Agent Evaluation

Databricks Agent Evaluation Python SDK.

For more details see Databricks Agent Evaluation.

databricks.agents.evals.generate_evals_df(docs: DataFrame | pyspark.sql.DataFrame, *, num_evals: int, agent_description: str | None = None, question_guidelines: str | None = None, guidelines: str | None = None) DataFrame

Generate an evaluation dataset with synthetic requests and synthetic expected_facts, given a set of documents.

The generated evaluation set can be used with Databricks Agent Evaluation.

For more details, see the Synthesize evaluation set guide.

Parameters:
  • docs – A pandas/Spark DataFrame with a text column content and a doc_uri column.

  • num_evals – The total number of evaluations to generate across all the documents. The function tries to distribute generated evals over all of your documents, taking into consideration their size. If num_evals is less than the number of documents, not all documents will be covered in the evaluation set.

  • agent_description – Optional task description of the agent used to guide the generation.

  • question_guidelines – Optional guidelines to guide the question generation. The string can be formatted in markdown and may include sections like: - User Personas: Types of users the agent should support - Example Questions: Sample questions to guide generation - Additional Guidelines: Extra rules or requirements

databricks.agents.evals.estimate_synthetic_num_evals(docs: DataFrame | pyspark.sql.DataFrame, *, eval_per_x_tokens: int) int

Estimate the number of evals to synthetically generate for full coverage over the documents.

Parameters:
  • docs – A pandas/Spark DataFrame with a text column content.

  • eval_per_x_tokens – Generate 1 eval for every x tokens to control the coverage level. 500 tokens is ~1 page of text.

Returns:

The estimated number of evaluations to generate.

databricks.agents.evals.metric(eval_fn=None, *, name: str | None = None)

Note

Experimental: This function may change or be removed in a future release without warning.

Create a custom agent metric from a user-defined eval function.

Can be used as a decorator on the eval_fn.

The eval_fn should have the following signature:

def eval_fn(
    *,
    request_id: str,
    request: Union[ChatCompletionRequest, str],
    response: Optional[Any],
    retrieved_context: Optional[List[Dict[str, str]]]
    expected_response: Optional[Any],
    expected_facts: Optional[List[str]],
    expected_retrieved_context: Optional[List[Dict[str, str]]],
    custom_expected: Optional[Dict[str, Any]],
    custom_inputs: Optional[Dict[str, Any]],
    custom_outputs: Optional[Dict[str, Any]],
    trace: Optional[mlflow.entities.Trace],
    tool_calls: Optional[List[ToolCallInvocation]],
    **kwargs,
) -> Optional[Union[int, float, bool]]:
    """
    Args:
        request_id: The ID of the request.
        request: The agent's input from your input eval dataset.
        response: The agent's raw output. Whatever we get from the agent, we will pass it here as is.
        retrieved_context: Retrieved context, can be from your input eval dataset or from the trace,
                           we will try to extract retrieval context from the trace;
                           if you have custom extraction logic, use the `trace` field.
        expected_response: The expected response from your input eval dataset.
        expected_facts: The expected facts from your input eval dataset.
        expected_retrieved_context: The expected retrieved context from your input eval dataset.
        custom_expected: Custom expected information from your input eval dataset.
        custom_inputs: Custom inputs from your input eval dataset.
        custom_outputs: Custom outputs from the agent's response.
        trace: The trace object. You can use this to extract additional information from the trace.
        tool_calls: List of tool call invocations, can be from your agent's response (ChatAgent only)
                    or from the trace. We will prioritize extracting from the trace as it contains
                    additional information such as available tools and from which span the tool was called.
    """

eval_fn will always be called with named arguments. You only need to declare the arguments you need. If kwargs is declared, all available arguments will be passed.

The return value of the function should be either a number or a boolean. It will be used as the metric value. Return None if the metric cannot be computed.

Parameters:
  • eval_fn – The user-defined eval function.

  • name – The name of the metric. If not provided, the function name will be used.

class databricks.agents.evals.ToolCallInvocation(tool_name: str, tool_call_args: Dict[str, Any], tool_call_id: str | None = None, tool_call_result: Dict[str, Any] | None = None, raw_span: mlflow.entities.span.Span | None = None, available_tools: List[Dict[str, Any]] | None = None)

Bases: object

tool_name: str
tool_call_args: Dict[str, Any]
tool_call_id: str | None = None
tool_call_result: Dict[str, Any] | None = None
raw_span: Span | None = None
available_tools: List[Dict[str, Any]] | None = None
to_dict() Dict[str, Any]
classmethod from_dict(tool_calls: List[Dict[str, Any]] | Dict[str, Any] | None) ToolCallInvocation | List[ToolCallInvocation] | None

Judges

databricks.agents.evals.judges.chunk_relevance(request: str | Dict[str, Any], retrieved_context: List[Dict[str, Any]], assessment_name: str | None = None) List[Assessment]

The chunk-relevance-precision LLM judge determines whether the chunks returned by the retriever are relevant to the input request. Precision is calculated as the number of relevant chunks returned divided by the total number of chunks returned. For example, if the retriever returns four chunks, and the LLM judge determines that three of the four returned documents are relevant to the request, then llm_judged/chunk_relevance/precision is 0.75.

Parameters:
  • request – Input to the application to evaluate, user’s question or query. For example, “What is RAG?”.

  • retrieved_context

    Retrieval results generated by the retriever in the application being evaluated. It should be a list of dictionaries with the following keys:

    • doc_uri (Optional): The doc_uri of the context.

    • content: The content of the context.

  • assessment_name – Optional override for the assessment name. If present, the output Assessment will use this as the name instead of “chunk_relevance”

Required input arguments:

request, retrieved_context

Returns:

Chunk relevance assessment result for each of the chunks in the given input.

databricks.agents.evals.judges.context_sufficiency(request: str | Dict[str, Any], retrieved_context: List[Dict[str, Any]], expected_response: str | None = None, expected_facts: List[str] | None = None, assessment_name: str | None = None) Assessment

The context_sufficiency LLM judge determines whether the retriever has retrieved documents that are sufficient to produce the expected response or expected facts.

Parameters:
  • request – Input to the application to evaluate, user’s question or query. For example, “What is RAG?”.

  • expected_response – Ground-truth (correct) answer for the input request.

  • retrieved_context

    Retrieval results generated by the retriever in the application being evaluated. It should be a list of dictionaries with the following keys:

    • doc_uri (Optional): The doc_uri of the context.

    • content: The content of the context.

  • expected_facts – Array of strings containing facts expected in the correct response for the input request.

  • assessment_name – Optional override for the assessment name. If present, the output Assessment will use this as the name instead of “context_sufficiency”

Required input arguments:

request, retrieved_context, oneof(expected_response, expected_facts)

Returns:

Context sufficiency assessment result for the given input.

databricks.agents.evals.judges.correctness(request: str | Dict[str, Any], response: str | Dict[str, Any], expected_response: str | None = None, expected_facts: List[str] | None = None, assessment_name: str | None = None) Assessment

The correctness LLM judge gives a binary evaluation and written rationale on whether the response generated by the agent is factually accurate and semantically similar to the provided expected response or expected facts.

Parameters:
  • request – Input to the application to evaluate, user’s question or query. For example, “What is RAG?”.

  • response – Response generated by the application being evaluated.

  • expected_response – Ground-truth (correct) answer for the input request.

  • expected_facts – Array of strings containing facts expected in the correct response for the input request.

  • assessment_name – Optional override for the assessment name. If present, the output Assessment will use this as the name instead of “correctness”

Required input arguments:

request, response, oneof(expected_response, expected_facts)

Returns:

Correctness assessment result for the given input.

databricks.agents.evals.judges.groundedness(request: str | Dict[str, Any], response: str | Dict[str, Any], retrieved_context: List[Dict[str, Any]], assessment_name: str | None = None) Assessment

The groundedness LLM judge returns a binary evaluation and written rationale on whether the generated response is factually consistent with the retrieved context.

Parameters:
  • request – Input to the application to evaluate, user’s question or query. For example, “What is RAG?”.

  • response – Response generated by the application being evaluated.

  • retrieved_context

    Retrieval results generated by the retriever in the application being evaluated. It should be a list of dictionaries with the following keys:

    • doc_uri (Optional): The doc_uri of the context.

    • content: The content of the context.

  • assessment_name – Optional override for the assessment name. If present, the output Assessment will use this as the name instead of “groundedness”

Required input arguments:

request, response, retrieved_context

Returns:

Groundedness assessment result for the given input.

databricks.agents.evals.judges.guideline_adherence(request: str | Dict[str, Any], response: str | Dict[str, Any], guidelines: List[str] | Dict[str, List[str]], assessment_name: str | None = None) Assessment | List[Assessment]

The guideline_adherence LLM judge determines whether the response to the request adheres to the provided guidelines.

Parameters:
  • request – Input to the application to evaluate, user’s question or query. For example, “What is RAG?”.

  • response – Response generated by the application being evaluated.

  • guidelines – One of the following: - Array of strings containing the guidelines that the response should adhere to. - Mapping of string (named guidelines) to array of strings containing the guidelines the response should adhere to.

  • assessment_name – Optional override for the assessment name. If present, the output Assessment will use this as the name instead of “guideline_adherence”

Required input arguments:

request, response, guidelines

Returns:

Guideline adherence assessment(s) result for the given input. Returns a list when named guidelines are provided.

databricks.agents.evals.judges.relevance_to_query(request: str | Dict[str, Any], response: str | Dict[str, Any], assessment_name: str | None = None) Assessment

The relevance_to_query LLM judge determines whether the response is relevant to the input request.

Parameters:
  • request – Input to the application to evaluate, user’s question or query. For example, “What is RAG?”.

  • response – Response generated by the application being evaluated.

  • assessment_name – Optional override for the assessment name. If present, the output Assessment will use this as the name instead of “relevance_to_query”

Required input arguments:

request, response

Returns:

Relevance to query assessment result for the given input.

databricks.agents.evals.judges.safety(request: str | Dict[str, Any], response: str | Dict[str, Any], assessment_name: str | None = None) Assessment

The safety LLM judge returns a binary rating and a written rationale on whether the generated response has harmful or toxic content.

Parameters:
  • request – Input to the application to evaluate, user’s question or query. For example, “What is RAG?”.

  • response – Response generated by the application being evaluated.

  • assessment_name – Optional override for the assessment name. If present, the output Assessment will use this as the name instead of “safety”

Required input arguments:

request, response

Returns:

Safety assessment result for the given input.

Datasets

Databricks Agent Datasets Python SDK.

For more details see Databricks Agent Evaluation <https://docs.databricks.com/en/generative-ai/agent-evaluation/index.html>

databricks.agents.datasets.create_dataset(uc_table_name: str, experiment_id: str | list[str] | None = None) Dataset

Create a dataset with the given name and associate it with the given experiment.

Parameters:
  • uc_table_name – The UC table location of the dataset.

  • experiment_id – The ID of the experiment to associate the dataset with. If not provided, the current experiment is inferred from the environment.

class databricks.agents.datasets.Dataset(dataset_id: str, digest: str | None = None, name: str | None = None, schema: str | None = None, profile: str | None = None, source: str | None = None, source_type: str | None = None, create_time: str | None = None, created_by: str | None = None, last_update_time: str | None = None, last_updated_by: str | None = None)

Bases: Dataset

A dataset for storing evaluation records (inputs and expectations).

dataset_id: str

The unique identifier of the dataset.

digest: str | None = None

String digest (hash) of the dataset provided by the caller that uniquely identifies

name: str | None = None

The UC table name of the dataset.

classmethod schema(*, infer_missing: bool = False, only=None, exclude=(), many: bool = False, context=None, load_only=(), dump_only=(), partial: bool = False, unknown=None) SchemaF[A]

The schema of the dataset. E.g., MLflow ColSpec JSON for a dataframe, MLflow TensorSpec JSON for an ndarray, or another schema format.

profile: str | None = None

The profile of the dataset, summary statistics.

source: str | None = None

Source information for the dataset.

source_type: str | None = None

The type of the dataset source, e.g. “databricks-uc-table”, “DBFS”, “S3”, …

create_time: str | None = None

The time the dataset was created.

created_by: str | None = None

The user who created the dataset.

last_update_time: str | None = None

The time the dataset was last updated.

last_updated_by: str | None = None

The user who last updated the dataset.

set_profile(profile: str) Dataset

Set the profile of the dataset.

insert(records: list[Dict] | DataFrame | pyspark.sql.DataFrame) Dataset

Insert records into the dataset. records that share the same inputs will be merged into a single record.

Parameters:
  • records – A list of dicts, a pandas DataFrame, or a Spark DataFrame. For the input schema

  • https (see) – //docs.databricks.com/en/generative-ai/agent-evaluation/evaluation-schema.html

to_df() DataFrame

Convert the dataset to a pandas DataFrame.

classmethod from_dict(kvs: dict | list | str | int | float | bool | None, *, infer_missing=False) A
classmethod from_json(s: str | bytes | bytearray, *, parse_float=None, parse_int=None, parse_constant=None, infer_missing=False, **kw) A
to_dict(encode_json=False) Dict[str, dict | list | str | int | float | bool | None]
to_json(*, skipkeys: bool = False, ensure_ascii: bool = True, check_circular: bool = True, allow_nan: bool = True, indent: int | str | None = None, separators: Tuple[str, str] | None = None, default: Callable | None = None, sort_keys: bool = False, **kw) str
databricks.agents.datasets.delete_dataset(uc_table_name: str) None

Delete the dataset with the given name.

databricks.agents.datasets.get_dataset(uc_table_name: str) Dataset

Get the dataset with the given name.

Review App

Databricks Agent Review App Python SDK.

For more details see Databricks Agent Evaluation <https://docs.databricks.com/en/generative-ai/agent-evaluation/index.html>

class databricks.agents.review_app.Agent(agent_name: str, model_serving_endpoint: str)

Bases: object

The agent configuration, used for generating responses in the review app.

agent_name: str
model_serving_endpoint: str
classmethod from_dict(kvs: dict | list | str | int | float | bool | None, *, infer_missing=False) A
classmethod from_json(s: str | bytes | bytearray, *, parse_float=None, parse_int=None, parse_constant=None, infer_missing=False, **kw) A
classmethod schema(*, infer_missing: bool = False, only=None, exclude=(), many: bool = False, context=None, load_only=(), dump_only=(), partial: bool = False, unknown=None) SchemaF[A]
to_dict(encode_json=False) Dict[str, dict | list | str | int | float | bool | None]
to_json(*, skipkeys: bool = False, ensure_ascii: bool = True, check_circular: bool = True, allow_nan: bool = True, indent: int | str | None = None, separators: Tuple[str, str] | None = None, default: Callable | None = None, sort_keys: bool = False, **kw) str
databricks.agents.review_app.get_review_app(experiment_id: str | None = None) ReviewApp

Gets or creates (if it doesn’t exist) the review app for the given experiment ID.

Parameters:

experiment_id – Optional. The experiment ID for which to get the review app. If not provided, the experiment ID is inferred from the current active environment.

class databricks.agents.review_app.LabelingSession(name: str, assigned_users: list[str], agent: str | None, label_schemas: list[str], labeling_session_id: str, mlflow_run_id: str, review_app_id: str, experiment_id: str, url: str)

Bases: object

A session for labeling items in the review app.

name: str
assigned_users: list[str]
agent: str | None
label_schemas: list[str]
labeling_session_id: str
mlflow_run_id: str
review_app_id: str
experiment_id: str
url: str
add_dataset(dataset_name: str, record_ids: list[str] | None = None) LabelingSession

Add a dataset to the labeling session.

Parameters:
  • dataset_name – The name of the dataset.

  • record_ids – Optional. The individiual record ids to be added to the session. If not provided, all records in the dataset will be added.

add_traces(traces: Iterable[Trace] | Iterable[str] | DataFrame) LabelingSession

Add traces to the labeling session.

Parameters:

traces – Can be either: a) a pandas DataFrame with a ‘trace’ column. The ‘trace’ column should contain either mlflow.entities.Trace objects or their json string representations. b) an iterable of mlflow.entities.Trace objects. c) an iterable of json string representations of mlflow.entities.Trace objects.

sync_expectations(to_dataset: str) None

Sync the expectations from the labeling session to a dataset.

set_assigned_users(assigned_users: list[str]) LabelingSession

Set the assigned users for the labeling session.

classmethod from_dict(kvs: dict | list | str | int | float | bool | None, *, infer_missing=False) A
classmethod from_json(s: str | bytes | bytearray, *, parse_float=None, parse_int=None, parse_constant=None, infer_missing=False, **kw) A
classmethod schema(*, infer_missing: bool = False, only=None, exclude=(), many: bool = False, context=None, load_only=(), dump_only=(), partial: bool = False, unknown=None) SchemaF[A]
to_dict(encode_json=False) Dict[str, dict | list | str | int | float | bool | None]
to_json(*, skipkeys: bool = False, ensure_ascii: bool = True, check_circular: bool = True, allow_nan: bool = True, indent: int | str | None = None, separators: Tuple[str, str] | None = None, default: Callable | None = None, sort_keys: bool = False, **kw) str
class databricks.agents.review_app.ReviewApp(review_app_id: str, experiment_id: str, url: str, agents: list[~databricks.rag_eval.review_app.entities.Agent] = <factory>, label_schemas: list[~databricks.rag_eval.review_app.entities.LabelSchema] = <factory>)

Bases: object

A review app is used to collect feedback from stakeholders for a given experiment.

review_app_id

The ID of the review app.

Type:

str

experiment_id

The ID of the experiment.

Type:

str

url

The URL of the review app for stakeholders to provide feedback.

Type:

str

agents

The agents to be used to generate responses.

Type:

list[databricks.rag_eval.review_app.entities.Agent]

label_schemas

The label schemas to be used in the review app.

Type:

list[databricks.rag_eval.review_app.entities.LabelSchema]

review_app_id: str
experiment_id: str
url: str
agents: list[Agent]
label_schemas: list[LabelSchema]
add_agent(*, agent_name: str, model_serving_endpoint: str, overwrite: bool = False) ReviewApp

Add an agent to the review app to be used to generate responses.

remove_agent(agent_name: str) ReviewApp

Remove an agent from the review app.

create_label_schema(name: str, *, type: Literal['feedback', 'expectation'], title: str, input: InputCategorical | InputCategoricalList | InputText | InputTextList | InputNumeric, instruction: str | None = None, enable_comment: bool = False, overwrite: bool = False) LabelSchema

Create a new label schema for the review app.

A label schema defines the type of input that stakeholders will provide when labeling items in the review app.

Parameters:
  • name – The name of the label schema. Must be unique across the review app.

  • type – The type of the label schema. Either “feedback” or “expectation”.

  • title – The title of the label schema shown to stakeholders.

  • input – The input type of the label schema.

  • instruction – Optional. The instruction shown to stakeholders.

  • enable_comment – Optional. Whether to enable comments for the label schema.

  • overwrite – Optional. Whether to overwrite the existing label schema with the same name.

delete_label_schema(label_schema_name: str) ReviewApp

Delete a label schema from the review app.

create_labeling_session(name: str, *, assigned_users: list[str] = [], agent: str | None = None, label_schemas: list[str] = []) LabelingSession

Create a new labeling session in the review app.

Parameters:
  • name – The name of the labeling session.

  • assigned_users – The users that will be assigned to label items in the session.

  • agent – The agent to be used to generate responses for the items in the session.

  • label_schemas – The label schemas to be used in the session.

get_labeling_sessions() list[LabelingSession]

Get all labeling sessions in the review app.

delete_labeling_session(labeling_session: LabelingSession) ReviewApp

Delete a labeling session from the review app.

classmethod from_dict(kvs: dict | list | str | int | float | bool | None, *, infer_missing=False) A
classmethod from_json(s: str | bytes | bytearray, *, parse_float=None, parse_int=None, parse_constant=None, infer_missing=False, **kw) A
classmethod schema(*, infer_missing: bool = False, only=None, exclude=(), many: bool = False, context=None, load_only=(), dump_only=(), partial: bool = False, unknown=None) SchemaF[A]
to_dict(encode_json=False) Dict[str, dict | list | str | int | float | bool | None]
to_json(*, skipkeys: bool = False, ensure_ascii: bool = True, check_circular: bool = True, allow_nan: bool = True, indent: int | str | None = None, separators: Tuple[str, str] | None = None, default: Callable | None = None, sort_keys: bool = False, **kw) str

Label Schemas

Label schemas for configuring the Review App.

class databricks.agents.review_app.label_schemas.LabelSchemaType(value)

Bases: StrEnum

Type of label schema.

FEEDBACK = 'feedback'
EXPECTATION = 'expectation'
class databricks.agents.review_app.label_schemas.LabelSchema(name: str, type: Literal['feedback', 'expectation'], title: str, input: InputCategorical | InputCategoricalList | InputText | InputTextList | InputNumeric, instruction: str | None = None, enable_comment: bool = False)

Bases: object

A label schema for collecting input from stakeholders.

name: str
type: Literal['feedback', 'expectation']
title: str
input: InputCategorical | InputCategoricalList | InputText | InputTextList | InputNumeric
instruction: str | None = None
enable_comment: bool = False
classmethod from_dict(kvs: dict | list | str | int | float | bool | None, *, infer_missing=False) A
classmethod from_json(s: str | bytes | bytearray, *, parse_float=None, parse_int=None, parse_constant=None, infer_missing=False, **kw) A
classmethod schema(*, infer_missing: bool = False, only=None, exclude=(), many: bool = False, context=None, load_only=(), dump_only=(), partial: bool = False, unknown=None) SchemaF[A]
to_dict(encode_json=False) Dict[str, dict | list | str | int | float | bool | None]
to_json(*, skipkeys: bool = False, ensure_ascii: bool = True, check_circular: bool = True, allow_nan: bool = True, indent: int | str | None = None, separators: Tuple[str, str] | None = None, default: Callable | None = None, sort_keys: bool = False, **kw) str
class databricks.agents.review_app.label_schemas.InputCategorical(options: list[str])

Bases: InputType

A single-select dropdown for collecting assessments from stakeholders.

options: list[str]
classmethod from_dict(kvs: dict | list | str | int | float | bool | None, *, infer_missing=False) A
classmethod from_json(s: str | bytes | bytearray, *, parse_float=None, parse_int=None, parse_constant=None, infer_missing=False, **kw) A
classmethod schema(*, infer_missing: bool = False, only=None, exclude=(), many: bool = False, context=None, load_only=(), dump_only=(), partial: bool = False, unknown=None) SchemaF[A]
to_dict(encode_json=False) Dict[str, dict | list | str | int | float | bool | None]
to_json(*, skipkeys: bool = False, ensure_ascii: bool = True, check_circular: bool = True, allow_nan: bool = True, indent: int | str | None = None, separators: Tuple[str, str] | None = None, default: Callable | None = None, sort_keys: bool = False, **kw) str
class databricks.agents.review_app.label_schemas.InputCategoricalList(options: list[str])

Bases: InputType

A multi-select dropdown for collecting assessments from stakeholders.

options: list[str]
classmethod from_dict(kvs: dict | list | str | int | float | bool | None, *, infer_missing=False) A
classmethod from_json(s: str | bytes | bytearray, *, parse_float=None, parse_int=None, parse_constant=None, infer_missing=False, **kw) A
classmethod schema(*, infer_missing: bool = False, only=None, exclude=(), many: bool = False, context=None, load_only=(), dump_only=(), partial: bool = False, unknown=None) SchemaF[A]
to_dict(encode_json=False) Dict[str, dict | list | str | int | float | bool | None]
to_json(*, skipkeys: bool = False, ensure_ascii: bool = True, check_circular: bool = True, allow_nan: bool = True, indent: int | str | None = None, separators: Tuple[str, str] | None = None, default: Callable | None = None, sort_keys: bool = False, **kw) str
class databricks.agents.review_app.label_schemas.InputNumeric(min_value: float | None = None, max_value: float | None = None)

Bases: InputType

A numeric input for collecting assessments from stakeholders.

min_value: float | None = None
max_value: float | None = None
classmethod from_dict(kvs: dict | list | str | int | float | bool | None, *, infer_missing=False) A
classmethod from_json(s: str | bytes | bytearray, *, parse_float=None, parse_int=None, parse_constant=None, infer_missing=False, **kw) A
classmethod schema(*, infer_missing: bool = False, only=None, exclude=(), many: bool = False, context=None, load_only=(), dump_only=(), partial: bool = False, unknown=None) SchemaF[A]
to_dict(encode_json=False) Dict[str, dict | list | str | int | float | bool | None]
to_json(*, skipkeys: bool = False, ensure_ascii: bool = True, check_circular: bool = True, allow_nan: bool = True, indent: int | str | None = None, separators: Tuple[str, str] | None = None, default: Callable | None = None, sort_keys: bool = False, **kw) str
class databricks.agents.review_app.label_schemas.InputText(max_length: int | None = None)

Bases: InputType

A free-form text box for collecting assessments from stakeholders.

max_length: int | None = None
classmethod from_dict(kvs: dict | list | str | int | float | bool | None, *, infer_missing=False) A
classmethod from_json(s: str | bytes | bytearray, *, parse_float=None, parse_int=None, parse_constant=None, infer_missing=False, **kw) A
classmethod schema(*, infer_missing: bool = False, only=None, exclude=(), many: bool = False, context=None, load_only=(), dump_only=(), partial: bool = False, unknown=None) SchemaF[A]
to_dict(encode_json=False) Dict[str, dict | list | str | int | float | bool | None]
to_json(*, skipkeys: bool = False, ensure_ascii: bool = True, check_circular: bool = True, allow_nan: bool = True, indent: int | str | None = None, separators: Tuple[str, str] | None = None, default: Callable | None = None, sort_keys: bool = False, **kw) str
class databricks.agents.review_app.label_schemas.InputTextList(max_length_each: int | None = None, max_count: int | None = None)

Bases: InputType

Like Working with text data, but allows multiple entries.

max_length_each: int | None = None
max_count: int | None = None
classmethod from_dict(kvs: dict | list | str | int | float | bool | None, *, infer_missing=False) A
classmethod from_json(s: str | bytes | bytearray, *, parse_float=None, parse_int=None, parse_constant=None, infer_missing=False, **kw) A
classmethod schema(*, infer_missing: bool = False, only=None, exclude=(), many: bool = False, context=None, load_only=(), dump_only=(), partial: bool = False, unknown=None) SchemaF[A]
to_dict(encode_json=False) Dict[str, dict | list | str | int | float | bool | None]
to_json(*, skipkeys: bool = False, ensure_ascii: bool = True, check_circular: bool = True, allow_nan: bool = True, indent: int | str | None = None, separators: Tuple[str, str] | None = None, default: Callable | None = None, sort_keys: bool = False, **kw) str
databricks.agents.review_app.label_schemas.EXPECTED_FACTS = "expected_facts"
databricks.agents.review_app.label_schemas.GUIDELINES = "guidelines"
databricks.agents.review_app.label_schemas.EXPECTED_RESPONSE = "expected_response"