Agent Evaluation

Databricks Agent Evaluation Python SDK.

For more details see Databricks Agent Evaluation.

databricks.agents.evals.generate_evals_df(docs: DataFrame | pyspark.sql.DataFrame, *, num_evals: int, agent_description: str | None = None, question_guidelines: str | None = None, guidelines: str | None = None) → DataFrame

Generate an evaluation dataset with synthetic requests and synthetic expected_facts, given a set of documents.

The generated evaluation set can be used with Databricks Agent Evaluation.

For more details, see the Synthesize evaluation set guide.

Parameters:

docs – A pandas/Spark DataFrame with a text column content and a doc_uri column.
num_evals – The total number of evaluations to generate across all the documents. The function tries to distribute generated evals over all of your documents, taking into consideration their size. If num_evals is less than the number of documents, not all documents will be covered in the evaluation set.
agent_description – Optional task description of the agent used to guide the generation.
question_guidelines – Optional guidelines to guide the question generation. The string can be formatted in markdown and may include sections like: - User Personas: Types of users the agent should support - Example Questions: Sample questions to guide generation - Additional Guidelines: Extra rules or requirements

databricks.agents.evals.estimate_synthetic_num_evals(docs: DataFrame | pyspark.sql.DataFrame, *, eval_per_x_tokens: int) → int

Estimate the number of evals to synthetically generate for full coverage over the documents.

Parameters:

docs – A pandas/Spark DataFrame with a text column content.
eval_per_x_tokens – Generate 1 eval for every x tokens to control the coverage level. 500 tokens is ~1 page of text.

Returns:

The estimated number of evaluations to generate.

databricks.agents.evals.metric(eval_fn=None, *, name: str | None = None, aggregations: List[Literal['min', 'max', 'mean', 'median', 'variance', 'p90', 'p99'] | Callable] | None = None)

Create a custom agent metric from a user-defined eval function.

Can be used as a decorator on the eval_fn.

The eval_fn should have the following signature:

def eval_fn(
    *,
    request_id: str,
    request: Union[ChatCompletionRequest, str],
    response: Optional[Any],
    retrieved_context: Optional[List[Dict[str, str]]]
    expected_response: Optional[Any],
    expected_facts: Optional[List[str]],
    guidelines: Optional[Union[List[str], Dict[str, List[str]]]],
    expected_retrieved_context: Optional[List[Dict[str, str]]],
    custom_expected: Optional[Dict[str, Any]],
    custom_inputs: Optional[Dict[str, Any]],
    custom_outputs: Optional[Dict[str, Any]],
    trace: Optional[mlflow.entities.Trace],
    tool_calls: Optional[List[ToolCallInvocation]],
    **kwargs,
) -> Optional[Union[int, float, bool]]:
    """
    Args:
        request_id: The ID of the request.
        request: The agent's input from your input eval dataset.
        response: The agent's raw output. Whatever we get from the agent, we will pass it here as is.
        retrieved_context: Retrieved context, can be from your input eval dataset or from the trace,
                           we will try to extract retrieval context from the trace;
                           if you have custom extraction logic, use the `trace` field.
        expected_response: The expected response from your input eval dataset.
        expected_facts: The expected facts from your input eval dataset.
        guidelines: The guidelines from your input eval dataset.
        expected_retrieved_context: The expected retrieved context from your input eval dataset.
        custom_expected: Custom expected information from your input eval dataset.
        custom_inputs: Custom inputs from your input eval dataset.
        custom_outputs: Custom outputs from the agent's response.
        trace: The trace object. You can use this to extract additional information from the trace.
        tool_calls: List of tool call invocations, can be from your agent's response (ChatAgent only)
                    or from the trace. We will prioritize extracting from the trace as it contains
                    additional information such as available tools and from which span the tool was called.
    """

eval_fn will always be called with named arguments. You only need to declare the arguments you need. If kwargs is declared, all available arguments will be passed.

The return value of the function should be either a number or a boolean. It will be used as the metric value. Return None if the metric cannot be computed.

Parameters:

eval_fn – The user-defined eval function.
name – The name of the metric. If not provided, the function name will be used.
aggregations – The aggregations to apply to the metric.

class databricks.agents.evals.ToolCallInvocation(tool_name: str, tool_call_args: Dict[str, Any], tool_call_id: str | None = None, tool_call_result: Dict[str, Any] | None = None, raw_span: mlflow.entities.span.Span | None = None, available_tools: List[Dict[str, Any]] | None = None)

Bases: object

tool_name: str

tool_call_args: Dict[str, Any]

tool_call_id: str | None = None

tool_call_result: Dict[str, Any] | None = None

raw_span: Span | None = None

available_tools: List[Dict[str, Any]] | None = None

to_dict() → Dict[str, Any]

classmethod from_dict(tool_calls: List[Dict[str, Any]] | Dict[str, Any] | None) → ToolCallInvocation | List[ToolCallInvocation] | None

Judges

databricks.agents.evals.judges.chunk_relevance(request: str | Dict[str, Any], retrieved_context: List[Dict[str, Any]] | Any, assessment_name: str | None = None) → List[Assessment]

The chunk-relevance-precision LLM judge determines whether the chunks returned by the retriever are relevant to the input request. Precision is calculated as the number of relevant chunks returned divided by the total number of chunks returned. For example, if the retriever returns four chunks, and the LLM judge determines that three of the four returned documents are relevant to the request, then llm_judged/chunk_relevance/precision is 0.75.

Parameters:

request – Input to the application to evaluate, user’s question or query. For example, “What is RAG?”.
retrieved_context –
Retrieval results generated by the retriever in the application being evaluated. It should be a list of dictionaries with the following keys:
- doc_uri (Optional): The doc_uri of the context.
- content: The content of the context.
assessment_name – Optional override for the assessment name. If present, the output Assessment will use this as the name instead of “chunk_relevance”

Required input arguments:: request, retrieved_context

Returns:: Chunk relevance assessment result for each of the chunks in the given input.

databricks.agents.evals.judges.context_sufficiency(request: str | Dict[str, Any], retrieved_context: List[Dict[str, Any]] | Any, expected_response: str | None = None, expected_facts: List[str] | None = None, assessment_name: str | None = None) → Assessment

The context_sufficiency LLM judge determines whether the retriever has retrieved documents that are sufficient to produce the expected response or expected facts.

Parameters:

request – Input to the application to evaluate, user’s question or query. For example, “What is RAG?”.
expected_response – Ground-truth (correct) answer for the input request.
retrieved_context –
Retrieval results generated by the retriever in the application being evaluated. It should be a list of dictionaries with the following keys:
- doc_uri (Optional): The doc_uri of the context.
- content: The content of the context.
expected_facts – Array of strings containing facts expected in the correct response for the input request.
assessment_name – Optional override for the assessment name. If present, the output Assessment will use this as the name instead of “context_sufficiency”

Required input arguments:: request, retrieved_context, oneof(expected_response, expected_facts)

Returns:: Context sufficiency assessment result for the given input.

The correctness LLM judge gives a binary evaluation and written rationale on whether the response generated by the agent is factually accurate and semantically similar to the provided expected response or expected facts.

Parameters:

request – Input to the application to evaluate, user’s question or query. For example, “What is RAG?”.
response – Response generated by the application being evaluated.
expected_response – Ground-truth (correct) answer for the input request.
expected_facts – Array of strings containing facts expected in the correct response for the input request.
assessment_name – Optional override for the assessment name. If present, the output Assessment will use this as the name instead of “correctness”

Required input arguments:: request, response, oneof(expected_response, expected_facts)

Returns:: Correctness assessment result for the given input.

databricks.agents.evals.judges.groundedness(request: str | Dict[str, Any], response: str | Dict[str, Any], retrieved_context: List[Dict[str, Any]] | Any, assessment_name: str | None = None) → Assessment

The groundedness LLM judge returns a binary evaluation and written rationale on whether the generated response is factually consistent with the retrieved context.

Parameters:

request – Input to the application to evaluate, user’s question or query. For example, “What is RAG?”.
response – Response generated by the application being evaluated.
retrieved_context –
Retrieval results generated by the retriever in the application being evaluated. It should be a list of dictionaries with the following keys:
- doc_uri (Optional): The doc_uri of the context.
- content: The content of the context.
assessment_name – Optional override for the assessment name. If present, the output Assessment will use this as the name instead of “groundedness”

Required input arguments:: request, response, retrieved_context

Returns:: Groundedness assessment result for the given input.

Deprecated since version The: guideline_adherence function is deprecated. Use the guidelines function instead.

The guideline_adherence LLM judge determines whether the provided context (one of or both of guidelines context or the response to the request) adheres to the provided guidelines.

Parameters:

request – Input to the application to evaluate, user’s question or query. For example, “What is RAG?”.
guidelines – One of the following: - Array of strings containing the guidelines that the response or context should adhere to. - Mapping of string (named guidelines) to array of strings containing the guidelines the response or context should adhere to.
response – Response generated by the application being evaluated.
guidelines_context – Mapping of a string (context field name) to any object (content) containing context the guidelines can apply to. The values in the mapping will be cast to strings.
assessment_name – Optional override for the assessment name. If present, the output Assessment will use this as the name instead of “guideline_adherence”

Required input arguments:: guidelines, oneof(request, response, guidelines_context)

Returns:: Guideline adherence assessment(s) result for the given input. Returns a list when named guidelines are provided.

databricks.agents.evals.judges.guidelines(guidelines: str | List[str], context: Dict[str, Any], assessment_name: str | None = None) → Assessment

The guidelines LLM judge determines whether the provided context adheres to the provided guidelines.

Parameters:

guidelines – Array of strings containing the guidelines that the context should adhere to.
context – Mapping of a string (context field name) to any object (content) containing context the guidelines can apply to. The values in the mapping will be cast to strings.
assessment_name – Optional override for the assessment name. If present, the output Assessment will use this as the name instead of “guidelines”

Returns:

Guidelines assessment result for the given input.

databricks.agents.evals.judges.relevance_to_query(request: str | Dict[str, Any], response: str | Dict[str, Any], assessment_name: str | None = None) → Assessment

The relevance_to_query LLM judge determines whether the response is relevant to the input request.

Parameters:

request – Input to the application to evaluate, user’s question or query. For example, “What is RAG?”.
response – Response generated by the application being evaluated.
assessment_name – Optional override for the assessment name. If present, the output Assessment will use this as the name instead of “relevance_to_query”

Required input arguments:: request, response

Returns:: Relevance to query assessment result for the given input.

The safety LLM judge returns a binary rating and a written rationale on whether the generated response has harmful or toxic content.

Parameters:

request – Input to the application to evaluate, user’s question or query. For example, “What is RAG?”.
response – Response generated by the application being evaluated.
assessment_name – Optional override for the assessment name. If present, the output Assessment will use this as the name instead of “safety”

Required input arguments:: response

Returns:: Safety assessment result for the given input.

databricks.agents.evals.judges.custom_prompt_judge(*, name: str, prompt_template: str, numeric_values: dict[str, int | float] | None = None) → Callable[[...], Feedback]

Create a custom prompt judge that evaluates inputs using a template.

Example prompt template:

``` You will look at the response and determine the formality of the response.

<request>{{request}}</request> <response>{{response}}</response>

You must choose one of the following categories.

[[formal]]: The response is very formal. [[semi_formal]]: The response is somewhat formal. The response is somewhat formal if the response mentions friendship, etc. [[not_formal]]: The response is not formal. ```

Variable names in the template should be enclosed in double curly braces, e.g., {{request}}, {{response}}. They should be alphanumeric and can include underscores, but should not contain spaces or special characters.

It is required for the prompt template to request choices as outputs, with each choice enclosed in square brackets. Choice names should be alphanumeric and can include underscores, but should not contain spaces or special characters.

Parameters:

name (str) – Name of the judge, used as the assessment name.
prompt_template (str) – Template string with {{var_name}} placeholders for variable substitution. Should be prompted with choices as outputs.
numeric_values (dict[str, int | float] | None) – Optional mapping from categorical values to numeric scores. Useful if you want to create a custom judge that returns continuous valued outputs. Defaults to None.

Returns:

A callable that takes keyword arguments mapping to the template variables and returns an mlflow Feedback.

Datasets

Databricks Agent Datasets Python SDK.

For more details see Databricks Agent Evaluation <https://docs.databricks.com/en/generative-ai/agent-evaluation/index.html>

databricks.agents.datasets.create_dataset(uc_table_name: str, experiment_id: str | list[str] | None = None) → Dataset

Create a dataset with the given name and associate it with the given experiment.

Parameters:

uc_table_name – The UC table location of the dataset.
experiment_id – The ID of the experiment to associate the dataset with. If not provided, the current experiment is inferred from the environment.

Bases: Dataset

A dataset for storing evaluation records (inputs and expectations).

dataset_id: str: The unique identifier of the dataset.

digest: str | None = None: String digest (hash) of the dataset provided by the caller that uniquely identifies

name: str | None = None: The UC table name of the dataset.

classmethod schema(*, infer_missing: bool = False, only=None, exclude=(), many: bool = False, context=None, load_only=(), dump_only=(), partial: bool = False, unknown=None) → SchemaF[A]: The schema of the dataset. E.g., MLflow ColSpec JSON for a dataframe, MLflow TensorSpec JSON for an ndarray, or another schema format.

profile: str | None = None: The profile of the dataset, summary statistics.

source: DatasetSource | None = None: Source information for the dataset.

source_type: str | None = None: The type of the dataset source, e.g. “databricks-uc-table”, “DBFS”, “S3”, …

create_time: str | None = None: The time the dataset was created.

created_by: str | None = None: The user who created the dataset.

last_update_time: str | None = None: The time the dataset was last updated.

last_updated_by: str | None = None: The user who last updated the dataset.

set_profile(profile: str) → Dataset: Set the profile of the dataset.

insert(records: list[Dict] | DataFrame | pyspark.sql.DataFrame) → Dataset

merge_records(records: list[Dict] | DataFrame | pyspark.sql.DataFrame) → Dataset

Merge records into the dataset. Records that share the same inputs will be merged into a single record with the merged expectations and tags.

Parameters:

records – A list of dicts, a pandas DataFrame, or a Spark DataFrame. For the input schema
https (see) – //docs.databricks.com/en/generative-ai/agent-evaluation/evaluation-schema.html

to_df() → DataFrame: Convert the dataset to a pandas DataFrame.

classmethod from_dict(kvs: dict | list | str | int | float | bool | None, *, infer_missing=False) → A

classmethod from_json(s: str | bytes | bytearray, *, parse_float=None, parse_int=None, parse_constant=None, infer_missing=False, **kw) → A

Create config dictionary for the dataset.

Subclasses should override this method to provide additional fields in the config dict, e.g., schema, profile, etc.

Returns a string dictionary containing the following fields: name, digest, source, source type.

to_json(*, skipkeys: bool = False, ensure_ascii: bool = True, check_circular: bool = True, allow_nan: bool = True, indent: str | int | None = None, separators: Tuple[str, str] | None = None, default: Callable | None = None, sort_keys: bool = False, **kw) → str

Obtains a JSON string representation of the Dataset.

Returns:: A JSON string representation of the Dataset.

databricks.agents.datasets.delete_dataset(uc_table_name: str) → None: Delete the dataset with the given name.

databricks.agents.datasets.get_dataset(uc_table_name: str) → Dataset: Get the dataset with the given name.

Review App

Databricks Agent Review App Python SDK.

For more details see Databricks Agent Evaluation <https://docs.databricks.com/en/generative-ai/agent-evaluation/index.html>

class databricks.agents.review_app.Agent(agent_name: str, model_serving_endpoint: str)

Bases: object

The agent configuration, used for generating responses in the review app.

agent_name: str

model_serving_endpoint: str

classmethod from_dict(kvs: dict | list | str | int | float | bool | None, *, infer_missing=False) → A

classmethod from_json(s: str | bytes | bytearray, *, parse_float=None, parse_int=None, parse_constant=None, infer_missing=False, **kw) → A

classmethod schema(*, infer_missing: bool = False, only=None, exclude=(), many: bool = False, context=None, load_only=(), dump_only=(), partial: bool = False, unknown=None) → SchemaF[A]

to_dict(encode_json=False) → Dict[str, dict | list | str | int | float | bool | None]

to_json(*, skipkeys: bool = False, ensure_ascii: bool = True, check_circular: bool = True, allow_nan: bool = True, indent: str | int | None = None, separators: Tuple[str, str] | None = None, default: Callable | None = None, sort_keys: bool = False, **kw) → str

databricks.agents.review_app.get_review_app(experiment_id: str | None = None) → ReviewApp

Gets or creates (if it doesn’t exist) the review app for the given experiment ID.

Parameters:: experiment_id – Optional. The experiment ID for which to get the review app. If not provided, the experiment ID is inferred from the current active environment.

class databricks.agents.review_app.LabelingSession(name: str, assigned_users: list[str], agent: str | None, label_schemas: list[str], labeling_session_id: str, mlflow_run_id: str, review_app_id: str, experiment_id: str, url: str, enable_multi_turn_chat: bool, custom_inputs: dict[str, Any] | None = None)

Bases: object

A session for labeling items in the review app.

name: str

assigned_users: list[str]

agent: str | None

label_schemas: list[str]

labeling_session_id: str

mlflow_run_id: str

review_app_id: str

experiment_id: str

url: str

enable_multi_turn_chat: bool

custom_inputs: dict[str, Any] | None = None

add_dataset(dataset_name: str, record_ids: list[str] | None = None) → LabelingSession

Add a dataset to the labeling session.

Parameters:

dataset_name – The name of the dataset.
record_ids – Optional. The individiual record ids to be added to the session. If not provided, all records in the dataset will be added.

add_traces(traces: Iterable[Trace] | Iterable[str] | DataFrame) → LabelingSession

Add traces to the labeling session.

Parameters:: traces – Can be either: a) a pandas DataFrame with a ‘trace’ column. The ‘trace’ column should contain either mlflow.entities.Trace objects or their json string representations. b) an iterable of mlflow.entities.Trace objects. c) an iterable of json string representations of mlflow.entities.Trace objects.

sync_expectations(to_dataset: str) → None: Sync the expectations from the labeling session to a dataset.

set_assigned_users(assigned_users: list[str]) → LabelingSession: Set the assigned users for the labeling session.

classmethod from_dict(kvs: dict | list | str | int | float | bool | None, *, infer_missing=False) → A

classmethod from_json(s: str | bytes | bytearray, *, parse_float=None, parse_int=None, parse_constant=None, infer_missing=False, **kw) → A

classmethod schema(*, infer_missing: bool = False, only=None, exclude=(), many: bool = False, context=None, load_only=(), dump_only=(), partial: bool = False, unknown=None) → SchemaF[A]

to_dict(encode_json=False) → Dict[str, dict | list | str | int | float | bool | None]

to_json(*, skipkeys: bool = False, ensure_ascii: bool = True, check_circular: bool = True, allow_nan: bool = True, indent: str | int | None = None, separators: Tuple[str, str] | None = None, default: Callable | None = None, sort_keys: bool = False, **kw) → str

class databricks.agents.review_app.ReviewApp(review_app_id: str, experiment_id: str, url: str, agents: list[~databricks.rag_eval.review_app.entities.Agent] = <factory>, label_schemas: list[~databricks.rag_eval.review_app.entities.LabelSchema] = <factory>)

Bases: object

A review app is used to collect feedback from stakeholders for a given experiment.

review_app_id

The ID of the review app.

Type:: str

experiment_id

The ID of the experiment.

Type:: str

url

The URL of the review app for stakeholders to provide feedback.

Type:: str

agents

The agents to be used to generate responses.

Type:: list[databricks.rag_eval.review_app.entities.Agent]

label_schemas

The label schemas to be used in the review app.

Type:: list[databricks.rag_eval.review_app.entities.LabelSchema]

review_app_id: str

experiment_id: str

url: str

agents: list[Agent]

label_schemas: list[LabelSchema]

add_agent(*, agent_name: str, model_serving_endpoint: str, overwrite: bool = False) → ReviewApp: Add an agent to the review app to be used to generate responses.

remove_agent(agent_name: str) → ReviewApp: Remove an agent from the review app.

create_label_schema(name: str, *, type: Literal['feedback', 'expectation'], title: str, input: InputCategorical | InputCategoricalList | InputText | InputTextList | InputNumeric, instruction: str | None = None, enable_comment: bool = False, overwrite: bool = False) → LabelSchema

Create a new label schema for the review app.

A label schema defines the type of input that stakeholders will provide when labeling items in the review app.

Parameters:

name – The name of the label schema. Must be unique across the review app.
type – The type of the label schema. Either “feedback” or “expectation”.
title – The title of the label schema shown to stakeholders.
input – The input type of the label schema.
instruction – Optional. The instruction shown to stakeholders.
enable_comment – Optional. Whether to enable comments for the label schema.
overwrite – Optional. Whether to overwrite the existing label schema with the same name.

delete_label_schema(label_schema_name: str) → ReviewApp: Delete a label schema from the review app.

create_labeling_session(name: str, *, assigned_users: list[str] = [], agent: str | None = None, label_schemas: list[str] = [], enable_multi_turn_chat: bool = False, custom_inputs: dict[str, Any] | None = None) → LabelingSession

Create a new labeling session in the review app.

Parameters:

name – The name of the labeling session.
assigned_users – The users that will be assigned to label items in the session.
agent – The agent to be used to generate responses for the items in the session.
label_schemas – The label schemas to be used in the session.
enable_multi_turn_chat – Whether to enable multi-turn chat labeling for the session.
custom_inputs – Optional. Custom inputs to be used in the session.

get_labeling_sessions() → list[LabelingSession]: Get all labeling sessions in the review app.

delete_labeling_session(labeling_session: LabelingSession) → ReviewApp: Delete a labeling session from the review app.

classmethod from_dict(kvs: dict | list | str | int | float | bool | None, *, infer_missing=False) → A

classmethod from_json(s: str | bytes | bytearray, *, parse_float=None, parse_int=None, parse_constant=None, infer_missing=False, **kw) → A

classmethod schema(*, infer_missing: bool = False, only=None, exclude=(), many: bool = False, context=None, load_only=(), dump_only=(), partial: bool = False, unknown=None) → SchemaF[A]

to_dict(encode_json=False) → Dict[str, dict | list | str | int | float | bool | None]

to_json(*, skipkeys: bool = False, ensure_ascii: bool = True, check_circular: bool = True, allow_nan: bool = True, indent: str | int | None = None, separators: Tuple[str, str] | None = None, default: Callable | None = None, sort_keys: bool = False, **kw) → str

Label Schemas

Label schemas for configuring the Review App.

class databricks.agents.review_app.label_schemas.LabelSchemaType(value)

Bases: StrEnum

Type of label schema.

FEEDBACK = 'feedback'

EXPECTATION = 'expectation'

class databricks.agents.review_app.label_schemas.LabelSchema(name: str, type: Literal['feedback', 'expectation'], title: str, input: InputCategorical | InputCategoricalList | InputText | InputTextList | InputNumeric, instruction: str | None = None, enable_comment: bool = False)

Bases: object

A label schema for collecting input from stakeholders.

name: str

type: Literal['feedback', 'expectation']

title: str

input: InputCategorical | InputCategoricalList | InputText | InputTextList | InputNumeric

instruction: str | None = None

enable_comment: bool = False

classmethod from_dict(kvs: dict | list | str | int | float | bool | None, *, infer_missing=False) → A

classmethod from_json(s: str | bytes | bytearray, *, parse_float=None, parse_int=None, parse_constant=None, infer_missing=False, **kw) → A

classmethod schema(*, infer_missing: bool = False, only=None, exclude=(), many: bool = False, context=None, load_only=(), dump_only=(), partial: bool = False, unknown=None) → SchemaF[A]

to_dict(encode_json=False) → Dict[str, dict | list | str | int | float | bool | None]

to_json(*, skipkeys: bool = False, ensure_ascii: bool = True, check_circular: bool = True, allow_nan: bool = True, indent: str | int | None = None, separators: Tuple[str, str] | None = None, default: Callable | None = None, sort_keys: bool = False, **kw) → str

class databricks.agents.review_app.label_schemas.InputCategorical(options: list[str])

Bases: InputType

A single-select dropdown for collecting assessments from stakeholders.

options: list[str]

classmethod from_dict(kvs: dict | list | str | int | float | bool | None, *, infer_missing=False) → A

classmethod from_json(s: str | bytes | bytearray, *, parse_float=None, parse_int=None, parse_constant=None, infer_missing=False, **kw) → A

classmethod schema(*, infer_missing: bool = False, only=None, exclude=(), many: bool = False, context=None, load_only=(), dump_only=(), partial: bool = False, unknown=None) → SchemaF[A]

to_dict(encode_json=False) → Dict[str, dict | list | str | int | float | bool | None]

to_json(*, skipkeys: bool = False, ensure_ascii: bool = True, check_circular: bool = True, allow_nan: bool = True, indent: str | int | None = None, separators: Tuple[str, str] | None = None, default: Callable | None = None, sort_keys: bool = False, **kw) → str

class databricks.agents.review_app.label_schemas.InputCategoricalList(options: list[str])

Bases: InputType

A multi-select dropdown for collecting assessments from stakeholders.

options: list[str]

classmethod from_dict(kvs: dict | list | str | int | float | bool | None, *, infer_missing=False) → A

classmethod from_json(s: str | bytes | bytearray, *, parse_float=None, parse_int=None, parse_constant=None, infer_missing=False, **kw) → A

classmethod schema(*, infer_missing: bool = False, only=None, exclude=(), many: bool = False, context=None, load_only=(), dump_only=(), partial: bool = False, unknown=None) → SchemaF[A]

to_dict(encode_json=False) → Dict[str, dict | list | str | int | float | bool | None]

to_json(*, skipkeys: bool = False, ensure_ascii: bool = True, check_circular: bool = True, allow_nan: bool = True, indent: str | int | None = None, separators: Tuple[str, str] | None = None, default: Callable | None = None, sort_keys: bool = False, **kw) → str

class databricks.agents.review_app.label_schemas.InputNumeric(min_value: float | None = None, max_value: float | None = None)

Bases: InputType

A numeric input for collecting assessments from stakeholders.

min_value: float | None = None

max_value: float | None = None

classmethod from_dict(kvs: dict | list | str | int | float | bool | None, *, infer_missing=False) → A

classmethod from_json(s: str | bytes | bytearray, *, parse_float=None, parse_int=None, parse_constant=None, infer_missing=False, **kw) → A

classmethod schema(*, infer_missing: bool = False, only=None, exclude=(), many: bool = False, context=None, load_only=(), dump_only=(), partial: bool = False, unknown=None) → SchemaF[A]

to_dict(encode_json=False) → Dict[str, dict | list | str | int | float | bool | None]

to_json(*, skipkeys: bool = False, ensure_ascii: bool = True, check_circular: bool = True, allow_nan: bool = True, indent: str | int | None = None, separators: Tuple[str, str] | None = None, default: Callable | None = None, sort_keys: bool = False, **kw) → str

class databricks.agents.review_app.label_schemas.InputText(max_length: int | None = None)

Bases: InputType

A free-form text box for collecting assessments from stakeholders.

max_length: int | None = None

classmethod from_dict(kvs: dict | list | str | int | float | bool | None, *, infer_missing=False) → A

classmethod from_json(s: str | bytes | bytearray, *, parse_float=None, parse_int=None, parse_constant=None, infer_missing=False, **kw) → A

classmethod schema(*, infer_missing: bool = False, only=None, exclude=(), many: bool = False, context=None, load_only=(), dump_only=(), partial: bool = False, unknown=None) → SchemaF[A]

to_dict(encode_json=False) → Dict[str, dict | list | str | int | float | bool | None]

to_json(*, skipkeys: bool = False, ensure_ascii: bool = True, check_circular: bool = True, allow_nan: bool = True, indent: str | int | None = None, separators: Tuple[str, str] | None = None, default: Callable | None = None, sort_keys: bool = False, **kw) → str

class databricks.agents.review_app.label_schemas.InputTextList(max_length_each: int | None = None, max_count: int | None = None)

Bases: InputType

Like Working with text data, but allows multiple entries.

max_length_each: int | None = None

max_count: int | None = None

classmethod from_dict(kvs: dict | list | str | int | float | bool | None, *, infer_missing=False) → A

classmethod from_json(s: str | bytes | bytearray, *, parse_float=None, parse_int=None, parse_constant=None, infer_missing=False, **kw) → A

classmethod schema(*, infer_missing: bool = False, only=None, exclude=(), many: bool = False, context=None, load_only=(), dump_only=(), partial: bool = False, unknown=None) → SchemaF[A]

to_dict(encode_json=False) → Dict[str, dict | list | str | int | float | bool | None]

to_json(*, skipkeys: bool = False, ensure_ascii: bool = True, check_circular: bool = True, allow_nan: bool = True, indent: str | int | None = None, separators: Tuple[str, str] | None = None, default: Callable | None = None, sort_keys: bool = False, **kw) → str

databricks.agents.review_app.label_schemas.EXPECTED_FACTS = "expected_facts"

databricks.agents.review_app.label_schemas.GUIDELINES = "guidelines"

databricks.agents.review_app.label_schemas.EXPECTED_RESPONSE = "expected_response"