Lakehouse Monitoring for GenAI

For more details see What is Lakehouse Monitoring for generative AI?

databricks.agents.monitoring.create_external_monitor(*, catalog_name: str, schema_name: str, assessments_config: AssessmentsSuiteConfig | dict, experiment_id: str | None = None, experiment_name: str | None = None) ExternalMonitor

Create a monitor for a GenAI application served outside Databricks.

Parameters:
  • catalog_name (str) – The name of the catalog in UC to create the checkpoint table in.

  • schema_name (str) – The name of the schema in UC to create the checkpoint table in.

  • assessments_config (AssessmentsSuiteConfig | dict) – The configuration for the suite of assessments to be run on traces from the GenAI application.

  • experiment_id (str | None, optional) – ID of Mlflow experiment that the monitor should be associated with. Defaults to the currently active experiment.

  • experiment_name (str | None, optional) – The name of the Mlflow experiment that the monitor should be associated with. Defaults to the currently active experiment.

Returns:

The created monitor.

Return type:

ExternalMonitor

databricks.agents.monitoring.delete_external_monitor(*, experiment_id: str | None = None, experiment_name: str | None = None) None

Deletes the monitor for a GenAI application served outside Databricks.

Parameters:
  • experiment_id (str | None, optional) – ID of the Mlflow experiment that the monitor is associated with. Defaults to None.

  • experiment_name (str | None, optional) – Name of the Mlflow experiment that the monitor is associated with. Defaults to None.

databricks.agents.monitoring.get_external_monitor(*, experiment_id: str | None = None, experiment_name: str | None = None) ExternalMonitor

Gets the monitor for a GenAI application served outside Databricks.

Parameters:
  • experiment_id (str | None, optional) – ID of the Mlflow experiment that the monitor is associated with. Defaults to None.

  • experiment_name (str | None, optional) – Name of the Mlflow experiment that the monitor is associated with. Defaults to None.

Raises:
  • ValueError – When neither experiment_id nor experiment_name is provided.

  • ValueError – When no monitor is found for the given experiment_id or experiment_name.

Returns:

The retrieved external monitor.

Return type:

entities.ExternalMonitor

databricks.agents.monitoring.update_external_monitor(*, experiment_id: str | None = None, experiment_name: str | None = None, assessments_config: AssessmentsSuiteConfig | dict) ExternalMonitor

Updates the monitor for an GenAI application served outside Databricks.

Parameters:
  • assessments_config (assessments.AssessmentsSuiteConfig) – The updated configuration for the suite of assessments to be run on traces from the AI system. Partial updates of arrays is not supported, so assessments specified here will override your monitor’s assessments. If unspecified, non-nested fields like sample will not be updated.

  • experiment_id (str | None, optional) – ID of the Mlflow experiment that the monitor is associated with. Defaults to None.

  • experiment_name (str | None, optional) – Name of the Mlflow experiment that the monitor is associated with. Defaults to None.

Raises:

ValueError – When assessments_config is not provided.

Returns:

The updated external monitor.

Return type:

entities.ExternalMonitor

databricks.agents.monitoring.create_monitor(endpoint_name: str, *, monitoring_config: dict | MonitoringConfig, experiment_id: str | None = None) Monitor

Create a monitor for a Databricks serving endpoint.

Parameters:
  • endpoint_name – The name of the serving endpoint.

  • monitoring_config – The monitoring configuration.

  • experiment_id – The experiment ID to log the monitoring results. Defaults to the experiment used to log the model that is serving the provided endpoint_name.

Returns:

The monitor for the serving endpoint.

databricks.agents.monitoring.delete_monitor(*, endpoint_name: str | None = None) None

Deletes a monitor for a Databricks serving endpoint.

Parameters:

endpoint_name (str, optional) – The name of the agent’s serving endpoint.

databricks.agents.monitoring.get_monitor(*, endpoint_name: str) Monitor

Retrieves a monitor for a Databricks serving endpoint.

Parameters:

endpoint_name (str, optional) – The name of the agent’s serving endpoint.

Returns:

Monitor | ExternalMonitor metadata. For external monitors, this will include the status of the ingestion endpoint.

databricks.agents.monitoring.update_monitor(*, endpoint_name: str, monitoring_config: dict | MonitoringConfig) Monitor

Partially update a monitor for a serving endpoint.

Parameters:
  • endpoint_name (str, optional) – The name of the agent’s serving endpoint. Only supported for agents served on Databricks.

  • monitoring_config – The configuration change, using upsert semantics.

Returns:

The updated monitor for the serving endpoint.

Return type:

Monitor

class databricks.agents.monitoring.AssessmentsSuiteConfig(sample: float | None = None, paused: bool | None = None, assessments: list[AssessmentConfig] | None = None)

Bases: object

Configuration for a suite of assessments to be run on traces from a GenAI application.

Raises:
  • ValueError – When sample is not between 0.0 (exclusive) and 1.0 (inclusive).

  • ValueError – When more than one guidelines judge is found.

  • ValueError – When duplicate builtin judges are found.

sample: float | None = None
paused: bool | None = None
assessments: list[AssessmentConfig] | None = None
classmethod from_dict(data: dict)
class databricks.agents.monitoring.GuidelinesJudge(guidelines: dict[str, list[str]])

Bases: AssessmentConfig

Configuration for a guideline adherence judge to be run on traces from an GenAI application.

Raises:
guidelines: dict[str, list[str]]
class databricks.agents.monitoring.BuiltinJudge(name: Literal['safety', 'groundedness', 'relevance_to_query', 'chunk_relevance'])

Bases: AssessmentConfig

Configuration for a builtin judge to be run on traces from a GenAI application.

Raises:

ValueError – When the judge name is invalid.

name: Literal['safety', 'groundedness', 'relevance_to_query', 'chunk_relevance']
class databricks.agents.monitoring.Monitor(endpoint_name: str, evaluated_traces_table: str, monitoring_config: MonitoringConfig, experiment_id: str, workspace_path: str)

Bases: object

The monitor for a serving endpoint.

endpoint_name: str
evaluated_traces_table: str
monitoring_config: MonitoringConfig
experiment_id: str
workspace_path: str
property monitoring_page_url: str
classmethod from_dict(kvs: dict | list | str | int | float | bool | None, *, infer_missing=False) A
classmethod from_json(s: str | bytes | bytearray, *, parse_float=None, parse_int=None, parse_constant=None, infer_missing=False, **kw) A
classmethod schema(*, infer_missing: bool = False, only=None, exclude=(), many: bool = False, context=None, load_only=(), dump_only=(), partial: bool = False, unknown=None) SchemaF[A]
to_dict(encode_json=False) Dict[str, dict | list | str | int | float | bool | None]
to_json(*, skipkeys: bool = False, ensure_ascii: bool = True, check_circular: bool = True, allow_nan: bool = True, indent: int | str | None = None, separators: Tuple[str, str] | None = None, default: Callable | None = None, sort_keys: bool = False, **kw) str
class databricks.agents.monitoring.MonitoringConfig(sample: float | None = None, metrics: list[str] | None = None, periodic: PeriodicMonitoringConfig | None = None, paused: bool | None = None, global_guidelines: dict[str, list[str]] | None = None)

Bases: object

Configuration for monitoring an GenAI application. All fields are optional for upsert semantics.

periodic is deprecated and will be removed in a future release.

sample: float | None = None
metrics: list[str] | None = None
periodic: PeriodicMonitoringConfig | None = None
paused: bool | None = None
global_guidelines: dict[str, list[str]] | None = None
classmethod from_assessments_suite_config(assessments_suite_config: AssessmentsSuiteConfig) MonitoringConfig
to_assessments_suite_config() AssessmentsSuiteConfig
classmethod from_dict(kvs: dict | list | str | int | float | bool | None, *, infer_missing=False) A
classmethod from_json(s: str | bytes | bytearray, *, parse_float=None, parse_int=None, parse_constant=None, infer_missing=False, **kw) A
classmethod schema(*, infer_missing: bool = False, only=None, exclude=(), many: bool = False, context=None, load_only=(), dump_only=(), partial: bool = False, unknown=None) SchemaF[A]
to_dict(encode_json=False) Dict[str, dict | list | str | int | float | bool | None]
to_json(*, skipkeys: bool = False, ensure_ascii: bool = True, check_circular: bool = True, allow_nan: bool = True, indent: int | str | None = None, separators: Tuple[str, str] | None = None, default: Callable | None = None, sort_keys: bool = False, **kw) str
class databricks.agents.monitoring.SchedulePauseStatus(value)

Bases: StrEnum

An enumeration.

UNPAUSED = 'UNPAUSED'
PAUSED = 'PAUSED'