databricks.vector_search package

class databricks.vector_search.client.VectorSearchClient(workspace_url=None, personal_access_token=None, service_principal_client_id=None, service_principal_client_secret=None, azure_tenant_id=None, azure_login_id='2ff814a6-3304-4ab8-85cb-cd0e6f879c1d', disable_notice=False, credential_strategy=None)

Bases: object

A client for interacting with the Vector Search service.

This client provides methods for managing endpoints and indexes in the Vector Search service.

Initialize the VectorSearchClient.

Parameters:
  • workspace_url (str) – The URL of the Databricks workspace (e.g., “https://your-workspace.databricks.com”). Optional if running inside a Databricks notebook (auto-detected).

  • personal_access_token (str) – Personal access token for authentication. Used with workspace_url for PAT auth. Optional if running inside a Databricks notebook (auto-detected).

  • service_principal_client_id (str) – OAuth client ID of the service principal. Required for service principal auth.

  • service_principal_client_secret (str) – OAuth client secret of the service principal. Required for service principal auth.

  • azure_tenant_id (str) – Azure tenant ID for Azure-based authentication. Required only for Azure service principal auth.

  • azure_login_id (str) – Azure login ID (Databricks Azure Application ID) for authentication. Defaults to “2ff814a6-3304-4ab8-85cb-cd0e6f879c1d” (AZURE_PUBLIC). See all login IDs: https://github.com/databricks/databricks-sdk-py/blob/main/databricks/sdk/environments.py

  • disable_notice (bool) – Whether to disable authentication notice messages. Default is False.

  • credential_strategy (CredentialStrategy) – Credential strategy for specialized authentication scenarios. Use CredentialStrategy.MODEL_SERVING_USER_CREDENTIALS to authenticate with the calling user’s credentials in Model Serving environments.

Note

Authentication Methods

The client supports multiple authentication methods. Choose one based on your environment:

1. Auto-detection (Recommended for Databricks Notebooks)

When no credentials are provided, the client automatically detects credentials from the Databricks notebook environment. This is the simplest method when running inside Databricks.

Example:

from databricks.vector_search.client import VectorSearchClient

# Credentials are automatically detected from notebook context
client = VectorSearchClient()
2. Personal Access Token (PAT)

Use a personal access token for user-based authentication. Requires both workspace_url and personal_access_token.

Example:

from databricks.vector_search.client import VectorSearchClient

client = VectorSearchClient(
    workspace_url="https://your-workspace.databricks.com",
    personal_access_token="dapi..."
)
3. Service Principal (OAuth M2M) - AWS & GCP

Use OAuth Machine-to-Machine authentication with a service principal for automated workflows. Requires workspace_url, service_principal_client_id, and service_principal_client_secret.

Example:

from databricks.vector_search.client import VectorSearchClient

client = VectorSearchClient(
    workspace_url="https://your-workspace.databricks.com",
    service_principal_client_id="your-client-id",
    service_principal_client_secret="your-client-secret"
)
4. Azure Service Principal

For Azure Databricks, use Azure AD authentication with additional azure_tenant_id.

Example:

from databricks.vector_search.client import VectorSearchClient

client = VectorSearchClient(
    workspace_url="https://adb-1234567890123456.7.azuredatabricks.net",
    service_principal_client_id="your-azure-app-id",
    service_principal_client_secret="your-azure-app-secret",
    azure_tenant_id="your-azure-tenant-id"
)
5. Model Serving with User Credentials

When calling from within Databricks Model Serving, use the invoking user’s credentials for fine-grained access control.

Example:

from databricks.vector_search.client import VectorSearchClient
from databricks.vector_search.utils import CredentialStrategy

client = VectorSearchClient(
    credential_strategy=CredentialStrategy.MODEL_SERVING_USER_CREDENTIALS
)

Additional Notes:

  • Service principal authentication requires workspace_url to be explicitly provided

  • Personal access token authentication also requires workspace_url

  • When using auto-detection in notebooks, credentials are inferred from the execution context

  • Azure authentication requires both azure_tenant_id and standard service principal credentials

create_delta_sync_index(endpoint_name, index_name, primary_key, source_table_name, pipeline_type, embedding_dimension=None, embedding_vector_column=None, embedding_source_column=None, embedding_model_endpoint_name=None, sync_computed_embeddings=False, columns_to_sync=None, model_endpoint_name_for_query=None, budget_policy_id=None, usage_policy_id=None)

Create a delta sync index.

Parameters:
  • columns_to_sync (str) – The columns that would be synced to the vector index with the primary key and vector column always being synced. If the field is not defined, all columns will be synced.

  • endpoint_name (str) – The name of the endpoint.

  • index_name (str) – The name of the index.

  • primary_key (str) – The primary key of the index.

  • source_table_name (str) – The name of the source table.

  • pipeline_type (str) – The type of the pipeline. Must be CONTINUOUS or TRIGGERED.

  • embedding_dimension (int) – The dimension of the embedding vector.

  • embedding_vector_column (str) – The name of the embedding vector column.

  • embedding_source_column (str) – The name of the embedding source column.

  • embedding_model_endpoint_name (str) – The name of the embedding model endpoint.

  • sync_computed_embeddings (bool) – Whether to automatically sync the vector index contents and computed embeddings to a new UC table, table name will be ${index_name}_writeback_table.

  • model_endpoint_name_for_query (str) – When set, queries will use this embedding model instead of the embedding_model_endpoint_name. If unset, queries continue to use embedding_model_endpoint_name.

  • budget_policy_id (str) – The budget policy ID to associate with this index for cost tracking and management (deprecated, use usage_policy_id).

  • usage_policy_id (str) – The usage policy ID to associate with this index for cost tracking and management.

create_delta_sync_index_and_wait(endpoint_name, index_name, primary_key, source_table_name, pipeline_type, embedding_dimension=None, embedding_vector_column=None, embedding_source_column=None, embedding_model_endpoint_name=None, sync_computed_embeddings=False, columns_to_sync=None, model_endpoint_name_for_query=None, budget_policy_id=None, usage_policy_id=None, verbose=False, timeout=datetime.timedelta(days=1))

Create a delta sync index and wait for it to be ready.

Parameters:
  • columns_to_sync (str) – The columns that would be synced to the vector index with the primary key and vector column always being synced. If the field is not defined, all columns will be synced.

  • endpoint_name (str) – The name of the endpoint.

  • index_name (str) – The name of the index.

  • primary_key (str) – The primary key of the index.

  • source_table_name (str) – The name of the source table.

  • pipeline_type (str) – The type of the pipeline. Must be CONTINUOUS or TRIGGERED.

  • embedding_dimension (int) – The dimension of the embedding vector.

  • embedding_vector_column (str) – The name of the embedding vector column.

  • embedding_source_column (str) – The name of the embedding source column.

  • embedding_model_endpoint_name (str) – The name of the embedding model endpoint.

  • verbose (bool) – Whether to print status messages.

  • timeout (datetime.timedelta) – The time allowed until we timeout with an Exception.

  • sync_computed_embeddings (bool) – Whether to automatically sync the vector index contents and computed embeddings to a new UC table, table name will be ${index_name}_writeback_table.

  • model_endpoint_name_for_query (str) – The name of the embedding model endpoint to be used for querying, not ingestion.

  • budget_policy_id (str) – The budget policy ID to associate with this index for cost tracking and management (deprecated, use usage_policy_id).

  • usage_policy_id (str) – The usage policy ID to associate with this index for cost tracking and management.

create_direct_access_index(endpoint_name, index_name, primary_key, embedding_dimension, embedding_vector_column, schema, embedding_model_endpoint_name=None, budget_policy_id=None, usage_policy_id=None)

Create a direct access index.

Parameters:
  • endpoint_name (str) – The name of the endpoint.

  • index_name (str) – The name of the index.

  • primary_key (str) – The primary key of the index.

  • embedding_dimension (int) – The dimension of the embedding vector.

  • embedding_vector_column (str) – The name of the embedding vector column.

  • schema (dict) – The schema of the index.

  • embedding_model_endpoint_name (str) – The name of the optional embedding model endpoint to use when querying.

  • budget_policy_id (str) – The budget policy id to be applied to the index (deprecated, use usage_policy_id).

  • usage_policy_id (str) – The usage policy id to be applied to the index.

create_endpoint(name, endpoint_type='STANDARD', budget_policy_id=None, usage_policy_id=None)

Create an endpoint.

Parameters:
  • name (str) – The name of the endpoint.

  • endpoint_type (str) – The type of the endpoint. Must be STANDARD or ENTERPRISE.

  • budget_policy_id (str) – The id of the budget policy to assign to the endpoint (deprecated, use usage_policy_id).

  • usage_policy_id (str) – The id of the usage policy to assign to the endpoint.

create_endpoint_and_wait(name, endpoint_type='STANDARD', budget_policy_id=None, usage_policy_id=None, verbose=False, timeout=datetime.timedelta(seconds=3600))

Create an endpoint and wait for it to be online.

Parameters:
  • name (str) – The name of the endpoint.

  • endpoint_type (str) – The type of the endpoint. Must be STANDARD or ENTERPRISE.

  • budget_policy_id (str) – The id of the budget policy to assign to the endpoint (deprecated, use usage_policy_id).

  • usage_policy_id (str) – The id of the usage policy to assign to the endpoint.

  • verbose (bool) – Whether to print status messages.

  • timeout (datetime.timedelta) – The time allowed until we timeout with an Exception.

delete_endpoint(name)

Delete an endpoint.

Parameters:

name (str) – The name of the endpoint.

delete_index(endpoint_name=None, index_name=None)

Delete an index.

Parameters:
  • endpoint_name (Option[str]) – The optional name of the endpoint.

  • index_name (str) – The name of the index.

get_endpoint(name)

Get an endpoint.

Parameters:

name (str) – The name of the endpoint.

get_index(endpoint_name=None, index_name=None)

Get an index.

Parameters:
  • endpoint_name (Option[str]) – The optional name of the endpoint.

  • index_name (str) – The name of the index.

list_endpoints()

List all endpoints.

list_indexes(name)

List all indexes for an endpoint.

Parameters:

name (str) – The name of the endpoint.

update_endpoint_budget_policy(name, budget_policy_id=None, usage_policy_id=None)

Update an endpoint’s budget/usage policy.

Parameters:
  • name (str) – The name of the endpoint.

  • budget_policy_id (str) – The id of the budget policy to assign to the endpoint (deprecated, use usage_policy_id).

  • usage_policy_id (str) – The id of the usage policy to assign to the endpoint.

update_endpoint_usage_policy(name, usage_policy_id)

Update an endpoint’s usage policy (alias for update_endpoint_budget_policy).

Parameters:
  • name (str) – The name of the endpoint.

  • usage_policy_id (str) – The id of the usage policy to assign to the endpoint.

update_index_budget_policy(index_name, budget_policy_id=None, usage_policy_id=None)

Update the budget/usage policy of an index.

Parameters:
  • index_name (str) – The name of the index.

  • budget_policy_id (str) – The budget policy id to be applied to the index (deprecated, use usage_policy_id).

  • usage_policy_id (str) – The usage policy id to be applied to the index.

update_index_usage_policy(index_name, usage_policy_id)

Update the usage policy of an index (alias for update_index_budget_policy).

Parameters:
  • index_name (str) – The name of the index.

  • usage_policy_id (str) – The usage policy id to be applied to the index.

validate(disable_notice=False)

Validate the client configuration.

Parameters:

disable_notice (bool) – Whether to disable the authentication notice message.

wait_for_endpoint(name, verbose=False, timeout=datetime.timedelta(seconds=3600))

Wait for an endpoint to be online.

Parameters:
  • name (str) – The name of the endpoint.

  • verbose (bool) – Whether to print status messages.

  • timeout (datetime.timedelta) – The time allowed until we timeout with an Exception.

class databricks.vector_search.index.VectorSearchIndex(workspace_url: str, index_url: str, name: str, endpoint_name: str, mlserving_endpoint_name: str | None = None, personal_access_token: str | None = None, service_principal_client_id: str | None = None, service_principal_client_secret: str | None = None, azure_tenant_id: str | None = None, azure_login_id: str | None = None, use_user_passed_credentials: bool = False, credential_strategy: CredentialStrategy | None = None, get_reranker_url_callable: callable | None = None, mlserving_endpoint_name_for_query: str | None = None, total_retries: int = 3, backoff_factor: float = 1, backoff_jitter: float = 0.2)

Bases: object

VectorSearchIndex is a helper class that represents a Vector Search Index.

Those who wish to use this class should not instantiate it directly, but rather use the VectorSearchClient class.

Initialize a VectorSearchIndex instance.

Parameters:
  • workspace_url (str) – The URL of the Databricks workspace.

  • index_url (str) – The direct URL to the vector search index endpoint.

  • name (str) – The name of the vector search index.

  • endpoint_name (str) – The name of the vector search endpoint.

  • mlserving_endpoint_name (str) – The name of the model serving endpoint used for embedding generation during ingestion.

  • personal_access_token (str) – Personal access token for authentication.

  • service_principal_client_id (str) – Service principal client ID for authentication.

  • service_principal_client_secret (str) – Service principal client secret for authentication.

  • azure_tenant_id (str) – Azure tenant ID for Azure-based authentication.

  • azure_login_id (str) – Azure login ID (Databricks Azure Application ID) for authentication.

  • use_user_passed_credentials (bool) – Whether credentials were explicitly provided by the user (True) or inferred automatically (False).

  • credential_strategy (CredentialStrategy) – The credential strategy to use for authentication.

  • get_reranker_url_callable (callable) – A callable function to retrieve the reranker-compatible index URL when needed.

  • mlserving_endpoint_name_for_query (str) – The name of the model serving endpoint to use for queries (if different from ingestion endpoint).

  • total_retries (int) – Total number of retries for requests. Defaults to 3.

  • backoff_factor (float) – Backoff factor for retry delays. Defaults to 1.

  • backoff_jitter (float) – Random jitter proportion (0-1) to add to backoff delays. Defaults to 0.2.

delete(primary_keys)

Delete data from the index.

Parameters:

primary_keys – List of primary keys to delete from the index.

describe()

Describe the index. This returns metadata about the index.

scan(num_results=10, last_primary_key=None)

Given all the data in the index sorted by primary key, this returns the next num_results data after the primary key specified by last_primary_key. If last_primary_key is None , it returns the first num_results.

Please note if there’s ongoing updates to the index, the scan results may not be consistent.

Parameters:
  • num_results – Number of results to return.

  • last_primary_key – last primary key from previous pagination, it will be used as the exclusive starting primary key.

scan_index(num_results=10, last_primary_key=None)

Deprecated since version 0.36: This will be removed in 0.37. Use the scan function instead

Perform a similarity search on the index. This returns the top K results that are most similar to the query.

Parameters:
  • columns – List of column names to return in the results.

  • query_text – Query text to search for.

  • query_vector – Query vector to search for.

  • filters – Filters to apply to the query.

  • num_results – Number of results to return.

  • debug_level – Debug level to use for the query.

  • score_threshold – Score threshold to use for the query.

  • query_type – Query type of this query. Choices are “ANN” and “HYBRID”.

  • columns_to_rerank – List of column names to use for reranking the results.

  • disable_notice – Whether to disable the notice message.

  • reranker – Reranker to use for the query.

  • total_retries – Total number of retries for the request. Set to 0 to disable retries.

  • backoff_factor – Backoff factor to apply between retry attempts. The delay between retries is calculated as {backoff_factor} * (2 ** (retry_count - 1)) seconds. For example, with backoff_factor=1, delays are 0.5s, 1s, 2s, 4s, etc.

  • backoff_jitter – Random jitter to add to backoff delays to avoid thundering herd problem. Value between 0 and 1 representing the proportion of jitter to apply.

sync()

Sync the index. This is used to sync the index with the source delta table. This only works with managed delta sync index with pipeline type=”TRIGGERED”.

upsert(inputs)

Upsert data into the index.

Parameters:

inputs – List of dictionaries to upsert into the index.

wait_until_ready(verbose=False, timeout=datetime.timedelta(days=1), wait_for_updates=False)

Wait for the index to be online.

Parameters:
  • verbose (bool) – Whether to print status messages.

  • timeout (datetime.timedelta) – The time allowed until we timeout with an Exception.

  • wait_for_updates (bool) – If true, the index will also wait for any updates to be completed.