databricks.ai_search package
The canonical Python module for Databricks AI Search.
The classes AISearchClient / AISearchIndex / AISearchException are
the canonical names; VectorSearchClient / VectorSearchIndex /
VectorSearchException are preserved as backward-compat aliases (each is
the same class, so isinstance and is checks work either way).
- class databricks.ai_search.client.AISearchClient(workspace_url=None, personal_access_token=None, service_principal_client_id=None, service_principal_client_secret=None, azure_tenant_id=None, azure_login_id='2ff814a6-3304-4ab8-85cb-cd0e6f879c1d', disable_notice=False, credential_strategy=None)
Bases:
objectA client for interacting with the AI Search service.
This client provides methods for managing endpoints and indexes in the AI Search service.
Initialize the AISearchClient.
- Parameters:
workspace_url (str) – The URL of the Databricks workspace (e.g., “https://your-workspace.databricks.com”). Optional if running inside a Databricks notebook (auto-detected).
personal_access_token (str) – Personal access token for authentication. Used with workspace_url for PAT auth. Optional if running inside a Databricks notebook (auto-detected).
service_principal_client_id (str) – OAuth client ID of the service principal. Required for service principal auth.
service_principal_client_secret (str) – OAuth client secret of the service principal. Required for service principal auth.
azure_tenant_id (str) – Azure tenant ID for Azure-based authentication. Required only for Azure service principal auth.
azure_login_id (str) – Azure login ID (Databricks Azure Application ID) for authentication. Defaults to “2ff814a6-3304-4ab8-85cb-cd0e6f879c1d” (AZURE_PUBLIC). See all login IDs: https://github.com/databricks/databricks-sdk-py/blob/main/databricks/sdk/environments.py
disable_notice (bool) – Whether to disable authentication notice messages. Default is False.
credential_strategy (CredentialStrategy) – Credential strategy for specialized authentication scenarios. Use CredentialStrategy.MODEL_SERVING_USER_CREDENTIALS to authenticate with the calling user’s credentials in Model Serving environments.
Note
Authentication Methods
The client supports multiple authentication methods. Choose one based on your environment:
- 1. Auto-detection (Recommended for Databricks Notebooks)
When no credentials are provided, the client automatically detects credentials from the Databricks notebook environment. This is the simplest method when running inside Databricks.
Example:
from databricks.ai_search.client import AISearchClient # Credentials are automatically detected from notebook context client = AISearchClient()
- 2. Personal Access Token (PAT)
Use a personal access token for user-based authentication. Requires both workspace_url and personal_access_token.
Example:
from databricks.ai_search.client import AISearchClient client = AISearchClient( workspace_url="https://your-workspace.databricks.com", personal_access_token="dapi..." )
- 3. Service Principal (OAuth M2M) - AWS & GCP
Use OAuth Machine-to-Machine authentication with a service principal for automated workflows. Requires workspace_url, service_principal_client_id, and service_principal_client_secret.
Example:
from databricks.ai_search.client import AISearchClient client = AISearchClient( workspace_url="https://your-workspace.databricks.com", service_principal_client_id="your-client-id", service_principal_client_secret="your-client-secret" )
- 4. Azure Service Principal
For Azure Databricks, use Azure AD authentication with additional azure_tenant_id.
Example:
from databricks.ai_search.client import AISearchClient client = AISearchClient( workspace_url="https://adb-1234567890123456.7.azuredatabricks.net", service_principal_client_id="your-azure-app-id", service_principal_client_secret="your-azure-app-secret", azure_tenant_id="your-azure-tenant-id" )
- 5. Model Serving with User Credentials
When calling from within Databricks Model Serving, use the invoking user’s credentials for fine-grained access control.
Example:
from databricks.ai_search.client import AISearchClient from databricks.ai_search.utils import CredentialStrategy client = AISearchClient( credential_strategy=CredentialStrategy.MODEL_SERVING_USER_CREDENTIALS )
Additional Notes:
Service principal authentication requires workspace_url to be explicitly provided
Personal access token authentication also requires workspace_url
When using auto-detection in notebooks, credentials are inferred from the execution context
Azure authentication requires both azure_tenant_id and standard service principal credentials
- create_delta_sync_index(endpoint_name, index_name, primary_key, source_table_name, pipeline_type, embedding_dimension=None, embedding_vector_column=None, embedding_source_column=None, embedding_model_endpoint_name=None, sync_computed_embeddings=False, columns_to_sync=None, model_endpoint_name_for_query=None, budget_policy_id=None, usage_policy_id=None, index_subtype=None)
Create a delta sync index.
- Parameters:
columns_to_sync (str) – The columns that would be synced to the vector index with the primary key and vector column always being synced. If the field is not defined, all columns will be synced.
endpoint_name (str) – The name of the endpoint.
index_name (str) – The name of the index.
primary_key (str) – The primary key of the index.
source_table_name (str) – The name of the source table.
pipeline_type (str) – The type of the pipeline. Must be CONTINUOUS or TRIGGERED.
embedding_dimension (int) – The dimension of the embedding vector.
embedding_vector_column (str) – The name of the embedding vector column.
embedding_source_column (str) – The name of the embedding source column.
embedding_model_endpoint_name (str) – The name of the embedding model endpoint.
sync_computed_embeddings (bool) – Whether to automatically sync the vector index contents and computed embeddings to a new UC table, table name will be ${index_name}_writeback_table.
model_endpoint_name_for_query (str) – When set, queries will use this embedding model instead of the embedding_model_endpoint_name. If unset, queries continue to use embedding_model_endpoint_name.
budget_policy_id (str) – The budget policy ID to associate with this index for cost tracking and management (deprecated, use usage_policy_id).
usage_policy_id (str) – The usage policy ID to associate with this index for cost tracking and management.
index_subtype (str) – The subtype of the index. “HYBRID” (default) supports semantic search with embeddings and keyword search. “FULL_TEXT” supports keyword search only, without embeddings.
Note
BETA FEATURE: The index_subtype parameter is in beta and subject to change. “FULL_TEXT” enables keyword-only search without embeddings. Currently only supported for STORAGE_OPTIMIZED endpoints.
Warning
Only STORAGE_OPTIMIZED endpoints support index_subtype=”FULL_TEXT”. STANDARD endpoints will return an error if FULL_TEXT is specified.
- create_delta_sync_index_and_wait(endpoint_name, index_name, primary_key, source_table_name, pipeline_type, embedding_dimension=None, embedding_vector_column=None, embedding_source_column=None, embedding_model_endpoint_name=None, sync_computed_embeddings=False, columns_to_sync=None, model_endpoint_name_for_query=None, budget_policy_id=None, usage_policy_id=None, verbose=False, timeout=datetime.timedelta(days=1), index_subtype=None)
Create a delta sync index and wait for it to be ready.
- Parameters:
columns_to_sync (str) – The columns that would be synced to the vector index with the primary key and vector column always being synced. If the field is not defined, all columns will be synced.
endpoint_name (str) – The name of the endpoint.
index_name (str) – The name of the index.
primary_key (str) – The primary key of the index.
source_table_name (str) – The name of the source table.
pipeline_type (str) – The type of the pipeline. Must be CONTINUOUS or TRIGGERED.
embedding_dimension (int) – The dimension of the embedding vector.
embedding_vector_column (str) – The name of the embedding vector column.
embedding_source_column (str) – The name of the embedding source column.
embedding_model_endpoint_name (str) – The name of the embedding model endpoint.
verbose (bool) – Whether to print status messages.
timeout (datetime.timedelta) – The time allowed until we timeout with an Exception.
sync_computed_embeddings (bool) – Whether to automatically sync the vector index contents and computed embeddings to a new UC table, table name will be ${index_name}_writeback_table.
model_endpoint_name_for_query (str) – The name of the embedding model endpoint to be used for querying, not ingestion.
budget_policy_id (str) – The budget policy ID to associate with this index for cost tracking and management (deprecated, use usage_policy_id).
usage_policy_id (str) – The usage policy ID to associate with this index for cost tracking and management.
index_subtype (str) – The subtype of the index. “HYBRID” (default) supports semantic search with embeddings and keyword search. “FULL_TEXT” supports keyword search only, without embeddings.
Note
BETA FEATURE: The index_subtype parameter is in beta and subject to change. “FULL_TEXT” enables keyword-only search without embeddings. Currently only supported for STORAGE_OPTIMIZED endpoints.
Warning
Only STORAGE_OPTIMIZED endpoints support index_subtype=”FULL_TEXT”. STANDARD endpoints will return an error if FULL_TEXT is specified.
- create_direct_access_index(endpoint_name, index_name, primary_key, embedding_dimension, embedding_vector_column, schema, embedding_model_endpoint_name=None, budget_policy_id=None, usage_policy_id=None)
Create a direct access index.
- Parameters:
endpoint_name (str) – The name of the endpoint.
index_name (str) – The name of the index.
primary_key (str) – The primary key of the index.
embedding_dimension (int) – The dimension of the embedding vector.
embedding_vector_column (str) – The name of the embedding vector column.
schema (dict) – The schema of the index.
embedding_model_endpoint_name (str) – The name of the optional embedding model endpoint to use when querying.
budget_policy_id (str) – The budget policy id to be applied to the index (deprecated, use usage_policy_id).
usage_policy_id (str) – The usage policy id to be applied to the index.
- create_endpoint(name: str, endpoint_type: str = 'STANDARD', budget_policy_id: str | None = None, usage_policy_id: str | None = None, target_qps: int | None = None) Dict[str, Any]
Create an endpoint.
- Parameters:
name (str) – The name of the endpoint.
endpoint_type (str) – The type of the endpoint. Must be STANDARD or STORAGE_OPTIMIZED.
budget_policy_id (str) – The id of the budget policy to assign to the endpoint (deprecated, use usage_policy_id).
usage_policy_id (str) – The id of the usage policy to assign to the endpoint.
target_qps (int) – Target queries per second for the endpoint. Must be a positive integer. Capacity is automatically adjusted at index creation/sync time to best match this throughput target (best-effort, not guaranteed). Optional.
Note
BETA FEATURE: The target_qps parameter is in beta and subject to change. It enables automatic capacity scaling based on desired throughput. Currently only supported for STANDARD endpoints.
Warning
Only STANDARD endpoints support target_qps. Storage Optimized endpoints will return an error if target_qps is specified.
Example:
from databricks.ai_search.client import AISearchClient client = AISearchClient() # Create endpoint with target_qps endpoint = client.create_endpoint( name="high-throughput-endpoint", endpoint_type="STANDARD", target_qps=200 ) # Check the scaling info scaling_info = endpoint.get("endpoint", {}).get("scaling_info", {}) print(f"Requested target QPS: {scaling_info.get('requested_target_qps')}")
- create_endpoint_and_wait(name: str, endpoint_type: str = 'STANDARD', budget_policy_id: str | None = None, usage_policy_id: str | None = None, target_qps: int | None = None, verbose: bool = False, timeout: timedelta = datetime.timedelta(seconds=3600)) None
Create an endpoint and wait for it to be online.
- Parameters:
name (str) – The name of the endpoint.
endpoint_type (str) – The type of the endpoint. Must be STANDARD or STORAGE_OPTIMIZED.
budget_policy_id (str) – The id of the budget policy to assign to the endpoint (deprecated, use usage_policy_id).
usage_policy_id (str) – The id of the usage policy to assign to the endpoint.
target_qps (int) – Target queries per second for the endpoint. Must be a positive integer. Capacity is automatically adjusted at index creation/sync time to best match this throughput target (best-effort, not guaranteed). Optional.
verbose (bool) – Whether to print status messages.
timeout (datetime.timedelta) – The time allowed until we timeout with an Exception.
Note
BETA FEATURE: The target_qps parameter is in beta and subject to change. It enables automatic capacity scaling based on desired throughput. Currently only supported for STANDARD endpoints.
Warning
Only STANDARD endpoints support target_qps. Storage Optimized endpoints will return an error if target_qps is specified.
- delete_endpoint(name)
Delete an endpoint.
- Parameters:
name (str) – The name of the endpoint.
- delete_index(endpoint_name=None, index_name=None)
Delete an index.
- Parameters:
endpoint_name (Option[str]) – The optional name of the endpoint.
index_name (str) – The name of the index.
- endpoint_exists(name)
Check if an endpoint exists.
This method provides a cleaner alternative to catching NotFound exceptions when checking for resource existence.
- Parameters:
name (str) – The name of the endpoint.
- Returns:
True if the endpoint exists, False otherwise.
- Return type:
bool
Example:
from databricks.ai_search.client import AISearchClient client = AISearchClient() if client.endpoint_exists("my-endpoint"): endpoint = client.get_endpoint("my-endpoint") else: endpoint = client.create_endpoint("my-endpoint")
- get_async_index(endpoint_name=None, index_name=None)
Get an async index for use in
async/awaitcode.Returns an
AsyncAISearchIndexthat mirrors the data-plane methods ofAISearchIndex(upsert,delete,similarity_search,scan,describe,sync) overhttpx.AsyncClient. Control-plane index lifecycle (create / delete /wait_until_ready) remains on the sync client.- Parameters:
endpoint_name (Option[str]) – The optional name of the endpoint.
index_name (str) – The name of the index.
- Returns:
An async index bound to the event loop where its first method is awaited. Sharing across loops is unsupported.
- Return type:
- Raises:
NotFound – If the index does not exist.
PermissionDenied – If user lacks permission to view the index.
Note
This method issues a synchronous HTTP request to resolve the index URL. In async contexts (e.g. FastAPI handlers), wrap the call in
asyncio.to_thread:idx = await asyncio.to_thread( client.get_async_index, endpoint_name=..., index_name=... )
Example:
from databricks.ai_search.client import AISearchClient client = AISearchClient() async with client.get_async_index( endpoint_name="my-endpoint", index_name="catalog.schema.my_index" ) as idx: results = await idx.similarity_search( query_text="hello", columns=["id"], num_results=10 )
- get_endpoint(name)
Get an endpoint.
- Parameters:
name (str) – The name of the endpoint.
- Raises:
NotFound – If the endpoint does not exist.
PermissionDenied – If user lacks permission to view the endpoint.
- get_index(endpoint_name=None, index_name=None)
Get an index.
- Parameters:
endpoint_name (Option[str]) – The optional name of the endpoint.
index_name (str) – The name of the index.
- Raises:
NotFound – If the index does not exist.
PermissionDenied – If user lacks permission to view the index.
- index_exists(endpoint_name=None, index_name=None)
Check if an index exists.
This method provides a cleaner alternative to catching NotFound exceptions when checking for resource existence.
- Parameters:
endpoint_name (Option[str]) – The optional name of the endpoint.
index_name (str) – The name of the index.
- Returns:
True if the index exists, False otherwise.
- Return type:
bool
Example:
from databricks.ai_search.client import AISearchClient client = AISearchClient() if client.index_exists(index_name="my-index"): index = client.get_index(index_name="my-index") else: index = client.create_delta_sync_index(...)
- list_endpoints()
List all endpoints.
- list_indexes(name)
List all indexes for an endpoint.
- Parameters:
name (str) – The name of the endpoint.
- update_endpoint(name: str, target_qps: int | None = None) Dict[str, Any]
Update an endpoint’s configuration.
- Parameters:
name (str) – The name of the endpoint. Required.
target_qps (int) – Target queries per second for the endpoint. Must be a positive integer. Optional.
Note
BETA FEATURE: The target_qps parameter is in beta and subject to change. It enables automatic capacity scaling based on desired throughput. Currently only supported for STANDARD endpoints.
Warning
Only STANDARD endpoints support target_qps. Storage Optimized endpoints will return an error if target_qps is specified.
Example:
from databricks.ai_search.client import AISearchClient client = AISearchClient() # Update existing endpoint with target_qps response = client.update_endpoint(name="my-endpoint", target_qps=100) # Check scaling state scaling_info = response.get("endpoint", {}).get("scaling_info", {}) state = scaling_info.get("state") if state == "SCALING_CHANGE_APPLIED": print("Scaling configuration applied") elif state == "SCALING_CHANGE_IN_PROGRESS": print("Scaling change in progress, capacity will be adjusted at next index sync")
- update_endpoint_budget_policy(name, budget_policy_id=None, usage_policy_id=None)
Update an endpoint’s budget/usage policy.
- Parameters:
name (str) – The name of the endpoint.
budget_policy_id (str) – The id of the budget policy to assign to the endpoint (deprecated, use usage_policy_id).
usage_policy_id (str) – The id of the usage policy to assign to the endpoint.
- update_endpoint_usage_policy(name, usage_policy_id)
Update an endpoint’s usage policy (alias for update_endpoint_budget_policy).
- Parameters:
name (str) – The name of the endpoint.
usage_policy_id (str) – The id of the usage policy to assign to the endpoint.
- update_index_budget_policy(index_name, budget_policy_id=None, usage_policy_id=None)
Update the budget/usage policy of an index.
- Parameters:
index_name (str) – The name of the index.
budget_policy_id (str) – The budget policy id to be applied to the index (deprecated, use usage_policy_id).
usage_policy_id (str) – The usage policy id to be applied to the index.
- update_index_usage_policy(index_name, usage_policy_id)
Update the usage policy of an index (alias for update_index_budget_policy).
- Parameters:
index_name (str) – The name of the index.
usage_policy_id (str) – The usage policy id to be applied to the index.
- validate(disable_notice=False)
Validate the client configuration.
- Parameters:
disable_notice (bool) – Whether to disable the authentication notice message.
- wait_for_endpoint(name, verbose=False, timeout=datetime.timedelta(seconds=3600))
Wait for an endpoint to be online.
- Parameters:
name (str) – The name of the endpoint.
verbose (bool) – Whether to print status messages.
timeout (datetime.timedelta) – The time allowed until we timeout with an Exception.
- class databricks.ai_search.index.AISearchIndex(workspace_url: str, index_url: str, name: str, endpoint_name: str, mlserving_endpoint_name: str | None = None, personal_access_token: str | None = None, service_principal_client_id: str | None = None, service_principal_client_secret: str | None = None, azure_tenant_id: str | None = None, azure_login_id: str | None = None, use_user_passed_credentials: bool = False, credential_strategy: CredentialStrategy | None = None, get_reranker_url_callable: callable | None = None, mlserving_endpoint_name_for_query: str | None = None, total_retries: int = 3, backoff_factor: float = 1, backoff_jitter: float = 0.2)
Bases:
objectAISearchIndex is a helper class that represents a AI Search Index.
Those who wish to use this class should not instantiate it directly, but rather use the AISearchClient class.
Initialize a AISearchIndex instance.
- Parameters:
workspace_url (str) – The URL of the Databricks workspace.
index_url (str) – The direct URL to the vector search index endpoint.
name (str) – The name of the vector search index.
endpoint_name (str) – The name of the vector search endpoint.
mlserving_endpoint_name (str) – The name of the model serving endpoint used for embedding generation during ingestion.
personal_access_token (str) – Personal access token for authentication.
service_principal_client_id (str) – Service principal client ID for authentication.
service_principal_client_secret (str) – Service principal client secret for authentication.
azure_tenant_id (str) – Azure tenant ID for Azure-based authentication.
azure_login_id (str) – Azure login ID (Databricks Azure Application ID) for authentication.
use_user_passed_credentials (bool) – Whether credentials were explicitly provided by the user (True) or inferred automatically (False).
credential_strategy (CredentialStrategy) – The credential strategy to use for authentication.
get_reranker_url_callable (callable) – A callable function to retrieve the reranker-compatible index URL when needed.
mlserving_endpoint_name_for_query (str) – The name of the model serving endpoint to use for queries (if different from ingestion endpoint).
total_retries (int) – Total number of retries for requests. Defaults to 3.
backoff_factor (float) – Backoff factor for retry delays. Defaults to 1.
backoff_jitter (float) – Random jitter proportion (0-1) to add to backoff delays. Defaults to 0.2.
- delete(primary_keys)
Delete data from the index.
- Parameters:
primary_keys – List of primary keys to delete from the index.
- describe()
Describe the index. This returns metadata about the index.
- scan(num_results=10, last_primary_key=None)
Given all the data in the index sorted by primary key, this returns the next num_results data after the primary key specified by last_primary_key. If last_primary_key is None , it returns the first num_results.
Please note if there’s ongoing updates to the index, the scan results may not be consistent.
- Parameters:
num_results – Number of results to return.
last_primary_key – last primary key from previous pagination, it will be used as the exclusive starting primary key.
- scan_index(num_results=10, last_primary_key=None)
Deprecated since version 0.36: This will be removed in 0.37. Use the scan function instead
- similarity_search(columns: List[str], query_text: str | None = None, query_vector: List[float] | None = None, filters: str | Dict[str, Any] | None = None, num_results: int = 5, debug_level: int = 0, score_threshold: float | None = None, query_type: str | None = None, columns_to_rerank: List[str] | None = None, disable_notice: bool = False, reranker: Reranker | None = None, total_retries: int = 3, backoff_factor: float = 1, backoff_jitter: float = 0.2, query_columns: List[str] | None = None, sort_columns: List[str] | None = None, facets: List[str] | None = None)
Perform a similarity search on the index. This returns the top K results that are most similar to the query.
- Parameters:
columns – List of column names to return in the results.
query_text – Query text to search for.
query_vector – Query vector to search for.
filters – Filters to apply to the query.
num_results – Number of results to return.
debug_level – Debug level to use for the query.
score_threshold – Score threshold to use for the query. If reranker is used, the score threshold is applied before reranking.
query_type – Query type of this query. Choices are “ANN”, “HYBRID”, and “FULL_TEXT”.
query_columns – Text columns to search for query_text. When empty, all text columns are searched.
sort_columns – Sort results by column values instead of the default relevance ordering. Each clause has the form “<column> ASC” or “<column> DESC”, for example [“rating DESC”, “price ASC”].
facets – Facets to compute over the matched results. Each entry is one of: “<column>” (top 10 distinct values by count), “<column> TOP <n>” (top n distinct values, n > 0), or “<column> BUCKETS [[from,to],…]” (inclusive numeric ranges). TOP and BUCKETS are case-insensitive; a column may appear at most once.
columns_to_rerank – (Deprecated) List of column names to use for reranking the results. Use the
rerankerparameter instead.disable_notice – Whether to disable the notice message.
reranker (Optional[
databricks.ai_search.reranker.Reranker]) – Optional reranker to apply on the top results. Pass an instance ofdatabricks.ai_search.reranker.DatabricksRerankerwithcolumns_to_rerank=[...]. The reranker reorders the initial results using the specified text columns.total_retries – Total number of retries for the request. Set to 0 to disable retries.
backoff_factor – Backoff factor to apply between retry attempts. The delay between retries is calculated as {backoff_factor} * (2 ** (retry_count - 1)) seconds. For example, with backoff_factor=1, delays are 0.5s, 1s, 2s, 4s, etc.
backoff_jitter – Random jitter to add to backoff delays to avoid thundering herd problem. Value between 0 and 1 representing the proportion of jitter to apply.
- Example:
Use the Databricks reranker to improve the ordering of hybrid search results:
from databricks.ai_search.reranker import DatabricksReranker results = index.similarity_search( query_text="How to create a AI Search index", columns=["id", "text", "parent_doc_summary", "date"], # The final number of results to return. The reranker will automatically overfetch 50 documents and rerank them. num_results=10, query_type="hybrid", # Needed for debug info to get any warnings and time to rerank the results. debug_level=1, # The text reranked will be concatenated and if it is longer than 2000 characters, it will be truncated. # Include shorter, important columns first. reranker=DatabricksReranker(columns_to_rerank=["parent_doc_summary", "text", "other_column"]), ) # Check if reranking was successful and how much additional time it took to rerank the results. if "warnings" in results['debug_info']: print(results['debug_info']['warnings']) else: print(f"Reranking was successful and took {results['debug_info']['reranker_time']}ms")
- sync()
Sync the index. This is used to sync the index with the source delta table. This only works with managed delta sync index with pipeline type=”TRIGGERED”.
- upsert(inputs)
Upsert data into the index.
- Parameters:
inputs – List of dictionaries to upsert into the index.
- wait_until_ready(verbose=False, timeout=datetime.timedelta(days=1), wait_for_updates=False)
Wait for the index to be online.
- Parameters:
verbose (bool) – Whether to print status messages.
timeout (datetime.timedelta) – The time allowed until we timeout with an Exception.
wait_for_updates (bool) – If true, the index will also wait for any updates to be completed.
- class databricks.ai_search.async_index.AsyncAISearchIndex(workspace_url: str, index_url: str, name: str, endpoint_name: str, mlserving_endpoint_name: str | None = None, personal_access_token: str | None = None, service_principal_client_id: str | None = None, service_principal_client_secret: str | None = None, azure_tenant_id: str | None = None, azure_login_id: str | None = None, use_user_passed_credentials: bool = False, credential_strategy: CredentialStrategy | None = None, get_reranker_url_callable: callable | None = None, mlserving_endpoint_name_for_query: str | None = None, total_retries: int = 3, backoff_factor: float = 1, backoff_jitter: float = 0.2)
Bases:
objectAsyncAISearchIndex is a helper class that represents a AI Search Index for
async/awaitcode.Those who wish to use this class should not instantiate it directly, but rather use
AISearchClient.get_async_index()to obtain an instance.Index lifecycle operations (create, delete,
wait_until_ready) are available only onAISearchClient.Example:
async with client.get_async_index( endpoint_name="my-endpoint", index_name="my-index" ) as idx: results = await idx.similarity_search( query_vector=[0.1, 0.2, ...], columns=["id"], num_results=10 )
Initialize an AsyncAISearchIndex instance.
- Parameters:
workspace_url (str) – The URL of the Databricks workspace.
index_url (str) – The direct URL to the vector search index endpoint.
name (str) – The name of the vector search index.
endpoint_name (str) – The name of the vector search endpoint.
mlserving_endpoint_name (str) – The name of the model serving endpoint used for embedding generation during ingestion.
personal_access_token (str) – Personal access token for authentication.
service_principal_client_id (str) – Service principal client ID for authentication.
service_principal_client_secret (str) – Service principal client secret for authentication.
azure_tenant_id (str) – Azure tenant ID for Azure-based authentication.
azure_login_id (str) – Azure login ID (Databricks Azure Application ID) for authentication.
use_user_passed_credentials (bool) – Whether credentials were explicitly provided by the user (True) or inferred automatically (False).
credential_strategy (CredentialStrategy) – The credential strategy to use for authentication.
get_reranker_url_callable (callable) – A callable function to retrieve the reranker-compatible index URL when needed.
mlserving_endpoint_name_for_query (str) – The name of the model serving endpoint to use for queries (if different from ingestion endpoint).
total_retries (int) – Total number of retries for requests. Defaults to 3.
backoff_factor (float) – Backoff factor for retry delays. Defaults to 1.
backoff_jitter (float) – Random jitter proportion (0-1) to add to backoff delays. Defaults to 0.2.
- async aclose()
Close this index instance.
- async delete(primary_keys)
Delete data from the index.
- Parameters:
primary_keys – List of primary keys to delete from the index.
- async describe()
Describe the index. This returns metadata about the index.
- async scan(num_results=10, last_primary_key=None)
Given all the data in the index sorted by primary key, this returns the next
num_resultsdata after the primary key specified bylast_primary_key. Iflast_primary_keyis None, it returns the firstnum_results.Please note if there’s ongoing updates to the index, the scan results may not be consistent.
- Parameters:
num_results – Number of results to return.
last_primary_key – Last primary key from previous pagination, used as the exclusive starting primary key.
- async similarity_search(columns: List[str], query_text: str | None = None, query_vector: List[float] | None = None, filters: str | Dict[str, Any] | None = None, num_results: int = 5, debug_level: int = 0, score_threshold: float | None = None, query_type: str | None = None, columns_to_rerank: List[str] | None = None, disable_notice: bool = False, reranker: Reranker | None = None, total_retries: int = 3, backoff_factor: float = 1, backoff_jitter: float = 0.2, query_columns: List[str] | None = None, sort_columns: List[str] | None = None, facets: List[str] | None = None)
Perform a similarity search on the index. This returns the top K results that are most similar to the query.
- Parameters:
columns – List of column names to return in the results.
query_text – Query text to search for.
query_vector – Query vector to search for.
filters – Filters to apply to the query.
num_results – Number of results to return.
debug_level – Debug level to use for the query.
score_threshold – Score threshold to use for the query. If reranker is used, the score threshold is applied before reranking.
query_type – Query type of this query. Choices are “ANN”, “HYBRID”, and “FULL_TEXT”.
query_columns – Text columns to search for query_text. When empty, all text columns are searched.
sort_columns – Sort results by column values instead of the default relevance ordering. Each clause has the form “<column> ASC” or “<column> DESC”, for example [“rating DESC”, “price ASC”].
facets – Facets to compute over the matched results. Each entry is one of: “<column>” (top 10 distinct values by count), “<column> TOP <n>” (top n distinct values, n > 0), or “<column> BUCKETS [[from,to],…]” (inclusive numeric ranges). TOP and BUCKETS are case-insensitive; a column may appear at most once.
columns_to_rerank – (Deprecated) List of column names to use for reranking the results. Use the
rerankerparameter instead.disable_notice – Whether to disable the notice message.
reranker (Optional[
databricks.ai_search.reranker.Reranker]) – Optional reranker to apply on the top results. Pass an instance ofdatabricks.ai_search.reranker.DatabricksRerankerwithcolumns_to_rerank=[...]. The reranker reorders the initial results using the specified text columns.total_retries – Total number of retries for the request. Set to 0 to disable retries.
backoff_factor – Backoff factor to apply between retry attempts. The delay between retries is calculated as {backoff_factor} * (2 ** (retry_count - 1)) seconds. For example, with backoff_factor=1, delays are 0.5s, 1s, 2s, 4s, etc.
backoff_jitter – Random jitter to add to backoff delays to avoid thundering herd problem. Value between 0 and 1 representing the proportion of jitter to apply.
- Example:
Use the Databricks reranker to improve the ordering of hybrid search results:
from databricks.ai_search.reranker import DatabricksReranker results = index.similarity_search( query_text="How to create a AI Search index", columns=["id", "text", "parent_doc_summary", "date"], # The final number of results to return. The reranker will automatically overfetch 50 documents and rerank them. num_results=10, query_type="hybrid", # Needed for debug info to get any warnings and time to rerank the results. debug_level=1, # The text reranked will be concatenated and if it is longer than 2000 characters, it will be truncated. # Include shorter, important columns first. reranker=DatabricksReranker(columns_to_rerank=["parent_doc_summary", "text", "other_column"]), ) # Check if reranking was successful and how much additional time it took to rerank the results. if "warnings" in results['debug_info']: print(results['debug_info']['warnings']) else: print(f"Reranking was successful and took {results['debug_info']['reranker_time']}ms")
- async sync()
Sync the index. This is used to sync the index with the source delta table. This only works with managed delta sync index with pipeline type=”TRIGGERED”.
- async upsert(inputs)
Upsert data into the index.
- Parameters:
inputs – List of dictionaries to upsert into the index.
- class databricks.ai_search.reranker.DatabricksReranker(columns_to_rerank: list[str])
Bases:
RerankerInitialize a DatabricksReranker config object.
- Args:
columns_to_rerank: A list of column names to use for reranking the results.
- class databricks.ai_search.reranker.ExperimentalDatabricksFinetunedReranker(columns_to_rerank: list[str], endpoint_name: str | None = None)
Bases:
RerankerEXPERIMENTAL. Not covered by SDK compatibility guarantees.
Rerank query results using a finetuned reranker model hosted on a Model Serving endpoint in the caller’s workspace (typically created by the reranker finetuning job).
Initialize an ExperimentalDatabricksFinetunedReranker.
- Args:
- columns_to_rerank: List of column names to concatenate and send to
the reranker model for each document.
- endpoint_name: Name of the Model Serving endpoint hosting the
finetuned reranker. If None, the index resolves it at query time via
_default_finetuned_endpoint_name()using the index’s UC table UUID (see that function for the exact formula). This requires one extradescribe()round-trip the first time the index handles a finetuned-reranker query; the UUID is cached on the index for subsequent queries.
- exception databricks.ai_search.exceptions.AISearchException(message, status_code=None, response_content=None)
Bases:
ExceptionBase exception for all AI Search SDK errors.
- Attributes:
status_code: HTTP status code if applicable response_content: Raw response content from the API
- exception databricks.ai_search.exceptions.BadRequest(message, status_code=None, response_content=None)
Bases:
AISearchException
- exception databricks.ai_search.exceptions.InvalidInputException(message, status_code=None, response_content=None)
Bases:
AISearchException
- exception databricks.ai_search.exceptions.NotFound(message, status_code=None, response_content=None)
Bases:
AISearchException
- exception databricks.ai_search.exceptions.PermissionDenied(message, status_code=None, response_content=None)
Bases:
AISearchException
- exception databricks.ai_search.exceptions.ResourceConflict(message, status_code=None, response_content=None)
Bases:
AISearchException
- exception databricks.ai_search.exceptions.TooManyRequests(message, status_code=None, response_content=None)
Bases:
AISearchException
- databricks.ai_search.exceptions.VectorSearchException
alias of
AISearchException