Databricks DSPy Integrations Python API

Setup:

Install databricks-dspy.

pip install -U databricks-dspy

If you are outside Databricks, set the Databricks workspace hostname and personal access token to environment variables:

export DATABRICKS_HOSTNAME="https://your-databricks-workspace"
export DATABRICKS_TOKEN="your-personal-access-token"

class databricks_dspy.DatabricksLM(model: str, workspace_client: WorkspaceClient | None = None, create_pt_endpoint: bool = False, pt_entity: PtServedModel | None = None, **kwargs)

Bases: LM

Subclass of dspy.LM for compatibility with Databricks.

Parameters:

model – The model to use. Must start with ‘databricks/’.
workspace_client – The workspace client to use. If not provided, a new one will be created with default credentials from the environment.
create_pt_endpoint – Whether to create a provisioned throughput endpoint to make LM calls.
pt_entity – The entity to serve, only used when create_pt_endpoint is True.

Example 1: Use a Databricks model with preconfigured workspace client.

import dspy
import databricks_dspy
from databricks.sdk import WorkspaceClient

w = WorkspaceClient()
lm = databricks_dspy.DatabricksLM(
    "databricks/databricks-llama-4-maverick",
    workspace_client=w,
)
dspy.configure(lm=lm)

predict = dspy.Predict("q->a")
print(predict(q="why did a chicken cross the kitchen?"))

Example 2: Create a provisioned throughput endpoint for a Databricks model.

import dspy
import databricks_dspy
from databricks.sdk import WorkspaceClient
from databricks.sdk.service.serving import PtServedModel

w = WorkspaceClient()
entity = PtServedModel(
    entity_name="system.ai.llama-4-maverick",
    entity_version="1",
    provisioned_model_units=50,
)
lm = databricks_dspy.DatabricksLM(
    "databricks/provisioned-llama-4-maverick",
    workspace_client=w,
    create_pt_endpoint=True,
    pt_entity=entity,
)
dspy.configure(lm=lm)

predict = dspy.Predict("q->a")
print(predict(q="why did a chicken cross the kitchen?"))

tear_down()

forward(**kwargs)

Forward pass for the language model.

Subclasses must implement this method, and the response should be identical to either of the following formats: - [OpenAI response format](https://platform.openai.com/docs/api-reference/responses/object) - [OpenAI chat completion format](https://platform.openai.com/docs/api-reference/chat/object) - [OpenAI text completion format](https://platform.openai.com/docs/api-reference/completions/object)

class databricks_dspy.DatabricksRM(databricks_index_name: str, databricks_endpoint: str | None = None, databricks_token: str | None = None, databricks_client_id: str | None = None, databricks_client_secret: str | None = None, columns: list[str] | None = None, filters_json: str | None = None, k: int = 3, docs_id_column_name: str = 'id', docs_uri_column_name: str | None = None, text_column_name: str = 'text', use_with_databricks_agent_framework: bool = False, workspace_client: WorkspaceClient | None = None)

Bases: Retrieve

A retriever module that uses a Databricks Mosaic AI Vector Search Index to return the top-k embeddings for a given query.

Examples

Below is a code snippet that shows how to set up a Databricks Vector Search Index and configure a DatabricksRM DSPy retriever module to query the index.

(example adapted from “Databricks: How to create and query a Vector Search Index: https://docs.databricks.com/en/generative-ai/create-query-vector-search.html#create-a-vector-search-index)

from databricks.vector_search.client import VectorSearchClient
from databricks.sdk import WorkspaceClient

# Create a Databricks workspace client
w = WorkspaceClient()

# Create a Databricks Vector Search Endpoint
client = VectorSearchClient()
client.create_endpoint(name="your_vector_search_endpoint_name", endpoint_type="STANDARD")

# Create a Databricks Direct Access Vector Search Index
index = client.create_direct_access_index(
    endpoint_name="your_vector_search_endpoint_name",
    index_name="your_index_name",
    primary_key="id",
    embedding_dimension=1024,
    embedding_vector_column="text_vector",
    schema={
        "id": "int",
        "field2": "str",
        "field3": "float",
        "text_vector": "array<float>",
    },
)

# Create a DatabricksRM retriever module to query the Databricks Direct Access Vector
# Search Index
retriever = DatabricksRM(
    databricks_index_name="your_index_name",
    docs_id_column_name="id",
    text_column_name="field2",
    k=3,
    workspace_client=w,
)

Below is a code snippet that shows how to query the Databricks Direct Access Vector Search Index using the DatabricksRM retriever module:

retrieved_results = DatabricksRM(query="Example query text"))

forward(query: str | list[float], query_type: str = 'ANN', filters_json: str | None = None, query_vector: list[float] | None = None) → Prediction | list[dict[str, Any]]

Retrieve documents from a Databricks Mosaic AI Vector Search Index that are relevant to the specified query.

Parameters:

query (Union[str, list[float]]) – The query text or numeric query vector for which to retrieve relevant documents.
query_type (str) – The type of search query to perform against the Databricks Vector Search Index. Must be either ‘ANN’ (approximate nearest neighbor) or ‘HYBRID’ (hybrid search).
filters_json (Optional[str]) – A JSON string specifying additional query filters. Example filters: {"id <": 5} selects records that have an id column value less than 5, and {"id >=": 5, "id <": 10} selects records that have an id column value greater than or equal to 5 and less than 10. If specified, this parameter overrides the filters_json parameter passed to the constructor.
query_vector (Optional[list[float]]) – An optional query vector to use in combination with the query text for HYBRID search. This parameter can only be provided when query_type is ‘HYBRID’ and query is a string. When provided, both the query text and query vector will be used for hybrid search.

Returns:

A list of dictionaries when use_with_databricks_agent_framework is True, or a dspy.Prediction object when use_with_databricks_agent_framework is False.