Feature Table

Classes

class databricks.ml_features.entities.feature_table.FeatureTable(name, table_id, description, primary_keys, partition_columns, features, creation_timestamp=None, online_stores=None, notebook_producers=None, job_producers=None, table_data_sources=None, path_data_sources=None, custom_data_sources=None, timestamp_keys=None, tags=None)

Note

Aliases:databricks.feature_engineering.entities.feature_table.FeatureTable, databricks.feature_store.entities.feature_table.FeatureTable

Value class describing one feature table.

This will typically not be instantiated directly, instead the create_table() will create FeatureTable objects.

Time Windows

class databricks.ml_features.entities.time_window.TimeWindow(*, duration: timedelta, offset: Optional[timedelta] = None, _start_inclusive: bool = True, _end_inclusive: bool = False)

Bases: _FeatureStoreObject

Defines an aggregation time window.

Parameters
  • duration – The length of the time window. This defines how far back in time the window spans from the requested time. This must be positive. The interval defined by this window includes the start (earlier in time) endpoint, but not the end (later in time) endpoint. That is, the interval is [ts - duration, ts).

  • offset – Optional offset to adjust the end of the window. This can be used to shift the window by a certain duration backwards. This must be non-positive if provided. Defaults to 0.

__init__(*, duration: timedelta, offset: Optional[timedelta] = None, _start_inclusive: bool = True, _end_inclusive: bool = False)

Initialize a TimeWindow object. See class documentation.

property duration: timedelta

The length of the time window.

property offset: timedelta

The offset to adjust the end of the window.

spark_window(partition_columns: List[str], order_column: str) WindowSpec

Creates a Spark WindowSpec using rangeBetween with time-based windows.

Parameters:

partition_columns (list[str]): Columns to partition by. order_column (str): Column to order by (must be one timestamp column).

Returns:

pyspark.sql.window.Window: A configured WindowSpec.

class databricks.ml_features.entities.time_window.TumblingWindow(*, window_duration: timedelta)

Bases: TimeWindow

Tumbling windows partition a continuous stream of data into non-overlapping, fixed-duration windows. Each event belongs to exactly one window.

Example: 5-day tumbling creates windows [Day1-5], [Day6-10], [Day11-15]

Parameters

window_duration – The length of each time window. This must be positive.

__init__(*, window_duration: timedelta)

Initialize a TumblingWindow object. See class documentation.

class databricks.ml_features.entities.time_window.SlidingWindow(*, window_duration: timedelta, slide_duration: timedelta)

Bases: TimeWindow

Sliding windows create overlapping, fixed-duration windows that advance by a specified slide interval. Data points can belong to multiple windows.

Example: 5-day window, 1-day slide creates overlapping 5-day periods: [Day1-5], [Day2-6], [Day3-7], etc.

Parameters
  • window_duration – The length of each time window. This must be positive.

  • slide_duration – The interval by which windows advance. This must be positive and less than window_duration.

__init__(*, window_duration: timedelta, slide_duration: timedelta)

Initialize a SlidingWindow object. See class documentation.

property slide_duration: timedelta

The interval by which windows advance.

class databricks.ml_features.entities.time_window.ContinuousWindow(*, window_duration: timedelta, offset: Optional[timedelta] = None)

Bases: TimeWindow

Continuous windows are typically used for point-in-time windows, where duration and offset are explicitly defined. This provides maximum fidelity for windowing scenarios.

Timeline:

<── older ─────────────────── evaluation_time ──> newer

Basic Window (duration=7d):

[─── 7 days ───]|
└─ start        └─ evaluation_time (excluded)

With Offset (duration=7d, offset=-1d):

[─── 7 days ───]        |
└─ start       └─ end   └─ evaluation_time
                        (1d ago)
Parameters
  • window_duration – The length of the window from the event time. This must be positive.

  • offset – Adjustment the end of the window backwards from the event time before the window starts. This must be non-positive if provided. Defaults to 0.

__init__(*, window_duration: timedelta, offset: Optional[timedelta] = None)

Initialize a ContinuousWindow object. See class documentation.