pyspark.sql.functions.min_by

pyspark.sql.functions.min_by(col: ColumnOrName, ord: ColumnOrName) → pyspark.sql.column.Column

Returns the value associated with the minimum value of ord.

Parameters
colColumn or str

target column that the value will be returned

ordColumn or str

column to be minimized

Returns
Column

value associated with the minimum value of ord.

Examples

>>> df = spark.createDataFrame([
...     ("Java", 2012, 20000), ("dotNET", 2012, 5000),
...     ("dotNET", 2013, 48000), ("Java", 2013, 30000)],
...     schema=("course", "year", "earnings"))
>>> df.groupby("course").agg(min_by("year", "earnings")).show()
+------+----------------------+
|course|min_by(year, earnings)|
+------+----------------------+
|  Java|                  2012|
|dotNET|                  2012|
+------+----------------------+