pyspark.sql.functions.posexplode_outer

pyspark.sql.functions.posexplode_outer(col: ColumnOrName) → pyspark.sql.column.Column

Returns a new row for each element with position in the given array or map. Unlike posexplode, if the array/map is null or empty then the row (null, null) is produced. Uses the default column name pos for position, and col for elements in the array and key and value for elements in the map unless specified otherwise.

Examples

>>> df = spark.createDataFrame(
...     [(1, ["foo", "bar"], {"x": 1.0}), (2, [], {}), (3, None, None)],
...     ("id", "an_array", "a_map")
... )
>>> df.select("id", "an_array", posexplode_outer("a_map")).show()
+---+----------+----+----+-----+
| id|  an_array| pos| key|value|
+---+----------+----+----+-----+
|  1|[foo, bar]|   0|   x|  1.0|
|  2|        []|null|null| null|
|  3|      null|null|null| null|
+---+----------+----+----+-----+
>>> df.select("id", "a_map", posexplode_outer("an_array")).show()
+---+----------+----+----+
| id|     a_map| pos| col|
+---+----------+----+----+
|  1|{x -> 1.0}|   0| foo|
|  1|{x -> 1.0}|   1| bar|
|  2|        {}|null|null|
|  3|      null|null|null|
+---+----------+----+----+