pyspark.sql.DataFrame.dropna¶
-
DataFrame.
dropna
(how: str = 'any', thresh: Optional[int] = None, subset: Union[str, Tuple[str, …], List[str], None] = None) → pyspark.sql.dataframe.DataFrame¶ Returns a new
DataFrame
omitting rows with null values.DataFrame.dropna()
andDataFrameNaFunctions.drop()
are aliases of each other.- Parameters
- howstr, optional
‘any’ or ‘all’. If ‘any’, drop a row if it contains any nulls. If ‘all’, drop a row only if all its values are null.
- thresh: int, optional
default None If specified, drop rows that have less than thresh non-null values. This overwrites the how parameter.
- subsetstr, tuple or list, optional
optional list of column names to consider.
Examples
>>> df4.na.drop().show() +---+------+-----+ |age|height| name| +---+------+-----+ | 10| 80|Alice| +---+------+-----+