pyspark.sql.DataFrame.explain¶
-
DataFrame.explain(extended: Union[bool, str, None] = None, mode: Optional[str] = None) → None¶ Prints the (logical and physical) plans to the console for debugging purpose.
- Parameters
- extendedbool, optional
default
False. IfFalse, prints only the physical plan. When this is a string without specifying themode, it works as the mode is specified.- modestr, optional
specifies the expected output format of plans.
simple: Print only a physical plan.extended: Print both logical and physical plans.codegen: Print a physical plan and generated codes if they are available.cost: Print a logical plan and statistics if they are available.formatted: Split explain output into two sections: a physical plan outline and node details.Added optional argument mode to specify the expected output format of plans.
Examples
>>> df.explain() == Physical Plan == *(1) Scan ExistingRDD[age#0,name#1]
>>> df.explain(True) == Parsed Logical Plan == ... == Analyzed Logical Plan == ... == Optimized Logical Plan == ... == Physical Plan == ...
>>> df.explain(mode="formatted") == Physical Plan == * Scan ExistingRDD (1) (1) Scan ExistingRDD [codegen id : 1] Output [2]: [age#0, name#1] ...
>>> df.explain("cost") == Optimized Logical Plan == ...Statistics... ...