pyspark.sql.DataFrame.semanticHash

DataFrame.semanticHash() → int

Returns a hash code of the logical query plan against this DataFrame.

Notes

Unlike the standard hash code, the hash is calculated against the query plan simplified by tolerating the cosmetic differences such as attribute names.

This API is a developer API.

Examples

>>> spark.range(10).selectExpr("id as col0").semanticHash()  
1855039936
>>> spark.range(10).selectExpr("id as col1").semanticHash()  
1855039936