pyspark.RDD.saveAsHadoopDataset¶
-
RDD.
saveAsHadoopDataset
(conf: Dict[str, str], keyConverter: Optional[str] = None, valueConverter: Optional[str] = None) → None¶ Output a Python RDD of key-value pairs (of form
RDD[(K, V)]
) to any Hadoop file system, using the old Hadoop OutputFormat API (mapred package). Keys/values are converted for output using either user specified converters or, by default, “org.apache.spark.api.python.JavaToWritableConverter”.- Parameters
- confdict
Hadoop job configuration
- keyConverterstr, optional
fully qualified classname of key converter (None by default)
- valueConverterstr, optional
fully qualified classname of value converter (None by default)