pyspark.RDD.saveAsSequenceFile¶
-
RDD.
saveAsSequenceFile
(path: str, compressionCodecClass: Optional[str] = None) → None¶ Output a Python RDD of key-value pairs (of form
RDD[(K, V)]
) to any Hadoop file system, using the “org.apache.hadoop.io.Writable” types that we convert from the RDD’s key and value types. The mechanism is as follows:Pickle is used to convert pickled Python RDD into RDD of Java objects.
Keys and values of this Java RDD are converted to Writables and written out.
- Parameters
- pathstr
path to sequence file
- compressionCodecClassstr, optional
fully qualified classname of the compression codec class i.e. “org.apache.hadoop.io.compress.GzipCodec” (None by default)