pyspark.streaming.StreamingContext.binaryRecordsStream¶
-
StreamingContext.
binaryRecordsStream
(directory: str, recordLength: int) → pyspark.streaming.dstream.DStream[bytes]¶ Create an input stream that monitors a Hadoop-compatible file system for new files and reads them as flat binary files with records of fixed length. Files must be written to the monitored directory by “moving” them from another location within the same file system. File names starting with . are ignored.
- Parameters
- directorystr
Directory to load data from
- recordLengthint
Length of each record in bytes