pyspark.RDD.pipe¶
- 
RDD.pipe(command: str, env: Optional[Dict[str, str]] = None, checkCode: bool = False) → pyspark.rdd.RDD[str]¶ Return an RDD created by piping elements to a forked external process.
- Parameters
 - commandstr
 command to run.
- envdict, optional
 environment variables to set.
- checkCodebool, optional
 whether or not to check the return value of the shell command.
Examples
>>> sc.parallelize(['1', '2', '', '3']).pipe('cat').collect() ['1', '2', '', '3']