pyspark.sql.DataFrameWriter.json¶
-
DataFrameWriter.
json
(path, mode=None, compression=None, dateFormat=None, timestampFormat=None, lineSep=None, encoding=None, ignoreNullFields=None)[source]¶ Saves the content of the
DataFrame
in JSON format (JSON Lines text format or newline-delimited JSON) at the specified path.New in version 1.4.0.
- Parameters
- pathstr
the path in any Hadoop supported file system
- modestr, optional
specifies the behavior of the save operation when data already exists.
append
: Append contents of thisDataFrame
to existing data.overwrite
: Overwrite existing data.ignore
: Silently ignore this operation if data already exists.error
orerrorifexists
(default case): Throw an exception if data already exists.
- compressionstr, optional
compression codec to use when saving to file. This can be one of the known case-insensitive shorten names (none, bzip2, gzip, lz4, snappy and deflate).
- dateFormatstr, optional
sets the string that indicates a date format. Custom date formats follow the formats at datetime pattern. # noqa This applies to date type. If None is set, it uses the default value,
yyyy-MM-dd
.- timestampFormatstr, optional
sets the string that indicates a timestamp format. Custom date formats follow the formats at datetime pattern. # noqa This applies to timestamp type. If None is set, it uses the default value,
yyyy-MM-dd'T'HH:mm:ss[.SSS][XXX]
.- encodingstr, optional
specifies encoding (charset) of saved json files. If None is set, the default UTF-8 charset will be used.
- lineSepstr, optional
defines the line separator that should be used for writing. If None is set, it uses the default value,
\n
.- ignoreNullFieldsstr or bool, optional
Whether to ignore null fields when generating JSON objects. If None is set, it uses the default value,
true
.
Examples
>>> df.write.json(os.path.join(tempfile.mkdtemp(), 'data'))