DataFrame.
to_parquet
Write a DataFrame to the binary parquet format.
New in version 0.21.0.
This function writes the dataframe as a parquet file. You can choose different parquet backends, and have the option of compression. See the user guide for more details.
File path or Root Directory path. Will be used as Root Directory path while writing a partitioned dataset.
Changed in version 1.0.0.
Previously this was “fname”
Parquet library to use. If ‘auto’, then the option io.parquet.engine is used. The default io.parquet.engine behavior is to try ‘pyarrow’, falling back to ‘fastparquet’ if ‘pyarrow’ is unavailable.
io.parquet.engine
Name of the compression to use. Use None for no compression.
None
If True, include the dataframe’s index(es) in the file output. If False, they will not be written to the file. If None, similar to True the dataframe’s index(es) will be saved. However, instead of being saved as values, the RangeIndex will be stored as a range in the metadata so it doesn’t require much space and is faster. Other indexes will be included as columns in the file output.
True
False
New in version 0.24.0.
Column names by which to partition the dataset. Columns are partitioned in the order they are given.
Additional arguments passed to the parquet library. See pandas io for more details.
See also
read_parquet
Read a parquet file.
DataFrame.to_csv
Write a csv file.
DataFrame.to_sql
Write to a sql table.
DataFrame.to_hdf
Write to hdf.
Notes
This function requires either the fastparquet or pyarrow library.
Examples
>>> df = pd.DataFrame(data={'col1': [1, 2], 'col2': [3, 4]}) >>> df.to_parquet('df.parquet.gzip', ... compression='gzip') >>> pd.read_parquet('df.parquet.gzip') col1 col2 0 1 3 1 2 4