typeddfs.abs_dfs
ο
Defines a low-level DataFrame subclass.
It overrides a lot of methods to auto-change the type back to cls
.
Module Contentsο
- class typeddfs.abs_dfs.AbsDfο
- classmethod _check(cls, df) None ο
Should raise an
typeddfs.df_errors.InvalidDfError
or subclass for issues.
- classmethod can_read(cls) Set[typeddfs.file_formats.FileFormat] ο
Returns all formats that can be read using
read_file
. Some depend on the availability of optional packages. The lines format (.txt
,.lines
, etc.) is only included if this DataFrame can support only 1 column+index. Seetypeddfs.file_formats.FileFormat.can_read()
.
- classmethod can_write(cls) Set[typeddfs.file_formats.FileFormat] ο
Returns all formats that can be written to using
write_file
. Some depend on the availability of optional packages. The lines format (.txt
,.lines
, etc.) is only included if this DataFrame type can support only 1 column+index. Seetypeddfs.file_formats.FileFormat.can_write()
.
- classmethod from_records(cls, *args, **kwargs) __qualname__ ο
- classmethod read_file(cls, path: Union[pathlib.Path, str], *, file_hash: Optional[bool] = None, dir_hash: Optional[bool] = None, hex_hash: Optional[str] = None, attrs: Optional[bool] = None) __qualname__ ο
Reads from a file (or possibly URL), guessing the format from the filename extension. Delegates to the
read_*
functions of this class.You can always write and then read back to get the same dataframe. .. code-block:
# df is any DataFrame from typeddfs # path can use any suffix df.write_file(path)) df.read_file(path)
Text files always allow encoding with .gz, .zip, .bz2, or .xz.
- Supports:
.csv, .tsv, or .tab
.json
.xml
.feather
.parquet or .snappy
.h5 or .hdf
.xlsx, .xls, .odf, etc.
.toml
.properties
.ini
.fxf (fixed-width)
.flexwf (fixed-but-unspecified-width with an optional delimiter)
.txt, .lines, or .list
See also
- Parameters
path β Only path-like strings or pathlib objects are supported, not buffers (because we need a filename).
file_hash β Check against a hash file specific to this file (e.g. <path>.sha1)
dir_hash β Check against a per-directory hash file
hex_hash β Check against this hex-encoded hash
attrs β Set dataset attributes/metadata (
pd.DataFrame.attrs
) from a JSON file. If True, usestypeddfs.df_typing.DfTyping.attrs_suffix
. If a str or Path, uses that file. If None or False, does not set.
- Returns
An instance of this class
- classmethod read_url(cls, url: str) __qualname__ ο
Reads from a URL, guessing the format from the filename extension. Delegates to the
read_*
functions of this class.See also
- Returns
An instance of this class
- write_file(self, path: Union[pathlib.Path, str], *, overwrite: bool = True, mkdirs: bool = False, file_hash: Optional[bool] = None, dir_hash: Optional[bool] = None, attrs: Optional[bool] = None) Optional[str] ο
Writes to a file, guessing the format from the filename extension. Delegates to the
to_*
functions of this class (e.g.to_csv
). Only includes file formats that can be read back in with correspondingto
methods.- Supports, where text formats permit optional .gz, .zip, .bz2, or .xz:
.csv, .tsv, or .tab
.json
.feather
.fwf (fixed-width)
.flexwf (columns aligned but using a delimiter)
.parquet or .snappy
.h5, .hdf, or .hdf5
.xlsx, .xls, and other variants for Excel
.odt and .ods (OpenOffice)
.xml
.toml
.ini
.properties
.pkl and .pickle
.txt, .lines, or .list; see
to_lines()
andread_lines()
See also
- Parameters
path β Only path-like strings or pathlib objects are supported, not buffers (because we need a filename).
overwrite β If False, complain if the file already exists
mkdirs β Make the directory and parents if they do not exist
file_hash β Write a hash for this file. The filename will be path+β.β+algorithm. If None, chooses according to
self.get_typing().io.hash_file
.dir_hash β Append a hash for this file into a list. The filename will be the directory name suffixed by the algorithm; (i.e. path.parent/(path.parent.name+β.β+algorithm) ). If None, chooses according to
self.get_typing().io.hash_dir
.attrs β Write dataset attributes/metadata (
pd.DataFrame.attrs
) to a JSON file. usestypeddfs.df_typing.DfTyping.attrs_suffix
. If None, chooses according toself.get_typing().io.use_attrs
.
- Returns
Whatever the corresponding method on
pd.to_*
returns. This is usually either str or None- Raises
InvalidDfError β If the DataFrame is not valid for this type
ValueError β If the type of a column or index name is non-str