typeddfs.df_typing
ο
Information about how DataFrame subclasses should be handled.
Module Contentsο
- typeddfs.df_typing.FINAL_DF_TYPINGο
- typeddfs.df_typing.FINAL_IO_TYPINGο
- class typeddfs.df_typing.DfTypingο
Contains all information about how to type a DataFrame subclass.
- _auto_dtypes :Optional[Mapping[str, Type[Any]]]ο
- _column_series_name :Union[bool, None, str]ο
- _columns_to_drop :Optional[Set[str]]ο
- _index_series_name :Union[bool, None, str]ο
- _io_typing :IoTypingο
- _more_columns_allowed :bool = Trueο
- _more_index_names_allowed :bool = Trueο
- _order_dclass :bool = Trueο
- _post_processing :Optional[Callable[[T], Optional[T]]]ο
- _required_columns :Optional[Sequence[str]]ο
- _required_index_names :Optional[Sequence[str]]ο
- _reserved_columns :Optional[Sequence[str]]ο
- _reserved_index_names :Optional[Sequence[str]]ο
- _value_dtype :Optional[Type[Any]]ο
- _verifications :Optional[Sequence[Callable[[T], Union[None, bool, str]]]]ο
- property auto_dtypes(self) Mapping[str, Type[Any]] ο
A mapping from column/index names to the expected dtype. These are used via
pd.Series.as_type
for automatic conversion. An error will be raised if aas_type
call fails. Note that Pandas frequently just does not perform the conversion, rather than raising an error. The keys should be contained inknown_names
, but this is not strictly required.
- property column_series_name(self) Union[bool, None, str] ο
Intelligently returns
df.columns.name
. Returns a value that will be forced intodf.columns.name
on callingconvert
. IfNone
, will setdf.columns.name = None
. IfFalse
, will not set. (True
is treated the same asNone
.)
- property columns_to_drop(self) Set[str] ο
Returns the list of columns that are automatically dropped by
convert
. This does NOT include βlevel_0β and βindex, which are ALWAYS dropped.
- property index_series_name(self) Union[bool, None, str] ο
Intelligently returns
df.index.name
. Returns a value that will be forced intodf.index.name
on callingconvert
, only if the DataFrame is multi-index. IfNone
, will setdf.index.name = None
ifdf.index.names != [None]
. IfFalse
, will not set. (True
is treated the same asNone
.)
- property is_strict(self) bool ο
Returns True if this allows unspecified index levels or columns.
- property known_column_names(self) Sequence[str] ο
Returns all columns that are required or reserved. The sort order positions required columns first.
- property known_index_names(self) Sequence[str] ο
Returns all index levels that are required or reserved. The sort order positions required columns first.
- property known_names(self) Sequence[str] ο
Returns all index and column names that are required or reserved. The sort order is: required index, reserved index, required columns, reserved columns.
- property more_columns_allowed(self) bool ο
Returns whether the DataFrame allows columns that are not reserved or required.
- property more_indices_allowed(self) bool ο
Returns whether the DataFrame allows index levels that are neither reserved nor required.
- property order_dataclass(self) bool ο
Whether the corresponding dataclass can be sorted (has
__lt__
).
- property post_processing(self) Optional[Callable[[T], Optional[T]]] ο
A function to be called at the final stage of
convert
. It is called immediately beforeverifications
are checked. The function takes a copy of the inputBaseDf
and returns a new copy.Note
Although a copy is passed as input, the function should not modify it. Technically, doing so will cause problems only if the DataFrameβs internal values are modified. The value passed is a shallow copy (see
pd.DataFrame.copy
).
- property required_columns(self) Sequence[str] ο
Returns the list of required column names.
- property required_index_names(self) Sequence[str] ο
Returns the list of required column names.
- property required_names(self) Sequence[str] ο
Returns all index and column names that are required. The sort order is: required index, required columns.
- property reserved_columns(self) Sequence[str] ο
Returns the list of reserved (optional) column names.
- property reserved_index_names(self) Sequence[str] ο
Returns the list of reserved (optional) index levels.
- property reserved_names(self) Sequence[str] ο
Returns all index and column names that are not required. The sort order is: reserved index, reserved columns.
- property value_dtype(self) Optional[Type[Any]] ο
A type for βvaluesβ in a simple DataFrame. Typically numeric.
- property verifications(self) Sequence[Callable[[T], Union[None, bool, str]]] ο
Additional requirements for the DataFrame to be conformant.
- Returns
A sequence of conditions that map the DF to None or True if the condition passes, or False or the string of an error message if it fails
- class typeddfs.df_typing.IoTypingο
Abstract base class for generic types.
A generic type is typically declared by inheriting from this class parameterized with one or more type variables. For example, a generic mapping type might be defined as:
class Mapping(Generic[KT, VT]): def __getitem__(self, key: KT) -> VT: ... # Etc.
This class can then be used as follows:
def lookup_name(mapping: Mapping[KT, VT], key: KT, default: VT) -> VT: try: return mapping[key] except KeyError: return default
- _attrs_json_kwargs :Optional[Mapping[str, Any]]ο
- _attrs_suffix :str = .attrs.jsonο
- _custom_readers :Optional[Mapping[str, Callable[[pathlib.Path], pandas.DataFrame]]]ο
- _custom_writers :Optional[Mapping[str, Callable[[pandas.DataFrame, pathlib.Path], None]]]ο
- _hash_alg :Optional[str] = sha256ο
- _hdf_key :str = dfο
- _read_kwargs :Optional[Mapping[typeddfs.file_formats.FileFormat, Mapping[str, Any]]]ο
- _recommended :bool = Falseο
- _remap_suffixes :Optional[Mapping[str, typeddfs.file_formats.FileFormat]]ο
- _remapped_read_kwargs :Optional[Mapping[str, Any]]ο
- _remapped_write_kwargs :Optional[Mapping[str, Any]]ο
- _save_hash_dir :bool = Falseο
- _save_hash_file :bool = Falseο
- _secure :bool = Falseο
- _text_encoding :str = utf-8ο
- _use_attrs :bool = Falseο
- _write_kwargs :Optional[Mapping[typeddfs.file_formats.FileFormat, Mapping[str, Any]]]ο
- property attrs_json_kwargs(self) Mapping[str, Any] ο
Keyword arguments for
typeddfs.json_utils.JsonUtils.encoder
. Used when writing attrs.
- property attrs_suffix(self) str ο
File filename suffix detailing where to save/load per-DataFrame βattrsβ (metadata). Will be appended to the DataFrame filename.
- property custom_readers(self) Mapping[str, Callable[[pathlib.Path], pandas.DataFrame]] ο
Mapping from filename suffixes (module compression) to custom reading methods.
- property custom_writers(self) Mapping[str, Callable[[pandas.DataFrame, pathlib.Path], None]] ο
Mapping from filename suffixes (module compression) to custom reading methods.
- property dir_hash(self) bool ο
Whether to save (append) to per-directory hash files by default. Specifically, in
typeddfs.abs_df.AbsDf.write_file()
.
- property file_hash(self) bool ο
Whether to save per-file hash files by default. Specifically, in
typeddfs.abs_df.AbsDf.write_file()
.
- property flexwf_sep(self) str ο
The delimiter used when reading βflex-widthβ format.
Caution
Only checks the read keyword arguments, not write
- property hash_algorithm(self) Optional[str] ο
The hash algorithm used for checksums.
- property hdf_key(self) str ο
The default key used in
typeddfs.abs_df.AbsDf.to_hdf()
. The key is also used intypeddfs.abs_df.AbsDf.read_hdf.()
- property is_text_encoding_utf(self) bool ο
- property read_kwargs(self) Mapping[typeddfs.file_formats.FileFormat, Mapping[str, Any]] ο
Passes kwargs into read functions from
read_file
. These are keyword arguments that are automatically added into specificread_
methods when called byread_file
.Note
This should rarely be needed
- property read_suffix_kwargs(self) Mapping[str, Mapping[str, Any]] ο
Per-suffix kwargs into read functions from
read_file
. Modulo compression (e.g. .tsv is equivalent to .tsv.gz).
- property recommended(self) bool ο
Whether to forbid discouraged formats like fixed-width and HDF5. Excludes all insecure formats.
- property remap_suffixes(self) Mapping[str, typeddfs.file_formats.FileFormat] ο
Returns filename formats that have been re-mapped to file formats. These are used in
read_file
andwrite_file
.Note
This should rarely be needed. An exception might be
.txt
to tsv rather than lines; Excel uses this.
- property secure(self) bool ο
Whether to forbid insecure operations and formats.
- property text_encoding(self) str ο
Can be an exact encoding like utf-8, βplatformβ, βutf8(bom)β or βutf16(bom)β. See the docs in
TypedDfs.typed().encoding
for details.
- property toml_aot(self) str ο
The name of the Array of Tables (AoT) used when reading TOML.
Caution
Only checks the read keyword arguments, not write
- property use_attrs(self) bool ο
Whether to read and write
pd.DataFrame.attrs
when passingattrs=None
.
- property write_kwargs(self) Mapping[typeddfs.file_formats.FileFormat, Mapping[str, Any]] ο
Passes kwargs into write functions from
to_file
. These are keyword arguments that are automatically added into specificto_
methods when called bywrite_file
.Note
This should rarely be needed
- property write_suffix_kwargs(self) Mapping[str, Mapping[str, Any]] ο
Per-suffix kwargs into read functions from
write_file
. Modulo compression (e.g. .tsv is equivalent to .tsv.gz).