typeddfs.typed_dfs

Defines DataFrames with convenience methods and that enforce invariants.

Module Contents

class typeddfs.typed_dfs.PlainTypedDf

A trivial TypedDf that behaves like an untyped one.

class typeddfs.typed_dfs.TypedDf

A concrete BaseFrame that enforces conditions. Each subclass has required and reserved (optional) columns and index names. They may or may not permit additional columns or index names.

The constructor will require the conditions to pass but will not rearrange columns and indices. To do that, call convert.

Overrides a number of DataFrame methods that preserve the subclass. For example, calling df.reset_index() will return a TypedDf of the same type as df. If a condition would then fail, call untyped() first.

For example, suppose MyTypedDf has a required index name called β€œxyz”. Then this will be fine as long as df has a column or index name called xyz: MyTypedDf.convert(df). But calling MyTypedDf.convert(df).reset_index() will fail. You can put the column β€œxyz” back into the index using convert: MyTypedDf.convert(df.reset_index()). Or, you can get a plain DataFrame (UntypedDf) back: MyTypedDf.convert(df).untyped().reset_index().

To summarize: Call untyped() before calling something that would result in anything invalid.

classmethod _check(cls, df) None
classmethod _check_has_required(cls, df: pandas.DataFrame) None
classmethod _check_has_unexpected(cls, df: pandas.DataFrame) None
classmethod convert(cls, df: pandas.DataFrame) __qualname__

Converts a vanilla Pandas DataFrame (or any subclass) to cls. Explicitly sets the new copy’s __class__ to cls. Rearranges the columns and index names. For example, if a column in df is in self.reserved_index_names(), it will be moved to the index.

The new index names will be, in order:
  • required_index_names(), in order

  • reserved_index_names(), in order

  • any extras in df, if more_indices_allowed is True

Similarly, the new columns will be, in order:
  • required_columns(), in order

  • reserved_columns(), in order

  • any extras in df in the original, if more_columns_allowed is True

Note

Any column called index or level_0 will be dropped automatically.

Parameters

df – The Pandas DataFrame or member of cls; will have its __class_ change but will otherwise not be affected

Returns

A copy

Raises
  • InvalidDfError – If a condition such as a required column or symmetry fails (specific subclasses)

  • TypeError – If df is not a DataFrame

classmethod get_typing(cls) typeddfs.df_typing.DfTyping
meta(self) __qualname__

Drops the columns, returning only the index but as the same type.

Returns

A copy

Raises

InvalidDfError – If the result does not pass the typing of this class

classmethod new_df(cls, reserved: Union[bool, Sequence[str]] = False) __qualname__

Returns a DataFrame that is empty but has the correct columns and indices.

Parameters

reserved – Include reserved index/column names as well as required. If True, adds all reserved index levels and columns; You can also specify the exact list of columns and index names.

Raises

InvalidDfError – If a function in verifications fails (returns False or a string).

untyped(self) typeddfs.untyped_dfs.UntypedDf

Makes a copy that’s an UntypedDf. It won’t have enforced requirements but will still have the convenience functions.

Returns

A shallow copy with its __class__ set to an UntypedDf

See:

vanilla()