Column Order Transformer#

class sdgx.data_processors.transformers.column_order.ColumnOrderTransformer[source]#

Bases: Transformer

A transformer that rearranges the columns of a DataFrame to a specified order.

column_list#

The list of column names in the desired order.

Type:

list

fit(metadata

Metadata | None = None, **kwargs: dict[str, Any]): Fits the transformer by remembering the order of the columns.

convert(raw_data

pd.DataFrame) -> pd.DataFrame: Converts the input DataFrame by rearranging its columns.

reverse_convert(processed_data

pd.DataFrame) -> pd.DataFrame: Reverse-converts the processed DataFrame by rearranging its columns back to their original order.

rearrange_columns(column_list, processed_data)[source]#

Rearranges the columns of a DataFrame according to the provided column list.

_fit(metadata: Metadata | None = None, **kwargs: Dict[str, Any])#

Fit the data processor.

Called before convert and reverse_convert.

Parameters:

metadata (Metadata, optional) – Metadata. Defaults to None.

static attach_columns(tabular_data: DataFrame, new_columns: DataFrame) DataFrame#

Attach additional columns to an existing DataFrame.

Parameters:
  • tabular_data (-) – The original DataFrame.

  • new_columns (-) – The DataFrame containing additional columns to be attached.

Returns:

The DataFrame with new_columns attached.

Return type:

  • result_data (pd.DataFrame)

Raises:

- ValueError – If the number of rows in tabular_data and new_columns are not the same.

check_fitted()#

Check if the processor is fitted.

Raises:

SynthesizerProcessorError – If the processor is not fitted.

column_list: list#

The list of tabular data’s columns.

convert(raw_data: DataFrame) DataFrame[source]#

Convert method to handle missing values in the input data.

fit(metadata: Metadata | None = None, **kwargs: dict[str, Any])[source]#

Fit method for the transformer.

Remember the order of the columns.

fitted = False#
static rearrange_columns(column_list, processed_data)[source]#

This method rearranges the columns of a given DataFrame according to the provided column list.

Any columns in the DataFrame that are not in the column list are dropped.

Parameters:
  • column_list (-) – A list of column names in the order they should appear in the output DataFrame.

  • processed_data (-) – The DataFrame to be rearranged.

Returns:

The rearranged DataFrame.

Return type:

  • result_data (pd.DataFrame)

static remove_columns(tabular_data: DataFrame, column_name_to_remove: list) DataFrame#

Remove specified columns from the input tabular data.

Parameters:
  • tabular_data (-) – Processed tabular data

  • column_name_to_remove (-) – List of column names to be removed

Returns:

Tabular data with specified columns removed

Return type:

  • result_data (pd.DataFrame)

reverse_convert(processed_data: DataFrame) DataFrame[source]#

Reverse_convert method for the transformer.