Column Order Transformer#
- class sdgx.data_processors.transformers.column_order.ColumnOrderTransformer[source]#
Bases:
TransformerA transformer that rearranges the columns of a DataFrame to a specified order.
- column_list#
The list of column names in the desired order.
- Type:
list
- fit(metadata
Metadata | None = None, **kwargs: dict[str, Any]): Fits the transformer by remembering the order of the columns.
- convert(raw_data
pd.DataFrame) -> pd.DataFrame: Converts the input DataFrame by rearranging its columns.
- reverse_convert(processed_data
pd.DataFrame) -> pd.DataFrame: Reverse-converts the processed DataFrame by rearranging its columns back to their original order.
- rearrange_columns(column_list, processed_data)[source]#
Rearranges the columns of a DataFrame according to the provided column list.
- _fit(metadata: Metadata | None = None, **kwargs: Dict[str, Any])#
Fit the data processor.
Called before
convertandreverse_convert.- Parameters:
metadata (Metadata, optional) – Metadata. Defaults to None.
- static attach_columns(tabular_data: DataFrame, new_columns: DataFrame) DataFrame#
Attach additional columns to an existing DataFrame.
- Parameters:
tabular_data (-) – The original DataFrame.
new_columns (-) – The DataFrame containing additional columns to be attached.
- Returns:
The DataFrame with new_columns attached.
- Return type:
result_data (pd.DataFrame)
- Raises:
- ValueError – If the number of rows in tabular_data and new_columns are not the same.
- check_fitted()#
Check if the processor is fitted.
- Raises:
SynthesizerProcessorError – If the processor is not fitted.
- column_list: list#
The list of tabular data’s columns.
- convert(raw_data: DataFrame) DataFrame[source]#
Convert method to handle missing values in the input data.
- fit(metadata: Metadata | None = None, **kwargs: dict[str, Any])[source]#
Fit method for the transformer.
Remember the order of the columns.
- fitted = False#
- static rearrange_columns(column_list, processed_data)[source]#
This method rearranges the columns of a given DataFrame according to the provided column list.
Any columns in the DataFrame that are not in the column list are dropped.
- Parameters:
column_list (-) – A list of column names in the order they should appear in the output DataFrame.
processed_data (-) – The DataFrame to be rearranged.
- Returns:
The rearranged DataFrame.
- Return type:
result_data (pd.DataFrame)
- static remove_columns(tabular_data: DataFrame, column_name_to_remove: list) DataFrame#
Remove specified columns from the input tabular data.
- Parameters:
tabular_data (-) – Processed tabular data
column_name_to_remove (-) – List of column names to be removed
- Returns:
Tabular data with specified columns removed
- Return type:
result_data (pd.DataFrame)