DataProcessor#
- class sdgx.data_processors.base.DataProcessor[source]#
Bases:
objectBase class for data processors.
- _fit(metadata: Metadata | None = None, **kwargs: Dict[str, Any])[source]#
Fit the data processor.
Called before
convertandreverse_convert.- Parameters:
metadata (Metadata, optional) – Metadata. Defaults to None.
- static attach_columns(tabular_data: DataFrame, new_columns: DataFrame) DataFrame[source]#
Attach additional columns to an existing DataFrame.
- Parameters:
tabular_data (-) – The original DataFrame.
new_columns (-) – The DataFrame containing additional columns to be attached.
- Returns:
The DataFrame with new_columns attached.
- Return type:
result_data (pd.DataFrame)
- Raises:
- ValueError – If the number of rows in tabular_data and new_columns are not the same.
- check_fitted()[source]#
Check if the processor is fitted.
- Raises:
SynthesizerProcessorError – If the processor is not fitted.
- convert(raw_data: DataFrame) DataFrame[source]#
Convert raw data into processed data.
- Parameters:
raw_data (pd.DataFrame) – Raw data
- Returns:
Processed data
- Return type:
pd.DataFrame
- fitted = False#
- static remove_columns(tabular_data: DataFrame, column_name_to_remove: list) DataFrame[source]#
Remove specified columns from the input tabular data.
- Parameters:
tabular_data (-) – Processed tabular data
column_name_to_remove (-) – List of column names to be removed
- Returns:
Tabular data with specified columns removed
- Return type:
result_data (pd.DataFrame)