Int Formatter#

class sdgx.data_processors.formatters.int.IntValueFormatter[source]#

Bases: Formatter

Formatter class for handling Int values in pd.DataFrame.

_fit(metadata: Metadata | None = None, **kwargs: Dict[str, Any])#

Fit the data processor.

Called before convert and reverse_convert.

Parameters:

metadata (Metadata, optional) – Metadata. Defaults to None.

static attach_columns(tabular_data: DataFrame, new_columns: DataFrame) DataFrame#

Attach additional columns to an existing DataFrame.

Parameters:
  • tabular_data (-) – The original DataFrame.

  • new_columns (-) – The DataFrame containing additional columns to be attached.

Returns:

The DataFrame with new_columns attached.

Return type:

  • result_data (pd.DataFrame)

Raises:

- ValueError – If the number of rows in tabular_data and new_columns are not the same.

check_fitted()#

Check if the processor is fitted.

Raises:

SynthesizerProcessorError – If the processor is not fitted.

convert(raw_data: DataFrame) DataFrame[source]#

No action for convert.

fit(metadata: Metadata | None = None, **kwargs: dict[str, Any])[source]#

Fit method for the formatter.

Formatter need to use metadata to record which columns belong to the int type, and convert them back to the int type during post-processing.

fitted = False#
int_columns: set#

List of column names that are of type int, populated by the fit method using metadata.

static remove_columns(tabular_data: DataFrame, column_name_to_remove: list) DataFrame#

Remove specified columns from the input tabular data.

Parameters:
  • tabular_data (-) – Processed tabular data

  • column_name_to_remove (-) – List of column names to be removed

Returns:

Tabular data with specified columns removed

Return type:

  • result_data (pd.DataFrame)

reverse_convert(processed_data: DataFrame) DataFrame[source]#

reverse_convert method for the formatter.

Do format conversion for int columns.