DataProcessorManager#

class sdgx.data_processors.manager.DataProcessorManager(*args, **kwargs)[source]#

Bases: Manager

This is a plugin management class for data processing components.

Properties:
  • register_type: Specifies the type of data processors to register.

  • project_name: Stores the project name from the extension module.

  • hookspecs_model: Stores the hook specifications model from the extension module.

  • preset_default_processors: Stores a list of default processor names in lowercase.

  • registed_data_processors: Property that returns the registered data processors.

  • registed_default_processor_list: Property that returns the registered default data processors.

- load_all_local_model

Loads all local models for formatters, generators, samplers, and transformers.

- init_data_processor

Initializes a data processor with the given name and keyword arguments.

- init_all_processors

Initializes all registered data processors with optional keyword arguments.

- init_default_processors

Initializes default processors that are both registered and preset.

_load_dir(module)#

Import all python files in a submodule.

_normalize_name(name: str) str#
hookspecs_model = <module 'sdgx.data_processors.extension' from '/home/docs/checkouts/readthedocs.org/user_builds/synthetic-data-generator/envs/latest/lib/python3.10/site-packages/sdgx/data_processors/extension.py'>#

The hook specifications model from the extension module.

init(c, **kwargs: dict[str, Any])#

Init a new subclass of self.register_type.

Raises:
init_all_processors(**kwargs: Any) list[DataProcessor][source]#

Initializes all registered data processors

init_data_processor(processor_name, **kwargs: dict[str, Any]) DataProcessor[source]#

Initializes a data processor with the given name and parameters

init_default_processors(**kwargs: Any) list[DataProcessor][source]#

Initializes all default data processors

load_all_local_model()[source]#

loads all local models

preset_defalut_processors = ['specificcombinationtransformer', 'fixedcombinationtransformer', 'nonvaluetransformer', 'outliertransformer', 'emailgenerator', 'chnpiigenerator', 'intvalueformatter', 'datetimeformatter', 'constvaluetransformer', 'positivenegativefilter', 'emptytransformer', 'columnordertransformer']#

preset_defalut_processors list stores the lowercase names of the transformers loaded by default. When using the synthesizer, they will be loaded by default to facilitate user operations.

Keep ColumnOrderTransformer always at the last one.

project_name: str = 'sdgx.data_processor'#

Stores the project name from the extension module.

property registed_cls: dict[str, type]#

Access all registed class.

Lazy load, only load once.

property registed_data_processors#

This property returns all registered data processors

property registed_default_processor_list#

This property returns all registered default data processors

register(cls_name, cls: type)#

Register a new model, if the model is already registed, skip it.

register_type#

Specifies the type of data processors to register.

alias of DataProcessor