Extented Synthetic Data Generator#
Note
Understand the purpose of each component from the Architecture.
SDG uses pluggy to develop plug-in systems, which is based on the entry-points of Python project.
A plugin project is made up of three parts:
A class, inherits from the
register_typeof Manager, which contains your own logic.A register function, which’s name is defined(decorated) by
@hookspec. and you need to implement it and use@hookimpto declare it as a registed hook.A
entry-pointsinpyproject.toml, which pointing to the hookimp function. The subdomain of the entry-point is thePROJECT_NAMEyou can find in Manager.
View latest extension example on GitHub.
Plugin-supported modules#
API Reference for extended Data Connector: Data Connector is used to connect to data sources.
API Reference for extended Cacher for DataLoader: Cacher is used for improving performance, reducing network overhead and support large datasets.
API Reference for extended Data Processor: Data Processor is used to pre-process and post-process data. It is useful for business logic.
API Reference for extended Inspector for Metadata: Inspector is used to extract metadata such as patterns, types, etc. from raw data.
API Reference for extended Model: Model, the model fitted by processed data and used to generate synthetic data.
API Reference for extended Data Exporter: Data Exporter is used to export data to somewhere. Use it in CLI or library way to save your processed data or synthetic data.