Command Line Interface#
Command Line Interface(CLI) is designed to simplify the usage of SDG and enable other programs to use SDG in a more convenient way.
There are tow main commands in the CLI:
fit: For fitting, finetuning, retraining… the model, which will save the final model to a specified path.sample: Load existing model and sample synthetic data.
And as SDG supports plug-in system, users can list all available via list-{component} command.
Note
If you want to use SDG as a library, please refer to Use Synthetic Data Generator as a library.
If you want to extend SDG with your own components, please refer to Developer guides for extension.
CLI for synthetic single-table data#
sdgx#
sdgx [OPTIONS] COMMAND [ARGS]...
fit#
Fit the synthesizer or load a synthesizer for fitnuning/retraining/continue training…
sdgx fit [OPTIONS]
Options
- --torchrun <torchrun>#
Use torchrun to run cli.
- --torchrun_kwargs <torchrun_kwargs>#
[Json String] torchrun kwargs.
- --save_dir <save_dir>#
Required The directory to save the synthesizer
- --model <model>#
Required The name of the model.
- --model_path <model_path>#
The path of the model to load
- --model_kwargs <model_kwargs>#
[Json String] The kwargs of the model for initialization
- --load_dir <load_dir>#
The directory to load the synthesizer, if it is specified,
model_pathwill be ignored.
- --metadata_path <metadata_path>#
The path of the metadata to load
- --data_connector <data_connector>#
The name of the data connector to use
- --data_connector_kwargs <data_connector_kwargs>#
[Json String] The kwargs of the data connector to use
- --raw_data_loaders_kwargs <raw_data_loaders_kwargs>#
[Json String] The kwargs of the raw data loader to use
- --processed_data_loaders_kwargs <processed_data_loaders_kwargs>#
[Json String] The kwargs of the processed data loader to use
- --data_processors <data_processors>#
[Comma separated list] The name of the data processors to use, e.g. ‘processor_x,processor_y’
- --data_processors_kwargs <data_processors_kwargs>#
[Json String] The kwargs of the data processors to use
- --inspector_max_chunk <inspector_max_chunk>#
The max chunk of the inspector to load
- --metadata_include_inspectors <metadata_include_inspectors>#
[Comma separated list] The name of the inspectors to include, e.g. ‘inspector_x,inspector_y’
- --metadata_exclude_inspectors <metadata_exclude_inspectors>#
[Comma separated list] The name of the inspectors to exclude, e.g. ‘inspector_x,inspector_y’
- --inspector_init_kwargs <inspector_init_kwargs>#
[Json String] The kwargs of the inspector to use
- --model_fit_kwargs <model_fit_kwargs>#
[Json String] The kwargs of the model fit method
- --dry_run <dry_run>#
Only init the synthesizer without fitting and save.
- --json_output <json_output>#
Exit with json output.
- --log_to_file <log_to_file>#
Log to file.
list-cachers#
sdgx list-cachers [OPTIONS]
Options
- --json_output <json_output>#
Exit with json output.
- --log_to_file <log_to_file>#
Log to file.
list-data-connectors#
sdgx list-data-connectors [OPTIONS]
Options
- --json_output <json_output>#
Exit with json output.
- --log_to_file <log_to_file>#
Log to file.
list-data-exporters#
sdgx list-data-exporters [OPTIONS]
Options
- --json_output <json_output>#
Exit with json output.
- --log_to_file <log_to_file>#
Log to file.
list-data-processors#
sdgx list-data-processors [OPTIONS]
Options
- --json_output <json_output>#
Exit with json output.
- --log_to_file <log_to_file>#
Log to file.
list-models#
sdgx list-models [OPTIONS]
Options
- --json_output <json_output>#
Exit with json output.
- --log_to_file <log_to_file>#
Log to file.
sample#
Load a synthesizer and sample.
load_dir should contain model and metadata. Please check Synthesizer’s load method for more details.
sdgx sample [OPTIONS]
Options
- --torchrun <torchrun>#
Use torchrun to run cli.
- --torchrun_kwargs <torchrun_kwargs>#
[Json String] torchrun kwargs.
- --load_dir <load_dir>#
Required The directory to load the synthesizer.
- --model <model>#
Required The name of the model.
- --raw_data_loaders_kwargs <raw_data_loaders_kwargs>#
[Json String] The kwargs of the raw data loaders.
- --processed_data_loaders_kwargs <processed_data_loaders_kwargs>#
[Json String] The kwargs of the processed data loaders.
- --data_processors <data_processors>#
[Comma separated list] The name of the data processors, e.g. ‘data_processor_1,data_processor_2’.
- --data_processors_kwargs <data_processors_kwargs>#
[Json String] The kwargs of the data processors.
- --count <count>#
The number of samples to generate.
- --chunksize <chunksize>#
The size of each chunk. If count is very large, chunksize is recommended.
- --model_sample_args <model_sample_args>#
[Json String] The kwargs of the model.sample.
- --data_exporter <data_exporter>#
Required The name of the data exporter.
- --data_exporter_kwargs <data_exporter_kwargs>#
[Json String] The kwargs of the data exporter.
- --export_dst <export_dst>#
The destination of the exported data.
- --dry_run <dry_run>#
Dry run. Only initialize the synthesizer without sampling.
- --json_output <json_output>#
Exit with json output.
- --log_to_file <log_to_file>#
Log to file.