GeneratorConnector#

class sdgx.data_connectors.generator_connector.GeneratorConnector(generator_caller: Callable[[], Generator[DataFrame, None, None]], *args, **kwargs)[source]#

Bases: DataConnector

A virtual data connector that wrap Generator into a DataConnector.

Passing offset=0 to read will reset the generator.

Warning

offset and limit are ignored as Generator not supporting random access. But we can use Cacher to support it. See Data Loader for more details.

Note

This connector is not been registered by default. So only be used with the library way.

_columns() list[str][source]#

Subclass should implement this for reading columns if there is an efficient way for peaking columns.

See column for more details.

_iter(offset=0, chunksize=0) Generator[DataFrame, None, None][source]#

Subclass should implement this for reading data in chunk.

See iter for more details.

_read(offset: int = 0, limit: int | None = None) DataFrame | None[source]#

Ingore limit and allow sequential reading.

columns() list[str]#

Interface for peaking columns.

finalize()#

Finalize the data connector.

property identity: str#
iter(offset: int = 0, chunksize: int = 0) Generator[DataFrame, None, None]#

Interface for reading data in chunk.

Parameters:
  • offset (int, optional) – Offset for reading. Defaults to 0.

  • chunksize (int, optional) – Chunksize for reading. Defaults to 0.

Returns:

Generator/Iterator for readed dataframe

Return type:

Generator[pd.DataFrame, None, None]

keys() list[str]#

Same as columns.

read(offset: int = 0, limit: int | None = None) DataFrame | None#

Interface for reading data.

Parameters:
  • offset (int, optional) – Offset for reading. Defaults to 0.

  • limit (int, optional) – Limit for reading. Defaults to None. None is for reading all data and 0 is for reading no data(only header).

Returns:

Readed dataframe

Return type:

pd.DataFrame