CsvConnector#

class sdgx.data_connectors.csv_connector.CsvConnector(path, sep=',', header='infer', **read_csv_kwargs)[source]#

Wraps csv file into DataConnector

Parameters:

path (str) – Path to csv file
sep (str, optional) – Separator. Defaults to ‘,’.
header (str, optional) – Header. Defaults to ‘infer’.
read_csv_kwargs (dict, optional) – kwargs for pd.read_csv, please refer to https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html

Example

from sdgx.data_connectors.csv_connector import CsvConnector
connector = CsvConnector(
    path="data.csv",
)
df = connector.read()

_columns() → list[str][source]#

Subclass should implement this for reading columns if there is an efficient way for peaking columns.

See column for more details.

_iter(offset: int = 0, chunksize: int = 1000) → Generator[DataFrame, None, None][source]#

Subclass should implement this for reading data in chunk.

See iter for more details.

_read(offset: int = 0, limit: int | None = None) → DataFrame | None[source]#

Subclass must implement this for reading data.

See read for more details.

iter(offset: int = 0, chunksize: int = 0) → Generator[DataFrame, None, None]#

Interface for reading data in chunk.

Parameters:

Returns:

Generator/Iterator for readed dataframe

Return type:

Generator[pd.DataFrame, None, None]

read(offset: int = 0, limit: int | None = None) → DataFrame | None#

Interface for reading data.

Parameters:

offset (int, optional) – Offset for reading. Defaults to 0.
limit (int, optional) – Limit for reading. Defaults to None. None is for reading all data and 0 is for reading no data(only header).

Returns:

Readed dataframe

Return type:

pd.DataFrame