bff.read_sql_by_chunks

bff.read_sql_by_chunks(sql, cnxn, params=None, chunksize=8000000, column_types=None, **kwargs)

Read SQL query by chunks into a DataFrame.

This function uses the read_sql from Pandas with the chunksize option.

The columns of the DataFrame are cast in order to be memory efficient and preserved when adding the several chunks of the iterator.

Parameters
  • sql (str) – SQL query to be executed.

  • cnxn (SQLAlchemy connectable (engine/connection) or database string URI) – Connection object representing a single connection to the database.

  • params (list or dict, default None) – List of parameters to pass to execute method.

  • chunksize (int, default 8,000,000) – Number of rows to include in each chunk.

  • column_types (dict, default None) – Dictionary with the name of the column as key and the type as value. No cast is done if None.

  • **kwargs – Additional keyword arguments to be passed to the pd.read_sql function.

Returns

DataFrame with the concatenation of the chunks in the wanted type.

Return type

pd.DataFrame