bff.pipe_multiprocessing_pd¶
-
bff.
pipe_multiprocessing_pd
(df, func, *, nb_proc=None, **kwargs)¶ Compute function on DataFrame with nb_proc processes.
The given function must return a new DataFrame. Rows must be independant and not depend from a value generated using the whole DataFrame.
The function uses as many processes as cpu available on the machine.
The DataFrame is splitted in nb_proc processes and then each splitted DataFrame is computed by a different process. The results are then concatenated an returned.
- Parameters
df (pd.DataFrame) – DataFrame that must be computed by the function.
func (function) – Function that takes the DataFrame as input.
nb_proc (Union[int, None], default None) – Number of processor to use. If not provided, uses multiprocessing.cpu_count() number of processes.
**kwargs – Additional keyword arguments to be passed to func.
- Returns
Return the DataFrame computed by func.
- Return type
pd.DataFrame