[Chennaipy] Chennaipy - Monday Module - 15 May 2023
selvi dct
selvi.dct at gmail.com
Mon May 15 15:30:09 EDT 2023
Date: 15 May 2023
Module : modin
Installation : pip install modin
About:
Modin is a replacement for pandas. While pandas is single-threaded,
Modin lets instantly speed up the workflows by scaling pandas so it uses
all of your cores. Modin works especially well on larger datasets, where
pandas has challenges.
By simply replacing the import statement, Modin offers users
effortless speed and scale for their pandas workflows:
import modin.pandas as pd
Sample:
import modin.pandas as pd
import numpy as np
df = pd.read_csv("my_dataset.csv")
left_data = np.random.randint(0, 100, size=(2**8, 2**8))
right_data = np.random.randint(0, 100, size=(2**12, 2**12))
left_df = pd.DataFrame(left_data)
right_df = pd.DataFrame(right_data)
%timeit left_df.merge(right_df, how="inner", on=10)
3.59 s 107 ms per loop (mean std. dev. of 7 runs, 1 loop each)
%timeit right_df.merge(left_df, how="inner", on=10)
1.22 s 40.1 ms per loop (mean std. dev. of 7 runs, 1 loop each)
Reference:
https://pypi.org/project/modin/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/chennaipy/attachments/20230516/0144ebf1/attachment.html>
More information about the Chennaipy
mailing list