[Pandas-dev] Faster .apply natively
Roman Yurchak
rth.yurchak at gmail.com
Mon Nov 23 16:36:21 EST 2020
On 21/11/2020 06:31, Abdur-Rahmaan Janhangeer wrote:
> A normal NLP function of
> reducing a sentence to it's essential lowercase version
> in 10 lines of list-comprehension processing takes an
> eternity for the ten of thousands rows.
Calling .apply on 10k rows has an overhead of a few ms as far as I can
tell. If it takes much longer it means that the bottleneck is in your
function.
Then the question is more how to make that function faster, with the
typical answer of optimizing it in Python, rewriting in a lower level
language (Cython or maybe using numba), parallelization over rows or
here possibly caching.
See https://pandas.pydata.org/docs/user_guide/enhancingperf.html for
more details. The .apply function cannot really make any arbitrary
python functions faster, and even parallelization has its limits in pure
Python.
--
Roman
More information about the Pandas-dev
mailing list