[Pandas-dev] Import time/size optimization - how much do people care?

Marc Garcia garcia.marc at gmail.com
Tue Oct 12 13:24:34 EDT 2021


+1 on all them

I don't think 3 should be that complex, I might be wrong.

On Tue, Oct 12, 2021 at 12:21 PM Brock Mendel <jbrockmendel at gmail.com>
wrote:

> > For this do you have in mind moving imports from the top of the file
> into the functions that use them in our code base. Or would it be more not
> loading components of pandas until the user uses them (components like
> plotting, timeseries, IO connectors...)
>
> Some of each.  The main candidates I've looked at recently
>
> 1) make pyarrow import lazy (~15%
> https://github.com/pandas-dev/pandas/issues/41432#issuecomment-939083050)
> 2) make pandas.io.api imports (into pd namespace) lazy (4-5%)
> 3) avoid @doc/@Appender/@Substitution at runtime (~4-5% but a PITA i think
> not worth it)
>
> On Tue, Oct 12, 2021 at 9:46 AM Marc Garcia <garcia.marc at gmail.com> wrote:
>
>> Hi Brock, thanks for having a look at this.
>>
>> Just a question. For this do you have in mind moving imports from the top
>> of the file into the functions that use them in our code base. Or would it
>> be more not loading components of pandas until the user uses them
>> (components like plotting, timeseries, IO connectors...). The main
>> difference being that in the latter case, most Python files would keep the
>> imports at the top, but we'd avoid loading pandas modules until needed.
>>
>> Feels like the latter, where it makes sense, could be a nice thing not
>> only for the loading time and the base memory footprint.
>>
>> On Mon, Oct 11, 2021 at 6:03 PM Brock Mendel <jbrockmendel at gmail.com>
>> wrote:
>>
>>> I've spent some time looking at our import time and the memory footprint
>>> at import and I _think_ we can cut another 20-30% by e.g. lazifying
>>> imports.  The last 5-10% of that is pretty hairy though.
>>>
>>> My question for the community is: is this worth optimizing?  Is there
>>> anyone (dask maybe?) for whom import time and memory footprint is a pain
>>> point?
>>> _______________________________________________
>>> Pandas-dev mailing list
>>> Pandas-dev at python.org
>>> https://mail.python.org/mailman/listinfo/pandas-dev
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/pandas-dev/attachments/20211012/e67e0a98/attachment-0001.html>


More information about the Pandas-dev mailing list