[Python-Dev] Investigating time for `import requests`

Sun Oct 1 22:34:55 EDT 2017

On Sun, Oct 1, 2017 at 7:04 PM, INADA Naoki <songofacandy at gmail.com> wrote:
> 4. http.client
>
> import time:      1376 |       2448 |                   email.header
> ...
> import time:      1469 |       7791 |                   email.utils
> import time:       408 |      10646 |                 email._policybase
> import time:       939 |      12210 |               email.feedparser
> import time:       322 |      12720 |             email.parser
> ...
> import time:       599 |       1361 |             email.message
> import time:      1162 |      16694 |           http.client
>
> email.parser has very large import tree.
> But I don't know how to break the tree.

There is some work to get urllib3/requests to stop using http.client,
though it's not clear if/when it will actually happen:
https://github.com/shazow/urllib3/pull/1068

> Another major slowness comes from compiling regular expression.
> I think we can increase cache size of `re.compile` and use ondemand cached
> compiling (e.g. `re.match()`),
> instead of "compile at import time" in many modules.

In principle re.compile() itself could be made lazy -- return a
regular exception object that just holds the string, and then compiles
and caches it the first time it's used. Might be tricky to do in a
backwards compatibility way if it moves detection of invalid regexes
from compile time to use time, but it could be an opt-in flag.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org