[Python-ideas] re.compile_lazy - on first use compiled regexes

M.-A. Lemburg mal at egenix.com
Sat Mar 23 22:31:31 CET 2013


On 23.03.2013 22:20, Antoine Pitrou wrote:
> On Sat, 23 Mar 2013 22:19:02 +0100
> "M.-A. Lemburg" <mal at egenix.com> wrote:
>>
>> Hmm, I'm not following you. The patterns would get compiled once
>> at Python build time when installing the stdlib. The bytecode
>> version wouldn't change for those compiled patterns - unless, of
>> course, you upgrade to a new Python version, but then you'd
>> rebuild the bytecode versions of the REs :-)
>>
>> To make them generally useful, I agree, you would have to add a
>> RE compiler version to the bytecode pickle, but AFAICS this
>> should not affect the usefulness for the stdlib RE cache.
> 
> Ah, you're talking only about the stdlib.
> Well, sure, that would work, but we have to remember to regenerate
> those pickles by hand each time the re bytecode is updated (which
> doesn't happen often, admittedly). That's a bit of a maintenance burden.

No, that would happen at build time automatically. setup.py
would create the module with the pickled RE bytecodes by scanning the
stdlib modules for RE patterns, the re module would use this to
seed its cache.

That's the high-level idea. I'm sure there are a few pitfalls
along the way :-)

>> The whole idea is really very similar to the Python VM bytecode
>> caching Python is using to speedup imports of modules.
> 
> Except that the VM bytecode caching works automatically and
> transparently :-)

Should be the same for the REs in the stdlib. The user wouldn't
notice (except for the speedup hopefully). Code in the stdlib
compiling the REs wouldn't need to be touched either, since the
cache in the re module would simply reuse the compiled versions.

>> Perhaps we could have a GSoC student give it a try and see
>> whether it makes results in noticable startup time speedups ?!
> 
> That's a rather smallish topic for a GSoC project, IMHO.

Well, you could extend it by adding some RE optimization
tasks on top of it :-)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Mar 23 2013)
>>> Python Projects, Consulting and Support ...   http://www.egenix.com/
>>> mxODBC.Zope/Plone.Database.Adapter ...       http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2013-03-13: Released eGenix pyOpenSSL 0.13 ...    http://egenix.com/go39

::::: Try our mxODBC.Connect Python Database Interface for free ! ::::::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/



More information about the Python-ideas mailing list