OK to memoize re objects?

kj no.email at please.post
Mon Sep 21 09:33:05 EDT 2009


In <mailman.120.1253406305.2807.python-list at python.org> Robert Kern <robert.kern at gmail.com> writes:

>kj wrote:
>> 
>> My Python code is filled with assignments of regexp objects to
>> globals variables at the top level; e.g.:
>> 
>> _spam_re = re.compile('^(?:ham|eggs)$', re.I)
>> 
>> Don't like it.  My Perl-pickled brain wishes that re.compile was
>> a memoizing method, so that I could use it anywhere, even inside
>> tight loops, without ever having to worry about the overhead of
>> regexp compilation.

>Just use re.search(), etc. They already memoize the compiled regex objects.

Thanks.

I find the docs are pretty confusing on this point.  They first
make the point of noting that pre-compiling regular expressions is
more efficient, and then *immediately* shoot down this point by
saying that one need not worry about pre-compiling in most cases.
>From the docs:

    ...using compile() and saving the resulting regular expression
    object for reuse is more efficient when the expression will be
    used several times in a single program.

    Note: The compiled versions of the most recent patterns passed
    to re.match(), re.search() or re.compile() are cached, so
    programs that use only a few regular expressions at a time
    needn't worry about compiling regular expressions.

Honestly I don't know what to make of this...  I would love to see
an example in which re.compile was unequivocally preferable, to
really understand what the docs are saying here...

kynn



More information about the Python-list mailing list