regexp question

Ben Finney bignose-hates-spam at and-benfinney-does-too.id.au
Thu Dec 4 22:11:52 EST 2003


On Fri, 05 Dec 2003 02:26:53 -0000, python_charmer2000 wrote:
> re1 = <some regexp>
> re2 = <some regexp>
> re3 = <some regexp>
> 
> big_re = re.compile(re1 + '|' + re2 + '|' + re3)
> 
> Now the "match.re.pattern" is the entire regexp, big_re.  But I want
> to print out the portion of the big re that was matched -- was it re1?
> re2?  or re3?  Is it possible to determine this, or do I have to make
> a second pass through the collection of re's and compare them against
> the "matched text" in order to determine which part of the big_re was
> matched?

That will work no matter what your regexes hapen to be, and is easily
understood.  Implement that, and see if it's fast enough.  (Doing
otherwise is known as "premature optimisation" and is a bad practice.)
In fact, it may be better (from a readability standpoint) to simply
compile each of the regexes and match them all each time.

An alternative, if it's not fast enough:  Group the regexes and inspect
them with the re.MatchObject.group() method.

    >>> import re
    >>> regex1 = 'abc'
    >>> regex2 = 'def'
    >>> regex3 = 'ghi'
    >>> big_regex = re.compile(
    ...     '(' + regex1 + ')'
    ...     + '|(' + regex2 + ')'
    ...     + '|(' + regex3 + ')'
    ... )
    >>> match = re.match( big_regex, 'def' )
    >>> match.groups()
    (None, 'def', None)
    >>> match.group(1)
    >>> match.group(2)
    'def'
    >>> match.group(3)
    >>>


-- 
 \          "As the evening sky faded from a salmon color to a sort of |
  `\   flint gray, I thought back to the salmon I caught that morning, |
_o__) and how gray he was, and how I named him Flint."  -- Jack Handey |
Ben Finney <http://bignose.squidly.org/>




More information about the Python-list mailing list