Question: Optional Regular Expression Grouping

Ian Kelly ian.g.kelly at gmail.com
Mon Oct 10 19:03:02 EDT 2011


On Mon, Oct 10, 2011 at 4:49 PM, MRAB <python at mrabarnett.plus.com> wrote:
> Instead of "\S" I'd recommend using "[^\]]", or using a lazy repetition
> "\S+?".

Preferably the former.  The core problem is that the regex matches
ambiguously on the problem string.  Lazy repetition doesn't remove
that ambiguity; it merely attempts to make the module prefer the match
that you prefer.

Other notes to the OP:  Always use raw strings (r'') when writing
regex patterns, to make sure the backslashes are escape characters in
the pattern rather than in the string literal.

The '^foo|bar$' construct you're using is wonky.  I think you're
writing this to mean "match if the entire string is either 'foo' or
'bar'".  But what that actually matches is "anything that either
starts with 'foo' or ends with 'bar'".  The correct way to do the
former would be either '^foo$|^bar$' or '^(?:foo|bar)$'.



More information about the Python-list mailing list