regexp upward compatibility bug ?
Jeff Epler
jepler at unpythonic.net
Thu Jan 29 10:19:47 EST 2004
The problem is the use of '-' in the character groups, like
r'[\w-]'
Here's what the library reference manual has to say:
[]
Used to indicate a set of characters. Characters can be listed
individually, or a range of characters can be indicated by giving
two characters and separating them by a "-". Special characters are
not active inside sets. For example, [akm$] will match any of the
characters "a", "k", "m", or "$"; [a-z] will match any lowercase
letter, and [a-zA-Z0-9] matches any letter or digit. Character
classes such as \w or \S (defined below) are also acceptable inside
a range. If you want to include a "]" or a "-" inside a set, precede
it with a backslash, or place it as the first character. The pattern
[]] will match ']', for example.
http://www.python.org/doc/current/lib/re-syntax.html
So you may want to write r'[-\w]' or r'[\w\-]' instead, based on my
reading.
The same goes for the later part of the pattern [\w-\.?=].
Jeff
More information about the Python-list
mailing list