regex in python
Jim Segrave
jes at nl.demon.net
Thu May 25 07:56:35 EDT 2006
In article <1148551097.266423.141230 at j55g2000cwa.googlegroups.com>,
gisleyt <gisleyt at gmail.com> wrote:
>I'm trying to compile a perfectly valid regex, but get the error
>message:
>
> r =
>re.compile(r'([^\d]*)(\d{1,3}\.\d{0,2})?(\d*)(\,\d{1,3}\.\d{0,2})?(\d*)?.*')
>Traceback (most recent call last):
> File "<stdin>", line 1, in ?
> File "/usr/lib/python2.3/sre.py", line 179, in compile
> return _compile(pattern, flags)
> File "/usr/lib/python2.3/sre.py", line 230, in _compile
> raise error, v # invalid expression
>sre_constants.error: nothing to repeat
>>>>
>
>What does this mean? I know that the regex
>([^\d]*)(\d{1,3}\.\d{0,2})?(\d*)(\,\d{1,3}\.\d{0,2})?(\d*)?.*
>is valid because i'm able to use it in Regex Coach. But is Python's
>regex syntax different that an ordinary syntax?
Your problem lies right near the end:
>>> import re
>>> r = re.compile(r'(\d*)?')
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "/usr/local/lib/python2.4/sre.py", line 180, in compile
return _compile(pattern, flags)
File "/usr/local/lib/python2.4/sre.py", line 227, in _compile
raise error, v # invalid expression
sre_constants.error: nothing to repeat
Since the term \d* can be matched by the empty string, what would it
mean to ask for 0 or 1 copies of the empty string? How is that
different from 17 copies of the empty string.
So:
r =
re.compile(r'([^\d]*)(\d{1,3}\.\d{0,2})?(\d*)(\,\d{1,3}\.\d{0,2})?(\d*).*')
will be accepted.
>By the way, i'm using it to normalise strings like:
>
>London|country/uk/region/europe/geocoord/32.3244,42,1221244
>to:
>London|country/uk/region/europe/geocoord/32.32,42,12
>
>By using \1\2\4 as replace. I'm open for other suggestions to achieve
>this!
But you're looking for a string followed by two floats and your sample
input is a string, a float, an integer, a comma and another
integer. If you actually mean the input is
London|country/uk/region/europe/geocoord/32.3244,42.1221244
and you want to convert it to:
London|country/uk/region/europe/geocoord/32.32,42.12
then the above regex will work
--
Jim Segrave (jes at jes-2.demon.nl)
More information about the Python-list
mailing list