regex in python

Jim Segrave jes at nl.demon.net
Thu May 25 07:56:35 EDT 2006


In article <1148551097.266423.141230 at j55g2000cwa.googlegroups.com>,
gisleyt <gisleyt at gmail.com> wrote:
>I'm trying to compile a perfectly valid regex, but get the error
>message:
>
> r =
>re.compile(r'([^\d]*)(\d{1,3}\.\d{0,2})?(\d*)(\,\d{1,3}\.\d{0,2})?(\d*)?.*')
>Traceback (most recent call last):
>  File "<stdin>", line 1, in ?
>  File "/usr/lib/python2.3/sre.py", line 179, in compile
>    return _compile(pattern, flags)
>  File "/usr/lib/python2.3/sre.py", line 230, in _compile
>    raise error, v # invalid expression
>sre_constants.error: nothing to repeat
>>>>
>
>What does this mean? I know that the regex
>([^\d]*)(\d{1,3}\.\d{0,2})?(\d*)(\,\d{1,3}\.\d{0,2})?(\d*)?.*
>is valid because i'm able to use it in Regex Coach. But is Python's
>regex syntax different that an ordinary syntax?

Your problem lies right near the end:

>>> import re
>>> r = re.compile(r'(\d*)?')
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "/usr/local/lib/python2.4/sre.py", line 180, in compile
    return _compile(pattern, flags)
  File "/usr/local/lib/python2.4/sre.py", line 227, in _compile
    raise error, v # invalid expression
sre_constants.error: nothing to repeat

Since the term \d* can be matched by the empty string, what would it
mean to ask for 0 or 1 copies of the empty string? How is that
different from 17 copies of the empty string.  

So:
r =
re.compile(r'([^\d]*)(\d{1,3}\.\d{0,2})?(\d*)(\,\d{1,3}\.\d{0,2})?(\d*).*')

will be accepted.

>By the way, i'm using it to normalise strings like:
>
>London|country/uk/region/europe/geocoord/32.3244,42,1221244
>to:
>London|country/uk/region/europe/geocoord/32.32,42,12
>
>By using \1\2\4 as replace. I'm open for other suggestions to achieve
>this!

But you're looking for a string followed by two floats and your sample
input is a string, a float, an integer, a comma and another
integer. If you actually mean the input is

London|country/uk/region/europe/geocoord/32.3244,42.1221244

and you want to convert it to:

London|country/uk/region/europe/geocoord/32.32,42.12

then the above regex will work

-- 
Jim Segrave           (jes at jes-2.demon.nl)




More information about the Python-list mailing list