Problem with re group in python 2.1 and 2.2

Pedro RODRIGUEZ pedro_rodriguez at club-internet.fr
Sun Jun 30 11:55:08 EDT 2002


Greetings,

While writing some unit tests for a rfc1738 parser, I found the following
strange behaviour when using parenthesis in a named group regular
expression.

The following expression (?P<name>.*) and (?P<name>(.*)) don't behave in 
the same way. When the matching fails, the first group will be None, but
the last one will contain an empty string.

The problem occurs with python 2.1.1 and 2.2. Python 1.5.2 OTH works fine.

I checked the documentation but didn't find any hint on this. So, before
submitting a bug report, I wanted to know if I missed something.


# Code exhibiting the problem
import sys
print sys.version

import re

# [ user [ : password ] @ ] host

x = re.compile("((?P<user>(.*))(:(?P<password>.*))?@)?(?P<host>.*)") 
m = x.match("host")
print m.groupdict()

x = re.compile("((?P<user>.*)(:(?P<password>.*))?@)?(?P<host>.*)") 
m = x.match("host")
print m.groupdict()


# Outputs 1.5.2
1.5.2 (#1, Jul  5 2001, 03:02:19)  [GCC 2.96 20000731 (Red Hat Linux 7.1 2
{'password': None, 'host': 'host', 'user': None}
{'password': None, 'host': 'host', 'user': None}

# Outputs 2.1.1
2.1.1 (#1, Sep 27 2001, 13:48:33)
[GCC 2.96 20000731 (Red Hat Linux 7.1 2.96-81)]
{'user': '', 'password':None, 'host': 'host'} 
{'user': None, 'password': None, 'host': 'host'}

# Outputs 2.2
2.2 (#2, Jan 15 2002, 20:55:15)
[GCC 2.96 20000731 (Red Hat Linux 7.1 2.96-96)] 
{'host': 'host', 'password': None, 'user': ''} 
{'host': 'host', 'password': None, 'user': None}

Regards,
Pedro



More information about the Python-list mailing list