Regexp: unexspected splitting of string in several groups

Piet van Oostrum piet at cs.uu.nl
Wed Jun 2 17:59:00 EDT 2004


>>>>> pit.grinja at gmx.de (Piet) (P) wrote:

P> vartypePattern = re.compile("([a-zA-Z]+)(\(.*\))*([^(].*[^)])")

P> However, simple one-string expressions like
P> vartypeSplit = vartypePattern.match("float")
P> are always splitted into two strings. The result is:
P> vartypeSplit.groups() = ('flo', None, 'at').
P> I would have either expected ('float',None,None) or ('float','','').
P> For other strings, the last two characters are also found in a
P> separate group.
P> Is this a bug or a feature? ;-)

It is a feature:
The last part: [^(].*[^)] says: a character which is not (, possibly more
characters and a character which is not ). So at least two characters.

Maybe you mean something like [^()]*
Or would you like to accept )xxx) or )yyy(?
-- 
Piet van Oostrum <piet at cs.uu.nl>
URL: http://www.cs.uu.nl/~piet [PGP]
Private email: P.van.Oostrum at hccnet.nl



More information about the Python-list mailing list