Regexps and lists

Paddy paddy3118 at googlemail.com
Sun Feb 11 17:08:24 EST 2007


I don't know enough to write an R.E. engine so forgive me if I am
being naive.
I have had to atch text involving lists in the past. These are usually
comma separated words such as
 "egg,beans,ham,spam,spam"
you can match that with:
 r"(\w+)(,\w+)*"
and when you look at the groups you get the following

>>> import re
>>> re.match(r"(\w+)(,\w+)*", "egg,beans,ham,spam,spam").groups()
('egg', ',spam')
>>>

Notice how you only get the last match as the second groups value.

It would be nice if a repeat operator acting on a group turned that
group into a sequence returning every match, in order. (or an empty
sequence for no matches).

The above exaple would become:

 >>> import re
>>> re.newmatch(r"(\w+)(,\w+)*", "egg,beans,ham,spam,spam").groups()
('egg', ('beans', 'ham', 'spam', ',spam'))
>>>

1, Is it possible? do any other RE engines do this?
2, Should it be added to Python?

- Paddy.




More information about the Python-list mailing list