Arbitrary number of groups in regex

Daniel Fackrell unlearned at DELETETHIS.learn2think.org
Thu Aug 8 16:41:35 EDT 2002


"Jean-Philippe Côté" <cotej at crt.umontreal.ca> wrote in message
news:ZkA49.1477$Tv.421894 at news20.bellglobal.com...
>
> I apologize if this a common and/or stupid question (it probably is),
> but I can't figure it out.
>
> I'm trying to write a regular expression pattern which can return
> an arbitrary number of groups, depending on the string on
> which is it applied.
>
> For instance, if I do
> >>> import re
> >>> m = re.match("PATTERN", "abcde")
> >>> m.groups()
> I'd like to see
> ('a','b','c','d','e')
>
> and if I do
> >>> import re
> >>> m = re.match("PATTERN", "xy")
> >>> m.groups()
> I'd like to see
> ('x','y')
>
> but by using a single generic pattern, and not "(\w)(\w)(\w)(\w)(\w)" in
the
> first case and "(\w)(\w)" in the second case.
>
> The way I undestand "(\w)*" is <<match a single alphanumeric
> character, put in into a group, return that group and repeat as
> long a you can>>, but that doesn't work:
> >>> m = re.match("(\w)*", "abcde")
> >>> m.groups()
> ('e',)
> >>>
>
> Does anybody know what the PATTERN should be ?

For this particular case many things will work to get results like what you
want (including the trivial 'abcde'.split('')), but how about the following
as a start under Python 2.2?

>>> import re
>>> s = 'abcde'
>>> p = '(\w)' * len(s)
>>> m = re.match(p, s)
>>> m.groups()
('a', 'b', 'c', 'd', 'e')
>>>

--
Daniel Fackrell (unlearned at learn2think.org)
When we attempt the impossible, we can experience true growth.





More information about the Python-list mailing list