a regexp riddle: re.search(r'(?:(\w+), |and (\w+))+', 'whatever a, bbb, and c') =? ('a', 'bbb', 'c')

MRAB python at mrabarnett.plus.com
Thu Nov 25 11:16:15 EST 2010


On 25/11/2010 04:46, Phlip wrote:
> HypoNt:
>
> I need to turn a human-readable list into a list():
>
>     print re.search(r'(?:(\w+), |and (\w+))+', 'whatever a, bbb, and
> c').groups()
>
> That currently returns ('c',). I'm trying to match "any word \w+
> followed by a comma, or a final word preceded by and."
>
> The match returns 'a, bbb, and c', but the groups return ('bbb', 'c').
> What do I type for .groups() to also get the 'a'?
>
> Please go easy on me (and no RTFM!), because I have only been using
> regular expressions for about 20 years...
>
Try re.findall:

     >>> re.findall(r'(\w+), |and (\w+)', 'whatever a, bbb, and c')
     [('a', ''), ('bbb', ''), ('', 'c')]

You can get a list of strings like this:

     >>> [x or y for x, y in re.findall(r'(\w+), |and (\w+)', 'whatever 
a, bbb, and c')]
     ['a', 'bbb', 'c']



More information about the Python-list mailing list