a regexp riddle: re.search(r'(?:(\w+), |and (\w+))+', 'whatever a, bbb, and c') =? ('a', 'bbb', 'c')

Alice Bevan–McGregor alice at gothcandy.com
Thu Nov 25 05:00:11 EST 2010


Accepting input from a human is frought with dangers and edge cases.  ;)

Some time ago I wrote a regular expression generator that creates 
regexen that can parse arbitrarily delimited text, supports quoting (to 
avoid accidentally separating two elements that should be treated as 
one), and works in both directions (text<->native).

The code that generates the regex is heavily commented:

	https://github.com/pulp/marrow.util/blob/master/marrow/util/convert.py#L123-234

You 

should be able to use this as-is and simply handle the optional 'and' 
on the last element yourself.  You can even create an instance of the 
class with the options you want then get the generated regular 
expression by running print(parser.pattern).

Note that I have friends who use 'and' multiple times when describing 
lists of things.  :P

	— Alice.




More information about the Python-list mailing list