a regexp riddle: re.search(r'(?:(\w+), |and (\w+))+', 'whatever a, bbb, and c') =? ('a', 'bbb', 'c')

MRAB python at mrabarnett.plus.com
Thu Nov 25 15:45:02 EST 2010


On 25/11/2010 19:57, Phlip wrote:
>> Accepting input from a human is fraught with dangers and edge cases.
>
>> Here's a non-regex solution
>
> Thanks all for playing! And as usual I forgot a critical detail:
>
> I'm writing a matcher for a Morelia /viridis/ Scenario step, so the
> matcher must be a single regexp.
>
>    http://c2.com/cgi/wiki?MoreliaViridis
>
> I'm avoiding the current situation, where Morelia pulls out (.*), and
> the step handler "manually" splits that up with:
>
>    flags = re.split(r', (?:and )?', flags)
>
> That means I already had a brute-force version. A regexp version is
> always better because, especially in Morelia, it validates input. (.*)
> is less specific than (\w+).
>
> So if the step says:
>
>    Alice has crypto keys apple, barley, and flax
>
> Then the step handler could say (if this worked):
>
>    def step_user_has_crypto_keys_(self, user, *keys):
>        r'(\w+) has crypto keys (?:(\w+), )+and (\w+)'
>
>        # assert that user with those keys here
>
[snip]
You could do:

     def step_user_has_crypto_keys_(self, user, keys):
         r'(\w+) has crypto keys ((?:\w+, )+and \w+)'

to validate and capture, and then split the keys string.



More information about the Python-list mailing list