When not to use an RE -- an example

John Machin sjmachin at lexicon.net
Sat Apr 19 20:15:19 EDT 2003


On 20 Apr 2003 00:44:22 +0100, Alexander Schmolck <a.schmolck at gmx.net>
wrote:

>sjmachin at lexicon.net (John Machin) writes:
>
>> I needed a check for strings consisting of repeated characters -- like
>> when users type "ZZZZZZZ" instead of "UNKNOWN" into a database field.
>> After implementing the obvious overlapping-substring comparison, I got
>> to thinking how this could be done with REs. The following resulted:
>> 
>> import re
>> repeats1 = re.compile(r"^(?:(.)(?=\1))+\1\Z", re.DOTALL).match
>> def repeats2(s):
>>    return len(s) > 1 and s[1:] == s[:-1]
>> for testvalue, expected in zip(
>>    ['','x','xx','xxx','xxxxxx','xy','xxy','xyy','\n\n\n','aaa\n'],
>>    [0,  0,  1,   1,    1,       0,   0,    0,    1,       0     ]):
>>    print repr(testvalue), not not repeats1(testvalue),
>> repeats2(testvalue), expected
>> 
>> Pergly/phugly, eh? Note the effort required to ensure the newline
>> cases worked.
>
>
>What's wrong with:
>
>>>> matchRepetition = re.compile(r'(.)\1+\Z', re.DOTALL).match
>>>> map(bool, map(matchRepetition,
>...  ['','x','xx','xxx','xxxxxx','xy','xxy','xyy','\n\n\n','aaa\n']))
>[0, 0, 1, 1, 1, 0, 0, 0, 1, 0]
>
>?

(1) Too simple. (2) I didn't think of it. (3) Probably runs a lot
faster, too.

I'll get back into my box now. :-)





More information about the Python-list mailing list