a simple regex question

Roel Schroeven rschroev_nospam_ml at fastmail.fm
Sat Apr 1 04:59:08 EST 2006


John Salerno schreef:
>> pattern = '([a-z][A-Z]{3}[a-z][A-Z]{3}[a-z])+'
>> print re.search(pattern, mess).groups()
>>
>> Anyway, this returns one matching string, but when I put this letter in 
>> as the solution to the problem, I get a message saying "yes, but there 
>> are more", so assuming this means that there is more than one character 
>> with three caps on either side, is my RE written correctly to find them 
>> all? I didn't have the parentheses or + sign at first, but I added them 
>> to find all the possible matches, but still only one comes up.
>>
>> Thanks.
> 
> A quick note: I found nine more matches by using findall() instead of 
> search(), but I'm still curious how to write the RE so that it works 
> with search, especially since findall wouldn't have returned overlapping 
> matches. I guess I didn't write it to properly check multiple times.

It seems to me you should be able to find all matches with search(). Not 
with the pattern you mention above: that will only find matches if they 
come right after each other, as in
xXXXxXXXxyYYYyYYYyzZZZzZZZz

You'll need something more like
pattern = '([a-z][A-Z]{3}[a-z][A-Z]{3}[a-z]+)+'
so that it will find matches that are further apart from each other.

That said, I think findall() is a better solution for this problem. I 
don't think search() will find overlapping matches either, so that's no 
reason not to use findall(), and the pattern is simpler with findall(); 
I solved this challenge with findall() and this regular expression:

pattern = r'[a-z][A-Z]{3}[a-z][A-Z]{3}[a-z]'


-- 
If I have been able to see further, it was only because I stood
on the shoulders of giants.  -- Isaac Newton

Roel Schroeven



More information about the Python-list mailing list