regex alternation problem

Robert Kern robert.kern at gmail.com
Fri Apr 17 18:03:48 EDT 2009


On 2009-04-17 16:49, Jesse Aldridge wrote:
> import re
>
> s1 = "I am an american"
>
> s2 = "I am american an "
>
> for s in [s1, s2]:
>      print re.findall(" (am|an) ", s)
>
> # Results:
> # ['am']
> # ['am', 'an']
>
> -------
>
> I want the results to be the same for each string.  What am I doing
> wrong?

findall() finds non-overlapping matches. " am  an " would work, but not
" am an ".

Instead of including explicit spaces in your pattern, I suggest using the \b 
"word boundary" special instruction.

 >>> for s in [s1, s2]:
...     print re.findall(r"\b(am|an)\b", s)
...
['am', 'an']
['am', 'an']

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth."
   -- Umberto Eco




More information about the Python-list mailing list