regex alternation problem
Robert Kern
robert.kern at gmail.com
Fri Apr 17 18:03:48 EDT 2009
On 2009-04-17 16:49, Jesse Aldridge wrote:
> import re
>
> s1 = "I am an american"
>
> s2 = "I am american an "
>
> for s in [s1, s2]:
> print re.findall(" (am|an) ", s)
>
> # Results:
> # ['am']
> # ['am', 'an']
>
> -------
>
> I want the results to be the same for each string. What am I doing
> wrong?
findall() finds non-overlapping matches. " am an " would work, but not
" am an ".
Instead of including explicit spaces in your pattern, I suggest using the \b
"word boundary" special instruction.
>>> for s in [s1, s2]:
... print re.findall(r"\b(am|an)\b", s)
...
['am', 'an']
['am', 'an']
--
Robert Kern
"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco
More information about the Python-list
mailing list