emulating an and operator in regular expressions

John Machin sjmachin at lexicon.net
Mon Jan 3 16:46:02 EST 2005


Terry Reedy wrote:
> "Craig Ringer" <craig at postnewspapers.com.au> wrote in message
> news:1104750397.26918.1.camel at rasputin.localnet...
> > On Mon, 2005-01-03 at 08:52, Ross La Haye wrote:
> >> How can an and operator be emulated in regular expressions in
Python?
>
> Regular expressions are designed to define and detect repetition and
> alternatives.  These are easily implemented with finite state
machines.
> REs not meant for conjunction.  'And' can be done but, as I remember,
only
> messily and slowly.  The demonstration I once read was definitely
> theoretical, not practical.
>
> Python was designed for and logic (among everything else).  If you
want
> practical code, use it.
>
> if match1 and match2: do whatever.
>

Provided you are careful to avoid overlapping matches e.g. data = 'Fred
Johnson', query = ('John', 'Johnson').

Even this approach (A follows B or B follows A) gets tricky in the real
world of the OP, who appears to be attempting some sort of name
matching, where the word order may be scrambled. Problem is, punters
can have more than 2 words in their names, e.g. Mao Ze Dong[*], Louise
de la Valliere, and Johann Georg Friedrich von und zu Hohenlohe ... or
misreading handwriting can change the number of perceived words, e.g.
Walenkamp -> Wabu Kamp (no kidding).

[*] aka Mao Zedong aka Mao Tse Tung -- difficult enough before we start
considering variations in the order of the words.




More information about the Python-list mailing list