Regular expression to match whole words.

Simon Brunning SBrunning at trisystems.co.uk
Wed Sep 27 04:44:02 EDT 2000


I'm trying to build a regular expression to match a list of whole words.

If I don't care about whole words, it's easy - I just use:

words = ['Spam', 'egg', 'chips']
rePattern = '|'.join(map(re.escape, words))

..and it's fine. The problem with this is that it will match on 'smegg' and
suchlike. So I tried this:

rePattern = '''[\s('"]''' + '''[\s,.;:)'"]|[\s('"]'''.join(map(re.escape,
fromVals)) + '''[\s,.;:)'"]'''

... the idea being to match words only when they have a space, bracket or so
on in front and behind. This has two problems (aside from being *very*
ugly):

*	It won't match words at the beginning or end of the string, because
they *don't* have a space before/after them.
*	It won't match two words from the list in a row, because the first
match 'consumes' the space.

Can anyone help me out?

Cheers,
Simon Brunning
TriSystems Ltd.
sbrunning at trisystems.co.uk

P.S. You've all been very helpful already. Thanks.

P.P.S. Don't worry about the legal drivel below - I'm not going to sue
*anyone*. Honest.




-----------------------------------------------------------------------
The information in this email is confidential and may be legally privileged.
It is intended solely for the addressee. Access to this email by anyone else
is unauthorised. If you are not the intended recipient, any disclosure,
copying, distribution, or any action taken or omitted to be taken in
reliance on it, is prohibited and may be unlawful. TriSystems Ltd. cannot
accept liability for statements made which are clearly the senders own.




More information about the Python-list mailing list