Help with Regular Expressions

Paul McGuire ptmcg at austin.rr.com
Wed Aug 10 09:27:18 EDT 2005


If your re demands get more complicated, you could take a look at
pyparsing.  The code is a bit more verbose, but many find it easier to
compose their expressions using pyparsing's classes, such as Literal,
OneOrMore, Optional, etc., plus a number of built-in helper functions
and expressions, including delimitedList, quotedString, and
cStyleComment.  Pyparsing is intended for writing recursive-descent
parsers, but can also be used (and is best learned) with simple
applications such as this one.

Here is a simple script for parsing your e-mail addresses.  Note the
use of results names to give you access to the individual parsed fields
(re's also support a similar capability).

Download pyparsing at http://pyparsing.sourceforge.net.

-- Paul

from pyparsing import Literal,Word,Optional,\
                        delimitedList,alphanums

# define format of an email address
AT = Literal("@").suppress()
emailWord = Word(alphanums+"_")
emailDomain = delimitedList( emailWord, ".", combine=True)
emailAddress = emailWord.setResultsName("user") + \
    Optional( AT + emailDomain ).setResultsName("host")

# parse each word in wordList
wordList = ['myname1', 'myname1 at domain.tld', 'myname2 at domain.tld',
'myname4 at domain', 'myname5 at domain.tldx']

for w in wordList:
    addr = emailAddress.parseString(w)
    print w
    print addr
    print "user:", addr.user
    print "host:", addr.host
    print

Will print out:
myname1
['myname1']
user: myname1
host:

myname1 at domain.tld
['myname1', 'domain.tld']
user: myname1
host: domain.tld

myname2 at domain.tld
['myname2', 'domain.tld']
user: myname2
host: domain.tld

myname4 at domain
['myname4', 'domain']
user: myname4
host: domain

myname5 at domain.tldx
['myname5', 'domain.tldx']
user: myname5
host: domain.tldx




More information about the Python-list mailing list