Help with Regular Expressions
Paul McGuire
ptmcg at austin.rr.com
Wed Aug 10 09:27:18 EDT 2005
If your re demands get more complicated, you could take a look at
pyparsing. The code is a bit more verbose, but many find it easier to
compose their expressions using pyparsing's classes, such as Literal,
OneOrMore, Optional, etc., plus a number of built-in helper functions
and expressions, including delimitedList, quotedString, and
cStyleComment. Pyparsing is intended for writing recursive-descent
parsers, but can also be used (and is best learned) with simple
applications such as this one.
Here is a simple script for parsing your e-mail addresses. Note the
use of results names to give you access to the individual parsed fields
(re's also support a similar capability).
Download pyparsing at http://pyparsing.sourceforge.net.
-- Paul
from pyparsing import Literal,Word,Optional,\
delimitedList,alphanums
# define format of an email address
AT = Literal("@").suppress()
emailWord = Word(alphanums+"_")
emailDomain = delimitedList( emailWord, ".", combine=True)
emailAddress = emailWord.setResultsName("user") + \
Optional( AT + emailDomain ).setResultsName("host")
# parse each word in wordList
wordList = ['myname1', 'myname1 at domain.tld', 'myname2 at domain.tld',
'myname4 at domain', 'myname5 at domain.tldx']
for w in wordList:
addr = emailAddress.parseString(w)
print w
print addr
print "user:", addr.user
print "host:", addr.host
print
Will print out:
myname1
['myname1']
user: myname1
host:
myname1 at domain.tld
['myname1', 'domain.tld']
user: myname1
host: domain.tld
myname2 at domain.tld
['myname2', 'domain.tld']
user: myname2
host: domain.tld
myname4 at domain
['myname4', 'domain']
user: myname4
host: domain
myname5 at domain.tldx
['myname5', 'domain.tldx']
user: myname5
host: domain.tldx
More information about the Python-list
mailing list