Regular expression - dot problem!
李政
fzhenglee23 at yahoo.com.cn
Wed Jun 7 16:29:23 EDT 2006
Hi,
I've a problem with regular express(dot problem). I checked Python Library Reference, but i can't find any infomation that is useful . Nor did Google.
They have the same way:
re.compile(''www").match(string).
It always works better. But my pattern string must be an argument of re.compile(). In this case, I can not use
Here is my code. Bold text is where the problem is. Is there anyone know the right way to do this.
----------------------------------------------------------------------------------------------------------
def getLinkType(url, sitedomain):
# get the domain which 'url' belongs to
urldomain = urlparse4esa(url)[1]
tmpsd = ''
if re.compile('^www').match(sitedomain) is not None:
tmpsd = sitedomain[4:]
tmpsd.replace('.', '\.') # it seems that it doesn't work
pattern = tmpsd + '$' # this is my pattern string.
# Example: pattern 'ibm.com'
if re.compile(pattern).match(urldomain) is not None:
return INTERNAL_LINK # match. url is internal link
else:
return EXTERNAL_LINK # doesn't match. url is external link
----------------------------------------------------------------------------------------------------------
Alex, China
__________________________________________________
赶快注册雅虎超大容量免费邮箱?
http://cn.mail.yahoo.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20060608/a22a163f/attachment.html>
More information about the Python-list
mailing list