Regular expression - dot problem!

李政 fzhenglee23 at yahoo.com.cn
Wed Jun 7 16:29:23 EDT 2006


Hi,
   
  I've a problem with regular express(dot problem). I checked Python Library Reference, but i can't find any infomation that is useful . Nor did Google. 
  They have the same way: 
   
            re.compile(''www").match(string). 
   
  It always works better. But my pattern string must be an argument of re.compile(). In this case, I can not use
   
  Here is my code. Bold text is where the problem is. Is there anyone know the right way to do this.
  ----------------------------------------------------------------------------------------------------------
  def getLinkType(url, sitedomain):
    # get the domain which 'url' belongs to
    urldomain = urlparse4esa(url)[1]
    
    tmpsd = ''
    if re.compile('^www').match(sitedomain) is not None:
        tmpsd = sitedomain[4:]
    
    tmpsd.replace('.', '\.')        # it seems that it doesn't  work
    pattern = tmpsd + '$'        # this is my pattern string. 
                                              # Example: pattern 'ibm.com'
  
    if re.compile(pattern).match(urldomain) is not None:
        return INTERNAL_LINK    # match. url is internal link
    else:
        return EXTERNAL_LINK    # doesn't match. url is external link
  ----------------------------------------------------------------------------------------------------------
   
  Alex, China



 __________________________________________________
赶快注册雅虎超大容量免费邮箱?
http://cn.mail.yahoo.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20060608/a22a163f/attachment.html>


More information about the Python-list mailing list