small problem with re.sub

Wed Jan 30 22:43:05 EST 2008

"Astan Chee" <stanc at al.com.au> wrote in message 
news:mailman.78.1201748496.9267.python-list at python.org...
> Hi,
> I have a html text stored as a string. Now I want to go through this 
> string and find all 6 digit numbers and make links from them.
> Im using re.sub and for some reason its not picking up the previously 
> matched condition. Am I doing something wrong? This is what my code looks 
> like:
> htmlStr = re.sub('(?P<id>\d{6})','<a 
> href=\"http://linky.com/(?P=id).html\">(?P=id)</a>',htmlStr)
> It seems that it replaces it alright, but it replaces it literally. Am I 
> not escaping certain characters?
> Thanks again for the help.
> Cheers
>
> Animal Logic
> http://www.animallogic.com
>
> Please think of the environment before printing this email.
>
> This email and any attachments may be confidential and/or privileged. If 
> you are not the intended recipient of this email, you must not disclose or 
> use the information contained in it. Please notify the sender immediately 
> and delete this document if you have received it in error. We do not 
> guarantee this email is error or virus free.
>
>

See the help for re.sub: "Backreferences, such as "\6", are replaced with 
the substring matched by group 6 in the pattern."

This should work:

htmlStr = re.sub('(?P<id>\d{6})','<a 
href="http://linky.com/\\1.html">\\1</a>',htmlStr)

--Mark