[Tutor] problem in replacing regex
Moos Heintzen
iwasroot at gmail.com
Thu Apr 9 00:05:46 CEST 2009
Hi,
You can do the substitution in many ways.
You can first search for bare account numbers and substitute them with
urls. Then substitute urls into <a></a> tags.
To substitute account numbers that aren't in urls, you simply
substitutes account numbers if they don't start with a "/", as you
have been trying to do.
re.sub() can accept a function instead of a string. The function
receives the match object and returns a replacement. This way you can
do extra processing to matches.
import re
text = """https://hello.com/accid/12345-12
12345-12
http://sadfsdf.com/asdf/asdf/asdf/12345-12
start12345-12end
this won't be replaced
start/123-45end
"""
def sub_num(m):
if m.group(1) == '/':
return m.group(0)
else:
# put url here
return m.group(1) + 'http://example.com/' + m.group(2)
>>> print re.sub(r'(\D)(\d+-\d+)', sub_num , text)
https://hello.com/accid/12345-12
http://example.com/12345-12
http://sadfsdf.com/asdf/asdf/asdf/12345-12
starthttp://example.com/12345-12end
this won't be replaced
start/123-45end
>>> _
This is assuming there isn't any <a> tags in the input, so you should
do this before substituting urls into <a> tags.
I have super cow powers!
Moos
More information about the Tutor
mailing list