[Tutor] problem in replacing regex

Moos Heintzen iwasroot at gmail.com
Thu Apr 9 00:05:46 CEST 2009


Hi,

You can do the substitution in many ways.

You can first search for bare account numbers and substitute them with
urls. Then substitute urls into <a></a> tags.

To substitute account numbers that aren't in urls, you simply
substitutes account numbers if they don't start with a "/", as you
have been trying to do.

re.sub() can accept a function instead of a string. The function
receives the match object and returns a replacement. This way you can
do extra processing to matches.

import re

text = """https://hello.com/accid/12345-12

12345-12

http://sadfsdf.com/asdf/asdf/asdf/12345-12

start12345-12end

this won't be replaced
start/123-45end
"""

def sub_num(m):
	if m.group(1) == '/':
		return m.group(0)
	else:
		# put url here
		return m.group(1) + 'http://example.com/' + m.group(2)

>>> print re.sub(r'(\D)(\d+-\d+)', sub_num , text)
https://hello.com/accid/12345-12

http://example.com/12345-12

http://sadfsdf.com/asdf/asdf/asdf/12345-12

starthttp://example.com/12345-12end

this won't be replaced
start/123-45end

>>> _

This is assuming there isn't any <a> tags in the input, so you should
do this before substituting urls into <a> tags.


I have super cow powers!

Moos


More information about the Tutor mailing list