Python-list Digest, Vol 22, Issue 82

Vibha Tripathi vibtrip at yahoo.com
Tue Jul 5 16:35:37 EDT 2005


thank you George!!!

THIS is what I was looking for :)

Peace.
Vibha

From:	"George Sakkis" <gsakkis at rutgers.edu>
To:	python-list at python.org
Date:	5 Jul 2005 12:43:26 -0700
Subject:	Re: Python Regular Expressions: re.sub(regex,
replacement, subject)

Plain Text Attachment [ Download File | Save to Yahoo!
Briefcase ]

"Vibha Tripathi" <vibtrip at yahoo.com> wrote:

> Hi Folks,
>
> I put a Regular Expression question on this list a
> couple days ago. I would like to rephrase my
question
> as below:
>
> In the Python re.sub(regex, replacement, subject)
> method/function, I need the second argument
> 'replacement' to be another regular expression ( not
a
> string) . So when I find a 'certain kind of string'
in
> the subject, I can replace it with 'another kind of
> string' ( not a predefined string ). Note that the
> 'replacement' may depend on what exact string is
found
> as a result of match with the first argument
'regex'.

In re.sub, 'replacement' can be either a string, or a
callable that
takes a single match argument and should return the
replacement string.
So although replacement cannot be a regular
expression, it can be
something even more powerful, a function. Here's a toy
example of what
you can do that wouldn't be possible with regular
expressions alone:

>>> import re
>>> from datetime import datetime
>>> this_year = datetime.now().year
>>> rx = re.compile(r'(born|gratuated|hired) in
(\d{4})')
>>> def replace_year(match):
>>>     return "%s %d years ago" % (match.group(1),
this_year - 
int(match.group(2)))
>>> rx.sub(replace_year, 'I was born in 1979 and
gratuated in 1996.')
'I was born 26 years ago and gratuated 9 years ago'

In cases where you don't have to transform the matched
string (such as
calling int() and evaluating an expression as in the
example) but only
append or prepend another string, there is a simpler
solution that
doesn't require writing a replacement function:
backreferences.
Replacement can be a string where \1 denotes the first
group of the
match, \2 the second and so on. Continuing the
example, you could hide
the dates by:

>>> rx.sub(r'\1 in ****', 'I was hired in 2001 in a
company of 2001 
employees.')
'I was hired in **** in a company of 2001 employees.'

By the way, run the last example without the 'r' in
front of the
replacement string and you'll see why it is there for.

HTH,

George




=======
"Things are only impossible until they are not."

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 



More information about the Python-list mailing list