Regex replacement operation

Cliff Wells clifford.wells at attbi.com
Thu Jan 16 01:29:21 EST 2003


On Wed, 2003-01-15 at 20:37, David K. Trudgett wrote:
> I'm trying to write a little micro function in Python that takes a
> string with numeric dates in it and changes those dates into an
> ISO8601 format. The dates in the input string are in a consistent
> format, so that simplifies it. In Perl, I could write something like
> the following to do it:
> 
> $str = 'Today is 16-1-2003 or 16-01-2003. New Year was 1-1-2003, reportedly.';
> 
> $str =~ s/ \b (\d{1,2}) - (\d{1,2}) - (\d{4}) \b /
>     $3 . '-' . sprintf('%02d', $2) . '-' . sprintf('%02d', $1) /gxe;
> 
> which would make $str contain:
> 
> "Today is 2003-01-16 or 2003-01-16. New Year was 2003-01-01, reportedly."
> 
> 
> How would I go about doing that in Python?
> 
> I've looked up the "re" module, but don't see any "substitute"
> command, so it seems a different approach may be in order.

re.sub()

http://www.python.org/doc/current/lib/node99.html


Anyway, here are two different (yet similar) implementations.  The
second is mostly to demonstrate some useful features of the re module
and Python string interpolation.  I'm sure it can be done better :P


# ===== Example 1 ======

import re

s = 'Today is 16-1-2003 or 16-01-2003. New Year was 1-1-2003, reportedly.'

rexp = re.compile('(\d\d?-\d\d?-\d\d\d\d)')

for date in rexp.findall(s):
    subdate = date.split('-')
    subdate.reverse()
    subdate = "%d-%02d-%02d" % tuple([int(d) for d in subdate])
    s = re.sub(date, subdate, s)

print s


# ===== Example 2 ======


s = 'Today is 16-1-2003 or 16-01-2003. New Year was 1-1-2003, reportedly.'

rexp = re.compile('(?P<date>(?P<day>\d\d?)-(?P<month>\d\d?)-(?P<year>\d\d\d\d))')

for match in rexp.finditer(s):
    groupdict = match.groupdict()
    for i in groupdict:
        try:
            groupdict[i] = int(groupdict[i])
        except ValueError: # date key isn't a valid int
            pass
    subdate = "%(year)d-%(month)02d-%(day)02d" % groupdict
    s = re.sub(groupdict['date'], subdate, s)

print s


-- 
Cliff Wells <clifford.wells at attbi.com>






More information about the Python-list mailing list