Regular Expression - old regex module vs. re module

Paul McGuire ptmcg at austin.rr._bogus_.com
Fri Jun 30 10:46:34 EDT 2006


"Steve" <stever at cruzio.com> wrote in message
news:1151607229.548737.145800 at d56g2000cwd.googlegroups.com...
> Hi All,
>
> I'm having a tough time converting the following regex.compile patterns
> into the new re.compile format.  There is also a differences in the
> regsub.sub() vs. re.sub()
>
> Could anyone lend a hand?
>
>

Not an re solution, but pyparsing makes for an easy-to-follow program.
TransformString only needs to scan through the string once - the
"reals-before-ints" testing is factored into the definition of the
formatters variable.

Pyparsing's project wiki is at http://pyparsing.wikispaces.com.

-- Paul

-------------------
from pyparsing import *

"""
read Perl-style formatting placeholders and replace with
proper Python %x string interp formatters

   ###### -> %6d
   ##.### -> %6.3f
   <<<<<  -> %-5s
   >>>>>  -> %5s

"""

# set up patterns to be matched - Word objects match character groups
# made up of characters in the Word constructor; Combine forces
# elements to be adjacent with no intervening whitespace
# (note use of results name in realFormat, for easy access to
# decimal places substring)
intFormat = Word("#")
realFormat = Combine(Word("#")+"."+
                     Word("#").setResultsName("decPlaces"))
leftString = Word("<")
rightString = Word(">")

# define parse actions for each - the matched tokens are the third
# arg to parse actions; parse actions will replace the incoming tokens with
# value returned from the parse action
intFormat.setParseAction( lambda s,l,toks: "%%%dd" % len(toks[0]) )
realFormat.setParseAction( lambda s,l,toks: "%%%d.%df" %
                              (len(toks[0]),len(toks.decPlaces)) )
leftString.setParseAction( lambda s,l,toks: "%%-%ds" %  len(toks[0]) )
rightString.setParseAction( lambda s,l,toks: "%%%ds" %  len(toks[0]) )

# collect all formatters into a single "grammar"
# - note reals are checked before ints
formatters = rightString | leftString | realFormat | intFormat

# set up our test string, and use transform string to invoke parse actions
# on any matched tokens
testString = """
    This is a string with
        ints: ####  # ###############
        floats: #####.#  ###.######  #.#
        left-justified strings: <<<<<<<<  << <
        right-justified strings: >>>>>>>>>>  >> >
        int at end of sentence: ####.
    """
print formatters.transformString( testString )

-------------------
Prints:

    This is a string with
        ints: %4d  %1d %15d
        floats: %7.1f  %10.6f  %3.1f
        left-justified strings: %-8s  %-2s %-1s
        right-justified strings: %10s  %2s %1s
        int at end of sentence: %4d.






More information about the Python-list mailing list