There's got to be an easy way to do this

Emile van Sebille emile at fenx.com
Thu Jul 5 14:34:35 EDT 2001


I'm not sure how you mean slower, but I tested just now to see, and this is
the fastest of the four.

"""
-    10-       std_re      cpl_re      str_join    flt_lmbda
         5 :    0.00         0.01         0.00         0.00
        10 :    0.02         0.01         0.01         0.00
        50 :    0.07         0.07         0.02         0.01
-    50-       std_re      cpl_re      str_join    flt_lmbda
         5 :    0.04         0.04         0.00         0.01
        10 :    0.07         0.07         0.02         0.01
        50 :    0.37         0.35         0.07         0.08
-   100-       std_re      cpl_re      str_join    flt_lmbda
         5 :    0.07         0.07         0.02         0.01
        10 :    0.15         0.14         0.03         0.03
        50 :    0.76         0.70         0.14         0.15
-   500-       std_re      cpl_re      str_join    flt_lmbda
         5 :    0.37         0.35         0.07         0.08
        10 :    0.74         0.70         0.14         0.15
        50 :    3.74         3.60         0.70         0.75
"""


import re
from time import time

def std_re(iters):
    for i in iters:
        re.sub("[^0-9]", "", "(555) 333.2221")

c = re.compile("[^0-9]")

def cpl_re(iters):
    for i in iters:
        c.sub("", "(555) 333.2221")

def str_join(iters):
    for i in iters:
        "".join([x for x in '(123)/456-7890' if x in '0123456789'])

def flt_lmbda(iters):
    for i in iters:
        filter(lambda c:c.isdigit(), '(123)/456-7890')

testfuncs = (std_re, cpl_re, str_join, flt_lmbda)
COLWIDTH = 12
funcHdrs = '%%-%ss' % COLWIDTH * len(testfuncs) % tuple([x.__name__ for x in
testfuncs])
rsltMask = '%6.2f'

while COLWIDTH > len(rsltMask % 0):
        rsltMask = rsltMask + ' '

for count in (10, 50, 100, 500):
    print '\n\n-%6d-       %s' % (count, funcHdrs),
    for iterations in (5, 10, 50):
        print ("\n\n    %6d : " % iterations),
        for func in testfuncs:
            iters = xrange(count)
            start = time()
            for i in range(iterations):
                func(iters)
            print rsltMask % (time()-start),

--

Emile van Sebille
emile at fenx.com

---------
"Michael Ströder" <michael at stroeder.com> wrote in message
news:3B44B031.819C93E at stroeder.com...
> Emile van Sebille wrote:
> > Or (without re):
> > print "".join([x for x in '(123)/456-7890' if x in '0123456789'])
>
> I guess this one will be significantly slower for larger
> data sets than the re solution because of
>
>   if x in '0123456789'
>
> Ciao, Michael.




More information about the Python-list mailing list