Why is Python so slow ?- revisited.

Christian Tismer tismer at appliedbiometrics.com
Wed Jul 5 09:52:00 EDT 2000


William Dandreta wrote:
> 
> Hi chris,
> 
> Here is the fastest version of my program.

Ok. I was off for a while and did not follow the thread.

Some hints may give you a little speedup:

Don't work on the global level, but do everything in
functions. This gives you a speedup, since local variables
are optimized and don't involve dictionary lookups.

Further, since you are working on only 6000 lines or so,
you can save quite some time by using readlines(),and
then iterating over that list. But I don't know if Py1.2
has readlines at all.

Here a rough layout, untested of course!

### start of example refactored
#corrects Y2K leapyear error and
#strips off all the " marks

import strop

FINNAME  = 'lu4.txt'
FOUTNAME = 'lu4.new'

def process(finn=FINNAME, foutn=FOUTNAME):
    # caching some funcitons locally:
    strip = strop.strip
    splitfields = strop.splitfields
    atoi = strop.atoi
    joinfields = strop.joinfields
    _map = map
    _tuple = tuple

    lines = open(finn, 'r').readlines()
    fout = open(fount, 'wb)
    write = fout.write

    for line in lines:
        fields = splitfields(joinfields(splitfields(line,'"'),''),',')
        fields[2] = joinfields(splitfields(fields[2],' '),'0')
        fields = _map(strip,fields)
        #convert date to YYYYMMDD format
        if fields[2][-2:] == '00':
            fields[2] = '2000' + fields[2][-10:-8] + fields[2][-7:-5]
        else:
            fields[2] = '19' + fields[2][-2:] + fields[2][-10:-8] +
fields[2][-7:-5]

        #fix date for Y2K leapyear error
        if   fields[2] == '20000301':  fields[2] = '20000229'
        elif fields[2] == '20000401':  fields[2] = '20000331'
        elif '20000301'<fields[2] <'20000501':
            fields[2] = '2000' + fields[2][4:6] + '%02d' %
(atoi(fields[2][6:])-1)

        write('%-16s,%8s,%8s,%10s,%10s,%-11s,%-11s,%-21s,%-51s\r\n' %
_tuple(fields))

if __name__ == '__main__':
    process()  # add cmd line handling here

### end

But I'm not confident whether the speed gain would justify
changing it at all. Well, maybe it is some percentage.

cheers - chris

> ----------------------------------------------------------
> #corrects Y2K leapyear error and
> #strips off all the " marks
> 
> from strop import strip, splitfields,atoi,joinfields
> 
> lu4 = open('lu4.txt','r')
> newlu4 = open('lu4.new','wb')
> 
> while 1:
> #for i in range(6000):
>   line = lu4.readline()
>   if not line: break
>   fields = splitfields(joinfields(splitfields(line,'"'),''),',')
>   fields[2] = joinfields(splitfields(fields[2],' '),'0')
>   fields = map(strip,fields)
> #convert date to YYYYMMDD format
>   if fields[2][-2:] == '00':
>     fields[2] = '2000' + fields[2][-10:-8] + fields[2][-7:-5]
>   else:
>     fields[2] = '19' + fields[2][-2:] + fields[2][-10:-8] + fields[2][-7:-5]
> 
> #fix date for Y2K leapyear error
>   if fields[2] == '20000301':
>     fields[2] = '20000229'
>   elif fields[2] == '20000401':
>     fields[2] = '20000331'
>   elif '20000301'<fields[2] <'20000501':
>     fields[2] = '2000' + fields[2][4:6] + '%02d' % (atoi(fields[2][6:])-1)
> 
>   newlu4.write('%-16s,%8s,%8s,%10s,%10s,%-11s,%-11s,%-21s,%-51s\r\n' %
> tuple(fields))
> ---------------------------------------------------------------------
> Bill
> 
> Christian Tismer wrote in message
> <394D6579.8B2FF840 at appliedbiometrics.com>...
> >Also feel free to send me your source code for inspection.
> 
> --
> http://www.python.org/mailman/listinfo/python-list

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com




More information about the Python-list mailing list