Improve performance for writing files with format modification

Andrew Dalke dalke at dalkescientific.com
Wed Dec 19 22:47:24 EST 2001


christine.bartels at teleatlas.com:
>I need some help to improve a function that formats a string and
>writes the result to a file.
>...
>file = open(filein,"r")
>nfile = open(fileout,"w")
>while 1:
>    fblock = file.readlines(0x2000)
>    if not fblock:
>        break
>    for i in fblock:
>        nfile.write("%20s%10s" % tuple(split(';')))
>nfile.close()
>...

It appears this takes a bunch of lines of the form
value1;value2

and formats them as 20 columns for value1 and 10 for value2.

I assume somewhere you did a 'from string import *', so
that split is in the local namespace?  And that the
split(';') is really split(i, ';')?  And that there's a
missing "\n"?

In other words, that the inner loop is written

    for i in fblock:
        nfile.write("%20s%10s\n" % tuple(string.split(i, ';')))

(btw, 'i' is usually used to store an integer - it throws me
off to see it holding a string.)

There's only a few ways to make that go faster

 - make sure the code is inside of a function.  Local variable
lookups (as inside a function) are faster than module-level
functions.  This is probably the biggest performance impact
in the code you presented.

 - you can manually cache the lookup for nfile.write, tuple, and
split.

Try this

def convert(file, nfile):
    write = nfile.write  # cache the attribute lookup to a local variable
    tupl = tuple  # cache the __builtin__ lookup to a local variable
    splt = split  # cache the module lookup to a local variable
    while 1:
        fblock = file.readlines(0x2000)
        if not fblock:
            break
        for i in fblock:
            write("%20s%10\n" % tupl(splt(i, ';')))

file = open(filein,"r")
nfile = open(fileout,"w")
convert(file, nfile)
file.close()
nfile.close()

                    Andrew
                    dalke at dalkescientific.com






More information about the Python-list mailing list