Anyone happen to have optimization hints for this loop?

writeson doug.farrell at gmail.com
Wed Jul 9 12:47:32 EDT 2008


On Jul 9, 12:04 pm, dp_pearce <dp_pea... at hotmail.com> wrote:
> I have some code that takes data from an Access database and processes
> it into text files for another application. At the moment, I am using
> a number of loops that are pretty slow. I am not a hugely experienced
> python user so I would like to know if I am doing anything
> particularly wrong or that can be hugely improved through the use of
> another method.
>
> Currently, all of the values that are to be written to file are pulled
> from the database and into a list called "domainVa". These values
> represent 3D data and need to be written to text files using line
> breaks to seperate 'layers'. I am currently looping through the list
> and appending a string, which I then write to file. This list can
> regularly contain upwards of half a million values...
>
> count = 0
> dmntString = ""
> for z in range(0, Z):
>     for y in range(0, Y):
>         for x in range(0, X):
>             fraction = domainVa[count]
>             dmntString += "  "
>             dmntString += fraction
>             count = count + 1
>         dmntString += "\n"
>     dmntString += "\n"
> dmntString += "\n***\n
>
> dmntFile     = open(dmntFilename, 'wt')
> dmntFile.write(dmntString)
> dmntFile.close()
>
> I have found that it is currently taking ~3 seconds to build the
> string but ~1 second to write the string to file, which seems wrong (I
> would normally guess the CPU/Memory would out perform disc writing
> speeds).
>
> Can anyone see a way of speeding this loop up? Perhaps by changing the
> data format? Is it wrong to append a string and write once, or should
> hold a file open and write at each instance?
>
> Thank you in advance for your time,
>
> Dan

Hi Dan,

Looking at the code sample you sent, you could do some clever stuff
making dmntString a list rather than a string and appending everywhere
you're doing a +=. Then at the end you build the string your write to
the file one time with a dmntFile.write(''.join(dmntList). But I think
the more straightforward thing would be to replace all the dmntString
+= ... lines in the loops with a dmntFile.write(whatever), you're just
constantly adding onto the file in various ways.

I think the slowdown you're seeing your code as written comes from
Python string being immutable. Every time you perform a dmntString
+= ... in the loops you're creating a new dmntString, copying in the
contents of the old, plus the appended content. And if your list can
reach a half a million items, well that's a TON of string create,
string copy operations.

Hope you find this helpful,
Doug



More information about the Python-list mailing list