writing to file very slow

Moritz Lennert mlennert at club.worldonline.be
Wed Mar 26 12:21:06 EST 2003


Thanks to Gustavo, Skip and Jack for your prompt answers,

I've tried the different solutions you proposed, but unfortunately the
time gain is not very important (maybe 30 secs).

Tomorrow I'll try to figure out where the bottleneck is, but you've
already helped me advance quite a lot in my understanding of how things
should work (although I do have to do my homework on lambdas ;-) ), so
thanks again.

Moritz






> On Wed, Mar 26, 2003 at 03:54:20PM +0100, Moritz Lennert wrote:
>> Hello,
>>
>> I have written a cgi script that uses PyGreSQL to query a PostgreSQL
>> database on the basis of the input of an html form. The results of the
>> query are stored in a file for the user to download.
>>
>> Everything seems to work fine, except for the fact that writing the file
>> is very slow (example: 4 minutes for 4 thousand lines). I've even had
>> the
>> problem that the program stopped writing the file before coming to the
>> end
>> of the results, then launches the same query again and starts writing a
>> new file, which "crashes" again, triggering a new query, etc.
>>
>> I use the following modules:
>>
>> import cgi
>> import os
>> import _pg
>> import string
>> import tempfile
>>
>> And here is the relevant function for writing the file:
>>
>> def fichier_resultats(results):
>>   temdir="/tmp"
>>   tfilename = tempfile.mktemp('rec.txt')
>>   f=open(tfilename,'w')
>>
>>   varnames=""
>>   for z in range(len(results.listfields())-1):
>>     varnames += str(results.fieldname(z))+'|'
>>   varnames += str(results.fieldname(len(results.listfields())-1))
>>   f.write(varnames)
>>   f.write("\n")
>>   for x in range(results.ntuples()):
>>    var = ""
>>    for y in range(len(results.listfields())-1):
>>      var+=str(results.getresult()[x][y])+'|'
>>    var+=str(results.getresult()[x][len(results.listfields())-1])
>>    f.write(var)
>>    f.write('\n')
>>   f.close()
>>   return(tfilename)
>
> What everyone else said plus,
>   you call getresult() alot, does it's return value ever change?
>   is it doing alot of work?  you could call it just once
>
>   if the list of fields never changes, just calculate the string once
>   and output it many times
>
>   listfields() is used only for its length
>
>   see my comment below, do you really only want to output the last
>   value in each row of getresult() ?
>
> modified function below (!warning, untested!)
> I use lambdas unapoligetically, feel free to translate into list comps
>
> def fichier_resultats(results):
>   temdir="/tmp"
>   tfilename = tempfile.mktemp('rec.txt')
>   f=open(tfilename,'w')
>
>   # you could also
>   varnames = '|'.join(map(lambda i: str(results.fieldname(i)),
> range(len(results.listfields()))))
>   f.write(varnames)
>   f.write("\n")
>   gotten_result = results.getresult() # we only call this once
>   field_str = '|'.join(map(str, gotten_result[0]))
>   for row in gotten_result:
>     f.write(field_str)
>     f.write(row[-1]) # is this really what you want? translation of below
> line
>     # var+=str(results.getresult()[x][len(results.listfields())-1])
>     f.write('\n')
>   f.close()
>   return(tfilename)
>
>
> -jackdied
>
> --
> http://mail.python.org/mailman/listinfo/python-list
>






More information about the Python-list mailing list