Writing Log CSV (Efficiently)

Tim Golden mail at timgolden.me.uk
Mon Apr 16 09:48:06 EDT 2007


Robert Rawlins - Think Blue wrote:
> I'm looking to write a Log file which will be CSV based, and there is a good
> possibility that it'll get quite busy once its up and running, so I'm
> looking for the most efficient way to achieve it. 


[... snip ...]
>               myfile = open("Logs/Application.txt", "w")
> 
>               myfile.write('col1, col2, col3, col4, col5')
> 
>               myfile.close
[.. snip ...]
> But I'm a little apprehensive that open() and close() on a very regular
> basis is just going to cause issues. I'm also a little worried that we'll
> end up with 'race' type conditions and things going missing.

> So my next thought was to just have an open object for the file, and then
> perform multiple rights, until I need to send the report file somewhere
> else, at which point I would close it. This in itself causes more issues as
> we're running in a buffer which is just going to eat my memory, and as this
> is on an embedded system which may lose power, we'd be kissing good bye to
> all the logs until that point.


I'm sure you'll get this same advice from everyone on
the list, but:

1) Use the csv module which comes with Python to avoid
reinventing the wheel. (Not to do with your main question, 
but worth it anyway).

2) Don't optimize too soon. It's hard to predict what effect 
things are likely to have on performance. A *lot* depends on 
your operating system, the environment, the frequency of
updates etc. etc. One obvious factor is the whether
multiple processes are writing to the file, what the
damage would be if the process crashed and the buffer didn't 
get written /closed.

3) If you really worry about the performance, do some 
profiling / timing. It's surely not too hard
to generate a stream of csv writes comparable to your target
system (or at least proportional). Use the timeit or hotshot 
modules to see what difference the open/close makes.

TJG



More information about the Python-list mailing list