How can I speed this function up?

Tim Hochberg tim.hochberg at ieee.org
Sat Nov 18 01:08:11 EST 2006


Chris wrote:
> This is just some dummy code to mimic what's being done in the real 
> code. The actual code is python which is used as a scripting language in 
> a third party app. The data structure returned by the app is more or 
> less like the "data" list in the code below. The test for "ELEMENT" is 
> necessary ... it just evaluates to true every time in this test code. In 
> the real app perhaps 90% of tests will also be true.
> 
> So my question is how can I speed up what's happening inside the 
> function write_data()? Only allowed to use vanilla python (no psycho or 
> other libraries outside of a vanilla python install).

Try collecting your output into bigger chunks before writing it out. For 
example, take a look at:

def write_data2(out, data):
     buffer = []
     append = buffer.append
     extend = buffer.extend
     for i in data:
         if i[0] == 'ELEMENT':
             append("ELEMENT %06d " % i[1])
             extend(map(str, i[2]))
             append('\n')
     out.write(''.join(buffer))


def write_data3(out, data):
     buffer = []
     append = buffer.append
     for i in data:
         if i[0] == 'ELEMENT':
             append(("ELEMENT %06d %s" % (i[1],' '.join(map(str,i[2])))))
     out.write('\n'.join(buffer))


Both of these run almost twice as fast as the original below (although 
admittedly I didn't check that they were actually right). Using some of 
the other suggestions mentioned in this thread may make things better 
still. It's possible that some intermediate chunk size might be better 
than collecting everything into one string, I dunno.

cStringIO might be helpful here as a buffer instead of using lists, but 
I don't have time to try it right now.

-tim


> 
> I have a vested interest in showing a colleague that a python app can 
> yield results in a time comparable to his C-app, which he feels is mch 
> faster. I'd like to know what I can do within the constraints of the 
> python language to get the best speed possible. Hope someone can help.
> 
> def write_data1(out, data):
>      for i in data:
>          if i[0] is 'ELEMENT':
>              out.write("%s %06d " % (i[0], i[1]))
>              for j in i[2]:
>                  out.write("%d " % (j))
>              out.write("\n")
> 
> import timeit
> 
> # basic data mimicing data returned from 3rd party app
> data = []
> for i in range(500000):
>      data.append(("ELEMENT", i, (1,2,3,4,5,6)))
> 
> # write data out to file
> fname = "test2.txt"
> out = open(fname,'w')
> start= timeit.time.clock()
> write_data2(out, data)
> out.close()
> print timeit.time.clock()-start
> 
> 




More information about the Python-list mailing list