reading binary data fast / help with optimizing ( again )

Mon May 6 08:28:15 EDT 2002

"Thomas Weholt" <thomas at gatsoft.no> writes:

> fmt = '4Iff3I'
> record_size = calcsize(fmt)
> desired_buffer_size = 512*1024 # want to read approx. 512k chunks pr.
> IO-call
> 
> # calculate a buffer-size based on record-size and desired buffer-size
> i = 0
> while 1:
>     buffer_size = int((desired_buffer_size + i)/ struct.calcsize(fmt))
>     if buffer_size % struct.calcsize(fmt) == 0:
>         break
>     i = i + 1

That doesn't have the desired effect; it gives a buffer size of
14580. What you meant is

buffer_size = desired_buffer_size - desired_buffer_size % record_size

This is actually a bit smaller than 512k; if you want the next-larger
value, use

buffer_size = desired_buffer_size - desired_buffer_size % -record_size

>     for stop_pos in range(record_size, len(_data) + record_size,
> record_size):
>         result.append(unpack(fmt, _data[start_pos:stop_pos]))
>         start_pos = stop_pos

I'm not sure what consumes the time here; you may try leaving out the
result.append call. If that significantly affects time consumption, I
you can preallocate the result, with, say,

  result = [0]*(buffer_size/record_size)

Regards,
Martin