Writing some floats in a file in an efficient way

bartc bc at freeuk.com
Wed Feb 21 12:23:42 EST 2018


On 21/02/2018 15:54, ast wrote:
> Le 21/02/2018 à 15:02, bartc a écrit :
>> On 21/02/2018 13:27, ast wrote:
> 
>>
>> Time efficient or space efficient?
> 
> space efficient
> 
>> If the latter, how many floats are we talking about?
> 
> 10^9

OK. My experiment of writing the same 64-bit float a billion times to a 
file took 3 minutes, 8 bytes at a time. Doing it in blocks of 64KB, it 
still took 100 seconds (not using Python, and on Windows to a spinning 
hard drive, which is supposed to have slow file operations).

So perhaps both time and space requirements need looking at.

Was it just the extra overhead on top of the 8 billion bytes needed for 
those floats that you were worried about?

If not, then you might look at some mild compression, but only if the 
data lends itself to it. For example, if there are lots of zeros, lots 
of integer values among the floats, or lots of consecutive repeated values.

(And if precision is not so critical, you might also try just truncating 
a few bytes at the lower end. A bit crude, but it could be effective! 
You replace them with zeros when reading back in. However watch out for 
special floating point values, those involving underflow etc.

Although it might be better to convert to proper 32-bit float format in 
this case. This will halve space and probably time requirements.)

-- 
bartc



More information about the Python-list mailing list