Fast forward-backward (write-read)

Oscar Benjamin oscar.j.benjamin at gmail.com
Sun Oct 28 14:21:45 EDT 2012


On 28 October 2012 14:20, Virgil Stokes <vs at it.uu.se> wrote:
> On 28-Oct-2012 12:18, Dave Angel wrote:
>>
>> On 10/24/2012 03:14 AM, Virgil Stokes wrote:
>>>
>>> On 24-Oct-2012 01:46, Paul Rubin wrote:
>>>>
>>>> Virgil Stokes <vs at it.uu.se> writes:
>>>>>
>>>>> Yes, I do wish to inverse the order,  but the "forward in time" file
>>>>> will be in binary.
>>>>
>>>> I really think it will be simplest to just write the file in forward
>>>> order, then use mmap to read it one record at a time.  It might be
>>>> possible to squeeze out a little more performance with reordering tricks
>>>> but that's the first thing to try.
>>>
>>> Thanks Paul,
>>> I am working on this approach now...
>>
>> If you're using mmap to map the whole file, you'll need 64bit Windows to
>> start with.  I'd be interested to know if Windows will allow you to mmap
>> 100gb at one stroke.  Have you tried it, or are you starting by figuring
>> how to access the data from the mmap?
>
> Thanks very much for pursuing my query, Dave.
>
> I have not tried it yet --- temporarily side-tracked; but, I will post my
> findings on this issue.

If you are going to use mmap then look at the numpy.memmap function.
This wraps pythons mmap so that you can access the contents of the
mapped binary file as if it was a numpy array. This means that you
don't need to handle the bytes -> float conversions yourself.

>>> import numpy
>>> a = numpy.array([4,5,6], numpy.float64)
>>> a
array([ 4.,  5.,  6.])
>>> with open('tmp.bin', 'wb') as f:  # write forwards
...   a.tofile(f)
...   a.tofile(f)
...
>>> a2 = numpy.memmap('tmp.bin', numpy.float64)
>>> a2
memmap([ 4.,  5.,  6.,  4.,  5.,  6.])
>>> a2[3]
4.0
>>> a2[5:2:-1] # read backwards
memmap([ 6.,  5.,  4.])


Oscar



More information about the Python-list mailing list