[Numpy-discussion] Not enough storage for memmap on 32 bit WinXP for accumulated file size above approx. 1 GB

David Cournapeau david at ar.media.kyoto-u.ac.jp
Fri Jul 24 06:52:51 EDT 2009


Kim Hansen wrote:
>>> I tried adding the /3GB switch to boot.ini as you suggested:
>>> multi(0)disk(0)rdisk(0)partition(1)\WINDOWS="Microsoft Windows XP
>>> Professional" /noexecute=optin /fastdetect /3GB
>>> and rebooted the system.
>>>
>>> Unfortunately that did not change anything for me. I still hit a hard
>>> deck around 1.9 GB. Strange.
>>>
>>>       
>> The 3Gb thing only works for application specifically compiled for it:
>>
>> http://blogs.msdn.com/oldnewthing/archive/2004/08/12/213468.aspx
>>
>> I somewhat doubt python is built with this, but you could check this in
>> python sources to be sure,
>>
>> cheers,
>>
>> David
>>     
> Ahh, that explains it. Thank you for that enlightening link. Anyway
> would it not be worth mentioning in the memmap documentation that
> there is this 32 bit limitation, or is it so straightforwardly obvious
> (it was not for me) that his is the case?
>   

Well, the questions has popped up a few times already, so I guess this
is not so obvious :) 32 bits architecture fundamentally means that a
pointer is 32 bits, so you can only address 2^32 different memory
locations. The 2Gb instead of 4Gb is a consequence on how windows and
linux kernels work. You can mmap a file which is bigger than 4Gb (as you
can allocate more than 4Gb, at least in theory, on a 32 bits system),
but you cannot 'see' more than 4Gb at the same time because the pointer
is too small.

Raymond Chen gives an example on windows:

http://blogs.msdn.com/oldnewthing/archive/2004/08/10/211890.aspx

I don't know if it is possible to do so in python, though.

> The reason it isn't obvious for me is because I can read and
> manipulate files >200 GB in Python with no problems (yes I process
> that large files), so I thought why should it not be capable of
> handling quite large memmaps as well...
>   

Handling large files is no problem on 32 bits: it is just a matter of
API (and kernel/fs support). You move the file location using a 64 bits
integer and so on. Handling more than 4 Gb of memory at the same time is
much more difficult. To address more than 4Gb, you would need a
segmented architecture in your memory handling (with a first address for
a segment, and a second address for the location within one segment).

cheers,

David



More information about the NumPy-Discussion mailing list