regex over files

Robin Becker robin at SPAMREMOVEjessikat.fsnet.co.uk
Tue Apr 26 04:55:58 EDT 2005


Richard Brodie wrote:
> "Robin Becker" <robin at reportlab.com> wrote in message
> news:mailman.2469.1114444689.1799.python-list at python.org...
> 
>>Gerald Klix wrote:
>>
>>>Map the file into RAM by using the mmap module.
>>>The file's contents than is availabel as a seachable string.
>>>
>>
>>that's a good idea, but I wonder if it actually saves on memory? I just tried
>>regexing through a 25Mb file and end up with 40Mb as working set (it rose
>>linearly as the loop progessed through the file). Am I actually saving anything
>>by not letting normal vm do its thing?
> 
> 
> You aren't saving memory in that sense, no. If you have any RAM spare the
> file will end up in it. However, if you are short on memory though, mmaping the
> file gives the VM the opportunity to discard pages from the file, instead of paging
> them out. Try again with a 25Gb file and watch the difference ;) YMMV.
> 
> 

:)

So we avoid dirty page writes etc etc. However, I still think I could 
get away with a small window into the file which would be more efficient.
-- 
Robin Becker



More information about the Python-list mailing list