[Python-ideas] Support parsing stream with `re`

Mon Oct 8 16:30:39 EDT 2018

On 08Oct2018 13:36, Ram Rachum <ram at rachum.com> wrote:
>I'm not an expert on memory. I used Process Explorer to look at the
>Process. The Working Set of the current run is 11GB. The Private Bytes is
>708MB. Actually, see all the info here:
>https://www.dropbox.com/s/tzoud028pzdkfi7/screenshot_TURING_2018-10-08_133355.jpg?dl=0

And the process' virtual size is about 353GB, which matches having your file 
mmaped (its contents is now part of your process virtual memory space).

>I've got 16GB of RAM on this computer, and Process Explorer says it's
>almost full, just ~150MB left. This is physical memory.

I'd say this is expected behaviour. As you access the memory it is paged into 
physical memory, and since it may be wanted again (the OS can't tell) it isn't 
paged out until that becomes necessary to make room for other virtual pages.

I suspect (but would need to test to find out) that sequentially reading the 
file instead of memory mapping it might not be so aggressive because your 
process would be reusing that same small pool of memory to hold data as you 
scan the file.

Cheers,
Cameron Simpson <cs at cskk.id.au>