regex over files
Skip Montanaro
skip at pobox.com
Tue Apr 26 15:53:49 EDT 2005
>> It's hard to imagine how sliding a small window onto a file within Python
>> would be more efficient than the operating system's paging system. ;-)
Robin> well it might be if I only want to scan forward through the file
Robin> (think lexical analysis). Most lexical analyzers use a buffer and
Robin> produce a stream of tokens ie a compressed version of the
Robin> input. There are problems crossing buffers etc, but we never
Robin> normally need the whole file in memory.
If I mmap() a file, it's not slurped into main memory immediately, though as
you pointed out, it's charged to my process's virtual memory. As I access
bits of the file's contents, it will page in only what's necessary. If I
mmap() a huge file, then print out a few bytes from the middle, only the
page containing the interesting bytes is actually copied into physical
memory.
Skip
More information about the Python-list
mailing list