Regular expressions in files

Andrew M. Kuchling akuchlin at mems-exchange.org
Fri May 19 13:35:59 EDT 2000


Simon Langley <Simon.Langley at uwe.ac.uk> writes:
> Can regular expression matching be done on files?  I'd like to read just
> as much of a file as is necessary to either be a (greedy) match of the
> re or until a match definitely can't be found.

In this situation, I'd try using the mmapfile module, mapping the
entire file into memory, and then doing a match on the entire file.
You still might wind up having to loop over the entire file, but the
pages should only be brought into memory if they're required,
dependent on your platform's memory paging algorithms.

I believe the mmapfile module is included with Pythonwin; it's also in
the Python 1.6 CVS tree (but the module should work with 1.5.2), and
an older version can be downloaded from
http://starship.python.net/crew/amk/python/code/mmap.html .

-- 
A.M. Kuchling			http://starship.python.net/crew/amk/
Little one, I would like to see anyone -- prophet, king or God -- persuade a
thousand cats to do anything at the same time.
  -- The cynical cat, in SANDMAN #18: "A Dream of a Thousand Cats"



More information about the Python-list mailing list