Scanning a file

Andrew McCarthy a_mccarthy at hotmail.com
Fri Oct 28 09:59:40 EDT 2005


On 2005-10-28, pinkfloydhomer at gmail.com <pinkfloydhomer at gmail.com> wrote:
> I'm now down to:
>
> f = open("filename", "rb")
> s = f.read()
> sub = "\x00\x00\x01\x00"
> count = s.count(sub)
> print count
>
> Which is quite fast. The only problems is that the file might be huge.
> I really have no need for reading the entire file into a string as I am
> doing here. All I want is to count occurences this substring. Can I
> somehow count occurences in a file without reading it into a string
> first?

Yes - use memory mapping (the mmap module). An mmap object is like a
cross between a file and a string, but the data is only read into RAM
when, and for as long as, necessary. An mmap object doesn't have a
count() method, but you can just use find() in a while loop instead.

Andrew



More information about the Python-list mailing list