streaming a file object through re.finditer

Erick idadesub at gmail.com
Wed Feb 2 21:02:25 EST 2005


Hello,

I've been looking for a while for an answer, but so far I haven't been
able to turn anything up yet. Basically, what I'd like to do is to use
re.finditer to search a large file (or a file stream), but I haven't
figured out how to get finditer to work without loading the entire file
into memory, or just reading one line at a time (or more complicated
buffering).

For example, say I do this:
cat a b c > blah

Then run this python script:
>>> import re
>>> for m in re.finditer('\w+', buffer(file('blah'))):
...   print m.group()
...
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: buffer object expected

Of course, this works fine, but it loads the file completely into
memory (right?):
>>> for m in re.finditer('\w+', buffer(file('blah').read())):
...   print m.group()
...
a
b
c

So, is there any way to do this?

Thanks,

-e




More information about the Python-list mailing list