buffering choking sys.stdin.readlines() ?
Diez B. Roggisch
deets at nospam.web.de
Mon May 12 11:29:51 EDT 2008
cshirky schrieb:
> Newbie question:
>
> I'm trying to turn a large XML file (~7G compressed) into a YAML file,
> and my program seems to be buffering the input.
>
> IOtest.py is just
>
> import sys
> for line in sys.stdin.readlines():
> print line
>
> but when I run
>
> $ gzcat bigXMLfile.gz | IOtest.py
>
> but it hangs then dies.
>
> The goal of the program is to build a YAML file with print statements,
> rather than building a gigantic nested dictionary, but I am obviously
> doing something wrong in passing input through without buffering. Any
> advice gratefully fielded.
readlines() reads all of the file into the memory. Try using xreadlines,
the generator-version, instead. And I'm not 100% sure, but I *think* doing
for line in sys.stdin:
...
does exactly that.
Diez
More information about the Python-list
mailing list