[issue18003] New lzma crazy slow with line-oriented reading.

Nadeem Vawda report at bugs.python.org
Sun May 19 19:52:20 CEST 2013


Nadeem Vawda added the comment:

I agree that making lzma.open() wrap its return value in a BufferedReader
(or BufferedWriter, as appropriate) is the way to go. I'm currently
travelling and don't have my SSH key with me - Serhiy, can you make the
change?

I'll put together a documentation patch that recommends using lzma.open()
rather than LZMAFile directly, and mentions the performance implications.


> Interestingly, opening in text (i.e. unicode) mode is almost as fast as with a BufferedReader:

This is because opening in text mode returns a TextIOWrapper, which is
written in C, and presumably does its own buffering on top of
LZMAFile.read1() instead of calling LZMAFile.readline().


> From my perspective default wrapping with io.BufferedReader is a great
> idea. I can't think of who would suffer. Maybe someone who wants to
> open thousands of simultaneous streams wouldn't appreciate the memory
> overhead. If that person exists then he would want an option to turn
> it off.

If someone doesn't want the BufferedReader/BufferedWriter, they can
create an LZMAFile directly; we don't plan to remove that possibility. So
I don't think that should be a problem.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue18003>
_______________________________________


More information about the Python-bugs-list mailing list