Multithreaded compression/decompression library with python bindings?

Stephan Houben stephanh42 at gmail.com.invalid
Thu Oct 5 14:38:18 EDT 2017


Op 2017-10-05, Thomas Nyberg schreef <tomuxiong at gmx.com>:
> Btw if anyone knows a better way to handle this sort of thing, I'm all
> ears. Given my current implementation I could use any compression that
> works with stdin/stdout as long as I could sort out the waiting on the
> subprocess. In fact, bzip2 is probably more than I need...I've half used
> it out of habit rather than anything else.

lzma ("xv" format) compression is generally both better and faster than
bzip2. So that gives you already some advantage.

Moreover, the Python lzma docs say:

"When opening a file for reading, the input file may be the concatenation
of multiple separate compressed streams. These are transparently decoded
as a single logical stream."

This seems to open the possibility to simply divide your input into,
say, 100 MB blocks, compress each of them in a separate thread/process
and then concatenate them. 

Stephan



More information about the Python-list mailing list