bz2 module doesn't work properly with all bz2 files

Steven D'Aprano steve at REMOVE-THIS-cybersource.com.au
Fri Jun 4 22:06:18 EDT 2010


On Fri, 04 Jun 2010 12:53:26 -0700, Magdoll wrote:

> I'm not sure what's causing this, but depending on the compression
> program used, the bz2 module sometimes exits earlier.
[...]

The current bz2 module only supports files written as a single stream, 
and not multiple stream files. This is why the BZ2File class has no 
"append" mode. See this bug report:

http://bugs.python.org/issue1625

Here's an example:

>>> bz2.BZ2File('a.bz2', 'w').write('this is the first chunk of text')
>>> bz2.BZ2File('b.bz2', 'w').write('this is the second chunk of text')
>>> bz2.BZ2File('c.bz2', 'w').write('this is the third chunk of text')
>>> # concatenate the files
... d = file('concate.bz2', 'w')
>>> for name in "abc":
...     f = file('%c.bz2' % name, 'rb')
...     d.write(f.read())
...
>>> d.close()
>>>
>>> bz2.BZ2File('concate.bz2', 'r').read()
'this is the first chunk of text'

Sure enough, BZ2File only sees the first chunk of text, but if I open it 
in (e.g.) KDE's Ark application, I see all the text.

So this is a known bug, sorry.


-- 
Steven



More information about the Python-list mailing list