[Tutor] Reading large bz2 Files
Lie Ryan
lie.1296 at gmail.com
Fri Feb 19 22:14:31 CET 2010
On 02/20/10 07:42, Lie Ryan wrote:
> On 02/19/10 23:42, Norman Rieß wrote:
>> Hello,
>>
>> i am trying to read a large bz2 file with this code:
>>
>> source_file = bz2.BZ2File(file, "r")
>> for line in source_file:
>> print line.strip()
>>
>> But after 4311 lines, it stoppes without a errormessage. The bz2 file is
>> much bigger though.
>> How can i read the whole file line by line?
>
> Is the bz2 file an archive[1]?
>
> [1] archive: contains more than one file
Or more clearly, is the bz2 contains multiple file compressed using -c
flag? The -c flag will do a simple concatenation of multiple compressed
streams to stdout; it is only decompressible with bzip2 0.9.0 or later[1].
You cannot use bz2.BZ2File to open this, instead use the stream
decompressor bz2.BZ2Decompressor.
A better approach, is to use a real archiving format (e.g. tar).
[1] http://www.bzip.org/1.0.3/html/description.html
More information about the Tutor
mailing list