[Tutor] Reading large bz2 Files
Steven D'Aprano
steve at pearwood.info
Fri Feb 19 17:04:31 CET 2010
On Fri, 19 Feb 2010 11:42:07 pm Norman Rieß wrote:
> Hello,
>
> i am trying to read a large bz2 file with this code:
>
> source_file = bz2.BZ2File(file, "r")
> for line in source_file:
> print line.strip()
>
> But after 4311 lines, it stoppes without a errormessage. The bz2 file
> is much bigger though.
>
> How can i read the whole file line by line?
"for line in file" works for me:
>>> import bz2
>>>
>>> writer = bz2.BZ2File('file.bz2', 'w')
>>> for i in xrange(20000):
... # write some variable text to a line
... writer.write('abc'*(i % 5) + '\n')
...
>>> writer.close()
>>> reader = bz2.BZ2File('file.bz2', 'r')
>>> i = 0
>>> for line in reader:
... i += 1
...
>>> reader.close()
>>> i
20000
My guess is one of two things:
(1) You are mistaken that the file is bigger than 4311 lines.
(2) You are using Windows, and somehow there is a Ctrl-Z (0x26)
character in the file, which Windows interprets as End Of File when
reading files in text mode. Try changing the mode to "rb" and see if
the behaviour goes away.
--
Steven D'Aprano
More information about the Tutor
mailing list