[ python-Bugs-1597011 ] Reading with bz2.BZ2File() returns one garbage character

SourceForge.net noreply at sourceforge.net
Wed Nov 15 18:46:21 CET 2006


Bugs item #1597011, was opened at 2006-11-15 12:19
Message generated for change (Comment added) made by cpn
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1597011&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Extension Modules
Group: Python 2.4
Status: Open
Resolution: None
Priority: 7
Private: No
Submitted By: Clodoaldo Pinto Neto (cpn)
Assigned to: Nobody/Anonymous (nobody)
Summary: Reading with bz2.BZ2File() returns one garbage character

Initial Comment:
When comparing two files which should be equal the last line is
different:

The first file is a bzip2 compressed file and is read with
bz2.BZ2File()
The second file is the same file uncompressed and read with open()

The first file named file.txt.bz2 is uncompressed with:

$ bunzip2 -k file.txt.bz2

To compare I use this script:
###############################
import bz2

f1 = bz2.BZ2File(r'file.txt.bz2', 'r')
f2 = open(r'file.txt', 'r')
lines = 0
while True:
   line1 = f1.readline()
   line2 = f2.readline()
   if line1 == '':
      break
   lines += 1
   if line1 != line2:
      print 'line number:', lines
      print repr(line1)
      print repr(line2)
f1.close()
f2.close()
##############################

Output:

$ python bzp.py
line number: 588317
'\x07'
'' 

The offending attached file is 5.5 MB. Sorry, i could not reproduce this problem
with a smaller file.

Tested in Fedora Core 5 and Python 2.4.3

----------------------------------------------------------------------

>Comment By: Clodoaldo Pinto Neto (cpn)
Date: 2006-11-15 15:46

Message:
Logged In: YES 
user_id=1646083
Originator: YES

I received this file already compressed. I don't know what was the used
compressor.
There is no error if i test the compressed file with:

$ bzip2 -t file.txt.bz2

----------------------------------------------------------------------

Comment By: Georg Brandl (gbrandl)
Date: 2006-11-15 15:30

Message:
Logged In: YES 
user_id=849994
Originator: NO

With your file, I can reproduce that on Linux, Python 2.5.

Which compressor did you compress your file with?
I unpacked it with bunzip2 without problems, then recompressed it with
bzip2, which resulted
in a slightly smaller (51 bytes) file, which then didn't trigger the bug.

----------------------------------------------------------------------

Comment By: Clodoaldo Pinto Neto (cpn)
Date: 2006-11-15 12:35

Message:
Logged In: YES 
user_id=1646083
Originator: YES

Confirmed in Windows Python 2.4 and 2.5

http://groups.google.com/group/comp.lang.python/tree/browse_frm/thread/3010fd664d78010f/4166d429b25c9ed4?rnum=1&_done=%2Fgroup%2Fcomp.lang.python%2Fbrowse_frm%2Fthread%2F3010fd664d78010f%2F4166d429b25c9ed4%3Ftvc%3D1%26#doc_7770aa47861db452

----------------------------------------------------------------------

Comment By: Clodoaldo Pinto Neto (cpn)
Date: 2006-11-15 12:28

Message:
Logged In: YES 
user_id=1646083
Originator: YES

I can't upload the bz2 sample file. So it is here:
http://fahstats.com/img/file.txt.bz2 

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1597011&group_id=5470


More information about the Python-bugs-list mailing list