Python3.1: gzip encoding with UTF-8 fails

Antoine Pitrou solipsis at pitrou.net
Mon Dec 21 07:24:09 EST 2009


Hello,

Le Sun, 20 Dec 2009 17:08:33 +0100, Johannes Bauer a écrit :
> 
> #!/usr/bin/python3
> import gzip
> x = gzip.open("testdatei", "wb")
> x.write("ä")

The bug here is that you are trying to write an unicode text string ("ä")
to a binary file (a gzip file). This bug has been fixed now; in the next 
3.x versions it will raise a TypeError:

>>> x = gzip.open("testdatei", "wb")
>>> x.write("ä")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/antoine/py3k/__svn__/Lib/gzip.py", line 227, in write
    self.crc = zlib.crc32(data, self.crc) & 0xffffffff
TypeError: must be bytes or buffer, not str

You have to encode manually if you want to write text strings to a gzip 
file:

>>> x = gzip.open("testdatei", "wb")
>>> x.write("ä".encode('utf8'))


Regards

Antoine.





More information about the Python-list mailing list