UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position 0: invalid start byte

Steven D'Aprano steve+comp.lang.python at pearwood.info
Thu Dec 6 06:27:28 EST 2012


On Thu, 06 Dec 2012 02:07:51 -0800, iMath wrote:

> the following code originally from
> http://zetcode.com/databases/mysqlpythontutorial/ within the "Writing
> images" part .
> 
> 
> import MySQLdb as mdb
> import sys
> 
> try:
>     fin = open("Chrome_Logo.svg.png",'rb') 
>     img = fin.read()
>     fin.close()
> except IOError as e:
>     print ("Error %d: %s" % (e.args[0],e.args[1]))
>     sys.exit(1)

Every time a programmer catches an exception, only to merely print a 
vague error message and then exit, God kills a kitten. Please don't do 
that.

If all you are going to do is print an error message and then exit, 
please don't bother. All you do is make debugging harder. When Python 
detects an error, by default it prints a full traceback, which gives you 
lots of information to track down the error. By catching that exception 
as you do, you lose that information and make it harder to debug.

Moving on to the next thing:


[snip code]
> I port it to python 3 ,and also change fin = open("chrome.png")
> to
> fin = open("Chrome_Logo.png",'rb')
> but when I run it ,it gives the following error :
> 
> Traceback (most recent call last):
>   File "E:\Python\py32\itest4.py", line 20, in <module>
>     mdb.escape_string(img))
> UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position 0:
> invalid start byte
> 
> so how to fix it ?

I suggest you start by reading the documentation for 
MySQLdb.escape_string. What does it do? What does it expect? A byte 
string or a unicode text string?

It seems very strange to me that you are reading a binary file, then 
passing it to something which appears to be expecting a string. It looks 
like what happens is that the PNG image starts with a 0x89 byte, and the 
escape_string function tries to decode those bytes into Unicode text:

py> img = b"\x89\x00\x23\xf2"  # fake PNG binary data
py> img.decode('utf-8')  # I'm expecting text
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position 0: 
invalid start byte

Without knowing more about escape_string, I can only make a wild guess. 
Try this:

import base64
img = fin.read()  # read the binary data of the PNG file
data = base64.encodebytes(img)  # turn the binary image into text
cursor.execute("INSERT INTO Images SET Data='%s'" % \
        mdb.escape_string(data))


and see what that does.


-- 
Steven



More information about the Python-list mailing list