catch UnicodeDecodeError

Philipp Hagemeister phihag at phihag.de
Wed Jul 25 07:35:09 EDT 2012


Hi Jaroslav,

you can catch a UnicodeDecodeError just like any other exception. Can
you provide a full example program that shows your problem?

This works fine on my system:


import sys
open('tmp', 'wb').write(b'\xff\xff')
try:
    buf = open('tmp', 'rb').read()
    buf.decode('utf-8')
except UnicodeDecodeError as ude:
    sys.exit("Found a bad char in file " + "tmp")


Note that you cannot possibly determine the line number if you don't
know what encoding the file is in (and what EOL it uses).

What you can do is count the number of bytes with the value 10 before
ude.start, like this:

lineGuess = buf[:ude.start].count(b'\n') + 1

- Philipp

On 07/25/2012 01:05 PM, jaroslav.dobrek at gmail.com wrote:
> it doesn't work

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/python-list/attachments/20120725/8e4dee1c/attachment.sig>


More information about the Python-list mailing list