Convertion of Unicode to ASCII NIGHTMARE

Robert Kern robert.kern at gmail.com
Tue Apr 4 12:39:17 EDT 2006


Roger Binns wrote:
> "Paul Boddie" <paul at boddie.org.uk> wrote in message news:1144081137.137744.253790 at i39g2000cwa.googlegroups.com...
> 
>>It looks like you may have Unicode objects that you're presenting to
>>sqlite. In any case, with earlier versions of pysqlite that I've used,
>>you need to connect with a special unicode_results parameter,
> 
> He is using apsw.  apsw correctly handles unicode.  In fact it won't
> accept a str with bytes >127 as they will be an unknown encoding and
> SQLite only uses Unicode internally.  It does have a blob type
> using buffer for situations where binary data needs to be stored.
> pysqlite's mishandling of Unicode is one of the things that drove
> me to writing apsw in the first place.

Ah, I misread the OP's traceback.

Okay, the OP is getting regular strings, which are probably encoded in
ISO-8859-1 if I had to guess, from the Oracle DB. He is trying to pass them in
to SQLiteCur.execute() which tries to make a unicode string from the input:

In [1]: unicode('\xdc')
---------------------------------------------------------------------------
exceptions.UnicodeDecodeError                        Traceback (most recent call
last)

/Users/kern/<ipython console>

UnicodeDecodeError: 'ascii' codec can't decode byte 0xdc in position 0: ordinal
not in range(128)

*Now*, my advice to the OP is to figure out the encoding of the strings that are
being returned from Oracle. As I said, ISO-8859-1 is probably a good guess.
Then, he would *decode* the string to a unicode string using the encoding. E.g.:

  row = row.decode('iso-8859-1')

Then everything should be peachy. I hope.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco




More information about the Python-list mailing list