Convertion of Unicode to ASCII NIGHTMARE

ChaosKCW da.martian at gmail.com
Mon Apr 3 11:57:10 EDT 2006


Hi

I am reading from an oracle database using cx_Oracle. I am writing to a
SQLite database using apsw.

The oracle database is returning utf-8 characters for euopean item
names, ie special charcaters from an ASCII perspective.

I get the following error:
>    SQLiteCur.execute(sql, row)
>UnicodeDecodeError: 'ascii' codec can't decode byte 0xdc in position 12: ordinal not in >range(128)

I have googled for serval days now and still cant get it to encode to
ascii.

I encode the SQL as follows:

        sql = "insert into %s values %s" % (SQLiteTable, paramstr)
        sql.encode('ascii', 'ignore')

I then code each of the row values returned from Oracle like this:

     row = map(encodestr, row)
     SQLiteCur.execute(sql, row)

where encodestr is as follows:

def encodestr(item):
    if isinstance(item, types.StringTypes):
       return unicodedata.normalize('NFKD', unicode(item, 'utf-8',
'ignore')).encode('ASCII', 'ignore')
    else:
       return item

I have tried a thousand of similiar functions to the above,
permitations of the above from various google searches. But I still get
the above exception on the line:

     SQLiteCur.execute(sql, row)

and the exception is reslated to the data in one field.

Int the end I resorted to using oracles convert function in the SQL
statement but would like to  understand why this is happening and why
its so hard to convert the string in python. I have read many
complaints about this from other people some of whom have written
custom stripping routines. I havent tried a custom routine yet, cause I
think it should be possilble in python.

Thanks,




More information about the Python-list mailing list