Unicode / cx_Oracle problem

Richard Schulman raschulmanxx at verizon.net
Sun Sep 10 16:25:45 EDT 2006


On Sun, 10 Sep 2006 11:42:26 +0200, "Diez B. Roggisch"
<deets at nospam.web.de> wrote:

>What does print repr(mean) give you?

That is a useful suggestion.

For context, I reproduce the source code:

in_file = codecs.open("c:\\pythonapps\\mean.my",encoding="utf_16_LE")
connection = cx_Oracle.connect("username", "password")
cursor = connection.cursor()
for row in in_file:
    id = row[0]
    mean = row[1]
    print "Value of row is ", repr(row)                    #debug line
    print "Value of the variable 'id' is ", repr(id)       #debug line
    print "Value of the variable 'mean' is ", repr(mean)   #debug line
    cursor.execute("""INSERT INTO mean (mean_id,mean_eng_txt)
        VALUES (:id,:mean)""",id=id,mean=mean)

Here is the result from the print repr() statements:

Value of row is  u"\ufeff(3,'sadness, lament; sympathize with,
pity')\r\n"
Value of the variable 'id' is  u'\ufeff'
Value of the variable 'mean' is  u'('

Clearly, the values loaded into the 'id' and 'mean' variables are not
satisfactory but are picking up the BOM.

>... 
>The oracle NLS is a sometimes tricky beast, as it sets the encoding it 
>tries to be clever and assigns an existing connection some encoding, 
>based on the users/machines locale. Which can yield unexpected results, 
>such as "Dusseldorf" instead of "Düsseldorf" when querying a german city 
>list with an english locale.

Agreed.

>So - you have to figure out, what encoding your db-connection expects. 
>You can do so by issuing some queries against the session tables I 
>believe - I don't have my oracle resources at home, but googling will 
>bring you there, the important oracle term is NLS.

It's very hard to figure out what to do on the basis of complexities
on the order of

http://download-east.oracle.com/docs/cd/B25329_01/doc/appdev.102/b25108/xedev_global.htm#sthref1042

(tiny equivalent http://tinyurl.com/fnc54

But I'm not even sure I got that far. My problems so far seem prior:
in Python or Python's cx_Oracle driver. To be candid, I'm very tempted
at this point to abandon the Python effort and revert to an all-ucs2
environment, much as I dislike Java and C#'s complexities and the poor
support available for all-Java databases.

>Then you need to encode the unicode string before passing it - something 
>like this:
>
>mean = mean.encode("latin1")

I don't see how the Chinese characters embedded in the English text
will carry over if I do that.

In any case, thanks for your patient and generous help.

Richard Schulman
Delete the antispamming 'xx' characters for email reply



More information about the Python-list mailing list