handling unicode data

Wed Jun 28 10:55:10 EDT 2006

Hi all,

I'm starting to learn python but am having some difficulties with how
it handles the encoding of data I'm reading from a database. I'm using
pymssql to access data stored in a SqlServer database, and the
following is the script I'm using for testing purposes.

-----------------------------------------------------------------------------
import pymssql

mssqlConnection =
pymssql.connect(host='localhost',user='sa',password='password',database='TestDB')
cur = mssqlConnection.cursor()
query="Select ID, Term from TestTable where ID > 200 and ID < 300;"
cur.execute(query)
row = cur.fetchone()
results = []
while row is not None:
   term = row[1]
   print type(row[1])
   print term
   results.append(term)
   row = cur.fetchone()
cur.close()
mssqlConnection.close()
print results
-----------------------------------------------------------------------------

In the console output, for a record where I expected to see "França"
I'm getting the following:

"<type 'str'>"   -    When I print the type (print type(row[1]))
"Fran+a"         -    When I print the "term" variable (print term)
"Fran\xd8a"     -    When I print all the query results (print results)

The values in "Term" column in "TestTable" are stored as unicode (the
column's datatype is nvarchar), yet, the python data type of the values
I'm reading is not unicode.
It all seems to be an encoding issue, but I can't see what I'm doing
wrong..
Any thoughts?

thanks in advance,
Filipe