handling unicode data
Fredrik Lundh
fredrik at pythonware.com
Wed Jun 28 11:10:26 EDT 2006
Filipe wrote:
> In the console output, for a record where I expected to see "França"
> I'm getting the following:
>
> "<type 'str'>" - When I print the type (print type(row[1]))
> "Fran+a" - When I print the "term" variable (print term)
> "Fran\xd8a" - When I print all the query results (print results)
>
> The values in "Term" column in "TestTable" are stored as unicode (the
> column's datatype is nvarchar), yet, the python data type of the values
> I'm reading is not unicode.
> It all seems to be an encoding issue, but I can't see what I'm doing
> wrong..
looks like the DB-API driver returns 8-bit ISO-8859-1 strings instead of Unicode
strings. there might be some configuration option for this; see
in worst case, you could do something like
def unicodify(value):
if isinstance(value, str):
value = unicode(value, "iso-8859-1")
return value
term = unicodify(row[1])
but it's definitely better if you can get the DB-API driver to do the right thing.
</F>
More information about the Python-list
mailing list