handling unicode data

Fredrik Lundh fredrik at pythonware.com
Wed Jun 28 11:10:26 EDT 2006


Filipe wrote:

> In the console output, for a record where I expected to see "França"
> I'm getting the following:
>
> "<type 'str'>"   -    When I print the type (print type(row[1]))
> "Fran+a"         -    When I print the "term" variable (print term)
> "Fran\xd8a"     -    When I print all the query results (print results)
>
> The values in "Term" column in "TestTable" are stored as unicode (the
> column's datatype is nvarchar), yet, the python data type of the values
> I'm reading is not unicode.
> It all seems to be an encoding issue, but I can't see what I'm doing
> wrong..

looks like the DB-API driver returns 8-bit ISO-8859-1 strings instead of Unicode
strings.  there might be some configuration option for this; see

in worst case, you could do something like

    def unicodify(value):
        if isinstance(value, str):
            value = unicode(value, "iso-8859-1")
        return value

    term = unicodify(row[1])

but it's definitely better if you can get the DB-API driver to do the right thing.

</F> 






More information about the Python-list mailing list