adodbapi / string encoding problem

Thu Sep 25 08:17:42 EDT 2003

Achim Domma wrote:

> I read a webpage via urllib2. The result of the 'read' call is of type
> 'str'. This string can be written to disc via
> file('out.html','w').write(html). Then I write the string into a Memofield
> in an Access database, using adodbapi. If I read the text back I get a
> unicode string, which can not written to disc via file(...) due to
> encoding problems. How do I have to decode the unicode string to get my
> original data back?

You have to know the encoding of the original file.

Assuming (1) you had western european characters including the euro sign,
(2) they were correctly translated into unicode and (3) you want them back
that way:

>>> s = u"äöüÄÖÜ".encode("iso-8859-15")
>>> s
'\xe4\xf6\xfc\xc4\xd6\xdc'
>>> print s
äöüÄÖÜ
>>> type(s)
<type 'str'>
>>>

Or more general:

unicodeFromAccess.encode(targetEncoding)

Peter