adodbapi / string encoding problem

Alex Martelli aleax at aleax.it
Thu Sep 25 08:12:44 EDT 2003


Achim Domma wrote:

> Hi,
> 
> I read a webpage via urllib2. The result of the 'read' call is of type
> 'str'. This string can be written to disc via
> file('out.html','w').write(html). Then I write the string into a Memofield
> in an Access database, using adodbapi. If I read the text back I get a
> unicode string, which can not written to disc via file(...) due to
> encoding problems. How do I have to decode the unicode string to get my
> original data back?

You have to *EN*-code Unicode into string, with the same way the string
had been *DE*-coded to Unicode originally, in order to be sure to get
the same string back; specifically, you have to use the same *codec*
(which stands for COder-DECoder).  I don't know what codec adodbapi is
using (Python's normal default codec is ASCII, which is the "minimum
common denominator" of just about every encoding around -- if adodbapi
hadn't surreptitiously inserted a different codec, it's impossible that
anything would be decoded that might cause problems in encoding it back;-).


Alex





More information about the Python-list mailing list