Convertion of Unicode to ASCII NIGHTMARE

ChaosKCW da.martian at gmail.com
Mon Apr 10 04:37:13 EDT 2006


Roger Binns wrote:

>
> No.  APSW converts it *to* Unicode.  SQLite only accepts Unicode
> so a Unicode string has to be supplied.  If you supply a non-Unicode
> string then conversion has to happen.  APSW asks Python to
> supply the string in Unicode.  If Python can't do that (eg
> it doesn't know the encoding) then you get an error.

If what you say is true, I have to ask why I get a converstion error
which states it cant convert to ASCII, not it cant convert to UNICODE?


> > Ok if SQLite uses unicode internally why do you need to ignore
> > everything greater than 127,
>
> I never said that.  I said that a special case is made so that
> if the string you supply only contains ASCII characters (ie <=127)
> then the ASCII string is converted to Unicode.  (In fact it is
> valid UTF-8 hence the shortcut).
>
> > the ascii table (256 bit one) fits into
> > unicode just fine as far as I recall?
>
> No, ASCII characters have defined Unicode codepoints.  The ASCII
> character number just happens to be the same as the Unicode
> codepoints.  But there are only 127 ASCII characters.
>
> > Or did I miss the boat here ?
>
> For bytes greater than 127, what character set is used?  There
> are hundreds of character sets that define those characters.
> You have to tell the computer which one to use.  See the Unicode
> article referenced above.

Yes I know there are a million "extended" ASCII charaters sets, which
happen to the bane of all existence. Most computers deal in bytes
nativly and the 7 bit coding still causes problems to this day. But
since the error I get is a converstion error to ASCII, not from ASCII,
I am willing to accept loss of information. You cant code unicode into
ascii without loss of information or two charcater codes. In my mind,
somewhere inside the "cursor.execute" function, it converts to ascii. I
say this because of the error msg recieved.  So I am missing how a
function which supposedly converts evereythin to unicode lands up doing
an ascii converstion ?




More information about the Python-list mailing list