byte count unicode string
John Machin
sjmachin at lexicon.net
Wed Sep 20 06:12:54 EDT 2006
willie wrote:
> John Machin:
>
> >You are confusing the hell out of yourself. You say that your web app
> >deals only with UTF-8 strings. Where do you get "the unicode string"
> >from??? If name is a utf-8 string, as your comment says, then len(name)
> >is all you need!!!
>
>
> # I'll go ahead and concede defeat since you appear to be on the
> # verge of a heart attack :)
> # I can see that I lack clarity so I don't blame you.
All you have to do is use terminology like "Python str object, encoded
in utf-8" and "Python unicode object".
>
> # By UTF-8 string, I mean a unicode object with UTF-8 encoding:
There is no such animal as a "unicode object with UTF-8 encoding".
Don't make up terminology as you go.
>
> type(ustr)
> <type 'unicode'>
> >>> repr(ustr)
> "u'\\u2708'"
Sigh. I suppose we have to infer that "ustr" is the same as the "name"
that you were getting as post data. Is that correct?
>
> # The database API expects unicode objects:
> # A template query, then a variable number of values.
> # Perhaps I'm a victim of arbitrary design decisions :)
And the database will encode those unicode objects as utf-8, silently
truncating any that are too long -- just as Duncan feared? "Arbitrary"
is not the word for it.
Good luck!
Cheers,
John
More information about the Python-list
mailing list