Newbie question about text encoding
random832 at fastmail.us
random832 at fastmail.us
Fri Mar 6 08:33:00 EST 2015
On Fri, Mar 6, 2015, at 04:06, Rustom Mody wrote:
> Also:
> Can a programmer who is away from UTF-16 in one part of the system (say
> by using python3)
> assume he is safe all over?
The most common failure of UTF-16 support, supposedly, is in programs
misusing the number of code units (for length or random access) as a
proxy for the number of characters.
However, when do you _really_ want the number of characters? You may
want to use it for, for example, the number of columns in a 'monospace'
font, which you've already screwed up because you haven't accounted for
double-wide characters or combining marks. Or you may want the position
that pressing an arrow key or backspace or forward-delete a number of
times will reach, which has its own rules in e.g. Indic languages (and
also fails on Latin with combining marks).
More information about the Python-list
mailing list