[Python-Dev] Generalised String Coercion

Stephen J. Turnbull stephen at xemacs.org
Tue Aug 9 07:28:08 CEST 2005


>>>>> "Martin" == Martin v Löwis <martin at v.loewis.de> writes:

    Martin> While this would work, it would still feel wrong: the
    Martin> binary data are *not* latin1 (most likely), so declaring
    Martin> them to be latin1 would be confusing. Perhaps a synonym
    Martin> '8bit' for latin1 could be introduced.

Be careful.  This alias has caused Emacs some amount of pain, as
binary data escapes into contexts (such as Universal Newline
processing) where it gets interpreted as character data.  We've also
had some problems in codec implementation, because latin1 and (eg)
latin9 have some differences in semantics other than changing the
coded character set for the GR register---controls are treated
differently, for example, because they _are_ binary (alias latin1)
octets, but not in the range of the latin9 code.

I won't go so far as to say it won't work, but it will require careful
design.

-- 
School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.


More information about the Python-Dev mailing list