Codecs for ISO 8859-11 (Thai) and 8859-16 (Romanian)

"Martin v. Löwis" martin at v.loewis.de
Mon Aug 2 10:10:17 EDT 2004


Peter Jacobi wrote:
> a) ISO 8859-n vs ISO-8859-n
> If the information at 
>  http://en.wikipedia.org/wiki/ISO_8859-1#ISO_8859-1_vs_ISO-8859-1 
> is correct, Python 8859-n 
> codecs do implement the ISO standard charsets ISO 8859-n 
> in the specialized IANA forms ISO-8859-n (and in agreement 
> with the Unicode mapping files). So any difficult C0/C1 
> wording in the original ISO standard can be disregarded.

I see. According to RFC 1345, this is definitely the case
for ISO-8859-1. ISO-8859-16 is not defined in an RFC, but
in

http://www.iana.org/assignments/charset-reg/ISO-8859-16

This is a confusing document, as it both refers to ISO/IEC
8859-16:2001 (no control characters), and the Unicode character
map (with control characters). We might interpret this as a
mistake, and assume that it was intended to include control
characters (as all the other ISO-8859-n).

For ISO-8859-11, the situation is even more confusing, as
that is no registered IANA character set, according to

http://www.iana.org/assignments/character-sets

Therefore, it would be a protocol violation (strictly speaking)
if one would use iso-8859-11 in, say, a MIME charset= header.

Regards,
Martin



More information about the Python-list mailing list