[issue3649] IA5 Encoding should be in the default encodings

Sun Aug 24 23:52:17 CEST 2008

Martin v. Löwis <martin at v.loewis.de> added the comment:

I don't think this codec should be named IA-5. IA-5 is specified in
ITU-T Rec. T.50 (International Alphabet No. 5), recently renamed to
"International Reference Alphabet", and it does *not* specify that the
characters 0..31 are printable. Instead, IA5 is identical to ISO 646
(i.e. allowing for national variants), with the International Reference
Version of IA5 (e.g. as used in ASN.1 IA5String) is identical to US-ASCII.

If GSM uses a modified version of this, it should receive a separate
name. If you were looking at section 2 (Structure of EMI messages), what
makes you think that this specification calls the encoding "IA5"? In my
copy, it says:

# Alphanumeric characters are encoded as two numeric IA5 characters,
# the higher 3 bits (0..7) first, the lower 4 bits (0..F) thereafter,
# according to the following table.

So it *uses* IA5 to hex-encode the encoding. To achieve that, one would
have to write

  text.encode("emi-section-2").encode("hex")

[Notice that the "hex" codec already uses IA-5]

In any case, I don't think this is general enough to deserve inclusion
into the standard library. The codec system is designed to be so
flexible to support additional codecs outside the core.

----------
nosy: +loewis

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue3649>
_______________________________________