PEP263 (Specifying encoding) and bytecode strings

Alex Martelli aleax at aleax.it
Mon May 5 04:14:44 EDT 2003


Tony Meyer wrote:

>> > Is there some way to specify that all strings are
>> > bytecodes, and not encoded characters?
>> Couldn't resourcepackage just insert a suitable encoding
>> "declaration", which is after all a comment and thus
>> innocuous for any previous release of Python -- or else emit
>> non-ascii chars as escape sequences?
> 
> I probably phrased my question poorly: what, then, is the correct
> encoding for the output of zlib.compress()?  I know IANA has a list [1]
> of encodings, but it's not really clear which is the right one.
> 
> [zlib.compress() returns a string of '\xXX's, where XX is from 00 to
> FF.]
> 
> I'm happy to admit that I know almost nothing about encodings.
> Particularly, I don't see why '\xda' [2] is not considered ascii.  It
> should be an upper case U with an acute, according to rfc1345 [3] -
> isn't that what ascii is?

No.  The A in ASCII stands for "American" (the rest of the letters
stand for "Standard Code for Information Interchange).  The standards
body who aproved that standard is ANSI (the American National
Standards Institute), and the document that currently defines it is:

Standard ANSI X3.4-1986, "US-ASCII. Coded Character Set - 7-Bit 
American Standard Code for Information Interchange".

US-ASCII is the preferred name of this encoding for MIME purposes,
by the way.  But almost always it's referred to as just ASCII.

I suspect (but cannot be sure) that ISO 8859-1 is the encoding
you want to use for your purposes.


Alex





More information about the Python-list mailing list