[Python-Dev] ascii default encoding

Fredrik Lundh fredrik@pythonware.com
Tue, 18 Jul 2000 12:21:54 +0200


jack wrote:

> On the Mac, under CodeWarrior, the default locale (and, actually, the only
> locale) is "C", and it uses the mac-roman charset, so including the upper
128
> characters.

I could have sworn that the "C" locale only included characters in
the ASCII range [1], but a more careful reading of the specification
indicates that the behaviour is only *defined* for characters in the
ASCII range...

two alternatives:

    -- limit the string constants to the portable range (0..127).
       if you want more characters, you have to explicitly call
       locale.setlocale(..., "C").

    -- leave it as it is.  this means that code that does things like
       "if char in string.whitespace" will break if char is unicode,
       and you're running on a mac.

on second thought, it's probably better to just fix mixed string com-
parisions...  (so that instead of throwing an exception, unichr(202)
will simply be != chr(202)...)

</F>

1) not quite true: the portable character set only lists all characters
in the ASCII character set, it doesn't define what code points they
should use.