IDLE "Codepage" Switching?

Stephen Tucker stephen_tucker at sil.org
Wed Jan 18 05:43:01 EST 2023


Thanks for these responses.

I was encouraged to read that I'm not the only one to find this all
confusing.

I have investigated a little further.

1. I produced the following IDLE log:

>>> mylongstr = ""
>>> for thisCP in range (1, 256):
mylongstr += chr (thisCP) + " " + str (ord (chr (thisCP))) + ", "


>>> print mylongstr
1, 2, 3, 4, 5, 6, 7, 8, 9,
 10, 11, 12,
 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30,
31,   32, ! 33, " 34, # 35, $ 36, % 37, & 38, ' 39, ( 40, ) 41, * 42, + 43,
, 44, - 45, . 46, / 47, 0 48, 1 49, 2 50, 3 51, 4 52, 5 53, 6 54, 7 55, 8
56, 9 57, : 58, ; 59, < 60, = 61, > 62, ? 63, @ 64, A 65, B 66, C 67, D 68,
E 69, F 70, G 71, H 72, I 73, J 74, K 75, L 76, M 77, N 78, O 79, P 80, Q
81, R 82, S 83, T 84, U 85, V 86, W 87, X 88, Y 89, Z 90, [ 91, \ 92, ] 93,
^ 94, _ 95, ` 96, a 97, b 98, c 99, d 100, e 101, f 102, g 103, h 104, i
105, j 106, k 107, l 108, m 109, n 110, o 111, p 112, q 113, r 114, s 115,
t 116, u 117, v 118, w 119, x 120, y 121, z 122, { 123, | 124, } 125, ~
126, 127, タ 128, チ 129, ツ 130, テ 131, ト 132, ナ 133, ニ 134, ヌ 135, ネ 136, ノ
137, ハ 138, ヒ 139, フ 140, ヘ 141, ホ 142, マ 143, ミ 144, ム 145, メ 146, モ 147,
ヤ 148, ユ 149, ヨ 150, ラ 151, リ 152, ル 153, レ 154, ロ 155, ワ 156, ン 157, ゙
158, ゚ 159, ᅠ 160, ᄀ 161, ᄁ 162, ᆪ 163, ᄂ 164, ᆬ 165, ᆭ 166, ᄃ 167, ᄄ 168,
ᄅ 169, ᆰ 170, ᆱ 171, ᆲ 172, ᆳ 173, ᆴ 174, ᆵ 175, ᄚ 176, ᄆ 177, ᄇ 178, ᄈ
179, ᄡ 180, ᄉ 181, ᄊ 182, ᄋ 183, ᄌ 184, ᄍ 185, ᄎ 186, ᄏ 187, ᄐ 188, ᄑ 189,
ᄒ 190, ﾿ 191, À 192, Á 193, Â 194, Ã 195, Ä 196, Å 197, Æ 198, Ç 199, È
200, É 201, Ê 202, Ë 203, Ì 204, Í 205, Î 206, Ï 207, Ð 208, Ñ 209, Ò 210,
Ó 211, Ô 212, Õ 213, Ö 214, × 215, Ø 216, Ù 217, Ú 218, Û 219, Ü 220, Ý
221, Þ 222, ß 223, à 224, á 225, â 226, ã 227, ä 228, å 229, æ 230, ç 231,
è 232, é 233, ê 234, ë 235, ì 236, í 237, î 238, ï 239, ð 240, ñ 241, ò
242, ó 243, ô 244, õ 245, ö 246, ÷ 247, ø 248, ù 249, ú 250, û 251, ü 252,
ý 253, þ 254, ÿ 255,
>>>

2. I copied and pasted the IDLE log into a text file and ran a program on
it that told me about every byte in the log.

3. I discovered the following:

Bytes 001 to 127 (01 to 7F hex) inclusive were printed as-is;

Bytes 128 to 191 (80 to BF) inclusive were output as UTF-8-encoded
characters whose codepoints were FF00 hex more than the byte values (hence
the strange glyphs);

Bytes 192 to 255 (C0 to FF) inclusive were output as UTF-8-encoded
characters - without any offset being added to their codepoints in the
meantime!

I thought you might just be interested in this - there does seem to be some
method in IDLE's mind, at least.

Stephen Tucker.








On Wed, Jan 18, 2023 at 9:41 AM Peter J. Holzer <hjp-python at hjp.at> wrote:

> On 2023-01-17 22:58:53 -0500, Thomas Passin wrote:
> > On 1/17/2023 8:46 PM, rbowman wrote:
> > > On Tue, 17 Jan 2023 12:47:29 +0000, Stephen Tucker wrote:
> > > > 2. Does the IDLE in Python 3.x behave the same way?
> > >
> > > fwiw
> > >
> > > Python 3.10.6 (main, Nov 14 2022, 16:10:14) [GCC 11.3.0] on linux
> > > Type "help", "copyright", "credits" or "license()" for more
> information.
> > > str = ""
> > > for c in range(140, 169):
> > >      str += chr(c) + " "
> > >
> > > print(str)
> > > Œ   Ž     ‘ ’ “ ” • – — ˜ ™ š › œ   ž Ÿ   ¡ ¢ £ ¤ ¥
> > > ¦ § ¨
> > >
> > >
> > > I don't know how this will appear since Pan is showing the icon for a
> > > character not in its set.  However, even with more undefined characters
> > > the printable one do not change. I get the same output running Python3
> > > from the terminal so it's not an IDLE thing.
> >
> > I'm not sure what explanation is being asked for here.  Let's take
> Python3,
> > so we can be sure that the strings are in unicode.  The font being used
> by
> > the console isn't mentioned, but there's no reason it should have glyphs
> for
> > any random unicode character.
>
> Also note that the characters between 128 (U+0080) and 159 (U+009F)
> inclusive aren't printable characters. They are control characters.
>
>         hp
>
> --
>    _  | Peter J. Holzer    | Story must make more sense than reality.
> |_|_) |                    |
> | |   | hjp at hjp.at         |    -- Charles Stross, "Creative writing
> __/   | http://www.hjp.at/ |       challenge!"
> --
> https://mail.python.org/mailman/listinfo/python-list
>


More information about the Python-list mailing list