PEP 3131: Supporting Non-ASCII Identifiers

Wed May 16 09:46:10 EDT 2007

Eric Brunel:

> Funny you talk about Japanese, a language I'm a bit familiar with and 
> for which I actually know some input methods. The thing is, these only 
> work if you know the transcription to the latin alphabet of the word you 
> want to type, which closely match its pronunciation. So if you don't 
> know that 売り場 is pronounced "uriba" for example, you have absolutely 
> no way of entering the word. Even if you could choose among a list of 
> characters, are you aware that there are almost 2000 "basic" Chinese 
> characters used in the Japanese language? And if I'm not mistaken, there 
> are several tens of thousands characters in the Chinese language itself. 
> This makes typing them virtually impossible if you don't know the 
> language and/or have the correct keyboard.

    It is nowhere near that difficult. There are several ways to 
approach this, including breaking up each character into pieces and 
looking through the subset of characters that use that piece (the 
Radical part of the IME). For 売, you can start with the cross with a 
short bottom stroke (at the top of the character) 士, for 場 look for 
the crossy thing on the left 土. The middle character is simple looking 
so probably not Chinese so found it in Hiragana. Another approach is to 
count strokes (Strokes section of the IME) and look through the 
characters with that number of strokes. Within lists, the characters are 
ordered from simplest to more complex so you can get a feel for where to 
look.

    Neil