transliteration in Python
Frederick H. Bartlett
fbartlet at optonline.net
Fri Jan 4 09:14:04 EST 2002
Giorgi,
But transliteration isn't what encodings do. You are asking that a
encoding manage to handle variable-length ascii strings and just know
what to do with them. Given your question, you must want "shchi" to be
replaced by the Cyrillic character Unicode knows as "shcha" (0429/0449);
meanwhile "sh" would be "sha" (0428/0448), "ch" would be "che"
(0427/0447), "k" would be "ka" (041a/043a), and "kh" would be "ha"
(0425/0445). That's a lot of intelligence to build into an encoding.
My experience with encoding issues comes from TeX, where I used to
design my own encodings in .tfm files so that I could type Classical
Greek, Russian, and Georgian in American ascii with reasonable clarity.
(Clear to me, anyway.) But in TeX one can use ligatures to accomplish
neat encoding effects; no other encoding works that way.
Fred
Giorgi Lekishvili wrote:
>
> Hello!
>
> I am sorry to say this, but I was pretty confused exploring the
> encodings folder in the Python21 distribution.
>
> Can someone smarter than me explain the common syntax of transliteration
> of one encoding to another?
>
> Suppose we have string sl="shchi" and want to recode it in "KOIR-8". How
> can this be achieved?
>
> Thank you.
>
> Greetings,
> Giorgi
More information about the Python-list
mailing list