[Tutor] Absolute newbie - Transliteration
Bob Gailer
bgailer@alum.rpi.edu
Wed May 21 11:46:01 2003
At 11:51 PM 5/20/2003 -0700, David Rogers wrote:
>Hi
>
>I'm an absolute newbie - this is my first attempt with Python or any
>"real" language, so my advance apologies for any stupid comments. I
>joined the list just to ask this question, after doing a little searching
>in the list archives and the documentation and not being able to find out
>what I want to know.
>
>I'm trying make scripts to transliterate a file from (Unicode) Cyrillic
>characters to each of
>- Roman script, and
>- International Phonetic Alphabet (more Unicode).
>
>(Whether I end up with separate scripts, one for each transliteration, or
>one script for all with a bigger dictionary/list/table, is not important
>to me.)
>
>The transliteration will not always be one-to-one in terms of the number
>of characters, for example the "ch" sound is one letter in Russian but
>corresponds to two letters in English.
>
>I have found the following in the Python web documentation...
>
>>translate(table[, deletechars])
>>
>>
>>Return a copy of the string where all characters occurring in the
>>optional argument deletechars are removed, and the remaining characters
>>have been mapped through the given translation table, which must be a
>>string of length 256.
>
>
>...but I don't understand what format my table needs to be in, or even if
>this accommodates Unicode, or the problem of one character sometimes
>translating to two. If I'm completely on the wrong track here, somebody
>laugh now before it's too late. :-)
>
>
>What I don't want is a pointer to a non-modifiable Cyrillic-to-Roman
>transliteration application, because I want to re-use what I do here when
>I make other transliteration tables to speed up IPA transcription from
>other languages too. I love IPA. :-)
>
>On the other hand, if somebody has already done something like what I
>want, in a script I can modify for other uses, then I'm all ears.
>(Some of me is ears all the time.) I'm happy to make the lists,
>dictionary entries, or whatever format they need to be in - I just want to
>know how to get Python to read this stuff and then give me back the right
>thing.
>
>I'm using Mac OS X, if it makes any difference.
translate() is for ASCII not Unicode. My best guess is a dictionary.
Bob Gailer
bgailer@alum.rpi.edu
303 442 2625