transliteration in Python
Jason Orendorff
jason at jorendorff.com
Fri Jan 4 04:58:00 EST 2002
> Can someone smarter than me explain the common syntax of transliteration
> of one encoding to another?
No. But if you can settle for a dumber-than-a-box-of-hammers person,
read on...
s1 = "shchi" # start with some ascii bytes
u = s1.decode('ascii') # decode them into a unicode string
s2 = u.encode('utf16') # encode it as UTF-16 bytes
outfile.write(s2) # write them to a binary file, for example
[In this case, I know that s1 is ascii-encoded, because I typed in the
letters "shchi" and I know that those are all ascii characters, and Python
and my computer both handle ASCII just fine by default. But if you
*don't* know the encoding of s1, it's in general not really possible
to find out. You can make a pretty good heuristic guess, sometimes.]
Python only supports a few encodings out of the box. KOIR-8, the one
you mentioned, apparently isn't one of them.
## Jason Orendorff http://www.jorendorff.com/
More information about the Python-list
mailing list