replacing Chinese chars with their spellings
Boudewijn Rempt
boud at valdyas.org
Wed Apr 24 16:06:07 EDT 2002
Dan Jacobson wrote:
> Before I start learning python, here's what I want to do: I have a
> table of Hakka Chinese words and their pronunciations. I scan a file
> and replace any Hakka ["big5" 2, 4, 6 ... byte long strings] there
> with their pronunciations. If it were just one character [two byte]
> words I would use the "c2t" program. Is there a template that munches
> forth in a file and replaces the longest match in a database
> before moving on?
Is the whole file in Big-5? If so, it's easiest to use iconv to
convert the file to unicode, and open it using Python. If you have
also converted your table to unicode, you can match on unicode char.
--
Boudewijn Rempt | http://www.valdyas.org
More information about the Python-list
mailing list