Shift-JIS to UTF-8 conversion

rbsharp at gmx.de rbsharp at gmx.de
Fri May 20 03:16:15 EDT 2005


Hello,
I think the answer is basically correct but shift-jis is not a standard
part of
Python 2.3. You will either need to use Python 2.4 where the cjkcodes
are integrated or install them under Python 2.3. The link is
http://cjkpython.i18n.org/

You then also need:
import cjkcodecs.aliases

Richard

Jeff Epler wrote:
> I think you do something like this (untested):
>
> import codecs
>
> def transcode(infile, outfile, incoding="shift-jis",
>         outcoding="utf-8"):
>     f = codecs.open(infile, "rb", incoding)
>     g = codecs.open(outfile, "wb", outcoding)
>
>     g.write(f.read())
> # If the file is so large that it can't be read at once, do a loop
which
> # reads and writes smaller chunks
> #    while 1:
> #        block = f.read(4096000)
> #        if not block: break
> #        g.write(block)
> 
>     f.close()
>     g.close()
> 
> Jeff




More information about the Python-list mailing list