encoding latin1 to utf-8

Piet van Oostrum piet at cs.uu.nl
Mon Sep 10 15:34:02 EDT 2007


>>>>> Harshad Modi <modiinfo at gmail.com> (HM) wrote:

>HM> hello ,
>HM>  I make one function for encoding latin1 to utf-8. but i think it is
>HM> not work proper.
>HM> plz guide me.

>HM> it is not get proper result . such that i got "Belgi�" using this
>HM> method, (Belgium)  :

>HM> import codecs
>HM> import sys
>HM> # Encoding / decoding functions
>HM> def encode(filename):
>HM>  file = codecs.open(filename, encoding="latin-1")
>HM>  data = file.read()
>HM>  file = codecs.open(filename,"wb", encoding="utf-8")
>HM>  file.write(data)

>HM> file_name=sys.argv[1]
>HM> encode(file_name)

I tried this program and for me it works correctly. So you probably used a
wrong input file or you misinterpreted the output. To be sure make hex
dumps of your input/output.
-- 
Piet van Oostrum <piet at cs.uu.nl>
URL: http://www.cs.uu.nl/~piet [PGP 8DAE142BE17999C4]
Private email: piet at vanoostrum.org



More information about the Python-list mailing list