Convert to big5 to unicode

John Machin sjmachin at lexicon.net
Thu Sep 7 06:54:49 EDT 2006


xiejw topposted:
> Install the codecs. In Debain, you can do :
>  apt-get install python-cjkcodecs

With Windows & 2.4, no extra installation step is required.

| Python 2.4.3 (#69, Mar 29 2006, 17:35:34) [MSC v.1310 32 bit (Intel)]
on win32
| >>> bc = '\xb1i'
| >>> unicode(bc, 'big5')
| u'\u5f35'
| >>>

HTH,
John

>
> Then, it is easy to encode ( I use 'gb2312' ) :
>
>  str = '我们'
>  u = unicode(str,'gb2312')
>
> The convertion is done and you can get the string of UTF-8:
>  str_utf8 = u.encode("utf-8")
>
> You can get the original string:
>  str_gb = u.encode("gb2312")
>
>
> GM 写道:
>
> > Dear all,
> >
> > Could you all give me some guide on how to convert my big5 string to
> > unicode using python? I already knew that I might use cjkcodecs or
> > python 2.4 but I still don't have idea on what exactly I should do.
> > Please give me some sample code if you could. Thanks a lot
> > 
> > Regards,
> > 
> > Gary




More information about the Python-list mailing list