Some questions about decode/encode

glacier rong.xian at gmail.com
Sun Jan 27 05:18:48 EST 2008


On 1月24日, 下午4时44分, Marc 'BlackJack' Rintsch <bj_... at gmx.net> wrote:
> On Wed, 23 Jan 2008 19:49:01 -0800, glacier wrote:
> > My second question is: is there any one who has tested very long mbcs
> > decode? I tried to decode a long(20+MB) xml yesterday, which turns out
> > to be very strange and cause SAX fail to parse the decoded string.
>
> That's because SAX wants bytes, not a decoded string.  Don't decode it
> yourself.
>
> > However, I use another text editor to convert the file to utf-8 and
> > SAX will parse the content successfully.
>
> Because now you feed SAX with bytes instead of a unicode string.
>
> Ciao,
>         Marc 'BlackJack' Rintsch

Yepp. I feed SAX with the unicode string since SAX didn't support my
encoding system(GBK).

Is there any way to solve this better?
I mean if I shouldn't convert the GBK string to unicode string, what
should I do to make SAX work?

Thanks , Marc.
:)



More information about the Python-list mailing list