ignoring chinese characters parsing xml file

limodou limodou at gmail.com
Tue Oct 23 03:20:36 EDT 2007


On 10/23/07, Stefan Behnel <stefan.behnel-n05pAM at web.de> wrote:
> Fabian López wrote:
> > Thanks Mark, the code is like this. The attrib name is the problem:
> >
> > from lxml import etree
> >
> > context = etree.iterparse("file.xml")
> > for action, elem in context:
> >     if elem.tag == "weblog":
> >         print action, elem.tag , elem.attrib["name"],elem.attrib["url"],
>
> The problem is the print statement. Looks like your terminal encoding (that
> Python needs to encode the unicode string to) can't handle these unicode
> characters.
>
I agree. For Japanese, you should know the exactly encoding name, and
convert them, just like:

print text.encoding('encoding')

-- 
I like python!
UliPad <<The Python Editor>>: http://code.google.com/p/ulipad/
meide <<wxPython UI module>>: http://code.google.com/p/meide/
My Blog: http://www.donews.net/limodou



More information about the Python-list mailing list