SAX unicode and ascii parsing problem

Steve Holden steve at holdenweb.com
Tue Nov 30 16:02:30 EST 2010


On 11/30/2010 3:43 PM, goldtech wrote:
> Hi,
> 
> I'm trying to parse an xml file using SAX. About half-way through a
> file I get this error:
> 
> Traceback (most recent call last):
>   File "C:\Python26\Lib\site-packages\pythonwin\pywin\framework
> \scriptutils.py", line 325, in RunScript
>     exec codeObject in __main__.__dict__
>   File "E:\sc\b2.py", line 58, in <module>
>     parser.parse(open(r'ppb5.xml'))
>   File "C:\Python26\Lib\xml\sax\expatreader.py", line 107, in parse
>     xmlreader.IncrementalParser.parse(self, source)
>   File "C:\Python26\Lib\xml\sax\xmlreader.py", line 123, in parse
>     self.feed(buffer)
>   File "C:\Python26\Lib\xml\sax\expatreader.py", line 207, in feed
>     self._parser.Parse(data, isFinal)
>   File "C:\Python26\Lib\xml\sax\expatreader.py", line 304, in
> end_element
>     self._cont_handler.endElement(name)
>   File "E:\sc\b2.py", line 51, in endElement
>     d.write(csv+"\n")
> UnicodeEncodeError: 'ascii' codec can't encode characters in position
> 146-147: ordinal not in range(128)
> 
> I'm using ActivePython 2.6. I trying to figure out the simplest fix.
> If there's a Python way to just take the source XML file and covert/
> process it so this will not happen - that would be best. Or should I
> just update to Python 3 ?
> 
> I tried this but nothing changed, I thought this might convert it and
> then I'd paerse the new file - didn't work:
> 
> uc = open(r'E:\sc\ppb4.xml').read().decode('utf8')
> ascii = uc.decode('ascii')
> mex9 = open( r'E:\scrapes\ppb5.xml', 'w' )
> mex9.write(ascii)
> 
> Again I'm looking for something simple even it's a few more lines of
> codes...or upgrade(?)
> 
> Thanks, appreciate any help.
> mex9.close()

I'm just as stumped as I was when you first asked this question 13
minutes ago. ;-)

regards
 Steve

-- 
Steve Holden           +1 571 484 6266   +1 800 494 3119
PyCon 2011 Atlanta March 9-17       http://us.pycon.org/
See Python Video!       http://python.mirocommunity.org/
Holden Web LLC                 http://www.holdenweb.com/




More information about the Python-list mailing list