[XML-SIG] sax expatreader and unicode

Joe Murray jmurray@agyinc.com
Tue, 17 Apr 2001 18:21:04 -0700


What am I missing: the sax expatreader can't handle some unicode
characters?  I thought this was supported.  I believe the xml.dom
modules handle unicode characters just fine...

>From the text:

"...LEX. IN NA=EFVE H4 AND CHO CELLS, PS1 CO-IMM..."

Output:
=2E
=2E
=2E

  File "analyzexml.py", line 68, in analyze_sax
    parser.parse(stream)
  File
"/usr/local/lib/python2.0/site-packages/_xmlplus/sax/expatreader.py",
line 43, in parse
    xmlreader.IncrementalParser.parse(self, source)
  File
"/usr/local/lib/python2.0/site-packages/_xmlplus/sax/xmlreader.py", line
120, in parse
    self.feed(buffer)
  File
"/usr/local/lib/python2.0/site-packages/_xmlplus/sax/expatreader.py",
line 87, in feed
    self._parser.Parse(data, isFinal)
UnicodeError: UTF-8 decoding error: invalid data


jm

--=20
Joseph Murray
Bioinformatics Specialist, AGY Therapeutics
290 Utah Avenue, South San Francisco, CA 94080
(650) 228-1146