[XML-SIG] sax2 parsing from a string

Sam Brauer sam@webslingerZ.com
Thu, 27 Sep 2001 10:49:32 -0400 (EDT)


Thanks!  This set me on the right path.
I'd only done sax1 so far with PyXML, and did not realize that the
characters() method had a different number of arguments.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sam Brauer : Systems Programmer : sam@webslingerZ.com

On Thu, 27 Sep 2001, Martin v. Loewis wrote:

> > Can someone give me a brief example showing how to create a
> > namespace-aware sax2 parser and use it to parse a string containing an
> > XML document?
>
> I see a number of confusing information in your message, perhaps you
> can help making sense out of it.
>
> > parser = xml.sax.make_parser()
> > parser.setFeature(xml.sax.handler.feature_namespaces, 1)
> > parser.setContentHandler(myhandler)
> > inputsource = xml.sax.xmlreader.InputSource()
> > inbuffer = cStringIO.StringIO()
> > inbuffer.write(xmlstring)
> > inbuffer.seek(0)
> > inputsource.setByteStream(inbuffer)
> > parser.parse(inputsource)
> > parser.close()
>
> You don't need to close the parser if you use the .parse method; this
> is only for use as an IncremementalParser (i.e. through feed).
>
> >     self._parser.Parse(data, isFinal)
> >   File "extensions/pyexpat.c", line 522, in CharacterData
> > TypeError: not enough arguments; expected 4, got 2
>
> I cannot reproduce this problem. Can you please find out what content
> handler exactly you gave to the expat reader? It appears that you
> somehow put in a character data handler that expects 4 arguments,
> whereas pyexpat will only pass 2 of them.
>
> To find this out, please print myhandler, and perhaps
> myhandler.characters.
>
> > If I replace the line:
> > inputsource.setByteStream(inbuffer)
> >
> > with:
> > inputsource.setCharacterStream(inbuffer)
> >
> >
> > I get:
> > Traceback (most recent call last):
>
> This is not so surprising: the character stream interface is inherited
> from Java, but it doesn't work in Python (yet?).
>
> > Also (on a tangent), I think in xml.sax.saxutils.XMLGenerator and
> > xml.sax.saxutils.XMLFilterBase that the characters() and
> > ignorableWhitespace() methods need to have 4 arguments instead of 2...
> >
> > For example:
> >    def characters(self, content, start, length):
> >        self._out.write(escape(content[start:start+length]))
>
> No, they don't. A SAX2 characters handler has only a single content
> argument; it was SAX1 where you had start and length arguments.
>
> Regards,
> Martin
>