Parsing XML streams

Peter Scott sketerpot at chase3000.com
Thu Sep 11 19:30:18 EDT 2003


I have a program that listens on an IRC channel and logs everything to
XML on standard output. The format of the XML is pretty
straightforward, looking like this:

<channel name='#sandbox'>
    <message user='PeterScott'>Hello, my bot</message>
    <message user='PeterScott'>This is a message</message>
    <nickchange>
        <oldnick>PeterScott</oldnick>
        <newnick>PeterSc</newnick>
    </nickchange>
</channel>

I'm writing another program that should parse that sort of XML on its
stdin, printing out a more user-friendly representation. For this, I
need to parse the XML as it comes in, not all at once.

I wrote a parser using xml.sax, and it works well---provided that I
read in the whole document. However, I want to be able to just read
the XML piece by piece, calling event handlers whenever something
happens and waiting for more to happen.

Is there some way to do this with the standard python xml parsers?
Will I need to use PyXML? Or what?

Thanks,
-Peter




More information about the Python-list mailing list