[Expat-discuss] use of expat with socket
Fred L. Drake, Jr.
fdrake@acm.org
Tue, 24 Jul 2001 15:59:52 -0400 (EDT)
Thomas J. Clancy writes:
> I'm new to this dicussion group and new to expat, so forgive me if this
> question has already been asked (I didn't see anything related to this on
> the web site).
I'm sure it's been asked, though. This seems to come up for every
XML parser I've come across.
> I want to use expat to replace our own home brewed XML parser. The problem
> is that while I'm getting in data from a socket (the protocol to our product
> is in XML), I may get more than one XML document at a time. When I tried to
I don't know of a general-purpose XML parser that supports anything
like this, and (only half facetiously) hope I never do.
The problem is that, in the general case, there's no way to
determine if a stream is *supposed* to contain multiple documents.
What is needed is some external way to determine the end of the input;
you can then feed the parser data buffers until the end-of-buffer
function returns true. You can do this by embedding the chunks of XML
into another protocol; this should not be difficult if you can
determine the size of each XML document in bytes before sending it, so
that each document can be preceeded by the byte-count. Otherwise,
you'll need a stream encoding that contains explicit end-of-file
markers.
> simulate this with the xmlwf app by creating a file that contained two xml
> documents, expat crapped out with:
>
> "junk after document element at line 7."
No, it didn't "crap out"; it found a real XML error! It just
depends on how you look at it.
-Fred
--
Fred L. Drake, Jr. <fdrake at acm.org>
PythonLabs at Digital Creations