[Expat-discuss] use of expat with socket

Fred L. Drake, Jr. fdrake@acm.org
Tue, 24 Jul 2001 15:59:52 -0400 (EDT)


Thomas J. Clancy writes:
 > I'm new to this dicussion group and new to expat, so forgive me if this
 > question has already been asked (I didn't see anything related to this on
 > the web site).

  I'm sure it's been asked, though.  This seems to come up for every
XML parser I've come across.

 > I want to use expat to replace our own home brewed XML parser.  The problem
 > is that while I'm getting in data from a socket (the protocol to our product
 > is in XML), I may get more than one XML document at a time.  When I tried to

  I don't know of a general-purpose XML parser that supports anything
like this, and (only half facetiously) hope I never do.
  The problem is that, in the general case, there's no way to
determine if a stream is *supposed* to contain multiple documents.
What is needed is some external way to determine the end of the input;
you can then feed the parser data buffers until the end-of-buffer
function returns true.  You can do this by embedding the chunks of XML
into another protocol; this should not be difficult if you can
determine the size of each XML document in bytes before sending it, so
that each document can be preceeded by the byte-count.  Otherwise,
you'll need a stream encoding that contains explicit end-of-file
markers.

 > simulate this with the xmlwf app by creating a file that contained two xml
 > documents, expat crapped out with:
 > 
 > "junk after document element at line 7."

  No, it didn't "crap out"; it found a real XML error!  It just
depends on how you look at it.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations