From highos@highos.com  Tue Jul  3 06:21:35 2001
From: highos@highos.com (Jesse Tie Ten Quee)
Date: Mon, 2 Jul 2001 23:21:35 -0600
Subject: [Expat-discuss] This list still alive?
Message-ID: <20010702232135.A327@highos.com>

Yo,

Hey guys.. i was wondering if there is a "real world" app that shows off
expat, but not getting to complicated, eg.. something that would not
take a day just to understand how everything fits together ;)

outline.c is a start, but i was thinking a little larger...

-- 
Jesse Tie Ten Quee - highos at highos dot com


From wischhusen@web.de  Tue Jul  3 11:49:37 2001
From: wischhusen@web.de (Christian Wischhusen)
Date: Tue, 3 Jul 2001 12:49:37 +0200
Subject: [Expat-discuss] Character encoding ISO 8859-1
Message-ID: <200107031049.f63AnbF30387@mailgate3.cinetic.de>

Hi,
I'm using expat with ISO 8859-1 encoded xml files and I have following pro=
blem: expat converts german characters e.g.

   =DF (Small sharp s, German (sz ligature) ("&szlig;"))
 or
   =FC (Small u, dieresis or umlaut mark ("&uuml;"))

 to a sequence of two bytes, e.g.
 =DF (sz) -> 0xC39F
 =FC (small u, dieresis) -> 0xC3BC

As I use expat for german language I expect from expat that expat doesn't =
modify the character data between xml elements. Do anybody have a suggesti=
on to solve my problem=3F

  Chris
=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=
=5F=5F=5F=5F
Fast alle Fluege koennen Ihnen egal sein. Einer nicht: Ihrer!
Flug.de hat ihn: http://flug.de/sb/=3FPP=3D0-5-100-105-6


From henrik.eriksson@axis.com  Tue Jul  3 12:09:16 2001
From: henrik.eriksson@axis.com (Henrik Eriksson)
Date: Tue, 3 Jul 2001 13:09:16 +0200 
Subject: [Expat-discuss] Character encoding ISO 8859-1
Message-ID: <B6B64A8D263A4945BB5DCF3F9F400EB40FBA6B@mailse02.axis.se>

Hi

> -----Original Message-----
> From: Christian Wischhusen [mailto:wischhusen@web.de]
> Sent: Tuesday, July 03, 2001 12:50 PM
>
> Hi,
> I'm using expat with ISO 8859-1 encoded xml files and I have=20
> following problem: expat converts german characters e.g.
>=20
>    =DF (Small sharp s, German (sz ligature) ("&szlig;"))
>  or
>    =FC (Small u, dieresis or umlaut mark ("&uuml;"))
>=20
>  to a sequence of two bytes, e.g.
>  =DF (sz) -> 0xC39F
>  =FC (small u, dieresis) -> 0xC3BC

This is quite correct; expat uses UTF-8 encoding in the callbacks
and the sequences above are UTF-8 encodings of the ISO 8859-1
characters =FC and =DF.

> As I use expat for german language I expect from expat that=20
> expat doesn't modify the character data between xml elements.=20
> Do anybody have a suggestion to solve my problem?

As said above, expat uses UTF-8 in the callbacks. I don't think
there is any way to change this.
>=20
>   Chris

Best regards,
Henrik Eriksson


From cw@intergga.ch  Tue Jul  3 15:01:56 2001
From: cw@intergga.ch (Christian Wattinger)
Date: Tue, 03 Jul 2001 16:01:56 +0200
Subject: [Expat-discuss] expat conflict with apache
Message-ID: <B7679CF2.27CF%cw@intergga.ch>

hi

i installed expat and  the perl-module XML::Parser onto
my mac osx machine.

when i run a CGI -it makes use of XML::Parser which uses EXPAT -
under my apache/mod_perl server it generates an error saying:

------------
dyld: /usr/sbin/httpd multiple definitions of symbol _XML_DefaultCurrent

/usr/sbin/httpd definition of _XML_DefaultCurrent
/Library/Perl/darwin/auto/XML/Parser/Expat/Expat.bundle definition of
_XML_DefaultCurrent
-------------

the CGI script runs fine in the  tcsh-shell

on first try i only removed the
/Library/Perl/darwin/auto/XML/Parser/Expat/Expat.bundle
but of course this makes it worse saying:
---------------
[Tue Jul  3 15:27:30 2001] [error] PerlRun: `Can't locate loadable object
for module 
XML::Parser::Expat in @INC (@INC contains: /System/Library/Perl/darwin
/System/Library/Perl
 /Library/Perl/darwin /Library/Perl /Library/Perl
/Network/Library/Perl/darwin /Network/Library/Perl
 /Network/Library/Perl /usr/ /usr/lib/perl) at
/Library/Perl/darwin/XML/Parser.pm line 15
 Compilation failed in require at /Library/Perl/darwin/XML/Parser.pm line 15
during global destruction.
BEGIN failed--compilation aborted at /Library/Perl/darwin/XML/Parser.pm line
19 during global destruction.
Compilation failed in require at /Library/Perl/Create_Sentence.pm line 4
during global destruction.
BEGIN failed--compilation aborted at /Library/Perl/Create_Sentence.pm line 4
during global destruction.
Compilation failed in require at /Library/Perl/Speak.pm line 5 during global
destruction.
[Tue Jul  3 15:27:30 2001] [error] Can't locate object method "uri" via
package 
"Apache::PerlRun" at /System/Library/Perl/darwin/Apache/PerlRun.pm line 212
during global destruction.
--------------

obviously i cant remove
/usr/sbin/httpd 
well its the apache server...

any ideas?

cheers
christian


From Michael_B_Allen@ml.com  Tue Jul  3 22:51:25 2001
From: Michael_B_Allen@ml.com (Allen, Michael B (RSCH))
Date: Tue, 3 Jul 2001 17:51:25 -0400
Subject: [Expat-discuss] XML_UNICODE_WCHAR_T Ineffective
Message-ID: <B27EB33BAB29D2119ABF0001FA7EF289053BF026@ewfd04.exchange.ml.com>

Hi,

The XML_UNICODE_WCHAR_T define doesn't seem to be working for me. My
build is:

gcc -Wall -I include -L lib -lmba \ 
        -lexpat -DXML_UNICODE_WCHAR_T -o x0.o x0.c

which compiles without type warnings(whereas without
-DXML_UNICODE_WCHAR_T I do get a type warning about using wchar_t in
the code) but my end tag handler:

void
end(void *userData, const wchar_t *name)
{
  log_hexdump(NORM, (const char *)name, 0, 16, 16, "hexdump of name\n");
  log(NORM, "end called %s\n", name);
}

produces the following:

x0.c:15: hexdump of name 
00000:  75 73 65 72 00 00 69 64 00 00 61 75 74 68 00 00  |user..id..auth..|
x0.c:16: end called user

Notice if this where wchar_t characters the hexdump function(which just
blindly prints the hex values of a region of memory) would show three
zeros in between each ASCII equivalent hex code and the %ls printf format
specifier does not work -- must use %s instead.

Is this a bug or am I doing something wrong?

Linux 2.2.14 i686
glibc-2.1.3-15
expat-1.95.1

Thanks,
Mike


From Michael_B_Allen@ml.com  Wed Jul  4 02:59:13 2001
From: Michael_B_Allen@ml.com (Allen, Michael B (RSCH))
Date: Tue, 3 Jul 2001 21:59:13 -0400
Subject: [Expat-discuss] XML_UNICODE_WCHAR_T Ineffective
Message-ID: <B27EB33BAB29D2119ABF0001FA7EF289053BF02C@ewfd04.exchange.ml.com>


> -----Original Message-----
> From:	rolf@pointsman.de [SMTP:rolf@pointsman.de]
> 
> handler functions according to the typedef's. Any XML_Char, this is:
> any PCDATA out of the XML document, reaches handler level always UTF-8
> encoded.
> 
	Not really, but I think I see the problem. The "Overview of Expat" by Clark
	Cooper reads roughly:

	XML_UNICODE_WCHAR_T
	Use UTF-16 internally as declared as wchar_t from <stddef.h> and pass
	strings to the application this way.

	I assumed from this that Expat would use the system definition of wchar_t.
	On my system I believe wchar_t is uint32_t. But I suspect Expat is passing
	the handlers UTF-16 regardless(not UTF-8).

	So what is the XML_UNICODE_WCHAR_T for if I have to use UTF-8 to
	convert to wchar_t with mbstowcs?

	I have a lot of code that uses wchar_t. I don't want to have a big chunk using
	UTF-8 and the rest wchar_t. Yuck.

	Mike


From Michael_B_Allen@ml.com  Wed Jul  4 03:27:33 2001
From: Michael_B_Allen@ml.com (Allen, Michael B (RSCH))
Date: Tue, 3 Jul 2001 22:27:33 -0400
Subject: [Expat-discuss] xmlparse.h v.s. expat.h
Message-ID: <B27EB33BAB29D2119ABF0001FA7EF289053BF02E@ewfd04.exchange.ml.com>

What's the difference between xmlparse.h and expat.h? I see I have both but the're different.

Mike


From Michael_B_Allen@ml.com  Fri Jul  6 02:10:12 2001
From: Michael_B_Allen@ml.com (Allen, Michael B (RSCH))
Date: Thu, 5 Jul 2001 21:10:12 -0400
Subject: [Expat-discuss] XML_UNICODE_WCHAR_T Ineffective
Message-ID: <B27EB33BAB29D2119ABF0001FA7EF289053BF044@ewfd04.exchange.ml.com>


> -----Original Message-----
> From:	Allen, Michael B (RSCH) 
> 
> 	XML_UNICODE_WCHAR_T
> 	Use UTF-16 internally as declared as wchar_t from <stddef.h> and pass
> 	strings to the application this way.
> 
> 	I assumed from this that Expat would use the system definition of wchar_t.
> 	On my system I believe wchar_t is uint32_t. But I suspect Expat is passing
> 	the handlers UTF-16 regardless(not UTF-8).
> 
		Whoops, if it was UTF-16 my hexdumps would show every other character
		as '\0'. That can't be it.


From jgp@4js.com  Mon Jul  9 17:50:18 2001
From: jgp@4js.com (Jean Georges PERRIN)
Date: Mon, 9 Jul 2001 18:50:18 +0200
Subject: [Expat-discuss] Expat "add-ons"
Message-ID: <NDBBIAIPJBHEKEDFKCJGMEDPFHAA.jgp@4js.com>

Hi,

We recently decided to focus a bit more on Expat, as it seems to be quite a
cool project.

However, I was wondering about the other projects that were using Expat. We
found Centerpoint/XML a DOM / SAX parser, Sablotron an XSLT library, ...

Is there any others? Is there a list somewhere?

TIA

Jean Georges PERRIN
--
Four J's Development Tools (www.4js.com)
jgp@4js.com - Tel +33 (0)3 88 18 61 20 - Fax +33 (0)3 88 18 61 21
--
Join the 4GL/BDL community. Subscribe to the Four J's users mailing list.
Visit: www.4js.com/html/supp/mlsubscribe.htm
--
<CAUTION Warning="StandardDisclaimer"/>


From jgp-ml@4js.com  Mon Jul  9 17:51:25 2001
From: jgp-ml@4js.com (Jean Georges PERRIN)
Date: Mon, 9 Jul 2001 18:51:25 +0200
Subject: [Expat-discuss] Expat "add-ons"
Message-ID: <NDBBIAIPJBHEKEDFKCJGAEEAFHAA.jgp-ml@4js.com>

Hi,

We recently decided to focus a bit more on Expat, as it seems to be quite a
cool project.

However, I was wondering about the other projects that were using Expat. We
found Centerpoint/XML a DOM / SAX parser, Sablotron an XSLT library, ...

Is there any others? Is there a list somewhere?

TIA

Jean Georges PERRIN
--
Four J's Development Tools (www.4js.com)
jgp@4js.com - Tel +33 (0)3 88 18 61 20 - Fax +33 (0)3 88 18 61 21
--
Join the 4GL/BDL community. Subscribe to the Four J's users mailing list.
Visit: www.4js.com/html/supp/mlsubscribe.htm
--
<CAUTION Warning="StandardDisclaimer"/>


From Michael_B_Allen@ml.com  Mon Jul  9 21:14:07 2001
From: Michael_B_Allen@ml.com (Allen, Michael B (RSCH))
Date: Mon, 9 Jul 2001 16:14:07 -0400
Subject: [Expat-discuss] Expat "add-ons"
Message-ID: <B27EB33BAB29D2119ABF0001FA7EF289053BF051@ewfd04.exchange.ml.com>

> -----Original Message-----
> From:	Jean Georges PERRIN [SMTP:jgp@4js.com]
> 
> Hi,
> 
> We recently decided to focus a bit more on Expat, as it seems to be quite a
> cool project.
> 
> However, I was wondering about the other projects that were using Expat. We
> found Centerpoint/XML a DOM / SAX parser, Sablotron an XSLT library, ...
> 
> Is there any others? Is there a list somewhere?
> 
	I *just* started a DOM in c this weekend. The focus at the moment is to complement Expat.

	http://auditorymodels.org/domc

	Mike


From Michael_B_Allen@ml.com  Mon Jul  9 21:52:30 2001
From: Michael_B_Allen@ml.com (Allen, Michael B (RSCH))
Date: Mon, 9 Jul 2001 16:52:30 -0400
Subject: [Expat-discuss] Expat "add-ons"
Message-ID: <B27EB33BAB29D2119ABF0001FA7EF289053BF055@ewfd04.exchange.ml.com>

The DOM is just a tree data structure capable of representing *any* well
formed XML docuement.

http://www.w3.org/TR/1998/REC-DOM-Level-1-19981001/level-one-core.html

You can walk, filter, and generally navigate the tree more easily than you
might with its couterpart - SAX. There is also an HTML version but I presume
a prequisite of using that would be an HTML parser. Expat is not an HTML
parser AFAIK. But I'm new to XML so I'll include expat-discuss so people
can correct any lies :~)

Mike


> -----Original Message-----
> From:	Dru Nelson [SMTP:dru@redwoodsoft.com]
> Sent:	Monday, July 09, 2001 4:45 PM
> To:	Allen, Michael B (RSCH)
> Subject:	RE: [Expat-discuss] Expat "add-ons"
> 
> 
> So, unless Expat could do HTML, I could get it in the standard DOM.
> 
	It does not so you can't get an HTML DOM Document object.

> What XML formats generate a DOM?  
> 
	All

> (I am assuming that DOM still means the DOM used by w3c for web page
> documents, right?)
> 
	Yes, see link above.

> Dru Nelson
> San Mateo, California
> 
> 
> On Mon, 9 Jul 2001, Allen, Michael B (RSCH) wrote:
> 
> > No. The Document Object Model (DOM) is not a parser. It's just a data structure for representing a tree of nodes. Expat would do the parsing.
> > 
> > Sorry,
> > Mike
> > 
> > > -----Original Message-----
> > > From:	Dru Nelson [SMTP:dru@redwoodsoft.com]
> > > Sent:	Monday, July 09, 2001 4:21 PM
> > > To:	Allen, Michael B (RSCH)
> > > Subject:	RE: [Expat-discuss] Expat "add-ons"
> > > 
> > > 
> > > Hi, will your program parse plain HTML?
> > > 
> > > Dru Nelson
> > > San Mateo, California
> > > 
> > > 
> > > > 	I *just* started a DOM in c this weekend. The focus at the moment is to complement Expat.
> > > > 
> > > > 	http://auditorymodels.org/domc
> > > > 
> > > > 	Mike
> > > 
> > 
> > 
> 


From ken@bitsko.slc.ut.us  Tue Jul 10 18:07:20 2001
From: ken@bitsko.slc.ut.us (Ken MacLeod)
Date: 10 Jul 2001 12:07:20 -0500
Subject: [Expat-discuss] Expat "add-ons"
In-Reply-To: "Jean Georges PERRIN"'s message of "Mon, 9 Jul 2001 18:50:18 +0200"
References: <NDBBIAIPJBHEKEDFKCJGMEDPFHAA.jgp@4js.com>
Message-ID: <m0k81gvhkn.fsf@bitsko.slc.ut.us>

"Jean Georges PERRIN" <jgp@4js.com> writes:

> We recently decided to focus a bit more on Expat, as it seems to be
> quite a cool project.
> 
> However, I was wondering about the other projects that were using
> Expat. We found Centerpoint/XML a DOM / SAX parser, Sablotron an
> XSLT library, ...
> 
> Is there any others? Is there a list somewhere?

Orchard/C[1] includes a SAX/DOM implementation.  Overall, Orchard is
currently in an alpha stage, but the XML modules are part of the core
so they are well covered in the unit and functional tests.  Orchard/C
uses a preprocessor, virtual methods, and garbage collection to make
user-written C code much simpler and more flexible -- one step beyond
Gnome's "C++ in C" style of coding, for example.  Orchard also has a
transparent bridge to Perl, and eventually to Python, Ruby, Tcl, and
similar languages.

  -- Ken

[1] <http://Casbah.org/~kmacleod/orchard/>


From m0ntar3@home.com  Thu Jul 12 10:52:07 2001
From: m0ntar3@home.com (Chris Garrity)
Date: Thu, 12 Jul 2001 04:52:07 -0500
Subject: [Expat-discuss] General Purpose Tree to Hold Things (Help and Hints Please)
Message-ID: <3B4D7347.F5CFDCCF@home.com>

	I've started to write some general purpose routines to build a tree from an XML
document, in straight C.

	I've defined a node-structure to hold a tag name, text, and an attribute list.
I set the tag and attribute list fields in my start-tag handler, and then I push
the node onto a stack that I've defined. In my text handler, I peek at the node
on top of the stack, return a pointer to it, and add text to the text field. In
the end-tag handler, I pop the stack and insert into a tree. The tree node I
create at this point holds the depth (from within the document) of the current
tag.

	The tree I've defined is an N-ary tree, with each node having a pointer to it's
child and to a list of it's siblings. Implementing a proper insertion algorithm
is what I'm working on currently.

	Basically, while the depth of the new tree node is greater than the current
node, I descend. When the depth of the new tree node is equal to the current
node, I traverse across the list of siblings. The problem I see with this is
that the new tree node well not always be a descendant of the first tree node at
depth N. I figure I can pass the tag name of the parent along with the new tree
node,  and then know how far over to traverse the list of siblings.

	C++ is not a really an option in the current environment I'm working in, so
that's not a solution presently (no STL solution).

	Comments?


From Michael_B_Allen@ml.com  Thu Jul 12 23:00:37 2001
From: Michael_B_Allen@ml.com (Allen, Michael B (RSCH))
Date: Thu, 12 Jul 2001 18:00:37 -0400
Subject: [Expat-discuss] General Purpose Tree to Hold Things (Help
 and Hints Please)
Message-ID: <B27EB33BAB29D2119ABF0001FA7EF289053BF072@ewfd04.exchange.ml.com>

Well, I'm not really sure I understand what your immediate problems are but you do know about DOM right? The Document Object Model it the W3C recommended way to build and manipulate a tree data
structure representing an XML (or possibly HTML) document.

http://www.w3.org/DOM/

There are already several implementations in c. There's one by Oracle and Gnome's Gdome but I don't think Oracle's is usable for commercial perposes and there certainly isn't any source code and Gdome
uses a C++ in c technique that is a little more sophisticated than I need(20000 lines of code).

Incedentally I have recently started to implement a comparatively light weight DOM in c (the target is more like 2000 lines of code).

http://auditorymodels.org/domc/

but this code doesn't work and I have already practially rewritten it entirely. I will post a new batch by the end of this weekend but I suspect it too will be unusable. I don't plan on spending more
than two weeks or so on this though.

Mike

> -----Original Message-----
> From:	Chris Garrity [SMTP:m0ntar3@home.com]
> Sent:	Thursday, July 12, 2001 5:52 AM
> To:	expat-discuss@lists.sourceforge.net
> Subject:	[Expat-discuss] General Purpose Tree to Hold Things (Help and Hints Please)
> 
> 
> 	I've started to write some general purpose routines to build a tree from an XML
> document, in straight C.
> 
> 	I've defined a node-structure to hold a tag name, text, and an attribute list.
> I set the tag and attribute list fields in my start-tag handler, and then I push
> the node onto a stack that I've defined. In my text handler, I peek at the node
> on top of the stack, return a pointer to it, and add text to the text field. In
> the end-tag handler, I pop the stack and insert into a tree. The tree node I
> create at this point holds the depth (from within the document) of the current
> tag.
> 
> 	The tree I've defined is an N-ary tree, with each node having a pointer to it's
> child and to a list of it's siblings. Implementing a proper insertion algorithm
> is what I'm working on currently.
> 
> 	Basically, while the depth of the new tree node is greater than the current
> node, I descend. When the depth of the new tree node is equal to the current
> node, I traverse across the list of siblings. The problem I see with this is
> that the new tree node well not always be a descendant of the first tree node at
> depth N. I figure I can pass the tag name of the parent along with the new tree
> node,  and then know how far over to traverse the list of siblings.
> 
> 	C++ is not a really an option in the current environment I'm working in, so
> that's not a solution presently (no STL solution).
> 
> 	Comments?
> 
> _______________________________________________
> Expat-discuss mailing list
> Expat-discuss@lists.sourceforge.net
> http://lists.sourceforge.net/lists/listinfo/expat-discuss


From edrich@informatik.fh-kl.de  Tue Jul 17 13:52:09 2001
From: edrich@informatik.fh-kl.de (Ralf Edrich)
Date: Tue, 17 Jul 2001 13:52:09 +0100
Subject: [Expat-discuss] Content of Tags
Message-ID: <019601c10ebf$4a56a7e0$6b155d8f@edrichpc>

Hi,

how can I access the content between elements
using expat?

<Element>
SomeContent
</Element>

TIA,
 
Ralf


From djm@maccormack.net  Tue Jul 17 17:14:41 2001
From: djm@maccormack.net (David MacCormack)
Date: Tue, 17 Jul 2001 12:14:41 -0400 (EDT)
Subject: [Expat-discuss] Content of Tags
In-Reply-To: <019601c10ebf$4a56a7e0$6b155d8f@edrichpc>
Message-ID: <Pine.LNX.4.33.0107171213380.29924-100000@the-wall.maccormack.net>

http://www.xml.com/pub/a/1999/09/expat/index.html


-- 
----------------
David MacCormack
djm@maccormack.net


On Tue, 17 Jul 2001, Ralf Edrich wrote:

> Hi,
>
> how can I access the content between elements
> using expat?
>
> <Element>
> SomeContent
> </Element>
>
> TIA,
>
> Ralf
>
>
> _______________________________________________
> Expat-discuss mailing list
> Expat-discuss@lists.sourceforge.net
> http://lists.sourceforge.net/lists/listinfo/expat-discuss
>


From Michael_B_Allen@ml.com  Tue Jul 17 20:41:54 2001
From: Michael_B_Allen@ml.com (Allen, Michael B (RSCH))
Date: Tue, 17 Jul 2001 15:41:54 -0400
Subject: [Expat-discuss] Content of Tags
Message-ID: <B27EB33BAB29D2119ABF0001FA7EF289053BF094@ewfd04.exchange.ml.com>

That's the character data so it's not going to be passed to the start and end tag handlers. Use XML_SetCharacterDataHandler for that.

Mike

> -----Original Message-----
> From:	Ralf Edrich [SMTP:edrich@informatik.fh-kl.de]
> Sent:	Tuesday, July 17, 2001 8:52 AM
> To:	expat-discuss@lists.sourceforge.net
> Subject:	[Expat-discuss] Content of Tags
> 
> Hi,
> 
> how can I access the content between elements
> using expat?
> 
> <Element>
> SomeContent
> </Element>
> 
> TIA,
>  
> Ralf
> 
> 
> _______________________________________________
> Expat-discuss mailing list
> Expat-discuss@lists.sourceforge.net
> http://lists.sourceforge.net/lists/listinfo/expat-discuss


From dpuryear@usa.net  Tue Jul 17 21:35:22 2001
From: dpuryear@usa.net (Dustin Puryear)
Date: Tue, 17 Jul 2001 15:35:22 -0500
Subject: [Expat-discuss] working with character data
Message-ID: <3B54A18A.7020406@usa.net>

I am having some problems with an application that uses XML. (It is a 
part of a testing harness for Jabber, the XML-based Instant Messaging 
tool.) The problem is that it seems I'm losing messages at some point. I 
think it may be within jabberd, but I'm not so sure that my client isn't 
properly processing the XML messages.

Can I assume that the character data sent to the character data handler 
by expat is the entire message, or could it only be part of the message, 
and I need to push what I have on a stack and wait to see what I get next?

In other words, if I send:

<tag>a b c d</tag>

Can I assume that "a b c d" are sent to my handler or not? Will it 
possibly be sent as "a", then "b c", and then "d", or some other 
variation? My handler is quite simple, as shown below:

void char_data_hdlr(void *userdata, const XML_Char *s, int len)
{
         user_data_t *ud = userdata;
         char buf[MAX_XML_BUFSZ+1];
         struct timeval tv;
         reply_data_t *reply;
         int id;

         memcpy(buf, s, len);
         buf[len] = '\0';
         DPRINT("found message: %s\n", buf);

         /* scan for our start times at the beginning of the message */
         if (sscanf(buf, " %d %ld %ld ", &id, &(tv.tv_sec), 
&(tv.tv_usec)) == 3)
         {
                 reply = malloc(sizeof(reply_data_t));
                 if (reply == NULL)
                 {
                         perror("malloc()");
                         exit(EXIT_FAILURE);
                 }

                 DPRINT("char_data_hdlr(): adding buf = %s with sec = 
%ld and usec = %ld\n",
                                 buf, tv.tv_sec, tv.tv_usec);

                 reply->begin.tv_sec = tv.tv_sec;
                 reply->begin.tv_usec = tv.tv_usec;
                 reply->id = id;
                 list_add(&(ud->reply_list), (void *) reply);
         }
}

Here is how I setup expat:

        /* setup our XML parser */
         parser = XML_ParserCreate(NULL);
         XML_SetUserData(parser, &ud);
         XML_SetElementHandler(parser,
                         start_element_hdlr,
                         end_element_hdlr);
         XML_SetCharacterDataHandler(parser, char_data_hdlr);
         parser_done = 0;

I am using expat v1.2. Any help is appreciated.

Regards, Dustin

-- 
Dustin Puryear <dpuryear@usa.net>
http://members.telocity.com/~dpuryear
In the beginning the Universe was created.
This has been widely regarded as a bad move. - Douglas Adams


From AMARTIN@artech.com.uy  Tue Jul 17 21:52:26 2001
From: AMARTIN@artech.com.uy (Alvaro Martin)
Date: Tue, 17 Jul 2001 17:52:26 -0300
Subject: [Expat-discuss] working with character data
Message-ID: <61F232495BC8D211999E0004ACAEC3C9014EDE49@proxy.artech.com.uy>

No, you can not assume that the character data sent to the character data
handler 
by expat is the entire message.

This is specially true when you have new line characters (but I think it is
not the only case). 
If you have
	<tag>a
		b
	</tag>

you will get

	"a"
	"\n"
	"	\b"
	"\n"

Regards, Alvaro


-----Original Message-----
From: Dustin Puryear [mailto:dpuryear@usa.net]
Sent: Martes, 17 de Julio de 2001 05:35 p.m.
To: Expat-discuss@lists.sourceforge.net
Subject: [Expat-discuss] working with character data


I am having some problems with an application that uses XML. (It is a 
part of a testing harness for Jabber, the XML-based Instant Messaging 
tool.) The problem is that it seems I'm losing messages at some point. I 
think it may be within jabberd, but I'm not so sure that my client isn't 
properly processing the XML messages.

Can I assume that the character data sent to the character data handler 
by expat is the entire message, or could it only be part of the message, 
and I need to push what I have on a stack and wait to see what I get next?

In other words, if I send:

<tag>a b c d</tag>

Can I assume that "a b c d" are sent to my handler or not? Will it 
possibly be sent as "a", then "b c", and then "d", or some other 
variation? My handler is quite simple, as shown below:

void char_data_hdlr(void *userdata, const XML_Char *s, int len)
{
         user_data_t *ud = userdata;
         char buf[MAX_XML_BUFSZ+1];
         struct timeval tv;
         reply_data_t *reply;
         int id;

         memcpy(buf, s, len);
         buf[len] = '\0';
         DPRINT("found message: %s\n", buf);

         /* scan for our start times at the beginning of the message */
         if (sscanf(buf, " %d %ld %ld ", &id, &(tv.tv_sec), 
&(tv.tv_usec)) == 3)
         {
                 reply = malloc(sizeof(reply_data_t));
                 if (reply == NULL)
                 {
                         perror("malloc()");
                         exit(EXIT_FAILURE);
                 }

                 DPRINT("char_data_hdlr(): adding buf = %s with sec = 
%ld and usec = %ld\n",
                                 buf, tv.tv_sec, tv.tv_usec);

                 reply->begin.tv_sec = tv.tv_sec;
                 reply->begin.tv_usec = tv.tv_usec;
                 reply->id = id;
                 list_add(&(ud->reply_list), (void *) reply);
         }
}

Here is how I setup expat:

        /* setup our XML parser */
         parser = XML_ParserCreate(NULL);
         XML_SetUserData(parser, &ud);
         XML_SetElementHandler(parser,
                         start_element_hdlr,
                         end_element_hdlr);
         XML_SetCharacterDataHandler(parser, char_data_hdlr);
         parser_done = 0;

I am using expat v1.2. Any help is appreciated.

Regards, Dustin

-- 
Dustin Puryear <dpuryear@usa.net>
http://members.telocity.com/~dpuryear
In the beginning the Universe was created.
This has been widely regarded as a bad move. - Douglas Adams


_______________________________________________
Expat-discuss mailing list
Expat-discuss@lists.sourceforge.net
http://lists.sourceforge.net/lists/listinfo/expat-discuss


From Michael_B_Allen@ml.com  Tue Jul 17 21:52:00 2001
From: Michael_B_Allen@ml.com (Allen, Michael B (RSCH))
Date: Tue, 17 Jul 2001 16:52:00 -0400
Subject: [Expat-discuss] working with character data
Message-ID: <B27EB33BAB29D2119ABF0001FA7EF289053BF095@ewfd04.exchange.ml.com>


> -----Original Message-----
> From:	Dustin Puryear [SMTP:dpuryear@usa.net]
> 
> Can I assume that the character data sent to the character data handler 
> by expat is the entire message, or could it only be part of the message, 
> 
	NO!

	Go to XML.com and search on Expat. Read that stuff carefully.

	Mike


From syprat@yahoo.fr  Wed Jul 18 11:07:07 2001
From: syprat@yahoo.fr (=?iso-8859-1?q?Sylvain=20PRAT?=)
Date: Wed, 18 Jul 2001 12:07:07 +0200 (CEST)
Subject: [Expat-discuss] Question : internal buffer
Message-ID: <20010718100707.63648.qmail@web14802.mail.yahoo.com>

I have a little question :
How expat 1.95.1 but also 1.2 handle his internal
buffer ? For example i have to parse a big file, but i
have not much mem available, so i can't put the whole
file in mem, in a buffer. So is the parameter isFinal
can help me ? Is the internal buffer too big for me,
or is the internal buffer size is configurable ? Can i
use expat for my problem ?

Thanks in advance.

___________________________________________________________
Do You Yahoo!? -- Vos albums photos en ligne, 
Yahoo! Photos : http://fr.photos.yahoo.com


From brc@fourlittlemice.com  Thu Jul 19 08:05:25 2001
From: brc@fourlittlemice.com (Dirk Dierckx)
Date: Thu, 19 Jul 2001 09:05:25 +0200
Subject: [Expat-discuss] Using Expat 1.95.1 under WIN32 and Solaris (7/8)
Message-ID: <FBEPJFEBIOECCAJIGEKMOENPCDAA.brc@fourlittlemice.com>

Hi,

I'm using Expat 1.95.1 under Linux for some time now, but I need to port my
code to Win32.
To do this I've downloaded the expat_win32bin_1_95_1.zip from sourceforge
but there is no
import library included with the package (only the header file and the dll).
My question
now is, how can I use the dll.  The method to create an import library by
hand (extracting
all exports from the dll using dumpbin, converting it to a .def file and
using link to
create the .lib) isn't well documented so I think there must be another way
to do this.
Anyone?

The same code needs to be ported to Solaris(7 & 8) too, so can I use the
same source tarball
as for Linux and compile it under Solaris (using gcc) without modifications
or are there some
things to keep in mind?
---
Regards,
Dirk Dierckx
Software Engineer

B. Rekencentra NV
Kromstraat 50, B-2520 Ranst (Belgium)
Phone   +32 (3) 470.14.00
Fax     +32 (3) 470.14.01
Email   dirk.dierckx@rekencentra.be


From paul.burlumi@argogroup.com  Fri Jul 20 11:15:26 2001
From: paul.burlumi@argogroup.com (Paul Burlumi)
Date: Fri, 20 Jul 2001 11:15:26 +0100
Subject: [Expat-discuss] XML_UNICODE_WCHAR_T / XML_StartDoctypeDeclHandler
Message-ID: <ABB39CCA97F2D840BB0FEDDEBD5A4D0415849D@mail-svr1.elstead-ad.elstead.argogroup.com>

Two questions.

1) Does the XML_UNICODE_WCHAR_T compile time macro described in Clark
Cooper's Overview of Expat on xml.com still work? If so how do I enable
this feature without getting compile time warnings?

2) I am using XML_SetStartDoctypeDeclHandler to set a handler that is
called at the start of a DOCTYPE declaration. While parsing UTF-8
documents the parameter 'pubid' appears to be set correctly. While
parsing UTF-16 documents the contents of this variable do not appear to
be either UTF-8 or UTF-16. Does anybody have any ideas?

Many thanks

Paul Burlumi


From mballen@erols.com  Sun Jul 22 07:45:36 2001
From: mballen@erols.com (Michael B. Allen)
Date: Sun, 22 Jul 2001 02:45:36 -0400
Subject: [Expat-discuss] domc-0.3 released -- W3C's DOM in c
Message-ID: <20010722024536.B11167@nano.foo.net>

DOMC is a light weight implementation of the Document Object
Model (DOM) in ANSI c as specified in the W3C DOM Core Level
1 recommendation. When coupled with the Expat XML Parser
Toolkit, DOMC can load, store, build, and directly manipulate
XML documents represented as a tree in memory.

Enjoy,
Mike

Sun Jul 22 00:34:30 EDT 2001

                         DOMC
          The Document Object Model (DOM) in c

             http://auditorymodels.org/domc/

In the spirit of Expat and the community process I have
written a light weight implementation of the Document Object
Model (DOM) in ANSI c as specified in the W3C DOM Core Level1
recommendation:

  http://www.w3.org/TR/1998/REC-DOM-Level-1-19981001/level-one-core.html

under the MIT License. The package is available for immediate
download here:

  http://auditorymodels.org/domc/src/

The Document Object Model is the W3C recommended way to
manipulate XML and HTML documements as a tree of nodes. It is
the more sophisticated but more memory constraining alternative
to the SAX api.

Some functionality with respect to EntityReferences, Notations,
character encoding, and other peripherals are missing however
the package should be immediately useable in many contexts. The
goal is full compliance although HTML is not supported and I
have no plans to support it in the future. It is small (~1800
lines of code) and should prove to be highly extensible. I
strongly encourage you to contribute changes but as cited in
the MIT License you are not obligated to do so.

BUILDING:

UNIX - The package has not been prepared as a library but
the Makefile should build any of the examples and serve as a
model for how to integrate this code into your own. I suspect
in the simplest case this would be to export the appropriate
LD_LIBRARY_PATH or equivalent such as perhaps

  export LD_LIBRARY_PATH=domc_0.3/lib/

and compile with the equivalent of:

  gcc -Wall -I domc_0.3/lib -L domc_0.3/lib -lexpat \
        lib/stack.c lib/node.c lib/dom.c lib/expatls.c \
        -o my_program my_program.c

If expatls.c is left out the DOM_DocumentLS functions will
not be available but Expat will not be required to compile
(however, you will not be able to load or save to XML format
without your own serialization routines).

WINDOWS - I have never built this package on the Windows
platform but theres no reason why it shouldn't with a little
help. Some defines like DOM_String_dup will need to be replaced
with their equivalent windows functions. If you successfully
build DOMC on Windows I'm sure others would appreciate
your sending me a patch (preferably with your best guess at
conditional compilation that doesn't break the UNIX build).

THE API:

The popular C++ in c technique was NOT used. This means all
methods are expressed as functions that follow the pattern:

  DOM_Object_methodName(DOM_Object *obj, <parameters>)

where the first parameter is always the object the method is
being invoked on. Simple typedefs are used to compensate for
the lack of OO constructs in the c language. For example the
following IDL samples from the W3C specification:

  DOM_Element *DOM_Document_createElement( \
              in DOMString tagName raises(DOMException);
  NodeList getElementsByTagName(in DOMString name);
  Node appendChild(in Node newChild) raises(DOMException);

have been implemented in DOMC with the following prototypes:

  DOM_Element *DOM_Document_createElement( \
              DOM_Document *doc, const DOM_String *tagName);
  DOM_NodeList *DOM_Element_getElementsByTagName( \
              DOM_Element *element, const DOM_String *name);
  DOM_Node *DOM_Node_appendChild( \
              DOM_Node *node, DOM_Node *newChild);

I believe the only definitions in DOMC that are not specified
in the W3C documentation are the <code>DOM_Exception</code>
constants:

  #define DOM_NO_MEMORY_ERR               11
  #define DOM_NULL_POINTER_ERR            12
  #define DOM_SYSTEM_ERR                  13
  #define DOM_XML_PARSER_ERR              14

the memory management functions:

  void DOM_Document_destroyNode( \
              DOM_Document *doc, DOM_Node *node); 
  void DOM_Document_destroyNodeList(DOM_Document *doc, \
              DOM_NodeList *nl, int free_nodes);
  void DOM_Document_destroyNamedNodeMap(DOM_Document *doc, \
              DOM_NamedNodeMap *nnm, int free_nodes);

and load/save convenience operations:

  int DOM_DocumentLS_load(DOM_Document *doc, DOM_String *uri);
  int DOM_DocumentLS_save(DOM_Document *doc, \
              DOM_String *uri, DOM_Node *node);

The above load/save implementation is dependant on the Expat
XML Parser Toolkit which can be obtained from:

  http://expat.sourceforge.net/

however the source file can be left out of the compliation or
replaced with another implementation of the above methods.

All functions specified in Core Level 1 and two from Level 2
(e.g. DOM_Implementation_createDocument) have been completely
implemented minus the functionality previously mentioned.

The code should be highly portable however it has not been
compiled on any platform other than Linux. The only non-ANSI
code I can think of at the moment is the use of strdup however
this will need to be replaced with a function that checks for
invalid characters anyway.

Finally, below is an example program (something I always
look for when evaluating a new package) that was used to test
DOMC. There are several more example programs packaged with
the distribution.

--8<--

/* d4.c
 *
 * Load the XML file specified on the command line and build
 * a DOM tree returned as a DOM_Document object. Get the root
 * (yes, that union is a little strange -- doesn't happen a
 * lot) and add some new elements using a DOM_DocumentFragment
 * object. Then save the result as XML to stdout.
 */

#include <stdlib.h>
#include <stdio.h>
#include "dom.h"

int
main(int argc, char *argv[])
{
    DOM_Document *doc;
    DOM_DocumentFragment *dfrag;
    DOM_Element *root, *e0, *e1;
    DOM_Text *t0;

    if (argc < 2) {
        return EXIT_FAILURE;
    }

    doc = DOM_Implementation_createDocument(NULL, NULL, NULL);
    if (DOM_DocumentLS_load(doc, argv[1]) == 0) {
        return EXIT_FAILURE;
    }

    root = doc->u.Document.documentElement;

    dfrag = DOM_Document_createDocumentFragment(doc);
    e0 = DOM_Document_createElement(doc, "foo");
    e1 = DOM_Document_createElement(doc, "bar");
    t0 = DOM_Document_createTextNode(doc,
            "This tests the DocumentFragment operations such \
            as properly moving nodes from the DocumentFragment \
            into the children of another.");

    DOM_Node_appendChild(dfrag, e0);
    DOM_Node_appendChild(dfrag, e1);
    DOM_Node_appendChild(dfrag, t0);

    if (DOM_Node_appendChild(root->lastChild, dfrag) == NULL) {
        return EXIT_FAILURE;
    }

    if (DOM_DocumentLS_save(doc, "/dev/stdout", NULL) == 0) {
        return EXIT_FAILURE;
    }

    DOM_Document_destroyNode(doc, dfrag);
    DOM_Document_destroyNode(doc, doc);

    return EXIT_SUCCESS;
}


From Michael_B_Allen@ml.com  Sun Jul 22 21:20:28 2001
From: Michael_B_Allen@ml.com (Allen, Michael B (RSCH))
Date: Sun, 22 Jul 2001 16:20:28 -0400
Subject: [Expat-discuss] XML_UNICODE_WCHAR_T /
 XML_StartDoctypeDeclHan dler
Message-ID: <B27EB33BAB29D2119ABF0001FA7EF289053BF0B4@ewfd04.exchange.ml.com>


> -----Original Message-----
> From:	Paul Burlumi [SMTP:paul.burlumi@argogroup.com]
> Sent:	Friday, July 20, 2001 6:15 AM
> To:	expat-discuss@lists.sourceforge.net
> Subject:	[Expat-discuss] XML_UNICODE_WCHAR_T / XML_StartDoctypeDeclHandler
> 
> 
> Two questions.
> 
> 1) Does the XML_UNICODE_WCHAR_T compile time macro described in Clark
> Cooper's Overview of Expat on xml.com still work? If so how do I enable
> this feature without getting compile time warnings?
> 
	There's a bug report about this. See the bugs page on the sourceforge site. You
	might add your compiler messages to the Tracker. I never saw any such
	messages on Linux but I could never get it to work anyway. 

> 2) I am using XML_SetStartDoctypeDeclHandler to set a handler that is
> called at the start of a DOCTYPE declaration. While parsing UTF-8
> documents the parameter 'pubid' appears to be set correctly. While
> parsing UTF-16 documents the contents of this variable do not appear to
> be either UTF-8 or UTF-16. Does anybody have any ideas?
> 
	Sounds roughly like *my* "XML_UNICODE_WCHAR_T isn't working" diagnostics. I
	wonder if James' original source works and this version is just broken. Worth
	trying to rollback to the http://jclark.com/ code just to see.

	Is this compiled with XML_UNICODE_WCHAR_T? I think I saw a message by
	Jamed Clark (the original author) that you had to set the XML_UNICODE macro
	as well. I know Clark Coopers XML.com docs say that you don't if
	XML_UNICODE_WCHAR_T is set but maybe it's wrong or is just no longer true.

	Mike


From Rainer.Aschwanden@724.com  Mon Jul 23 10:15:38 2001
From: Rainer.Aschwanden@724.com (Rainer Aschwanden)
Date: Mon, 23 Jul 2001 11:15:38 +0200
Subject: [Expat-discuss] XML schema support
Message-ID: <19A252AE8B23D511A1EF00B0D0AB52E8235A62@inffrimail01.fri.724.com>

What level of XML schema support offers expat? Does it support the full w3c
spec? I had a look at the readme and reference, but couldn't find a word
about it. 

Thanks
Rainer


From fdrake@acm.org  Mon Jul 23 14:29:28 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Mon, 23 Jul 2001 09:29:28 -0400 (EDT)
Subject: [Expat-discuss] XML schema support
In-Reply-To: <19A252AE8B23D511A1EF00B0D0AB52E8235A62@inffrimail01.fri.724.com>
References: <19A252AE8B23D511A1EF00B0D0AB52E8235A62@inffrimail01.fri.724.com>
Message-ID: <15196.9912.990449.142892@cj42289-a.reston1.va.home.com>

Rainer Aschwanden writes:
 > What level of XML schema support offers expat? Does it support the full w3c
 > spec? I had a look at the readme and reference, but couldn't find a word
 > about it. 

  Expat is an XML 1.0 non-validating parser.  Any additional support
for schemas would need to be built on top; it does not include any
specific support for schemas, but could certainly be used as a parser
for the schema.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations


From tclancy@personity.com  Tue Jul 24 18:23:27 2001
From: tclancy@personity.com (Thomas J. Clancy)
Date: Tue, 24 Jul 2001 13:23:27 -0400
Subject: [Expat-discuss] use of expat with socket
Message-ID: <FFENLFFJILFHPIDBGJAFOENCCFAA.tclancy@personity.com>

Hey There,

I'm new to this dicussion group and new to expat, so forgive me if this
question has already been asked (I didn't see anything related to this on
the web site).

I want to use expat to replace our own home brewed XML parser.  The problem
is that while I'm getting in data from a socket (the protocol to our product
is in XML), I may get more than one XML document at a time.  When I tried to
simulate this with the xmlwf app by creating a file that contained two xml
documents, expat crapped out with:

"junk after document element at line 7."

Here is the XML I was messing around with:

<?xml version="1.0" ?>
<foo>
  this is a test
  <b>this is</b>
  only a test.
</foo>
<?xml version="1.0" ?>
<foo2>
  this is a test
  <b>this is</b>
  only a test.
</foo2>


I need to detect through some handler that I've reached the end of the
document and to not continue processing.  Then I can put away the remainder
of the buffer and process it later.  Can this be done?


Thomas J. Clancy


From fdrake@acm.org  Tue Jul 24 20:59:52 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Tue, 24 Jul 2001 15:59:52 -0400 (EDT)
Subject: [Expat-discuss] use of expat with socket
In-Reply-To: <FFENLFFJILFHPIDBGJAFOENCCFAA.tclancy@personity.com>
References: <FFENLFFJILFHPIDBGJAFOENCCFAA.tclancy@personity.com>
Message-ID: <15197.54200.812543.238046@cj42289-a.reston1.va.home.com>

Thomas J. Clancy writes:
 > I'm new to this dicussion group and new to expat, so forgive me if this
 > question has already been asked (I didn't see anything related to this on
 > the web site).

  I'm sure it's been asked, though.  This seems to come up for every
XML parser I've come across.

 > I want to use expat to replace our own home brewed XML parser.  The problem
 > is that while I'm getting in data from a socket (the protocol to our product
 > is in XML), I may get more than one XML document at a time.  When I tried to

  I don't know of a general-purpose XML parser that supports anything
like this, and (only half facetiously) hope I never do.
  The problem is that, in the general case, there's no way to
determine if a stream is *supposed* to contain multiple documents.
What is needed is some external way to determine the end of the input;
you can then feed the parser data buffers until the end-of-buffer
function returns true.  You can do this by embedding the chunks of XML
into another protocol; this should not be difficult if you can
determine the size of each XML document in bytes before sending it, so
that each document can be preceeded by the byte-count.  Otherwise,
you'll need a stream encoding that contains explicit end-of-file
markers.

 > simulate this with the xmlwf app by creating a file that contained two xml
 > documents, expat crapped out with:
 > 
 > "junk after document element at line 7."

  No, it didn't "crap out"; it found a real XML error!  It just
depends on how you look at it.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations


From Michael_B_Allen@ml.com  Tue Jul 24 21:19:15 2001
From: Michael_B_Allen@ml.com (Allen, Michael B (RSCH))
Date: Tue, 24 Jul 2001 16:19:15 -0400
Subject: [Expat-discuss] use of expat with socket
Message-ID: <B27EB33BAB29D2119ABF0001FA7EF289053BF0C7@ewfd04.exchange.ml.com>

Just use some special character to signify EOR (end of record) and then
call XML_ParseBuffer with the isFinal parameter as true when you see it.

Mike

> -----Original Message-----
> From:	Thomas J. Clancy [SMTP:tclancy@personity.com]
> Sent:	Tuesday, July 24, 2001 1:23 PM
> To:	expat-discuss@lists.sourceforge.net
> Subject:	[Expat-discuss] use of expat with socket
> 
> Hey There,
> 
> I'm new to this dicussion group and new to expat, so forgive me if this
> question has already been asked (I didn't see anything related to this on
> the web site).
> 
> I want to use expat to replace our own home brewed XML parser.  The problem
> is that while I'm getting in data from a socket (the protocol to our product
> is in XML), I may get more than one XML document at a time.  When I tried to
> simulate this with the xmlwf app by creating a file that contained two xml
> documents, expat crapped out with:
> 
> "junk after document element at line 7."
> 
> Here is the XML I was messing around with:
> 
> <?xml version="1.0" ?>
> <foo>
>   this is a test
>   <b>this is</b>
>   only a test.
> </foo>
> <?xml version="1.0" ?>
> <foo2>
>   this is a test
>   <b>this is</b>
>   only a test.
> </foo2>
> 
> 
> I need to detect through some handler that I've reached the end of the
> document and to not continue processing.  Then I can put away the remainder
> of the buffer and process it later.  Can this be done?
> 
> 
> Thomas J. Clancy
> 
> 
> _______________________________________________
> Expat-discuss mailing list
> Expat-discuss@lists.sourceforge.net
> http://lists.sourceforge.net/lists/listinfo/expat-discuss


From tclancy@personity.com  Wed Jul 25 13:06:55 2001
From: tclancy@personity.com (Thomas J. Clancy)
Date: Wed, 25 Jul 2001 08:06:55 -0400
Subject: [Expat-discuss] use of expat with socket
In-Reply-To: <15197.54200.812543.238046@cj42289-a.reston1.va.home.com>
Message-ID: <FFENLFFJILFHPIDBGJAFCENOCFAA.tclancy@personity.com>

Yes, I agree with this.  But it would be nice to have a handler that
notified you when the end of the xml document had been reached (i.e. the
close tag of the main document element) so that you could take it from there
and come back when you want more, perhaps resetting some flag in the parser
so that it would continue from the point in the buffer where you left off as
if it were starting anew.  The fact that the parser knew there was junk
after the closing document element leads me to believe that the parser knew
when it had reached the end of the document.

But  I do like the idea of embedding the XML into another protocol, and MIME
seems to be a particularly nice way in which to wrap this.

Thanks for your input.  It was most helpful.

tom

------------------------

The problem is that, in the general case, there's no way to
determine if a stream is *supposed* to contain multiple documents.
What is needed is some external way to determine the end of the input;
you can then feed the parser data buffers until the end-of-buffer
function returns true.  You can do this by embedding the chunks of XML
into another protocol; this should not be difficult if you can
determine the size of each XML document in bytes before sending it, so
that each document can be preceeded by the byte-count.  Otherwise,
you'll need a stream encoding that contains explicit end-of-file
markers.

 > simulate this with the xmlwf app by creating a file that contained two
xml
 > documents, expat crapped out with:
 >
 > "junk after document element at line 7."

  No, it didn't "crap out"; it found a real XML error!  It just
depends on how you look at it.


  -Fred

--
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations


_______________________________________________
Expat-discuss mailing list
Expat-discuss@lists.sourceforge.net
http://lists.sourceforge.net/lists/listinfo/expat-discuss


From duncan.palmer@s3group.com  Wed Jul 25 13:25:26 2001
From: duncan.palmer@s3group.com (Duncan Palmer)
Date: Wed, 25 Jul 2001 13:25:26 +0100
Subject: [Expat-discuss] use of expat with socket
References: <B27EB33BAB29D2119ABF0001FA7EF289053BF0C7@ewfd04.exchange.ml.com>
Message-ID: <3B5EBAB6.2F7E68F7@s3group.com>


"Allen, Michael B (RSCH)" wrote:
> 
> Just use some special character to signify EOR (end of record) and then
> call XML_ParseBuffer with the isFinal parameter as true when you see it.

This won't actually work (not for me anyway). I had wanted to use the
parser in the following manner:

p = XML_ParserCreate(NULL)

do {
 receive(xml)
 XML_Parse(p, xml, xml length, TRUE)
} while(1)

XML_ParserFree(p)

The document i was feeding to XML_Parse was complete, and i'd set the
isFinal param on XML_Parse. But I had the same problem as Thomas, so
ended up creating a new parser for each XML document. it would be nicer
not to have to do this...
 
Dunk.

> 
> Mike
> 
> > -----Original Message-----
> > From: Thomas J. Clancy [SMTP:tclancy@personity.com]
> > Sent: Tuesday, July 24, 2001 1:23 PM
> > To:   expat-discuss@lists.sourceforge.net
> > Subject:      [Expat-discuss] use of expat with socket
> >
> > Hey There,
> >
> > I'm new to this dicussion group and new to expat, so forgive me if this
> > question has already been asked (I didn't see anything related to this on
> > the web site).
> >
> > I want to use expat to replace our own home brewed XML parser.  The problem
> > is that while I'm getting in data from a socket (the protocol to our product
> > is in XML), I may get more than one XML document at a time.  When I tried to
> > simulate this with the xmlwf app by creating a file that contained two xml
> > documents, expat crapped out with:
> >
> > "junk after document element at line 7."
> >
> > Here is the XML I was messing around with:
> >
> > <?xml version="1.0" ?>
> > <foo>
> >   this is a test
> >   <b>this is</b>
> >   only a test.
> > </foo>
> > <?xml version="1.0" ?>
> > <foo2>
> >   this is a test
> >   <b>this is</b>
> >   only a test.
> > </foo2>
> >
> >
> > I need to detect through some handler that I've reached the end of the
> > document and to not continue processing.  Then I can put away the remainder
> > of the buffer and process it later.  Can this be done?
> >
> >
> > Thomas J. Clancy
> >
> >
> > _______________________________________________
> > Expat-discuss mailing list
> > Expat-discuss@lists.sourceforge.net
> > http://lists.sourceforge.net/lists/listinfo/expat-discuss
> 
> _______________________________________________
> Expat-discuss mailing list
> Expat-discuss@lists.sourceforge.net
> http://lists.sourceforge.net/lists/listinfo/expat-discuss

-- 
Duncan Palmer                                    
duncan.palmer@s3group.com
Software Design Engineer                          Phone:    
+353-1-2911561
Silicon and Software Systems                      Fax:      
+353-1-2911001


From fdrake@acm.org  Wed Jul 25 14:37:30 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Wed, 25 Jul 2001 09:37:30 -0400 (EDT)
Subject: [Expat-discuss] use of expat with socket
In-Reply-To: <3B5EBAB6.2F7E68F7@s3group.com>
References: <B27EB33BAB29D2119ABF0001FA7EF289053BF0C7@ewfd04.exchange.ml.com>
	<3B5EBAB6.2F7E68F7@s3group.com>
Message-ID: <15198.52122.662592.807905@cj42289-a.reston1.va.home.com>

Duncan Palmer writes:
 > This won't actually work (not for me anyway). I had wanted to use the
 > parser in the following manner:
 > 
 > p = XML_ParserCreate(NULL)
 > 
 > do {
 >  receive(xml)
 >  XML_Parse(p, xml, xml length, TRUE)
 > } while(1)
 > 
 > XML_ParserFree(p)
 > 
 > The document i was feeding to XML_Parse was complete, and i'd set the
 > isFinal param on XML_Parse. But I had the same problem as Thomas, so
 > ended up creating a new parser for each XML document. it would be nicer
 > not to have to do this...

  I'd couch this as being a different problem:  the first is parsing
multiple documents from a single stream, and the second is parsing
multiple documents using a single parser object.  While commonly
observed together, their solutions are distinct.
  Is creation/deletion of the parser object so expensive that it
really makes sense to re-use the parser?  Why not just use:

int
parse_xml(const char *xml, int length)
{
    XML_Parser p = XML_ParserCreate(NULL);

    return XML_Parse(p, xml, length, TRUE);
}

int
main(int argc, char *argv[])
{
    int ok;

    ...
    do {
        receive(xml);
        ok = parse_xml(xml, xml length);
    } while (ok);

    return !ok;
}


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations


From fdrake@acm.org  Wed Jul 25 14:48:11 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Wed, 25 Jul 2001 09:48:11 -0400 (EDT)
Subject: [Expat-discuss] use of expat with socket
In-Reply-To: <FFENLFFJILFHPIDBGJAFCENOCFAA.tclancy@personity.com>
References: <15197.54200.812543.238046@cj42289-a.reston1.va.home.com>
	<FFENLFFJILFHPIDBGJAFCENOCFAA.tclancy@personity.com>
Message-ID: <15198.52763.88902.940961@cj42289-a.reston1.va.home.com>

Thomas J. Clancy writes:
 > Yes, I agree with this.  But it would be nice to have a handler that
 > notified you when the end of the xml document had been reached (i.e. the
 > close tag of the main document element) so that you could take it from there

  There already *is* an end-element event, and all that's needed from
there is a counter so you know when the document element is complete.

 > and come back when you want more, perhaps resetting some flag in the parser
 > so that it would continue from the point in the buffer where you left off as
 > if it were starting anew.  The fact that the parser knew there was junk
 > after the closing document element leads me to believe that the parser knew
 > when it had reached the end of the document.

  It knows when it reaches the end of the document element, but
there's a specific requirement that nothing follows that.  If there
is, even a non-validating parser is required to report an error.  It
knows it's reached the end of the document because that's when you
stop feeding it data.  In all carefully engineered systems, it is the
responsibility of the container to determine where the end of the
contained object is; the contained object cannot be assumed to know
enough about the container to read itself from the container's
stream.

 > But  I do like the idea of embedding the XML into another protocol, and MIME
 > seems to be a particularly nice way in which to wrap this.

  Yes, indeed!

 > Thanks for your input.  It was most helpful.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations


From ebohlman@earthlink.net  Wed Jul 25 16:52:56 2001
From: ebohlman@earthlink.net (Eric Bohlman)
Date: Wed, 25 Jul 2001 10:52:56 -0500
Subject: [Expat-discuss] use of expat with socket
Message-ID: <200107251524.IAA23514@avocet.mail.pas.earthlink.net>

7/25/01 8:48:11 AM, "Fred L. Drake, Jr." <fdrake@acm.org> wrote:

>
>Thomas J. Clancy writes:
> > Yes, I agree with this.  But it would be nice to have a handler that
> > notified you when the end of the xml document had been reached (i.e. the
> > close tag of the main document element) so that you could take it from there
>
>  There already *is* an end-element event, and all that's needed from
>there is a counter so you know when the document element is complete.
>
> > and come back when you want more, perhaps resetting some flag in the parser
> > so that it would continue from the point in the buffer where you left off as
> > if it were starting anew.  The fact that the parser knew there was junk
> > after the closing document element leads me to believe that the parser knew
> > when it had reached the end of the document.
>
>  It knows when it reaches the end of the document element, but
>there's a specific requirement that nothing follows that.  If there
>is, even a non-validating parser is required to report an error.  It

Not quite correct: whitespace, comments, and processing instructions are allowed to follow the 
document element (which also means that you can't use a simple counter to determine when the document 
is complete).


From fdrake@acm.org  Thu Jul 26 03:20:39 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Wed, 25 Jul 2001 22:20:39 -0400 (EDT)
Subject: [Expat-discuss] Parsing multiple documents
In-Reply-To: <019d01c0bdc2$16ac0050$9302a8c0@intranet.pspl.co.in>
References: <019d01c0bdc2$16ac0050$9302a8c0@intranet.pspl.co.in>
Message-ID: <15199.32375.466988.527494@cj42289-a.reston1.va.home.com>

Pratibha Venkatachalam writes:
 > Does the Expatpp C++ wrapper for expat allow for parsing
 > consecutive XML documents?  When I try doing so, documents
 > following the first are rejected by the parser as garbage following
 > the document.  Is there any way to reinitialize the parser after
 > each document parse.

  Sorry for responding so late.
  I don't know anything about the Expatpp wrapper; can someone provide
a link to more information?  (I'd like to start a list of Expat users
(esp. Open Source projects) on the Web site; if anyone has a project
they'd like to see included, please post a link either to the list or
directly to me.  I'd be very interested in knowing which projects are
using or are interested in the current line of development (1.95.x and
beyond).)
  There has been a recent thread on this topic; the basic response
(including mine) is that the XML must be encapsulated in some other
protocol (MIME encoding was suggested, but is not the only way).
Other parsers that I'm aware of seem to lean the same way.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations


From brc@fourlittlemice.com  Thu Jul 26 08:10:34 2001
From: brc@fourlittlemice.com (Dirk Dierckx)
Date: Thu, 26 Jul 2001 09:10:34 +0200
Subject: [Expat-discuss] Using Expat 1.95.1 under WIN32 and Solaris (7/8)
Message-ID: <FBEPJFEBIOECCAJIGEKMEEOICDAA.brc@fourlittlemice.com>

Hi,

I'm using Expat 1.95.1 under Linux for some time now, but I need to port my
code to Win32.
To do this I've downloaded the expat_win32bin_1_95_1.zip from sourceforge
but there is no
import library included with the package (only the header file and the dll).
My question
now is, how can I use the dll.  The method to create an import library by
hand (extracting
all exports from the dll using dumpbin, converting it to a .def file and
using link to
create the .lib) isn't well documented so I think there must be another way
to do this.
Anyone?

The same code needs to be ported to Solaris(7 & 8) too, so can I use the
same source tarball
as for Linux and compile it under Solaris (using gcc) without modifications
or are there some
things to keep in mind?
---
Regards,
Dirk Dierckx
Software Engineer

B. Rekencentra NV
Kromstraat 50, B-2520 Ranst (Belgium)
Phone   +32 (3) 470.14.00
Fax     +32 (3) 470.14.01
Email   dirk.dierckx@rekencentra.be


From fdrake@acm.org  Thu Jul 26 13:23:24 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Thu, 26 Jul 2001 08:23:24 -0400 (EDT)
Subject: [Expat-discuss] Using Expat 1.95.1 under WIN32 and Solaris (7/8)
In-Reply-To: <FBEPJFEBIOECCAJIGEKMEEOICDAA.brc@fourlittlemice.com>
References: <FBEPJFEBIOECCAJIGEKMEEOICDAA.brc@fourlittlemice.com>
Message-ID: <15200.3004.85351.502082@cj42289-a.reston1.va.home.com>

Dirk Dierckx writes:
 > I'm using Expat 1.95.1 under Linux for some time now, but I need to
 > port my code to Win32.  To do this I've downloaded the
 > expat_win32bin_1_95_1.zip from sourceforge but there is no import
 > library included with the package (only the header file and the
 > dll).  My question now is, how can I use the dll.  The method to
 > create an import library by hand (extracting all exports from the
 > dll using dumpbin, converting it to a .def file and using link to
 > create the .lib) isn't well documented so I think there must be
 > another way to do this.

  There will be an import library in the upcoming 1.95.2 release.

 > The same code needs to be ported to Solaris(7 & 8) too, so can I
 > use the same source tarball as for Linux and compile it under
 > Solaris (using gcc) without modifications or are there some things
 > to keep in mind?

  Some problems have been reported building on Solaris, but those seem
to be fixed in the upcoming 1.95.2 release.  (I was able to build
without a problem on one of the Solaris boxes on the SourceForge
compile farm, at least.)
  I currently expect the 1.95.2 release to happen sometime tomorrow.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations


From zhenghong@mustardtech.com  Fri Jul 27 10:36:54 2001
From: zhenghong@mustardtech.com (zheng hong)
Date: Fri, 27 Jul 2001 17:36:54 +0800
Subject: [Expat-discuss] Installing error
Message-ID: <000a01c1167f$ac338600$0401a8c0@mustardtech.com>

This is a multi-part message in MIME format.

---------------------- multipart/alternative attachment
Dear Sir

I am using the "expat-1.95.1.tar.gz" in Red Hat Linux 6.2 operating =
system, I use the ./configure, make and make install, the using =
"./configure" and "make" don't have any problem, but I use "make =
install" that have error, the error is: /usr/bin/install: cannot create =
regular file `/usr/local/lib/libexpat.so.0.0.1': Permission denied
make[1]: *** [install] Error 1. Can you tell me which side error? why =
they have permission error?

I can be reached at zhenghong@mustardtech.com

Thank you very much
Zheng =20

---------------------- multipart/alternative attachment
An HTML attachment was scrubbed...
URL: http://mail.libexpat.org/pipermail-21/expat-discuss/attachments/20010727/2a8b8144/attachment.html

---------------------- multipart/alternative attachment--


From mballen@erols.com  Fri Jul 27 20:28:27 2001
From: mballen@erols.com (Michael B. Allen)
Date: Fri, 27 Jul 2001 15:28:27 -0400
Subject: [Expat-discuss] Installing error
In-Reply-To: <000a01c1167f$ac338600$0401a8c0@mustardtech.com>; from zhenghong@mustardtech.com on Fri, Jul 27, 2001 at 05:36:54PM +0800
References: <000a01c1167f$ac338600$0401a8c0@mustardtech.com>
Message-ID: <20010727152827.A859@nano.foo.net>

On Fri, Jul 27, 2001 at 05:36:54PM +0800, zheng hong wrote:
> Dear Sir
> 
> I am using the "expat-1.95.1.tar.gz" in Red Hat Linux 6.2 operating
system, I use the ./configure, make and make install, the using
"./configure" and "make" don't have any problem, but I use "make install"
that have error, the error is: /usr/bin/install: cannot create regular
file `/usr/local/lib/libexpat.so.0.0.1': Permission denied
> make[1]: *** [install] Error 1. Can you tell me which side error? why
they have permission error?

You must be root to write to /usr/local/lib/.

Mike


From fdrake@acm.org  Fri Jul 27 20:24:34 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 27 Jul 2001 15:24:34 -0400 (EDT)
Subject: [Expat-discuss] use of expat with socket
In-Reply-To: <200107251524.IAA23514@avocet.mail.pas.earthlink.net>
References: <200107251524.IAA23514@avocet.mail.pas.earthlink.net>
Message-ID: <15201.49138.869954.970497@cj42289-a.reston1.va.home.com>

Eric Bohlman writes:
 > Not quite correct: whitespace, comments, and processing instructions are allowed to follow the 
 > document element (which also means that you can't use a simple counter to determine when the document 
 > is complete).

  An excellent point!  I stand corrected.
  This is also a really good reason you can't assume you're at the end
of the document until you reach the end of the input data.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From fdrake@acm.org  Fri Jul 27 22:03:20 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 27 Jul 2001 17:03:20 -0400 (EDT)
Subject: [Expat-discuss] new releases bolluxed
Message-ID: <15201.55064.921979.1625@cj42289-a.reston1.va.home.com>

  I just tried to release Expat version 1.95.2, but something is
messed up at SourceForge.   I'll have to look at it later tonight to
see it I can fix it.
  You can pull the files out of CVS, but it's not as nice for the
Windows users.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From fdrake@acm.org  Sat Jul 28 04:40:35 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 27 Jul 2001 23:40:35 -0400 (EDT)
Subject: [Expat-discuss] new releases bolluxed
In-Reply-To: <15201.55064.921979.1625@cj42289-a.reston1.va.home.com>
References: <15201.55064.921979.1625@cj42289-a.reston1.va.home.com>
Message-ID: <15202.13363.797821.454651@cj42289-a.reston1.va.home.com>

Fred L. Drake, Jr. writes:
 >   I just tried to release Expat version 1.95.2, but something is
 > messed up at SourceForge.   I'll have to look at it later tonight to
 > see it I can fix it.

  Well, it worked after all!  I guess I just had to wait for some cron
job at SourceForge.  I still think they're having problems (the file
release admin pages certainly don't work with Konqueror!), but it's
easy enough to get the files now.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From syprat@yahoo.fr  Mon Jul 30 16:11:46 2001
From: syprat@yahoo.fr (=?iso-8859-1?q?Sylvain=20PRAT?=)
Date: Mon, 30 Jul 2001 17:11:46 +0200 (CEST)
Subject: [Expat-discuss] Expat 16 bits compatibility
Message-ID: <20010730151146.38787.qmail@web14807.mail.yahoo.com>

Hi,

I'm hardly trying to port expat to a 16 bits version,
in fact to a windows 3.1 dll, with msvc 1.52. I'm
getting many compilers errors (i do not know very well
vc 1.52), and i can't understand what is the
namingBitmap in the file nametab.h
I can't see why it is so difficult to make expat 16
bits, so can anyone help me ?

Thanks in advance.


___________________________________________________________
Do You Yahoo!? -- Vos albums photos en ligne, 
Yahoo! Photos : http://fr.photos.yahoo.com


From fdrake@acm.org  Mon Jul 30 17:06:40 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Mon, 30 Jul 2001 12:06:40 -0400 (EDT)
Subject: [Expat-discuss] Expat 16 bits compatibility
In-Reply-To: <20010730151146.38787.qmail@web14807.mail.yahoo.com>
References: <20010730151146.38787.qmail@web14807.mail.yahoo.com>
Message-ID: <15205.34320.972169.847950@cj42289-a.reston1.va.home.com>

=?iso-8859-1?q?Sylvain=20PRAT?= writes:
 > I'm hardly trying to port expat to a 16 bits version,
 > in fact to a windows 3.1 dll, with msvc 1.52. I'm
 > getting many compilers errors (i do not know very well
 > vc 1.52), and i can't understand what is the
 > namingBitmap in the file nametab.h
 > I can't see why it is so difficult to make expat 16
 > bits, so can anyone help me ?

  I don't have any access to Windows 3.1 development tools, so I don't
know how much I can help.  Perhaps if you posted the specific error
output you get we can make a little progress.
  It's not clear to me that a 16-bit port would be very useful to many
people, but it's always good to identify places in the code where
there are unnecessary assumptions; that will help anyone on a 64-bit
platform as well.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From fdrake@acm.org  Mon Jul 30 20:12:24 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Mon, 30 Jul 2001 15:12:24 -0400 (EDT)
Subject: [Expat-discuss] Anyone using BeOS?
Message-ID: <15205.45464.632584.320171@cj42289-a.reston1.va.home.com>

  Does anyone here have a BeOS machine?  I'd like to learn a little
more about the appearant __declspec support on that platform, since
past patches indicate that it differs from the MSVC support.
  If anyone has a pointer to relevant documentation, I'd really
appreciate it.
  Thanks!


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation