From ken@bitsko.slc.ut.us Sun Apr 1 17:18:41 2001 From: ken@bitsko.slc.ut.us (Ken MacLeod) Date: 01 Apr 2001 11:18:41 -0500 Subject: [Expat-discuss] Compiling expat on Win32 In-Reply-To: Bjoern Hoehrmann's message of "Sun, 01 Apr 2001 10:42:26 +0200" References: <09pdctkv60ru76o53d5d9oe0h3k8b721t1@4ax.com> Message-ID: Is anyone working on Win32 issues? Below is a message sent to perl-xml about my Orchard project that uses expat with Perl, but I suspect that XML::Parser+expat would also have problems compiling on Win32. -- Ken -------- forwarded -------- Bjoern Hoehrmann writes: > Hi, > > This is not the best place to discuss Orchard/C but the appropriate list > seems to be dead and http://sourceforge.net/mail/?group_id=10839 lists > no mailling lists, so lets go :-) > > After some hours of evil hacking to get all prerequisites running and > digging through the build process of Orchard I finally got it running > under Win32 (Windows NT 4.0 with ActiveState Perl 5.6). > > Expat > ----- > Release Version won't compile, followed [1] and got it running. Since > expat-1.95.1 is somewhat old and the latest CVS version produces even > more errors, I think there is no real interest to keep it running under > win32 without problems... > > BoehmGC > ------- > Compiles just fine > > Python > ------ > Got latest Python 2.1 beta 1, installs fine, runs fine > > Orchard > ------- > Hell! This was really complicated; after I understood how those > makefiles are build up and I finally understood how moc.py was supposed > to work, I wrote a .cmd file to call it. Damn! Windows throws weired > error messages. Someday I discovered cmd.exe cannot handle lines longer > than 500 or so chars, I wrote a Perl script to call it [2]. I'm to dump > to change the MakeFile.pl, I kept getting some errors for unresolved > symbols and so on, I fired up Visual Studio and got the perlif.dll > built, renamed it to OrchardC.dll, copied gc.dll, expat.dll and > liborchard.dll into the arch/auto/class/orchardc/ directory, hacked > 'Makefile' to remove 'all' from 'install' build stage and run 'nmake > install' (you can see I've been really frustrated :-) - worked. Tests > run fine except t/basic.t test 17, it fails. > >  [1] http://sourceforge.net/tracker/index.php?func=detail&aid=221127&group_id=10127&atid=110127 > [2] > > #!perl -w > use strict; > use warnings; > use Win32::Process; > > my $proc; > my @files = qw| > hash.moc if.moc int.moc key.moc list.moc moc.moc nil.moc node.moc > saxd_expat.moc string.moc symbol.moc tree_builder.moc xml_fs.moc > xml_nonopt.moc xpath.moc > > t/except_unit.moc t/expat_func.moc t/expat_perf.moc t/expat_tree.moc > t/hash_unit.moc t/if_unit.moc t/int_unit.moc t/key_unit.moc > t/list_unit.moc t/MyHandler.moc t/MyNode.moc t/node_func.moc > t/node_unit.moc t/NullHandler.moc t/string_unit.moc t/symbol_unit.moc > t/tree_builder_func.moc t/xml_node_func.moc t/xml_node_unit.moc > t/xpath_unit.moc > > perlif/perl_object.moc|; > > my $cmdline = 'python .\moc.py ' . join(' ', @files); > Win32::Process::Create( > $proc, > 'd:\winapp\python\python.exe', $cmdline, > 0, NORMAL_PRIORITY_CLASS, '.'); > > For future releases, please do not include the .c files if moc.py has to > be run anyways; better don't provide the .moc files and don't require > python for anything. Please provide some Win32 makefiles and provide a > ppm package. > -- > Björn Höhrmann ^ mailto:bjoern@hoehrmann.de ^ http://www.bjoernsworld.de > am Badedeich 7 ° Telefon: +49(0)4667/981028 ° http://bjoern.hoehrmann.de > 25899 Dagebüll # PGP Pub. KeyID: 0xA4357E78 # http://learn.to/quote [!]e > -- listen, learn, contribute -- David J. Marcus > _______________________________________________ > Perl-XML mailing list > Perl-XML@listserv.ActiveState.com > http://listserv.ActiveState.com/mailman/listinfo/perl-xml From derhoermi@gmx.net Sun Apr 1 20:23:29 2001 From: derhoermi@gmx.net (Bjoern Hoehrmann) Date: Sun, 01 Apr 2001 21:23:29 +0200 Subject: [Expat-discuss] Re: Compiling expat on Win32 In-Reply-To: References: <09pdctkv60ru76o53d5d9oe0h3k8b721t1@4ax.com> Message-ID: * Ken MacLeod wrote: >Is anyone working on Win32 issues? Below is a message sent to >perl-xml about my Orchard project that uses expat with Perl, but I >suspect that XML::Parser+expat would also have problems compiling on >Win32. XML::Parser 2.30 doesn't compile (in addition to this problem) because of an unresolved symbol _XML_DefaultCurrent in expat.lib and because the MakeFile.pl searches for libexpat.lib while the library is named expat.lib. This issue was raised on perl-xml, but nobody gave a reply... -- Björn Höhrmann ^ mailto:bjoern@hoehrmann.de ^ http://www.bjoernsworld.de am Badedeich 7 ° Telefon: +49(0)4667/981028 ° http://bjoern.hoehrmann.de 25899 Dagebüll # PGP Pub. KeyID: 0xA4357E78 # http://learn.to/quote [!]e -- listen, learn, contribute -- David J. Marcus From Tjuricek@mohomine.com Wed Apr 4 00:56:04 2001 From: Tjuricek@mohomine.com (Tristan Juricek) Date: Tue, 3 Apr 2001 16:56:04 -0700 Subject: [Expat-discuss] Cygwin Message-ID: This message is in MIME format. Since your mail reader does not understand this format, some or all of this message may not be legible. ---------------------- multipart/alternative attachment I've been having numerous problems configuring the install process to work under Cygwin. (I'd like to use the XML::Parser perl module under Cygwin Perl.) I can get the installation so it can find windows.h, but I don't know the defines that need to be set so the library will compile. Has anybody got it to compile under Cygwin? I don't know if there is actual support for it or if this is configure magic making me think there might be, so I'd figure I'd ask. -Tristan ---------------------- multipart/alternative attachment An HTML attachment was scrubbed... URL: http://mail.libexpat.org/pipermail-21/expat-discuss/attachments/20010403/be36135f/attachment.html ---------------------- multipart/alternative attachment-- From Tjuricek@mohomine.com Wed Apr 4 03:07:21 2001 From: Tjuricek@mohomine.com (Tristan Juricek) Date: Tue, 3 Apr 2001 19:07:21 -0700 Subject: [Expat-discuss] Cygwin Message-ID: This message is in MIME format. Since your mail reader does not understand this format, some or all of this message may not be legible. ---------------------- multipart/alternative attachment Yep, upgrading from gcc 2.95.2-7 to gcc 2.95.3-2 made expat work out of the box in Cygwin (1.1.8). Good stuff. Thanks for your quick reply. -Tristan -----Original Message----- From: David Crowley [mailto:dcrowley@scitegic.com] Sent: Tuesday, April 03, 2001 5:17 PM To: Tristan Juricek Subject: Re: [Expat-discuss] Cygwin At 04:56 PM 4/3/2001, you wrote: >I've been having numerous problems configuring the install process to work >under Cygwin. (I'd like to use the XML::Parser perl module under Cygwin >Perl.) I can get the installation so it can find windows.h, but I don't >know the defines that need to be set so the library will compile. > >Has anybody got it to compile under Cygwin? I don't know if there is >actual support for it or if this is configure magic making me think there >might be, so I'd figure I'd ask. > >-Tristan Just last week I got Expat to compile with Cygwin out of the box. But I had to get the very latest cygwin gcc to do it. An old cygwin gcc I had from a few months ago I had to modify Expat.h. About 2 weeks ago I got a gcc update update and Expat didn't compile at all. The include path lookup for windows.h and all the win32api files were all screwed up. I updated the cygwin gcc package again about a week ago and now Expat compiles out of the box. Give that a shot. David ---------------------- multipart/alternative attachment An HTML attachment was scrubbed... URL: http://mail.libexpat.org/pipermail-21/expat-discuss/attachments/20010403/d908da83/attachment.html ---------------------- multipart/alternative attachment-- From pratibha@pspl.co.in Thu Apr 5 12:18:06 2001 From: pratibha@pspl.co.in (Pratibha Venkatachalam) Date: Thu, 5 Apr 2001 16:48:06 +0530 Subject: [Expat-discuss] Parsing multiple documents Message-ID: <019d01c0bdc2$16ac0050$9302a8c0@intranet.pspl.co.in> This is a multi-part message in MIME format. ---------------------- multipart/alternative attachment Does the Expatpp C++ wrapper for expat allow for parsing consecutive XML = documents? When I try doing so, documents following the first are rejected by the = parser as garbage following the document. Is there any way to reinitialize the parser after each document parse. -Pratibha. ---------------------- multipart/alternative attachment An HTML attachment was scrubbed... URL: http://mail.libexpat.org/pipermail-21/expat-discuss/attachments/20010405/e4257b95/attachment.html ---------------------- multipart/alternative attachment-- From paulp@ActiveState.com Thu Apr 5 23:42:53 2001 From: paulp@ActiveState.com (Paul Prescod) Date: Thu, 05 Apr 2001 15:42:53 -0700 Subject: [Expat-discuss] Parsing multiple documents References: <019d01c0bdc2$16ac0050$9302a8c0@intranet.pspl.co.in> Message-ID: <3ACCF4ED.40CFB84@ActiveState.com> Why do you want to re-intialize the parser instead of creating another one. The parser object is not that large... I mean it isn't tiny but the malloc is going to be basically forgotten in the overhead of doing I/O and XML parsing. -- Take a recipe. Leave a recipe. Python Cookbook! http://www.ActiveState.com/pythoncookbook From southstar@minecard.com.tw Tue Apr 10 08:13:07 2001 From: southstar@minecard.com.tw (=?big5?B?sWmrbqxQ?=) Date: Tue, 10 Apr 2001 15:13:07 +0800 Subject: [Expat-discuss] Is Expat-1.9.5 is validating parser Message-ID: This is a multi-part message in MIME format. ---------------------- multipart/alternative attachment aGkhIQ0KICAgSSBhbSBub3Qgc3VyZSBpZiB0aGUgRXhwYXQgaGFzIHRoZSBmdW5jdGlvbiB0byB2 YWxpZCB0aGUgRFREIHNoZW1hL0RlZmluaXRpb24uIEZvciBleGFtcGxlLCBJIHVzZSBhIGVsZW1l bnQgdGhhdCBoYXMgbm8gZGVjbGFyYXRpb24gaW4gdGhlIERURCwgd2lsbCAgdGhlIEV4cGF0IHJl dHVybiBlcnJvci4gIA0KDQogIE1heSB5b3UgZ2l2ZSBtZSBoZWxwPz8/Pw0KDQpTaW5jZXJlbHkN Ck5laWwgQ2hhbmcgKGZyb20gVGFpd2FuKQ0K ---------------------- multipart/alternative attachment An HTML attachment was scrubbed... URL: http://mail.libexpat.org/pipermail-21/expat-discuss/attachments/20010410/3735e1e0/attachment.html ---------------------- multipart/alternative attachment-- From ikkampfcaspar@oeri.ch Tue Apr 10 08:26:56 2001 From: ikkampfcaspar@oeri.ch (Hans-Peter Oeri) Date: Tue, 10 Apr 2001 09:26:56 +0200 (CEST) Subject: [Expat-discuss] Is Expat-1.9.5 is validating parser In-Reply-To: References: Message-ID: <986887616.3ad2b5c0b3fa1@imap.oeri.ch> Hi! Quoting ±i«n¬P : > shema/Definition. For example, I use a element that has no declaration > in the DTD, will the Expat return error. No, Expat is NOT validating. That task is left to the application, if necessary. Expat offers the hooks needed to accomplish a validation, but does not do it itself. Don't conform! Ik Kampf Caspar 75 From t19nguyen@yahoo.com Wed Apr 11 19:24:33 2001 From: t19nguyen@yahoo.com (David Nguyen) Date: Wed, 11 Apr 2001 11:24:33 -0700 (PDT) Subject: [Expat-discuss] Is expat thread-safe? Message-ID: <20010411182433.30134.qmail@web11605.mail.yahoo.com> Hi every one, Is expat thread-safe in a multi-threaded environment? Thanks. David Nguyen __________________________________________________ Do You Yahoo!? Get email at your own domain with Yahoo! Mail. http://personal.mail.yahoo.com/ From paulp@ActiveState.com Wed Apr 11 19:30:46 2001 From: paulp@ActiveState.com (Paul Prescod) Date: Wed, 11 Apr 2001 11:30:46 -0700 Subject: [Expat-discuss] Is expat thread-safe? References: <20010411182433.30134.qmail@web11605.mail.yahoo.com> Message-ID: <3AD4A2D6.4723A30A@ActiveState.com> Yes, expat is designed to be thread-safe and I don't know of any reports of bugs relating to that. David Nguyen wrote: > > Hi every one, > Is expat thread-safe in a multi-threaded environment? > Thanks. > David Nguyen > > __________________________________________________ > Do You Yahoo!? > Get email at your own domain with Yahoo! Mail. > http://personal.mail.yahoo.com/ > > _______________________________________________ > Expat-discuss mailing list > Expat-discuss@lists.sourceforge.net > http://lists.sourceforge.net/lists/listinfo/expat-discuss -- Take a recipe. Leave a recipe. Python Cookbook! http://www.ActiveState.com/pythoncookbook From bronson@rinspin.com Sat Apr 14 22:19:53 2001 From: bronson@rinspin.com (Scott Bronson) Date: Sat, 14 Apr 2001 14:19:53 -0700 Subject: [Expat-discuss] Stopping the parse Message-ID: <20010414141953.D18421@rinspin.com> When parsing the XML file, there are ample opportunities for me to run out of memory. What's the best way of stopping parsing from one of the callback routines (i.e. the element handlers)? Right now I setjmp/longjmp, but that's virtually guaranteed leak memory and leave the parser in a bad state. It's too agressive. Is there a way of treating expat nicer? Thanks, - Scott From awkkock@yahoo.com Wed Apr 18 15:24:23 2001 From: awkkock@yahoo.com (Ambles Kock) Date: Wed, 18 Apr 2001 07:24:23 -0700 (PDT) Subject: [Expat-discuss] Object file format error on OSF4 Message-ID: <20010418142423.1029.qmail@web4406.mail.yahoo.com> ---------------------- multipart/alternative attachment Hi all, I am using expat is Solaris, HPUX11, SGI, FreeBSD, Linux, and DEC. Everything worked fine except for the DEC one. When I try to link my executable with the xmlparse.o. It gives me this error: ..../lib/xmlparse.o: local_is_complete: iaux(266) > iauxMax(220) for obj ..../lib/xmlparse.o Has anyone seem this before? I am anxious to fix this. Thanks a million in advance. Ambles Ambles Kock | I N K T O M I |Tel: 650-653-5636 Soft. Eng. | Essential to the Internet |Fax: 650-653-1848 --------------------------------- Do You Yahoo!? Yahoo! Auctions - buy the things you want at great prices ---------------------- multipart/alternative attachment An HTML attachment was scrubbed... URL: http://mail.libexpat.org/pipermail-21/expat-discuss/attachments/20010418/272f45b4/attachment.html ---------------------- multipart/alternative attachment-- From bronson@rinspin.com Wed Apr 18 17:48:42 2001 From: bronson@rinspin.com (bronson@rinspin.com) Date: Wed, 18 Apr 2001 09:48:42 -0700 Subject: [Expat-discuss] Stopping the parse -- anybody home? Message-ID: <20010418094842.A2808@rinspin.com> I'm in a callback, and I've just run out of memory. I'd like to stop parsing. Is there any way to do this? Right now I just longjmp my way all the way out, but that's almost guaranteed to cause memory leaks and confuse Expat's internals. Is there any better way of stopping parsing? Thanks, I asked this same question last week. Since I got no replies, I assume the answer is, "no"? Can someone reassure me that this even made it to the list? Thanks, - Scott From fdrake@acm.org Wed Apr 18 18:04:05 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed, 18 Apr 2001 13:04:05 -0400 (EDT) Subject: [Expat-discuss] Stopping the parse -- anybody home? In-Reply-To: <20010418094842.A2808@rinspin.com> References: <20010418094842.A2808@rinspin.com> Message-ID: <15069.51461.660638.600908@cj42289-a.reston1.va.home.com> bronson@rinspin.com writes: > I'm in a callback, and I've just run out of memory. I'd like to stop > parsing. Is there any way to do this? Right now I just longjmp my > way all the way out, but that's almost guaranteed to cause memory > leaks and confuse Expat's internals. > > Is there any better way of stopping parsing? Thanks, Not that I can tell! This would be a very welcome addition to the API, I'm sure. It looks like I'll be able to spend some time on Expat next week; I'll keep this new requirement in mind. > I asked this same question last week. Since I got no replies, I assume > the answer is, "no"? Can someone reassure me that this even made it to > the list? Thanks, Sorry; all of us have been completely swamped. -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From michael@vivtek.com Wed Apr 18 18:01:55 2001 From: michael@vivtek.com (Michael Roberts) Date: Wed, 18 Apr 2001 12:01:55 -0500 Subject: [Expat-discuss] Stopping the parse -- anybody home? References: <20010418094842.A2808@rinspin.com> Message-ID: <3ADDC883.972EE47F@vivtek.com> It did indeed make it to the list and I was kind of hoping somebody would answer it. You might just keep a flag attached to the parse, and skip out of all handlers when it gets set. That's the approach I'd try first. Michael bronson@rinspin.com wrote: > I'm in a callback, and I've just run out of memory. I'd like to stop > parsing. Is there any way to do this? Right now I just longjmp my > way all the way out, but that's almost guaranteed to cause memory > leaks and confuse Expat's internals. > > Is there any better way of stopping parsing? Thanks, > > I asked this same question last week. Since I got no replies, I assume > the answer is, "no"? Can someone reassure me that this even made it to > the list? Thanks, > > - Scott > > _______________________________________________ > Expat-discuss mailing list > Expat-discuss@lists.sourceforge.net > http://lists.sourceforge.net/lists/listinfo/expat-discuss From fdrake@acm.org Wed Apr 18 18:38:03 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed, 18 Apr 2001 13:38:03 -0400 (EDT) Subject: [Expat-discuss] Stopping the parse -- anybody home? In-Reply-To: <3ADDC883.972EE47F@vivtek.com> References: <20010418094842.A2808@rinspin.com> <3ADDC883.972EE47F@vivtek.com> Message-ID: <15069.53499.461120.365967@cj42289-a.reston1.va.home.com> Michael Roberts writes: > It did indeed make it to the list and I was kind of hoping somebody would > answer it. Looks like our responses crossed in the mail! > You might just keep a flag attached to the parse, and skip out of all > handlers when it gets set. That's the approach I'd try first. Here's a (slightly) better approach that we use in the Python bindings for Expat: when a Python handler raises an exception, we clear all the handlers registered with the parser instance being used. This avoids having to check a flag for each callback (which gives us more maintainable application code), and can be just a little faster. -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From dcrowley@scitegic.com Wed Apr 18 19:27:29 2001 From: dcrowley@scitegic.com (David Crowley) Date: Wed, 18 Apr 2001 11:27:29 -0700 Subject: [Expat-discuss] Stopping the parse -- anybody home? In-Reply-To: <15069.53499.461120.365967@cj42289-a.reston1.va.home.com> References: <3ADDC883.972EE47F@vivtek.com> <20010418094842.A2808@rinspin.com> <3ADDC883.972EE47F@vivtek.com> Message-ID: <5.0.2.1.0.20010418110917.01e4f138@mail.internal.scitegic.com> At 10:38 AM 4/18/2001, Fred L. Drake, Jr. wrote: >Michael Roberts writes: > > It did indeed make it to the list and I was kind of hoping somebody would > > answer it. > > Looks like our responses crossed in the mail! > > > You might just keep a flag attached to the parse, and skip out of all > > handlers when it gets set. That's the approach I'd try first. > > Here's a (slightly) better approach that we use in the Python >bindings for Expat: when a Python handler raises an exception, we >clear all the handlers registered with the parser instance being used. >This avoids having to check a flag for each callback (which gives us >more maintainable application code), and can be just a little faster. I actually tried to respond last weekend but my mail bounced and I didn't get back to it. The situation I am in is I need to break out of a parse and then continue at a later time. So I set up a wrapper class around my file to read the file and return "tokens" where I say a "token" is anything before a ">" character. So my loop is like this: bool stopParse = false; tokenizer t("myfile.xml"); while (1) { void *buffer = XML_GetParseBufffer(parser, 1024) int read = t.readToken(buffer, 1024); XML_ParseBuffer(parser, read, read == 0); if (stopParse || read == 0) break; } void endElementHandler(...) { if (needToStop) stopParse = true; } The tokens returned for an xml file of "data" are "", "", "data", and "." I guess you could also write the tokenizer to break it up a little bit more to break up the "data" token. But thats the general idea. The Xerces parser kind of does something like that with the tokens, but I MUCH prefer Expat. David From bronson@rinspin.com Wed Apr 18 19:32:51 2001 From: bronson@rinspin.com (Scott Bronson) Date: Wed, 18 Apr 2001 11:32:51 -0700 Subject: [Expat-discuss] Stopping the parse In-Reply-To: <15069.53499.461120.365967@cj42289-a.reston1.va.home.com>; from fdrake@acm.org on Wed, Apr 18, 2001 at 01:38:03PM -0400 References: <20010418094842.A2808@rinspin.com> <3ADDC883.972EE47F@vivtek.com> <15069.53499.461120.365967@cj42289-a.reston1.va.home.com> Message-ID: <20010418113251.B22930@rinspin.com> On Wed, Apr 18, 2001 at 01:38:03PM -0400, Fred L. Drake, Jr. wrote: > Here's a (slightly) better approach that we use in the Python > bindings for Expat: when a Python handler raises an exception, we > clear all the handlers registered with the parser instance being used. > This avoids having to check a flag for each callback (which gives us > more maintainable application code), and can be just a little faster. This sounds like a good approach. I'll use it. Thanks, Fred. I just need to be sure that if Expat finds a syntax error later in the file, that doesn't mask the "real" error I hit earlier on. - Scott p.s. This gives me another idea for how to stop expat: buf = XML_GetBuffer(parser, buf_len) XML_ParseBuffer(parser, buf_len, 0) then, from a callback... bzero(buf, buf_len); :) From gstein@lyra.org Thu Apr 19 04:06:16 2001 From: gstein@lyra.org (Greg Stein) Date: Wed, 18 Apr 2001 20:06:16 -0700 Subject: [Expat-discuss] Object file format error on OSF4 In-Reply-To: <20010418142423.1029.qmail@web4406.mail.yahoo.com>; from awkkock@yahoo.com on Wed, Apr 18, 2001 at 07:24:23AM -0700 References: <20010418142423.1029.qmail@web4406.mail.yahoo.com> Message-ID: <20010418200616.R31832@lyra.org> On Wed, Apr 18, 2001 at 07:24:23AM -0700, Ambles Kock wrote: > > Hi all, > > I am using expat is Solaris, HPUX11, SGI, FreeBSD, Linux, and DEC. Everything worked fine except for the DEC one. > > When I try to link my executable with the xmlparse.o. It gives me this error: > ..../lib/xmlparse.o: local_is_complete: iaux(266) > iauxMax(220) for obj ..../lib/xmlparse.o > > Has anyone seem this before? I am anxious to fix this. Thanks a million in advance. Never seen that. Sounds like a basic compiler/linker problem. Possibly a corrupted .o file. I'd suggest a "make clean" and try making it again. If it still doesn't work... woof. No idea. Cheers, -g -- Greg Stein, http://www.lyra.org/ From Josh.Martin@abq.sc.philips.com Fri Apr 20 00:11:15 2001 From: Josh.Martin@abq.sc.philips.com (Josh Martin) Date: Thu, 19 Apr 2001 17:11:15 -0600 (MDT) Subject: [Expat-discuss] External entity parsing problems Message-ID: <200104192311.RAA08430@abqn42.sca.philips.com> ---------------------- multipart/mixed attachment I'm having an odd problem while trying to parse external entities with xmlcheck ("xmlcheck -v test.xml" gcc 2.95.2 HP-UX 11.00 expat 1.95.1, compiled with 'gcc -O2 -Wall -Wl,+vallcompatwarnings -lexpat -o xmlcheck xmlcheck.c'). test.dtd defines a parameter entity (pcdata), includes xhtml-lat1.ent and test.ent (via external parameter definitions and declarations), and then uses these parameter entities to define some elements and another entity. test.xml defines an internal entity and then procedes to use all of the elements and an entity from each file. XML_DTD was defined at compile, and I have param entities set to parse unless standalone. Here's the problem: At the end of my external entity handler (at line 206) I free the extern_ent parser (via XML_ParserFree()) after it is done parsing the external reference. If I do this, then the parser refuses to 'include' test.ent and it dies on the reference to %myname; at line 10 with "undefined entity". But, if I comment out the XML_ParserFree() statement, then it gladly parses and includes the test.ent file, but forgets the definition for pcdata and dies on the reference to %pcdata; on line 9 with "undefined entity". Obviously I must be trying to use expat for something it wasn't designed for, but which coding method am I supposed to use? Which "bug" is actually the correct way that expat should operate? And what do I need to be doing to get these documents to parse correctly? Please let me know if I've been confusing or vague, or let me know if you can't read my code, and I'll try to explain myself better. All of the attached files were written by me, with the exception of xhtml-lat1.ent, which is borrowed from the W3C website (I wanted to test © and ®). - Josh Martin ---------------------- multipart/mixed attachment A non-text attachment was scrubbed... Name: not available Type: text/x-sun-c-file Size: 5287 bytes Desc: xmlcheck.c Url : http://mail.libexpat.org/pipermail-21/expat-discuss/attachments/20010419/5ee97fac/attachment.bin ---------------------- multipart/mixed attachment ]> We have a Thing® & stuff. My name is &fred;. Your name is &bob;. Our name is not Joe. ---------------------- multipart/mixed attachment %HTMLlat1; %TESTent; ---------------------- multipart/mixed attachment ---------------------- multipart/mixed attachment ---------------------- multipart/mixed attachment-- From thomas@urgent.rug.ac.be Tue Apr 24 10:48:48 2001 From: thomas@urgent.rug.ac.be (Thomas Vander Stichele) Date: Tue, 24 Apr 2001 11:48:48 +0200 (CEST) Subject: [Expat-discuss] newbie question Message-ID: Hi. I'm kinda new to expat. I'm trying to write a program which needs to use stream-based XML parsing in C. I first tried libxml from gnome, but I can't get it to work without it segfaulting. So I tried expat, and it seems to work a lot better. Some things aren't very clear to me, however, and that's why I'm turning to this list. Though it seems like there isn't much traffic on it. What I wanted to know : a) is there a more elaborate example of expat use in another project ? Preferably one that does something with the data inbetween start and end tags b) the doc says that the data inbetween these tags can be spread over calls to *XML_CharacterDataHandler). Based on what is it spread ? What would be the best way to get all of the data inbetween tags in one buffer ? Allocate one in the start element callback, add this data in the character data handler callback, and use it in the end element and free it there as well ? c) most handlers have a void *userData as a first argument; it's not very clear to me how this works. There are two functions that seem to have something to do with this, XML_SetUserData and *XML_GetUserData. But what do they do ? it's not entirely clear from the expat source code. As far as I can tell, userData itself is a #define macro that interfaces between the Parser structure and a "global" userData; perhaps because expat was first implemented with globals and then changed to use functions ? But why does userData have to be passed as an argument to handlers, and what is a handler supposed to do with it ? Any help would be appreciated. Thanks in advance, thomas <-*- -*-> god loves his children yeah <-*- thomas@apestaart.org -*-> URGent, the best radio on the Internet - 24/7 ! - http://urgent.rug.ac.be/ From ken@bitsko.slc.ut.us Tue Apr 24 16:45:29 2001 From: ken@bitsko.slc.ut.us (Ken MacLeod) Date: 24 Apr 2001 10:45:29 -0500 Subject: [Expat-discuss] newbie question In-Reply-To: Thomas Vander Stichele's message of "Tue, 24 Apr 2001 11:48:48 +0200 (CEST)" References: Message-ID: Thomas Vander Stichele writes: > a) is there a more elaborate example of expat use in another > project? Preferably one that does something with the data inbetween > start and end tags I don't know of any examples off-hand, besides the included one and the xml.com article linked to from the web site[1]. > b) the doc says that the data inbetween these tags can be spread > over calls to *XML_CharacterDataHandler). Based on what is it > spread ? Although probably deterministic, it's also probably not worth delving into. Expat spreads out a lot on whitespace and non-alphabetic characters. You must always expect text to be spread out. > What would be the best way to get all of the data inbetween tags in > one buffer ? Allocate one in the start element callback, add this > data in the character data handler callback, and use it in the end > element and free it there as well ? Yes. > c) most handlers have a void *userData as a first argument; it's not > very clear to me how this works. There are two functions that seem > to have something to do with this, XML_SetUserData and > *XML_GetUserData. But what do they do ? it's not entirely clear > from the expat source code. userData is a pointer to anything *you* want to store there and be passed back to all of your handlers. This way you could have more than one parser allocated, yet calling the same set of handlers -- the userData can be used to distinguish the streams or hold instance-specific information that you are gathering. Use XML_SetUserData to assign that pointer to a parser instance, and XML_GetUserData to get it back (if you ever need to). -- Ken [1] From bronson@rinspin.com Wed Apr 25 02:03:32 2001 From: bronson@rinspin.com (Scott Bronson) Date: Tue, 24 Apr 2001 18:03:32 -0700 Subject: [Expat-discuss] newbie question In-Reply-To: ; from thomas@urgent.rug.ac.be on Tue, Apr 24, 2001 at 11:48:48AM +0200 References: Message-ID: <20010424180332.A28317@rinspin.com> On Tue, Apr 24, 2001 at 11:48:48AM +0200, Thomas Vander Stichele wrote: > I'm kinda new to expat. I'm trying to write a program which needs to use > stream-based XML parsing in C. Me too. I saw that this is a pretty generic problem, so I tried to create a reusable library called tagstack. This just maintains a stack of elements, and collects all the chardata for each tag into a single zero-terminated string. Like Expat, it provides two callbacks, one for open tag, and one for close. It doesn't need a chardata callback, though, because it handles that internally all by itself. It's still very much a work in progress -- I haven't even written a full client for tagstack yet. But the absurdly simple demo works. You can find what I've done at ftp://www.trestle.com/pub/tagstack-0.01.tar.gz I hope to have it useful in a week or two. I'd be interested in any comments on my design of tagstack and its use of Expat. > b) the doc says that the data inbetween these tags can be spread over > calls to *XML_CharacterDataHandler). Based on what is it spread? It seems pretty complex. It breaks on a lot of CRs, and some whitespace, but it's hard to say exactly what. Just rely on it possibly breaking anywhere. - Scott From sam@uchicago.edu Wed Apr 25 02:23:22 2001 From: sam@uchicago.edu (Sam TH) Date: Tue, 24 Apr 2001 20:23:22 -0500 Subject: [Expat-discuss] newbie question In-Reply-To: ; from thomas@urgent.rug.ac.be on Tue, Apr 24, 2001 at 11:48:48AM +0200 References: Message-ID: <20010424202322.A12222@uchicago.edu> ---------------------- multipart/signed attachment On Tue, Apr 24, 2001 at 11:48:48AM +0200, Thomas Vander Stichele wrote: > a) is there a more elaborate example of expat use in another project ? > Preferably one that does something with the data inbetween start and end > tags You could look at the AbiWord project (www.abisource.com/lxr for the source code), since we make fairly extensive use of expat. =20 =20 sam th --- sam@uchicago.edu --- http://www.abisource.com/~sam/ OpenPGP Key: CABD33FC --- http://samth.dyndns.org/key DeCSS: http://samth.dyndns.org/decss ---------------------- multipart/signed attachment A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 232 bytes Desc: not available Url : http://mail.libexpat.org/pipermail-21/expat-discuss/attachments/20010424/bc1cd9c6/attachment.bin ---------------------- multipart/signed attachment-- From rjamison@lincom-asg.com Fri Apr 27 21:49:41 2001 From: rjamison@lincom-asg.com (Bob Jamison) Date: Fri, 27 Apr 2001 15:49:41 -0500 Subject: [Expat-discuss] Borland makefile for 1.95.1 Message-ID: <3AE9DB65.5090209@lincom-asg.com> Hi, guys, I don't know if this thing would be useful to anyone, but recently I needed to make a static linking library for Win32. I couldn't find out how to add another target to a VC++ project, so I used Borland. So if anyone could use it, here it a Makefile for Borland C++ 5 (or 5.5), command-line style. It makes the DLL, import lib, and static lib fairly cleanly. By the way, the Borland C++ command-line compiler has been a free download for the last few months. (which is a great deal, I think). Another thought: since several Win32 compilers are becoming either fairly cheap or free, it might help to use a common compiler switch for all of them. Would it maybe be possible to use '__WIN32__' instead of 'COMPILED_FROM_DSP' ? This is pre-defined, and does not need to be set in either VC++ or Borland. Just a suggestion. Anyway, I hope this is of use to someone. Thanks for maintaining this nice library. I'll see what I can do for the CVS version soon. Bob ===== SNIP! ====== ######################################################################### ## Borland C++ 5 Makefile for Expat ## Date: 01 Apr 01 ## Author: Bob Jamison ######################################################################### ## NOTES ## This file works with Borland C++ Builder's 5's ## command-line compiler, The IDE-less compiler is also ## available as a free download at http://Borland.com. ## ## Please note that this makefile requires the Borland C++ \bin ## directory to be in the current PATH setting. ## ## To use, type ## make -f Makefile.bcc ## to build ## expat.dll - the main Dymanic Link Library ## expat.lib - the corresponding import library ## expatstatic.lib - a static linking library, possibly ## producing smaller executables ## make -f Makefile.bcc clean ## Removes objects from directory ## ## On the first build, you can ignore the Tlib "not found" warnings. ## ######################################################################### EXPAT_MAJOR_VERSION=1 EXPAT_MINOR_VERSION=95 EXPAT_EDIT=1 EXPAT_VERSION=$(EXPAT_MAJOR_VERSION).$(EXPAT_MINOR_VERSION).$(EXPAT_EDIT) VERSION="\"$(EXPAT_VERSION)\"" CC = bcc32 LINK = ilink32 AR = tlib USERDEFINES = _DEBUG;COMPILED_FROM_DSP;VERSION=$(VERSION) WARNINGS = -w-rch -w-par -w-ccc CFLAGS = -O2 -H- -Vx -Ve -X- -r- -a8 -b- -k -y -v -vi- -c -tW -tWM all: expat.dll expatstatic.lib OBJ = xmlparse.obj xmlrole.obj xmltok.obj SRCS = xmlparse.c xmlrole.c xmltok.c .c.obj: $(CC) $(CFLAGS) $(WARNINGS) -D$(USERDEFINES) -n$(@D) {$< } expat.dll: $(OBJ) $(CC) -WD -lGi -eexpat.dll $(OBJ) expatstatic.lib: $(OBJ) $(AR) /u expatstatic.lib $(OBJ) clean: del /q /f $(OBJ) *.tds expat.dll expat.lib expatstatic.lib ==== UN-SNIP ===== From gstein@lyra.org Fri Apr 27 22:23:19 2001 From: gstein@lyra.org (Greg Stein) Date: Fri, 27 Apr 2001 14:23:19 -0700 Subject: [Expat-discuss] Borland makefile for 1.95.1 In-Reply-To: <3AE9DB65.5090209@lincom-asg.com>; from rjamison@lincom-asg.com on Fri, Apr 27, 2001 at 03:49:41PM -0500 References: <3AE9DB65.5090209@lincom-asg.com> Message-ID: <20010427142319.U1374@lyra.org> Please submit this as a patch so that we can track it properly and get it integrated into the next release. thx, -g On Fri, Apr 27, 2001 at 03:49:41PM -0500, Bob Jamison wrote: > Hi, guys, > > I don't know if this thing would be useful to anyone, > but recently I needed to make a static linking library > for Win32. I couldn't find out how to add another > target to a VC++ project, so I used Borland. > > So if anyone could use it, here it a Makefile for Borland > C++ 5 (or 5.5), command-line style. It makes the DLL, > import lib, and static lib fairly cleanly. > > By the way, the Borland C++ command-line compiler has > been a free download for the last few months. (which > is a great deal, I think). > > Another thought: since several Win32 compilers are becoming > either fairly cheap or free, it might help to use a common > compiler switch for all of them. Would it maybe be possible > to use '__WIN32__' instead of 'COMPILED_FROM_DSP' ? > This is pre-defined, and does not need to be set in either VC++ or > Borland. Just a suggestion. > > Anyway, I hope this is of use to someone. Thanks > for maintaining this nice library. I'll see what I can do for > the CVS version soon. > > > Bob >... -- Greg Stein, http://www.lyra.org/ From michael_wissner@intuit.com Mon Apr 30 21:07:38 2001 From: michael_wissner@intuit.com (Michael Wissner) Date: Mon, 30 Apr 2001 16:07:38 -0400 Subject: [Expat-discuss] Encoding lower 32 characters Message-ID: <010801c0d1b1$34ba3910$db1e13ac@NAUTILUS> I must be missing something about encoding the lower 32 non-whitespace US-ASCII characters in an XML file when using expat to read the file. As I read the XML spec, http://www.w3.org/TR/2000/REC-xml-20001006#charsets , it is saying that an XML character is any legal Unicode/UCS character, and it implies that the lower 32, non-whitespace characters are not legal Unicode characters. The spec gives the following definition of a legal XML character: Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF] /* any Unicode character, excluding the surrogate blocks, FFFE, and FFFF. */ I don't have the Unicode spec handy, but I think Unicode (and by extension utf-8) is supposed to include all US-ASCII characters as a subset. Is this not so? Since I find it hard to believe that certain US-ASCII characters were omitted from Unicode, my next guess is that the intent of the XML spec is to say that those special characters are not valid in an XML file; that a valid XML file should encode those characters using character references such as "" so that they don't appear literally in the file. I've tried this, but when I attempt to parse a file containing one of the special character references using expat, it generates an error indicating that the character code is illegal. Is this error message correct, or is this a bug/misfeature in expat? Is it a bug in the XML spec? If it's correct, how can I transmit application data that contains these characters? Clearly I can create my own application-level escaping mechanism, but doesn't this defeat the purpose of having an application-independent standard like XML? Michael Wissner From paulp@ActiveState.com Mon Apr 30 23:08:12 2001 From: paulp@ActiveState.com (Paul Prescod) Date: Mon, 30 Apr 2001 15:08:12 -0700 Subject: [Expat-discuss] Encoding lower 32 characters References: <010801c0d1b1$34ba3910$db1e13ac@NAUTILUS> Message-ID: <3AEDE24C.F480C50@ActiveState.com> Michael Wissner wrote: > > ... > > Since I find it hard to believe that certain US-ASCII characters were > omitted from Unicode, my next guess is that the intent of the XML spec is to > say that those special characters are not valid in an XML file; that a valid > XML file should encode those characters using character references such as > "" so that they don't appear literally in the file. "Well-Formedness Constraint: Legal Character Characters referred to using character references must match the production for Char." [2] Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF] /* any Unicode character, excluding the surrogate blocks, FFFE, and FFFF. */ > ... Is it a bug in the XML spec? Well, it is intentional, but you could argue that it is a wrong intention. :) > ... If it's > correct, how can I transmit application data that contains these characters? > Clearly I can create my own application-level escaping mechanism, but > doesn't this defeat the purpose of having an application-independent > standard like XML? It defeaturs part of the purpose but encoding "control characters" is actually pretty rare. You could make the argument that "<", ">" and "&" are XML's control characters so the others would be redundant. If you want to insert a NAK or ESC , I'd suggest or and so on. You could even standardize your encoding for these characters. :) -- Take a recipe. Leave a recipe. Python Cookbook! http://www.ActiveState.com/pythoncookbook From mkhan@crosscom.com Mon Apr 30 23:50:16 2001 From: mkhan@crosscom.com (Mohammad Khan) Date: Mon, 30 Apr 2001 17:50:16 -0500 Subject: [Expat-discuss] Expat Problem on Solaris system! Message-ID: <4.3.2.7.2.20010430174526.00abd4a0@mail194208.popserver.pop.net> An HTML attachment was scrubbed... URL: http://mail.libexpat.org/pipermail-21/expat-discuss/attachments/20010430/cc1f26f9/attachment.html