From bogus@does.not.exist.com Fri Aug 30 14:56:58 2002 From: bogus@does.not.exist.com () Date: Fri Aug 30 14:57:14 2002 Subject: No subject Message-ID: /* XML_UNICODE_WCHAR_T will work only if sizeof(wchar_t) =3D=3D 2 and wchar_t uses Unicode. */ Why? I could be willing to work on this... but where to start? Does anybody have an idea how big this task is? Also any good workarounds? On Solaris with the Sun C compiler wchar_t is 4 bytes, it doesn't appear to be negotiable. Is there something I am missing? Thanks- Butler matt_butler@equilibrium.com From gstein@lyra.org Thu Aug 1 12:10:44 2002 From: gstein@lyra.org (Greg Stein) Date: Thu Aug 1 11:10:44 2002 Subject: [Expat-discuss] Re: troubles with expat sol 8/gcc In-Reply-To: ; from cewatson@memphis.edu on Thu, Aug 01, 2002 at 09:20:57AM -0700 References: Message-ID: <20020801111146.J22527@lyra.org> [ copying to the discussion mailing list ] I think it would be most helpful to see how Perl is complaining. What is the error message? I'm not a Perl guy, but there are people on this list that might be able to help track the issue down from there. Also helpful would be a listing of what Expat files were actually installed, and where. Expat has testing tools, but they're really used for testing Expat itself, rather than "did it install properly?" And those tools are going to say "everything is fine" (otherwise, we wouldn't have released :-) Cheers, -g On Thu, Aug 01, 2002 at 09:20:57AM -0700, chris watson wrote: > o'reilly came out with a book called perl for sys admins. > in the book they describe how to build an xml document that > contains user meta data. one of the tools required to build > tools that manage this xml document is expat. > > i am trying to build all of these tools in such a way that > the tool directory can be picked up and moved to a different > location and everything in the tool directory still works. > this means that i am building all of the tools from source > code and using the --prefix flag with the configure script. > > the tools that i have that are attempting to use expat are > perl and the perl module xml::parser. seems that i cannot > get xml::parser to run properly....that is, it complains > about expat. > > so...i was wondering if you could lend a hand, or your > eyeballs, and maybe tell me what i am doing wrong. > > is there a utiltity that comes with the expat distribution > that i can use to check and make sure that my installation > of expat is working properly? > > thanks > > cw -- Greg Stein, http://www.lyra.org/ From cewatson@memphis.edu Thu Aug 1 13:10:38 2002 From: cewatson@memphis.edu (Chris Watson) Date: Thu Aug 1 12:10:38 2002 Subject: [Expat-discuss] Re: troubles with expat sol 8/gcc/xml-parser References: <20020801111146.J22527@lyra.org> Message-ID: <3D498567.93B9B635@memphis.edu> i have installed perl version 5.8 i have installed expat version 1.95.4 i am trying to install XML-Parser-2.31 a perl module. the parser package allows you to test the distribution with a make test. when i run the make test the error messages that i get are, (please excuse the ^M): +++++++++++++++++++++++++++++++++++++++++++ make test^M^M PERL_DL_NONLAZY=1 /server/utils/bin/perl "-MExtUtils::Command::MM" "-e" "test_harness(0, 'blib/lib', 'blib/arch')" t/*.t^M t/astress.........Can't load '/gaia/bartlett/d1/scripts/nis/src/XML-Parser-2.31/blib/arch/auto/XML/Parser/Expat/Expat.so' for module XML ::Parser::Expat: ld.so.1: /server/utils/bin/perl: fatal: relocation error: file /gaia/bartlett/d1/scripts/nis/src/XML-Parser-2.31/blib/a rch/auto/XML/Parser/Expat/Expat.so: symbol XML_GetErrorCode: referenced symbol not found at /server/utils/lib/perl5/5.8.0/sun4-solaris/D ynaLoader.pm line 229.^M at /gaia/bartlett/d1/scripts/nis/src/XML-Parser-2.31/blib/lib/XML/Parser.pm line 15^M Compilation failed in require at /gaia/bartlett/d1/scripts/nis/src/XML-Parser-2.31/blib/lib/XML/Parser.pm line 15.^M BEGIN failed--compilation aborted at /gaia/bartlett/d1/scripts/nis/src/XML-Parser-2.31/blib/lib/XML/Parser.pm line 19.^M Compilation failed in require at t/astress.t line 11.^M BEGIN failed--compilation aborted at t/astress.t line 11.^M ^M ^Mt/astress.........NOK 1^M ^Mt/astress.........dubious^M Test returned status 255 (wstat 65280, 0xff00)^M DIED. FAILED tests 1-25^M Failed 25/25 tests, 0.00% okay^M Failed Test Stat Wstat Total Fail Failed List of Failed^M -------------------------------------------------------------------------------^M t/astress.t 255 65280 25 25 100.00% 1-25^M t/cdata.t 255 65280 2 2 100.00% 1-2^M t/decl.t 255 65280 30 30 100.00% 1-30^M t/defaulted.t 255 65280 4 4 100.00% 1-4^M t/encoding.t 255 65280 4 4 100.00% 1-4^M t/external_ent.t 255 65280 5 5 100.00% 1-5^M t/file.t 255 65280 2 2 100.00% 1-2^M t/finish.t 255 65280 3 3 100.00% 1-3^M t/namespaces.t 255 65280 16 16 100.00% 1-16^M t/parament.t 255 65280 12 12 100.00% 1-12^M t/partial.t 255 65280 3 3 100.00% 1-3^M t/skip.t 255 65280 4 4 100.00% 1-4^M t/stream.t 255 65280 3 3 100.00% 1-3^M ++++++++++++++++++++++++++++++++++++++++++++++ actually there are a lot of errors, you can see by the totals at the bottom of the section above. all of the errors start with the same complaint about expat. Greg Stein wrote: > [ copying to the discussion mailing list ] > > I think it would be most helpful to see how Perl is complaining. What is the > error message? I'm not a Perl guy, but there are people on this list that > might be able to help track the issue down from there. > > Also helpful would be a listing of what Expat files were actually installed, > and where. > > Expat has testing tools, but they're really used for testing Expat itself, > rather than "did it install properly?" And those tools are going to say > "everything is fine" (otherwise, we wouldn't have released :-) > > Cheers, > -g > > On Thu, Aug 01, 2002 at 09:20:57AM -0700, chris watson wrote: > > o'reilly came out with a book called perl for sys admins. > > in the book they describe how to build an xml document that > > contains user meta data. one of the tools required to build > > tools that manage this xml document is expat. > > > > i am trying to build all of these tools in such a way that > > the tool directory can be picked up and moved to a different > > location and everything in the tool directory still works. > > this means that i am building all of the tools from source > > code and using the --prefix flag with the configure script. > > > > the tools that i have that are attempting to use expat are > > perl and the perl module xml::parser. seems that i cannot > > get xml::parser to run properly....that is, it complains > > about expat. > > > > so...i was wondering if you could lend a hand, or your > > eyeballs, and maybe tell me what i am doing wrong. > > > > is there a utiltity that comes with the expat distribution > > that i can use to check and make sure that my installation > > of expat is working properly? > > > > thanks > > > > cw > > -- > Greg Stein, http://www.lyra.org/ From Josh.Martin@abq.sc.philips.com Thu Aug 1 16:09:18 2002 From: Josh.Martin@abq.sc.philips.com (Josh Martin) Date: Thu Aug 1 15:09:18 2002 Subject: [Expat-discuss] Re: troubles with expat sol 8/gcc/xml-parser Message-ID: <200208012107.g71L75c15162@atoae450.abq.sc.philips.com> Did you run 'make test' before or after you did 'make install'? A well behaved packed is supposed to be testable _before_ you install it, but it might be having problems with an uninstalled shared/dynamic library. Try testing after installing, and let us know if it still doesn't work. Also, make sure that the directory holding the expat libarary is listed in your SHLIB_PATH or LD_LIBRARY_PATH environment variable. I can't remember which one Solaris uses, but I believe it's LD_LIBRARY_PATH, although I use both because I'm paranoid. BTW, using the --prefix=foo methods don't specifically let you build a package that can be moved around, they just let you change the default location that the package will be installed to. PS I apologize for continuing to break convention, especially since it causes problems when mixing the methods, but I much prefer it when I, and others, post to the top of a message, rather than the bottom. - Josh Martin > i have installed perl version 5.8 > i have installed expat version 1.95.4 > i am trying to install XML-Parser-2.31 a perl module. > > the parser package allows you to test the distribution with a make test. when > i run the make test the error messages that i get are, (please excuse the ^M): > > > > +++++++++++++++++++++++++++++++++++++++++++ > make test^M^M > PERL_DL_NONLAZY=1 /server/utils/bin/perl "-MExtUtils::Command::MM" "-e" > "test_harness(0, 'blib/lib', 'blib/arch')" t/*.t^M > t/astress.........Can't load > '/gaia/bartlett/d1/scripts/nis/src/XML-Parser-2.31/blib/arch/auto/XML/Parser/Exp at/Expat.so' > for module XML > ::Parser::Expat: ld.so.1: /server/utils/bin/perl: fatal: relocation error: file > /gaia/bartlett/d1/scripts/nis/src/XML-Parser-2.31/blib/a > rch/auto/XML/Parser/Expat/Expat.so: symbol XML_GetErrorCode: referenced symbol > not found at /server/utils/lib/perl5/5.8.0/sun4-solaris/D > ynaLoader.pm line 229.^M > at /gaia/bartlett/d1/scripts/nis/src/XML-Parser-2.31/blib/lib/XML/Parser.pm > line 15^M > Compilation failed in require at > /gaia/bartlett/d1/scripts/nis/src/XML-Parser-2.31/blib/lib/XML/Parser.pm line > 15.^M > BEGIN failed--compilation aborted at > /gaia/bartlett/d1/scripts/nis/src/XML-Parser-2.31/blib/lib/XML/Parser.pm line > 19.^M > Compilation failed in require at t/astress.t line 11.^M > BEGIN failed--compilation aborted at t/astress.t line 11.^M > ^M > ^Mt/astress.........NOK 1^M > ^Mt/astress.........dubious^M > Test returned status 255 (wstat 65280, 0xff00)^M > DIED. FAILED tests 1-25^M > Failed 25/25 tests, 0.00% okay^M > > Failed Test Stat Wstat Total Fail Failed List of Failed^M > -------------------------------------------------------------------------------^ M > > t/astress.t 255 65280 25 25 100.00% 1-25^M > t/cdata.t 255 65280 2 2 100.00% 1-2^M > t/decl.t 255 65280 30 30 100.00% 1-30^M > t/defaulted.t 255 65280 4 4 100.00% 1-4^M > t/encoding.t 255 65280 4 4 100.00% 1-4^M > t/external_ent.t 255 65280 5 5 100.00% 1-5^M > t/file.t 255 65280 2 2 100.00% 1-2^M > t/finish.t 255 65280 3 3 100.00% 1-3^M > t/namespaces.t 255 65280 16 16 100.00% 1-16^M > t/parament.t 255 65280 12 12 100.00% 1-12^M > t/partial.t 255 65280 3 3 100.00% 1-3^M > t/skip.t 255 65280 4 4 100.00% 1-4^M > t/stream.t 255 65280 3 3 100.00% 1-3^M > > > ++++++++++++++++++++++++++++++++++++++++++++++ > > actually there are a lot of errors, you can see by the totals at the bottom of > the section above. all of the errors start with the same complaint about > expat. > > Greg Stein wrote: > > > [ copying to the discussion mailing list ] > > > > I think it would be most helpful to see how Perl is complaining. What is the > > error message? I'm not a Perl guy, but there are people on this list that > > might be able to help track the issue down from there. > > > > Also helpful would be a listing of what Expat files were actually installed, > > and where. > > > > Expat has testing tools, but they're really used for testing Expat itself, > > rather than "did it install properly?" And those tools are going to say > > "everything is fine" (otherwise, we wouldn't have released :-) > > > > Cheers, > > -g > > > > On Thu, Aug 01, 2002 at 09:20:57AM -0700, chris watson wrote: > > > o'reilly came out with a book called perl for sys admins. > > > in the book they describe how to build an xml document that > > > contains user meta data. one of the tools required to build > > > tools that manage this xml document is expat. > > > > > > i am trying to build all of these tools in such a way that > > > the tool directory can be picked up and moved to a different > > > location and everything in the tool directory still works. > > > this means that i am building all of the tools from source > > > code and using the --prefix flag with the configure script. > > > > > > the tools that i have that are attempting to use expat are > > > perl and the perl module xml::parser. seems that i cannot > > > get xml::parser to run properly....that is, it complains > > > about expat. > > > > > > so...i was wondering if you could lend a hand, or your > > > eyeballs, and maybe tell me what i am doing wrong. > > > > > > is there a utiltity that comes with the expat distribution > > > that i can use to check and make sure that my installation > > > of expat is working properly? > > > > > > thanks > > > > > > cw > > > > -- > > Greg Stein, http://www.lyra.org/ > > > > ------------------------------------------------------- > This sf.net email is sponsored by:ThinkGeek > Welcome to geek heaven. > http://thinkgeek.com/sf > _______________________________________________ > Expat-discuss mailing list > Expat-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/expat-discuss From Josh.Martin@abq.sc.philips.com Thu Aug 1 16:09:35 2002 From: Josh.Martin@abq.sc.philips.com (Josh Martin) Date: Thu Aug 1 15:09:35 2002 Subject: [Expat-discuss] Predefined Entity Expanding Message-ID: <200208012109.g71L94c15242@atoae450.abq.sc.philips.com> > Is Expat supposed to expand predefined entities in the character data > handler? > - snip - > TIA, > > Mark Yes, expat us supposed to expand predefined entities in the character data handler. I don't not know why you are having problems. Please post a copy of your code so that we can determine if it's a new problem in expat, or a problem in your code. - Josh Martin From mark@mitchenall.com Thu Aug 1 19:17:05 2002 From: mark@mitchenall.com (Mark Mitchenall) Date: Thu Aug 1 18:17:05 2002 Subject: [Expat-discuss] Predefined Entity Expanding In-Reply-To: <200208012109.g71L94c15242@atoae450.abq.sc.philips.com> Message-ID: on 1/8/2002 10:09 PM, Josh Martin at Josh.Martin@abq.sc.philips.com wrote: > Yes, expat us supposed to expand predefined entities in the character data > handler. I don't not know why you are having problems. Please post a copy > of your code so that we can determine if it's a new problem in expat, or a > problem in your code. Thanks for your reply Josh, I did post a response, but didn't notice that it only went to Karl, not to the rest of the list (perhaps this is a potential feature request for the discussion lists, i.e. reply to list, rather than reply to sender). The bug was completely my fault as my handler was badly coded... here's my response to Karl... I wrote (after seeing the errors of my ways): > Thanks for the quick response. I created a separate test app, and it seems > the > problem I was seeing was caused by some bad coding in my characterDataHandler. > I don't know why I didn't see this before (except that I almost certainly > changed too much code in one go without running my tests at each stage). > > So, no problem with expat in this respect! Somehow I knew as soon as I sent a support request, I'd see that it was my own fault in the first place. Thanks to everyone! Mark -- Mark Mitchenall Principal Consultant mitchenall.com Email: mark@mitchenall.com Tel: +44(0)20 8452 3031 Mobile: +44(0)7850 847 543 http://www.mitchenall.com/ New Freeware, Open-Source 4D Components available.... http://www.mitchenall.com/products/freeware/ From CMacgowan@temgweb.com Mon Aug 5 13:35:02 2002 From: CMacgowan@temgweb.com (MacGowan, Chris) Date: Mon Aug 5 12:35:02 2002 Subject: [Expat-discuss] Appling the Expat Parser as a SAX Parser Message-ID: <288FAF5565A1A74EA5E35C39E7EE1D4201F3108E@mail.temgweb.com> Hello expat discussion, First, the Expat Parser has been a great success for me, thanks for all the great work! Also, as I pose this question, realize that this is the first time that I have implemented a XML Parser :-) I am currently using Expat (Release 1.95.2) in an application that sending/receiving messages between a mainframe and client pc. I have implemented the Expat Parser as a DOM parser, passing it a buffer with 'well-formed-xml' and it works great! The messages from the host mainframe can get rather large, so I would like to implement a SAX Parser due to memory constraints. I have some basic questions ... If I do something like below, and pass this routine a buffer that will force the loop to execute 3-4 time (3-4K), the parser will return an error (syntax error) because the start of the xml file is being chopped up ?? int nBufferSize = 1000; for (;;) { int bytes_read; void *buff = XML_GetBuffer(p, nBufferSize ); if (buff == NULL) { /* handle error */ } bytes_read = read(docfd, buff, nBufferSize ); if (bytes_read < 0) { /* handle error */ } if (! XML_ParseBuffer(p, bytes_read, bytes_read == 0)) { /* handle parse error */ } if (bytes_read == 0) break; } So basically, if Expat is a stream based parser how does it handle a SAX implementation. If I have a 'well-formed-xml' file and then send it into the parser in parts ... how does it handle this ... or how do I handle it ??? Any comments are much appreciated. The xml file The xml file chopped up Thanks Chris Macgowan macgowan@pobox.com From fdrake@acm.org Mon Aug 5 15:01:02 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Mon Aug 5 14:01:02 2002 Subject: [Expat-discuss] Re: troubles with expat sol 8/gcc/xml-parser In-Reply-To: <3D498567.93B9B635@memphis.edu> References: <20020801111146.J22527@lyra.org> <3D498567.93B9B635@memphis.edu> Message-ID: <15694.59224.595755.535183@grendel.zope.com> Chris Watson writes: > i have installed perl version 5.8 > i have installed expat version 1.95.4 > i am trying to install XML-Parser-2.31 a perl module. > > the parser package allows you to test the distribution with a make test. when > i run the make test the error messages that i get are, (please > excuse the ^M): ... > /gaia/bartlett/d1/scripts/nis/src/XML-Parser-2.31/blib/arch/auto/XML/Parser/Expat/Expat.so: symbol XML_GetErrorCode: referenced symbol > not found at /server/utils/lib/perl5/5.8.0/sun4-solaris/DynaLoader.pm line 229. The XML::Parser::Expat module should be re-linked to use the static version of the Expat library (libexpat.a). I don't know how to do this since I don't use Perl much, and have never built a Perl extension. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From Josh.Martin@abq.sc.philips.com Tue Aug 6 11:38:02 2002 From: Josh.Martin@abq.sc.philips.com (Josh Martin) Date: Tue Aug 6 10:38:02 2002 Subject: [Expat-discuss] Re: troubles with expat sol 8/gcc/xml-parser Message-ID: <200208061737.g76Hb0h29881@atoae450.abq.sc.philips.com> Hi, Ah, see, there's the rub. If you're writing an extension for Perl (at least in C) then you must either build it as a dynamic library, or you have to have to statically link it in to perl itself, meaning recompiling perl. Now, I'm rather vague on static versus dynamic libraries on solaris, but I think it should be possible to build the XML-Parser module dynamically (as normal) but link the expat library into it statically when you build it. That is, assuming you only built a static version of expat. You might (but shouldn't) need to get so drastic as to edit the XML::Parser makefile and add the '-lexpat' compiler switch (possibly along with the '-L' compiler switch specifying the directory expat was installed to if it's not in your libpath). BTW, Perl version 5.6.1 is the most recent stable release. As far as I could tell there is no version 5.8. Okay, I take that back... 5.8 is the most recent release. While it is not the "stable" release, it is also not a development release, so that shouldn't be causing you any problems. You might want to check the perldelta man page (or perldoc) just to be sure. Also, did you try running 'make test' after 'make install' like I suggested? Did it help? - Josh Martin > Chris Watson writes: > > i have installed perl version 5.8 > > i have installed expat version 1.95.4 > > i am trying to install XML-Parser-2.31 a perl module. > > > > the parser package allows you to test the distribution with a make test. when > > i run the make test the error messages that i get are, (please > > excuse the ^M): > ... > > /gaia/bartlett/d1/scripts/nis/src/XML-Parser-2.31/blib/arch/auto/XML/Parser/Expa t/Expat.so: symbol XML_GetErrorCode: referenced symbol > > not found at /server/utils/lib/perl5/5.8.0/sun4-solaris/DynaLoader.pm line 229. > > The XML::Parser::Expat module should be re-linked to use the static > version of the Expat library (libexpat.a). I don't know how to do > this since I don't use Perl much, and have never built a Perl > extension. > > > -Fred > > -- > Fred L. Drake, Jr. > PythonLabs at Zope Corporation From martap@tango04.net Tue Aug 6 11:47:02 2002 From: martap@tango04.net (Marta Padilla) Date: Tue Aug 6 10:47:02 2002 Subject: [Expat-discuss] Problems with utf8_toUtf8 Message-ID: This is a multi-part message in MIME format. ---------------------- multipart/mixed attachment Hi, I'm still working to adapt Expat to AS/400. The code has a different behaviour in following function: utf8_toUtf8 if (fromLim - *fromP > toLim - *toP) { /* Avoid copying partial characters. */ for (fromLim = *fromP + (toLim - *toP); fromLim > *fromP; fromLim--) if (((unsigned char)fromLim[-1] & 0xc0) != 0x80) --> WHY 0xc0 and =x80 ?? break; } for (to = *toP, from = *fromP; from != fromLim; from++, to++) *to = *from; *fromP = from; *toP = to; I guess it has something to do with Ascii codes. Can anyone clarify me what does this function do? Thanks! Marta ---------------------- multipart/mixed attachment A non-text attachment was scrubbed... Name: winmail.dat Type: application/ms-tnef Size: 2302 bytes Desc: not available Url : http://mail.libexpat.org/pipermail-21/expat-discuss/attachments/20020806/ddd61556/winmail.bin ---------------------- multipart/mixed attachment-- From Josh.Martin@abq.sc.philips.com Tue Aug 6 13:08:23 2002 From: Josh.Martin@abq.sc.philips.com (Josh Martin) Date: Tue Aug 6 12:08:23 2002 Subject: [Expat-discuss] Problems with utf8_toUtf8 Message-ID: <200208061907.g76J7Xh10021@atoae450.abq.sc.philips.com> Okay, the two byte hex sequence "0xc0 0x80" represents unicode character number (in hex) 80. This is the first (from zero) UTF-8 character which is represented by two (or more) bytes, all of the previous characters being ASCII characters which are only encoded by one byte. I would (and will for this message) _erroneously_ call this two byte character 'character zero'. So basically, from what I see, the first for loop looks for the first non-ASCII character, breaks out, and then copies that character from the "from" buffer to the "to" buffer, also skipping 'character zero', which must be invalid. So my guess would be this function is used to copy one UTF-8 buffer to another. I do have reason to believe that my analysis is not entirely correct, especially since I can't see why you would skip character 0x80, but Fred will have to tell us the truth. My question is, why does this function concern you? What problems are you having with it? - Josh Martin > Hi, > > I'm still working to adapt Expat to AS/400. The code has a different > behaviour in following function: > > utf8_toUtf8 > > if (fromLim - *fromP > toLim - *toP) { > /* Avoid copying partial characters. */ > for (fromLim = *fromP + (toLim - *toP); fromLim > *fromP; fromLim--) > if (((unsigned char)fromLim[-1] & 0xc0) != 0x80) > > --> WHY 0xc0 and =x80 ?? > > break; > } > for (to = *toP, from = *fromP; from != fromLim; from++, to++) > *to = *from; > *fromP = from; > *toP = to; > > I guess it has something to do with Ascii codes. Can anyone clarify me what > does this function do? > > Thanks! > Marta > From Josh.Martin@abq.sc.philips.com Tue Aug 6 13:37:03 2002 From: Josh.Martin@abq.sc.philips.com (Josh Martin) Date: Tue Aug 6 12:37:03 2002 Subject: [Expat-discuss] Appling the Expat Parser as a SAX Parser Message-ID: <200208061935.g76JZrh18470@atoae450.abq.sc.philips.com> Hi, I'm glad you're enjoying expat! First, just to let you know, the newest release of Expat is 1.95.4, which fixes some bugs, as well as adding (currently undocumented) new functionality, so you might want to get that. Second, are you asking if your code will produce that error message, or are you telling us it did? Third, although this might not actually apply to your use for the document, the XML document that you included is 'well-formed', but not 'valid'. It needs at least the tag at the beginning, and a declaration. Basically though, as long as you don't add any extra characters (such as a newline or null character at the end) to the buffer pieces that you send to the parser, and you don't try to send it more after you tell it it's done, then Expat doesn't care how many pieces you split the document in to. The first implementation is the hardest, if not neccesarily the most complicated. :) If you're using normal output for your error messages (such as the screen or a log file) then let me suggest this hairy looking piece of code for handling parser errors. It prints out the line number the error occurred at, the error message, the line that had the error, and then puts a caret '^' under the first character that expat didn't like. long b_index; if (XML_Parse(p, buff, length, length == 0) == 0) { /* Display the line and point to the character where the error occured */ b_index = XML_GetCurrentByteIndex(p); strtok(&buff[b_index-XML_GetCurrentColumnNumber(p)], "\n\r"); fprintf(stderr, "Parse error at line %1$d: %2$s\n%3$s\n%4$*5$c\n", XML_GetCurrentLineNumber(p), XML_ErrorString(XML_GetErrorCode(p)), &(buff[b_index-XML_GetCurrentColumnNumber(p)]), '^', XML_GetCurrentColumnNumber(p)+1); XML_ParserFree(p); /* Add other exit cleanup code here. */ exit(-1); } This should work fine with XML_ParseBuffer as well. I hope this all helps. - Josh Martin > Hello expat discussion, > > First, the Expat Parser has been a great success for me, thanks for all the > great work! Also, as I pose this question, realize that this is the first > time that I have implemented a XML Parser :-) > > I am currently using Expat (Release 1.95.2) in an application that > sending/receiving messages between a mainframe and client pc. I have > implemented the Expat Parser as a DOM parser, passing it a buffer with > 'well-formed-xml' and it works great! The messages from the host mainframe > can get rather large, so I would like to implement a SAX Parser due to > memory constraints. > > I have some basic questions ... > > If I do something like below, and pass this routine a buffer that will force > the loop to execute 3-4 time (3-4K), the parser will return an error (syntax > error) because the start of the xml file is being chopped up ?? > > int nBufferSize = 1000; > for (;;) > { > int bytes_read; > void *buff = XML_GetBuffer(p, nBufferSize ); > if (buff == NULL) > { > /* handle error */ > } > > bytes_read = read(docfd, buff, nBufferSize ); > if (bytes_read < 0) > { > /* handle error */ > } > > if (! XML_ParseBuffer(p, bytes_read, bytes_read == 0)) > { > /* handle parse error */ > } > > if (bytes_read == 0) > break; > } > > > So basically, if Expat is a stream based parser how does it handle a SAX > implementation. If I have a 'well-formed-xml' file and then send it into > the parser in parts ... how does it handle this ... or how do I handle it > ??? > > Any comments are much appreciated. > > The xml file > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The xml file chopped up > > > > > > > > > ------------------------------------------- > path='.'/> > > > > > > > > ------------------------------------------- > le> > > > > > > > > > > > > Thanks > Chris Macgowan > macgowan@pobox.com From fdrake@acm.org Tue Aug 6 15:08:02 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Tue Aug 6 14:08:02 2002 Subject: [Expat-discuss] Re: troubles with expat sol 8/gcc/xml-parser In-Reply-To: <200208061737.g76Hb0h29881@atoae450.abq.sc.philips.com> References: <200208061737.g76Hb0h29881@atoae450.abq.sc.philips.com> Message-ID: <15696.14965.798436.868702@grendel.zope.com> Josh Martin writes: > Ah, see, there's the rub. If you're writing an extension for Perl > (at least in C) then you must either build it as a dynamic library, > or you have to have to statically link it in to perl itself, This is similar to Python; those are the only two options. > meaning recompiling perl. Now, I'm rather vague on static versus > dynamic libraries on solaris, but I think it should be possible to > build the XML-Parser module dynamically (as normal) but link the > expat library into it statically when you build it. That is, > assuming you only built a static version of expat. You might (but > shouldn't) need to get so drastic as to edit the XML::Parser > makefile and add the '-lexpat' compiler switch (possibly along with > the '-L' compiler switch specifying the directory expat was > installed to if it's not in your libpath). It should be quite easy to construct the link line, but if the .so libraries are installed in the same place as the static lib, using -l/-L probably won't work since those usually default to linking the dynamic library. There will be some other option to force the library to be linked statically (-static or something like that), or you can name the .a file explicitly instead of using the -l/-L options at all. What we do for Python (PyXML and Python 2.3+) is actually include the Expat library sources in the distribution and link the .o files directly into the extension, instead of using the library file at all. This actually seems to work quite well. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From rolf@pointsman.de Tue Aug 6 20:09:03 2002 From: rolf@pointsman.de (rolf@pointsman.de) Date: Tue Aug 6 19:09:03 2002 Subject: [Expat-discuss] Re: troubles with expat sol 8/gcc/xml-parser In-Reply-To: <15696.14965.798436.868702@grendel.zope.com> Message-ID: <200208070159.DAA21483@pointsman.pointsman.de> On 6 Aug, Fred L. Drake, Jr. wrote: > > What we do for Python (PyXML and Python 2.3+) is actually include the > Expat library sources in the distribution and link the .o files > directly into the extension, instead of using the library file at > all. This actually seems to work quite well. Very interesting. We do it exactly the same way, for our expat based Tcl extension. Yes, this works quite well. And it lowers the build hassle (for the people, that otherwise first would have to install expat). On the other hand... But, yes, disk space is so cheap this days, etc. ppp. And linking the expat object files directly into the application has the *big* advantage, that it simply works. rolf From msszczep@midway.uchicago.edu Tue Aug 6 20:34:02 2002 From: msszczep@midway.uchicago.edu (Mitchell Szczepanczyk) Date: Tue Aug 6 19:34:02 2002 Subject: [Expat-discuss] Question/coding about expat and attributes Message-ID: During the past few weeks, I have been using expat to write an XML parser with some success. However, I have recently encountered a problem with regards to Expat handling tag attributes that I have not been able to solve, and I'm wondering if anyone on this list can offer some assistance. The program I've been writing parses and XML document and puts the XML document into a hierarchical node structure. Each start tag gets its own node, nodes have pointers for names and attributes and nodes are linked with one another. Everything seems to work fine except for the attributes: whenever I try to put a single set of attributes (via the starthandler element) into a node, my program seems to put the most recent set of attributes into all of the nodes in the structure. And yet, there's nothing in the code (at least that I can see) which would suggest otherwise. The following is a test XML input file with attributes, followed by the output of my program which illustrates the problem, followed by the code of the program I've been writing. Can anyone see why this might be happening and/or offer a solution for putting unique attributes in each tag of the output as in the input? Thanks in advance, ---------- _ Z Mitchell Szczepanczyk / http://home.uchicago.edu/~msszczep http://www.geocities.com/szczepanczyk SAMPLE XML INPUT FILE BEGINS HERE SAMPLE XML INPUT FILE ENDS HERE RESULTING OUTPUT BEGINS HERE RESULTING OUTPUT ENDS HERE CODE BEGINS HERE /***************************************************************** * tree.c * * This program parses a well-formed SGML document and turns it into a * tree structure. * This program is modeled on the outline.c example in the Expat package. * * Use as a Unix command line argument as follows: * * % more file.xml | tree * ***************************************************************** */ #include #include #include #include #define BUFFSIZE 8192 char Buff[BUFFSIZE]; typedef struct element_s { struct node_s *mp_nod_Node; const char *mp_szName; const char **mp_arr_szAtts; } element_t; typedef struct character_s { struct node_s *mp_nod_Node; const char *mp_szText; } character_t; typedef struct atts_s { struct node_s *mp_nod_Node; int mp_int_nodeType; /* enum type 1=element, 2=character*/ } atts_t; typedef struct node_s { struct node_s *mp_nodMother; struct node_s *mp_nodDaughter; struct node_s *mp_nodLeftsister; struct node_s *mp_nodRightsister; struct element_s *mp_nod_Element; struct atts_s *mp_nod_Atts; struct character_s *mp_nod_Char; } node_t; node_t *top; void prettyPrint(node_t *top, int i) { int j, k; if (top->mp_nod_Atts->mp_int_nodeType == 1) { for (j=0; jmp_nod_Element->mp_szName); for (k = 0; top->mp_nod_Element->mp_arr_szAtts[k]; k += 2) { printf(" %s='%s'", top->mp_nod_Element->mp_arr_szAtts[k], top->mp_nod_Element->mp_arr_szAtts[k + 1]); } printf(">\n"); } else if (top->mp_nod_Atts->mp_int_nodeType == 2) { if (top->mp_nod_Char) { if (strcmp (top->mp_nod_Char->mp_szText, "\n") != 0) { for (j=0; jmp_nod_Char->mp_szText); } } } if (top->mp_nodDaughter) { i=i+1; prettyPrint(top->mp_nodDaughter); i=i-1; } if (top->mp_nod_Atts->mp_int_nodeType == 1) { for (j=0; j\n", top->mp_nod_Element->mp_szName); } if (top->mp_nodRightsister) { prettyPrint(top->mp_nodRightsister, i); } } static void startElement(void *userData, const char *name, const char **atts) { if (!(top)) { top = (node_t *)malloc(sizeof(node_t)); top->mp_nodMother = NULL; top->mp_nodDaughter = NULL; top->mp_nodLeftsister = NULL; top->mp_nodRightsister = NULL; top->mp_nod_Element = (element_t *)malloc(sizeof(element_t)); top->mp_nod_Element->mp_szName = name; top->mp_nod_Element->mp_arr_szAtts = atts; // The above line is one example of how the attributes are put in a node top->mp_nod_Element->mp_nod_Node = top; top->mp_nod_Atts = (atts_t *)malloc(sizeof(atts_t)); top->mp_nod_Atts->mp_int_nodeType = 1; top->mp_nod_Atts->mp_nod_Node = top; } else if (!(top->mp_nodDaughter)) { top->mp_nodDaughter = (node_t *)malloc(sizeof(node_t)); top->mp_nodDaughter->mp_nodMother = NULL; top->mp_nodDaughter->mp_nodDaughter = NULL; top->mp_nodDaughter->mp_nodLeftsister = NULL; top->mp_nodDaughter->mp_nodRightsister = NULL; top->mp_nodDaughter->mp_nod_Element = (element_t*)malloc(sizeof(element_t)); top->mp_nodDaughter->mp_nod_Element->mp_szName = name; top->mp_nodDaughter->mp_nod_Element->mp_nod_Node = top; top->mp_nodDaughter->mp_nod_Element->mp_arr_szAtts = atts; // The above line is one example of how the attributes are put in a node top->mp_nodDaughter->mp_nod_Atts = (atts_t*)malloc(sizeof(atts_t)); top->mp_nodDaughter->mp_nod_Atts->mp_int_nodeType = 1; top->mp_nodDaughter->mp_nod_Atts->mp_nod_Node = top; top->mp_nodDaughter->mp_nodMother = top; top = top->mp_nodDaughter; } else { top = top->mp_nodDaughter; if (top->mp_nodRightsister) { do { top = top->mp_nodRightsister; } while (top->mp_nodRightsister); } top->mp_nodRightsister = (node_t *)malloc(sizeof(node_t)); top->mp_nodRightsister->mp_nodMother = NULL; top->mp_nodRightsister->mp_nodDaughter = NULL; top->mp_nodRightsister->mp_nodLeftsister = NULL; top->mp_nodRightsister->mp_nodRightsister = NULL; top->mp_nodRightsister->mp_nod_Element = (element_t*)malloc(sizeof(element_t)); top->mp_nodRightsister->mp_nod_Element->mp_szName = name; top->mp_nodRightsister->mp_nod_Element->mp_arr_szAtts = atts; // The above line is one example of how the attributes are put in a node top->mp_nodRightsister->mp_nod_Element->mp_nod_Node = top; top->mp_nodRightsister->mp_nod_Atts = (atts_t*)malloc(sizeof(atts_t)); top->mp_nodRightsister->mp_nod_Atts->mp_int_nodeType = 1; top->mp_nodRightsister->mp_nod_Atts->mp_nod_Node = top; top->mp_nodRightsister->mp_nodLeftsister = top; top->mp_nodRightsister->mp_nodMother = top->mp_nodMother; top = top->mp_nodRightsister; } } static void endElement(void *userData, const char *name) { if (top->mp_nodMother) { top = top->mp_nodMother; } } int main(int argc, char *argv[]) { int i = 0; XML_Parser p = XML_ParserCreate(NULL); if (! p) { fprintf(stderr, "Couldn't allocate memory for parser\n"); exit(-1); } XML_SetElementHandler(p, startElement, endElement); for (;;) { int done; int len; len = fread(Buff, 1, BUFFSIZE, stdin); if (ferror(stdin)) { fprintf(stderr, "Read error\n"); exit(-1); } done = feof(stdin); if (! XML_Parse(p, Buff, len, done)) { fprintf(stderr, "Parse error at line %d:\n%s\n", XML_GetCurrentLineNumber(p), XML_ErrorString(XML_GetErrorCode(p))); exit(-1); } if (done) break; } prettyPrint(top, i); return 0; } CODE ENDS HERE From F.J.Franklin@sheffield.ac.uk Wed Aug 7 01:06:03 2002 From: F.J.Franklin@sheffield.ac.uk (F J Franklin) Date: Wed Aug 7 00:06:03 2002 Subject: [Expat-discuss] Problems with utf8_toUtf8 In-Reply-To: Message-ID: On Tue, 6 Aug 2002, Marta Padilla wrote: > I'm still working to adapt Expat to AS/400. The code has a different > behaviour in following function: > > utf8_toUtf8 > > if (fromLim - *fromP > toLim - *toP) { > /* Avoid copying partial characters. */ > for (fromLim = *fromP + (toLim - *toP); fromLim > *fromP; fromLim--) > if (((unsigned char)fromLim[-1] & 0xc0) != 0x80) > > --> WHY 0xc0 and =x80 ?? This has nothing to do with ASCII. This line seems to be checking that the UTF-8 byte "fromLim[-1]" is not a trailing (or 'continuing') byte in a multi-byte sequence. A character expressed in UTF-8 has between 1 and 6 bytes, inclusive, and all but the first will satisfy "& 0xc0) == 0x80". Don't know about the function, sorry. Regards, Frank Francis James Franklin F.J.Franklin@shef.ac.uk "No, she really likes me. She told me I look like Britney Spears, and why would you say that to somebody you don't like?" --- Elle Woods From fdrake@acm.org Wed Aug 7 12:50:04 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed Aug 7 11:50:04 2002 Subject: [Expat-discuss] Question/coding about expat and attributes In-Reply-To: References: Message-ID: <15697.27557.543257.605315@grendel.zope.com> Mitchell Szczepanczyk writes: > The program I've been writing parses and XML document and puts the > XML document into a hierarchical node structure. Each start tag gets its > own node, nodes have pointers for names and attributes and nodes are > linked with one another. Everything seems to work fine except for the > attributes: whenever I try to put a single set of attributes (via the > starthandler element) into a node, my program seems to put the most recent > set of attributes into all of the nodes in the structure. And yet, > there's nothing in the code (at least that I can see) which would suggest > otherwise. It's not hard to see: you're storing a reference to a buffer of attributes into your structure, but you don't own the memory being referenced. Your node structure needs to store a copy of the memory, so there's a bit of work you need to do to allocate memory and make a copy of the attribute information. Expat uses the same result buffers as much as it can to avoid making additional allocations (important for performance), allowing you to copy only as much data as you need to for your application. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From binu.subramanian@sciatl.com Wed Aug 7 22:45:01 2002 From: binu.subramanian@sciatl.com (Subramanian, Binu) Date: Wed Aug 7 21:45:01 2002 Subject: [Expat-discuss] XML Parser Error Message-ID: <6973AC049FFDD41197BB0002A52916C657F0F4@bninchemex01.barconet.com> Hello, I am trying to use expat inorder to parse my XML files. I converted it into a static library and am using the expat compiled for UTF-16. I use the SAXInCpp wrapper for expat and try to parse my XML file. The following is the first 10 lines.....The XML file is valid and is viewed correctly in IE 6.0. I am working on VC 6.0 and Win 2 K. ---------------------------------------------------------------------------- ---------------------------------------------- 1.0.0 00:00:00, Saturday, December 30, 1899 ---------------------------------------------------------------------------- ---------------------------------------------- I get the following errors: ---------------------------------------------------------------------------- ---------------------------------------------- XML Parser error Parse exception at 2,37 Could not resolve XML document XML Parser error Parse exception at 2,37 error in processing external entity reference ---------------------------------------------------------------------------- ---------------------------------------------- Does anyone have any idea why? kr, Binu From binu.subramanian@sciatl.com Thu Aug 8 00:23:02 2002 From: binu.subramanian@sciatl.com (Subramanian, Binu) Date: Wed Aug 7 23:23:02 2002 Subject: [Expat-discuss] Encoding issues in expat Message-ID: <6973AC049FFDD41197BB0002A52916C657F0F5@bninchemex01.barconet.com> Hello Karl, I have integrated expat into my application running on VC 6.0. I read in the XML file using expat in VC 6.0, store it a CString and = then display it in an edit control in a dialog. I use the SAXInCpp API's which has a wrapper for expat. First case : Expat compiled in UTF-8 ( without specifying XML_UNICODE = in the build options.) For the following eg XML File: 1. The Start Element, End Element functions give me the corresponding = tags : ie "GridData", "Identification", etc.... 2. The entities ¡ corresponding to =A1, are prefixed with a = character =C2 and so in my edit control i see the loaded XML file display =C2=A1 Second Case : Expat compiled in UTF-16( by specifying XML_UNICODE in = the build options.) For the following eg XML File: 1. The Start Element, End Element functions DO NOT give me the = corresponding tags : ie "GridData". Instead only the first character is=20 given during parsing : 'G'. 2. The entities ¡ corresponding to =A1, are not prefixed with the character =C2 in this case. What should i do? ----------------------------------------------------- 1.0.0 00:00:00, Saturday, December 30, 1899 ............... ¡¢£¥¦§©®°±²&= #179 ;µ¹¼½¾This is first cell This is first row =20 ----------------------------------------------------- kr, Binu -----Original Message----- From: Karl Waclawek [mailto:karl@waclawek.net] Sent: 26 July 2002 18:55 To: Subramanian, Binu; expat-discuss@lists.sourceforge.net Subject: Re: [Expat-discuss] Encoding issues in expat > I am working on Win 2000, VC++ 6.0 > I am using expat 1.95.4 version. > It is compiled for UTF-8 output and i have specifed the encoding in = the XML > file as UTF-8. > Still when i load the XML file, a character =C2 is prefixed to the = special > characters like ( Euro, trademark, etc). > What can i do to see that the Euro character properly? I have = enclosed the > XML file i am using. > Any suggestion/help will be useful. To understand you correctly: You are reading an XML file using Expat, writing it out again to = another file, based on the callbacks from Expat. Then you view this other file = loading it into some word processor, is that right? Well, which word processor do you use? On Windows, not all editors can display UTF-8 well. And even if they can, they usually require a BOM (byte order mark) at the beginning of the file, even for UTF-16. In any case, the native Unicode version for Windows is UTF-16(LE). So, first I recommend you compile Expat for UTF-16. Then I recommend you write a BOM to the output file, details about BOMs can be found on http://www.unicode.org. Then you should be able to display it. Btw, it seems the file you attached does not contain the Euro symbol. Karl From roel.dijkema@philips.com Thu Aug 8 03:28:09 2002 From: roel.dijkema@philips.com (roel.dijkema@philips.com) Date: Thu Aug 8 02:28:09 2002 Subject: [Expat-discuss] Memory error while using expat Message-ID: This is a multipart message in MIME format. ---------------------- multipart/alternative attachment Hello, I am using expat to parse XML-files in an ansi- C-application on HPUX-11 machine en GCC built. All XML-files are automatically generated. With some files my application core-dumps (crash): memory error. I traced it back to the function lookup just prior to calling the function hash. The pointer to the string which is argument for hash points to an illegal address (totally out of range with respect to the previous addresses used). I have no idea why this happens. I was able to find a containment: increment INIT_SIZE from 64 to 128, but I have no guarantee this will work in any future file. Any suggestion how to prevent this or why this is happening is welcome ! Kind Regards Roel _ Ing Roel Dijkema senior ICT engineer PHILIPS Semiconductors - ATO Nijmegen Information & Automation Group Gerstweg 2, 6534 AE Nijmegen, The Netherlands bld: AO-1.108 pho: +31 24 353 4041 fax : +31 24 353 2123 ---------------------- multipart/alternative attachment An HTML attachment was scrubbed... URL: http://mail.libexpat.org/pipermail-21/expat-discuss/attachments/20020808/89d39653/attachment.html ---------------------- multipart/alternative attachment-- From liapis@liaison.gr Thu Aug 8 05:26:04 2002 From: liapis@liaison.gr (Spyros Liapis) Date: Thu Aug 8 04:26:04 2002 Subject: [Expat-discuss] Using precompiled expat.dll Message-ID: Hi, I 've downloaded and used successfully expat on unix using the expat package. I tried to use expat on windows platform using the expatwin32 package. I am using gcc compiler. How can I use the precompiled dll of expat? (I have not installed (yet :)) cygwin, neither using the MS Visual developer studio) I am not much of a windows programmer :) I am familiar with unix programming environment. I tried to use the makefile below: CC=gcc CFLAGS= -I..\expat\include LDFLAGS= -g LIBS= -L..\expat\lib -lexpat OBJS= elements.o xmlapp: $(OBJS) $(CC) $(LDFLAGS) -o run $(OBJS) $(LIBS) in include dir there is the expat.h include file and in lib dir there are libexpat.lib and libexpat.dll If I am right .lib .dll corresponds to .a .so in unix respectively (??) anyway... C:\EXPAT-~1.4\mytest>make gcc -I..\expat\include -c -o elements.o elements.c gcc -g -o run elements.o -L..\expat\lib -lexpat c:/djgpp/bin/ld.exe: cannot find -lexpat collect2: ld returned 1 exit status make.exe: *** [xmlapp] Error 1 any ideas or change compiler? thanx. spyros. From fdrake@acm.org Thu Aug 8 12:17:08 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu Aug 8 11:17:08 2002 Subject: [Expat-discuss] New Expat functionality and API proposal Message-ID: <15698.46452.77115.176524@grendel.zope.com> Implementing a blocking mode in Expat ===================================== Requests for a pull-based API for Expat have surfaced a few times over (at least) the last couple of years; there is a feature request for this on SourceForge (issue #544682): http://sourceforge.net/tracker/index.php?func=detail&aid=544682&group_id=10127&atid=110127 An additional motivation is that we'd like to be able to share a codebase with the Mozilla project, which is currently using a substantially modified version of an older version of Expat. Pull-based parsers have become increasingly popular as the limitations of DOM- or SAX-like APIs have become better known. The pull-based APIs provide an opportunity to build each part of an application in the way that's most appropriate, allowing a mixture of DOM- and SAX-like behaviors. Expat could provide the basis for an efficient pull-based API if it offered an opportunity to suspend parsing temporarily, allowing parsing to resume when the application is ready for additional information from the document. A .NET-like API could easily be built on top of such a feature. Karl Waclawek and I have been having discussions about this, and think we have a good idea of how to introduce such a feature into Expat. There are questions and issues regarding the possible API that would need to be exposed; I've summarized our ideas an analysis below in the form of two alternate API proposals. We welcome feedback and discussion, including the introduction of additional API proposals, on the expat-discuss list. Supporting Information ---------------------- Expat 1.95.6 / 1.96 will include a new enumeration, XML_Status, specifying return values for the XML_Parse() and XML_ParseBuffer() functions. Our recommendation is that the result of XML_Parse() and XML_ParseBuffer() be tested for these values specifically, even when using older versions of Expat 1.95.x -- this will be completely equivalent in practice. This change allows us to extend the number of possible return values in the future; the documented API in Expat 1.95 through 1.95.4 really only defines a boolean interpretation of these return values, but only the two specific values, now named by XML_Status enum names, were actually used. API Option 1 ------------ This alternative introduces two new functions and three new constants. These are only needed if an application uses the new functionality. XML_STATUS_SUSPENDED New value in the XML_Status enumeration. This is only used if XML_SuspendParser() has been called. XML_ERROR_NOT_SUSPENDED XML_ERROR_SUSPENDED These new error codes would be used to indicate that a call to the parser was made when the parser was not in the expected internal state, and indicate programming errors in the application. XML_Status XML_SuspendParser(XML_Parser parser) Inform the parser that parsing should be suspended when the currently active callback returns. It should only be called from a callback. Returns XML_STATUS_OK or XML_STATUS_ERROR. Multiple calls to XML_SuspendParser() during a callback are allowed, and are equivalent to a single call to XML_SuspendParser(). It is an error to call this function while a callback function is not active. XML_Status XML_ResumeParser(XML_Parser parser) Resume parsing using a suspended parser. Returns XML_STATUS_OK, XML_STATUS_ERROR, or XML_STATUS_SUSPENDED. If the parser has not been suspended, this returns XML_STATUS_ERROR, and XML_GetErrorCode() returns XML_ERROR_NOT_SUSPENDED. The parser is not invalidated in this case, and parsing may be continued with additional input using XML_Parse() or XML_ParseBuffer(). The following functions change: XML_Status XML_Parse(XML_Parser parser, const char *s, int len, int isFinal) XML_Status XML_ParseBuffer(XML_Parser parser, int len, int isFinal) These two existing functions will change the meaning of their return value slightly. If parsing is suspended using XML_SuspendParser(), they will return XML_STATUS_SUSPENDED, otherwise the current values of XML_STATUS_OK and XML_STATUS_ERROR may be returned. If XML_STATUS_SUSPENDED is returned, the parse of the input document can only be resumed using XML_ResumeParser(). If either of these is called on a suspended parser, XML_ERROR_OK will be returned with the error code XML_ERROR_SUSPENDED returned by XML_GetErrorCode(). The parser is not invalidated in this case, and parsing may still be resumed. void * XML_GetBuffer(XML_Parser parser, int len) If the parser has been suspended, returns NULL and XML_GetErrorCode() returns XML_ERROR_SUSPENDED. Parsing the input which has already been passed into Expat should be continued using XML_ResumeParser(). No changes if the parser was not suspended. Potential Issues ---------------- The risk inherent in this API varient is that it does change the interpretation of the return code for XML_Parse() and XML_ParseBuffer(). This is only significant if any callback ever calls XML_SuspendParser(). In the case of suspension, XML_STATUS_SUSPENDED would be returned, but an existing main loop will recognize this as a successful parse. This would be a programming error in the revised API, but not the old API. If the buffer being parsed was not the last buffer, a reasonable error would be returned when the main loop calls XML_Parse() or XML_ParseBuffer() is called again, but if the last input buffer was already passed (isFinal is true), there would be no opportunity to report the error, possibly making it difficult to diagnose application errors introduced by this change. We don't know how important this change is in practice for Expat 1.95.x users; we would appreciate feedback on the expat-discuss list. API Option 2 ------------ This version of the API changes provide increased backward compatibility, at the cost of a cruftier API to Expat. An alternate version of the API also adds the XML_SuspendParser() and XML_ResumeParser() functions, and the new XML_ERROR_* constants, but not the new XML_Status value. This variant would describe suspension as a pseudo-error from the XML_Parse() and XML_ParseBuffer() functions, allowing existing applications to report "errors" from the main loop if they had not been prepared for the suspension feature, but some callback function called XML_SuspendParser(). This would only be expected to occur during development, but applications that only suspend parsing occaissionally may find that poorly tested code paths expose problems late in the development cycle or even after the application has entered production. The alternate version uses this description for XML_Parse() and XML_ParseBuffer(): XML_Status XML_Parse(XML_Parser parser, const char *s, int len, int isFinal) XML_Status XML_ParseBuffer(XML_Parser parser, int len, int isFinal) If XML_STATUS_ERROR is returned, a main loop which supports the suspension feature needs to check whether XML_GetErrorCode(parser) == XML_ERROR_SUSPENDED. If so, the parse was suspended and the call to continue the parse needs to be XML_ResumeParser(). Otherwise, the error is "real". This approach conflates error codes with the state of the parse, and labels the normal operation of the parser as an error. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From fdrake@acm.org Fri Aug 9 14:07:02 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri Aug 9 13:07:02 2002 Subject: [Expat-discuss] XML Parser Error In-Reply-To: <6973AC049FFDD41197BB0002A52916C657F0F4@bninchemex01.barconet.com> References: <6973AC049FFDD41197BB0002A52916C657F0F4@bninchemex01.barconet.com> Message-ID: <15700.8360.621872.811052@grendel.zope.com> Subramanian, Binu writes: > I use the SAXInCpp wrapper for expat and try to parse my XML file. The > following is the first 10 lines.....The XML file is valid and is > viewed correctly in IE 6.0. I am working on VC 6.0 and Win 2 K. ... > I get the following errors: > ---------------------------------------------------------------------------- > ---------------------------------------------- > XML Parser error Parse exception at 2,37 > Could not resolve XML document > XML Parser error Parse exception at 2,37 > error in processing external entity reference ... > Does anyone have any idea why? It looks like the SAX wrapper is trying to resolve the system identifier "1_EN.dtd" and can't. Does the file exist? Is it in the same directory as the document? Depending on the SAX wrapper's entity resolution support, it may not be able to resolve relative references. You should determine what the capabilities of the supplied entity resolver are; the error message from Expat (the last line of the output you provided) indicates that the external entity handler signalled an error. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From bronson@rinspin.com Fri Aug 9 21:20:02 2002 From: bronson@rinspin.com (Scott Bronson) Date: Fri Aug 9 20:20:02 2002 Subject: [Expat-discuss] New Expat functionality and API proposal In-Reply-To: <15698.46452.77115.176524@grendel.zope.com> References: <15698.46452.77115.176524@grendel.zope.com> Message-ID: <1028949476.21441.3.camel@emma> ---------------------- multipart/mixed attachment > Requests for a pull-based API for Expat have surfaced a few times over > (at least) the last couple of years; there is a feature request for > this on SourceForge (issue #544682): > Expat could provide the basis for an efficient pull-based API if it > offered an opportunity to suspend parsing temporarily, allowing > parsing to resume when the application is ready for additional > information from the document. A .NET-like API could easily be built > on top of such a feature. Why? What does suspend have to do with pulling? > Karl Waclawek and I have been having discussions about this, and think > we have a good idea of how to introduce such a feature into Expat. > There are questions and issues regarding the possible API that would > need to be exposed; I've summarized our ideas an analysis below in the > form of two alternate API proposals. > > We welcome feedback and discussion, including the introduction of > additional API proposals, on the expat-discuss list. I actually thought about this a while ago but never went anywhere with it (due to other problems with the project that was to use it). But, I did send the following set of files to a friend. Glad they're still in the send mail directory. After seeing how ugly push got, I wrote a shim to implement pull. Exrub (stupid name -- I was tired) is pretty easy to use: you just keep asking it for tokens until it returns EOF. So, to slam together an example, this is how you would parse an arbitrary number of section elements like this:
... using Exrub. The (off the top of my head) Pythonish code: parser = exrub.Exrub() file = open('file.xml', 'r') parser.SetFile(file) ... while 1: tok = parser.GetNextNonWSToken() # ignore whitespace if tok.type == START: if tok.name == 'section': if tok.attrs.has_key('name'): NewSection(tok.attrs['name']) else: Error("All sections must have a name attribute") else: Error("This element can only contain sections.") elif tok.type == END: break else # token is character data print token.data (NewSection would keep asking for more tokens and parsing sub-elements until it gets an end section tag, whereupon it would return) An exrub token has a type (start tag / end tag / character data) and a name. If it's a start tag, it also has all of the attributes in a hash. If it's char data, it contains the data in a string. Pretty simple. It's MUCH easier to parse an XML file using this style of pull than it is to try to implement a FSM to reassemble data that is pushed. Compare read-exrub.py and read-fsm.py. The biggest thing to notice is that the structure of the code in read-exrub is pretty similar to the XML file. The structure of read-fsm,though, is totally different. Good luck understanding it... So, is Exrub (minus the name) similar to what you were thinking? If not, then why not? :) - Scott ---------------------- multipart/mixed attachment A non-text attachment was scrubbed... Name: parse-vs-fsm.tar.gz Type: application/x-gzip Size: 7806 bytes Desc: not available Url : http://mail.libexpat.org/pipermail-21/expat-discuss/attachments/20020809/509f8389/parse-vs-fsm.tar.bin ---------------------- multipart/mixed attachment-- From karl@waclawek.net Sat Aug 10 19:25:04 2002 From: karl@waclawek.net (Karl Waclawek) Date: Sat Aug 10 18:25:04 2002 Subject: [Expat-discuss] Encoding issues in expat References: <6973AC049FFDD41197BB0002A52916C657F0F5@bninchemex01.barconet.com> Message-ID: <01da01c240d8$fe445580$0207a8c0@karl> > Second Case : Expat compiled in UTF-16( by specifying XML_UNICODE in the > build options.) On Windows, you should use XML_UNICODE_WCHAR_T. Or better, use the "expatw files" project, where everything is set up already. You are using 1.95.4, aren't you? > For the following eg XML File: > 1. The Start Element, End Element functions DO NOT give me the corresponding > tags : ie "GridData". Instead only the first character is > given during parsing : 'G'. How do you determine that? That is, what do you use to view those 16bit characters? Karl From karl@waclawek.net Sat Aug 10 19:28:03 2002 From: karl@waclawek.net (Karl Waclawek) Date: Sat Aug 10 18:28:03 2002 Subject: [Expat-discuss] Memory error while using expat References: Message-ID: <01e501c240d9$60d248b0$0207a8c0@karl> > I am using expat to parse XML-files in an ansi- C-application on HPUX-11 > machine en GCC built. All XML-files are automatically generated. > With some files my application core-dumps (crash): memory error. I traced > it back to the function lookup just prior to calling the function hash. > The pointer to the > string which is argument for hash points to an illegal address (totally > out of range with respect to the previous addresses used). I have no idea > why this happens. > I was able to find a containment: increment INIT_SIZE from 64 to 128, but > I have no guarantee this will work in any future file. > Any suggestion how to prevent this or why this is happening is welcome ! If this is *not* an issue with the HPUX-11 environment, then maybe we can reproduce your problem if you give us a detailed test case. Karl From fdrake@acm.org Sat Aug 10 22:31:01 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Sat Aug 10 21:31:01 2002 Subject: [Expat-discuss] Re: troubles with expat sol 8/gcc/xml-parser In-Reply-To: <200208070159.DAA21483@pointsman.pointsman.de> References: <15696.14965.798436.868702@grendel.zope.com> <200208070159.DAA21483@pointsman.pointsman.de> Message-ID: <15701.59483.427754.90204@grendel.zope.com> rolf@pointsman.de writes: > Very interesting. We do it exactly the same way, for our expat based > Tcl extension. Yes, this works quite well. And it lowers the build > hassle (for the people, that otherwise first would have to install > expat). On the other hand... Yes; this is a big issue. We used to require that Expat already be installed, or you didn't get the Expat bindings. That actually causes an enormous amount of pain, because application developers could not rely on it being there. > But, yes, disk space is so cheap this days, etc. ppp. And linking the > expat object files directly into the application has the *big* > advantage, that it simply works. That certainly helps! -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From gp@familiehaase.de Sun Aug 11 10:43:05 2002 From: gp@familiehaase.de (Gerrit P. Haase) Date: Sun Aug 11 09:43:05 2002 Subject: [Expat-discuss] Using precompiled expat.dll In-Reply-To: References: Message-ID: <4997918669.20020811144400@familiehaase.de> Hallo Spyros, Am Donnerstag, 8. August 2002 um 13:28 schriebst du: > I 've downloaded and used successfully expat on unix using the expat package. > I tried to use expat on windows platform using the expatwin32 package. > I am using gcc compiler. How can I use the precompiled dll of expat? > (I have not installed (yet :)) cygwin, neither using the MS Visual developer studio) Expat is available for the Cygwin netrelease via the Cygwin mirrors. You can also get the sourcepackage with a small patch and a script where you can see the details how I build the Cygwin version of Expat. [...] > any ideas or change compiler? I use the Cygwin toolchain to build libexpat like it is done on Linux too. This includes the autotools and GCC. Take a look at the Cygwin source package to get familiar with building on top of Cygwin. Regards, Gerrit -- =^..^= From binu.subramanian@sciatl.com Mon Aug 12 00:34:02 2002 From: binu.subramanian@sciatl.com (Subramanian, Binu) Date: Sun Aug 11 23:34:02 2002 Subject: [Expat-discuss] XML Parser Error Message-ID: <6973AC049FFDD41197BB0002A52916C657F104@bninchemex01.barconet.com> Thanx Fred. Found out that the parser searches for the specified DTD file in the current directory which was different from the path in which the XML file + DTD were located. After setting the current directory to the directory where the DTD was found, i didnt get the error. kr, Binu -----Original Message----- From: Fred L. Drake, Jr. [mailto:fdrake@acm.org] Sent: 10 August 2002 01:36 To: Subramanian, Binu Cc: expat-discuss@lists.sourceforge.net Subject: Re: [Expat-discuss] XML Parser Error Subramanian, Binu writes: > I use the SAXInCpp wrapper for expat and try to parse my XML file. The > following is the first 10 lines.....The XML file is valid and is > viewed correctly in IE 6.0. I am working on VC 6.0 and Win 2 K. ... > I get the following errors: > ---------------------------------------------------------------------------- > ---------------------------------------------- > XML Parser error Parse exception at 2,37 > Could not resolve XML document > XML Parser error Parse exception at 2,37 > error in processing external entity reference ... > Does anyone have any idea why? It looks like the SAX wrapper is trying to resolve the system identifier "1_EN.dtd" and can't. Does the file exist? Is it in the same directory as the document? Depending on the SAX wrapper's entity resolution support, it may not be able to resolve relative references. You should determine what the capabilities of the supplied entity resolver are; the error message from Expat (the last line of the output you provided) indicates that the external entity handler signalled an error. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From binu.subramanian@sciatl.com Mon Aug 12 00:35:01 2002 From: binu.subramanian@sciatl.com (Subramanian, Binu) Date: Sun Aug 11 23:35:01 2002 Subject: [Expat-discuss] Source code : use expat Message-ID: <6973AC049FFDD41197BB0002A52916C657F106@bninchemex01.barconet.com> Hello, Is there any source code in C available which will give me an idea how to use expat? Thanx. Binu From martap@tango04.net Mon Aug 12 02:15:05 2002 From: martap@tango04.net (Marta Padilla) Date: Mon Aug 12 01:15:05 2002 Subject: [Expat-discuss] Source code : use expat In-Reply-To: <6973AC049FFDD41197BB0002A52916C657F106@bninchemex01.barconet.com> Message-ID: You can find a good example in the expat file itself (xmlwf application). Xmlwf.c is an application using expat (defining handlers, etc). Marta -----Mensaje original----- De: expat-discuss-admin@lists.sourceforge.net [mailto:expat-discuss-admin@lists.sourceforge.net]En nombre de Subramanian, Binu Enviado el: lunes, 12 de agosto de 2002 6:07 Para: expat-discuss@lists.sourceforge.net Asunto: [Expat-discuss] Source code : use expat Hello, Is there any source code in C available which will give me an idea how to use expat? Thanx. Binu ------------------------------------------------------- This sf.net email is sponsored by:ThinkGeek Welcome to geek heaven. http://thinkgeek.com/sf _______________________________________________ Expat-discuss mailing list Expat-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/expat-discuss From binu.subramanian@sciatl.com Mon Aug 12 05:22:02 2002 From: binu.subramanian@sciatl.com (Subramanian, Binu) Date: Mon Aug 12 04:22:02 2002 Subject: [Expat-discuss] Source code : use expat Message-ID: <6973AC049FFDD41197BB0002A52916C657F10A@bninchemex01.barconet.com> Thanx a ton. You just saved me a whole lot of trouble. -----Original Message----- From: Marta Padilla [mailto:martap@tango04.net] Sent: 12 August 2002 13:53 To: Subramanian, Binu; expat-discuss@lists.sourceforge.net Subject: RE: [Expat-discuss] Source code : use expat You can find a good example in the expat file itself (xmlwf application). Xmlwf.c is an application using expat (defining handlers, etc). Marta -----Mensaje original----- De: expat-discuss-admin@lists.sourceforge.net [mailto:expat-discuss-admin@lists.sourceforge.net]En nombre de Subramanian, Binu Enviado el: lunes, 12 de agosto de 2002 6:07 Para: expat-discuss@lists.sourceforge.net Asunto: [Expat-discuss] Source code : use expat Hello, Is there any source code in C available which will give me an idea how to use expat? Thanx. Binu ------------------------------------------------------- This sf.net email is sponsored by:ThinkGeek Welcome to geek heaven. http://thinkgeek.com/sf _______________________________________________ Expat-discuss mailing list Expat-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/expat-discuss From fdrake@acm.org Mon Aug 12 13:05:06 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Mon Aug 12 12:05:06 2002 Subject: [Expat-discuss] New Expat functionality and API proposal In-Reply-To: <1028949476.21441.3.camel@emma> References: <15698.46452.77115.176524@grendel.zope.com> <1028949476.21441.3.camel@emma> Message-ID: <15704.1726.857200.863947@grendel.zope.com> Scott Bronson writes: > Why? What does suspend have to do with pulling? A pull-based scheme can be implemented by having each event-based callback suspend the parser and let the top-level loop present the last set of collected data as the current event. > So, is Exrub (minus the name) similar to what you were thinking? > If not, then why not? :) I've looked at your example code only briefly. Your "exrub" example certainly looks very similar to the .NET XmlReader model; this is *one* of the cases that the proposed changes can support. Using suspend/resume as the basis allows more specialized applications to be constructed as well, where suspension may only be needed some of the time. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From gstein@lyra.org Tue Aug 13 23:27:01 2002 From: gstein@lyra.org (Greg Stein) Date: Tue Aug 13 22:27:01 2002 Subject: [Expat-discuss] New Expat functionality and API proposal In-Reply-To: <15698.46452.77115.176524@grendel.zope.com>; from fdrake@acm.org on Thu, Aug 08, 2002 at 02:16:20PM -0400 References: <15698.46452.77115.176524@grendel.zope.com> Message-ID: <20020813223011.G13410@lyra.org> On Thu, Aug 08, 2002 at 02:16:20PM -0400, Fred L. Drake, Jr. wrote: >... > We welcome feedback and discussion, including the introduction of > additional API proposals, on the expat-discuss list. I'm not inclined towards either of the two proposals you put forward. It is a strange mixture of callbacks and pull-style. An application would need to implement both. Instead, I would much rather see an extra layer over xmltok. It would be a *pure* pull API rather than a hybrid. The API would be modelled similar to the XMLPULL API (see http://www.xmlpull.org/). Essentially, the API could be a generator -- just implement a next() method which returns a token plus data. You could set up the generator to have callbacks to fetch more data, or you could pass in a buffer into the next() call somehow (I prefer the callback approach). The API would need a return value for "partial token", in which case you must feed it more data. The layer over xmltok would perform entity expansion, namespace handling, and other stuff like that. Essentially, it would be an entirely new interface. Users could use it, or use the classic SAX-like interface. If we do it right, we could even refactor the class interface to use the new Pull interface internally. I'll just rough out an interface here. Definitely pseudo-code, pending a deeper interest in the style :-) struct XML_Token { enum XML_Token_Type { XML_TOKEN_START_TAG, XML_TOKEN_END_TAG, XML_TOKEN_ATTR, ... XML_TOKEN_PARTIAL, /* couldn't get a full token. call again... */ ... } type; const char *name; /* valid for START_TAG and ATTR */ const char *value; /* valid for ATTR */ const char *uri; /* namespace. valid for START_TAG, ATTR */ const char *cdata; /* valid for CDATA */ /* ### use 'name' instead? */ size_t cdataSize; /* length of (binary) CDATA returned */ }; typedef struct XML_Generator XML_Generator; XML_Generator *XML_GeneratorCreate(void); /* ### params? */ XML_Error XML_Next( XML_Token *token, /* OUT: filled in with values */ size_t *consumed, /* OUT: number of bytes consumed */ XML_Generator *generator, /* IN: generator to use */ const char *buffer, /* IN: buffer to read. NULL to use Reader */ size_t bufsize /* IN: size of input buffer */ ); void XML_SetReader(XML_Generator *generator, XML_Reader readerFunc); /* ### dunno on this */ typedef .... XML_Reader ( ... ); I think that is about it. Thoughts? Cheers, -g -- Greg Stein, http://www.lyra.org/ From kwaclaw@thestar.ca Wed Aug 14 07:55:03 2002 From: kwaclaw@thestar.ca (Karl Waclawek) Date: Wed Aug 14 06:55:03 2002 Subject: [Expat-discuss] New Expat functionality and API proposal References: <15698.46452.77115.176524@grendel.zope.com> <20020813223011.G13410@lyra.org> Message-ID: <003001c2439a$201225c0$9e539696@citkwaclaww2k> > On Thu, Aug 08, 2002 at 02:16:20PM -0400, Fred L. Drake, Jr. wrote: > >... > > We welcome feedback and discussion, including the introduction of > > additional API proposals, on the expat-discuss list. > > I'm not inclined towards either of the two proposals you put forward. It is > a strange mixture of callbacks and pull-style. An application would need to > implement both. The goal was to enable a pull API on top of Expat. Our proposal - at least how I understand it - was not meant to turn Expat into a pull parser directly. > Instead, I would much rather see an extra layer over xmltok. It would be a > *pure* pull API rather than a hybrid. The API would be modelled similar to > the XMLPULL API (see http://www.xmlpull.org/). Essentially, the API could be > a generator -- just implement a next() method which returns a token plus > data. You could set up the generator to have callbacks to fetch more data, > or you could pass in a buffer into the next() call somehow (I prefer the > callback approach). The API would need a return value for "partial token", > in which case you must feed it more data. > > The layer over xmltok would perform entity expansion, namespace handling, > and other stuff like that. I like all of that! But, the internal interface is already a pull interface, isn't it? I mean, the main parsing loops in xmlparse.c are calling XmlContentTok, XmlCdataSectionTok, XmlPrologTok, XmlAttributeValueTok, and XmlEntityValueTok - depending on the state. These are essentially like the next() function, except for not being combined into one. I think you are proposing a pull layer directly above those token fetch functions. Essentially this means re-writing part of xmlparse.c. Definitely a more efficient idea than our proposal, but a *lot* more work. Our proposal simply has the advantage of being doable with comparatively little effort, given the resources we have. > Essentially, it would be an entirely new interface. Users could use it, or > use the classic SAX-like interface. If we do it right, we could even > refactor the class interface to use the new Pull interface internally. It already does - see above. For the rest - well, I smell a volunteer! ;-) Btw, Expat can easily be turned into a pull parser already, if you feed it the data in one byte buffers. Just build a nextTag() function that does this in an inner loop, and returns only if a callback gets triggered. Efficiency is another question, of course. Karl Get to know us http://www.thestar.com - Canada's largest daily newspaper online http://www.toronto.com - All you need to know about T.O. http://www.workopolis.com - Canada's biggest job site http://www.torontostartv.com - Webcasting & Production http://www.newinhomes.com - Ontario's Largest New Home & Condo Website http://www.waymoresports.com - Canada's most comprehensive sports site http://www.tmgtv.ca - Torstar Media Group Television From ajit_dias@hotmail.com Fri Aug 16 19:19:02 2002 From: ajit_dias@hotmail.com (Ajit Dias) Date: Fri Aug 16 18:19:02 2002 Subject: [Expat-discuss] XML_ParserReset and memory leak?? Message-ID: Hi, I using the XML_ParseReset function and seeing a memory leak. I am not sure if I am using it incorrecly or if there is a bug. The leak can be observed by using the windows task Manager. Below is a simple sample application. Thanks for your feedback. Expat version: 1.95.4 Platform: win200 ajit include #include "expat.h" char buffer[] = ""; void main() { XML_Parser parser = XML_ParserCreate(NULL); while ( true ) { int ret = XML_Parse( parser, buffer, strlen(buffer ), 1); if( ret == 0 ) { abort(); } ret = XML_ParserReset( parser, NULL ); if( ret == 0 ) { abort(); } } } _________________________________________________________________ MSN Photos is the easiest way to share and print your photos: http://photos.msn.com/support/worldwide.aspx From karl@waclawek.net Fri Aug 16 20:01:02 2002 From: karl@waclawek.net (Karl Waclawek) Date: Fri Aug 16 19:01:02 2002 Subject: [Expat-discuss] XML_ParserReset and memory leak?? References: Message-ID: <000701c24594$f862dc20$0207a8c0@karl> > Hi, > > I using the XML_ParseReset function and seeing a memory leak. I am not sure > if I am using it incorrecly or if there is a bug. The leak can be observed > by using the windows task Manager. Below is a simple sample application. > Thanks for your feedback. > > Expat version: 1.95.4 > Platform: win200 > > ajit > > > include > #include "expat.h" > > char buffer[] = " version=\"1.00\">"; > > void main() > { > XML_Parser parser = XML_ParserCreate(NULL); > > while ( true ) > { > int ret = XML_Parse( parser, buffer, strlen(buffer ), 1); > if( ret == 0 ) > { > abort(); > } > > ret = XML_ParserReset( parser, NULL ); > if( ret == 0 ) > { > abort(); > } > } > } You still need to call XML_ParserFree. XML_ParserReset just allows you to re-use the same parser instance for additional runs. In any case, your use of abort() would produce memory leaks anyway, since XML_ParserFree would not get called in that case. I recommend you check the HTML documentation (reference.html) which has some sample code for the main parsing loop. Karl From karl@waclawek.net Fri Aug 16 20:56:03 2002 From: karl@waclawek.net (Karl Waclawek) Date: Fri Aug 16 19:56:03 2002 Subject: [Expat-discuss] XML_ParserReset and memory leak?? References: Message-ID: <002301c2459c$ac3b3010$0207a8c0@karl> > Hi, > > I using the XML_ParseReset function and seeing a memory leak. I am not sure > if I am using it incorrecly or if there is a bug. The leak can be observed > by using the windows task Manager. Below is a simple sample application. > Thanks for your feedback. > > Expat version: 1.95.4 > Platform: win200 > > ajit > > > include > #include "expat.h" > > char buffer[] = " version=\"1.00\">"; > > void main() > { > XML_Parser parser = XML_ParserCreate(NULL); > > while ( true ) > { > int ret = XML_Parse( parser, buffer, strlen(buffer ), 1); > if( ret == 0 ) > { > abort(); > } > > ret = XML_ParserReset( parser, NULL ); > if( ret == 0 ) > { > abort(); > } > } > } I guess I replied too quick. Rolf Ade pointed out to me that what you are testing is increasing memory usage as the loop progresses. Have to look into, or maybe Rolf helps me out there. ;-) Karl From karl@waclawek.net Sat Aug 17 22:22:01 2002 From: karl@waclawek.net (Karl Waclawek) Date: Sat Aug 17 21:22:01 2002 Subject: [Expat-discuss] XML_ParserReset and memory leak?? References: Message-ID: <000c01c24671$e429d230$0207a8c0@karl> > I using the XML_ParseReset function and seeing a memory leak. I am not sure > if I am using it incorrecly or if there is a bug. I believe that this is a bug. Please submit a bug report. I am working on a fix. Karl From brasilino@recife.pe.gov.br Mon Aug 19 06:16:03 2002 From: brasilino@recife.pe.gov.br (Lucas Brasilino) Date: Mon Aug 19 05:16:03 2002 Subject: [Expat-discuss] Newbies questions Message-ID: <3D60E0A8.3060702@recife.pe.gov.br> Hi All: I'm starting studying XML and looking for XML tools, DTD's and so on. I'm very interesting in expat since there's few XML toolkits written in C, most are written in java. I have some expericence as a SGML user, using jade (openjade) DocBook DTD and DSSSL but not as a XML DTD designer, which I'm studying. At James Clark's expat page, he says that It is currently not a validating XML processor Since it is a parser looking for structural errors in a XML document throught a external DTD and entities isn't it validating a XML document? What's the diference between a parser and a validator ? (in my poor perception both are the same :-) ) Is this quote above outdated ? Sorry about this boring questions, but I wasn't be able to find those answers in list archive. By the way, can anybody point me out some XSL toolkit written in C ? And some practical tutorials about XML DTD and XSL ? Best regards -- []'s Lucas Brasilino brasilino@recife.pe.gov.br http://www.recife.pe.gov.br Emprel - Empresa Municipal de Informatica (pt_BR) Municipal Computing Enterprise (en_US) Recife - Pernambuco - Brasil Fone: +55-81-34167078 From ta-meyer@ihug.co.nz Tue Aug 20 03:40:02 2002 From: ta-meyer@ihug.co.nz (ta-meyer@ihug.co.nz) Date: Tue Aug 20 02:40:02 2002 Subject: [Expat-discuss] Re: Newbies questions Message-ID: > What's the diference between a parser and a validator ? (in my poor > perception both are the same :-) ) To use expat as a parser, the XML must be "well-formed". This must mean that it is 'valid' XML - i.e. all the tags that are opened are closed, you aren't missing any '>'s, and so on. To be valid (i.e. validated by a validating parser) the XML must conform to some type of given structure, like a DTD. (It must also be well-formed, as above). Obviously, if there is no DTD specified (or similar structure-giving device), an XML document can't be 'valid'. Expat checks for well-formedness, but does not validate (in the sense given above). Obviously it depends on your situation whether being valid (in this way) is important, or whether well-formedness is enough. (Well-formness is _always_ important, though, unless you want to spend a lot of time coding for possible typos in the XML). > And some practical tutorials about XML DTD and XSL ? Actually, when I first learned how to use XML (and DTD's and XSL and so on), I found that good-old-fashioned books were best. So I'd personally recommend your local library. I guess it depends on how good that library is (mine is a university library, so probably more up-to-date than some others). No doubt others on the list will have online tutorials that might be useful. Hope this helps. = Tony Meyer Auckland, NZ From brasilino@recife.pe.gov.br Tue Aug 20 05:33:03 2002 From: brasilino@recife.pe.gov.br (Lucas Brasilino) Date: Tue Aug 20 04:33:03 2002 Subject: [Expat-discuss] Re: Newbies questions References: Message-ID: <3D622823.2040105@recife.pe.gov.br> Hi Tony: > To use expat as a parser, the XML must be "well-formed". This must mean that > it is 'valid' XML - i.e. all the tags that are opened are closed, you aren't > missing any '>'s, and so on. > > To be valid (i.e. validated by a validating parser) the XML must conform to > some type of given structure, like a DTD. (It must also be well-formed, as > above). Obviously, if there is no DTD specified (or similar structure-giving > device), an XML document can't be 'valid'. > > Expat checks for well-formedness, but does not validate (in the sense given > above). I see... Now I understand. Thanks for your explanation., it was very useful. bests regards -- []'s Lucas Brasilino brasilino@recife.pe.gov.br http://www.recife.pe.gov.br Emprel - Empresa Municipal de Informatica (pt_BR) Municipal Computing Enterprise (en_US) Recife - Pernambuco - Brasil Fone: +55-81-34167078 From Michael.Kaufman@arbitron.com Tue Aug 20 08:14:01 2002 From: Michael.Kaufman@arbitron.com (Kaufman, Michael) Date: Tue Aug 20 07:14:01 2002 Subject: [Expat-discuss] Writing XML ? Message-ID: <411EA40BC162D211B92B0008C7B1D2B30C8E350A@arbmdex.arbitron.com> All, I've been doing a bit of searching for XML libraries. It seems the two good ones are expat and libxml (the GNOME project one) and there's also some LT XML v1.2 library coming out of the UK. This may very well be a dumb newbie question but... do any of these libraries support CREATING XML files as opposed to just parsing/reading them?? I have a lot of flat text files that are well-formatted, but am looking at transforming them into XML form. I already have a parser for my pre-existing files, what I need is a library to link into this parser that will allow me to re-write as XML. I saw the LT XML library had some PrintXXXX functions, but they didn't seem to be exactly what I wanted (I may be wrong). The expat library doesn't seem to support it at all, and I can't quite tell either way with libxml. Can anyone help? Thanks, -Mike From kwaclaw@thestar.ca Tue Aug 20 10:27:03 2002 From: kwaclaw@thestar.ca (Karl Waclawek) Date: Tue Aug 20 09:27:03 2002 Subject: [Expat-discuss] Writing XML ? References: <411EA40BC162D211B92B0008C7B1D2B30C8E350A@arbmdex.arbitron.com> Message-ID: <004c01c24866$57895fc0$9e539696@citkwaclaww2k> > I've been doing a bit of searching for XML libraries. It seems the two good > ones are expat and libxml (the GNOME project one) and there's also some LT > XML v1.2 library coming out of the UK. This may very well be a dumb newbie > question but... do any of these libraries support CREATING XML files as > opposed to just parsing/reading them?? I have a lot of flat text files that > are well-formatted, but am looking at transforming them into XML form. I > already have a parser for my pre-existing files, what I need is a library to > link into this parser that will allow me to re-write as XML. > I saw the LT XML library had some PrintXXXX functions, but they didn't seem > to be exactly what I wanted (I may be wrong). The expat library doesn't > seem to support it at all, and I can't quite tell either way with libxml. > Can anyone help? Expat is an XML parser, no writing API is supported. I believe that the .NET framework has XML writer support. There are also numerous OpenSource efforts, you just have to find them. If you are a Delphi programmer, you might even be able to use the XML writer library that I wrote. Karl Get to know us http://www.thestar.com - Canada's largest daily newspaper online http://www.toronto.com - All you need to know about T.O. http://www.workopolis.com - Canada's biggest job site http://www.torontostartv.com - Webcasting & Production http://www.newinhomes.com - Ontario's Largest New Home & Condo Website http://www.waymoresports.com - Canada's most comprehensive sports site http://www.tmgtv.ca - Torstar Media Group Television From mark@mitchenall.com Tue Aug 20 10:49:04 2002 From: mark@mitchenall.com (Mark Mitchenall) Date: Tue Aug 20 09:49:04 2002 Subject: [Expat-discuss] Alternative Encodings In-Reply-To: <004c01c24866$57895fc0$9e539696@citkwaclaww2k> Message-ID: Can anyone point me in the direction of some examples of the UnknownEncodingHandler, etc? I need to add support for Shift-JIS and Big5 encodings in the documents, and although I think I know what's necessarily, some pointers to some C source which demonstrates the features would be really useful. Can anyone suggest any URLs? I saw in the archive that I should have a look at the Perl XML::Parser, but I couldn't find the source. TIA, Mark -- Mark Mitchenall Principal Consultant mitchenall.com Email: mark@mitchenall.com Tel: +44(0)20 8452 3031 Mobile: +44(0)7850 847 543 http://www.mitchenall.com/ From kwaclaw@thestar.ca Tue Aug 20 11:04:03 2002 From: kwaclaw@thestar.ca (Karl Waclawek) Date: Tue Aug 20 10:04:03 2002 Subject: [Expat-discuss] Alternative Encodings References: Message-ID: <008401c2486b$7876fe40$9e539696@citkwaclaww2k> > Can anyone point me in the direction of some examples of the > UnknownEncodingHandler, etc? I need to add support for Shift-JIS and Big5 > encodings in the documents, and although I think I know what's necessarily, > some pointers to some C source which demonstrates the features would be > really useful. > > Can anyone suggest any URLs? I saw in the archive that I should have a look > at the Perl XML::Parser, but I couldn't find the source. The xmlwf utility included with Expat has some code for the unknownEncodingHandler. Karl Get to know us http://www.thestar.com - Canada's largest daily newspaper online http://www.toronto.com - All you need to know about T.O. http://www.workopolis.com - Canada's biggest job site http://www.torontostartv.com - Webcasting & Production http://www.newinhomes.com - Ontario's Largest New Home & Condo Website http://www.waymoresports.com - Canada's most comprehensive sports site http://www.tmgtv.ca - Torstar Media Group Television From kwaclaw@thestar.ca Tue Aug 20 11:29:04 2002 From: kwaclaw@thestar.ca (Karl Waclawek) Date: Tue Aug 20 10:29:04 2002 Subject: [Expat-discuss] Alternative Encodings References: Message-ID: <009301c2486e$f24fe580$9e539696@citkwaclaww2k> > Can anyone point me in the direction of some examples of the > UnknownEncodingHandler, etc? I need to add support for Shift-JIS and Big5 > encodings in the documents, and although I think I know what's necessarily, > some pointers to some C source which demonstrates the features would be > really useful. > > Can anyone suggest any URLs? I saw in the archive that I should have a look > at the Perl XML::Parser, but I couldn't find the source. I think I found the source at: http://www.perl.com/CPAN-local/modules/by-category/11_String_Lang_Text_Proc/XML/ The archive is: XML-Parser-2.31.tar.gz Karl Get to know us http://www.thestar.com - Canada's largest daily newspaper online http://www.toronto.com - All you need to know about T.O. http://www.workopolis.com - Canada's biggest job site http://www.torontostartv.com - Webcasting & Production http://www.newinhomes.com - Ontario's Largest New Home & Condo Website http://www.waymoresports.com - Canada's most comprehensive sports site http://www.tmgtv.ca - Torstar Media Group Television From mark@mitchenall.com Tue Aug 20 11:40:05 2002 From: mark@mitchenall.com (Mark Mitchenall) Date: Tue Aug 20 10:40:05 2002 Subject: [Expat-discuss] Alternative Encodings In-Reply-To: <008401c2486b$7876fe40$9e539696@citkwaclaww2k> Message-ID: on 20/8/2002 6:03 PM, Karl Waclawek at kwaclaw@thestar.ca wrote: > The xmlwf utility included with Expat has some code > for the unknownEncodingHandler. Thanks for that... it's just the ticket. Best, Mark -- Mark Mitchenall Principal Consultant mitchenall.com Email: mark@mitchenall.com Tel: +44(0)20 8452 3031 Mobile: +44(0)7850 847 543 http://www.mitchenall.com/ New Freeware, Open-Source 4D Components available.... http://www.mitchenall.com/products/freeware/ From michael@vivtek.com Tue Aug 20 12:07:06 2002 From: michael@vivtek.com (Michael Roberts) Date: Tue Aug 20 11:07:06 2002 Subject: [Expat-discuss] Writing XML ? References: <411EA40BC162D211B92B0008C7B1D2B30C8E350A@arbmdex.arbitron.com> <004c01c24866$57895fc0$9e539696@citkwaclaww2k> Message-ID: <3D62852B.50108@vivtek.com> > > >>what I need is a library to >>link into this parser that will allow me to re-write as XML. >> You can try my XMLAPI; I use it a lot and I'm discovering other people who do as well. It's part of the wftk workflow distribution (http://www.vivtek.com/wftk.html); I suppose at some point I should probably release it as a separate distribution as well.... Michael From Josh.Martin@abq.sc.philips.com Wed Aug 21 16:35:05 2002 From: Josh.Martin@abq.sc.philips.com (Josh Martin) Date: Wed Aug 21 15:35:05 2002 Subject: [Expat-discuss] Character conversion from Ansi and Latin 1 to UTF-8 Message-ID: <200208212234.g7LMYhB24529@atoae450.abq.sc.philips.com> > Hi, > > Could anyone please refer me to the function(s) in Expat that actually converts from the Latin1 and Ansi character set to UTF-8? I need to convert documents to UTF-8 and XML and would just like to see code examples of that kind of conversion to ensure that my own understanding is correct. Help will be appreciated. > > Louw The GNU libiconv is a C conversion library which you can use to convert one encoding into another. You can find out more information about this useful package at http://www.gnu.org/software/libiconv/ I hope this helps. - Josh Martin From fdrake@acm.org Fri Aug 23 13:22:27 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri Aug 23 12:22:27 2002 Subject: [Expat-discuss] Expat mailing lists are moving Message-ID: <15718.35598.477399.965833@grendel.zope.com> All three of the mailing lists for the Expat project are moving, and will no longer be hosted at SourceForge. SourceForge has proven itself an incredibly valuable resource for open source developers, and it's certainly made the continued maintenance of Expat much nicer than it could have been. Unfortunaltey, it's very success has caused it enough load on their systems that the performance and reliability of some components has fallen behind, most importantly the mailing list servers. To avoid suffering the highly variable delays in getting mail off the SourceForge list servers to recipients' inboxes, we're going to host the lists in the libexpat.org domain. We will continue to use the Mailman mailing list manager used on SourceForge, but using the list archive software provided with Mailman, which isn't quite as nice as the archive interface available on SourceForge. The new lists have been created, and we'll probably shut down the old lists next week. You can subscribe to the new lists at: http://mail.libexpat.org/mailman-21/listinfo/ The mail from the revision control system and the issue trackers will be won't re-directed to the new lists until early next week, to give you time to subscribe to the new lists. Please let me know if you have any problems with these changes. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From fdrake@acm.org Fri Aug 23 16:00:02 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri Aug 23 15:00:02 2002 Subject: [Expat-discuss] Expat on AS400. Help needed ! In-Reply-To: References: <15679.58276.425854.242020@grendel.zope.com> Message-ID: <15718.45087.382995.648629@grendel.zope.com> Marta Padilla writes: > - In OS400 file names cannot be longer than 8 characters so I had to change > file names like "xmltok_impl" and the source code of both library > and xmlwf (the application I was testing) with these changes. Does the AS/400 use an 8.3 convention, or something else? I'd like to take care of this before another release comes out, if we can get the information we need. > - Other "silly" issue was to change lines longer than 80 characters (another > wonderful feature of OS400) in all source code. This has been taken care of in the Expat sources now (in CVS). Many of the build support files used on Unix, Windows, and Mac OS still have long lines; those will be more difficult to change, and may not be possible for all of them. Thanks! -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From ta-meyer@ihug.co.nz Sat Aug 24 20:00:48 2002 From: ta-meyer@ihug.co.nz (ta-meyer@ihug.co.nz) Date: Sat Aug 24 19:00:48 2002 Subject: [Expat-discuss] Re: Expat mailing lists are moving Message-ID: Just a reminder that someone should update the "mailing lists" link on http://www.libexpat.org to reflect the fact that the lists have moved. = Tony Meyer Auckland, NZ From fdrake@acm.org Mon Aug 26 09:10:05 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Mon Aug 26 08:10:05 2002 Subject: [Expat-discuss] Re: Expat mailing lists are moving In-Reply-To: References: Message-ID: <15722.17570.466901.161200@grendel.zope.com> ta-meyer@ihug.co.nz writes: > Just a reminder that someone should update the "mailing lists" link on > http://www.libexpat.org to reflect the fact that the lists have moved. Thanks for the reminder! I've made this update now. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From fdrake@acm.org Mon Aug 26 14:05:05 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Mon Aug 26 13:05:05 2002 Subject: [Expat-discuss] Starting mailing list cutover Message-ID: <15722.35296.723163.411869@grendel.zope.com> I'm switching the check-in and issue-tracker email notifications to the new lists. If you haven't subscribed to the new lists, the subscription pages are available at: http://mail.libexpat.org/mailman-21/listinfo/ -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From fdrake@acm.org Wed Aug 28 23:10:04 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed Aug 28 22:10:04 2002 Subject: [Expat-discuss] Final mailing list cutover Message-ID: <15725.44177.79747.334402@grendel.zope.com> The expat-bugs and expat-checkins lists are now only being served by the new lists at libexpat.org. The expat-discuss list will only be served by libexpat.org sometime tomorrow; if you have not migrated your subscriptions, this is a good time to do so. Information on the new lists is available at: http://mail.libexpat.org/mailman-21/listinfo/ -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From fdrake@acm.org Mon Aug 26 21:04:48 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Mon, 26 Aug 2002 16:04:48 -0400 Subject: [Expat-discuss] Starting mailing list cutover Message-ID: <15722.35296.723163.411869@grendel.zope.com> I'm switching the check-in and issue-tracker email notifications to the new lists. If you haven't subscribed to the new lists, the subscription pages are available at: http://mail.libexpat.org/mailman-21/listinfo/ -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From dliu@bbn.com Wed Aug 28 17:04:36 2002 From: dliu@bbn.com (Daben Liu) Date: Wed, 28 Aug 2002 12:04:36 -0400 Subject: [Expat-discuss] problem adding new encoding to perl XML::Parser Message-ID: The XML::Parser installed from CPAN does not come with a GB2312 encoding support. However, I was not able to add the support as instructed by the XML::Encoding package. To add this support, I did the following: 1. Download GB2312.TXT from ftp.unicode.org 2. Download the XML::Encoding 1.01 and get two binaries: make_encmap and compile_encoding 3. run make_encmap as follows: make_encmap GB2312 GB2312.TXT > GB2312.encmap 4. Add expat='yes' to the first line of GB2312.encmap 5. run compile_encoding: compile_encoding -o GB2312.enc GB2312.encmap 6. copy GB2312.enc to /usr/lib/perl5/site_perl/5.005/i386-linux/XML/Parser/Encodings Then I made the following perl script: --------------- #!/usr/bin/perl use XML::Parser; my $xmlfile = $ARGV[0]; my $parser = new XML::Parser(); my $doc = $parser->parsefile ("$xmlfile"); --------------- I run this script with a well-formed xml file having a head line as: Following error occurs: unknown encoding at line 1, column 30, byte 30 at /usr/lib/perl5/site_perl/5.005/i386-linux/XML/Parser.pm line 185 Changing the encoding to other supported ones seem to work without error. I'm wondering if there is something I'm missing in the process. Thanks for any suggestions! Daben From fdrake@acm.org Wed Aug 28 17:18:46 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed, 28 Aug 2002 12:18:46 -0400 Subject: [Expat-discuss] problem adding new encoding to perl XML::Parser In-Reply-To: References: Message-ID: <15724.63462.634823.827758@grendel.zope.com> Daben Liu writes: > The XML::Parser installed from CPAN does not come with a > GB2312 encoding support. However, I was not able to add > the support as instructed by the XML::Encoding package. There's a mailing list specically for Perl & XML; perhaps someone there might know more about this. I'm afraid I don't know much about encoding support with the Perl bindings for Expat. Information about the perl-xml mailing list is available at: http://aspn.activestate.com/ASPN/Mail/Browse/Threaded/perl-xml -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From fdrake@acm.org Thu Aug 29 06:09:37 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu, 29 Aug 2002 01:09:37 -0400 Subject: [Expat-discuss] Final mailing list cutover Message-ID: <15725.44177.79747.334402@grendel.zope.com> The expat-bugs and expat-checkins lists are now only being served by the new lists at libexpat.org. The expat-discuss list will only be served by libexpat.org sometime tomorrow; if you have not migrated your subscriptions, this is a good time to do so. Information on the new lists is available at: http://mail.libexpat.org/mailman-21/listinfo/ -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From fdrake@acm.org Fri Aug 30 20:33:49 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri, 30 Aug 2002 15:33:49 -0400 Subject: [Expat-discuss] Mailing list migration complete Message-ID: <15727.51357.361390.448843@grendel.zope.com> The mailing list migration is now complete. Only the lists at libexpat.org should be active. The archives of the lists from SourceForge have been integrated, so you only need to go to one place to find email related to Expat. If you have any problems with the lists, please don't hesitate to contact me. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation