From jason at heddway.com Mon Apr 3 17:51:34 2006 From: jason at heddway.com (jason heddings) Date: Mon, 3 Apr 2006 09:51:34 -0600 Subject: [Expat-discuss] Line Endings Message-ID: <005401c65736$7c2fdd80$2a00a8c0@enterprise> Hello- This may be an established question, but I am unable to find a reference to it. I am having a problem parsing files that do not have Unix-style line endings. If I take the same file and parse it with Windows-style line endings, the Expat parser errors. Any ideas? Thanks, --jah From jason at heddings.com Mon Apr 3 17:51:09 2006 From: jason at heddings.com (jason heddings) Date: Mon, 3 Apr 2006 09:51:09 -0600 Subject: [Expat-discuss] Line Endings Message-ID: <005301c65736$6d9eec20$2a00a8c0@enterprise> Hello- This may be an established question, but I am unable to find a reference to it. I am having a problem parsing files that do not have Unix-style line endings. If I take the same file and parse it with Windows-style line endings, the Expat parser errors. Any ideas? Thanks, --jah From karl at waclawek.net Thu Apr 6 22:26:59 2006 From: karl at waclawek.net (Karl Waclawek) Date: Thu, 06 Apr 2006 16:26:59 -0400 Subject: [Expat-discuss] best practice to "XMLify" binary data In-Reply-To: <31543.1144353778@www051.gmx.net> References: <31543.1144353778@www051.gmx.net> Message-ID: <44357993.7020400@waclawek.net> Stefan Momma wrote: > We have the following situation: > > XML documents are generated by an application, and sometimes there are > binary characters embedded somewhere inside a CDATA section of the document > which are not valid UTF-8. > These XML data are parsed using expat in a different application. > > Is there a "best practice" what to do as the final step in the XML > generation process to manipulate the data such that they do not end up with > invalid token errors for this material inside CDATA? We want to retain as > much as possible of the original data, so some kind of replacement > representation would be helpful. Which tools can you recommend for that > task? > > Use base64 encoding. From francis.moore at rawflow.com Fri Apr 7 14:40:28 2006 From: francis.moore at rawflow.com (Frank Moore) Date: Fri, 07 Apr 2006 13:40:28 +0100 Subject: [Expat-discuss] Setting a CDATA handler in Expat Message-ID: <44365DBC.4040905@rawflow.com> Hi, Can anyone tell me how to set a CDATA event handler in Python using Expat. I'm currently using the Expat XML Parser (version 1.95.5). And this is the Python code: outFile = file(localPlayerHtml, 'w') inFile = file(tempPlayerHtml, 'r') htmlparser = saxexts.make_parser() htmlhandler = PlayerHtmlHandler(outFile, stream) htmlparser.setDocumentHandler(htmlhandler) htmlparser.StartCdataSectionHandler = PlayerHtmlHandler.startCDATA <-- here htmlparser.EndCdataSectionHandler = PlayerHtmlHandler.endCDATA <-- and here htmlparser.parseFile(inFile) The class PlayerHtmlHandler(saxlib.HandlerBase) in file PlayerHtmlHandler.py contains the following two functions: def startCDATA(self): print "Starting CDATA..." def endCDATA(self): print "Finishing CDATA..." And I'm including the contents of PlayerHtmlHandler using the import statement: from PlayerHtmlHandler import * For some reason when I run the code, the two CDATA handlers are not called. Can anyone see what I'm doing wrong? Many thanks, Frank. From bruno at clisp.org Fri Apr 7 14:18:10 2006 From: bruno at clisp.org (Bruno Haible) Date: Fri, 7 Apr 2006 14:18:10 +0200 Subject: [Expat-discuss] XML_LARGE_SIZE binary compatibility problem Message-ID: <200604071418.10160.bruno@clisp.org> Hello, In expat-2.0.0, a compilation option XML_LARGE_SIZE has been introduced that, on 32-bit systems, changes the size of the return type of the functions XML_GetCurrentLineNumber XML_GetCurrentColumnNumber XML_GetCurrentByteIndex from 32-bit ([unsigned] long) to 64-bit ([unsigned] long long). The ABI is not the same: on some CPUs 64-bit results are returned in memory, and on the other CPUs, where 64-bit results are returned in registers, the registers for 32-bit return and 64-bit return may not be the same. Question 1: How should a program that wants to use and -lexpat proceed? The installed expat.h and expat_external.h are the same in both cases. In other words, someone using the expat.m4 macro in his package and doing #include will be assuming that the return types are 32-bit large, while in the libexpat.so.1 they are actually 64-bit large. Question 2: How should a program that wants to use libexpat.so through dynamic loading (dlopen ("libexpat.so.1")) proceed? The list of exported symbols that are visible through dlsym() is the same in both cases. How can it know whether the library was built with XML_LARGE_SIZE or not? If these issues are not resolved, there is no way to make my program (GNU gettext) link against libexpat without risking crashes, and I will have to rewrite my code to use libxml2 instead of libexpat. Bruno From francis.moore at rawflow.com Fri Apr 7 16:37:44 2006 From: francis.moore at rawflow.com (Frank Moore) Date: Fri, 07 Apr 2006 15:37:44 +0100 Subject: [Expat-discuss] CDATA Handler in Expat Message-ID: <44367938.8080409@rawflow.com> Hi, With reference to the email I sent earlier, to learn how to set the CDATA handler myself, I've copied the recipe 'Using the SAX2 LexicalHandler Interface' from http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/84516 into a .py file, added some output ('***************************') to mark the places where I expect the code to pass through, run it and didn't get what I thought I would. I have a CDATA section in an xml file, run the .py against the file and it does not output what I would expect. The .xml file has (amongst other things) the following content: ... ... The output is : The comment delimiters '' are encoded, but where the CDATA start and end blocks are, there are just empty lines. I would expect to see some '***************************************' as per the code. I would also expect the to be output as it says it does in the code. I'm using Python 2.4.1, PyXML 0.8.4, and Expat XML Parser 1.95.5. The recipe however, does mention at the end: "My tests using Python 2.1, PyXML 0.7 (from CVS) and PIRXX 1.2 indicate that PIRXX (i.e. Xerces/C) reports all events, xmlproc leaves out the start/end entity ones, and pyexpat misses those too, in addition to the start/end DTD events." When it says "xmlproc leaves out the start/end entity ones, and pyexpat misses those too" does this include the startCDATA, endCDATA events? If it does, how am I supposed to pick out the contents of my CDATA section with the version of Expat that I'm currently using? Many thanks, Frank. ########################################################################################### # echoxml.py import sys from xml.sax import sax2exts, saxutils, handler from xml.sax import SAXNotSupportedException, SAXNotRecognizedException class EchoGenerator(saxutils.XMLGenerator): def __init__(self, out=None, encoding="iso-8859-1"): saxutils.XMLGenerator.__init__(self, out, encoding) self._in_entity = 0 self._in_cdata = 0 def characters(self, content): if self._in_entity: return elif self._in_cdata: self._out.write('*******************************************') self._out.write(content) else: saxutils.XMLGenerator.characters(self, content) # -- LexicalHandler interface def comment(self, content): self._out.write('' % content) def startDTD(self, name, public_id, system_id): self._out.write('\n') def startEntity(self, name): self._out.write('&%s;' % name) self._in_entity = 1 def endEntity(self, name): self._in_entity = 0 def startCDATA(self): self._out.write('*******************************************') self._out.write('') self._out.write('*******************************************') self._in_cdata = 0 def test(xmlfile): parser = sax2exts.make_parser([ 'pirxx', 'xml.sax.drivers2.drv_xmlproc', 'xml.sax.drivers2.drv_pyexpat', ]) print >>sys.stderr, "*** Using", parser try: parser.setFeature(handler.feature_namespaces, 1) except (SAXNotRecognizedException, SAXNotSupportedException): pass try: parser.setFeature(handler.feature_validation, 0) except (SAXNotRecognizedException, SAXNotSupportedException): pass saxhandler = EchoGenerator() parser.setContentHandler(saxhandler) parser.setProperty(handler.property_lexical_handler, saxhandler) parser.parse(xmlfile) if __name__ == "__main__": test('build.xml') From karl at waclawek.net Fri Apr 7 20:47:46 2006 From: karl at waclawek.net (Karl Waclawek) Date: Fri, 07 Apr 2006 14:47:46 -0400 Subject: [Expat-discuss] XML_LARGE_SIZE binary compatibility problem In-Reply-To: <200604071418.10160.bruno@clisp.org> References: <200604071418.10160.bruno@clisp.org> Message-ID: <4436B3D2.3090406@waclawek.net> Bruno Haible wrote: > Hello, > > In expat-2.0.0, a compilation option XML_LARGE_SIZE has been introduced that, > on 32-bit systems, changes the size of the return type of the functions > XML_GetCurrentLineNumber > XML_GetCurrentColumnNumber > XML_GetCurrentByteIndex > from 32-bit ([unsigned] long) to 64-bit ([unsigned] long long). > > The ABI is not the same: on some CPUs 64-bit results are returned in memory, > and on the other CPUs, where 64-bit results are returned in registers, the > registers for 32-bit return and 64-bit return may not be the same. > > Question 1: > How should a program that wants to use and -lexpat proceed? > The installed expat.h and expat_external.h are the same in both cases. > In other words, someone using the expat.m4 macro in his package and > doing #include will be assuming that the return types are > 32-bit large, while in the libexpat.so.1 they are actually 64-bit large. > > Question 2: > How should a program that wants to use libexpat.so through dynamic loading > (dlopen ("libexpat.so.1")) proceed? The list of exported symbols that are > visible through dlsym() is the same in both cases. How can it know whether > the library was built with XML_LARGE_SIZE or not? > > If these issues are not resolved, there is no way to make my program (GNU gettext) > link against libexpat without risking crashes, and I will have to rewrite my > code to use libxml2 instead of libexpat. > > The standard build is without XML_LARGE_SIZE defined. Therefore your program should rely on that ABI. We introduced this compile option only for special needs, where one needs to get the line/column number when processing extremely large files. In this case you would create a custom build of libexpat and you would know that you need to define XML_LARGE_SIZE when compiling your application to get the correct header files. We could add a new return value to XML_GetFeatureList(), so that you can dynamically detect if XML_LARGE_SIZE was used. Would that help you? Karl From nickmacd at gmail.com Fri Apr 7 21:49:48 2006 From: nickmacd at gmail.com (Nick MacDonald) Date: Fri, 7 Apr 2006 15:49:48 -0400 Subject: [Expat-discuss] XML_LARGE_SIZE binary compatibility problem In-Reply-To: <4436B3D2.3090406@waclawek.net> References: <200604071418.10160.bruno@clisp.org> <4436B3D2.3090406@waclawek.net> Message-ID: In general, this sounds like a wise idea. Better to know and be sure at run time (so decent diagnostics could be produced) than to risk bizarre code behaviours or crashes. > We could add a new return value to XML_GetFeatureList(), so that you can > dynamically detect if XML_LARGE_SIZE was used. From bruno at clisp.org Fri Apr 7 22:59:43 2006 From: bruno at clisp.org (Bruno Haible) Date: Fri, 7 Apr 2006 22:59:43 +0200 Subject: [Expat-discuss] XML_LARGE_SIZE binary compatibility problem In-Reply-To: <4436B3D2.3090406@waclawek.net> References: <200604071418.10160.bruno@clisp.org> <4436B3D2.3090406@waclawek.net> Message-ID: <200604072259.43433.bruno@clisp.org> Karl Waclawek wrote: > The standard build is without XML_LARGE_SIZE defined. Therefore your > program should rely on that ABI. We introduced this compile option only for special > needs, where one needs to get the line/column number when processing extremely > large files. This compile option is so prominently mentioned in the README that it's likely that some distributor (Linux distributor, mingw, fink, netbsd, freebsd, openbsd, openpkg, ...) will turn it on. Remember that FreeBSD made 'off_t' a 64-bit type a few years ago, at a time when this seemed totally unreasonable? > In this case you would create a custom build of libexpat and you would know > that you need to define XML_LARGE_SIZE when compiling your application to get > the correct header files. The question is: How would an application know this? In the current state, it cannot know it by looking at expat.h. The only way I can distinguish the two ABIs - as a human - is to disassemble selected functions of libexpat.so. > We could add a new return value to XML_GetFeatureList(), so that you can > dynamically detect if XML_LARGE_SIZE was used. Would that help you? Thank you, yes, such a function returning, say, a 'const char * const *' pointing to a NULL-terminated array of feature strings, would fix the second case (loading via dlopen). For the first case, #include , I would suggest that in a build with XML_LARGE_SIZE, the lines #ifndef XML_LARGE_SIZE # define XML_LARGE_SIZE 1 #endif be inserted at or near the beginning of expat_external.h. Then #include can be used without prior knowledge, and the expat.m4 macro does not need to be modified. Bruno From karl at waclawek.net Sat Apr 8 04:52:40 2006 From: karl at waclawek.net (Karl Waclawek) Date: Fri, 07 Apr 2006 22:52:40 -0400 Subject: [Expat-discuss] XML_LARGE_SIZE binary compatibility problem In-Reply-To: <200604072259.43433.bruno@clisp.org> References: <200604071418.10160.bruno@clisp.org> <4436B3D2.3090406@waclawek.net> <200604072259.43433.bruno@clisp.org> Message-ID: <44372578.8020407@waclawek.net> Bruno Haible wrote: > Karl Waclawek wrote: > >> The standard build is without XML_LARGE_SIZE defined. Therefore your >> program should rely on that ABI. We introduced this compile option only for special >> needs, where one needs to get the line/column number when processing extremely >> large files. >> > > This compile option is so prominently mentioned in the README that it's likely > that some distributor (Linux distributor, mingw, fink, netbsd, freebsd, openbsd, > openpkg, ...) will turn it on. Remember that FreeBSD made 'off_t' a 64-bit type > a few years ago, at a time when this seemed totally unreasonable? > I would think that such seasoned programmers would realize that this is a breaking change. Anyway, I have also added a short paragraph in the README pointing this out. > >> In this case you would create a custom build of libexpat and you would know >> that you need to define XML_LARGE_SIZE when compiling your application to get >> the correct header files. >> > > The question is: How would an application know this? In the current state, > it cannot know it by looking at expat.h. The only way I can distinguish the > two ABIs - as a human - is to disassemble selected functions of libexpat.so. > > >> We could add a new return value to XML_GetFeatureList(), so that you can >> dynamically detect if XML_LARGE_SIZE was used. Would that help you? >> > > Thank you, yes, such a function returning, say, a 'const char * const *' pointing > to a NULL-terminated array of feature strings, would fix the second case (loading > via dlopen). > OK, just added to CVS. New enum value XML_FEATURE_LARGE_SIZE, new return value for XML_GetFeatureList(). Not sure however, if anonymous CVS is currently in sync with developer CVS on SourceForge. > For the first case, #include , I would suggest that in a build with > XML_LARGE_SIZE, the lines > > #ifndef XML_LARGE_SIZE > # define XML_LARGE_SIZE 1 > #endif > > be inserted at or near the beginning of expat_external.h. Then #include > can be used without prior knowledge, and the expat.m4 macro does not need to be > modified. > > Isn't that something you would have to ask the"builder"? Karl From weigelt at metux.de Tue Apr 11 13:08:13 2006 From: weigelt at metux.de (Enrico Weigelt) Date: Tue, 11 Apr 2006 13:08:13 +0200 Subject: [Expat-discuss] test - mail.libexpat.org seems offline In-Reply-To: <200603260150.45592.fdrake@acm.org> References: <20060324000713.GA9893@nibiru.local> <200603260150.45592.fdrake@acm.org> Message-ID: <20060411110813.GB5039@nibiru.local> * Fred L. Drake, Jr. wrote: Hi, > The DNS servers that serve libexpat.org have been suffering severe DDoS > attacks lately; the administrators for those servers are still working > against those attacks. This has affected other domains I have registered > there as well. If you like, I could offer 3 additional backup servers. cu -- --------------------------------------------------------------------- Enrico Weigelt == metux IT service phone: +49 36207 519931 www: http://www.metux.de/ fax: +49 36207 519932 email: contact at metux.de cellphone: +49 174 7066481 --------------------------------------------------------------------- -- DSL ab 0 Euro. -- statische IP -- UUCP -- Hosting -- Webshops -- --------------------------------------------------------------------- From pmorange at gmail.com Fri Apr 14 16:19:52 2006 From: pmorange at gmail.com (Philippe Morange) Date: Fri, 14 Apr 2006 15:19:52 +0100 Subject: [Expat-discuss] eXpat and Windows 64-bits Message-ID: Hi, I need to port eXpat under Windows 64-bits. Has it already been done ? If not, what would I need to know, before trying that ? I will probably use Microsoft Visual Studio to do that... Thanks, Philippe Morange. Mecalog. From karl at waclawek.net Fri Apr 14 16:56:40 2006 From: karl at waclawek.net (Karl Waclawek) Date: Fri, 14 Apr 2006 10:56:40 -0400 Subject: [Expat-discuss] eXpat and Windows 64-bits In-Reply-To: References: Message-ID: <443FB828.3070001@waclawek.net> Philippe Morange wrote: > Hi, > I need to port eXpat under Windows 64-bits. > Has it already been done ? > If not, what would I need to know, before trying that ? > I will probably use Microsoft Visual Studio to do that... > > > The best thing is simply to try building it. At this point I know of no reason why this build should fail. Karl From karl at waclawek.net Fri Apr 14 19:22:38 2006 From: karl at waclawek.net (Karl Waclawek) Date: Fri, 14 Apr 2006 13:22:38 -0400 Subject: [Expat-discuss] eXpat and Windows 64-bits In-Reply-To: References: Message-ID: <443FDA5E.1030103@waclawek.net> Philippe Morange wrote: > Hi, > I need to port eXpat under Windows 64-bits. > Has it already been done ? > If not, what would I need to know, before trying that ? > I will probably use Microsoft Visual Studio to do that... > > Actually, I built Expat on VS 2005 with 64bit portability warnings turned on, and found one minor issue with the MUST_CONVERT macro. I checked in this patch for xmlparse.c: -#define MUST_CONVERT(enc, s) (!(enc)->isUtf16 || (((unsigned long)s) & 1)) +#define MUST_CONVERT(enc, s) (!(enc)->isUtf16 || ((s - NULL) & 1)) Some of the demo programs have integer conversion warnings as well, but it should be safe to cast them away. Karl From paul at pando.com Fri Apr 14 20:30:02 2006 From: paul at pando.com (Paul Davies) Date: Fri, 14 Apr 2006 14:30:02 -0400 Subject: [Expat-discuss] Universal Binary for Mac Message-ID: <20060414182849.DF4A6FEE56@dkny.pando.com> Has anyone built expat for a Universal Binary for the Mac? thanks Paul -- From amelikian at positronnetworks.com Tue Apr 18 15:49:26 2006 From: amelikian at positronnetworks.com (Melikian, Alex) Date: Tue, 18 Apr 2006 09:49:26 -0400 Subject: [Expat-discuss] VMWare Linux Build Message-ID: <567E04E775EA244A9F7E4140EAEB411104B9C208@srvexchg2.positron.qc.ca> Hello, I'm new to expat. I tried installing it on a virtual Linux Machine (VMWare Red Hat Linux emulated on a Window XP), hence attempted the 'configure', 'make' and 'make install' commands. The 'configure' command ran w/o any problems, but when executing the 'make' command, the execution stopped with this error: lib/xmlparse.c:77:2: error: #error memmove does not exist on this platform, nor is a substitute available. Does this mean that expat cannot be installed on VMWare machines, or this a more common problem that can be worked around. I apologize in advance if this has been discussed before. Thanks for your help, alex From karl at waclawek.net Tue Apr 18 17:42:17 2006 From: karl at waclawek.net (Karl Waclawek) Date: Tue, 18 Apr 2006 11:42:17 -0400 Subject: [Expat-discuss] VMWare Linux Build In-Reply-To: <567E04E775EA244A9F7E4140EAEB411104B9C208@srvexchg2.positron.qc.ca> References: <567E04E775EA244A9F7E4140EAEB411104B9C208@srvexchg2.positron.qc.ca> Message-ID: <444508D9.2090201@waclawek.net> Melikian, Alex wrote: > Hello, > > > > I'm new to expat. I tried installing it on a virtual Linux Machine > (VMWare Red Hat Linux emulated on a Window XP), hence attempted the > 'configure', 'make' and 'make install' commands. The 'configure' command > ran w/o any problems, but when executing the 'make' command, the > execution stopped with this error: > > > > lib/xmlparse.c:77:2: error: #error memmove does not exist on this > platform, nor is a substitute available. > > > > Does this mean that expat cannot be installed on VMWare machines, or > this a more common problem that can be worked around. > > > > Try commenting out the #ifndef HAVE_MEMMOVE block and rebuild. If it builds OK and Exat works, the problem might be with configure. In this case Greg Stein would likely know how to fix it. Karl From amelikian at positronnetworks.com Wed Apr 19 16:11:20 2006 From: amelikian at positronnetworks.com (Melikian, Alex) Date: Wed, 19 Apr 2006 10:11:20 -0400 Subject: [Expat-discuss] VMWare Linux Build Message-ID: <567E04E775EA244A9F7E4140EAEB411104B9CD15@srvexchg2.positron.qc.ca> Thanks for the suggestion, it appears to be a problem with the configure script. I'm not sure why, the configure script does realize the "memmove" exists but for some reason the HAVE_MEMMOVE macro is not defined. I commented out that MACRO section and it fully installed without problems. Thanks again. alex -----Original Message----- From: Karl Waclawek [mailto:karl at waclawek.net] Sent: Tuesday, April 18, 2006 11:42 AM To: Melikian, Alex Cc: expat-discuss at libexpat.org Subject: Re: [Expat-discuss] VMWare Linux Build Melikian, Alex wrote: > Hello, > > > > I'm new to expat. I tried installing it on a virtual Linux Machine > (VMWare Red Hat Linux emulated on a Window XP), hence attempted the > 'configure', 'make' and 'make install' commands. The 'configure' command > ran w/o any problems, but when executing the 'make' command, the > execution stopped with this error: > > > > lib/xmlparse.c:77:2: error: #error memmove does not exist on this > platform, nor is a substitute available. > > > > Does this mean that expat cannot be installed on VMWare machines, or > this a more common problem that can be worked around. > > > > Try commenting out the #ifndef HAVE_MEMMOVE block and rebuild. If it builds OK and Exat works, the problem might be with configure. In this case Greg Stein would likely know how to fix it. Karl From karl at waclawek.net Wed Apr 19 16:22:16 2006 From: karl at waclawek.net (Karl Waclawek) Date: Wed, 19 Apr 2006 10:22:16 -0400 Subject: [Expat-discuss] VMWare Linux Build In-Reply-To: <567E04E775EA244A9F7E4140EAEB411104B9CD15@srvexchg2.positron.qc.ca> References: <567E04E775EA244A9F7E4140EAEB411104B9CD15@srvexchg2.positron.qc.ca> Message-ID: <44464798.506@waclawek.net> Melikian, Alex wrote: > Thanks for the suggestion, it appears to be a problem with the configure > script. I'm not sure why, the configure script does realize the > "memmove" exists but for some reason the HAVE_MEMMOVE macro is not > defined. I commented out that MACRO section and it fully installed > without problems. > > Gret! Alternatively, if you don't want to change the source, just define HAVE_MEMMOVE in the header file created by configure (I think it is called expat_config.h). Karl From halloko at hotmail.com Sun Apr 23 11:01:10 2006 From: halloko at hotmail.com (=?iso-8859-1?Q?S=F8ren_Dreijer?=) Date: Sun, 23 Apr 2006 11:01:10 +0200 Subject: [Expat-discuss] Cancelling Parsing Message-ID: Hi, I'm looking for an easy way to cancel parsing of an XML document. For instance, imagine you discovered some invalid text input in a specific tag. What would be the best-practice way to cancel parsing the rest of the document? Thanks, S?ren Dreijer _________________________________________________________________ Express yourself instantly with MSN Messenger! Download today it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/ From gnschmidt at ukonline.co.uk Sun Apr 23 15:18:42 2006 From: gnschmidt at ukonline.co.uk (Gerald Schmidt) Date: Sun, 23 Apr 2006 14:18:42 +0100 Subject: [Expat-discuss] Cancelling Parsing References: Message-ID: <000c01c666d8$72ce0030$f41a86d4@Family> enum XML_Status XMLCALLXML_StopParser(XML_Parser p, XML_Bool resumable);Stops parsing, causing XML_Parse or XML_ParseBuffer to return. Must be called from within a call-back handler, except when aborting (when resumable is XML_FALSE) an already suspended parser.(From the documentation that ships with the Dev-C++ Expat package.)----- Original Message ----- From: "S?ren Dreijer" To: Sent: Sunday, April 23, 2006 10:01 AM Subject: [Expat-discuss] Cancelling Parsing Hi, I'm looking for an easy way to cancel parsing of an XML document. For instance, imagine you discovered some invalid text input in a specific tag. What would be the best-practice way to cancel parsing the rest of the document? Thanks, S?ren Dreijer _________________________________________________________________ Express yourself instantly with MSN Messenger! Download today it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/ _______________________________________________ Expat-discuss mailing list Expat-discuss at libexpat.org http://mail.libexpat.org/mailman/listinfo/expat-discuss From halloko at hotmail.com Sun Apr 23 16:27:00 2006 From: halloko at hotmail.com (=?iso-8859-1?Q?S=F8ren_Dreijer?=) Date: Sun, 23 Apr 2006 16:27:00 +0200 Subject: [Expat-discuss] Cancelling Parsing Message-ID: Ah, thanks a lot! _________________________________________________________________ Express yourself instantly with MSN Messenger! Download today it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/ From jason at heddings.com Tue Apr 25 21:45:42 2006 From: jason at heddings.com (jason heddings) Date: Tue, 25 Apr 2006 13:45:42 -0600 Subject: [Expat-discuss] Line Endings In-Reply-To: <24933-75905@sneakemail.com> Message-ID: <004a01c668a0$d6d04fb0$6500a8c0@enterprise> I ended up finding the problem... It had to do with the way the file was being opened, rather than a line-ending issue. --jah -----Original Message----- From: Mark [mailto:11mjazbdg02 at sneakemail.com] Sent: Tuesday, 04 April, 2006 06:09 To: jason at heddings.com Subject: RE: [Expat-discuss] Line Endings > I am having a problem parsing files that do not have Unix-style line > endings. If I take the same file and parse it with Windows-style line > endings, the Expat parser errors. Can you be more specific? I have parsed (with expat) files with UNIX and Windows line endings without many problems. The only time I found a difficulty was if the line endings were significant parts of the data. The only workaround for this that I could design is to use custom "escape sequences" to represent the different characters of the line endings as any XML parser will "normalise" line endings to just a line feed. Mark. From elver.loho at gmail.com Fri Apr 28 14:13:47 2006 From: elver.loho at gmail.com (Elver Loho) Date: Fri, 28 Apr 2006 15:13:47 +0300 Subject: [Expat-discuss] Is this a bug? Message-ID: <4a4bc5cd0604280513s4fc62751t2267bd0466cd7ad4@mail.gmail.com> Hiya, all! I have no idea whether this is a bug or not, but I'd like to get people's opinion on it. I'm using Python's xml.parsers.expat to parse an RSS feed from http://blog.kriso.ee/feed/ to generate a static component for our webstore sidebar at http://www.kriso.ee/ The problem is with the title of the latest entry: "Nip/Tuck'i stsenarist filmi kirjutamas" -- I have a method registered as CharacterDataHandler and for the title, it returns *three times* So the line: Nip/Tuck'i stsenarist filmi kirjutamas Ends up causing calls to: StartElementHandler("title") CharacterDataHandler("Nip/Tuck") CharacterDataHandler("'") CharacterDataHandler("i stsenarist filmi kirjutamas") EndElementHandler("title") It's trivial to work around it, but what's going on here? Elver From karl at waclawek.net Fri Apr 28 15:35:06 2006 From: karl at waclawek.net (Karl Waclawek) Date: Fri, 28 Apr 2006 09:35:06 -0400 Subject: [Expat-discuss] Is this a bug? In-Reply-To: <4a4bc5cd0604280513s4fc62751t2267bd0466cd7ad4@mail.gmail.com> References: <4a4bc5cd0604280513s4fc62751t2267bd0466cd7ad4@mail.gmail.com> Message-ID: <44521A0A.2060405@waclawek.net> Elver Loho wrote: > Hiya, all! > > I have no idea whether this is a bug or not, but I'd like to get > people's opinion on it. > > I'm using Python's xml.parsers.expat to parse an RSS feed from > http://blog.kriso.ee/feed/ to generate a static component for our > webstore sidebar at http://www.kriso.ee/ > > The problem is with the title of the latest entry: "Nip/Tuck'i > stsenarist filmi kirjutamas" -- I have a method registered as > CharacterDataHandler and for the title, it returns *three times* > > So the line: > > Nip/Tuck'i stsenarist filmi kirjutamas > > Ends up causing calls to: > StartElementHandler("title") > CharacterDataHandler("Nip/Tuck") > CharacterDataHandler("'") > CharacterDataHandler("i stsenarist filmi kirjutamas") > EndElementHandler("title") > > It's trivial to work around it, but what's going on here? > That is normal behaviour. Expat may split character data into any number of call-backs. You need to buffer them. Karl