From lewicki at provider.pl Mon Mar 1 06:03:06 2004 From: lewicki at provider.pl (Pawel Lewicki) Date: Mon Mar 1 06:40:43 2004 Subject: [XML-SIG] Re: Some questions from a beginner References: <200402290748.57630.derekfountain@yahoo.co.uk> Message-ID: >"Fredrik Lundh" wrote in message news:c1s48o$rd6$1@sea.gmane.org... ... > if the source file is relatively structured (e.g. it contains many > thousand records, all having an identical structure), you can use > an incremental DOM parsing approach. > > > here's an example for the elementtree library: > > http://effbot.org/zone/element-pull.htm > > I'm sure you use a similar approach with many other DOM libraries. > > Thanx a lot. ElementTree library works perfectly for me. Pawel From gward at python.net Mon Mar 1 07:41:12 2004 From: gward at python.net (gward@python.net) Date: Mon Mar 1 07:41:16 2004 Subject: [XML-SIG] (no subject) Message-ID: <200403011241.i21Ce4vF019904@mxzilla2.xs4all.nl> From lewicki at provider.pl Tue Mar 2 04:36:08 2004 From: lewicki at provider.pl (Pawel Lewicki) Date: Tue Mar 2 04:36:36 2004 Subject: [XML-SIG] Re: Some questions from a beginner References: <200402290748.57630.derekfountain@yahoo.co.uk> Message-ID: > >"Fredrik Lundh" wrote in message > news:c1s48o$rd6$1@sea.gmane.org... > ... > > if the source file is relatively structured (e.g. it contains many > > thousand records, all having an identical structure), you can use > > an incremental DOM parsing approach. > > > > > > here's an example for the elementtree library: > > > > http://effbot.org/zone/element-pull.htm > > > > I'm sure you use a similar approach with many other DOM libraries. > > > > > > Thanx a lot. ElementTree library works perfectly for me. > > Pawel But I still have a problem. Is ElementTree considered hardrock stable? I have a "memory protection violation" error after parsing about 60MB of xml file. Pawel From xu at reflexsecurity.com Tue Mar 2 16:21:57 2004 From: xu at reflexsecurity.com (J. Xu) Date: Tue Mar 2 18:14:14 2004 Subject: [XML-SIG] python xml parser Message-ID: How can I pass stdout of processes, or output stream of network channels (like ssh) to a python xml parser (either through DOM or SAX) so that xml can be used as the communication tool among different components of the software? I think the standard python xml reader expects a file-like object or string as the input, but not sure if it can take a stream. Any suggestions are highly appreciated. From chrish at cryptocard.com Wed Mar 3 08:26:26 2004 From: chrish at cryptocard.com (Chris Herborth) Date: Wed Mar 3 08:20:47 2004 Subject: [XML-SIG] python xml parser In-Reply-To: References: Message-ID: <4045DD02.30004@cryptocard.com> J. Xu wrote: > How can I pass stdout of processes, or output stream of network channels > (like ssh) to a python xml parser (either through DOM or SAX) so that > xml can be used as the communication tool among different components of > the software? I think the standard python xml reader expects a file-like > object or string as the input, but not sure if it can take a stream. > Any suggestions are highly appreciated. If you're not passing large amounts of data around, httplib looks pretty easy to hook up: # Pretty much straight from the Library Reference: import httplib conn = httplib.HTTPConnection("www.python.org") conn.request("GET", "/index.html") r1 = conn.getresponse() data1 = r1.read() # pass data1 to an XML processor's parseString() or similar method # exercise for the reader ;-) conn.close() -- Chris Herborth chrish@cryptocard.com Documentation Overlord, CRYPTOCard Corp. http://www.cryptocard.com/ Never send a monster to do the work of an evil scientist. Postatem obscuri lateris nescitis. From kissfrito at yahoo.com Wed Mar 3 15:43:51 2004 From: kissfrito at yahoo.com (Matt Lombardo) Date: Wed Mar 3 17:46:47 2004 Subject: [XML-SIG] question about pyXML Message-ID: <20040303204351.15973.qmail@web20508.mail.yahoo.com> Hi, my name is Matt and I have a question reguarding your software. Just to note, I am working on a Solaris 8 box. I am trying to get XML Diff 0.6.4 up and running and recently the administrator of this system installed Python 2.3.3 but now when i try to to get PyXML working and use the python setup.py build and get: not copying xml/FtCore.py (output up-to-date) not copying xml/__init__.py (output up-to-date) not copying xml/ns.py (output up-to-date) not copying xml/dom/Attr.py (output up-to-date) not copying xml/dom/CDATASection.py (output up-to-date) not copying xml/dom/CharacterData.py (output up-to-date) not copying xml/dom/Comment.py (output up-to-date) not copying xml/dom/DOMImplementation.py (output up-to-date) not copying xml/dom/Document.py (output up-to-date) not copying xml/dom/DocumentFragment.py (output up-to-date) not copying xml/dom/DocumentType.py (output up-to-date) not copying xml/dom/Element.py (output up-to-date) not copying xml/dom/Entity.py (output up-to-date) not copying xml/dom/EntityReference.py (output up-to-date) not copying xml/dom/Event.py (output up-to-date) not copying xml/dom/FtNode.py (output up-to-date) not copying xml/dom/MessageSource.py (output up-to-date) not copying xml/dom/NamedNodeMap.py (output up-to-date) not copying xml/dom/NodeFilter.py (output up-to-date) not copying xml/dom/NodeIterator.py (output up-to-date) not copying xml/dom/NodeList.py (output up-to-date) not copying xml/dom/Notation.py (output up-to-date) not copying xml/dom/ProcessingInstruction.py (output up-to-date) not copying xml/dom/Range.py (output up-to-date) not copying xml/dom/Text.py (output up-to-date) not copying xml/dom/TreeWalker.py (output up-to-date) not copying xml/dom/__init__.py (output up-to-date) not copying xml/dom/domreg.py (output up-to-date) not copying xml/dom/expatbuilder.py (output up-to-date) not copying xml/dom/javadom.py (output up-to-date) not copying xml/dom/minicompat.py (output up-to-date) not copying xml/dom/minidom.py (output up-to-date) not copying xml/dom/minitraversal.py (output up-to-date) not copying xml/dom/pulldom.py (output up-to-date) not copying xml/dom/xmlbuilder.py (output up-to-date) not copying xml/dom/html/GenerateHtml.py (output up-to-date) not copying xml/dom/html/HTMLAnchorElement.py (output up-to-date) not copying xml/dom/html/HTMLAppletElement.py (output up-to-date) not copying xml/dom/html/HTMLAreaElement.py (output up-to-date) not copying xml/dom/html/HTMLBRElement.py (output up-to-date) not copying xml/dom/html/HTMLBaseElement.py (output up-to-date) not copying xml/dom/html/HTMLBaseFontElement.py (output up-to-date) not copying xml/dom/html/HTMLBodyElement.py (output up-to-date) not copying xml/dom/html/HTMLButtonElement.py (output up-to-date) not copying xml/dom/html/HTMLCollection.py (output up-to-date) not copying xml/dom/html/HTMLDListElement.py (output up-to-date) not copying xml/dom/html/HTMLDOMImplementation.py (output up-to-date) not copying xml/dom/html/HTMLDirectoryElement.py (output up-to-date) not copying xml/dom/html/HTMLDivElement.py (output up-to-date) not copying xml/dom/html/HTMLDocument.py (output up-to-date) not copying xml/dom/html/HTMLElement.py (output up-to-date) not copying xml/dom/html/__init__.py (output up-to-date) not copying xml/dom/html/HTMLFieldSetElement.py (output up-to-date) not copying xml/dom/html/HTMLFontElement.py (output up-to-date) not copying xml/dom/html/HTMLFormElement.py (output up-to-date) not copying xml/dom/html/HTMLFrameElement.py (output up-to-date) not copying xml/dom/html/HTMLFrameSetElement.py (output up-to-date) not copying xml/dom/html/HTMLHRElement.py (output up-to-date) not copying xml/dom/html/HTMLHeadElement.py (output up-to-date) not copying xml/dom/html/HTMLHeadingElement.py (output up-to-date) not copying xml/dom/html/HTMLHtmlElement.py (output up-to-date) not copying xml/dom/html/HTMLIFrameElement.py (output up-to-date) not copying xml/dom/html/HTMLImageElement.py (output up-to-date) not copying xml/dom/html/HTMLInputElement.py (output up-to-date) not copying xml/dom/html/HTMLIsIndexElement.py (output up-to-date) not copying xml/dom/html/HTMLLIElement.py (output up-to-date) not copying xml/dom/html/HTMLLabelElement.py (output up-to-date) not copying xml/dom/html/HTMLLegendElement.py (output up-to-date) not copying xml/dom/html/HTMLLinkElement.py (output up-to-date) not copying xml/dom/html/HTMLMapElement.py (output up-to-date) not copying xml/dom/html/HTMLMenuElement.py (output up-to-date) not copying xml/dom/html/HTMLMetaElement.py (output up-to-date) not copying xml/dom/html/HTMLModElement.py (output up-to-date) not copying xml/dom/html/HTMLOListElement.py (output up-to-date) not copying xml/dom/html/HTMLObjectElement.py (output up-to-date) not copying xml/dom/html/HTMLOptGroupElement.py (output up-to-date) not copying xml/dom/html/HTMLOptionElement.py (output up-to-date) not copying xml/dom/html/HTMLParagraphElement.py (output up-to-date) not copying xml/dom/html/HTMLParamElement.py (output up-to-date) not copying xml/dom/html/HTMLPreElement.py (output up-to-date) not copying xml/dom/html/HTMLQuoteElement.py (output up-to-date) not copying xml/dom/html/HTMLScriptElement.py (output up-to-date) not copying xml/dom/html/HTMLSelectElement.py (output up-to-date) not copying xml/dom/html/HTMLStyleElement.py (output up-to-date) not copying xml/dom/html/HTMLTableCaptionElement.py (output up-to-date) not copying xml/dom/html/HTMLTableCellElement.py (output up-to-date) not copying xml/dom/html/HTMLTableColElement.py (output up-to-date) not copying xml/dom/html/HTMLTableElement.py (output up-to-date) not copying xml/dom/html/HTMLTableRowElement.py (output up-to-date) not copying xml/dom/html/HTMLTableSectionElement.py (output up-to-date) not copying xml/dom/html/HTMLTextAreaElement.py (output up-to-date) not copying xml/dom/html/HTMLTitleElement.py (output up-to-date) not copying xml/dom/html/HTMLUListElement.py (output up-to-date) not copying xml/dom/ext/Dom2Sax.py (output up-to-date) not copying xml/dom/ext/Printer.py (output up-to-date) not copying xml/dom/ext/Visitor.py (output up-to-date) not copying xml/dom/ext/XHtml2HtmlPrinter.py (output up-to-date) not copying xml/dom/ext/XHtmlPrinter.py (output up-to-date) not copying xml/dom/ext/__init__.py (output up-to-date) not copying xml/dom/ext/c14n.py (output up-to-date) not copying xml/dom/ext/reader/HtmlLib.py (output up-to-date) not copying xml/dom/ext/reader/HtmlSax.py (output up-to-date) not copying xml/dom/ext/reader/PyExpat.py (output up-to-date) not copying xml/dom/ext/reader/Sax.py (output up-to-date) not copying xml/dom/ext/reader/Sax2.py (output up-to-date) not copying xml/dom/ext/reader/Sax2Lib.py (output up-to-date) not copying xml/dom/ext/reader/Sgmlop.py (output up-to-date) not copying xml/dom/ext/reader/__init__.py (output up-to-date) not copying xml/marshal/__init__.py (output up-to-date) not copying xml/marshal/generic.py (output up-to-date) not copying xml/marshal/wddx.py (output up-to-date) not copying xml/unicode/__init__.py (output up-to-date) not copying xml/unicode/iso8859.py (output up-to-date) not copying xml/unicode/utf8_iso.py (output up-to-date) not copying xml/parsers/__init__.py (output up-to-date) not copying xml/parsers/expat.py (output up-to-date) not copying xml/parsers/sgmllib.py (output up-to-date) not copying xml/parsers/xmlproc/__init__.py (output up-to-date) not copying xml/parsers/xmlproc/_outputters.py (output up-to-date) not copying xml/parsers/xmlproc/catalog.py (output up-to-date) not copying xml/parsers/xmlproc/charconv.py (output up-to-date) not copying xml/parsers/xmlproc/dtdparser.py (output up-to-date) not copying xml/parsers/xmlproc/errors.py (output up-to-date) not copying xml/parsers/xmlproc/namespace.py (output up-to-date) not copying xml/parsers/xmlproc/utils.py (output up-to-date) not copying xml/parsers/xmlproc/xcatalog.py (output up-to-date) not copying xml/parsers/xmlproc/xmlapp.py (output up-to-date) not copying xml/parsers/xmlproc/xmldtd.py (output up-to-date) not copying xml/parsers/xmlproc/xmlproc.py (output up-to-date) not copying xml/parsers/xmlproc/xmlutils.py (output up-to-date) not copying xml/parsers/xmlproc/xmlval.py (output up-to-date) not copying xml/sax/__init__.py (output up-to-date) not copying xml/sax/_exceptions.py (output up-to-date) not copying xml/sax/expatreader.py (output up-to-date) not copying xml/sax/handler.py (output up-to-date) not copying xml/sax/sax2exts.py (output up-to-date) not copying xml/sax/saxexts.py (output up-to-date) not copying xml/sax/saxlib.py (output up-to-date) not copying xml/sax/saxutils.py (output up-to-date) not copying xml/sax/writer.py (output up-to-date) not copying xml/sax/xmlreader.py (output up-to-date) not copying xml/sax/drivers/__init__.py (output up-to-date) not copying xml/sax/drivers/drv_htmllib.py (output up-to-date) not copying xml/sax/drivers/drv_ltdriver.py (output up-to-date) not copying xml/sax/drivers/drv_ltdriver_val.py (output up-to-date) not copying xml/sax/drivers/drv_pyexpat.py (output up-to-date) not copying xml/sax/drivers/drv_sgmllib.py (output up-to-date) not copying xml/sax/drivers/drv_sgmlop.py (output up-to-date) not copying xml/sax/drivers/drv_xmldc.py (output up-to-date) not copying xml/sax/drivers/drv_xmllib.py (output up-to-date) not copying xml/sax/drivers/drv_xmlproc.py (output up-to-date) not copying xml/sax/drivers/drv_xmlproc_val.py (output up-to-date) not copying xml/sax/drivers/drv_xmltoolkit.py (output up-to-date) not copying xml/sax/drivers/pylibs.py (output up-to-date) not copying xml/sax/drivers2/__init__.py (output up-to-date) not copying xml/sax/drivers2/drv_htmllib.py (output up-to-date) not copying xml/sax/drivers2/drv_javasax.py (output up-to-date) not copying xml/sax/drivers2/drv_pyexpat.py (output up-to-date) not copying xml/sax/drivers2/drv_sgmllib.py (output up-to-date) not copying xml/sax/drivers2/drv_sgmlop.py (output up-to-date) not copying xml/sax/drivers2/drv_sgmlop_html.py (output up-to-date) not copying xml/sax/drivers2/drv_xmlproc.py (output up-to-date) not copying xml/utils/__init__.py (output up-to-date) not copying xml/utils/characters.py (output up-to-date) not copying xml/utils/iso8601.py (output up-to-date) not copying xml/utils/qp_xml.py (output up-to-date) not copying xml/schema/__init__.py (output up-to-date) not copying xml/schema/trex.py (output up-to-date) not copying xml/xpath/BuiltInExtFunctions.py (output up-to-date) not copying xml/xpath/Context.py (output up-to-date) not copying xml/xpath/Conversions.py (output up-to-date) not copying xml/xpath/CoreFunctions.py (output up-to-date) not copying xml/xpath/ExpandedNameWrapper.py (output up-to-date) not copying xml/xpath/MessageSource.py (output up-to-date) not copying xml/xpath/NamespaceNode.py (output up-to-date) not copying xml/xpath/ParsedAbbreviatedAbsoluteLocationPath.py (output up-to-date) not copying xml/xpath/ParsedAbbreviatedRelativeLocationPath.py (output up-to-date) not copying xml/xpath/ParsedAbsoluteLocationPath.py (output up-to-date) not copying xml/xpath/ParsedAxisSpecifier.py (output up-to-date) not copying xml/xpath/ParsedExpr.py (output up-to-date) not copying xml/xpath/ParsedNodeTest.py (output up-to-date) not copying xml/xpath/ParsedPredicateList.py (output up-to-date) not copying xml/xpath/ParsedStep.py (output up-to-date) not copying xml/xpath/ParsedRelativeLocationPath.py (output up-to-date) not copying xml/xpath/Set.py (output up-to-date) not copying xml/xpath/Util.py (output up-to-date) not copying xml/xpath/XPathGrammar.py (output up-to-date) not copying xml/xpath/XPathParser.py (output up-to-date) not copying xml/xpath/XPathParserBase.py (output up-to-date) not copying xml/xpath/__init__.py (output up-to-date) not copying xml/xpath/pyxpath.py (output up-to-date) not copying xml/xpath/yappsrt.py (output up-to-date) running build_ext building '_xmlplus.parsers.pyexpat' extension gcc -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -fPIC -DXML_NS=1 -DXML_DTD=1 -DBYTEORDER=4321 -DXML_CONTEXT_BYTES=1024 -Iextensions/expat/lib -I/opt/local/include/python2.2 -c extensions/pyexpat.c -o build/temp.solaris-2.8-sun4u-2.2/pyexpat.o unable to execute gcc: No such file or directory error: command 'gcc' failed with exit status 1 any feedback would be much appreciated, thanks matt __________________________________ Do you Yahoo!? Yahoo! Search - Find what you’re looking for faster http://search.yahoo.com From morillas at posta.unizar.es Wed Mar 3 18:18:09 2004 From: morillas at posta.unizar.es (luis miguel morillas) Date: Wed Mar 3 17:58:47 2004 Subject: [XML-SIG] question about pyXML In-Reply-To: <20040303204351.15973.qmail@web20508.mail.yahoo.com> References: <20040303204351.15973.qmail@web20508.mail.yahoo.com> Message-ID: <20040303231808.GA1877@marmota> Asunto: [XML-SIG] question about pyXML Fecha: mi?, mar 03, 2004 at 12:43:51 -0800 Citando a Matt Lombardo (kissfrito@yahoo.com): > Hi, my name is Matt and I have a question reguarding > your software. Just to note, I am working on a Solaris > 8 box. I am trying to get XML Diff 0.6.4 up and > running and recently the administrator of this system > installed Python 2.3.3 but now when i try to to get > PyXML working and use the python setup.py build and > get: > [snip] > building '_xmlplus.parsers.pyexpat' extension > gcc -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -fPIC > -DXML_NS=1 -DXML_DTD=1 -DBYTEORDER=4321 > -DXML_CONTEXT_BYTES=1024 -Iextensions/expat/lib > -I/opt/local/include/python2.2 -c extensions/pyexpat.c > -o build/temp.solaris-2.8-sun4u-2.2/pyexpat.o > unable to execute gcc: No such file or directory > error: command 'gcc' failed with exit status 1 > You need gcc compiler to build the new code. -- Luis Miguel No a las patentes de software en Europa EuropeSwPatentFree http://EuropeSwPatentFree.hispalinux.es From dkuhlman at cutter.rexx.com Thu Mar 4 11:26:29 2004 From: dkuhlman at cutter.rexx.com (Dave Kuhlman) Date: Thu Mar 4 11:26:36 2004 Subject: [XML-SIG] python xml parser In-Reply-To: ; from xu@reflexsecurity.com on Tue, Mar 02, 2004 at 04:21:57PM -0500 References: Message-ID: <20040304082628.A79846@cutter.rexx.com> On Tue, Mar 02, 2004 at 04:21:57PM -0500, J. Xu wrote: > How can I pass stdout of processes, or output stream of network channels > (like ssh) to a python xml parser (either through DOM or SAX) so that > xml can be used as the communication tool among different components of > the software? I think the standard python xml reader expects a file-like > object or string as the input, but not sure if it can take a stream. > Any suggestions are highly appreciated. The minidom parser in the Python standard library will take a file name or a file object. From http://www.python.org/doc/current/lib/module-xml.dom.minidom.html: parse(filename_or_file, parser) Return a Document from the given input. filename_or_file may be either a file name, or a file-like object. parser, if given, must be a SAX2 parser object. This function will change the document handler of the parser and activate namespace support; other parser configuration (like setting an entity resolver) must have been done in advance. Dave -- Dave Kuhlman dkuhlman@rexx.com http://www.rexx.com/~dkuhlman From hqsxudkkl at ecua.net.ec Thu Mar 4 11:48:58 2004 From: hqsxudkkl at ecua.net.ec (Lance Cochran) Date: Thu Mar 4 13:27:12 2004 Subject: [XML-SIG] diplomas for sale from accredited universities Message-ID: An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/xml-sig/attachments/20040304/61154a71/attachment.html From mike at skew.org Thu Mar 4 17:54:17 2004 From: mike at skew.org (Mike Brown) Date: Thu Mar 4 18:30:36 2004 Subject: [XML-SIG] python xml parser In-Reply-To: <20040304082628.A79846@cutter.rexx.com> "from Dave Kuhlman at Mar 4, 2004 08:26:29 am" Message-ID: <200403042254.i24MsHw2080430@chilled.skew.org> Dave Kuhlman wrote: > On Tue, Mar 02, 2004 at 04:21:57PM -0500, J. Xu wrote: > > How can I pass stdout of processes, or output stream of network channels > > (like ssh) to a python xml parser (either through DOM or SAX) so that > > xml can be used as the communication tool among different components of > > the software? I think the standard python xml reader expects a file-like > > object or string as the input, but not sure if it can take a stream. > > Any suggestions are highly appreciated. > > The minidom parser in the Python standard library will take a file > name or a file object. From > http://www.python.org/doc/current/lib/module-xml.dom.minidom.html: > > parse(filename_or_file, parser) > Return a Document from the given input. filename_or_file > may be either a file name, or a file-like object. parser, > if given, must be a SAX2 parser object. This function will > change the document handler of the parser and activate > namespace support; other parser configuration (like setting > an entity resolver) must have been done in advance. > J. Xu - Yes, as you guessed and as Dave mentions, a file-like object is all you need on the XML reading side. You can obtain the stdout stream of a process as a file-like object when you create the process via os.popen(), os.popen2(), os.popen3(), os.popen4() -- see the os module docs for info on how to use those functions. The method for obtaining a file-like object that wraps a network stream will depend on the library/software you are using. It may provide an API that will give you a file-like object, or a socket object, or it may not give you any hooks at all. Your task will be to figure out what access (if any) you have to the network stream, and then if it's not a file-like object, you'll need to wrap it in one. I believe the main requirement for a file-like object, as far as the Python XML reading tools are concerned, is just that is have a read() method that supplies all the bytes of the XML to be parsed. -Mike From rsalz at datapower.com Thu Mar 4 18:47:27 2004 From: rsalz at datapower.com (Rich Salz) Date: Thu Mar 4 18:47:30 2004 Subject: [XML-SIG] python xml parser In-Reply-To: <200403042254.i24MsHw2080430@chilled.skew.org> Message-ID: > The method for obtaining a file-like object that wraps a network stream will > depend on the library/software you are using. ? The "makefile" method dates back to python 1.5 /r$ -- Rich Salz Chief Security Architect DataPower Technology http://www.datapower.com XS40 XML Security Gateway http://www.datapower.com/products/xs40.html XML Security Overview http://www.datapower.com/xmldev/xmlsecurity.html From mike at skew.org Thu Mar 4 18:58:27 2004 From: mike at skew.org (Mike Brown) Date: Thu Mar 4 18:58:57 2004 Subject: [XML-SIG] python xml parser In-Reply-To: "from Rich Salz at Mar 4, 2004 06:47:27 pm" Message-ID: <200403042358.i24NwRqu080871@chilled.skew.org> Rich Salz wrote: > > The method for obtaining a file-like object that wraps a network stream will > > depend on the library/software you are using. > > ? The "makefile" method dates back to python 1.5 That assumes that you have a socket object. The question was very general, asking about SSH and other network streams. I don't know what SSH libs are available for Python and what their APIs are, so I make no assumptions as to how they expose their decrypted incoming streams. General question gets a general answer. :) From bob at robisonranch.net Thu Mar 4 23:16:00 2004 From: bob at robisonranch.net (Bob Robison) Date: Thu Mar 4 23:16:11 2004 Subject: [XML-SIG] PyXML parser problem Message-ID: <20040304221600.40d2c2c9.bob@robisonranch.net> I'm trying to get some existing code running with py2exe 0.5.0 and am having some problems. I've narrowed things down to a simple test file: #!/usr/bin/env python from xml.dom.ext.reader import Sax2 xr=Sax2.Reader() print "Success" If I run this with 'python test.py' then I get an Attribute error returned from saxexts.py: return drv_module.create_parser() Attribute Error: 'module' object has no attribute 'create_parser' If I run python interactively, and then type in the commands above from the prompt, it works fine. I don't understand what the issues are, but obviously there is something in the environment that is different. Putting in a few prints in the saxexts.py library, it appears that the parser that is trying to be created is pyexpat -- which fails. When it works it apparently tries 'xml.sax.drivers2.drv_pyexpat' which seems to work. I hard-coded to try the specific name mentioned above and then it works running the test.py file. However, when I try to use py2exe I get a similar problem: The make_parser routine returns: xml.sax._exceptions.SAXReaderNotAvailable: No parsers found. Can someone offer some clues about the differences in these environments, and how to possible force the right thing? I'm using py2exe 0.5.0, Python 2.3.3, and PyXML 0.8.3 on Windows 2000. The original program (and all of my normal work) is done on Linux, where I've been using Python 2.2. I haven't tried upgrading the Linux to 2.3 yet. I would have tried Python 2.2 on windows, but py2exe was wanting the later python. Help! bob From fredrik at pythonware.com Fri Mar 5 01:09:30 2004 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri Mar 5 01:09:55 2004 Subject: [XML-SIG] Re: Some questions from a beginner References: <200402290748.57630.derekfountain@yahoo.co.uk> Message-ID: Pawel Lewicki wrote: > But I still have a problem. Is ElementTree considered hardrock stable? I > have a "memory protection violation" error after parsing about 60MB of xml > file. ElementTree is pure Python, and uses only mechanisms that have been in Python since the early days. If an application using it crashes, the cause of the crash is probably somewhere else. In order, I'd suspect the following: a C extension you've written yourself (look for reference count errors) the database interface the Python bindings for the database interface the database that other C extension you've written yourself the Python bindings for the XML parser or the parser itself hardware problems (bad RAM) a computer virus that newly introduced odd corner of Python that your code is using a bad day several bad days in a row the parts of the Python core used by element trees almost-serious'ly yrs /F From tuildnys at thedoghousemail.com Fri Mar 5 07:53:40 2004 From: tuildnys at thedoghousemail.com (Raphael Mcknight) Date: Fri Mar 5 04:54:05 2004 Subject: [XML-SIG] size? Message-ID: drop the hammer on the next girl you screw... http://sculpin.ffdsd4d.com/vp5 No more of this sort of material. Honoured in 24-48 hours. http://mastermind.amilsdcx.com/a.html delta refugee arenaceous argue bamako stewart holcomb breathy mynheer geochronology e4 From rho at bigpond.net.au Fri Mar 5 15:24:46 2004 From: rho at bigpond.net.au (Robert Barta) Date: Fri Mar 5 15:24:52 2004 Subject: [XML-SIG] python xml parser In-Reply-To: References: Message-ID: <20040305202446.GB868@namod.qld.bigpond.net.au> On Tue, Mar 02, 2004 at 04:21:57PM -0500, J. Xu wrote: > How can I pass stdout of processes, or output stream of network channels > (like ssh) to a python xml parser (either through DOM or SAX) so that > xml can be used as the communication tool among different components of > the software? A slightly odd question here, I'd think. To 'pass stdout' you simply use the shell 'piping' myprocess_whatever | python mypython If you have created a tunnel with SSH then that will have established a local port to connect to. You simply let your Python script connect to that port. It does not have to know that that is SSH tunnel anyway. > I think the standard python xml reader expects a file-like > object or string as the input, but not sure if it can take a stream. Not sure about the 'standard python xml reader', but pretty much everyone allows to parse from a stream, file or a string. Reading the docs helps. \rho From chrmic at gmx.de Fri Mar 5 17:36:55 2004 From: chrmic at gmx.de (Christoph Michalke) Date: Fri Mar 5 20:18:53 2004 Subject: [XML-SIG] Re: PyXML parser problem In-Reply-To: http://mail.python.org/pipermail/xml-sig/2004-March/010180.html Message-ID: <40490F17.21919.7B381A6@localhost> Bob Robison wrote on Thu Mar 4 23:16:00 EST 2004 in http://mail.python.org/pipermail/xml-sig/2004-March/010180.html > I'm trying to get some existing code running with > py2exe 0.5.0 and am having some problems. > However, when I try to use py2exe I get a similar > problem: The make_parser routine returns: > xml.sax._exceptions.SAXReaderNotAvailable: No parsers found. > I'm using py2exe 0.5.0, Python 2.3.3, and PyXML > 0.8.3 on Windows 2000. For me the following worked with py2exe 0.4.2, Python 2.3.3, PyXML 0.8.3 on Windows XP: setup.py py2exe --includes xml.sax.drivers2.drv_pyexpat --packages encodings -- force-imports encodings Using "-p xml" or "-p _xmlplus" instead of "--includes xml.sax.drivers2.drv_pyexpat" didn't work for me, but are suggested elsewhere. Also note that py2exe 0.4.2 and 0.5.0 differ a bit, but probably only in other respects. Chris M. From bob at robisonranch.net Sat Mar 6 09:04:46 2004 From: bob at robisonranch.net (Bob Robison) Date: Sat Mar 6 09:04:55 2004 Subject: [XML-SIG] Re: PyXML parser problem Message-ID: <20040306080446.40f643d6.bob@robisonranch.net> Chris, Thanks! That did the trick. py2exe 0.5.0 didn't like the --force-imports argument, so I had to leave that out, but using the --includes seems to have solved the problem. I appreciate the help! bob Chris M. wrote on Fri Mar 5 17:36:55 EST 2004 in http://mail.python.org/pipermail/xml-sig/2004-March/010184.html >Bob Robison wrote on Thu Mar 4 23:16:00 EST 2004 in >http://mail.python.org/pipermail/xml-sig/2004-March/010180.html > >> I'm trying to get some existing code running with >> py2exe 0.5.0 and am having some problems. > >> However, when I try to use py2exe I get a similar >> problem: The make_parser routine returns: >> xml.sax._exceptions.SAXReaderNotAvailable: No parsers found. > >> I'm using py2exe 0.5.0, Python 2.3.3, and PyXML >> 0.8.3 on Windows 2000. > >For me the following worked with py2exe 0.4.2, Python 2.3.3, PyXML >0.8.3 on >Windows XP: > >setup.py py2exe --includes xml.sax.drivers2.drv_pyexpat --packages >encodings --force-imports encodings From noreply at sourceforge.net Mon Mar 8 17:27:10 2004 From: noreply at sourceforge.net (SourceForge.net) Date: Mon Mar 8 17:27:16 2004 Subject: [XML-SIG] [ pyxml-Bugs-912327 ] HTMLDocument and HTMLElement upper case attributes sometimes Message-ID: Bugs item #912327, was opened at 2004-03-08 22:27 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=106473&aid=912327&group_id=6473 Category: DOM Group: None Status: Open Resolution: None Priority: 5 Submitted By: Murray Steele (murraysteele) Assigned to: Nobody/Anonymous (nobody) Summary: HTMLDocument and HTMLElement upper case attributes sometimes Initial Comment: xml.dom.html.HTMLDocument overrides createAttribute of xml.dom.Document to create an attribute with an uppercase version of the attribute name passed in. xml.dom.html.HTMLElement then overrides getAttribute, getAttributeNode, hasAttribute, removeAttribute and setAttribute of xml.dom.Element to look for attributes with an uppercase name. So far so good (bug: 555303 not-with-standing). Unfortunately xml.dom.ext.reader.Sgmlop calls setAttributeNS (which eventually calls through to createAttributeNS in xml.dom.Document) to create attribute nodes on the HTMLElements. This neatly bypasses the uppercasing done in HTMLDocument and HTMLElement and means that none of the neat htmlanchorelement.href attribute references work. Shame :( Seems like the simple fix is to add createAttributeNS to HTMLDocument and/or setAttributeNS to HTMLElement and have these methods pass through to the base class methods with uppercased names instead. Although it seems like a more thorough fix would invovle making sure that all the (get|set)Attribute(.*) methods of the xml.dom.html objects are made auto-upper-case-magic. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=106473&aid=912327&group_id=6473 From jgentil at sebistar.net Wed Mar 10 01:23:35 2004 From: jgentil at sebistar.net (Jon-Pierre Gentil) Date: Wed Mar 10 01:23:38 2004 Subject: [XML-SIG] XSLT? Message-ID: <1078899815.4872.75.camel@mastermind> What would everyone recommend to use as a good XSLT processor? Thanks! -- Jon-Pierre Gentil : PGP Key ID 0xA21BC30E AIM: Zenethian : Jabber: jgentil@jabber.org -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 197 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/xml-sig/attachments/20040310/c4de848b/attachment.bin From markus_jais at yahoo.de Wed Mar 10 01:58:27 2004 From: markus_jais at yahoo.de (=?iso-8859-1?q?Markus=20Jais?=) Date: Wed Mar 10 01:58:30 2004 Subject: [XML-SIG] XSLT? In-Reply-To: <1078899815.4872.75.camel@mastermind> Message-ID: <20040310065827.71569.qmail@web25205.mail.ukl.yahoo.com> hi if you need speed, check out libxslt (www.xmlsoft.org). it is written in C and comes with Python bindings included. it is way faster than a pure Python processor. if speed is not important, 4suite has a great XSLT processer http://www.4suite.org/index.xhtml regards markus --- Jon-Pierre Gentil schrieb: > What would everyone recommend to use as a good XSLT > processor? > > Thanks! > > -- > Jon-Pierre Gentil : PGP Key ID 0xA21BC30E > AIM: Zenethian : Jabber: > jgentil@jabber.org > > ATTACHMENT part 1.2 application/pgp-signature name=signature.asc > _______________________________________________ > XML-SIG maillist - XML-SIG@python.org > http://mail.python.org/mailman/listinfo/xml-sig > Mit sch?nen Gr??en von Yahoo! Mail - http://mail.yahoo.de From phthenry at earthlink.net Wed Mar 10 04:01:50 2004 From: phthenry at earthlink.net (Paul Tremblay) Date: Wed Mar 10 04:02:36 2004 Subject: [XML-SIG] XSLT? In-Reply-To: <20040310065827.71569.qmail@web25205.mail.ukl.yahoo.com> References: <1078899815.4872.75.camel@mastermind> <20040310065827.71569.qmail@web25205.mail.ukl.yahoo.com> Message-ID: <20040310090150.GA12131@localhost.localdomain> On Wed, Mar 10, 2004 at 07:58:27AM +0100, Markus Jais wrote: > > hi > if you need speed, check out libxslt > (www.xmlsoft.org). > it is written in C and comes with Python bindings > included. > it is way faster than a pure Python processor. I regret to bring this up and offend the hard workers who have written 4suite, but I noticed the same thing. I processed a 2 Megabyte document on my 300MHZ machine. libxst took 6 seconds. xalan and saxon took about 15 seconds(counting how long it took java to "warm up.") 4suite took 1 minute and 50 seconds. Perhaps I am missing something? Or is the feeling that speed is not that important, since most people process relatively smaller docuemnts, where one wouldn't notice the speed? For example, with a 20k document, 4suite might be faster than the java appications. Paul -- ************************ *Paul Tremblay * *phthenry@earthlink.net* ************************ From brian at sweetapp.com Wed Mar 10 13:23:00 2004 From: brian at sweetapp.com (Brian Quinlan) Date: Wed Mar 10 13:23:12 2004 Subject: [XML-SIG] XSLT? In-Reply-To: <20040310090150.GA12131@localhost.localdomain> Message-ID: <055501c406cc$b8b54900$0000fea9@dell8200> Paul Tremblay wrote: > I processed a 2 Megabyte document on my 300MHZ machine. libxst took 6 > seconds. xalan and saxon took about 15 seconds(counting how long it took > java to "warm up.") 4suite took 1 minute and 50 seconds. Did you try the C implementation of Xalan? It should have less startup overhead. And there are Python bindings here: http://pyana.sourceforge.net/ Cheers, Brian From quirxi at aon.at Thu Mar 11 03:40:14 2004 From: quirxi at aon.at (Arno Wilhelm) Date: Thu Mar 11 03:40:29 2004 Subject: [XML-SIG] Problems with "ignorable whitespace" in python's minidom and pulldom ! Message-ID: <405025EE.9050203@aon.at> Hello, I hope this is the right mailinglist for this kind of topic. If not, do not hestiate to ignore this posting or direct me to another mailing list. Here is the problem: My application is a web server centered programm that uses mod_python and xml has to process xml files. These xml files have most of the time ignorable white spaces like \n, \r \t between the different tags. The problem is that minidom seems to interpret these white spaces as text nodes and I cannot know in before how many of these "text nodes" are in between the real data nodes. This seems to disturb the real structure of the dom tree and child nodes are no longer child nodes etc. That makes it hard to write a reliable xml application since I cannot know how many spaces the writer/editor of the xml file has put in between the tags. So I tried to find a way of getting rid of these unwanted text nodes with this piece of code but that did not help either: ################################################################################ # ################################################################################ def cleanUpNodes( nodes ): """Removes all TEXT_NODES in parameter nodes that contain only characters that are defined as whitespace in the string library""" for node in nodes.childNodes: if node.nodeType == Node.TEXT_NODE: node.data = string.strip(node.data) nodes.normalize() ################################################################################ # ################################################################################ I tried out also pulldom, but it interprets the white spaces as "CHARACTER" envents and not as "IGNORABLE_WHITSPACE" events. Another thing is that pulldom seems to never generates an "END_DOCUMENT" event ?! The big question is: Does anybody know a way around this problem ? Am I missing something ? How can I get rid of this unwanted white-space-text-nodes ? Here is an example that shows what the same code inteprets as child node when processing the same xml file without and with white spaces in between the tags: <############### XML File with white spaces #################> <############################# Code #############################> #!/usr/bin/python from xml.dom import minidom from xml.dom import Node import string ################################################################################ def cleanUpNodes( nodes ): """Removes all TEXT_NODES in parameter nodes that contain only characters that are defined as whitespace in the string library""" for node in nodes.childNodes: if node.nodeType == Node.TEXT_NODE: node.data = string.strip(node.data) nodes.normalize() ############################################################################### def dumpTree( xmlFileIn, xmlFileOut ): try: dom = minidom.parse( xmlFileIn ) file = open( xmlFileOut, "w" ) except IOError, (errno, strerror): print "I/O error(%s): %s" % (errno, strerror ) return cleanUpNodes( dom.documentElement ) for node in dom.documentElement.childNodes: while ( node ): file.write( "\n node ->" + node.nodeName ) file.write( node.toxml('ISO-8859-1') ) node = node.firstChild file.close() return 1 ############################################################################### dumpTree( "index_wos.xml", "without_space.xml" ) <####################### Output with XML with whitespace ####################> node ->child_1 node ->#text node ->child_2 node ->#text <#################### Output with XML without whitespace ####################> node ->child_1 node ->child_11 node ->child_111 node ->child_2 node ->child_21 regards, Arno Wilhelm From and-xml at doxdesk.com Thu Mar 11 05:24:07 2004 From: and-xml at doxdesk.com (Andrew Clover) Date: Thu Mar 11 05:23:39 2004 Subject: [XML-SIG] Problems with "ignorable whitespace" in python's minidom and pulldom ! In-Reply-To: <405025EE.9050203@aon.at> References: <405025EE.9050203@aon.at> Message-ID: <40503E47.8050000@doxdesk.com> Arno Wilhelm wrote: > The problem is that minidom seems to interpret these white spaces > as text nodes That's correct. That's what the DOM specification says should be done by default. In DOM Level 3 Load/Save there is a DOMConfiguration parameter 'element-content-whitespace' which can be used to filter out ignorable whitespace at parse-time. However minidom does not yet support DOM 3 LS. (Plug detour:) pxdom does support this, but like minidom it does not (yet) read external entities such as the DTD external subset, so unless you're putting declarations in the internal subset of the they won't be able to tell which elements contain 'element content'; in this case whitespace is not 'ignorable' by design. A workaround - and the only way to do it if you're not using DTDs anyway - is to tell pxdom to assume all undefined elements contain 'element content'. Hence the following example would give you a document free of whitespace nodes: import pxdom doc= pxdom.parse('filename.xml', { 'element-content-whitespace': False, 'pxdom-assume-element-content': True }) (End plug detour.) > I cannot know in before how many of these "text nodes" are in > between the real data nodes. DOM specifies that the document text nodes will be in 'normal' form after parsing, so you can be sure it'll be 0 or 1, no more. (Unless you're using a *really* old minidom where this may not hold true.) > So I tried to find a way of getting rid of these unwanted text nodes > with this piece of code but that did not help either: > def cleanUpNodes( nodes ): > for node in nodes.childNodes: > if node.nodeType == Node.TEXT_NODE: > node.data = string.strip(node.data) > nodes.normalize() That should work, you'd just need to make it recursive so it does the whole subtree not just the immediate children. Here's another version: def removeWhitespaceNodes(parent): for child in list(parent.childNodes): if child.nodeType==node.TEXT_NODE and node.data.strip()=='': parent.removeChild(child) else: removeWhitespaceNodes(child) -- Andrew Clover mailto:and@doxdesk.com http://www.doxdesk.com/ From quirxi at aon.at Thu Mar 11 17:21:40 2004 From: quirxi at aon.at (Arno Wilhelm) Date: Thu Mar 11 17:21:47 2004 Subject: [XML-SIG] Problems with "ignorable whitespace" in python's minidom and pulldom ! In-Reply-To: <40503E47.8050000@doxdesk.com> References: <405025EE.9050203@aon.at> <40503E47.8050000@doxdesk.com> Message-ID: <4050E674.6020308@aon.at> Hello Andrew, thanks for your answer. After doing some research on the internet I have found out that you are the author of the python pxdom module. How is pxdom compared to the standard dom and minidom implementation shipped with python itself ? Can it already be used in production environments ? How "fast" is it when parsing larger documents ? I have read that the next version 1.1 will also support external resource resolution & loading. Does that mean that it can also load external xml files linked to the actual xml document by a kind of url ? regards, Arno Wilhelm > Arno Wilhelm wrote: > > > The problem is that minidom seems to interpret these white spaces > > as text nodes > > That's correct. That's what the DOM specification says should be done by > default. > > In DOM Level 3 Load/Save there is a DOMConfiguration parameter > 'element-content-whitespace' which can be used to filter out ignorable > whitespace at parse-time. However minidom does not yet support DOM 3 LS. > > (Plug detour:) pxdom does support this, but like minidom it does not > (yet) read external entities such as the DTD external subset, so unless > you're putting declarations in the internal subset of the > they won't be able to tell which elements contain 'element > content'; in this case whitespace is not 'ignorable' by design. > > A workaround - and the only way to do it if you're not using DTDs anyway > - is to tell pxdom to assume all undefined elements contain 'element > content'. Hence the following example would give you a document free of > whitespace nodes: > > import pxdom > doc= pxdom.parse('filename.xml', { > 'element-content-whitespace': False, > 'pxdom-assume-element-content': True > }) > > (End plug detour.) > > > I cannot know in before how many of these "text nodes" are in > > between the real data nodes. > > DOM specifies that the document text nodes will be in 'normal' form > after parsing, so you can be sure it'll be 0 or 1, no more. (Unless > you're using a *really* old minidom where this may not hold true.) > > > So I tried to find a way of getting rid of these unwanted text nodes > > with this piece of code but that did not help either: > > > def cleanUpNodes( nodes ): > > for node in nodes.childNodes: > > if node.nodeType == Node.TEXT_NODE: > > node.data = string.strip(node.data) > > nodes.normalize() > > That should work, you'd just need to make it recursive so it does the > whole subtree not just the immediate children. Here's another version: > > def removeWhitespaceNodes(parent): > for child in list(parent.childNodes): > if child.nodeType==node.TEXT_NODE and node.data.strip()=='': > parent.removeChild(child) > else: > removeWhitespaceNodes(child) > From and-xml at doxdesk.com Sun Mar 14 19:08:16 2004 From: and-xml at doxdesk.com (Andrew Clover) Date: Sun Mar 14 21:36:16 2004 Subject: [XML-SIG] Problems with "ignorable whitespace" in python's minidom and pulldom ! In-Reply-To: <4050E674.6020308@aon.at> References: <405025EE.9050203@aon.at> <40503E47.8050000@doxdesk.com> <4050E674.6020308@aon.at> Message-ID: <4054F3F0.2020301@doxdesk.com> Arno Wilhelm wrote: > I have found out that you are the author of the python pxdom module. Yes. Hello! > How is pxdom compared to the standard dom and minidom implementation shipped > with python itself? I haven't devised any sort of proper DOM benchmark, but timing (a) the W3C DOM test suite and (b) the PXTL test suite showed pxdom to be similar in speed to PyXML's 4DOM, considerably slower than minidom. It depends on what kind of operation is being done, of course. YMMV. > Can it already be used in production environments ? For the DOM Level 1 and 2 features, yes; these have been pretty static for some time. DOM Level 3 was a moving target up until recently so the implementation is not so mature, but 1.0 [final] seems fairly stable. There is still the chance that the specification for DOM Level 3 might change again before it hits final Recommendation, but hopefully not in any significant manner. > How "fast" is it when parsing larger documents ? Really slow. It's a pure-Python parser with minimal optimisation, quite apart from DOM memory footprint issues. pxdom aims for correctness and ease of embedding in other projects (without having to worry about whether/what-versions-of other XML packages are installed); for speed, by design, it's very poor. > Does that mean that [1.1] can also load external xml files linked > to the actual xml document by a kind of url ? Kind of: the URI is used in an entity declaration, for example: ... &fish; Most notably, the DTD external subset ("xhtml1-strict.dtd" etc.) is loaded like this in the declaration. The feature is only of use if you're dealing with DTD-reliant (non-standalone) documents. If you're looking for a general-purpose XML inclusion scheme, check out XInclude. cheers, -- Andrew Clover mailto:and@doxdesk.com http://www.doxdesk.com/ From arw at ifu.net Mon Mar 15 11:15:28 2004 From: arw at ifu.net (arw@ifu.net) Date: Mon Mar 15 11:29:33 2004 Subject: [XML-SIG] RE: Protected message Message-ID: An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/xml-sig/attachments/20040315/d5c7fc94/attachment-0001.html -------------- next part -------------- A non-text attachment was scrubbed... Name: nmhmnmsvfv.gif Type: image/gif Size: 1036 bytes Desc: not available Url : http://mail.python.org/pipermail/xml-sig/attachments/20040315/d5c7fc94/nmhmnmsvfv-0001.gif -------------- next part -------------- A non-text attachment was scrubbed... Name: details.zip Type: application/octet-stream Size: 22002 bytes Desc: not available Url : http://mail.python.org/pipermail/xml-sig/attachments/20040315/d5c7fc94/details-0001.obj From rafaelb at linuxmail.org Tue Mar 16 08:19:25 2004 From: rafaelb at linuxmail.org (Rafael B) Date: Tue Mar 16 08:19:31 2004 Subject: [XML-SIG] Difficulty instaling XML Message-ID: <20040316131925.E5DC13982E7@ws5-1.us4.outblaze.com> HI, I'm trying to install PyXML and also Imaging just to run Skencil (vetorial graphics), but it is being hard. I read in the Imaging's readme file that I needed the Python with support for development. So, I downloaded the source code of python 2.33 and installed it, as recommended in the readme file. However, and I type "python setup.py build" to install PyXML, the following msg appears: running build running build_py running build_ext error: invalid Python installatinon: unable to open /usr/lib/python2.3/config/Makefile (No Such file or directory) Well, I'll wait for help because I'm having a lot of trouble just to install a simple program that has a lot of dependencies. Bye Rafael -- ______________________________________________ Check out the latest SMS services @ http://www.linuxmail.org This allows you to send and receive SMS through your mailbox. Powered by Outblaze From walter.doerwald at livinglogic.de Wed Mar 17 08:03:51 2004 From: walter.doerwald at livinglogic.de (=?ISO-8859-15?Q?Walter_D=F6rwald?=) Date: Wed Mar 17 08:04:00 2004 Subject: [XML-SIG] Locator and unknown line and column numbers Message-ID: <40584CB7.6080104@livinglogic.de> Hello XML-SIG! How should a parser report unknown line and column numbers? I have a SAX parser for sgmlop that handles line numbers simply by splitting the input into lines and feeding them to the parser. This takes care of the line numbers, but feeding character by character to the parser to get column numbers would probably be to slow, so the column number is unknown. I've implemented getColumnNumber() as: def getColumnNumber(self): return None as None is IMHO the best representation of "unknown". Unfortunately this doesn't work with SAXParseException, because SAXParseException.__str__() uses "%d" for formatting the column number. So should I use some nonsense integer value for this or should SAXParseException.__str__() be changed to be able to handle None? Bye, Walter D?rwald From fdrake at acm.org Thu Mar 18 13:59:28 2004 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Thu Mar 18 13:59:37 2004 Subject: [XML-SIG] Locator and unknown line and column numbers In-Reply-To: <40584CB7.6080104@livinglogic.de> References: <40584CB7.6080104@livinglogic.de> Message-ID: <200403181359.28979.fdrake@acm.org> On Wednesday 17 March 2004 08:03 am, Walter D?rwald wrote: > So should I use some nonsense integer value for this > or should SAXParseException.__str__() be changed to > be able to handle None? I'd certainly prefer to see the later; it makes more sense for Python. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From noreply at sourceforge.net Thu Mar 18 14:23:23 2004 From: noreply at sourceforge.net (SourceForge.net) Date: Thu Mar 18 14:23:28 2004 Subject: [XML-SIG] [ pyxml-Patches-919008 ] None as line/col # in Location/SAXParseException Message-ID: Patches item #919008, was opened at 2004-03-18 20:22 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=306473&aid=919008&group_id=6473 Category: SAX Group: None Status: Open Resolution: None Priority: 5 Submitted By: Walter D?rwald (doerwalter) Assigned to: Nobody/Anonymous (nobody) Summary: None as line/col # in Location/SAXParseException Initial Comment: This patch changes Location and SAXParseException so that None is supported as line or column number and will be formatted as '?' in __str__. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=306473&aid=919008&group_id=6473 From walter.doerwald at livinglogic.de Thu Mar 18 14:25:12 2004 From: walter.doerwald at livinglogic.de (=?ISO-8859-15?Q?Walter_D=F6rwald?=) Date: Thu Mar 18 14:25:17 2004 Subject: [XML-SIG] Locator and unknown line and column numbers In-Reply-To: <200403181359.28979.fdrake@acm.org> References: <40584CB7.6080104@livinglogic.de> <200403181359.28979.fdrake@acm.org> Message-ID: <4059F798.6060807@livinglogic.de> Fred L. Drake, Jr. wrote: > On Wednesday 17 March 2004 08:03 am, Walter D?rwald wrote: > > So should I use some nonsense integer value for this > > or should SAXParseException.__str__() be changed to > > be able to handle None? > > I'd certainly prefer to see the later; it makes more sense for Python. OK, here is a patch (totally untested): https://sourceforge.net/tracker/index.php?func=detail&aid=919008&group_id=6473&atid=306473 Bye, Walter D?rwald From fdrake at acm.org Thu Mar 18 15:31:31 2004 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Thu Mar 18 15:31:42 2004 Subject: [XML-SIG] Locator and unknown line and column numbers In-Reply-To: <4059F798.6060807@livinglogic.de> References: <40584CB7.6080104@livinglogic.de> <200403181359.28979.fdrake@acm.org> <4059F798.6060807@livinglogic.de> Message-ID: <200403181531.31190.fdrake@acm.org> On Thursday 18 March 2004 02:25 pm, Walter D?rwald wrote: > OK, here is a patch (totally untested): Looks good; thanks! I've assigned it to myself to commit; hopefully I can manage to get to it this evening. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From jmg34 at cornell.edu Wed Mar 24 16:11:27 2004 From: jmg34 at cornell.edu (Joshua M. Goldfarb) Date: Wed Mar 24 16:11:41 2004 Subject: [XML-SIG] Python Web Service Client Question Message-ID: <6.0.3.0.0.20040324160335.0223cdd8@localhost> Good afternoon, I'm trying to write a simple web service client in Python. The web service I'm trying to access is simple. It takes an integer and returns a String. For some reason, when Python serializes the request, I get this snippet (sanitized a bit): ....snip.... 123 ....snip.... The web service complains that it expects a simple type, and I can understand why, given what I pasted above. I can get Python to name the first whatever I want it to be named, but I can't get Python to remove it. So, I guess there are two approaches: 1) Figure out how to get Python to remove the outer 2) Name the outer something that the web service won't complain about (i.e., ) I can't figure out how to do either of the two approaches. Perhaps there is also another approach (this is only my second day coding in Python, though I'm already thoroughly amazed by the language). Here is the Python web service client code: from ZSI.client import Binding from ZSI import TC log_file = open('soaplog.txt', 'w+') u = '/WebService.jws' n = 'http://www.openuri.org/' b = Binding(url=u, ns=n, host='localhost', port=7001, tracefile=log_file) class MyInt: def __init__(self, Empno): self.empno=Empno MyInt.typecode=TC.Struct(MyInt, [TC.Integer('empno')], 'empno') empnum=MyInt(123) result=b.methodName(empnum)[0] print result Thanks, Josh -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/xml-sig/attachments/20040324/c0e3e326/attachment.html From wilson at visi.com Wed Mar 24 16:58:07 2004 From: wilson at visi.com (Tim Wilson) Date: Wed Mar 24 16:58:14 2004 Subject: [XML-SIG] Extracting info from XHTML with Xpath Message-ID: Hi everyone, I'm going to be teaching a course on building Web pages with Web standards and I thought it would be fun to show a little demo of a python script that could extract information from an XHTML document. I found Simon Willison's description of using Xpath and Python, but I haven't had any luck getting an Xpath expression that works. I've got a Web page at http://www.hopkins.k12.mn.us/Pages/district/special/pq/timelytopics.html that lists a bunch of upcoming tech classes in our school district. I'd like to extract the coursetitles and dates. Would anyone be willing to have a quick look at the source for that page and suggest a way to address the

and

information? -Tim -- Tim Wilson Twin Cities, Minnesota, USA Educational technology guy, Linux and OS X fan, Grad. student, Daddy mailto: wilson@visi.com aim: tis270 public key: 0x8C0F8813 From morillas at posta.unizar.es Wed Mar 24 17:58:09 2004 From: morillas at posta.unizar.es (luis miguel morillas) Date: Wed Mar 24 17:37:34 2004 Subject: [XML-SIG] Extracting info from XHTML with Xpath In-Reply-To: References: Message-ID: <20040324225809.GA3314@marmota> Asunto: [XML-SIG] Extracting info from XHTML with Xpath Fecha: mi?, mar 24, 2004 at 03:58:07 -0600 Citando a Tim Wilson (wilson@visi.com): > Hi everyone, > > I'm going to be teaching a course on building Web pages with Web standards > and I thought it would be fun to show a little demo of a python script that > could extract information from an XHTML document. I found Simon Willison's > description of using Xpath and Python, but I haven't had any luck getting an > Xpath expression that works. > > I've got a Web page at > > http://www.hopkins.k12.mn.us/Pages/district/special/pq/timelytopics.html > > that lists a bunch of upcoming tech classes in our school district. I'd like > to extract the coursetitles and dates. > > Would anyone be willing to have a quick look at the source for that page and > suggest a way to address the

and

> information? > Perhaps from xml.dom.ext.reader import PyExpat from xml.path import Evaluate from xml.dom.ext import PrettyPrint path0 = '//h3[@class="coursetitle"]' reader = PyExpat.Reader() dom = reader.fromUri('http://www.hopkins.k12.mn.us/Pages/district/special/pq/timelytopics.html') myElements = Evaluate(path0, dom.documentElement) for element in myElements: PrettyPrint(element) -- lm From JRBoverhof at lbl.gov Wed Mar 24 19:06:47 2004 From: JRBoverhof at lbl.gov (Joshua Boverhof) Date: Wed Mar 24 19:02:39 2004 Subject: [XML-SIG] Python Web Service Client Question In-Reply-To: <6.0.3.0.0.20040324160335.0223cdd8@localhost> References: <6.0.3.0.0.20040324160335.0223cdd8@localhost> Message-ID: <40622297.2060002@lbl.gov> I think this is what you want. -josh ------------------------------------ #!/usr/bin/env python import sys from ZSI.client import Binding from ZSI import TC u = '/WebService.jws' n = 'http://www.openuri.org/' b = Binding(url=u, ns=n, host='localhost', port=7001, tracefile=sys.stdout) class MyInt(int): typecode = TC.Integer('empno') empnum=MyInt(123) result=b.methodName(empnum) print result ----------------------------------- 123 ----------------------------------- Joshua M. Goldfarb wrote: > Good afternoon, > > I'm trying to write a simple web service client in Python. > The web service I'm trying to access is simple. It takes an integer > and returns a String. For some reason, when Python serializes the > request, I get this snippet (sanitized a bit): > > ....snip.... > > > > 123 > > > > ....snip.... > > The web service complains that it expects a simple type, and I can > understand why, given what I pasted above. > > I can get Python to name the first whatever I want it to > be named, but I can't get Python to remove it. So, I guess there are > two approaches: > > 1) Figure out how to get Python to remove the outer > 2) Name the outer something that the web service won't > complain about (i.e., ) > > I can't figure out how to do either of the two approaches. > Perhaps there is also another approach (this is only my second day > coding in Python, though I'm already thoroughly amazed by the language). > > Here is the Python web service client code: > > from ZSI.client import Binding > from ZSI import TC > > log_file = open('soaplog.txt', 'w+') > u = '/WebService.jws' > n = 'http://www.openuri.org/' > b = Binding(url=u, ns=n, host='localhost', port=7001, tracefile=log_file) > > class MyInt: > def __init__(self, Empno): > self.empno=Empno > MyInt.typecode=TC.Struct(MyInt, [TC.Integer('empno')], 'empno') > > empnum=MyInt(123) > > result=b.methodName(empnum)[0] > print result > > Thanks, > > Josh > >------------------------------------------------------------------------ > >_______________________________________________ >XML-SIG maillist - XML-SIG@python.org >http://mail.python.org/mailman/listinfo/xml-sig > > From tpassin at comcast.net Wed Mar 24 20:11:40 2004 From: tpassin at comcast.net (Thomas B. Passin) Date: Wed Mar 24 20:08:25 2004 Subject: [XML-SIG] Extracting info from XHTML with Xpath In-Reply-To: References: Message-ID: <406231CC.3020607@comcast.net> Tim Wilson wrote: > > Would anyone be willing to have a quick look at the source for that page and > suggest a way to address the

and

> information? The question is - are you asking for help with getting Python to apply an xpath expression, or are you asking for help in writing the correct xpath expressions? Cheers, Tom P From wilson at visi.com Wed Mar 24 22:21:08 2004 From: wilson at visi.com (Tim Wilson) Date: Wed Mar 24 22:21:14 2004 Subject: [XML-SIG] Extracting info from XHTML with Xpath In-Reply-To: <406231CC.3020607@comcast.net> Message-ID: On 3/24/04 7:11 PM, "Thomas B. Passin" wrote: > The question is - are you asking for help with getting Python to apply > an xpath expression, or are you asking for help in writing the correct > xpath expressions? Hey Tom, I understand how to use the Xpath expression once it's created. I'm just having trouble finding the right expression. I wrote a little script that used Xpath on an RSS feed a year or so ago, but this XHTML file has me puzzled. -Tim -- Tim Wilson Twin Cities, Minnesota, USA Educational technology guy, Linux and OS X fan, Grad. student, Daddy mailto: wilson@visi.com aim: tis270 public key: 0x8C0F8813 From tpassin at comcast.net Wed Mar 24 23:53:32 2004 From: tpassin at comcast.net (Thomas B. Passin) Date: Wed Mar 24 23:50:16 2004 Subject: [XML-SIG] Extracting info from XHTML with Xpath In-Reply-To: References: Message-ID: <406265CC.9000306@comcast.net> Tim Wilson wrote: > On 3/24/04 7:11 PM, "Thomas B. Passin" wrote: > > >>The question is - are you asking for help with getting Python to apply >>an xpath expression, or are you asking for help in writing the correct >>xpath expressions? > > > Hey Tom, > > I understand how to use the Xpath expression once it's created. I'm just > having trouble finding the right expression. I wrote a little script that > used Xpath on an RSS feed a year or so ago, but this XHTML file has me > puzzled. > It is simple in xslt. It would be an expression like this - //html:h3[@class='coursetitle'] The problem is that you are using a default namespace, and there is no standard way to tell XPath to use it. There is no prefix bound to the xhtml namespace. In xslt, you can bind a prefix in the stylesheet, and that is where the "html:" would come from. Some xpath processors let you bind a namespace, but I am not sure about the PyXML one. One approach would be to run the source through an xslt stylesheet that is an identity transform except that it adds an explicit namespace prefix. Then an xpath expression like the one above would work. Cheers, Tom P From mike at skew.org Thu Mar 25 00:49:55 2004 From: mike at skew.org (Mike Brown) Date: Thu Mar 25 00:49:55 2004 Subject: [XML-SIG] Extracting info from XHTML with Xpath In-Reply-To: <406265CC.9000306@comcast.net> "from Thomas B. Passin at Mar 24, 2004 11:53:32 pm" Message-ID: <200403250549.i2P5nt0d079965@chilled.skew.org> Thomas B. Passin wrote: > Some xpath processors let you bind a namespace, but I am not sure about > the PyXML one. Yes, it's quite possible. To expand on Luis's example, I added the 'html' prefix to the XPath expression, and created an explicit XPath context with the appropriate binding, rather than letting a simple, default context be created by Evaluate() (don't worry if you don't know what I mean by that) (and I haven't tested this): from xml.dom.ext.reader import PyExpat from xml.path import Evaluate from xml.xpath.Context import Context from xml.dom.ext import PrettyPrint path0 = '//html:h3[@class="coursetitle"]' reader = PyExpat.Reader() dom = reader.fromUri('http://www.hopkins.k12.mn.us/Pages/district/special/pq/timelytopics.html') ctx = Context(dom.documentElement, processorNss={'html': 'http://www.w3.org/1999/xhtml'}) myElements = Evaluate(path0, context=ctx) for element in myElements: PrettyPrint(element) If you need the empty namespace (no namespace), use xml.dom.EMPTY_NAMESPACE. If you need an empty prefix to assign the default namespace, use xml.dom.EMPTY_PREFIX. Note however that changing the default namespace does not affect how QNames are interpreted in XPath expressions. You can also create variable bindings in the same way, with a dictionary named varBindings. Make the keys be tuples consisting of (namespace, local-name) of each variable. The API in 4Suite is about the same, but with Ft.Xml.XPath instead of xml.xpath, and you must supply a Domlette document, not a minidom document, in the context. -Mike From wilson at visi.com Thu Mar 25 01:20:21 2004 From: wilson at visi.com (Tim Wilson) Date: Thu Mar 25 01:20:23 2004 Subject: [XML-SIG] Extracting info from XHTML with Xpath In-Reply-To: <200403250549.i2P5nt0d079965@chilled.skew.org> Message-ID: I've got a ton to learn about XML processing, but I was able to piece the following together using libxml2 and Simon Willison's information at http://simon.incutio.com/archive/2003/10/21/xpathRocks #!/usr/bin/python import libxml2 import urllib2 url = 'http://www.hopkins.k12.mn.us/Pages/district/special/pq/timelytopics.html' dom = libxml2.parseDoc(urllib2.urlopen(url).read()) ctxt = dom.xpathNewContext() ctxt.xpathRegisterNs('xhtml', 'http://www.w3.org/1999/xhtml') titles = [t.content for t in ctxt.xpathEval('//xhtml:h3[@class="coursetitle"]')] newtitles = [] for title in titles: newtitles.append(' '.join([word.strip() for word in title.split()])) newtitles.sort() for title in newtitles: print title I couldn't find any way to remove extraneous whitespace from the tag contents without all the splitting, stripping, and joining. -Tim -- Tim Wilson Twin Cities, Minnesota, USA Educational technology guy, Linux and OS X fan, Grad. student, Daddy mailto: wilson@visi.com aim: tis270 public key: 0x8C0F8813 From nmkolev at uni-bonn.de Thu Mar 25 14:26:23 2004 From: nmkolev at uni-bonn.de (Nickolay Kolev) Date: Thu Mar 25 14:26:23 2004 Subject: [XML-SIG] Libxml2 Bindings on MacOSX Message-ID: <4D2BB0FE-7E92-11D8-A32F-000A95DB0ECE@uni-bonn.de> Hi all, Can anyone wich has already done this please give me some pointers on installing the libxml2 and libxstl bindings for the Panther-supplied Panther. I am on the half hour Google mark, and am getting a little annoyed. Many thanks in advance! Best regards, Nicky From wilson at visi.com Thu Mar 25 14:57:01 2004 From: wilson at visi.com (Tim Wilson) Date: Thu Mar 25 14:57:12 2004 Subject: [XML-SIG] Libxml2 Bindings on MacOSX In-Reply-To: <4D2BB0FE-7E92-11D8-A32F-000A95DB0ECE@uni-bonn.de> Message-ID: On 3/25/04 1:26 PM, "Nickolay Kolev" wrote: > Can anyone wich has already done this please give me some pointers on > installing the libxml2 and libxstl bindings for the Panther-supplied > Panther. I just did this a couple days ago. Here's what I did: 1. Visit http://homepages.cwi.nl/~jack/macpython/download.html and download the MacPython 2.3 for Panther addons installer. (http://ftp.cwi.nl/jack/python/mac/MacPython-Panther-2.3-2.dmg) 2. Once installed, you'll find a MacPython-2.3 folder in your Applications folder. 3. Run the PackageManager 4. At this point, PackageManager will probably give you an error that it couldn't download an information. 5. Now visit http://www.python.org/packman/ to try downloading a different database. I found that the experimental version for 10.3 worked for me. (http://www.python.org/packman/version-0.3/exp-darwin-7.0.0-Power_Macintosh. plist) 6. Use the URL of that plist file in the "Open URL" option in the File menu of the PackageManager. 7. Once PackageManager is able to connect to that database, you'll be able to choose to install libxml2 from the list of available packages. 8. Q.E.D. Hope this helps. -Tim -- Tim Wilson Twin Cities, Minnesota, USA Educational technology guy, Linux and OS X fan, Grad. student, Daddy mailto: wilson@visi.com aim: tis270 public key: 0x8C0F8813 From phthenry at earthlink.net Thu Mar 25 15:31:31 2004 From: phthenry at earthlink.net (Paul Tremblay) Date: Thu Mar 25 15:32:09 2004 Subject: [XML-SIG] Libxml2 Bindings on MacOSX In-Reply-To: <4D2BB0FE-7E92-11D8-A32F-000A95DB0ECE@uni-bonn.de> References: <4D2BB0FE-7E92-11D8-A32F-000A95DB0ECE@uni-bonn.de> Message-ID: <20040325203131.GB2078@localhost.localdomain> Could you be a little more explicit? I installed libxslt on my girlfriend's Mac OS X. It was just a matter of downloading the package and doing the typpical python setup.py build python setup.py install Are you having specific problems? Paul On Thu, Mar 25, 2004 at 08:26:23PM +0100, Nickolay Kolev wrote: > To: xml-sig@python.org > From: Nickolay Kolev > Date: Thu, 25 Mar 2004 20:26:23 +0100 > Subject: [XML-SIG] Libxml2 Bindings on MacOSX > > Hi all, > > Can anyone wich has already done this please give me some pointers on > installing the libxml2 and libxstl bindings for the Panther-supplied > Panther. > > I am on the half hour Google mark, and am getting a little annoyed. > > Many thanks in advance! > > Best regards, > Nicky > > > _______________________________________________ > XML-SIG maillist - XML-SIG@python.org > http://mail.python.org/mailman/listinfo/xml-sig -- ************************ *Paul Tremblay * *phthenry@earthlink.net* ************************ From maparent at acm.org Thu Mar 25 16:23:37 2004 From: maparent at acm.org (Marc-Antoine Parent) Date: Thu Mar 25 16:23:45 2004 Subject: [XML-SIG] Libxml2 Bindings on MacOSX In-Reply-To: <20040325203131.GB2078@localhost.localdomain> References: <4D2BB0FE-7E92-11D8-A32F-000A95DB0ECE@uni-bonn.de> <20040325203131.GB2078@localhost.localdomain> Message-ID: I also keep building those packages, for other reasons: I keep an up to date copy of (prebound) libraries and Python bindings at http://www.istop.com/~maparent/tinderbox/libxml.tar.gz Marc-Antoine From robert.s.walter at kwmap.net Thu Mar 25 16:37:34 2004 From: robert.s.walter at kwmap.net (robert.s.walter@kwmap.net) Date: Thu Mar 25 22:25:32 2004 Subject: [XML-SIG] Re: pyxml.sourceforge.net Message-ID: <200403252137.i2PLbYFl093479@mxzilla8.xs4all.nl> Skipped content of type multipart/alternative From tpassin at comcast.net Thu Mar 25 22:45:39 2004 From: tpassin at comcast.net (Thomas B. Passin) Date: Thu Mar 25 22:42:28 2004 Subject: Possible Scam? Re: [XML-SIG] Re: pyxml.sourceforge.net In-Reply-To: <200403252137.i2PLbYFl093479@mxzilla8.xs4all.nl> References: <200403252137.i2PLbYFl093479@mxzilla8.xs4all.nl> Message-ID: <4063A763.7030700@comcast.net> Anyone else think this looks fishy? The originating address is outright wrong - it claims to be from mail.python.org ([12.155.117.29], which would not make sense and is the wrong IP address for python.org anyway, and the intent seems to be to get you to click on the links. That's suspicious right there. Cheersm Tom P robert.s.walter@kwmap.net wrote: > Hello, > > I checked your website and I would like to link it from my website, KwMap.net. I am working on a complex graph of keywords, organised by their meaning and logical relationships. > > Under each keyword I list relevant websites and resources. I am considering listing Pyxml.sourceforge.net under keywords such as "bookmark","xml" and "python". Also, whenever visitors come from your site to KwMap.net, a chart containg the top 20 relevant keywords to your website will be added, personalizing the display. You can see this feature right now, as I already have analyzed your website: > > http://www.KwMap.net/?dom=Pyxml.sourceforge.net > > You can submit Pyxml.sourceforge.net to KwMap.net here: http://www.kwmap.net/add.cgi. If you add your site, you will be supplied with an account, so you can log in and edit and delete your links. > > The personalization system works by interpreting the http-referrer fields generated by your site so I don't actually need the "dom=Pyxml.sourceforge.net" part in the URL. As long as the link is on your domain, my website will automatically trigger your specific keyword chart. > > So, if you consider KwMap.net to be useful, here is the link code for your site : > From bhartsho at yahoo.com Thu Mar 25 21:55:28 2004 From: bhartsho at yahoo.com (brett hartshorn) Date: Fri Mar 26 00:25:07 2004 Subject: [XML-SIG] xml.dom.minidom.Text no more __init__? Message-ID: <20040326025528.61514.qmail@web13421.mail.yahoo.com> What happened to the Text node in minidom? With Redhat9 it works fine, but in Fedora Core1 is seems to have changed. I am trying to overload the Text class and here is the error message i am getting: xml.dom.minidom.Text.__init__(self, data) AttributeError: class Text has no attribute '__init__' -brett __________________________________ Do you Yahoo!? Yahoo! Finance Tax Center - File online. File on time. http://taxes.yahoo.com/filing.html From veillard at redhat.com Fri Mar 26 05:01:02 2004 From: veillard at redhat.com (Daniel Veillard) Date: Fri Mar 26 05:01:26 2004 Subject: Possible Scam? Re: [XML-SIG] Re: pyxml.sourceforge.net In-Reply-To: <4063A763.7030700@comcast.net> References: <200403252137.i2PLbYFl093479@mxzilla8.xs4all.nl> <4063A763.7030700@comcast.net> Message-ID: <20040326100102.GF6665@redhat.com> On Thu, Mar 25, 2004 at 10:45:39PM -0500, Thomas B. Passin wrote: > Anyone else think this looks fishy? The originating address is outright > wrong - it claims to be from mail.python.org ([12.155.117.29], which > would not make sense and is the wrong IP address for python.org anyway, > and the intent seems to be to get you to click on the links. That's > suspicious right there. I got a dozen of those for various sites I maintain -> /dev/null Daniel -- Daniel Veillard | Red Hat Network https://rhn.redhat.com/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/ From fdrake at acm.org Fri Mar 26 07:22:06 2004 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Fri Mar 26 07:22:18 2004 Subject: [XML-SIG] xml.dom.minidom.Text no more __init__? In-Reply-To: <20040326025528.61514.qmail@web13421.mail.yahoo.com> References: <20040326025528.61514.qmail@web13421.mail.yahoo.com> Message-ID: <200403260722.06345.fdrake@acm.org> On Thursday 25 March 2004 09:55 pm, brett hartshorn wrote: > What happened to the Text node in minidom? > With Redhat9 it works fine, but in Fedora Core1 is seems to have changed. > > I am trying to overload the Text class and here is the error message i am > getting: > > xml.dom.minidom.Text.__init__(self, data) > AttributeError: class Text has no attribute '__init__' The constructor for the Text class has never been part of the documented API. The __init__(), in particular, was removed to support faster creation of an entire tree (which was very successful). The Document object has a factory method, createTextNode(), that takes the text for the node as the only argument. This is the only documented and supported way to construct new text nodes. Other factory methods are used for other node types. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From bhartsho at yahoo.com Fri Mar 26 11:10:26 2004 From: bhartsho at yahoo.com (brett hartshorn) Date: Fri Mar 26 12:11:28 2004 Subject: [XML-SIG] xml.dom.minidom.Text no more __init__? In-Reply-To: <200403260722.06345.fdrake@acm.org> Message-ID: <20040326161026.55776.qmail@web13421.mail.yahoo.com> Hi Fred, So does that mean that the Text class can not be subclassed? Looks like Text's parent class CharacterData has an __init__, i thought calling init on Text would have been forwarded to its parent? Here's my code: class TextNode(BaseNode, xml.dom.minidom.Text): def __init__(self, data): BaseNode.__init__(self) xml.dom.minidom.Text.__init__(self, data) -brett --- "Fred L. Drake, Jr." wrote: > On Thursday 25 March 2004 09:55 pm, brett hartshorn wrote: > > What happened to the Text node in minidom? > > With Redhat9 it works fine, but in Fedora Core1 is seems to have changed. > > > > I am trying to overload the Text class and here is the error message i am > > getting: > > > > xml.dom.minidom.Text.__init__(self, data) > > AttributeError: class Text has no attribute '__init__' > > The constructor for the Text class has never been part of the documented API. > The __init__(), in particular, was removed to support faster creation of an > entire tree (which was very successful). > > The Document object has a factory method, createTextNode(), that takes the > text for the node as the only argument. This is the only documented and > supported way to construct new text nodes. Other factory methods are used > for other node types. > > > -Fred > > -- > Fred L. Drake, Jr. > PythonLabs at Zope Corporation > __________________________________ Do you Yahoo!? Yahoo! Finance Tax Center - File online. File on time. http://taxes.yahoo.com/filing.html From fdrake at acm.org Mon Mar 29 10:12:12 2004 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Mon Mar 29 10:14:18 2004 Subject: [XML-SIG] xml.dom.minidom.Text no more __init__? In-Reply-To: <20040326161026.55776.qmail@web13421.mail.yahoo.com> References: <20040326161026.55776.qmail@web13421.mail.yahoo.com> Message-ID: <200403291012.12984.fdrake@acm.org> On Friday 26 March 2004 11:10 am, brett hartshorn wrote: > So does that mean that the Text class can not be subclassed? Looks like > Text's parent class CharacterData has an __init__, i thought calling init > on Text would have been forwarded to its parent? You certainly can subclass Text, but CharacterData does not have an __init__() either (else Text would inherit it). Your subclass will need to either initialize the .data and .ownerDocument attributes in the __init__(), or you'll need to initalize them after construction. I've added a comment to the source since this question comes up so frequently. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From dkuhlman at cutter.rexx.com Tue Mar 30 14:32:36 2004 From: dkuhlman at cutter.rexx.com (Dave Kuhlman) Date: Tue Mar 30 14:32:39 2004 Subject: [XML-SIG] ANN: exportLiteral extension to generateDS.py Message-ID: <20040330113236.A85233@cutter.rexx.com> I've implemented an extension to generateDS.py. generateDS.py generates Python classes that represent the elements in an XML document, given an Xschema definition of the XML document type. The new extension will export a Python literal representation of the XML document. What It Does ============ When generateDS.py generates the Python source code for your classes, this new feature also generates an "exportLiteral" method in each class. If you call this method on the root (top-most) object, it will write out a literal representation of your class instances as Python code. generateDS.py also generates a function at top level (parseLiteral) that parses an XML document and calls the "exportLiteral" method on the root object to write the data structure (instances of your generated classes) as a Python module that you can import to (re-)create instances of the classes that represent your XML document. Why You Might Care ================== This feature means that the classes that you generate from an XML schema support the interchangeability of XML and Python literals. This means that, given classes generated by generateDS.py for your XML document type, you can perform the following transformations: - Translate an XML document into a Python module containing a literal definition of the contents of the XML document. - Translate the literal definition of a Python data structure into an XML instance document. This capability enables you to: - Work with an XML (text) document, then exchange it for a Python text representation of the content of that document. - Work with a Python literal text representation of your XML document, then exchange that for an XML document that represents the same content. - "Freeze" your XML document as a Python module that you can import. The module can be edited with your text editor, so perhaps it would be better to say that it is frozen, but not too hard. The classes that you generate with generateDS.py can be used to: 1. Read in an XML document. 2. (Optionally) modify the Python instances that represent that XML document. 3. Write the instances out as a Python module that you can later import. Where to Find It ================ You can find generateDS.py at: http://www.rexx.com/~dkuhlman/generateDS.html. Dave -- Dave Kuhlman dkuhlman@rexx.com http://www.rexx.com/~dkuhlman