From LETICIA at tesla.cujae.edu.cu Wed Apr 6 00:25:24 2005 From: LETICIA at tesla.cujae.edu.cu (Leticia Larrosa) Date: Wed Apr 6 00:34:33 2005 Subject: [XML-SIG] validate dom tree object Message-ID: Hi all: I have to validate a dom tree but i can?t save it to a xml file. I have been using the validating parser "xmlproc": " xml.parsers.xmlproc.xmlapp.DTDConsumer xml.parsers.xmlproc.xmlapp.ErrorHandler xml.parsers.xmlproc.xmlval " but i found that the function "create_input_source" in the following code that are placed in the module "xml.parsers.xmlproc.xmlapp": " class InputSourceFactory: "A class that creates file-like objects from system identifiers." def create_input_source(self,sysid): if sysid[1:3]==":\\" or urlparse.urlparse(sysid)[0] == '': return open(sysid) else: return urllib2.urlopen(sysid) " don?t accept a instance of a xml dom tree. I get the following error: " File "C:\Python23\Lib\site-packages\_xmlplus\parsers\xmlproc\xmlval.py", line 31, in parse_resource self.parser.parse_resource(sysid) File "C:\Python23\Lib\site-packages\_xmlplus\parsers\xmlproc\xmlutils.py", line 123, in parse_resource infile = self.isf.create_input_source(sysID) File "C:\Python23\Lib\site-packages\_xmlplus\parsers\xmlproc\xmlapp.py", line 224, in create_input_source if sysid[1:3]==":\\" or urlparse.urlparse(sysid)[0] == '': AttributeError: Document instance has no attribute '__getitem__' " My question is if exist a way of validate a xml dom tree object, not a xml file. I accept any suggestions about other way of validate rather than "xmlproc". Thanks in advance From mertz at gnosis.cx Thu Apr 7 06:36:07 2005 From: mertz at gnosis.cx (David Mertz, Ph.D.) Date: Thu Apr 7 06:38:45 2005 Subject: [XML-SIG] [Announce] Gnosis Utils 1.2.0 Message-ID: <3iLVClKkXIiW092yn@gnosis.cx> David Mertz (mertz@gnosis.cx) Frank McIngvale (frankm@hiwaay.net) This release of the Gnosis Utilities contains several new modules, as well as fixes, enhancements, and speedups in existing subpackages. Try it out, have fun, send feedback! NEW SUBPACKAGES ------------------------------------------------------------------------ ADDED gnosis.utils.hashcash (also runs standalone). Python implementation of Hashcash v.1 (backward compatible with hashcash v.0). ADDED disthelper. A collection of scripts and modules that are generally useful for building/maintaining a Python source distribution. - To use disthelper in your own distribution, copy the entire disthelper/ tree into the toplevel of your tree, as has been done in Gnosis_Utils. - See disthelper/README for details. ADDED gnosis.pyconfig. Detect actual capabilities available in a Python interpreter. gnosis.pyconfig lets you write much more robust/readable code than simply relying on sys.version_info. ADDED gnosis.xml.xmlmap. Unicode->XML legality testing & Unicode helper functions. - For detailed background information on the motivation for this module, see "All About Python and Unicode" at: http://boodebr.org/python/pyunicode/index.html ENHANCEMENTS ------------------------------------------------------------------------ There is one critical change in gnosis.xml.pickle; all xml.pickle users are encouraged to upgrade. Catch unpickleable data in gnosis.xml.pickle and abort, instead of creating pickles that can't be reloaded. - Recommend all xml.pickle users upgrade to 1.2.0. - Issues: not all valid Unicode strings are valid XML CDATA or attribute values. - There have not been any reports of data loss due to this, but it is a possibility with 1.1.1. - A future Gnosis release will fix this so that the 'bad' data can be pickled, instead of just bailing out. Numerous new convenience functions for gnosis.xml.objectify. See gnosis/docs/xml_matters_39.txt for more discussion. - addChild() # moved to utils subpackage - walk_xo() # Recursively traverse the nodes - write_xml() # Serialize an _XO_ object back into XML - XPath() # Find node(s) within an _XO_ object - pyobj_printer() # moved to utils subpackage Some of older conveniences have been tweaked and/or were not announced previously: - content() # The (mixed) content of o as a list - children() # The child nodes (not PCDATA) of o - text() # List of textual children - dumps(): # The PCDATA in o (preserves whitespace) - normalize() # Whitespace normalize string, # e.g. o.PCDATA==normalize(dumps(o)) - tagname() # The element tag o was generated from - attributes()# List of (XML) attributes of o Fixed two bugs causing leaks in long running gnosis.xml.objectify processes (file close and expat base class). Significant speedups by miscellaneous refactoring/cleanup in gnosis.xml.objectify. Fixes to gnosis.xml.indexer suggested by Uche Ogbuji. Security fixes to gnosis.utils.convert.dmText2Html. DOWNLOADING: ------------------------------------------------------------------------ Browse the latest development snapshot or download it using 'wget -r' or similar tools: http://gnosis.cx/download/gnosis/ It may be obtained at: http://gnosis.cx/download/Gnosis_Utils-1.2.0.tar.gz The current release is always available as: http://gnosis.cx/download/Gnosis_Utils-current.tar.gz Other distribution formats and older versions can be found at: http://gnosis.cx/download/Gnosis_Utils.More/ BACKGROUND: ------------------------------------------------------------------------ Gnosis Utilities contains a number of Python libraries, most (but not all) related to working with XML. These include: disthelper (Create more flexible distutils archives) gnosis.indexer (Full-text indexing/searching) gnosis.xml.pickle (XML pickling of Python objects) gnosis.xml.objectify (Any XML to "native" Python objects) gnosis.xml.validity (Enforce validity constraints) gnosis.xml.relax (Tools for working with RelaxNG) gnosis.xml.indexer (XPATH indexing of XML documents) [...].convert.txt2html (Convert ASCII source files to HTML) gnosis.util.dtd2sql (DTD -> SQL 'CREATE TABLE' statements) gnosis.util.sql2dtd (SQL query -> DTD for query results) gnosis.util.xml2sql (XML -> SQL 'INSERT INTO' statements) gnosis.util.combinators (Combinatorial higher-order functions) gnosis.util.introspect (Introspect Python objects) gnosis.utils.hashcash (Hashcash proof-of-work protocol) gnosis.magic (Multimethods, metaclasses, etc) gnosis.trigramlib (Work w/ trigrams, e.g. spam filtering) gnosis.pyconfig (Capability-based version adaptation) ...and so much more! :-) From and-xml at doxdesk.com Thu Apr 7 19:56:16 2005 From: and-xml at doxdesk.com (Andrew Clover) Date: Thu Apr 7 19:55:36 2005 Subject: [XML-SIG] validate dom tree object In-Reply-To: References: Message-ID: <42557440.5020308@doxdesk.com> Leticia Larrosa wrote: > i found that the function "create_input_source" in the following code > that are placed in the module "xml.parsers.xmlproc.xmlapp": > "A class that creates file-like objects from system identifiers." > don't accept a instance of a xml dom tree. That's correct. A DOM tree is not a system identifier (effectively, a URI). > My question is if exist a way of validate a xml dom tree object, not a xml > file. Not currently as far as I know. For reasons of tradition, validation is typically done at parse-time. There is no standard way in the W3C DOM interface to retain the information from and declarations in the DTD, so most Python implementations do not keep this information, which is needed to do validation. (My own imp pxdom does keep the information, in a non-standard extended interface. So it would be possible to create a DOM-based validator based on this... it's something I'm considering doing myself, but not a priority at the moment.) -- Andrew Clover mailto:and@doxdesk.com http://www.doxdesk.com/ -- Andrew Clover mailto:and@doxdesk.com http://www.doxdesk.com/ From Sylvain.Thenault at logilab.fr Wed Apr 6 12:15:49 2005 From: Sylvain.Thenault at logilab.fr (Sylvain =?iso-8859-1?Q?Th=E9nault?=) Date: Fri Apr 8 08:35:48 2005 Subject: [XML-SIG] prepare_input_source and relative path In-Reply-To: <20050211084631.GA3844@logilab.fr> References: <20050210100217.GE3811@logilab.fr> <200502102015.j1AKFPsR009831@chilled.skew.org> <20050211084631.GA3844@logilab.fr> Message-ID: <20050406101549.GA4276@logilab.fr> Hi ! after some times working on other stuff, I've remembered that I had still the prepare_input_source patch pending. I've joined to this message the patch as it is now, and it's well working for me. I'll check it in in the next few days if no one objects. I also think that having a new release once this patch is applied would be a very good thing since it fixes some important xml compliance problem, such as those reported in : http://sourceforge.net/tracker/index.php?func=detail&aid=616431&group_id=6473&atid=106473 http://bugs.debian.org/213324 http://bugs.debian.org/182967 it may even fix sf bugs #749284 and #567411 which are related to system id problems, but I'm not sure at all for those one... regards PS: i'm not sure I should checkin the test_Uri.py file since it depends on Ft test tools, and I've no time to backport it to use unittest only. -- Sylvain Th?nault LOGILAB, Paris (France). http://www.logilab.com http://www.logilab.fr http://www.logilab.org -------------- next part -------------- --- /usr/lib/python2.3/site-packages/_xmlplus/sax/saxutils.py 2004-11-29 13:36:36.000000000 +0100 +++ /home/syt/cvs_work/_xmlplus/sax/saxutils.py 2005-04-06 11:55:50.000000000 +0200 @@ -10,6 +10,8 @@ import xmlreader import sys, _exceptions, saxlib +from xml.Uri import Absolutize, MakeUrllibSafe,IsAbsolute + try: _StringTypes = [types.StringType, types.UnicodeType] except AttributeError: # 1.5 compatibility:UnicodeType not defined @@ -513,19 +515,29 @@ source.setSystemId(f.name) if source.getByteStream() is None: - sysid = source.getSystemId() - if os.path.isfile(sysid): - basehead = os.path.split(os.path.normpath(base))[0] - source.setSystemId(os.path.join(basehead, sysid)) - f = open(sysid, "rb") - else: - source.setSystemId(urlparse.urljoin(base, sysid)) - f = urllib2.urlopen(source.getSystemId()) - + sysid = absolute_system_id(source.getSystemId(), base) + source.setSystemId(sysid) + f = urllib2.urlopen(sysid) source.setByteStream(f) return source + +def absolute_system_id(sysid, base=''): + # if a base is given, sysid may be relative to it, make the + # join before isfile() test + if base: + basehead = os.path.split(os.path.abspath(base))[0] + path = os.path.join(basehead, sysid) + else: + path = os.path.abspath(sysid) + if os.path.isfile(path): + sysid = 'file:%s' % path + elif base: + sysid = Absolutize(sysid, base) + #assert IsAbsolute(sysid) + return MakeUrllibSafe(sysid) + # =========================================================================== # # DEPRECATED SAX 1.0 CLASSES -------------- next part -------------- A non-text attachment was scrubbed... Name: Uri.py Type: text/x-python Size: 16518 bytes Desc: not available Url : http://mail.python.org/pipermail/xml-sig/attachments/20050406/b17b2adc/Uri-0001.py -------------- next part -------------- A non-text attachment was scrubbed... Name: test_saxutils.py Type: text/x-python Size: 2062 bytes Desc: not available Url : http://mail.python.org/pipermail/xml-sig/attachments/20050406/b17b2adc/test_saxutils-0001.py -------------- next part -------------- A non-text attachment was scrubbed... Name: test_Uri.py Type: text/x-python Size: 37608 bytes Desc: not available Url : http://mail.python.org/pipermail/xml-sig/attachments/20050406/b17b2adc/test_Uri-0001.py From mike at skew.org Fri Apr 8 08:56:23 2005 From: mike at skew.org (Mike Brown) Date: Fri Apr 8 08:56:28 2005 Subject: [XML-SIG] prepare_input_source and relative path In-Reply-To: <20050406101549.GA4276@logilab.fr> Message-ID: <200504080656.j386uN2C085172@chilled.skew.org> Sylvain Th?nault wrote: > after some times working on other stuff, I've remembered that I had > still the prepare_input_source patch pending. I've joined to this > message the patch as it is now, and it's well working for me. I'll > check it in in the next few days if no one objects. I still have not had time to review and test your changes, and don't anticipate being able to do it anytime soon. :/ I would just be happy to know that in PyXML, (1) OS-specific file system paths or relative URI references are never used as base URIs; only absolute URIs are (and they can be derived from what is given, in some cases) and (2) resolution of a relative URI reference to absolute form is carried out in accordance with RFC 3986 -- e.g., via Absolutize() in Uri.py. Anything else, like the details of OS path <-> URI conversion, improvements to percent-encoding/decoding, better str vs unicode handling, IDN support, etc. is lower priority. One thing you should do is compare your Uri.py against the current one. We've made a number of changes since February, and some of them are important: http://cvs.4suite.org/viewcvs/4Suite/Ft/Lib/Uri.py?r1=1.98&r2=1.89 > PS: i'm not sure I should checkin the test_Uri.py file since it depends > on Ft test tools, and I've no time to backport it to use unittest only. I wouldn't worry about it, in this case. From Sylvain.Thenault at logilab.fr Fri Apr 8 11:24:49 2005 From: Sylvain.Thenault at logilab.fr (Sylvain =?iso-8859-1?Q?Th=E9nault?=) Date: Fri Apr 8 11:24:52 2005 Subject: [XML-SIG] prepare_input_source and relative path In-Reply-To: <200504080656.j386uN2C085172@chilled.skew.org> References: <20050406101549.GA4276@logilab.fr> <200504080656.j386uN2C085172@chilled.skew.org> Message-ID: <20050408092448.GA4174@logilab.fr> On Friday 08 April ? 00:56, Mike Brown wrote: > Sylvain Th?nault wrote: > > after some times working on other stuff, I've remembered that I had > > still the prepare_input_source patch pending. I've joined to this > > message the patch as it is now, and it's well working for me. I'll > > check it in in the next few days if no one objects. > > I still have not had time to review and test your changes, and don't > anticipate being able to do it anytime soon. :/ maybe just a quick look at test_saxutils.py I've joined in my previous post ? > I would just be happy to know that in PyXML, > > (1) OS-specific file system paths or relative URI references > are never used as base URIs; only absolute URIs are > (and they can be derived from what is given, in some cases) > > and > > (2) resolution of a relative URI reference to absolute form > is carried out in accordance with RFC 3986 -- e.g., > via Absolutize() in Uri.py. So be happy :) I've modified (actually simplified) the patch so that a ValueError is raised if a non empty but non absolute path is given to the absolute_system_id function, and actual absolutization is done with the Absolutize function. > Anything else, like the details of OS path <-> URI conversion, improvements to > percent-encoding/decoding, better str vs unicode handling, IDN support, etc. > is lower priority. +1 > One thing you should do is compare your Uri.py against the current one. We've > made a number of changes since February, and some of them are important: > http://cvs.4suite.org/viewcvs/4Suite/Ft/Lib/Uri.py?r1=1.98&r2=1.89 I've checked this and updated my imported version of Uri.py -- Sylvain Th?nault LOGILAB, Paris (France). http://www.logilab.com http://www.logilab.fr http://www.logilab.org From Sylvain.Thenault at logilab.fr Fri Apr 8 11:26:50 2005 From: Sylvain.Thenault at logilab.fr (Sylvain =?iso-8859-1?Q?Th=E9nault?=) Date: Fri Apr 8 11:26:52 2005 Subject: [XML-SIG] prepare_input_source and relative path In-Reply-To: <20050408092448.GA4174@logilab.fr> References: <20050406101549.GA4276@logilab.fr> <200504080656.j386uN2C085172@chilled.skew.org> <20050408092448.GA4174@logilab.fr> Message-ID: <20050408092650.GA4523@logilab.fr> the updated patch... -- Sylvain Th?nault LOGILAB, Paris (France). http://www.logilab.com http://www.logilab.fr http://www.logilab.org -------------- next part -------------- --- /usr/lib/python2.3/site-packages/_xmlplus/sax/saxutils.py 2004-11-29 13:36:36.000000000 +0100 +++ cvs_work/_xmlplus/sax/saxutils.py 2005-04-08 11:11:36.000000000 +0200 @@ -10,6 +10,8 @@ import xmlreader import sys, _exceptions, saxlib +from xml.Uri import Absolutize, MakeUrllibSafe,IsAbsolute + try: _StringTypes = [types.StringType, types.UnicodeType] except AttributeError: # 1.5 compatibility:UnicodeType not defined @@ -510,22 +512,28 @@ source = xmlreader.InputSource() source.setByteStream(f) if hasattr(f, "name"): - source.setSystemId(f.name) + source.setSystemId(absolute_system_id(f.name, base)) if source.getByteStream() is None: - sysid = source.getSystemId() - if os.path.isfile(sysid): - basehead = os.path.split(os.path.normpath(base))[0] - source.setSystemId(os.path.join(basehead, sysid)) - f = open(sysid, "rb") - else: - source.setSystemId(urlparse.urljoin(base, sysid)) - f = urllib2.urlopen(source.getSystemId()) - + sysid = absolute_system_id(source.getSystemId(), base) + source.setSystemId(sysid) + f = urllib2.urlopen(sysid) source.setByteStream(f) return source + +def absolute_system_id(sysid, base=''): + # if a base is given, sysid may be relative to it, make the + # join before isfile() test + if os.path.exists(sysid): + sysid = 'file:%s' % os.path.abspath(sysid) + elif base: + assert IsAbsolute(base), base + sysid = Absolutize(sysid, base) + assert IsAbsolute(sysid) + return MakeUrllibSafe(sysid) + # =========================================================================== # # DEPRECATED SAX 1.0 CLASSES From Sylvain.Thenault at logilab.fr Fri Apr 8 11:30:11 2005 From: Sylvain.Thenault at logilab.fr (Sylvain =?iso-8859-1?Q?Th=E9nault?=) Date: Fri Apr 8 11:30:14 2005 Subject: [XML-SIG] prepare_input_source and relative path In-Reply-To: <20050408092650.GA4523@logilab.fr> References: <20050406101549.GA4276@logilab.fr> <200504080656.j386uN2C085172@chilled.skew.org> <20050408092448.GA4174@logilab.fr> <20050408092650.GA4523@logilab.fr> Message-ID: <20050408093011.GA4546@logilab.fr> the same patch with just outdated comments removed... -- Sylvain Th?nault LOGILAB, Paris (France). http://www.logilab.com http://www.logilab.fr http://www.logilab.org -------------- next part -------------- --- /usr/lib/python2.3/site-packages/_xmlplus/sax/saxutils.py 2004-11-29 13:36:36.000000000 +0100 +++ cvs_work/_xmlplus/sax/saxutils.py 2005-04-08 11:29:42.000000000 +0200 @@ -10,6 +10,8 @@ import xmlreader import sys, _exceptions, saxlib +from xml.Uri import Absolutize, MakeUrllibSafe,IsAbsolute + try: _StringTypes = [types.StringType, types.UnicodeType] except AttributeError: # 1.5 compatibility:UnicodeType not defined @@ -510,22 +512,25 @@ source = xmlreader.InputSource() source.setByteStream(f) if hasattr(f, "name"): - source.setSystemId(f.name) + source.setSystemId(absolute_system_id(f.name, base)) if source.getByteStream() is None: - sysid = source.getSystemId() - if os.path.isfile(sysid): - basehead = os.path.split(os.path.normpath(base))[0] - source.setSystemId(os.path.join(basehead, sysid)) - f = open(sysid, "rb") - else: - source.setSystemId(urlparse.urljoin(base, sysid)) - f = urllib2.urlopen(source.getSystemId()) - + sysid = absolute_system_id(source.getSystemId(), base) + source.setSystemId(sysid) + f = urllib2.urlopen(sysid) source.setByteStream(f) return source + +def absolute_system_id(sysid, base=''): + if os.path.exists(sysid): + sysid = 'file:%s' % os.path.abspath(sysid) + elif base: + sysid = Absolutize(sysid, base) + assert IsAbsolute(sysid) + return MakeUrllibSafe(sysid) + # =========================================================================== # # DEPRECATED SAX 1.0 CLASSES From postmaster at python.org Fri Apr 8 13:34:54 2005 From: postmaster at python.org (Post Office) Date: Fri Apr 8 13:37:36 2005 Subject: [XML-SIG] {VIRUS?} Xsqzbwuzjoeiif Message-ID: <200504081137.j38BbFGS022602@hosp.ozd.hu> Warning: This message has had one or more attachments removed. Warning: Please read the "VirusWarning.txt" attachment(s) for more information. The message was not delivered due to the following reason(s): Your message was not delivered because the destination computer was unreachable within the allowed queue period. The amount of time a message is queued before it is returned depends on local configura- tion parameters. Most likely there is a network problem that prevented delivery, but it is also possible that the computer is turned off, or does not have a mail system running right now. Your message could not be delivered within 2 days: Mail server 122.115.200.99 is not responding. The following recipients could not receive this message: Please reply to postmaster@python.org if you feel this message to be in error. -------------- next part -------------- This is a message from the MailScanner E-Mail Virus Protection Service ---------------------------------------------------------------------- The original e-mail attachment "INSTRUCTION.SCR" was believed to be infected by a virus and has been replaced by this warning message. If you wish to receive a copy of the *infected* attachment, please e-mail helpdesk and include the whole of this message in your request. Alternatively, you can call them, with the contents of this message to hand when you call. At Fri Apr 8 13:37:47 2005 the virus scanner said: >>> Virus 'W32/MyDoom-O' found in file ./j38BbFGS022602/INSTRUCTION.SCR Windows Screensavers often hide viruses in email (INSTRUCTION.SCR) Note to Help Desk: Look on the MailScanner in /var/spool/MailScanner/quarantine (message j38BbFGS022602). -- Postmaster From gvwilson at cs.utoronto.ca Fri Apr 8 13:37:26 2005 From: gvwilson at cs.utoronto.ca (Greg Wilson) Date: Fri Apr 8 13:38:41 2005 Subject: [XML-SIG] Simplest free DOM-like toolkit with validation? Message-ID: Hi everyone. I'm about to start work on the XML data crunching lectures for the PSF-funded software development course [1], and would like advice on which of Python's many XML toolkits to use. The default choice is minidom, but having taught with it for a couple of semesters, I find it clunky compared to (for example) ElementTree. However, ElementTree doesn't support validation against external DTDs, or RelaxNG schemas. So, what would you suggest? Ease of use is more important than an all-encompassing feature set (since this is for teaching purposes); it has to be available on all major platforms (Windows, Mac, Linux, Solaris), trivial to install, and have at least some documentation. Thanks, Greg [1] http://www.python.org/psf/grants/ From fredrik at pythonware.com Fri Apr 8 17:04:08 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri Apr 8 17:05:32 2005 Subject: [XML-SIG] Re: Simplest free DOM-like toolkit with validation? References: Message-ID: Greg Wilson wrote: > Hi everyone. I'm about to start work on the XML data crunching lectures for the PSF-funded > software development course [1], and would like advice on which of Python's many XML toolkits to > use. The default choice is minidom, but having taught with it for a couple of semesters, I find > it clunky compared to (for example) ElementTree. However, ElementTree doesn't support validation > against external DTDs, or RelaxNG > schemas. have you looked at ElementRXP? http://online.effbot.org/2005_02_01_archive.htm#elementrxp From gvwilson at cs.utoronto.ca Fri Apr 8 20:28:22 2005 From: gvwilson at cs.utoronto.ca (Greg Wilson) Date: Fri Apr 8 20:30:35 2005 Subject: [XML-SIG] Re: Simplest free DOM-like toolkit with validation? In-Reply-To: References: Message-ID: >> Greg Wilson wrote: >>...[I] would like advice on which of Python's many XML toolkits to use [in a course]. > Fredrik Lundh wrote: > have you looked at ElementRXP? > http://online.effbot.org/2005_02_01_archive.htm#elementrxp Mm. So people would have to: * download two supplementary libraries * parse with one * transform the nodes produced by that parser from one format to another That's going to be a hard sell to people who typically only have a first-year course in computer science. I _could_ package it up, and hand them black magic, but when things go wrong, they'll be completely lost. Thanks, Greg From fredrik at pythonware.com Fri Apr 8 21:59:40 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri Apr 8 22:00:57 2005 Subject: [XML-SIG] Re: Simplest free DOM-like toolkit with validation? References: Message-ID: Greg Wilson wrote: > Mm. So people would have to: > > * download two supplementary libraries one is a Python module, one is a parser. both are available as prebuilt kits for many platforms. > * parse with one > > * transform the nodes produced by that parser from one format to another no, they have to call a single function that does it for them (and the "transformation" is extremely light-weight: there's hardly any copying of data going on, just references being moved from tuple objects to instance attributes. RXP+conversion+Element- Tree is *faster* than expat+ElementTree, after all). if you want even more performance, *and* DTD and RelaxNG support, this was just released: http://codespeak.net/lxml/ (still things to download, though) From Uche.Ogbuji at fourthought.com Sun Apr 10 07:58:44 2005 From: Uche.Ogbuji at fourthought.com (Uche Ogbuji) Date: Sun Apr 10 08:59:10 2005 Subject: [XML-SIG] prepare_input_source and relative path In-Reply-To: <20050408093011.GA4546@logilab.fr> References: <20050406101549.GA4276@logilab.fr> <200504080656.j386uN2C085172@chilled.skew.org> <20050408092448.GA4174@logilab.fr> <20050408092650.GA4523@logilab.fr> <20050408093011.GA4546@logilab.fr> Message-ID: <1113112724.7426.5.camel@borgia> On Fri, 2005-04-08 at 11:30 +0200, Sylvain Th?nault wrote: > the same patch with just outdated comments removed... Seems to me there's been a reasonable level of review, and that your check-in would improve the situation. I'd say go for it. -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.net http://4Suite.org http://fourthought.com Use CSS to display XML, part 2 - http://www-128.ibm.com/developerworks/edu/x-dw-x-xmlcss2-i.html Writing and Reading XML with XIST - http://www.xml.com/pub/a/2005/03/16/py-xml.html Use XSLT to prepare XML for import into OpenOffice Calc - http://www.ibm.com/developerworks/xml/library/x-oocalc/ Be humble, not imperial (in design) - http://www.adtmag.com/article.asp?id=10286 State of the art in XML modeling - http://www.ibm.com/developerworks/xml/library/x-think30.html From LETICIA at tesla.cujae.edu.cu Sun Apr 10 21:48:20 2005 From: LETICIA at tesla.cujae.edu.cu (Leticia Larrosa) Date: Sun Apr 10 21:55:43 2005 Subject: [XML-SIG] validate file-like class object Message-ID: Hi all! I received a StringIO, file-like class object, that contain an XML. I need parser it, but before i need validate it. Then I make a slight change to the "_xmlplus\parsers\xmlproc\xmlapp.py". I replace the function "create_input_source" for: " def create_input_source(self,sysid): if isinstance(sysid, StringIO.StringIO): return sysid elif sysid[1:3]==":\\" or urlparse.urlparse(sysid)[0] == '': return open(sysid) else: return urllib2.urlopen(sysid) " and add the following import line: "import StringIO.StringIO" As you can see, now "sysid" may be an StringIO instance. Read the following code to see how i use the above change. " from xml.parsers.xmlproc import xmlval from xml.dom.ext.reader.Sax2 import FromXmlStream xv = xmlval.XMLValidator() #xml_fileIO is instance of StringIO.StringIO that contain an XML. xv.parse_resource(xml_fileIO) domTree = FromXmlStream(xml_fileIO) " Please send me any suggestion. What can i do to incorporate this change in the standard library? Thanks in advance From LETICIA at tesla.cujae.edu.cu Sun Apr 10 22:31:01 2005 From: LETICIA at tesla.cujae.edu.cu (Leticia Larrosa) Date: Sun Apr 10 22:31:35 2005 Subject: [XML-SIG] validate file-like class object In-Reply-To: References: Message-ID: In my last email i have an error in the following two lines: " xv.parse_resource(xml_fileIO) domTree = FromXmlStream(xml_fileIO) " because i must to do a deep copy of the StringIO object first, because the "xv.parse_resource" close the file. The code must be: " xml_fileIOCopy = copy.deepcopy(xml_fileIO) xv.parse_resource(xml_fileIO) domTree = FromXmlStream(xml_fileIOCopy) " Please send me any suggestion about my last email. Thanks in advance -----Original Message----- From: "Leticia Larrosa" To: "Andrew Clover" , xml-sig@python.org Cc: Date: Sun, 10 Apr 2005 15:48:20 -0400 Subject: [XML-SIG] validate file-like class object > Hi all! > > I received a StringIO, file-like class object, that contain an XML. I > need > parser it, but before i need validate it. > Then I make a slight change to the > "_xmlplus\parsers\xmlproc\xmlapp.py". I > replace the function "create_input_source" for: > " > def create_input_source(self,sysid): > if isinstance(sysid, StringIO.StringIO): > return sysid > elif sysid[1:3]==":\\" or urlparse.urlparse(sysid)[0] == '': > return open(sysid) > else: > return urllib2.urlopen(sysid) > " > > and add the following import line: > > "import StringIO.StringIO" > > As you can see, now "sysid" may be an StringIO instance. Read the > following > code to see how i use the above change. > > " > from xml.parsers.xmlproc import xmlval > from xml.dom.ext.reader.Sax2 import FromXmlStream > > xv = xmlval.XMLValidator() > #xml_fileIO is instance of StringIO.StringIO that contain an XML. > xv.parse_resource(xml_fileIO) > domTree = FromXmlStream(xml_fileIO) > " > Please send me any suggestion. > > What can i do to incorporate this change in the standard library? > > Thanks in advance > > > > > _______________________________________________ > XML-SIG maillist - XML-SIG@python.org > http://mail.python.org/mailman/listinfo/xml-sig > > From priyank.ks at gmail.com Mon Apr 11 11:58:31 2005 From: priyank.ks at gmail.com (Priyank Bhargav) Date: Mon Apr 11 11:58:33 2005 Subject: [XML-SIG] problem with pyexpat.so Message-ID: hello, i have python 2.3.5 and have installed PyXml-0.8.4 package. when i run a program that i am working on , which tries to import pyexpat.so but comes up with the following error. undefined symbol: PyUnicodeUCS2_DecodeUTF8 the following is displayed on the terminal. Traceback (most recent call last): File "davtest.py", line 32, in ? from DAV.davserver import DAVRequestHandler File "/home/virtual_data/VR/VRdavserver/DAV/davserver.py", line 48, in ? from propfind import PROPFIND File "/home/virtual_data/VR/VRdavserver/DAV/propfind.py", line 33, in ? import utils File "/home/virtual_data/VR/VRdavserver/DAV/utils.py", line 16, in ? from xml.dom.ext.reader import PyExpat File "/usr/lib/python2.3/xml/dom/ext/reader/PyExpat.py", line 25, in ? from xml.parsers import expat File "/usr/lib/python2.3/xml/parsers/expat.py", line 4, in ? from pyexpat import * ImportError: /usr/lib/python2.3/xml/parsers/pyexpat.so: undefined symbol: PyUnicodeUCS2_DecodeUTF8 Please guide me in overcomig this problem. Thnks for any help. From fredrik at pythonware.com Mon Apr 11 15:23:33 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Mon Apr 11 15:26:10 2005 Subject: [XML-SIG] Re: problem with pyexpat.so References: Message-ID: Priyank Bhargav wrote: > i have python 2.3.5 and have installed PyXml-0.8.4 package. when i run > a program that i am working on , which tries to import pyexpat.so but comes > up with the following error. > undefined symbol: PyUnicodeUCS2_DecodeUTF8 does the following links help? http://www.python.org/doc/faq/extending.html#when-importing-module-x-why-do-i-get-undefined-symbol-pyunicodeucs2 https://sourceforge.net/tracker/?func=detail&atid=106473&aid=594207&group_id=6473 (the next time you get a weird error message, try googling for the message) From LISTSERV at LIST.MSU.EDU Tue Apr 12 03:24:50 2005 From: LISTSERV at LIST.MSU.EDU (L-Soft list server at LIST.MSU.EDU (1.8d)) Date: Tue Apr 12 03:24:52 2005 Subject: [XML-SIG] Your message to JMCINFO-request@LIST.MSU.EDU Message-ID: <20050412012450.6D18D5252D@list.msu.edu> Mon, 11 Apr 2005 21:24:50 Your message to JMCINFO-request@LIST.MSU.EDU has been forwarded to the "list owners" (the people who manage the JMCINFO list). If you wanted to reach a human being, you used the correct procedure and you can ignore the remainder of this message. If you were trying to send a command for the computer to execute, please read on. The JMCINFO list is managed by a LISTSERV server. LISTSERV commands should always be sent to the "LISTSERV" address, ie LISTSERV@LIST.MSU.EDU. LISTSERV never tries to process messages sent to the JMCINFO-request address; it simply forwards them to a human being, and acknowledges receipt with the present message. The "listname-request" convention originated on the Internet a long time ago. At the time, lists were always managed manually, and this address was defined as an alias for the person(s) in charge of the mailing list. You would write to the "listname-request" address to ask for information about the list, ask to be added to the list, make suggestions about the contents and policy, etc. Because this address was always a human being, people knew and expected to be talking to a human being, not to a computer. Unfortunately, some recent list management packages screen incoming messages to the "listname-request" address and attempt to determine whether they are requests to join or leave the list. They look for words such as "subscribe," "add," "leave," "off," and so on. If they decide your message is a request to join or leave the list, they update the list automatically; otherwise, they forward the message to the list owners. Naturally, this means that if you write to the list owners about someone else's unsuccessful attempts to leave the list, you stand good chances of being automatically removed from the list, whereas the list owners will never receive your message. No one really benefits from this. There is no reliable mechanism to contact a human being for assistance, and you can never be sure whether your request will be interpreted as a command or as a message to the list owners. This is why LISTSERV uses two separate addresses, one for the people in charge of the list and one for the computer that runs it. This way you always know what will happen, especially if you are writing in a language other than English. In any case, if your message was a LISTSERV command, you should now resend it to LISTSERV@LIST.MSU.EDU. The list owners know that you have received this message and may assume that you will resend the command on your own. You will find instructions for the most common administrative requests below. ********************* * TO LEAVE THE LIST * ********************* Write to LISTSERV@LIST.MSU.EDU and, in the text of your message (not the subject line), write: SIGNOFF JMCINFO ******************** * TO JOIN THE LIST * ******************** Write to LISTSERV@LIST.MSU.EDU and, in the text of your message (not the subject line), write: SUBSCRIBE JMCINFO ************************ * FOR MORE INFORMATION * ************************ Write to LISTSERV@LIST.MSU.EDU and, in the text of your message (not the subject line), write: "HELP" or "INFO" (without the quotes). HELP will give you a short help message and INFO a list of the documents you can order. From faassen at infrae.com Tue Apr 12 13:55:06 2005 From: faassen at infrae.com (Martijn Faassen) Date: Tue Apr 12 13:52:25 2005 Subject: [XML-SIG] lxml 0.5.1 released Message-ID: <425BB71A.70403@infrae.com> Hi there, I just released this is also a good place to announce lxml. lxml, a Pythonic wrapper for libxml2 and libxslt, has been released. You can find out much more about it and download it here: http://codespeak.net/lxml Regards, Martijn From faassen at infrae.com Tue Apr 12 13:56:24 2005 From: faassen at infrae.com (Martijn Faassen) Date: Tue Apr 12 13:53:46 2005 Subject: [XML-SIG] Simplest free DOM-like toolkit with validation? In-Reply-To: References: Message-ID: <425BB768.10800@infrae.com> Greg Wilson wrote: > Hi everyone. I'm about to start work on the XML data crunching lectures > for the PSF-funded software development course [1], and would like > advice on which of Python's many XML toolkits to use. The default > choice is minidom, but having taught with it for a couple of semesters, > I find it clunky compared to (for example) ElementTree. However, > ElementTree doesn't support validation against external DTDs, or RelaxNG > schemas. lxml supports RelaxNG, though the support is only starting. Unfortunately it's not trivial to install, especially due to the libxml2/libxslt requirement. http://codespeak.net/lxml Regards, Martijn From claire_mcentee at hotmail.com Tue Apr 12 19:12:24 2005 From: claire_mcentee at hotmail.com (Claire Mc Entee) Date: Tue Apr 12 19:12:27 2005 Subject: [XML-SIG] Query re: Xml and Flash Message-ID: Hello there, I am developing a site using XML and Flash and I was wondering if you had any ideas as to which book would be the best to buy? Kind regards, Claire Mc Entee _________________________________________________________________ Don't know what Meegos are? Click to find out. http://meegos.msn.ie From jlujan at dreamcatalyst.com Wed Apr 13 05:37:20 2005 From: jlujan at dreamcatalyst.com (J. Lujan) Date: Wed Apr 13 05:37:18 2005 Subject: [XML-SIG] Sequential SAX2 Filters? Message-ID: <425C93F0.5050509@dreamcatalyst.com> OK I am relatively new to python and I know this is kind of a general python question but I haven't found an answer any where else. I know you can chain filters by calling nextfilter.startElement(..) and so on. But what if you want sequential filters to modify information that might have been added to the document by a previous filter? I assume you need to parse for each filter but it doesn't matter. What I cannot figure out is how to get the results of the parse into a string that I can pass on to a second parse with a different filter. I hope I am being clear enough here. I want to parse a file and have the result put into a string that I can parse a second time using a different filter. Any suggestions? Thank You, J. Lujan From Sylvain.Thenault at logilab.fr Wed Apr 13 16:07:31 2005 From: Sylvain.Thenault at logilab.fr (Sylvain =?iso-8859-1?Q?Th=E9nault?=) Date: Wed Apr 13 16:07:34 2005 Subject: [XML-SIG] prepare_input_source and relative path In-Reply-To: <1113112724.7426.5.camel@borgia> References: <20050406101549.GA4276@logilab.fr> <200504080656.j386uN2C085172@chilled.skew.org> <20050408092448.GA4174@logilab.fr> <20050408092650.GA4523@logilab.fr> <20050408093011.GA4546@logilab.fr> <1113112724.7426.5.camel@borgia> Message-ID: <20050413140731.GA7133@logilab.fr> On Saturday 09 April ? 23:58, Uche Ogbuji wrote: > On Fri, 2005-04-08 at 11:30 +0200, Sylvain Th?nault wrote: > > the same patch with just outdated comments removed... > > Seems to me there's been a reasonable level of review, and that your > check-in would improve the situation. I'd say go for it. I've juste checked in the patch and related modules. Dunno if I should close the related bug or wait to be sure it is for the submitter's use case. -- Sylvain Th?nault LOGILAB, Paris (France). http://www.logilab.com http://www.logilab.fr http://www.logilab.org From joseph_sacco at comcast.net Wed Apr 13 21:57:55 2005 From: joseph_sacco at comcast.net (Joseph E. Sacco, Ph.D.) Date: Wed Apr 13 21:57:59 2005 Subject: [XML-SIG] PyXML-0.8.4 : regression tests fail Message-ID: <1113422275.31830.17.camel@plantain.jesacco.com> System: * Powermac G4 silver, * Yellow Dog Linux 4.0.1 [FC2 clone for PPC's] * python-2.3.5 * gcc-3.3.3 =========================================================================== Three of the regression tests fail: test_expatreader test_howto test_minidom See attached output from regrtest.py. These failures have the collateral effect of nuking a python-based GNOME menu editor written my Travis Watkins [alleykat@gmail.com] http://www.realistanew.com/2005/03/18/gnome-menu-editor/ The menu-editor library imports xml.dom.minidom. When PyXML is installed, the menu-editor crashes: % menu-editor python: Objects/stringobject.c:110: PyString_FromString: Assertion `str != ((void *)0)' failed. Aborted When PyXML is removed, the menu-editor runs. -Joseph -- joseph_sacco [at] comcast [dot] net -------------- next part -------------- test_c14n test test_c14n skipped -- an optional feature could not be imported test_dom test test_dom skipped -- an optional feature could not be imported test_domreg test_encodings test_expatreader test test_expatreader failed -- Traceback (most recent call last): File "/usr/local/src/Python/PyXML-0.8.4/test/test_expatreader.py", line 21, in setUp self.parser.setFeature(handler.feature_namespace_prefixes, 1) File "/var/tmp/python-2.3.5-root/usr/lib/python2.3/xml/sax/expatreader.py", line 156, in setFeature SAXNotSupportedException: expat does not report namespace prefixes test_filter test_howto test test_howto crashed -- exceptions.AttributeError : 'module' object has no attribute 'DefaultHandler' test_htmlb test test_htmlb skipped -- an optional feature could not be imported test_javadom test test_javadom skipped -- an optional feature could not be imported test_marshal test test_marshal skipped -- an optional feature could not be imported test_minidom test test_minidom failed -- Writing: 'Test Failed: ', expected: '' test_ns test test_ns skipped -- an optional feature could not be imported test_pyexpat test_sax test test_sax skipped -- an optional feature could not be imported test_sax2 test test_sax2 skipped -- an optional feature could not be imported test_sax2_xmlproc test_sax_xmlproc test test_sax_xmlproc skipped -- an optional feature could not be imported test_saxdrivers test test_saxdrivers skipped -- an optional feature could not be imported test_utils test test_utils skipped -- an optional feature could not be imported test_xmlbuilder test_xmlproc test test_xmlproc skipped -- an optional feature could not be imported 6 tests OK. 3 tests failed: test_expatreader test_howto test_minidom 12 tests skipped: test_c14n test_dom test_htmlb test_javadom test_marshal test_ns test_sax test_sax2 test_sax_xmlproc test_saxdrivers test_utils test_xmlproc From uche.ogbuji at fourthought.com Thu Apr 14 00:04:00 2005 From: uche.ogbuji at fourthought.com (Uche Ogbuji) Date: Thu Apr 14 00:04:04 2005 Subject: [XML-SIG] prepare_input_source and relative path In-Reply-To: <20050413140731.GA7133@logilab.fr> References: <20050406101549.GA4276@logilab.fr> <200504080656.j386uN2C085172@chilled.skew.org> <20050408092448.GA4174@logilab.fr> <20050408092650.GA4523@logilab.fr> <20050408093011.GA4546@logilab.fr> <1113112724.7426.5.camel@borgia> <20050413140731.GA7133@logilab.fr> Message-ID: <1113429840.30484.7.camel@borgia> On Wed, 2005-04-13 at 16:07 +0200, Sylvain Th?nault wrote: > On Saturday 09 April ? 23:58, Uche Ogbuji wrote: > > On Fri, 2005-04-08 at 11:30 +0200, Sylvain Th?nault wrote: > > > the same patch with just outdated comments removed... > > > > Seems to me there's been a reasonable level of review, and that your > > check-in would improve the situation. I'd say go for it. > > I've juste checked in the patch and related modules. Dunno if I should > close the related bug or wait to be sure it is for the submitter's use > case. Maybe wait a few weeks, then close? -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.net http://fourthought.com http://copia.ogbuji.net http://4Suite.org Use CSS to display XML, part 2 - http://www-128.ibm.com/developerworks/edu/x-dw-x-xmlcss2-i.html Writing and Reading XML with XIST - http://www.xml.com/pub/a/2005/03/16/py-xml.html Use XSLT to prepare XML for import into OpenOffice Calc - http://www.ibm.com/developerworks/xml/library/x-oocalc/ Schema standardization for top-down semantic transparency - http://www-128.ibm.com/developerworks/xml/library/x-think31.html From Uche.Ogbuji at fourthought.com Thu Apr 14 03:54:26 2005 From: Uche.Ogbuji at fourthought.com (Uche Ogbuji) Date: Thu Apr 14 04:01:04 2005 Subject: [XML-SIG] Sequential SAX2 Filters? In-Reply-To: <425C93F0.5050509@dreamcatalyst.com> References: <425C93F0.5050509@dreamcatalyst.com> Message-ID: <1113443667.30484.13.camel@borgia> On Tue, 2005-04-12 at 22:37 -0500, J. Lujan wrote: > OK I am relatively new to python and I know this is kind of a general > python question but I haven't found an answer any where else. I know you > can chain filters by calling nextfilter.startElement(..) and so on. But > what if you want sequential filters to modify information that might > have been added to the document by a previous filter? I assume you need > to parse for each filter but it doesn't matter. What I cannot figure out > is how to get the results of the parse into a string that I can pass on > to a second parse with a different filter. I hope I am being clear > enough here. I want to parse a file and have the result put into a > string that I can parse a second time using a different filter. Any > suggestions? If I understand you, you may want to use some variation on xml.sax.XMLGenerator you might find the following useful: http://www.xml.com/pub/a/2003/03/12/py-xml.html -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.net http://fourthought.com http://copia.ogbuji.net http://4Suite.org Use CSS to display XML, part 2 - http://www-128.ibm.com/developerworks/edu/x-dw-x-xmlcss2-i.html Writing and Reading XML with XIST - http://www.xml.com/pub/a/2005/03/16/py-xml.html Use XSLT to prepare XML for import into OpenOffice Calc - http://www.ibm.com/developerworks/xml/library/x-oocalc/ Schema standardization for top-down semantic transparency - http://www-128.ibm.com/developerworks/xml/library/x-think31.html From Sylvain.Thenault at logilab.fr Thu Apr 14 10:51:32 2005 From: Sylvain.Thenault at logilab.fr (Sylvain =?iso-8859-1?Q?Th=E9nault?=) Date: Thu Apr 14 10:51:34 2005 Subject: [XML-SIG] prepare_input_source and relative path In-Reply-To: <1113429840.30484.7.camel@borgia> References: <20050406101549.GA4276@logilab.fr> <200504080656.j386uN2C085172@chilled.skew.org> <20050408092448.GA4174@logilab.fr> <20050408092650.GA4523@logilab.fr> <20050408093011.GA4546@logilab.fr> <1113112724.7426.5.camel@borgia> <20050413140731.GA7133@logilab.fr> <1113429840.30484.7.camel@borgia> Message-ID: <20050414085132.GA4127@logilab.fr> On Wednesday 13 April ? 16:04, Uche Ogbuji wrote: > On Wed, 2005-04-13 at 16:07 +0200, Sylvain Th?nault wrote: > > On Saturday 09 April ? 23:58, Uche Ogbuji wrote: > > > On Fri, 2005-04-08 at 11:30 +0200, Sylvain Th?nault wrote: > > > > the same patch with just outdated comments removed... > > > > > > Seems to me there's been a reasonable level of review, and that your > > > check-in would improve the situation. I'd say go for it. > > > > I've juste checked in the patch and related modules. Dunno if I should > > close the related bug or wait to be sure it is for the submitter's use > > case. > > Maybe wait a few weeks, then close? yep, sound good to me. Is there any plan to make a release including the bug fix soon ? And how to backport this fix to the base python distribution ? -- Sylvain Th?nault LOGILAB, Paris (France). http://www.logilab.com http://www.logilab.fr http://www.logilab.org From lists at dreamcatalyst.com Thu Apr 14 11:20:11 2005 From: lists at dreamcatalyst.com (Lists) Date: Thu Apr 14 11:20:08 2005 Subject: [XML-SIG] Sequential SAX2 Filters? In-Reply-To: <1113443667.30484.13.camel@borgia> References: <425C93F0.5050509@dreamcatalyst.com> <1113443667.30484.13.camel@borgia> Message-ID: <425E35CB.6080400@dreamcatalyst.com> Uche Ogbuji wrote: >On Tue, 2005-04-12 at 22:37 -0500, J. Lujan wrote: > > >>OK I am relatively new to python and I know this is kind of a general >>python question but I haven't found an answer any where else. I know you >>can chain filters by calling nextfilter.startElement(..) and so on. But >>what if you want sequential filters to modify information that might >>have been added to the document by a previous filter? I assume you need >>to parse for each filter but it doesn't matter. What I cannot figure out >>is how to get the results of the parse into a string that I can pass on >>to a second parse with a different filter. I hope I am being clear >>enough here. I want to parse a file and have the result put into a >>string that I can parse a second time using a different filter. Any >>suggestions? >> >> > >If I understand you, you may want to use some variation on >xml.sax.XMLGenerator you might find the following useful: > >http://www.xml.com/pub/a/2003/03/12/py-xml.html > > > > Well, I started off with your example "Tip: SAX filters for flexible processing" on IBM's developerworks and have reviewed many of your other articles. I under stand how most of it works including XMLGenerator. The problem I have is more general, a lack of Python knowledge. When I try using XMLGenerator, the result goes to standard output, my questions is how to get it to go into a string(variable) within the program that can be passed on to a second parser instance. I assume I pass a reference to a global variable to the filter. I am missing how to get XMLGenerator to write to that variable. Am I missing something with XMLGenerator? Is there a way to get the output from the XMLGenerator instance itself? Hopefully I am being more clear this time. Thank You, J. Lujan From lists at dreamcatalyst.com Thu Apr 14 11:43:12 2005 From: lists at dreamcatalyst.com (Lists) Date: Thu Apr 14 11:43:11 2005 Subject: [XML-SIG] Sequential SAX2 Filters? In-Reply-To: <425E35CB.6080400@dreamcatalyst.com> References: <425C93F0.5050509@dreamcatalyst.com> <1113443667.30484.13.camel@borgia> <425E35CB.6080400@dreamcatalyst.com> Message-ID: <425E3B30.4070602@dreamcatalyst.com> Lists wrote: > Uche Ogbuji wrote: > >> On Tue, 2005-04-12 at 22:37 -0500, J. Lujan wrote: >> >> >>> OK I am relatively new to python and I know this is kind of a >>> general python question but I haven't found an answer any where >>> else. I know you can chain filters by calling >>> nextfilter.startElement(..) and so on. But what if you want >>> sequential filters to modify information that might have been added >>> to the document by a previous filter? I assume you need to parse for >>> each filter but it doesn't matter. What I cannot figure out is how >>> to get the results of the parse into a string that I can pass on to >>> a second parse with a different filter. I hope I am being clear >>> enough here. I want to parse a file and have the result put into a >>> string that I can parse a second time using a different filter. Any >>> suggestions? >>> >> >> >> If I understand you, you may want to use some variation on >> xml.sax.XMLGenerator you might find the following useful: >> >> http://www.xml.com/pub/a/2003/03/12/py-xml.html >> >> >> >> > Well, I started off with your example "Tip: SAX filters for flexible > processing" on IBM's developerworks and have reviewed many of your > other articles. I under stand how most of it works including > XMLGenerator. The problem I have is more general, a lack of Python > knowledge. When I try using XMLGenerator, the result goes to standard > output, my questions is how to get it to go into a string(variable) > within the program that can be passed on to a second parser instance. > I assume I pass a reference to a global variable to the filter. I am > missing how to get XMLGenerator to write to that variable. Am I > missing something with XMLGenerator? Is there a way to get the output > from the XMLGenerator instance itself? Hopefully I am being more clear > this time. > > Thank You, > J. Lujan > _______________________________________________ > XML-SIG maillist - XML-SIG@python.org > http://mail.python.org/mailman/listinfo/xml-sig Sorry, I did not review the article you suggested close enough. I reviewed it again and saw that you had understood me and given me a solution. Thank you very much. All of your articles have helped to fill the void in understandable documentation. Thank You, J. Lujan From uche.ogbuji at fourthought.com Thu Apr 14 15:18:05 2005 From: uche.ogbuji at fourthought.com (Uche Ogbuji) Date: Thu Apr 14 15:18:08 2005 Subject: [XML-SIG] Sequential SAX2 Filters? In-Reply-To: <425E3B30.4070602@dreamcatalyst.com> References: <425C93F0.5050509@dreamcatalyst.com> <1113443667.30484.13.camel@borgia> <425E35CB.6080400@dreamcatalyst.com> <425E3B30.4070602@dreamcatalyst.com> Message-ID: <1113484686.1638.26.camel@borgia> On Thu, 2005-04-14 at 04:43 -0500, Lists wrote: > Sorry, I did not review the article you suggested close enough. I > reviewed it again and saw that you had understood me and given me a > solution. Thank you very much. All of your articles have helped to fill > the void in understandable documentation. Always happy to help, and it's fun to use the time machine :-) . For completeness, I'll mention what you might have already discovered. The first optional parameter to XMLGenerator is an output buffer, and it can be a string buffer. So if instead of, as in listing 1 of the article: logger = XMLGenerator(output, encoding) You do import cStringIO #Probably want to import this at the top buffer = cStringIO.StringIO() logger = XMLGenerator(buffer, encoding) Then all the output from the generator would be accumulated in buffer, and you could get it using buffer.getvalue(). Good luck. -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.net http://fourthought.com http://copia.ogbuji.net http://4Suite.org Use CSS to display XML, part 2 - http://www-128.ibm.com/developerworks/edu/x-dw-x-xmlcss2-i.html Writing and Reading XML with XIST - http://www.xml.com/pub/a/2005/03/16/py-xml.html Use XSLT to prepare XML for import into OpenOffice Calc - http://www.ibm.com/developerworks/xml/library/x-oocalc/ Schema standardization for top-down semantic transparency - http://www-128.ibm.com/developerworks/xml/library/x-think31.html From pawel at sakowski.pl Thu Apr 14 21:59:05 2005 From: pawel at sakowski.pl (=?iso-8859-2?Q?Pawe=B3?= Sakowski) Date: Thu Apr 14 21:59:08 2005 Subject: [XML-SIG] Invalid character encoding handling in PyXML-0.8.4 Message-ID: <1113508745.25122.11.camel@athlon.ac.pld-linux.org> A simple test case: $ LANG=pl_PL.ISO-8859-2 python Python 2.4 (#1, Dec 23 2004, 10:29:41) [GCC 3.3.5 (PLD Linux)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> from xml.marshal import generic >>> generic.dumps("pi?tek") 'pi\xb1tek' "\xb1" is the ISO 8859-2 encoding of "?". Still, the XML specification makes it clear that "In the absence of external character encoding information (such as MIME headers), parsed entities which are stored in an encoding other than UTF-8 or UTF-16 MUST begin with a text declaration (see 4.3.1 The Text Declaration) containing an encoding declaration". So, the XML obtained above is not well-formed: >>> generic.loads(generic.dumps("pi?tek")) Traceback (most recent call last): File "", line 1, in ? File "xml/marshal/generic.py", line 321, in loads return m._load(file) File "xml/marshal/generic.py", line 331, in _load p.parseFile(file) File "xml/sax/drivers/drv_pyexpat.py", line 68, in parseFile if self.parser.Parse(buf, 0) != 1: xml.parsers.expat.ExpatError: not well-formed (invalid token): line 1, column 40 I'd also like to make a related feature request: >>> generic.dumps(u"czwartek") Traceback (most recent call last): File "", line 1, in ? File "/usr/lib/python2.4/site-packages/_xmlplus/marshal/generic.py", line 59, in dumps File "/usr/lib/python2.4/site-packages/_xmlplus/marshal/generic.py", line 104, in m_root File "/usr/lib/python2.4/site-packages/_xmlplus/marshal/generic.py", line 92, in _marshal AttributeError: Marshaller instance has no attribute 'm_unicode' Given XML's well defined character encoding semantics, it would be useful (and IMO pretty straightforward) to support unicode strings by simply encoding them with the document's encoding. -- +----------------------------------------------------------------------+ | Pawe? Sakowski Never trust a man | | who can count up to 1023 on his fingers. | +----------------------------------------------------------------------+ From LETICIA at tesla.cujae.edu.cu Fri Apr 15 00:46:16 2005 From: LETICIA at tesla.cujae.edu.cu (Leticia Larrosa) Date: Fri Apr 15 00:49:33 2005 Subject: [XML-SIG] WSDL generator & pass xml between web services Message-ID: Hi all: I have three aspect to ask 1- how can generate automatically a WSDL from python code that represent a web services? exists any tool? I know about tools for java or perl, but i can't find anything for python 2- if i want to pass and "xml document" in a soap or xml-rpc message i know three options: - can be passed as a single quoted string - pass like a attachment for the SOAP message -the fields can be passed as separate parameters (option that i can't use) exists any else options? the SOAPPy library don't support the attachment, anybody know about other library that support it? 3- exists other mailling list to discusess this topics? any suggestions? thanks in advance From rsalz at datapower.com Fri Apr 15 02:38:43 2005 From: rsalz at datapower.com (Rich Salz) Date: Fri Apr 15 02:38:46 2005 Subject: [XML-SIG] WSDL generator & pass xml between web services In-Reply-To: Message-ID: > 1- how can generate automatically a WSDL from python code that represent a > web services? I don't know of any tools that do this. > 2- if i want to pass and "xml document" in a soap or xml-rpc message i know > three options: In general, you can't embed XML in XML, so you either have to treat it as a string (e.g., CDATA) or use attachments. ZSI has some support for attachments. > 3- exists other mailling list to discusess this topics? any suggestions? pywebsvcs-talk@lists.sf.net /r$ -- Rich Salz Chief Security Architect DataPower Technology http://www.datapower.com XS40 XML Security Gateway http://www.datapower.com/products/xs40.html From gvwilson at cs.utoronto.ca Fri Apr 15 19:47:12 2005 From: gvwilson at cs.utoronto.ca (Greg Wilson) Date: Fri Apr 15 19:49:12 2005 Subject: [XML-SIG] Text in ElementTree? Message-ID: I'm trying to write an example of how to move elements around in a document using ElementTree. The objective is to take things like this:

Heading

and turn them into:

Heading

i.e., put the emphasis inside the h1-h4 elements, instead of the other way around. It's almost working, but I'm still having trouble handling nodes whose children are interspersed text and elements. The script below shows the problem. Run it from the command line with no argument, and it'll break on the 'Single Inversion' text. Run it again with 'movetext' as its only argument, and it'll break on the last test (in which the body element has strings, em+heading, and other elements as children). I think the root of my problem is that I don't understand how ElementTree stores text --- if you have:

a c d f

then what are p's children? What is p.text? What happens if you assign a new value to p.text? Thanks, Greg (Note: if your news reader breaks the 'Mixed Content' and 'Nested' test cases across lines, you may have to edit them.) import sys from cElementTree import Element, fromstring, tostring # from visitor import Visitor class Visitor(object): def __init__(self): pass def visit(self, root): self.beforeAll(root) self.traverse(root) self.afterAll(root) def traverse(self, current): self.beforeNode(current) self.atNode(current) for child in current: self.traverse(child) self.afterNode(current) def doNothing(self, node): pass beforeAll = doNothing afterAll = doNothing beforeNode = doNothing afterNode = doNothing atNode = doNothing HeadingTags = ('h1', 'h2', 'h3', 'h4') def containsOnlyHeading(node): '''Does a node contain only a single heading?''' return (len(node) == 1) and \ (node[0].tag in HeadingTags) class Finder(Visitor): '''Locate all nodes in a tree that have emphasized nodes containing a single heading as children.''' def beforeAll(self, root): self.nodes = [] def atNode(self, node): for child in node: if (child.tag == 'em') and containsOnlyHeading(child): self.nodes.append(node) return def transform(parent): '''Transform a node that has emphasized children containing headings.''' print '..parent', tostring(parent) # Helper function to locate a child in a parent. def findIndex(parent, child): for i in range(len(parent)): if parent[i] is child: return i return -1 # Get all emphasized nodes, and filter to get the ones to be modified. allEmph = parent.findall('em') allEmph = [x for x in allEmph if containsOnlyHeading(x)] assert allEmph # Transform each in turn. for emph in allEmph: print '....emph', tostring(emph) # Get the heading. assert len(emph) == 1 heading = emph[0] assert heading.tag in HeadingTags print '....heading', tostring(heading) # Take the heading out of the emphasized node. emph.remove(heading) print '....after removing heading, emph is', tostring(emph) # Put the heading in the parent in the emphasized node's place. loc = findIndex(parent, emph) assert loc >= 0 parent[loc] = heading print '....after putting heading in emph place, parent is', tostring(parent) # Move the heading's children and text to the emphasized node. if 'movetext' in sys.argv[1:]: emph.text = heading.text heading.text = None print '....after moving text, heading is', tostring(heading), 'and emph is', tostring(emph) else: print '....not moving text' while len(heading): child = heading[0] emph.append(child) heading.remove(child) print '......after moving', tostring(child), 'emph is', tostring(emph), 'and heading is', tostring(heading) # Make the emphasized node the heading's only child. heading.append(emph) print 'after attaching emph to heading, heading is', tostring(heading) def normalize(root): '''Normalize an entire document.''' f = Finder() f.visit(root) for node in f.nodes: transform(node) if __name__ == '__main__': tests = ( ('Empty', '', ''), ('Single', '', ''), ('Em Only', 'unchanged', 'unchanged'), ('H1 Only', '

unchanged

', '

unchanged

'), ('Already Normalized', '

unchanged

', '

unchanged

'), ('Single Inversion', '

changed

', '

changed

'), ('Mixed Content', '

change this and that

', '

change this and that

'), ('Nested', '

x

space

y

space', '

x

space

y

space') ) for (name, input, expected) in tests: print name print 'INPUT', input doc = fromstring(input) normalize(doc) actual = tostring(doc) print 'EXPECTED', expected print 'ACTUAL', actual print assert actual == expected From gvwilson at cs.utoronto.ca Fri Apr 15 19:56:31 2005 From: gvwilson at cs.utoronto.ca (Greg Wilson) Date: Fri Apr 15 19:57:45 2005 Subject: [XML-SIG] Re: Text in ElementTree? In-Reply-To: References: Message-ID: > Greg Wilson wrote: > I'm trying to write an example of how to move elements around in a > document using ElementTree. This has to be some kind of record --- less than 180 seconds between my posting, and an email telling me to look at the "Mixed Content" section of the docs (my fault for printing it out, then losing the last page ;-). Thanks, Greg From uche.ogbuji at fourthought.com Sat Apr 16 01:52:41 2005 From: uche.ogbuji at fourthought.com (Uche Ogbuji) Date: Sat Apr 16 01:52:53 2005 Subject: [XML-SIG] ANN: 4Suite 1.0b1 Message-ID: <1113609161.14099.8.camel@borgia> Today we release 4Suite 1.0 beta 1, now available from Sourceforge and ftp.4suite.org. Highlights of changes -- * Minor to huge performance increases throughout the core libraries: * Ft.Lib.Set implemented in C, resulting in faster evaluation of XPath expressions using '/', '//', and '|') * 40x speedup in startup time for Domlette parsing * Additional optimization of XPath expressions involving '/', '//', 'child::' and others * .docindex removed from Domlette nodes; they now compare to each other in document order (e.g., when in a list, .sort() can be called for document order). Resulting XPath/XSLT speedup * XSLT whitespace stripping optimized; xml:space handling fixed * XML string utilities now in module Ft.Xml.Lib.XmlString, implemented in C, as part of Domlette optimizations * Mini-SAX interface added to the Domlette Expat library, via Ft.Xml.Sax. The "mini" is due to the fact that only the interfaces required for parsing XSLT stylesheets are exposed. * XML catalogs and XSLT stylesheets now read via the fast mini-SAX API * Many more Domlette speed and memory optimizations * RDF core: various optimizations result in speedup of RDF completes and faster database access * RDF core: better Unicode handling; improvements to the way statements are printed, compared, iterated over, reified, serialized/deserialized * RDF core: optimized RDFS validation (90% speedup plus other optimizations) plus updated to current standard * XML core: Ft.Xml.MarkupWriter introduced--very friendly interface for generating XML content based on the XSLT writers, but adding convenience methods for common element creation patterns, and for inserting chunks of "literal" markup * XML core: FtMiniDom removed * XML core: All dependencies on PyExpat removed * XML core: Domlette nodes have new attributes: * xpathNamespaces, a dict of in-scope namespaces * xpathAttributes, the node's attributes, not counting those for namespace bindings * Environment variable XML_CATALOG_FILES now supported * XPath: extension functions that access the underlying OS are now disabled by default (e.g., f:spawnv(), f:system(), f:env-var()) * In Ft.Lib.Uri.OsPathToUri and UriToOsPath, attemptAbsolute now defaults to True * XLink: xlink:show='replace' behaves more usefully & sensibly * BerkeleyDB (bsddb) RDF and repository drivers added (Py 2.3+ only) * MySQL RDF and repository drivers optimized * Repository: XSLT DocDefs repository can now use extension functions/elements * Repository: better handling of external entity resolution * Doc generation issues resolved. API docs can now be generated for every module in 4Suite (various * XHTML 1.0 Transitional DTD added to default catalog * Many more bug fixes and enhancements. See also http://uche.ogbuji.net/tech/akara/? xslt=irc.xslt&date=2005-04-11#03:52:25 4Suite is a comprehensive platform for XML and RDF processing, with base libraries and a server framework. It is implemented in Python and C, and provides Python and XSLT APIs, Web and command line interfaces. For general information, see: http://4suite.org http://uche.ogbuji.net/tech/4Suite/ http://uche.ogbuji.net/tech/akara/nodes/2003-01-01/4suite-section For the files, see: ftp://ftp.4suite.org/pub/4Suite/ Sources: ftp://ftp.4suite.org/pub/4Suite/4Suite-1.0b1.tar.gz Windows installer: ftp://ftp.4suite.org/pub/4Suite/4Suite-1.0b1.win32-py2.2.exe ftp://ftp.4suite.org/pub/4Suite/4Suite-1.0b1.win32-py2.3.exe ftp://ftp.4suite.org/pub/4Suite/4Suite-1.0b1.win32-py2.4.exe Windows zip: ftp://ftp.4suite.org/pub/4Suite/4Suite-1.0b1.zip You can also get the files on Sourceforge: https://sourceforge.net/projects/foursuite/ https://sourceforge.net/project/showfiles.php?group_id=39954 Documentation: In the locations specified above, with filenames of the form 4Suite-docs-1.0b1.* Release notes -- If you have built a 4Suite repository using an older version of 4Suite, you will probably have to make adjustments for this new release. If you used 0.12.0a3, or a more recent version, then you will have to follow the migration instructions detailed in the following message: http://lists.fourthought.com/pipermail/4suite/2004-October/012933.html In general, it's worth being familiar with the following document: http://uche.ogbuji.net/tech/akara/nodes/2003-01-01/backup There have been on and off problems for Mac OS X users. We think these are now resolved. Please see the following page for current status, notes and recommendations: http://uche.ogbuji.net/tech/akara/nodes/2003-01-01/osx Installation locations have changed since 0.12.0a3 on both Windows and Unix. See the current installation directory layout document at: http://4suite.org/docs/installation-locations.xhtml If there is a server config files at the default location for the build and platform (e.g. /usr/local/lib/4Suite/4ss.conf by default on UNIX), it will be renamed to 4ss.conf.old and then overwritten. For insulation from Domlette implementation changes, developers should always use the generic Ft.Xml.Domlette APIs (rather than, say Ft.Xml.cDomlette). -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.net http://fourthought.com http://copia.ogbuji.net http://4Suite.org Use CSS to display XML, part 2 - http://www-128.ibm.com/developerworks/edu/x-dw-x-xmlcss2-i.html Writing and Reading XML with XIST - http://www.xml.com/pub/a/2005/03/16/py-xml.html Use XSLT to prepare XML for import into OpenOffice Calc - http://www.ibm.com/developerworks/xml/library/x-oocalc/ Schema standardization for top-down semantic transparency - http://www-128.ibm.com/developerworks/xml/library/x-think31.html From lienard.bruno at free.fr Sun Apr 17 19:33:40 2005 From: lienard.bruno at free.fr (lienard.bruno@free.fr) Date: Sun Apr 17 19:33:41 2005 Subject: [XML-SIG] Bug in ElementTree (or in my xml file )? Message-ID: <1113759220.42629df459d95@imp5-q.free.fr> I have a strange bug in an xml file. I try to parse it with the following script: import cElementTree as ElementTree ##from elementtree import ElementTree inputFile = "c:\Test0\\bug.xml" tree = ElementTree.ElementTree() tree.parse(inputFile) root = tree.getroot() iter = root.getiterator() for element in iter: print element.tag The file is here:
It is should be adapted on the fly and should be
affected as minimal as possible during
adaptation. Adaptation approaches are different
in the adaptation granularity (procedure, module,
(simplicity, duration, ? ? ? ? ? ? ? automation,
I get the following error: Traceback (most recent call last): File "C:\Python23\Lib\site-packages\pythonwin\pywin\framework\scriptutils.py", line 310, in RunScript exec codeObject in __main__.__dict__ File "C:\Test0\FineReader\FixXml.py", line 7, in ? text += gettext(e) File "", line 24, in parse SyntaxError: not well-formed (invalid token): line 17, column 22 I can't understand why! I think my xml file is wrong (when I remove the corresponding line, eveything is ok). Can anydoby tell me where is the mistake ? Thank you, Bruno Lienard From fredrik at pythonware.com Sun Apr 17 19:51:25 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Sun Apr 17 19:52:24 2005 Subject: [XML-SIG] Re: Bug in ElementTree (or in my xml file )? References: <1113759220.42629df459d95@imp5-q.free.fr> Message-ID: Bruno wrote: > I have a strange bug in an xml file. I try to parse it with the following > script: > The file is here: your script and your XML looks fine, and runs just fine on my machine, but the exception looks a bit strange: > I get the following error: > > Traceback (most recent call last): > File "C:\Python23\Lib\site-packages\pythonwin\pywin\framework\scriptutils.py", > line 310, in RunScript > exec codeObject in __main__.__dict__ > File "C:\Test0\FineReader\FixXml.py", line 7, in ? > text += gettext(e) what's FixXml? that's not a part of (c)ElementTree, and shouldn't be invoked by your test script. From lienard.bruno at free.fr Sun Apr 17 22:06:09 2005 From: lienard.bruno at free.fr (lienard.bruno@free.fr) Date: Sun Apr 17 22:06:24 2005 Subject: [XML-SIG] Re: Bug in ElementTree (or in my xml file )? Message-ID: <1113768369.4262c1b135608@imp5-q.free.fr> FixXml.py is the name of the script, I have sent it from Pythonwin. I have the same result when I launch it from the dos prompt C:\Test0\FineReader>c:\python23\python FixXml.py Traceback (most recent call last): File "FixXml.py", line 7, in ? tree.parse(inputFile) File "", line 24, in parse SyntaxError: not well-formed (invalid token): line 17, column 22 It's quite strange ! In fact, my test file is rather large, and I have the same problem with all lines of the same kind, that look like: I tried to correct the file with Tidy, without any success, but when I remove these lines, I can parse the file.. I don't understand ! I use Python 2.3 and the latest versions of cElemenTree and ElementTree , I have the same result with both of them. Thank you Bruno From lienard.bruno at free.fr Sun Apr 17 22:39:13 2005 From: lienard.bruno at free.fr (lienard.bruno@free.fr) Date: Sun Apr 17 22:41:17 2005 Subject: [XML-SIG] Re: Bug in ElementTree (or in my xml file )? Message-ID: <1113770353.4262c971aadc1@imp5-q.free.fr> I think I have found the reason of the problem. It seems to be an incorrect character (Ascii code A0). I don't know why this character is there, but when I remove it, everything is ok. Thank for your help. Bruno From fredrik at pythonware.com Mon Apr 18 13:20:37 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Mon Apr 18 13:21:30 2005 Subject: [XML-SIG] Re: Bug in ElementTree (or in my xml file )? References: <1113770353.4262c971aadc1@imp5-q.free.fr> Message-ID: lienard.bruno@free.fr wrote: > I think I have found the reason of the problem. It seems to be an incorrect > character (Ascii code A0). I don't know why this character is there, but when I > remove it, everything is ok. 0xA0 is a non-breaking space (chr(160), HTML  ), and is not part of the US-ASCII character set. (for some reason, it didn't survive the paste-into-mail-and-paste-from-mail-into- editor trip, which is why your example worked well on my machine). if you want to use that character in the file, set the encoding to"iso-8859-1" in- stead of "us-ascii". From olivier.baudouin at normandnet.fr Mon Apr 18 13:43:39 2005 From: olivier.baudouin at normandnet.fr (Olivier Baudouin) Date: Mon Apr 18 13:42:42 2005 Subject: [XML-SIG] (no subject) Message-ID: Hello, I try to install PyXML-0.8.4 on Mandrake 10.0 with Python 2.4. And I have this error message : (...) extensions/expat/lib/xmlparse.c:75:2: #error memmove does not exist on this platform, nor is a substitute available error: command 'gcc' failed with exit status 1 What's the matter ? OB From finance at axcessinc.com Tue Apr 19 16:39:51 2005 From: finance at axcessinc.com (finance@axcessinc.com) Date: Tue Apr 19 16:39:11 2005 Subject: [XML-SIG] Returned mail: see transcript for details Message-ID: <200504191437.j3JEbOok012101@cesio.consuldata.com.br> ------------------ Virus Warning Message (on cesio.consuldata.com.br) Found virus WORM_MYDOOM.M in file instruction.txt .pif (in instruction.zip) The uncleanable file is deleted. Para maiores informacoes, contate o suporte da ConsulData: +55 (13) 3219-6522 ou suporte@consuldata.com.br --------------------------------------------------------- -------------- next part -------------- The original message was received at Tue, 19 Apr 2005 11:39:51 -0300 from 191.174.185.84 ----- The following addresses had permanent fatal errors ----- xml-sig@python.org ----- Transcript of the session follows ----- ... while talking to python.org.: 554 ... Message is too large 554 ... Service unavailable -------------- next part -------------- ------------------ Virus Warning Message (on cesio.consuldata.com.br) instruction.zip is removed from here because it contains a virus. --------------------------------------------------------- From noreply at sourceforge.net Wed Apr 20 02:34:25 2005 From: noreply at sourceforge.net (SourceForge.net) Date: Wed Apr 20 02:34:27 2005 Subject: [XML-SIG] [ pyxml-Bugs-1186373 ] Error calling normalize() on text node Message-ID: Bugs item #1186373, was opened at 2005-04-20 00:34 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=106473&aid=1186373&group_id=6473 Category: DOM Group: None Status: Open Resolution: None Priority: 5 Submitted By: Chui Tey (teyc) Assigned to: Nobody/Anonymous (nobody) Summary: Error calling normalize() on text node Initial Comment: Traceback (most recent call last): File "broken.py", line 24, in ? node.normalize() File "c:\Python23\lib\xml\dom\minidom.py", line 208, in normalize self.childNodes[:] = L TypeError: object doesn't support slice assignment Patch: *** minidom.py.084 Wed Apr 20 10:24:59 2005 --- minidom.py Wed Apr 20 10:33:42 2005 *************** *** 179,184 **** --- 179,189 ---- return oldChild def normalize(self): + + if self.nodeType == Node.TEXT_NODE: + if not self.data: self.unlink() + return + L = [] for child in self.childNodes: if child.nodeType == Node.TEXT_NODE: ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=106473&aid=1186373&group_id=6473 From uche.ogbuji at fourthought.com Wed Apr 20 22:06:14 2005 From: uche.ogbuji at fourthought.com (Uche Ogbuji) Date: Wed Apr 20 22:06:17 2005 Subject: [XML-SIG] ANN: Amara XML Toolkit 1.0b2 Message-ID: <1114027574.7121.36.camel@borgia> http://uche.ogbuji.net/tech/4Suite/amara ftp://ftp.4suite.org/pub/Amara/ Changes in this release: * More mutation API improvements (del and assignment can now be used with elements as well as attributes) [1][2][3] * Slight improvements in 4Suite XSLT compatability (still some remaining issues) * Bug fixes [1] http://copia.ogbuji.net/blog/2005-04-17/Amara_gets [2] http://copia.ogbuji.net/blog/2005-04-18/Elements_v [3] http://copia.ogbuji.net/blog/2005-04-18/Deletion_a Amara XML Toolkit is a collection of Python tools for XML processing-- not just tools that happen to be written in Python, but tools built from the ground up to use Python idioms and take advantage of the many advantages of Python. Amara builds on 4Suite [http://4Suite.org], but whereas 4Suite focuses more on literal implementation of XML standards in Python, Amara focuses on Pythonic idiom. It provides tools you can trust to conform with XML standards without losing the familiar Python feel. The components of Amara are: * Bindery: data binding tool (a very Pythonic XML API) * Scimitar: implementation of the ISO Schematron schema language for XML; converts Schematron files to Python scripts * domtools: set of tools to augment Python DOMs * saxtools: set of tools to make SAX easier to use in Python * Flextyper: user-defined datatypes in Python for XML processing There's a lot in Amara, but here are highlights: Amara Bindery: XML as easy as py -------------------------------- Bindery turns an XML document into a tree of Python objects corresponding to the vocabulary used in the XML document, for maximum clarity. For example, the document What do you mean "bleh" But I was looking for argument Becomes a data structure such that you can write binding.monty.python.spam In order to get the value "eggs" or binding.monty.python[1] In order to get the value "But I was looking for argument". There are other such tools for Python, and what makes Anobind unique is that it's driven by a very declarative rules-based system for binding XML to the Python data. You can register rules that are triggered by XPattern expressions specialized binding behavior. It includes XPath support and supports mutation. Bindery is very efficient, using SAX to generate bindings. Scimitar: Schematron for Pytthon -------------------------------- Merged in from a separate project, Scimitar is an implementation of ISO Schematron that compiles a Schematron schema into a Python validator script. You typically use scimitar in two phases. Say you have a schematron schema schema1.stron and you want to validate multiple XML files against it, instance1.xml, instance2.xml, instance3.xml. First you run schema1.stron through the scimitar compiler script, scimitar.py: scimitar.py schema1.stron The generated file, schema1.py, can be used to validate XML instances: python schema1.py instance1.xml Which emits a validation report. Amara DOM Tools: giving DOM a more Pythonic face ------------------------------------------------ DOM came from the Java world, hardly the most Pythonic API possible. Some DOM-like implementations such as 4Suite's Domlettes mix in some Pythonic idiom. Amara DOM Tools goes even further. Amara DOM Tools feature pushdom, similar to xml.dom.pulldom, but easier to use. It also includes Python generator-based tools for DOM processing, and a function to return an XPath location for any DOM node. Amara SAX Tools: SAX without the brain explosion ------------------------------------------------ Tenorsax (amara.saxtools.tenorsax) is a framework for "linerarizing" SAX logic so that it flows more naturally, and needs a lot less state machine wizardry. License ------- Amara is open source, provided under the 4Suite variant of the Apache license. See the file COPYING for details. Installation ------------ Amara requires Python 2.3 or more recent and 4Suite 1.0a4 or more recent. Make sure these are installed, unpack Amara to a convenient location and run python setup.py install -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.net http://fourthought.com http://copia.ogbuji.net http://4Suite.org Use CSS to display XML, part 2 - http://www-128.ibm.com/developerworks/edu/x-dw-x-xmlcss2-i.html Writing and Reading XML with XIST - http://www.xml.com/pub/a/2005/03/16/py-xml.html Use XSLT to prepare XML for import into OpenOffice Calc - http://www.ibm.com/developerworks/xml/library/x-oocalc/ Schema standardization for top-down semantic transparency - http://www-128.ibm.com/developerworks/xml/library/x-think31.html From walter at livinglogic.de Thu Apr 21 19:39:31 2005 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Thu Apr 21 19:39:34 2005 Subject: [XML-SIG] XIST 2.9 has been released Message-ID: <4267E553.4030403@livinglogic.de> XIST 2.9 has been released! What is it? =========== XIST is an extensible HTML/XML generator written in Python. XIST is also a DOM parser (built on top of SAX2) with a very simple and Pythonesque tree API. Every XML element type corresponds to a Python class, and these Python classes provide a conversion method to transform the XML tree (e.g. into HTML). XIST can be considered "object oriented XSL". What's new in version 2.9? ========================== * XIST trees can now be pickled. The only restriction is that global attributes must come from a namespace that has been turned into a module via makemod, so that this module can be imported on unpickling. * Two arguments of the walk method have been renamed: filtermode has been renamed to inmode and walkmode has been renamed to outmode. For these modes two new values are supported: ll.xist.xsc.walkindex The value passed to the filter function or yielded from the iterator is a list containing child indizes and attribute names that specify the path to the node in question. ll.xist.xsc.walkrootindex The filter function will be called with two arguments: The first is the root node of the tree (i.e. the node for which walk has been called), the second one is an index path (just like for ll.xist.xsc.walkindex). If used as an outmode a tuple with these two values will be yielded. * Attribute mappings now support __getitem__, __setitem__ and __delitem__ with list arguments, i.e. you can do: >>> from ll.xist.ns import html >>> e = html.a("gurk", href=("hinz", "kunz")) >>> print e.attrs[["href", 0]] hinz >>> e.attrs[["href", 0]] = "hurz" >>> print e["href"] hurzkunz >>> del e.attrs[["href", 0]] >>> print e["href"] kunz * XML attributes can now be accessed as Python attributes, i.e. >>> from ll.xist.ns import html >>> e = html.a("spam", href="eggs") >>> print e.attrs.href eggs (Don't confuse this with e.Attrs.href which is the attribute class.) * Frag and Element now support Node subclasses in their __getitem__ method: An iterator for all children of the specified type will be returned. * The encoding used for parsing now defaults to None. When reading from an URL and no default encoding has been specified the one from the Content-Type header will be used. If this still doesn't result in a usable encoding, "utf-8" will be used when parsing XML and "iso-8859-1" will be used when parsing broken HTML. * All error and warning classes from ll.xist.errors have been merged into ll.xist.xsc. This avoids import problems with circular imports. * The attributes showLocation and showPath of ll.xist.presenters.TreePresenter have been lowercased and presenters are properly reset after they've done their job. * The class attribute xmlname will no longer be turned into a list containing the Python and the XML name, but will be the XML name only. You can get the Python name from foo.__class__.__name__. * DeprecationWarnings for name and attrHandlers have finally been removed. * Instances of ll.xist.xsc.Entity subclasses can now be compared. __eq__ simply checks if the objects are instances of the same class. For changes in older versions see: http://www.livinglogic.de/Python/xist/History.html Where can I get it? =================== XIST can be downloaded from http://ftp.livinglogic.de/xist/ or ftp://ftp.livinglogic.de/pub/livinglogic/xist/ Web pages are at http://www.livinglogic.de/Python/xist/ ViewCVS access is available at http://www.livinglogic.de/viewcvs/ For information about the mailing lists go to http://www.livinglogic.de/Python/xist/Mailinglists.html Bye, Walter D?rwald From service at mozilla.org Sun Apr 24 02:25:39 2005 From: service at mozilla.org (service@mozilla.org) Date: Sun Apr 24 02:25:37 2005 Subject: [XML-SIG] Xml-sig@python.org Message-ID: <20050424002535.F32D11E4002@bag.python.org> Your message was not delivered due to the following reason: Your message was not delivered because the destination server was not reachable within the allowed queue period. The amount of time a message is queued before it is returned depends on local configura- tion parameters. Most likely there is a network problem that prevented delivery, but it is also possible that the computer is turned off, or does not have a mail system running right now. Your message was not delivered within 8 days: Host 110.172.114.241 is not responding. The following recipients could not receive this message: Please reply to postmaster@python.org if you feel this message to be in error. -------------- next part -------------- A non-text attachment was scrubbed... Name: document.zip Type: application/octet-stream Size: 29108 bytes Desc: not available Url : http://mail.python.org/pipermail/xml-sig/attachments/20050424/6c5ab8e6/document-0001.obj From gvwilson at cs.utoronto.ca Tue Apr 26 19:28:45 2005 From: gvwilson at cs.utoronto.ca (Greg Wilson) Date: Tue Apr 26 19:31:48 2005 Subject: [XML-SIG] ann: Data Crunching Message-ID: *ahem* Readers of this newsgroup might be interested in a new book on data crunching, which is available from Amazon: http://www.amazon.com/exec/obidos/ASIN/0974514071 or directly from the Pragmatic Programmers: http://www.pragmaticprogrammer.com/titles/gwd/index.html The book covers basic text processing, regular expressions, XML manipulation, binary data handling, and the 10% of relational databases that every programmer should know. Most of the examples are in Python (though Unix command line tools, XSL, and SQL are in there as well). Hope you enjoy it, Greg Wilson (its proud, but somewhat bashful, author) From morgan.lean at epiphanygames.com.au Wed Apr 27 04:11:22 2005 From: morgan.lean at epiphanygames.com.au (morgan lean) Date: Wed Apr 27 04:11:39 2005 Subject: [XML-SIG] Using XML Message-ID: <20050427021137.B06A91E4006@bag.python.org> Hi Peeps! I have only been using python for a few months now and I love it. I want to create a pure Python XML blog that uses both CGI and sockets to update an XML file on a server, I have decided that I don't want to use a database for a BLOG but xml instead. What I want to do simply append to an XML file at the bottom node from a normal form. What do you think about the idea? Has anyone done this yet and could you give me your code if you have. Cheers Morgan -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/xml-sig/attachments/20050427/78648fe0/attachment.html From olivier.collioud at wipo.int Wed Apr 27 07:30:45 2005 From: olivier.collioud at wipo.int (Olivier Collioud) Date: Wed Apr 27 07:31:33 2005 Subject: [XML-SIG] Using XML Message-ID: Hello Morgan, your love to python will increase again if you take a look at http://www.effbot.org/zone/element-index.htm and use ElementTree (or cElementTree if you are speed addict) to process your XML data. Olivier. >>> "morgan lean" 27/04/05 4:11:22 AM >>> Hi Peeps! I have only been using python for a few months now and I love it. I want to create a pure Python XML blog that uses both CGI and sockets to update an XML file on a server, I have decided that I don't want to use a database for a BLOG but xml instead. What I want to do simply append to an XML file at the bottom node from a normal form. What do you think about the idea? Has anyone done this yet and could you give me your code if you have. Cheers Morgan -- IMPORTANT: This electronic message may contain privileged, confidential and copyright protected information. If you have received this email by mistake, please immediately notify the sender and delete this email and all its attachments. Please ensure all e-mail attachments are scanned for viruses prior to opening or using. Views expressed by individuals in this electronic message do not necessarily reflect the views of the Organization. From faassen at infrae.com Wed Apr 27 14:03:32 2005 From: faassen at infrae.com (Martijn Faassen) Date: Wed Apr 27 13:58:53 2005 Subject: [XML-SIG] Using XML In-Reply-To: References: Message-ID: <426F7F94.3080409@infrae.com> Olivier Collioud wrote: > your love to python will increase again if you take a look at > http://www.effbot.org/zone/element-index.htm and use ElementTree (or > cElementTree if you are speed addict) to process your XML data. If you need a bit more in the way of XML features such as XPath, XSLT and Relax NG, still don't want to miss out the friendly API of ElementTree, *and* want nice performance (libxml2), you may also want to consider lxml, which has all of those: http://codespeak.net/lxml Regards, Martijn From noreply at sourceforge.net Fri Apr 29 18:52:04 2005 From: noreply at sourceforge.net (SourceForge.net) Date: Fri Apr 29 18:52:06 2005 Subject: [XML-SIG] [ pyxml-Bugs-1192536 ] marshal.generic wont marshal new style classes Message-ID: Bugs item #1192536, was opened at 2005-04-29 16:52 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=106473&aid=1192536&group_id=6473 Category: DOM Group: None Status: Open Resolution: None Priority: 5 Submitted By: Mark E (markenglish) Assigned to: Nobody/Anonymous (nobody) Summary: marshal.generic wont marshal new style classes Initial Comment: The generic marshaller will not marshal new style classes. Using Python 2.4 on Windows XP sp2 with PyXML 0.8.4 ---start code--- >>> class WillWork: ... pass ... >>> class WontWork(object): ... pass ... >>> will = WillWork() >>> wont = WontWork() >>> import xml.marshal.generic >>> xml.marshal.generic.dumps(will) '' >>> xml.marshal.generic.dumps(wont) Traceback (most recent call last): File "", line 1, in ? File "C:\Program Files\Python24\Lib\site- packages\_xmlplus\marshal\generic.py", line 59, in dumps L = [self.PROLOGUE + self.DTD] + self.m_root (value, dict) File "C:\Program Files\Python24\Lib\site- packages\_xmlplus\marshal\generic.py", line 104, in m_root L = ['<%s>' % name] + self._marshal(value,dict) + ['' % name] File "C:\Program Files\Python24\Lib\site- packages\_xmlplus\marshal\generic.py", line 92, in _marshal return getattr(self, meth)(value, dict) AttributeError: Marshaller instance has no attribute 'm_WontWork' ---end code--- ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=106473&aid=1192536&group_id=6473 From Mark.English at liffe.com Fri Apr 29 18:53:04 2005 From: Mark.English at liffe.com (Mark English) Date: Fri Apr 29 18:53:08 2005 Subject: [XML-SIG] xml.marhshal.generic and new style classes Message-ID: <40E605146701DE428FAF21286A97D3090457E4CA@wphexa02.corp.lh.int> I may have missed this being addressed somewhere, but I'm having the following problem. The generic marshaller will not marshal new style classes. I've seen reports of similar problems, but no solutions. ---start code--- For example: >>> class WillWork: ... pass ... >>> class WontWork(object): ... pass ... >>> will = WillWork() >>> wont = WontWork() >>> import xml.marshal.generic >>> xml.marshal.generic.dumps(will) '' >>> xml.marshal.generic.dumps(wont) Traceback (most recent call last): File "", line 1, in ? File "C:\Program Files\Python24\Lib\site-packages\_xmlplus\marshal\generic.py", line 59, in dumps L = [self.PROLOGUE + self.DTD] + self.m_root(value, dict) File "C:\Program Files\Python24\Lib\site-packages\_xmlplus\marshal\generic.py", line 104, in m_root L = ['<%s>' % name] + self._marshal(value,dict) + ['' % name] File "C:\Program Files\Python24\Lib\site-packages\_xmlplus\marshal\generic.py", line 92, in _marshal return getattr(self, meth)(value, dict) AttributeError: Marshaller instance has no attribute 'm_WontWork' ---end code--- Has anyone found a fix for this, or is anything planned ? I've raised a bug on sourceforge. Thanks, Mark ----------------------------------------------------------------------- The information contained in this e-mail is confidential and solely for the intended addressee(s). Unauthorised reproduction, disclosure, modification, and/or distribution of this email may be unlawful. If you have received this email in error, please notify the sender immediately and delete it from your system. The views expressed in this message do not necessarily reflect those of LIFFE Holdings Plc or any of its subsidiary companies. ----------------------------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/xml-sig/attachments/20050429/34ceda38/attachment.htm From Abadz at c-24-20-228-146.hsd1.or.comcast.net Sat Apr 30 03:18:47 2005 From: Abadz at c-24-20-228-146.hsd1.or.comcast.net (Abadz@c-24-20-228-146.hsd1.or.comcast.net) Date: Sat Apr 30 03:25:18 2005 Subject: [XML-SIG] Adobe Software from 70USD(Mac&PC).Microsoft Software from 80USD(PC). In-Reply-To: <55F245BHJL9C7GC4@python.org> References: <55F245BHJL9C7GC4@python.org> Message-ID: Hi Xml-sig it's here www.;8fumtxswc18prbq;.;taumldhfj.com Just text Simple End Me.