From postmaster at python.org Fri Oct 1 19:33:36 2004 From: postmaster at python.org (Automatic Email Delivery Software) Date: Fri Oct 1 19:35:50 2004 Subject: [XML-SIG] MDaemon Warning - virus found: Returned mail: Data format error Message-ID: <20041001173548.7E6501E4009@bag.python.org> ******************************* WARNING ****************************** This message has been scanned by MDaemon AntiVirus and was found to contain infected attachment(s). Please review the list below. Attachment Virus name Action taken ---------------------------------------------------------------------- letter.zip I-Worm.Mydoom.m Removed ********************************************************************** Your message was not delivered due to the following reason(s): Your message could not be delivered because the destination computer was not reachable within the allowed queue period. The amount of time a message is queued before it is returned depends on local configura- tion parameters. Most likely there is a network problem that prevented delivery, but it is also possible that the computer is turned off, or does not have a mail system running right now. Your message could not be delivered within 4 days: Mail server 92.169.202.11 is not responding. The following recipients did not receive this message: Please reply to postmaster@python.org if you feel this message to be in error. From pythonTutor at venix.com Fri Oct 1 21:43:02 2004 From: pythonTutor at venix.com (Lloyd Kvam) Date: Fri Oct 1 21:43:08 2004 Subject: [XML-SIG] xsl transforms for displaying XML in a browser Message-ID: <1096659781.2655.63.camel@laptop.venix.com> I have a rather complicated (100 tags) XML file that needs to get displayed sensibly in a browser. I've started writing the XSL to transform the document to html. (CSS is too simple and the XML file needs to be readable for people without requiring special software.) Is there a smart way to do this? I am thinking of writing a python script to simply generate an XSL file with explicit templates for every tag. I would then modify this collection of boiler-plate code to get a reasonable layout. Any suggestions are welcome. Thanks. -- Lloyd Kvam Venix Corp From tpassin at comcast.net Fri Oct 1 22:52:19 2004 From: tpassin at comcast.net (Thomas B. Passin) Date: Fri Oct 1 22:51:23 2004 Subject: [XML-SIG] xsl transforms for displaying XML in a browser In-Reply-To: <1096659781.2655.63.camel@laptop.venix.com> References: <1096659781.2655.63.camel@laptop.venix.com> Message-ID: <415DC383.3030205@comcast.net> Lloyd Kvam wrote: > I have a rather complicated (100 tags) XML file that needs to get > displayed sensibly in a browser. I've started writing the XSL to > transform the document to html. (CSS is too simple and the XML file > needs to be readable for people without requiring special software.) > > Is there a smart way to do this? I am thinking of writing a python > script to simply generate an XSL file with explicit templates for > every tag. I would then modify this collection of boiler-plate code > to get a reasonable layout. Well, both Mozilla/Firefox and Internet Explorer already will display pretty good pretty-printed versions. Maybe you don't need to do anything special. Cheers, Tom P -- Thomas B. Passin Explorer's Guide to the Semantic Web (Manning Books) http://www.manning.com/catalog/view.php?book=passin From pythonTutor at venix.com Fri Oct 1 23:12:49 2004 From: pythonTutor at venix.com (Lloyd Kvam) Date: Fri Oct 1 23:12:59 2004 Subject: [XML-SIG] xsl transforms for displaying XML in a browser In-Reply-To: <415DC383.3030205@comcast.net> References: <1096659781.2655.63.camel@laptop.venix.com> <415DC383.3030205@comcast.net> Message-ID: <1096665169.2655.71.camel@laptop.venix.com> I did try the lazy route. The default XML display is just not good enough. Pretty printed XML is too cluttered and a simple text extract just runs on. Essentially I need to add some labels, blocking, and breaks into the text stream. I'll simply write it out. Thanks. On Fri, 2004-10-01 at 16:52, Thomas B. Passin wrote: > Lloyd Kvam wrote: > > I have a rather complicated (100 tags) XML file that needs to get > > displayed sensibly in a browser. I've started writing the XSL to > > transform the document to html. (CSS is too simple and the XML file > > needs to be readable for people without requiring special software.) > > > > Is there a smart way to do this? I am thinking of writing a python > > script to simply generate an XSL file with explicit templates for > > every tag. I would then modify this collection of boiler-plate code > > to get a reasonable layout. > > Well, both Mozilla/Firefox and Internet Explorer already will display > pretty good pretty-printed versions. Maybe you don't need to do > anything special. > > Cheers, > > Tom P -- Lloyd Kvam Venix Corp From dkuhlman at cutter.rexx.com Sat Oct 2 02:19:44 2004 From: dkuhlman at cutter.rexx.com (Dave Kuhlman) Date: Sat Oct 2 02:19:45 2004 Subject: [XML-SIG] xsl transforms for displaying XML in a browser In-Reply-To: <1096665169.2655.71.camel@laptop.venix.com>; from pythonTutor@venix.com on Fri, Oct 01, 2004 at 05:12:49PM -0400 References: <1096659781.2655.63.camel@laptop.venix.com> <415DC383.3030205@comcast.net> <1096665169.2655.71.camel@laptop.venix.com> Message-ID: <20041001171943.A98218@cutter.rexx.com> On Fri, Oct 01, 2004 at 05:12:49PM -0400, Lloyd Kvam wrote: > I did try the lazy route. The default XML display is just not good > enough. Pretty printed XML is too cluttered and a simple text extract > just runs on. Essentially I need to add some labels, blocking, and > breaks into the text stream. I'll simply write it out. > > Thanks. > > On Fri, 2004-10-01 at 16:52, Thomas B. Passin wrote: > > Lloyd Kvam wrote: > > > I have a rather complicated (100 tags) XML file that needs to get > > > displayed sensibly in a browser. I've started writing the XSL to > > > transform the document to html. (CSS is too simple and the XML file > > > needs to be readable for people without requiring special software.) > > > > > > Is there a smart way to do this? I am thinking of writing a python > > > script to simply generate an XSL file with explicit templates for > > > every tag. I would then modify this collection of boiler-plate code > > > to get a reasonable layout. > > > > Well, both Mozilla/Firefox and Internet Explorer already will display > > pretty good pretty-printed versions. Maybe you don't need to do > > anything special. Lloyd - This isn't XSLT, but ... Have you looked at SciTE. It's a text editor that can export to HTML and PDF and several others. I just tried it on an XML file and the resulting HTML and PDF look reasonable. It looks like the exported output is slightly custom-izable. The generated HTML uses CSS, which it embeds in the HTML output file. So you might be able to customize that a bit also. See: http://scintilla.sourceforge.net/SciTE.html For a little information on customization, see the documentation and search for ("export.html" and export.pdf") at: http://scintilla.sourceforge.net/SciTEDoc.html Dave -- Dave Kuhlman http://www.rexx.com/~dkuhlman From lahiru25 at yahoo.com Sat Oct 2 15:45:08 2004 From: lahiru25 at yahoo.com (Lahiru Jayasundera) Date: Sat Oct 2 15:45:12 2004 Subject: [XML-SIG] Python / XML / XSLT vs. Cocoon for website server side Message-ID: <20041002134508.512.qmail@web41408.mail.yahoo.com> dear Richard i'm looking a web site written in XML, i was very happy to see your mail, can you tell me the address of your web site. lahiru _______________________________ Do you Yahoo!? Declare Yourself - Register online to vote today! http://vote.yahoo.com From pythonTutor at venix.com Mon Oct 4 04:00:13 2004 From: pythonTutor at venix.com (Lloyd Kvam) Date: Mon Oct 4 04:01:00 2004 Subject: [XML-SIG] xsl transforms for displaying XML in a browser In-Reply-To: <20041001171943.A98218@cutter.rexx.com> References: <1096659781.2655.63.camel@laptop.venix.com> <415DC383.3030205@comcast.net> <1096665169.2655.71.camel@laptop.venix.com> <20041001171943.A98218@cutter.rexx.com> Message-ID: <1096855212.3093.77.camel@laptop.venix.com> Thanks Dave. What I wound up doing was generating a boilerplate xsl file by modifying your generateDS.py program. Rather than generating Python classes I modified the generate function to call code to write xsl transformations for each element. This resulted in about 1400 lines of code. I deleted most of it (where the default to would be adequate) and then fixed the remaining 400 lines to produce useful XHTML and tacked on a style sheet. I decided to work from the schema because my generated XML omits many elements that were unnecessary for my purposes. This way the transform should work with any XML files that conform to this schema. Since the code inserted into generateDS makes little sense on its own, I am enclosing a small script that generates a set of transforms for a given XML file. For me, the benefits of generating the xsl file were: correctly spelled element names complete layout of the structure to be transformed most of the coding now becomes the addition of markup elements On Fri, 2004-10-01 at 20:19, Dave Kuhlman wrote: > On Fri, Oct 01, 2004 at 05:12:49PM -0400, Lloyd Kvam wrote: > > I did try the lazy route. The default XML display is just not good > > enough. Pretty printed XML is too cluttered and a simple text extract > > just runs on. Essentially I need to add some labels, blocking, and > > breaks into the text stream. I'll simply write it out. > > > > Thanks. > > > > On Fri, 2004-10-01 at 16:52, Thomas B. Passin wrote: > > > Lloyd Kvam wrote: > > > > I have a rather complicated (100 tags) XML file that needs to get > > > > displayed sensibly in a browser. I've started writing the XSL to > > > > transform the document to html. (CSS is too simple and the XML file > > > > needs to be readable for people without requiring special software.) > > > > > > > > Is there a smart way to do this? I am thinking of writing a python > > > > script to simply generate an XSL file with explicit templates for > > > > every tag. I would then modify this collection of boiler-plate code > > > > to get a reasonable layout. > > > > > > Well, both Mozilla/Firefox and Internet Explorer already will display > > > pretty good pretty-printed versions. Maybe you don't need to do > > > anything special. > > > Lloyd - > > This isn't XSLT, but ... > > Have you looked at SciTE. It's a text editor that can export to > HTML and PDF and several others. I just tried it on an XML file > and the resulting HTML and PDF look reasonable. > > It looks like the exported output is slightly custom-izable. > > The generated HTML uses CSS, which it embeds in the HTML output > file. So you might be able to customize that a bit also. > > See: > > http://scintilla.sourceforge.net/SciTE.html > > For a little information on customization, see the documentation > and search for ("export.html" and export.pdf") at: > > http://scintilla.sourceforge.net/SciTEDoc.html > > Dave -- Lloyd Kvam Venix Corp -------------- next part -------------- A non-text attachment was scrubbed... Name: xml2xsl.py Type: text/x-python Size: 1121 bytes Desc: not available Url : http://mail.python.org/pipermail/xml-sig/attachments/20041003/1af35bc0/xml2xsl.py From konrad.hinsen at laposte.net Mon Oct 4 15:57:55 2004 From: konrad.hinsen at laposte.net (konrad.hinsen@laposte.net) Date: Mon Oct 4 15:56:57 2004 Subject: [XML-SIG] Parsing a unicode string Message-ID: <640312DE-160D-11D9-BCFC-000A95999556@laposte.net> Is there any straightforward way to parse XML data from an existing Unicode string? I have found a procedure that works, but seems unnecessarily complicated: xml_data = xml_data.encode('utf-8') xml_data = '\n'.join(xml_data.split('\n')[1:]) xml_data = '\n' + xml_data dom_tree = xml.dom.minidom.parseString(xml_data) My first attempt was dom_tree = xml.dom.minidom.parseString(xml_data) and I still think that would be the nicest way. Konrad. -- --------------------------------------------------------------------- Konrad Hinsen Laboratoire L?on Brillouin, CEA Saclay, 91191 Gif-sur-Yvette Cedex, France Tel.: +33-1 69 08 79 25 Fax: +33-1 69 08 82 61 E-Mail: hinsen@llb.saclay.cea.fr --------------------------------------------------------------------- From lists at futuresign.de Mon Oct 4 17:30:24 2004 From: lists at futuresign.de (Christian Ledermann) Date: Mon Oct 4 17:29:22 2004 Subject: [XML-SIG] xbel Bookmark Collections Message-ID: <1096903823.17715.6.camel@web.futuresign.de> under http://www.maenner-club.de/Links/plone/xbel a plone bookmark collection is available in the xbel format. there are other collections available as well, just apply /xbel instead of /bookmark_view to the url in any bookmarkfolder on this site to get an xbel file ;) this collection is maintained with ATBookmarks an xbel supporting software build with Plone. cheers christian From mike at skew.org Tue Oct 5 05:57:15 2004 From: mike at skew.org (Mike Brown) Date: Tue Oct 5 05:57:18 2004 Subject: [XML-SIG] Parsing a unicode string In-Reply-To: <640312DE-160D-11D9-BCFC-000A95999556@laposte.net> "from konrad.hinsen@laposte.net at Oct 4, 2004 03:57:55 pm" Message-ID: <200410050357.i953vF2K067528@chilled.skew.org> konrad.hinsen@laposte.net wrote: > Is there any straightforward way to parse XML data from an existing > Unicode string? I have found a procedure that works, but seems > unnecessarily complicated: > > xml_data = xml_data.encode('utf-8') > xml_data = '\n'.join(xml_data.split('\n')[1:]) > xml_data = '\n' + xml_data > dom_tree = xml.dom.minidom.parseString(xml_data) You've got the general idea - 1. encode it 2. notify the parser of the encoding Parsers should have a way to be notified of encoding externally; by the rules of XML parsing as defined in the XML spec, such an external declaration takes precedence. Unfortunately, even though parsers may have the necessary APIs, their Python wrappers may not. We didn't even add it to 4Suite until a few months ago. So you're pretty much left with the option you've gone with - rewrite the encoding declaration in the XML itself. The method you are using in the code above is obviously assuming a bit more than it should-- there might not be any line breaks or an XML declaration at all. A regular expression would be better. -Mike From konrad.hinsen at laposte.net Tue Oct 5 09:31:51 2004 From: konrad.hinsen at laposte.net (konrad.hinsen@laposte.net) Date: Tue Oct 5 09:31:55 2004 Subject: [XML-SIG] Parsing a unicode string In-Reply-To: <200410050357.i953vF2K067528@chilled.skew.org> References: <200410050357.i953vF2K067528@chilled.skew.org> Message-ID: <9FB347EA-16A0-11D9-9585-000A95AB5F10@laposte.net> On 05.10.2004, at 05:57, Mike Brown wrote: > Parsers should have a way to be notified of encoding externally; > by the rules of XML parsing as defined in the XML spec, such an > external declaration takes precedence. I'd also expect parsers to accept unicode string objects with no encoding specification whatsoever. Decoding a Unicode encoding and parsing XML are two distinct steps, so why not propose them as distinct functions? > The method you are using in the code above is obviously assuming > a bit more than it should-- there might not be any line breaks > or an XML declaration at all. A regular expression would be better. > Yes, but this is for code that knows where its XML data comes from. Just an excuse for being lazy :-) Konrad. From fredrik at pythonware.com Tue Oct 5 12:01:16 2004 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue Oct 5 11:59:15 2004 Subject: [XML-SIG] Re: Parsing a unicode string References: <200410050357.i953vF2K067528@chilled.skew.org> <9FB347EA-16A0-11D9-9585-000A95AB5F10@laposte.net> Message-ID: konrad.hinsen@laposte.net wrote: > I'd also expect parsers to accept unicode string objects with no encoding specification > whatsoever. Decoding a Unicode encoding and parsing XML are two distinct steps not really; XML is defined in terms of encoded bytestreams. if an entity is stored in a Python Unicode string, it's not really an XML entity. it just looks like one. (it doesn't make much sense from a design perspective either. if you use standard XML tools, you get encoded streams. if you use Python code to generate it, you might as well generate a DOM or DOM-like structure in the first place...). From konrad.hinsen at laposte.net Tue Oct 5 14:50:59 2004 From: konrad.hinsen at laposte.net (konrad.hinsen@laposte.net) Date: Tue Oct 5 14:50:02 2004 Subject: [XML-SIG] Re: Parsing a unicode string In-Reply-To: References: <200410050357.i953vF2K067528@chilled.skew.org> <9FB347EA-16A0-11D9-9585-000A95AB5F10@laposte.net> Message-ID: <3510865C-16CD-11D9-A1FD-000A95999556@laposte.net> On Oct 5, 2004, at 12:01, Fredrik Lundh wrote: > (it doesn't make much sense from a design perspective either. if you > use > standard XML tools, you get encoded streams. if you use Python code > to generate it, you might as well generate a DOM or DOM-like structure > in the first place...). > In my case, the data is available inside a Python-scriptable editor (Leo). For Leo it's just a piece of Unicode text, the script I am adding knows it is XML and generates a DOM representation (or maybe something else soon, everytime I use DOM I hate it more). Konrad. From postmaster-av-out at mg.btc-net.bg Tue Oct 5 17:01:54 2004 From: postmaster-av-out at mg.btc-net.bg (AntiVirus) Date: Tue Oct 5 17:01:55 2004 Subject: [XML-SIG] RAV AntiVirus @ AV MAILGATE BTC-NET scan results Message-ID: <20041005150154.30CE61E4002@bag.python.org> Tozi e-mail vi e izpraten ot AV MAILGATE BTC-NET za da vi suobshti che e-mail izpraten ot sales@debian.org do xml-sig@python.org e zarazen s virus. This e-mail is generated by the AV MAILGATE BTC-NET to warn you that the e-mail sent by sales@debian.org to xml-sig@python.org is infected with virus. Please contact your system administrator for further information. If you are the sender: ------------------- The scanned e-mail has your address in the header field. Either your computer is infected or someone's computer having your e-mail address in the address book has been infected. (Please note that some viruses are sending e-mails directly from your computer. Our advise is to check your computer using an up-to-date antivirus product). If you are the receiver: --------------------- Please contact the sender: very probably he/she doesn't know he/she has a computer virus. Actions taken for the infected files: ------------------------------------- The infected file was saved to quarantine with name:1096988193-RAV21118 message.exe is infected with virus: W32/Mydoom.M@mm Cannot clean this file.Cannot delete this file (most probably it's in an archive). The mail was not delivered because it contained dangerous code. This is a copy of the e-mail header: --------------------- Received: from unknown (HELO debian.org) (213.91.217.74) by 0 with SMTP; 5 Oct 2004 14:56:23 -0000 From: sales@debian.org To: xml-sig@python.org Subject: Delivery reports about your e-mail Date: Tue, 5 Oct 2004 17:04:01 +0200 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_NextPart_000_0014_E9F09C39.C9D55A57" X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2600.0000 X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 From mike at skew.org Tue Oct 5 21:10:16 2004 From: mike at skew.org (Mike Brown) Date: Tue Oct 5 21:10:15 2004 Subject: [XML-SIG] Re: Parsing a unicode string In-Reply-To: "from Fredrik Lundh at Oct 5, 2004 12:01:16 pm" Message-ID: <200410051910.i95JAGl1072473@chilled.skew.org> Fredrik Lundh wrote: > > I'd also expect parsers to accept unicode string objects with no encoding specification > > whatsoever. Decoding a Unicode encoding and parsing XML are two distinct steps > > not really; XML is defined in terms of encoded bytestreams. To clarify for Konrad's benefit - XML syntax is defined in terms of ISO/IEC 10646 characters. XML parsing is defined in terms of encoded byte streams. If the XML spec weren't so strict about what a parser must do, it would be able to operate on pre-decoded streams. But as it is, the lowest-level parser must play dumb, and any Unicode-friendliness must be provided by a higher layer. SAX for example does accept Unicode character streams as entities and specifies that any encoding declaration appearing in the stream will be ignored, which is technically a violation of a couple of rules, e.g. that the declaration must be accurate :) From veillard at redhat.com Tue Oct 5 23:16:12 2004 From: veillard at redhat.com (Daniel Veillard) Date: Tue Oct 5 23:16:32 2004 Subject: [XML-SIG] Re: Parsing a unicode string In-Reply-To: <200410051910.i95JAGl1072473@chilled.skew.org> References: <200410051910.i95JAGl1072473@chilled.skew.org> Message-ID: <20041005211612.GH29015@redhat.com> On Tue, Oct 05, 2004 at 01:10:16PM -0600, Mike Brown wrote: > Fredrik Lundh wrote: > > > I'd also expect parsers to accept unicode string objects with no encoding specification > > > whatsoever. Decoding a Unicode encoding and parsing XML are two distinct steps > > > > not really; XML is defined in terms of encoded bytestreams. > > To clarify for Konrad's benefit - > > XML syntax is defined in terms of ISO/IEC 10646 characters. > XML parsing is defined in terms of encoded byte streams. > > If the XML spec weren't so strict about what a parser must do, it would be > able to operate on pre-decoded streams. But as it is, the lowest-level parser > must play dumb, and any Unicode-friendliness must be provided by a higher > layer. Actually it should not be a problem: http://www.w3.org/TR/REC-xml/#sec-guessing-with-ext-info "The second possible case occurs when the XML entity is accompanied by encoding information, as in some file systems and some network protocols." this is typically the case if your environment tells you the data is available in a given encoding (UCS4/UCS2/UTF-8 usually) "When multiple sources of information are available, their relative priority and the preferred method of handling conflict should be specified as part of the higher-level protocol used to deliver XML." One could argue that the internal API is a very high level protocol or simply use "If an XML entity is in a file, the Byte-Order Mark and encoding declaration are used (if present) to determine the character encoding." and in the case of strings if the BOM is present it will tell the right way to decode the data at the parser level. Daniel -- Daniel Veillard | Red Hat Desktop team http://redhat.com/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/ From tpassin at comcast.net Tue Oct 5 23:33:46 2004 From: tpassin at comcast.net (Thomas B. Passin) Date: Tue Oct 5 23:32:38 2004 Subject: [XML-SIG] Re: Parsing a unicode string In-Reply-To: <3510865C-16CD-11D9-A1FD-000A95999556@laposte.net> References: <200410050357.i953vF2K067528@chilled.skew.org> <9FB347EA-16A0-11D9-9585-000A95AB5F10@laposte.net> <3510865C-16CD-11D9-A1FD-000A95999556@laposte.net> Message-ID: <4163133A.1070208@comcast.net> konrad.hinsen@laposte.net wrote: > In my case, the data is available inside a Python-scriptable editor > (Leo). Speaking of Leo, which version are you using? I'm very fond of it, but I had serious problems with version 4, so I've been sticking with 3.12. Cheers, Tom P -- Thomas B. Passin Explorer's Guide to the Semantic Web (Manning Books) http://www.manning.com/catalog/view.php?book=passin From mike at skew.org Wed Oct 6 01:46:07 2004 From: mike at skew.org (Mike Brown) Date: Wed Oct 6 01:46:06 2004 Subject: [XML-SIG] Re: Parsing a unicode string In-Reply-To: <20041005211612.GH29015@redhat.com> "from Daniel Veillard at Oct 5, 2004 05:16:12 pm" Message-ID: <200410052346.i95Nk7Aq074658@chilled.skew.org> Daniel Veillard wrote: > > If the XML spec weren't so strict about what a parser must do, it would be > > able to operate on pre-decoded streams. But as it is, the lowest-level parser > > must play dumb, and any Unicode-friendliness must be provided by a higher > > layer. > > Actually it should not be a problem: > http://www.w3.org/TR/REC-xml/#sec-guessing-with-ext-info > > "The second possible case occurs when the XML entity is accompanied by > encoding information, as in some file systems and some network protocols." > > this is typically the case if your environment tells you the data is > available in a given encoding (UCS4/UCS2/UTF-8 usually) Oh, I'm sure that's still talking about parsing bytes, though. "It's already decoded to a character sequence / encoding does not apply" is not the kind of external encoding information that they are talking about there, or anywhere else. That would be a very liberal reading of the spec, I think. But perhaps this is a discussion for xml-dev. I'm not about to rejoin that forum, though. As I just told someone else, xml-dev, to me, was too many engineers coming up with too many solutions to too many problems that they, themselves, have willed into existence. :) Nevertheless, I think it would be a good idea for all of Python's XML parsing APIs to support external encoding declarations, so it would at least be possible to blindly encode to whatever your favorite encoding is and then notify the parser accordingly. Like I said, this functionality only went into 4Suite a few months ago [1], and I went a bit out of my way to make it properly use the encoding information from HTTP streams and to follow the rules of RFCs 3023 and 2616. -Mike [1] documented here: http://uche.ogbuji.net/tech/akara/nodes/2004-06-12/external-encoding From postmaster-av-out at mg.btc-net.bg Wed Oct 6 08:53:46 2004 From: postmaster-av-out at mg.btc-net.bg (AntiVirus) Date: Wed Oct 6 08:53:47 2004 Subject: [XML-SIG] RAV AntiVirus @ AV MAILGATE BTC-NET scan results Message-ID: <20041006065346.6A73E1E4002@bag.python.org> Tozi e-mail vi e izpraten ot AV MAILGATE BTC-NET za da vi suobshti che e-mail izpraten ot one@one.org do xml-sig@python.org e zarazen s virus. This e-mail is generated by the AV MAILGATE BTC-NET to warn you that the e-mail sent by one@one.org to xml-sig@python.org is infected with virus. Please contact your system administrator for further information. If you are the sender: ------------------- The scanned e-mail has your address in the header field. Either your computer is infected or someone's computer having your e-mail address in the address book has been infected. (Please note that some viruses are sending e-mails directly from your computer. Our advise is to check your computer using an up-to-date antivirus product). If you are the receiver: --------------------- Please contact the sender: very probably he/she doesn't know he/she has a computer virus. Actions taken for the infected files: ------------------------------------- The infected file was saved to quarantine with name:1097045605-RAV26619 message.cmd is infected with virus: W32/Mydoom.M@mm Cannot clean this file.Cannot delete this file (most probably it's in an archive). The mail was not delivered because it contained dangerous code. This is a copy of the e-mail header: --------------------- Received: from unknown (HELO one.org) (213.91.217.74) by 0 with SMTP; 6 Oct 2004 06:53:15 -0000 From: one@one.org To: xml-sig@python.org Subject: Returned mail: see transcript for details Date: Wed, 6 Oct 2004 09:01:35 +0200 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_NextPart_000_0010_7AF01363.514AE9FD" X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2600.0000 X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 From konrad.hinsen at laposte.net Wed Oct 6 09:28:47 2004 From: konrad.hinsen at laposte.net (konrad.hinsen@laposte.net) Date: Wed Oct 6 09:28:49 2004 Subject: [XML-SIG] Re: Parsing a unicode string In-Reply-To: <4163133A.1070208@comcast.net> References: <200410050357.i953vF2K067528@chilled.skew.org> <9FB347EA-16A0-11D9-9585-000A95AB5F10@laposte.net> <3510865C-16CD-11D9-A1FD-000A95999556@laposte.net> <4163133A.1070208@comcast.net> Message-ID: <5C58509E-1769-11D9-991C-000A95AB5F10@laposte.net> On 05.10.2004, at 23:33, Thomas B. Passin wrote: > konrad.hinsen@laposte.net wrote: >> In my case, the data is available inside a Python-scriptable editor >> (Leo). > > Speaking of Leo, which version are you using? I'm very fond of it, but > I had serious problems with version 4, so I've been sticking with > 3.12. > I discovered Leo recently, so I started with 4.2 in the beta phase and now I use 4.2final. The relatively few problems that I encountered are more likely to blame on the Mac port of Tk (TclTkAqua) than on Leo itself. Leo is a terrific tool for some applications. The project that lead to my XML questions is management of bibliographical data, which greatly profits both from the Outline approach and from the tight link between data and Python scripts in Leo. Konrad. From konrad.hinsen at laposte.net Wed Oct 6 09:37:17 2004 From: konrad.hinsen at laposte.net (konrad.hinsen@laposte.net) Date: Wed Oct 6 09:37:21 2004 Subject: [XML-SIG] Re: Parsing a unicode string In-Reply-To: <200410051910.i95JAGl1072473@chilled.skew.org> References: <200410051910.i95JAGl1072473@chilled.skew.org> Message-ID: <8C664140-176A-11D9-991C-000A95AB5F10@laposte.net> On 05.10.2004, at 21:10, Mike Brown wrote: > To clarify for Konrad's benefit - > > XML syntax is defined in terms of ISO/IEC 10646 characters. > XML parsing is defined in terms of encoded byte streams. Interesting. I had always thought of XML as a (unicode) text representation of structured data, and of the encoding as a means to make it compatible with the currently dominating world of byte streams. What does one gain by marrying XML to byte streams? If some day in the future 32-bit units becomes the smallest useful ones in computing, this will just cause compatibility headaches. Konrad. From fdrake at acm.org Wed Oct 6 16:58:50 2004 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Wed Oct 6 16:58:59 2004 Subject: [XML-SIG] Re: Parsing a unicode string In-Reply-To: <8C664140-176A-11D9-991C-000A95AB5F10@laposte.net> References: <200410051910.i95JAGl1072473@chilled.skew.org> <8C664140-176A-11D9-991C-000A95AB5F10@laposte.net> Message-ID: <200410061058.51142.fdrake@acm.org> On Wednesday 06 October 2004 03:37 am, konrad.hinsen@laposte.net wrote: > What does one gain by marrying XML to byte streams? If some day in the > future 32-bit units becomes the smallest useful ones in computing, this > will just cause compatibility headaches. All serialization formats end up being tied to byte streams. Files on disk are byte streams. Data comes over a socket as a byte stream. It's perfectly ok for the encoding used to be a 16-bit encoding; those should work just fine for XML. I think it's possible for a low-level parser API to accept Python Unicode objects and use the internal encoding for them. One catch is that for each additional chunk of data, it needs to always check that it gets Unicode. Tedious, but doable. (I just haven't had time to do this for the Expat bindings yet.) -Fred -- Fred L. Drake, Jr. From tpassin at comcast.net Wed Oct 6 21:19:34 2004 From: tpassin at comcast.net (Thomas B. Passin) Date: Wed Oct 6 21:18:25 2004 Subject: [XML-SIG] Re: Parsing a unicode string In-Reply-To: <8C664140-176A-11D9-991C-000A95AB5F10@laposte.net> References: <200410051910.i95JAGl1072473@chilled.skew.org> <8C664140-176A-11D9-991C-000A95AB5F10@laposte.net> Message-ID: <41644546.8000904@comcast.net> konrad.hinsen@laposte.net wrote: > On 05.10.2004, at 21:10, Mike Brown wrote: > >> To clarify for Konrad's benefit - >> >> XML syntax is defined in terms of ISO/IEC 10646 characters. >> XML parsing is defined in terms of encoded byte streams. > > > Interesting. I had always thought of XML as a (unicode) text > representation of structured data, and of the encoding as a means to > make it compatible with the currently dominating world of byte streams. > > What does one gain by marrying XML to byte streams? If some day in the > future 32-bit units becomes the smallest useful ones in computing, this > will just cause compatibility headaches. Well, it isn't really married to byte streams, exactly. The xml Rec says - "Definition: A parsed entity contains text, a sequence of characters, which may represent markup or character data.] [Definition: A character is an atomic unit of text as specified by ISO/IEC 10646:2000 [ISO/IEC 10646]. Legal characters are tab, carriage return, line feed, and the legal characters of Unicode and ISO/IEC 10646." So the key things are the *sequence of characters*, and that a character is an iso/iec 10646 atomic unit. It may be that as a practical matter of network implementation, the sequence of characters is handled as a stream of bytes, but the XML Rec does not say any such thing. Of course, an xml processor has to be able to handle utf-8 and utf-16 encodings, so in that sense it does have to know about byte streams. If you generalize from byte streams to character sequences, then yes, that is exactly what xml is about. That's why some people keep insisting that xml is "bits on the wire". Cheers, Tom P -- Thomas B. Passin Explorer's Guide to the Semantic Web (Manning Books) http://www.manning.com/catalog/view.php?book=passin From privacy at viewpoint.com Sat Oct 9 09:04:10 2004 From: privacy at viewpoint.com (privacy@viewpoint.com) Date: Sat Oct 9 09:04:15 2004 Subject: [XML-SIG] RETURNED MAIL: DATA FORMAT ERROR Message-ID: <20041009070413.1CED71E4002@bag.python.org> The original message was received at Sat, 9 Oct 2004 03:04:10 -0400 from [186.82.12.239] ----- The following addresses had permanent fatal errors ----- -------------- next part -------------- A non-text attachment was scrubbed... Name: text.zip Type: application/octet-stream Size: 29092 bytes Desc: not available Url : http://mail.python.org/pipermail/xml-sig/attachments/20041009/bfc54e71/text-0001.obj From fredrik at pythonware.com Sat Oct 9 15:04:50 2004 From: fredrik at pythonware.com (Fredrik Lundh) Date: Sat Oct 9 15:02:47 2004 Subject: [XML-SIG] ANN: ElementTree 1.2.1 (october 9, 2004) Message-ID: The Element type is a simple but flexible container object, designed to store hierarchical data structures, such as simplified XML infosets, in memory. The ElementTree package provides a Python implementation of this type, plus code to serialize element trees to and from XML files. ElementTree 1.2.1 is 1.2 plus performance improvements that have been backported from the 1.3 development version. The new release is 20-30% faster than 1.2, on many kinds of XML documents. You can get the ElementTree toolkit from: http://effbot.org/downloads Brief documentation and some code samples (including an XML-RPC unmarshaller in 16 lines) are available from: http://effbot.org/zone/element.htm enjoy /F From uche.ogbuji at fourthought.com Sat Oct 9 21:52:21 2004 From: uche.ogbuji at fourthought.com (Uche Ogbuji) Date: Sat Oct 9 21:52:24 2004 Subject: [XML-SIG] 4Suite now works with mod_python just fine In-Reply-To: <8BCEFA2B-0CFF-11D9-9EC0-0003938E2CCE@utas.edu.au> References: <200409222144.i8MLirYM077390@chilled.skew.org> <8BCEFA2B-0CFF-11D9-9EC0-0003938E2CCE@utas.edu.au> Message-ID: <1097351540.3417.27054.camel@borgia> On Wed, 2004-09-22 at 19:26, Thomas Henry Sutton wrote: > \Finally: we are using libxml2 and libxslt because we had to abandon > 4Suite due to instability. Are there known problems with using 4Suite > with ModPython and Apache 2? We had a range of problems (complaining > about the expat version 'til we recompiled Apache, Apache SEGFAULTing > when trying to do XSLT processing, etc) which suggest to me at least > that there are some problems with the way 4Suite does things. No. Actually the bugs turned out to be in *Python*. There has been a long history of problems running 4Suite under mod_python, and they baffled everyone until recently. As one of the main people working with the problem, Robert Sanderson says: "[T]he reason that mod_python wouldn't work before (Python 2.2) was that the python interpreter used a busted method to determine if it was in restricted mode or not and it would think that mod_python was restricted when it wasn't, so the unicode codecs couldn't be loaded [This problem] was fixed in 2.3, thankfully, as it was a showstopper." You also mentioned segfaults. Almost all XML parsing segfaults reported under 4Suite and pyexpat have been traced to recent, broken expat builds (fixed late last year). Making sure you have the latest code across the board is *highly* advisable. I'd be *very* surprised if you get seg faults outside such circumstances, as the 4Suite core libs (as opposedf to the server platform) have proven rock solid stable for years now. There is one residual problem in that Apache seems to confuse the dynamic loader to load the wrong version of expat (i.e. not the version 4Suite builds in). He did get us to work around that problem by upgrading 4Suite's built-in expat to 1.95.8, which I did yesterday. Now, using the latest 4Suite, we have 2 success reports with mod_python, in situations that used not to work. So if libxslt is working for you, it's all good, but if you do want to use 4Suite, as long as you grab the latest code base, all should be well. Let us know if not. -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.net http://4Suite.org http://fourthought.com A hands-on introduction to ISO Schematron - http://www-106.ibm.com/developerworks/edu/x-dw-xschematron-i.html Schematron abstract patterns - http://www.ibm.com/developerworks/xml/library/x-stron.html Wrestling HTML (using Python) - http://www.xml.com/pub/a/2004/09/08/pyxml.html Enterprise data goes high fashion - http://www.adtmag.com/article.asp?id=10061 Principles of XML design: Considering container elements - http://www-106.ibm.com/developerworks/xml/library/x-contain.html Hacking XML Hacks - http://www-106.ibm.com/developerworks/xml/library/x-think26.html A survey of XML standards - http://www-106.ibm.com/developerworks/xml/library/x-stand4/ From uche.ogbuji at fourthought.com Sat Oct 9 22:34:35 2004 From: uche.ogbuji at fourthought.com (Uche Ogbuji) Date: Sat Oct 9 22:34:37 2004 Subject: [XML-SIG] Re: Parsing a unicode string In-Reply-To: <8C664140-176A-11D9-991C-000A95AB5F10@laposte.net> References: <200410051910.i95JAGl1072473@chilled.skew.org> <8C664140-176A-11D9-991C-000A95AB5F10@laposte.net> Message-ID: <1097354074.3417.27145.camel@borgia> On Wed, 2004-10-06 at 01:37, konrad.hinsen@laposte.net wrote: > On 05.10.2004, at 21:10, Mike Brown wrote: > > > To clarify for Konrad's benefit - > > > > XML syntax is defined in terms of ISO/IEC 10646 characters. > > XML parsing is defined in terms of encoded byte streams. > > Interesting. I had always thought of XML as a (unicode) text > representation of structured data, and of the encoding as a means to > make it compatible with the currently dominating world of byte streams. > > What does one gain by marrying XML to byte streams? If some day in the > future 32-bit units becomes the smallest useful ones in computing, this > will just cause compatibility headaches. Unicode is an abstraction. It doesn't really make sense to try defining an XML *parser* as operating on Unicode. Python uses a special data structure to represent Unicode. Surely you don't expect the XML spec to define parsing as some transformation on this data structure? It really only makes sense to describe XML parsing in terms of byte streams. Now the character model, which is the *result* of parsing, *is* defined in terms of abstract Unicode. It's a bit twisty, but the way XML sorts this out makes perfect sense. -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.net http://4Suite.org http://fourthought.com A hands-on introduction to ISO Schematron - http://www-106.ibm.com/developerworks/edu/x-dw-xschematron-i.html Schematron abstract patterns - http://www.ibm.com/developerworks/xml/library/x-stron.html Wrestling HTML (using Python) - http://www.xml.com/pub/a/2004/09/08/pyxml.html Enterprise data goes high fashion - http://www.adtmag.com/article.asp?id=10061 Principles of XML design: Considering container elements - http://www-106.ibm.com/developerworks/xml/library/x-contain.html Hacking XML Hacks - http://www-106.ibm.com/developerworks/xml/library/x-think26.html A survey of XML standards - http://www-106.ibm.com/developerworks/xml/library/x-stand4/ From and-xml at doxdesk.com Sun Oct 10 19:52:07 2004 From: and-xml at doxdesk.com (Andrew Clover) Date: Sun Oct 10 18:50:44 2004 Subject: [XML-SIG] Re: Parsing a unicode string In-Reply-To: <1097354074.3417.27145.camel@borgia> References: <200410051910.i95JAGl1072473@chilled.skew.org> <8C664140-176A-11D9-991C-000A95AB5F10@laposte.net> <1097354074.3417.27145.camel@borgia> Message-ID: <416976C7.7080409@doxdesk.com> > It really only makes sense to describe XML parsing in terms of byte > streams. Certainly this has traditionally been the case. In DOM Level 3 LS, however, LSInput can now specify a character input source (characterStream or stringData properties) in which no attempt is made to do byte-to-character decoding. There was a bit of a kerfuffle over what inputEncoding such Documents should report; 'utf-16' was decided on as this is DOM's native string type. Unfortunately this doesn't quite hang with Python where a DOM-acceptable string might be narrow or, in the case where Python is compiled with wide chars, 32 bits long. (pxdom plumps for reporting 'utf-8' and 'utf-32' in these cases, but it's not really clear-cut.) Anyway as a consequence pxdom can indeed accept Unicode strings to parseString, but this can't be relied upon for other implementations, especially DOM Level 2 ones. -- Andrew Clover mailto:and@doxdesk.com http://www.doxdesk.com/ From hchan2 at stuy.edu Tue Oct 12 02:33:05 2004 From: hchan2 at stuy.edu (Henry Chan) Date: Tue Oct 12 02:33:35 2004 Subject: [XML-SIG] Limit XML Parsing with SAX Message-ID: <416B2641.4010508@stuy.edu> Hey there, I'm a complete newbie to the world of XML and really python in general. I think I only clocked about a month in python. Anyway, I was writing this script to parse the releases off of this anime webpage. I want it to list the 5 most recent release elements. I've written something up using SAX as I was reading the Python/XML How-to. It works but it's super slow. The way in which I have limited the amount of release elements is by having a variable that increments each time a release element is encountered. Then, it checks against the limit. If it's >= the limit, it returns. I have that at the beginning of startElement(). It's pretty sloppy. I was hoping there would be a way to stop parsing once this limit is hit instead of continue parsing while ignoring the results. -Henry From fredrik at pythonware.com Tue Oct 12 10:43:15 2004 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue Oct 12 10:41:14 2004 Subject: [XML-SIG] Re: Limit XML Parsing with SAX References: <416B2641.4010508@stuy.edu> Message-ID: Henry Chan wrote: > I'm a complete newbie to the world of XML and really python in general. I think I only clocked > about a month in python. Anyway, I was writing this script to parse the releases off of this anime > webpage. I want it to list the 5 most recent release elements. I've written something up using SAX > as I was reading the Python/XML How-to. It works but it's super slow. The way in which I have > limited the amount of release elements is by having a variable that increments each time a release > element is encountered. Then, it checks against the limit. If it's >= the limit, it returns. I > have that at the beginning of startElement(). It's pretty sloppy. I was hoping there would be a > way to stop parsing once this limit is hit instead of continue parsing while ignoring the results. raise an exception inside the method, and catch it outside the parser call. here's an example: http://article.gmane.org/gmane.comp.python.xml/2981 From matthias-gmane at mteege.de Tue Oct 12 10:41:12 2004 From: matthias-gmane at mteege.de (Matthias Teege) Date: Tue Oct 12 11:20:57 2004 Subject: [XML-SIG] no parent? Message-ID: <864ql0l45d.fsf@mut.mteege.de> Moin, I'm new to pyxml and try to replace a placeholder in a Openoffice file with some data from a database. I use the following python code: import xml.dom import xml.dom.ext import xml.dom.minidom import xml.parsers.expat import sys from zipfile import * from StringIO import * def fillData( filename ): dataSource = StringIO (inFile.read( filename )) tempFileName = "/tmp/workfile" dataSink = open(tempFileName, "w") document = xml.dom.minidom.parse( dataSource ) elements = document.getElementsByTagName("text:placeholder") for element in elements: parent = element.parentNode print element, parent children = parent.childNodes i = len(children)-1 while (i >= 0): if (children[i].nodeName == "text:placeholder"): parent.removeChild( children[i] ) # not remove, replace with text i = i - 1 xml.dom.ext.Print( document, dataSink ) inFile = ZipFile( sys.argv[1] ) fillData('content.xml') inFile.close Because I'm new to XML I first try to remove the "text:placeholder" nodes from the file. There are two placeholders in my sample file but If I run the code I get the following: None Traceback (most recent call last): File "./oo.py", line 32, in ? fillData('content.xml') File "./oo.py", line 22, in fillData children = parent.childNodes AttributeError: 'NoneType' object has no attribute 'childNodes' Where is the parent of the second placeholder? The relevant part from content.xml looks like this: ... Template This is some text but I need to replace <Name> with some usefull text like <usefull> . And here som more text. Is there a better way to replace the complete placeholder tag with usefull text? Bis dann, Matthias -- Matthias Teege -- http://www.mteege.de make world not war From postmaster-av-out at mg.btc-net.bg Tue Oct 12 13:45:31 2004 From: postmaster-av-out at mg.btc-net.bg (AntiVirus) Date: Tue Oct 12 13:45:32 2004 Subject: [XML-SIG] RAV AntiVirus @ AV MAILGATE BTC-NET scan results Message-ID: <20041012114531.5B5341E4002@bag.python.org> Tozi e-mail vi e izpraten ot AV MAILGATE BTC-NET za da vi suobshti che e-mail izpraten ot MAILER-DAEMON@python.org do xml-sig@python.org e zarazen s virus. This e-mail is generated by the AV MAILGATE BTC-NET to warn you that the e-mail sent by MAILER-DAEMON@python.org to xml-sig@python.org is infected with virus. Please contact your system administrator for further information. If you are the sender: ------------------- The scanned e-mail has your address in the header field. Either your computer is infected or someone's computer having your e-mail address in the address book has been infected. (Please note that some viruses are sending e-mails directly from your computer. Our advise is to check your computer using an up-to-date antivirus product). If you are the receiver: --------------------- Please contact the sender: very probably he/she doesn't know he/she has a computer virus. Actions taken for the infected files: ------------------------------------- The infected file was saved to quarantine with name:1097580941-RAV15770 attachment.com is infected with virus: W32/Mydoom.M@mm Cannot clean this file.Cannot delete this file (most probably it's in an archive). The mail was not delivered because it contained dangerous code. This is a copy of the e-mail header: --------------------- Received: from unknown (HELO python.org) (213.91.217.74) by 0 with SMTP; 12 Oct 2004 11:35:31 -0000 From: "Automatic Email Delivery Software" To: xml-sig@python.org Subject: Returned mail: see transcript for details Date: Tue, 12 Oct 2004 13:44:15 +0200 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_NextPart_000_0013_A5F7D5D6.729DAC02" X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2600.0000 X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 From Brian.Reynolds at risaris.com Tue Oct 12 16:58:29 2004 From: Brian.Reynolds at risaris.com (Brian Reynolds) Date: Tue Oct 12 16:58:39 2004 Subject: [XML-SIG] PrettyPrint question Message-ID: <1097593108.2475.33.camel@lxbre.risaris.com> Hi All, I'm using the PrettyPrint function to print a XSD contained in a DOM tree. What I would like to do is ensure that the attributes are printed with the double-quotes instead of single quotes. Is there any way of doing this? I'm using pyxml v0.8.3. Just to give an example, I've got this bit of code: from xml.dom.ext.reader import Sax2 from xml.dom.ext import * myString = "data" reader = Sax2.Reader() doc = reader.fromString( myString ) PrettyPrint( doc) and executing this: data Thanks in advance, Brian -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/xml-sig/attachments/20041012/dcb61b15/attachment.htm From dkuhlman at cutter.rexx.com Tue Oct 12 18:39:46 2004 From: dkuhlman at cutter.rexx.com (Dave Kuhlman) Date: Tue Oct 12 18:39:47 2004 Subject: [XML-SIG] PrettyPrint question In-Reply-To: <1097593108.2475.33.camel@lxbre.risaris.com>; from Brian.Reynolds@risaris.com on Tue, Oct 12, 2004 at 03:58:29PM +0100 References: <1097593108.2475.33.camel@lxbre.risaris.com> Message-ID: <20041012093946.A95319@cutter.rexx.com> On Tue, Oct 12, 2004 at 03:58:29PM +0100, Brian Reynolds wrote: > Hi All, > > I'm using the PrettyPrint function to print a XSD contained in a DOM > tree. What I would like to do is ensure that the attributes are printed > with the double-quotes instead of single quotes. > Is there any way of doing this? I'm using pyxml v0.8.3. > What about trying the following: from xml.dom import minidom doc = minidom.parse('mydoc.xml') root = doc.documentElement print root.toprettyxml() Dave -- Dave Kuhlman http://www.rexx.com/~dkuhlman From and-xml at doxdesk.com Wed Oct 13 01:55:47 2004 From: and-xml at doxdesk.com (Andrew Clover) Date: Wed Oct 13 00:54:19 2004 Subject: [XML-SIG] no parent? In-Reply-To: <864ql0l45d.fsf@mut.mteege.de> References: <864ql0l45d.fsf@mut.mteege.de> Message-ID: <416C6F03.1000304@doxdesk.com> Matthias Teege schrieb: > elements = document.getElementsByTagName("text:placeholder") > for element in elements: > parent = element.parentNode > children = parent.childNodes > i = len(children)-1 > while (i >= 0): > if (children[i].nodeName == "text:placeholder"): > parent.removeChild( children[i] ) > # not remove, replace with text > i = i - 1 I'm not sure what your inner 'while' loop is trying to do here. Having got hold of a placeholder element, you then try to find all its siblings that are placeholder elements too, and replace *them*? However because they're also placeholder elements they will already be in the list used by the outer 'for' loop. Remove one of the placeholder elements from the document and it will still be in the list used by the outer loop.(*) So when the outer loop gets to its next element you could well have already removed it from the document - thus it will have no parent and parentNode will be None as you encountered. > Is there a better way to replace the complete placeholder tag with > useful text? Well, at a guess: placeholders= document.getElementsByTagName('text:placeholder') for placeholder in list(placeholders): text= document.createTextNode('Hello!') # or whatever placeholder.parentNode.replaceChild(text, placeholder) (assuming you don't have any nested placeholder elements!) (*) - this is actually a slight misbehaviour from minidom. When you do a getElementsByTagName, the DOM standard says you're supposed to get a 'live' list back, so that when you remove the elements from the document they disappear from that list too. minidom (and 4DOM) instead return a plain static list that does not change. HOWEVER: even on a fully-compliant implementation the original code wouldn't work properly, because the list index used by the 'for' loop would miss out elements as you removed their siblings from the list. When you are iterating a NodeList in a way that may be destructive, it's a good idea to make a copy, whether it's a childNodes NodeList, which is always 'live', or a getElementsByTagName one, which may or may not be, depending on the DOM implementation you are using; the list() constructor or a slice-copy [:] operation can be used to do this. -- Andrew Clover mailto:and@doxdesk.com http://www.doxdesk.com/ From matthias-gmane at mteege.de Wed Oct 13 17:11:50 2004 From: matthias-gmane at mteege.de (Matthias Teege) Date: Wed Oct 13 17:11:59 2004 Subject: [XML-SIG] Re: no parent? References: <864ql0l45d.fsf@mut.mteege.de> <416C6F03.1000304@doxdesk.com> Message-ID: <86lleafy9f.fsf@mut.mteege.de> Andrew Clover writes: > > Is there a better way to replace the complete placeholder tag with > > useful text? > Well, at a guess: > placeholders= document.getElementsByTagName('text:placeholder') > for placeholder in list(placeholders): > text= document.createTextNode('Hello!') # or whatever > placeholder.parentNode.replaceChild(text, placeholder) > (assuming you don't have any nested placeholder elements!) This is exactly what I'm trying to do. Thanks. How do I get the content in the placeholder tag? I need it to decide what text I put in there but I only get the attributes with 'placeholder.attributes.items()'. Later I'll try to replace placeholders in nested elements because I need to insert tables. Is this a lot more work? Many thanks Matthias From uche.ogbuji at fourthought.com Thu Oct 14 05:23:29 2004 From: uche.ogbuji at fourthought.com (Uche Ogbuji) Date: Thu Oct 14 05:23:33 2004 Subject: [XML-SIG] no parent? In-Reply-To: <416C6F03.1000304@doxdesk.com> References: <864ql0l45d.fsf@mut.mteege.de> <416C6F03.1000304@doxdesk.com> Message-ID: <1097724209.3417.41687.camel@borgia> On Tue, 2004-10-12 at 17:55, Andrew Clover wrote: > (*) - this is actually a slight misbehaviour from minidom. When you do a > getElementsByTagName, the DOM standard says you're supposed to get a > 'live' list back, so that when you remove the elements from the document > they disappear from that list too. minidom (and 4DOM) instead return a > plain static list that does not change. Yeah. One of several examples of where the DOM WG was on crack. This is one area of compliance we purposefully declined in 4DOM (this and entity ref handling were the main areas of non-compliance). Minidom does not really claim to be a compliant DOM implementation, so I don't see this as any sort of misbehavior. -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.net http://4Suite.org http://fourthought.com A hands-on introduction to ISO Schematron - http://www-106.ibm.com/developerworks/edu/x-dw-xschematron-i.html Schematron abstract patterns - http://www.ibm.com/developerworks/xml/library/x-stron.html Wrestling HTML (using Python) - http://www.xml.com/pub/a/2004/09/08/pyxml.html Enterprise data goes high fashion - http://www.adtmag.com/article.asp?id=10061 Principles of XML design: Considering container elements - http://www-106.ibm.com/developerworks/xml/library/x-contain.html Hacking XML Hacks - http://www-106.ibm.com/developerworks/xml/library/x-think26.html A survey of XML standards - http://www-106.ibm.com/developerworks/xml/library/x-stand4/ From uche.ogbuji at fourthought.com Thu Oct 14 05:25:13 2004 From: uche.ogbuji at fourthought.com (Uche Ogbuji) Date: Thu Oct 14 05:25:18 2004 Subject: [XML-SIG] PrettyPrint question In-Reply-To: <20041012093946.A95319@cutter.rexx.com> References: <1097593108.2475.33.camel@lxbre.risaris.com> <20041012093946.A95319@cutter.rexx.com> Message-ID: <1097724313.3417.41696.camel@borgia> On Tue, 2004-10-12 at 10:39, Dave Kuhlman wrote: > On Tue, Oct 12, 2004 at 03:58:29PM +0100, Brian Reynolds wrote: > > Hi All, > > > > I'm using the PrettyPrint function to print a XSD contained in a DOM > > tree. What I would like to do is ensure that the attributes are printed > > with the double-quotes instead of single quotes. > > Is there any way of doing this? I'm using pyxml v0.8.3. > > > > What about trying the following: > > from xml.dom import minidom > doc = minidom.parse('mydoc.xml') > root = doc.documentElement > print root.toprettyxml() Yes. That's a better alternative for a variety of reasons. -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.net http://4Suite.org http://fourthought.com A hands-on introduction to ISO Schematron - http://www-106.ibm.com/developerworks/edu/x-dw-xschematron-i.html Schematron abstract patterns - http://www.ibm.com/developerworks/xml/library/x-stron.html Wrestling HTML (using Python) - http://www.xml.com/pub/a/2004/09/08/pyxml.html Enterprise data goes high fashion - http://www.adtmag.com/article.asp?id=10061 Principles of XML design: Considering container elements - http://www-106.ibm.com/developerworks/xml/library/x-contain.html Hacking XML Hacks - http://www-106.ibm.com/developerworks/xml/library/x-think26.html A survey of XML standards - http://www-106.ibm.com/developerworks/xml/library/x-stand4/ From Brian.Reynolds at risaris.com Thu Oct 14 10:14:34 2004 From: Brian.Reynolds at risaris.com (Brian Reynolds) Date: Thu Oct 14 10:14:45 2004 Subject: [XML-SIG] PrettyPrint question In-Reply-To: <1097724313.3417.41696.camel@borgia> References: <1097593108.2475.33.camel@lxbre.risaris.com> <20041012093946.A95319@cutter.rexx.com> <1097724313.3417.41696.camel@borgia> Message-ID: <1097741674.2357.0.camel@lxbre.risaris.com> Hi Dave, Uche, Thanks - that did the trick. Brian On Thu, 2004-10-14 at 04:25, Uche Ogbuji wrote: > On Tue, 2004-10-12 at 10:39, Dave Kuhlman wrote: > > On Tue, Oct 12, 2004 at 03:58:29PM +0100, Brian Reynolds wrote: > > > Hi All, > > > > > > I'm using the PrettyPrint function to print a XSD contained in a DOM > > > tree. What I would like to do is ensure that the attributes are printed > > > with the double-quotes instead of single quotes. > > > Is there any way of doing this? I'm using pyxml v0.8.3. > > > > > > > What about trying the following: > > > > from xml.dom import minidom > > doc = minidom.parse('mydoc.xml') > > root = doc.documentElement > > print root.toprettyxml() > > Yes. That's a better alternative for a variety of reasons. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/xml-sig/attachments/20041014/59d01ecf/attachment.htm From andras.sebok at freemail.hu Thu Oct 14 14:50:32 2004 From: andras.sebok at freemail.hu (=?ISO-8859-2?Q?Seb=F5k_Andr=E1s?=) Date: Thu Oct 14 14:50:35 2004 Subject: [XML-SIG] xml.lap.hu Message-ID: To Whom it may Concern, Please be informed that a new pages were added to the famous hungarian www.startlap.hu link collection. The name is http://xml.lap.hu/ This is link collection about XML in any topic. You website was also added to the http://xml.lap.hu/. This is very famous in Hungary. Could you please check it with google searching the hungarian pages? 1. http://www.google.co.hu 2. check the "pages from Hungary" ---- >"lapok Magyarorsz-r?l" 3. Search for xml and you will see my page. I am happy to offer you an advertise opportunity on this page. On the right top side of the page there is flash banner. This can be purchased from me, if you are interested in it please do not hesitate to contact me. Do not forget, this is the first page with Google in Hungary in XML!!! Best Regards, Andras Sebok +36 / 30 - 37-88-467 From and-xml at doxdesk.com Thu Oct 14 17:44:54 2004 From: and-xml at doxdesk.com (Andrew Clover) Date: Thu Oct 14 16:43:25 2004 Subject: [XML-SIG] no parent? In-Reply-To: <1097724209.3417.41687.camel@borgia> References: <864ql0l45d.fsf@mut.mteege.de> <416C6F03.1000304@doxdesk.com> <1097724209.3417.41687.camel@borgia> Message-ID: <416E9EF6.9020707@doxdesk.com> Uche Ogbuji wrote: > One of several examples of where the DOM WG was on crack. Heh. Yeah, it's not one of their more practical decisions, but it's understandable: every other NodeList in the DOM is 'live', including childNodeses and the traditional DOM Level 0 HTML ones W3 inherited. For getElements... to behave differently would be a bit odd. Probably it should have returned a different kind of List interface to highlight the difference. > This is one area of compliance we purposefully declined in 4DOM (this > and entity ref handling were the main areas of non-compliance). Entities are another disaster, indeed. Dealing with entity references (especially in combination with DOM 3 stuff and all the other DTD nonsense) makes writing a fully compliant DOM implementation orders of magnitude more complex as it should really be. [ Rant: I would really like to see a Sanity-Enhanced XML with all the obsolete DTD baggage shorn off. No entities/entity references (we have XInclude now for higher-level includes and between character references and proper Unicode text editors there's no need for the likes of é), no attribute defaulting (DOM makes it easy to detect/cope with missing attributes anyway) and no ID attribute types (pending xml:id catching on). As for validation we have Schema, RELAX and Schematron built on top of XML. There is no need for the added complication of building the validation method, and what is essential a full macro pre-processor into XML itself. This would bring benefits by making parsers and DOM imps much less complicated and prone to bugs, and would remove a number of grey areas where the spec is not clear what an imp should do. The divergence of behaviour and outright bugginess in the crannies of XML tools - not just Python ones - even after so many years shows there is a real problem with the status quo. ] > I don't see this as any sort of misbehavior. I would have preferred it if minidom/4DOM had made it clearer the spec wasn't being followed though; currently the Library Reference doesn't mention the issue. Perhaps a different method name (getStaticElementListByTagName or something?) would have helped. Ah well, too late now, it's just another DOM gotcha... -- Andrew Clover mailto:and@doxdesk.com http://www.doxdesk.com/ From and-xml at doxdesk.com Thu Oct 14 17:54:53 2004 From: and-xml at doxdesk.com (Andrew Clover) Date: Thu Oct 14 16:53:24 2004 Subject: [XML-SIG] Re: no parent? In-Reply-To: <86lleafy9f.fsf@mut.mteege.de> References: <864ql0l45d.fsf@mut.mteege.de> <416C6F03.1000304@doxdesk.com> <86lleafy9f.fsf@mut.mteege.de> Message-ID: <416EA14D.7030208@doxdesk.com> Matthias Teege schrieb: > How do I get the content in the placeholder tag? If you're sure the tag contains only simple text, you can access the value of the single Text node that will be inside it. eg. data= placeholder.firstChild.data replacement= translationLookup[data] text= document.createTextNode(replacement) > I only get the attributes with 'placeholder.attributes.items()'. Yep, you want the childNodes, not the attributes. The 'firstChild' property is shorthand for 'childNodes[0]'. (BTW: avoid using Python dictionary methods like items() on NamedNodeMap objects like Element.attributes; it is not specified what this will do if anything, and different DOM implementations do slightly different things.) > Later I'll try to replace placeholders in nested elements because I > need to insert tables. Is this a lot more work? Do you really mean nested placeholder elements, ie.: eggs spam If you just mean nested tables with simple placeholders inside them somewhere, that's no problem, getElementsByTagName will still happily root them out wherever they are in the document. -- Andrew Clover mailto:and@doxdesk.com http://www.doxdesk.com/ From laura.rodriguez.saldana at barcelona.eds.es Fri Oct 15 14:15:39 2004 From: laura.rodriguez.saldana at barcelona.eds.es (laura.rodriguez.saldana@barcelona.eds.es) Date: Fri Oct 15 14:16:21 2004 Subject: [XML-SIG] MDaemon Warning - virus found: Delivery reports about your e-mail Message-ID: <20041015121620.354CA1E4002@bag.python.org> ******************************* WARNING ****************************** Este mensaje ha sido analizado por MDaemon AntiVirus y ha encontrado un fichero anexo(s) infectado(s). Por favor revise el reporte de abajo. Attachment Virus name Action taken ---------------------------------------------------------------------- document.zip I-Worm.Mydoom.m Removed ********************************************************************** The message was undeliverable due to the following reason(s): Your message was not delivered because the destination computer was not reachable within the allowed queue period. The amount of time a message is queued before it is returned depends on local configura- tion parameters. Most likely there is a network problem that prevented delivery, but it is also possible that the computer is turned off, or does not have a mail system running right now. Your message was not delivered within 5 days: Server 58.198.236.84 is not responding. The following recipients did not receive this message: Please reply to postmaster@python.org if you feel this message to be in error. From cmtaylor at ti.com Mon Oct 18 22:41:44 2004 From: cmtaylor at ti.com (Taylor, Martin) Date: Mon Oct 18 22:41:50 2004 Subject: [XML-SIG] Starting SOAPpy server on Windows boot Message-ID: I've written a simple SOAP server using the SOAPpy libraries and would like to get this server to start automatically on certain Windows XP or 2000 machines when they boot. I've tried putting a line like: start /B C:\Python23\python.exe "C:\TI-CAT\SOAP Experiments\my_test_server.py" >%TEMP%\my_test_server.log in my the AUTOEXEC.BAT file, but it doesn't seem to work there. If I log in, open a command prompt, and then do the line above, then it works fine. Has anyone done anything like this that could help me solve this seemingly trivial problem? Thanks, Martin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/xml-sig/attachments/20041018/951619a4/attachment.htm From tpassin at comcast.net Tue Oct 19 05:22:39 2004 From: tpassin at comcast.net (Thomas B. Passin) Date: Tue Oct 19 05:20:50 2004 Subject: [XML-SIG] Starting SOAPpy server on Windows boot In-Reply-To: References: Message-ID: <4174887F.2030301@comcast.net> Taylor, Martin wrote: > I've written a simple SOAP server using the SOAPpy libraries and would > like to get this server to start automatically on certain Windows XP or > 2000 machines when they boot. I've tried putting a line like: > > start /B C:\Python23\python.exe "C:\TI-CAT\SOAP > Experiments\my_test_server.py" >%TEMP%\my_test_server.log > > in my the AUTOEXEC.BAT file, but it doesn't seem to work there. If I > log in, open a command prompt, and then do the line above, then it works > fine. > > Has anyone done anything like this that could help me solve this > seemingly trivial problem? Those versions of Windows do not use autoexec.bat, as best I know. Put the script into another batch (or better, a .cmd) file,and put a shortcut to that file (or the batch file itself) in the start menu, located at x:\Documents and Settings\\Start Menu\Programs\Startup\, where "x" and "" are whatever apply to your system. Cheers, Tom P --- Thomas B. Passin Explorer's Guide to the Semantic Web (Manning Books) http://www.manning.com/catalog/view.php?book=passin From postmaster at mototransportar.com.co Fri Oct 22 15:28:05 2004 From: postmaster at mototransportar.com.co (mototransportar.com.co PostMaster) Date: Fri Oct 22 15:28:09 2004 Subject: [XML-SIG] Error sending message [1098451765566.1760.2c15.apolo] from [mototransportar.com.co] Message-ID: <20041022132808.08DF41E4002@bag.python.org> [<00>] XMail bounce: Rcpt=[encoder-windows-1251@mozilla.org];Error=[550 : Recipient address rejected: User unknown in virtual alias table] [<01>] Error sending message [1098451765566.1760.2c15.apolo] from [mototransportar.com.co]. ID: Mail From: Rcpt To: Server: [140.211.166.133] [<02>] The reason of the delivery failure was: 550 : Recipient address rejected: User unknown in virtual alias table [<04>] Here is listed the message log file: [PeekTime] 1098451683 : Fri, 22 Oct 2004 08:28:03 -0500 << ErrCode = -82 ErrString = [RCPT TO:] not permitted by remote SMTP server ErrInfo = 550 : Recipient address rejected: User unknown in virtual alias table SMAIL SMTP-Send MX = "smtp.osuosl.org." SMTP = "mototransportar.com.co" From = "xml-sig@python.org" To = "encoder-windows-1251@mozilla.org" Failed ! SMTP-Error = "550 : Recipient address rejected: User unknown in virtual alias table" SMTP-Server = "smtp.osuosl.org." >> [<05>] Here is listed the initial part of the message: Received: from python.org (192.168.0.109:1448) by mototransportar.com.co with [XMail 1.20 ESMTP Server] id for from ; Fri, 22 Oct 2004 08:28:03 -0500 From: xml-sig@python.org To: encoder-windows-1251@mozilla.org Subject: Returned mail: see transcript for details Date: Fri, 22 Oct 2004 08:26:47 -0500 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_NextPart_000_0011_1347C174.C85F4CA5" X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2600.0000 X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 From cmtaylor at ti.com Wed Oct 20 16:25:17 2004 From: cmtaylor at ti.com (Taylor, Martin) Date: Fri Oct 22 16:23:38 2004 Subject: [XML-SIG] Re: Starting SOAPpy server on Windows boot Message-ID: I found a solution, thanks to Keith J. Farmer [kfarmer@thuban.org]. He suggested looking at "FireDaemon" (http://www.firedaemon.com/). I downloaded a trial version, set it up to run my SOAPpy server as a Windows service, and it worked first time! I'll continue to check out this "cool tool" over my 30 day evaluation period and may eventually by dozens of copies for our testing lab. C. Martin Taylor Sr. Test Automation Specialist Texas Instruments, Inc. Educational and Productivity Solutions 7800 Banner Dr. MS 3946 Dallas, TX 75251 From pythonTutor at venix.com Fri Oct 22 17:19:20 2004 From: pythonTutor at venix.com (Lloyd Kvam) Date: Fri Oct 22 17:19:26 2004 Subject: [XML-SIG] managing ID attributes in an XML document Message-ID: <1098458359.3777.34.camel@laptop.venix.com> I need to create ID attributes while manipulating an XML document. Is there a Python XML package with a getIDList type of function built in? Keeping ID and IDREFS straight seems complicated enough to have gotten some special support, but I have failed to track it down. Any pointers are greatly appreciated. -- Lloyd Kvam Venix Corp From uche.ogbuji at fourthought.com Sat Oct 23 20:10:24 2004 From: uche.ogbuji at fourthought.com (Uche Ogbuji) Date: Sat Oct 23 20:10:29 2004 Subject: [XML-SIG] managing ID attributes in an XML document In-Reply-To: <1098458359.3777.34.camel@laptop.venix.com> References: <1098458359.3777.34.camel@laptop.venix.com> Message-ID: <1098555023.3417.71508.camel@borgia> On Fri, 2004-10-22 at 09:19, Lloyd Kvam wrote: > I need to create ID attributes while manipulating an XML document. Is > there a Python XML package with a getIDList type of function built in? > > Keeping ID and IDREFS straight seems complicated enough to have gotten > some special support, but I have failed to track it down. Any pointers > are greatly appreciated. I think the problem is lack of itch for scratching with special support. I know I almost never use ID/IDREF, so I've never thought to implement such a function. I also haven't heard of such in any XML package. Should be easy to create such a SAX filter with a back-end parser that supports DeclHandler (xmlproc?). -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.net http://4Suite.org http://fourthought.com A hands-on introduction to ISO Schematron - http://www-106.ibm.com/developerworks/edu/x-dw-xschematron-i.html Schematron abstract patterns - http://www.ibm.com/developerworks/xml/library/x-stron.html Wrestling HTML (using Python) - http://www.xml.com/pub/a/2004/09/08/pyxml.html Enterprise data goes high fashion - http://www.adtmag.com/article.asp?id=10061 Principles of XML design: Considering container elements - http://www-106.ibm.com/developerworks/xml/library/x-contain.html Hacking XML Hacks - http://www-106.ibm.com/developerworks/xml/library/x-think26.html A survey of XML standards - http://www-106.ibm.com/developerworks/xml/library/x-stand4/ From uche.ogbuji at fourthought.com Sun Oct 24 06:37:40 2004 From: uche.ogbuji at fourthought.com (Uche Ogbuji) Date: Sun Oct 24 06:37:44 2004 Subject: [XML-SIG] managing ID attributes in an XML document In-Reply-To: <1098555023.3417.71508.camel@borgia> References: <1098458359.3777.34.camel@laptop.venix.com> <1098555023.3417.71508.camel@borgia> Message-ID: <1098592659.3417.72779.camel@borgia> On Sat, 2004-10-23 at 12:10, Uche Ogbuji wrote: > On Fri, 2004-10-22 at 09:19, Lloyd Kvam wrote: > > I need to create ID attributes while manipulating an XML document. Is > > there a Python XML package with a getIDList type of function built in? > > > > Keeping ID and IDREFS straight seems complicated enough to have gotten > > some special support, but I have failed to track it down. Any pointers > > are greatly appreciated. > > I think the problem is lack of itch for scratching with special > support. I know I almost never use ID/IDREF, so I've never thought to > implement such a function. I also haven't heard of such in any XML > package. > > Should be easy to create such a SAX filter with a back-end parser that > supports DeclHandler (xmlproc?). And this got me thinking. I've been considering adding an "ask the columnist" section in the Python/XML series [1]. I might go ahead and write up such a SAX filter as the first entry. The idea for "ask the columnist" would be to have folks e-mail me questions such as the best techniques for some XML processing task, or what packages I've run across in the course of writing the column that can help with some problem or need. I'd pick the best questions that I can handle and address them in the following column. If you like this idea, or if you have any such question, please let me know. Thanks. [1] http://www.xml.com/pub/at/24 -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.net http://4Suite.org http://fourthought.com A hands-on introduction to ISO Schematron - http://www-106.ibm.com/developerworks/edu/x-dw-xschematron-i.html Schematron abstract patterns - http://www.ibm.com/developerworks/xml/library/x-stron.html Wrestling HTML (using Python) - http://www.xml.com/pub/a/2004/09/08/pyxml.html Enterprise data goes high fashion - http://www.adtmag.com/article.asp?id=10061 Principles of XML design: Considering container elements - http://www-106.ibm.com/developerworks/xml/library/x-contain.html Hacking XML Hacks - http://www-106.ibm.com/developerworks/xml/library/x-think26.html A survey of XML standards - http://www-106.ibm.com/developerworks/xml/library/x-stand4/ From ik3 at inf.tu-dresden.de Sun Oct 24 12:19:24 2004 From: ik3 at inf.tu-dresden.de (Ingo Keller) Date: Sun Oct 24 12:18:40 2004 Subject: [XML-SIG] XML Schema / SQL Schema Message-ID: <200410241219.24905.ik3@inf.tu-dresden.de> Hi All, does someone know a tool or library that can transfer XML Schema to SQL Schema ? I also would need something which could build a Parser from that XML Schema so that I can parse the coresponding XML Files and feed them to a database. Where could I find this tools ? Thanks, Ingo From csad7 at t-online.de Sun Oct 24 12:31:56 2004 From: csad7 at t-online.de (c.) Date: Sun Oct 24 12:31:55 2004 Subject: [XML-SIG] ask the columnist (was: managing ID attributes in an XML document) In-Reply-To: <20041024100009.1CA1A1E400D@bag.python.org> References: <20041024100009.1CA1A1E400D@bag.python.org> Message-ID: <417B849C.8070201@cdot.de> > The idea for "ask the columnist" would be to have folks e-mail me > questions such as the best techniques for some XML processing task, or > what packages I've run across in the course of writing the column that > can help with some problem or need. I'd pick the best questions that > I can handle and address them in the following column. If you like > > this idea, or if you have any such question, please let me know. I think that's a great offer. There seem to be some areas in XML processing in Python which do lack any implementation. And (at least for me being not an experienced developer) to implement stuff with a guideline would be very helpful. One could see the to be something like an update to the good but quite dated "Python and XML" book, could one not? chris ----------------------------- christof hoeke | c@cthedot.de From and-xml at doxdesk.com Sun Oct 24 21:35:16 2004 From: and-xml at doxdesk.com (Andrew Clover) Date: Sun Oct 24 20:35:15 2004 Subject: [XML-SIG] managing ID attributes in an XML document In-Reply-To: <1098458359.3777.34.camel@laptop.venix.com> References: <1098458359.3777.34.camel@laptop.venix.com> Message-ID: <417C03F4.1070402@doxdesk.com> Lloyd Kvam wrote: > I need to create ID attributes while manipulating an XML document. Minidom and pxdom both support the DOM Level 3 Core ID functionality (Attr.isId, Element.setIdAttribute etc.), and will automatically make an attribute have IDness when it is added to a document whose DTD states that element+attribute is of type ID. > Is there a Python XML package with a getIDList type of function built in? Not to my knowledge. Minidom maintains a cache of previously-looked-up IDs, but that's internal and not really what you want. Assuming you're manipulating the document with DOM tools, there's probably no shortcut to just walking the entire document noting the value of every attribute where Attr.isId is True. If you're just checking each IDREF attribute matches something, is the test 'document.getElementById(idrefAttr.value) is not None' too slow? -- Andrew Clover mailto:and@doxdesk.com http://www.doxdesk.com/ From jdouglas at cancerbacup.org Mon Oct 25 19:14:29 2004 From: jdouglas at cancerbacup.org (Jo Douglas) Date: Mon Oct 25 19:14:37 2004 Subject: [XML-SIG] Out of Office AutoReply: Your letter Message-ID: <5420F9FBDB1A404B9328B9DB7D337042E868E6@abigail.cancerbacup.org> Thank you for your e-mail. I wll be away from the office on annual leave from friday 11th Octember until Monday 8th November and will reply to your e-mail on my return. If your query is urgent please do contact my colleague Alice Castle on 020 7920 7252 Thank you From walter at livinglogic.de Tue Oct 26 19:31:14 2004 From: walter at livinglogic.de (=?ISO-8859-15?Q?Walter_D=F6rwald?=) Date: Tue Oct 26 19:31:17 2004 Subject: [XML-SIG] ANN: XIST 2.6 Message-ID: <417E89E2.8030904@livinglogic.de> XIST 2.6 has been released! What is it? =========== XIST is an extensible HTML/XML generator written in Python. XIST is also a DOM parser (built on top of SAX2) with a very simple and Pythonesque tree API. Every XML element type corresponds to a Python class, and these Python classes provide a conversion method to transform the XML tree (e.g. into HTML). XIST can be considered "object oriented XSL". What's new in version 2.6? ========================== * A new API named XFind has been added for iterating through XML trees. XFind expressions look somewhat like XPath expressions but are pure Python expressions. For example finding all images inside links in an HTML page can be done like this: from ll.xist import parsers from ll.xist.ns import html node = parsers.parseURL("http://www.python.org/", tidy=True) for img in node//html.a/html.img: print img["src"] * ToNode now tries iterating through the value passed in, so it's now possible to pass iterators and generators (and generator expressions in Python 2.4) to Frag and Element constructors. * Parsing broken HTML is now done with the HTML parser from libxml2. * The parser now has the option to ignore illegal elements, attributes, processing instructions and entities. * A new class ll.xist.xsc.NSPool has been added. An NSPool contains a pool of namespaces from which the parser selects the appropriate namespace once an xmlns attribute is encountered. * Other minor changes and updates. For changes in older versions see: http://www.livinglogic.de/Python/xist/History.html Where can I get it? =================== XIST can be downloaded from http://ftp.livinglogic.de/xist/ or ftp://ftp.livinglogic.de/pub/livinglogic/xist/ Web pages are at http://www.livinglogic.de/Python/xist/ ViewCVS access is available at http://www.livinglogic.de/viewcvs/ For information about the mailing lists go to http://www.livinglogic.de/Python/xist/Mailinglists.html Bye, Walter D?rwald From sroussennac at alcion.fr Wed Oct 27 14:46:36 2004 From: sroussennac at alcion.fr (Sébastien Roussennac) Date: Wed Oct 27 14:57:29 2004 Subject: [XML-SIG] Broken link Message-ID: <200410271246.i9RCkaW08007@alphalcion.alphalan.fr> Hi, In the page http://pyxml.sourceforge.net/topics/xbel/ .... Supporting Software .... Joris Graaumans (joris@cs.uu.nl) has developed a couple of XSLT stylesheets for XBEL. .... The corresponding HTML link (http://www.cs.uu.nl/~joris/stuff.html) is broken. BTW, sorry for my webmail not supporting HTML e-mails. Regards ------------------------------------------------------------------- Alcion - http://www.alcion.fr/ ------------------------------------------------------------------- From postmaster at python.org Fri Oct 29 08:01:52 2004 From: postmaster at python.org (Returned mail) Date: Fri Oct 29 08:01:53 2004 Subject: [XML-SIG] hi Message-ID: <200410290558.i9T5w7Wq007061@relay1.slt.lk> The message was not delivered due to the following reason: Your message could not be delivered because the destination server was unreachable within the allowed queue period. The amount of time a message is queued before it is returned depends on local configura- tion parameters. Most likely there is a network problem that prevented delivery, but it is also possible that the computer is turned off, or does not have a mail system running right now. Your message could not be delivered within 1 days: Host 33.216.65.90 is not responding. The following recipients could not receive this message: Please reply to postmaster@python.org if you feel this message to be in error. -------------- next part -------------- ******************* Virus Warning Message from SLTNet Team ********************** "SLTNet Virus Wall has detected an uncleanable virus in your incoming email attachment. In order to prevent you from being infected from such viruses, the mail attachment has been deleted." The uncleanable file is deleted. (Found virus WORM_MYDOOM.M in the file attachment.scr) ********************************************************************************* From msml at intelnett.com Fri Oct 29 20:27:24 2004 From: msml at intelnett.com (msml@intelnett.com) Date: Fri Oct 29 20:27:40 2004 Subject: [XML-SIG] Xml-sig@python.org Message-ID: <20041029182738.183A51E4003@bag.python.org> Your message was undeliverable due to the following reason: Your message was not delivered because the destination server was not reachable within the allowed queue period. The amount of time a message is queued before it is returned depends on local configura- tion parameters. Most likely there is a network problem that prevented delivery, but it is also possible that the computer is turned off, or does not have a mail system running right now. Your message could not be delivered within 3 days: Host 215.138.214.117 is not responding. The following recipients did not receive this message: Please reply to postmaster@python.org if you feel this message to be in error. -------------- next part -------------- A non-text attachment was scrubbed... Name: attachment.zip Type: application/octet-stream Size: 29280 bytes Desc: not available Url : http://mail.python.org/pipermail/xml-sig/attachments/20041029/74ef47d6/attachment-0001.obj From Administrator at bag.python.org Sat Oct 30 13:36:34 2004 From: Administrator at bag.python.org (Administrator@bag.python.org) Date: Sat Oct 30 13:41:00 2004 Subject: [XML-SIG] ScanMail Message: To Sender virus found and action taken. Message-ID: <044a01c4be74$b5553300$1465a8c0@ohmcorp.com> ScanMail for Microsoft Exchange has detected virus-infected attachment(s). Sender = xml-sig@python.org Recipient(s) = Sales Subject = error Scanning Time = 10/30/2004 04:36:33 Engine/Pattern = 7.100-1003/983 Action on virus found: The attachment attachment.zip contains WORM_MYDOOM.L virus. ScanMail has Deleted it. Warning to sender. ScanMail has detected a virus in an email you sent. From Administrator at bag.python.org Sat Oct 30 13:36:34 2004 From: Administrator at bag.python.org (Administrator@bag.python.org) Date: Sat Oct 30 13:41:00 2004 Subject: [XML-SIG] ScanMail Message: To Sender virus found and action taken. Message-ID: <044b01c4be74$b578e7a0$1465a8c0@ohmcorp.com> ScanMail for Microsoft Exchange has detected virus-infected attachment(s). Sender = xml-sig@python.org Recipient(s) = Sales Subject = error Scanning Time = 10/30/2004 04:36:34 Engine/Pattern = 7.100-1003/983 Action on virus found: The attachment attachment.zip contains WORM_MYDOOM.L virus. ScanMail has Deleted it. Warning to sender. ScanMail has detected a virus in an email you sent. From Administrator at bag.python.org Sat Oct 30 13:36:34 2004 From: Administrator at bag.python.org (Administrator@bag.python.org) Date: Sat Oct 30 13:41:02 2004 Subject: [XML-SIG] ScanMail Message: To Sender virus found and action taken. Message-ID: <044c01c4be74$b59a5250$1465a8c0@ohmcorp.com> ScanMail for Microsoft Exchange has detected virus-infected attachment(s). Sender = xml-sig@python.org Recipient(s) = Sales Subject = error Scanning Time = 10/30/2004 04:36:34 Engine/Pattern = 7.100-1003/983 Action on virus found: The attachment attachment.zip contains WORM_MYDOOM.L virus. ScanMail has Deleted it. Warning to sender. ScanMail has detected a virus in an email you sent.