From martin at v.loewis.de  Sun Dec  2 19:10:53 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 02 Dec 2007 19:10:53 +0100
Subject: [XML-SIG] Problems with PyXML Mac OS 10.5 install
In-Reply-To: <BED9D15F73EDDE48BF480604236A4560040753AE@ex1.ad.dcs.gla.ac.uk>
References: <BED9D15F73EDDE48BF480604236A4560040753AE@ex1.ad.dcs.gla.ac.uk>
Message-ID: <4752F52D.90902@v.loewis.de>

> It can load the xml module OK, but I presume that this is simply the old
> version. It seems to me that it's simply not being installed - if I use
> the spotlight to find xpath, it's only found in the local directory, not
> where it should be installed.

Well, "setup.py install" should tell you what files it copies - does
that not look right?

Notice that PyXML installs itself as _xmlplus.

Regards,
Martin

From Pierre.DeWet at BITC.ORG.UK  Mon Dec  3 12:03:22 2007
From: Pierre.DeWet at BITC.ORG.UK (Pierre DeWet)
Date: Mon, 03 Dec 2007 11:03:22 +0000
Subject: [XML-SIG] XML-SIG Digest, Vol 56, Issue 1 (Out of office)
Message-ID: <s753e291.071@mail.bitc.org.uk>

I will be out of the office until Monday 10 December. If your request is
urgent, please contact the helpdesk at: helpdesk at bitc.org.uk,
alternatively, please dial: 0207 566 8771

Cheers
Pierre

From martin at v.loewis.de  Tue Dec  4 00:39:29 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 04 Dec 2007 00:39:29 +0100
Subject: [XML-SIG] How to parse an XML in SAX
In-Reply-To: <op.t1mecenn999q39@genius.homelinux.net>
References: <op.t1mecenn999q39@genius.homelinux.net>
Message-ID: <475493B1.7020900@v.loewis.de>

> Hi I want to parse an XML using sax but my big issue are the
> WhiteSpaces when they get reported. I want to know how to efficiently
> ignore them. I know there are some DocumentHandlers and one specific
> for ignore Whitespace but I still come up with a bunch of invisible
> nodes like \t or \n.
> 
> Anyone have a tutorial on how to handle SAX for this kind of parsing?

In general, the notion of "significant whitespace" is pretty weak in
XML (independent of SAX, so I don't think Stefan's bashing of SAX
was of any help). Here is what I know about it:
- white space should be preserved if the attribute xml:space was
  given on an element, and has the value of "preserve". Otherwise,
  it's up to the application on what precisely to do with white
  space.
- white space in "element content" is usually considered ignorable,
  and the XML spec requires that it is reported as such. However,
  whether an element has element content depends on the DTD, so only
  a validating parser can know. If you turn on validation on in SAX,
  white space in element content will be reported through the
  "ignorableWhitespace" event.

So, it's your own choice, and you should make that choice based on
your knowledge of the actual XML application. Typical options are
a) preserve all whitespace
b) perform validation, then strip all whitespace in element content
c) drop white space that completely spans from one tag to another,
   assuming the element has element content. In SAX, track characterData
   since either the last startElement or endElement, and then chose
   to drop the whitespace at the next startElement or endElement.
d) In many cases, you have either element content or simple text
   content, so in SAX, you can drop the white space if you see nested
   elements.
e) strip whitespace, in the sense of Python's string.strip. I.e.
   at endElement, perform .strip() on the collected data.

HTH,
Martin

From gelston at doosanbabcock.com  Wed Dec  5 11:38:18 2007
From: gelston at doosanbabcock.com (Elston, Gareth R)
Date: Wed, 5 Dec 2007 10:38:18 -0000
Subject: [XML-SIG] Amara xml_xpath() behaviour
Message-ID: <9D4464CAAAB788439D66EE2432F9B5F10292EE28@00001EXCH.uk.mitsuibabcock.com>

Hi,

I'm new to XML and I've just started using Amara - I'm very impressed.

I've been trying to use xml_xpath() on a bindery object itself created
with xml_xpath(). I didn't get what I expected, which may be my
misunderstanding of what xml_xpath() is doing. Here's a short example to
illustrate (I'm using Amara 1.2.0.2 and Python 2.4.3 on Windows XP.):

In [1]: import amara

In [2]: l = amara.parse('file:///F:/lines.xml')

In [3]: print l.xml()
<?xml version="1.0" encoding="UTF-8"?>
<Lines>
  <Line>
    <Point x2="1.0" x1="1.0"/>
    <Point x2="2.0" x1="1.0"/>
  </Line>
  <Line>
    <Point x2="2.0" x1="2.0"/>
    <Point x2="3.0" x1="2.0"/>
  </Line>
  <Line>
    <Point x2="3.0" x1="3.0"/>
    <Point x2="5.0" x1="3.0"/>
  </Line>
</Lines>

In [4]: l.xml_xpath('//Line')
Out[4]:
[<amara.bindery.Line object at 0x015628D0>,
 <amara.bindery.Line object at 0x0169E6B0>,
 <amara.bindery.Line object at 0x01759D10>]

In [5]: print l.xml_xpath('//Line')[0].xml()
<Line>
    <Point x2="1.0" x1="1.0"/>
    <Point x2="2.0" x1="1.0"/>
  </Line>

In [6]: l.xml_xpath('//Line')[0].xml_xpath('//Point')
Out[6]:
[<amara.bindery.Point object at 0x0169E210>,
 <amara.bindery.Point object at 0x0169E250>,
 <amara.bindery.Point object at 0x0169EA50>,
 <amara.bindery.Point object at 0x01759CD0>,
 <amara.bindery.Point object at 0x01759D50>,
 <amara.bindery.Point object at 0x01759D90>]

I expected only 2 amara.bindery.Point objects in the last step. Is this
(all 6 Points in the XML data) the expected behaviour?

Thanks,
Gareth

From morillas at gmail.com  Wed Dec  5 12:29:03 2007
From: morillas at gmail.com (Luis Miguel Morillas)
Date: Wed, 5 Dec 2007 12:29:03 +0100
Subject: [XML-SIG] Amara xml_xpath() behaviour
In-Reply-To: <9D4464CAAAB788439D66EE2432F9B5F10292EE28@00001EXCH.uk.mitsuibabcock.com>
References: <9D4464CAAAB788439D66EE2432F9B5F10292EE28@00001EXCH.uk.mitsuibabcock.com>
Message-ID: <68d25cbc0712050329q3cf4b58etce7793d3e1b9b38f@mail.gmail.com>

2007/12/5, Elston, Gareth R <gelston at doosanbabcock.com>:
> Hi,
>
> I'm new to XML and I've just started using Amara - I'm very impressed.
>
> I've been trying to use xml_xpath() on a bindery object itself created
> with xml_xpath(). I didn't get what I expected, which may be my
> misunderstanding of what xml_xpath() is doing. Here's a short example to
> illustrate (I'm using Amara 1.2.0.2 and Python 2.4.3 on Windows XP.):
>
> In [1]: import amara
>
> In [2]: l = amara.parse('file:///F:/lines.xml')
>
> In [3]: print l.xml()
> <?xml version="1.0" encoding="UTF-8"?>
> <Lines>
>   <Line>
>     <Point x2="1.0" x1="1.0"/>
>     <Point x2="2.0" x1="1.0"/>
>   </Line>
>   <Line>
>     <Point x2="2.0" x1="2.0"/>
>     <Point x2="3.0" x1="2.0"/>
>   </Line>
>   <Line>
>     <Point x2="3.0" x1="3.0"/>
>     <Point x2="5.0" x1="3.0"/>
>   </Line>
> </Lines>
>
> In [4]: l.xml_xpath('//Line')
> Out[4]:
> [<amara.bindery.Line object at 0x015628D0>,
>  <amara.bindery.Line object at 0x0169E6B0>,
>  <amara.bindery.Line object at 0x01759D10>]
>
> In [5]: print l.xml_xpath('//Line')[0].xml()
> <Line>
>     <Point x2="1.0" x1="1.0"/>
>     <Point x2="2.0" x1="1.0"/>
>   </Line>
>
> In [6]: l.xml_xpath('//Line')[0].xml_xpath('//Point')
> Out[6]:
> [<amara.bindery.Point object at 0x0169E210>,
>  <amara.bindery.Point object at 0x0169E250>,
>  <amara.bindery.Point object at 0x0169EA50>,
>  <amara.bindery.Point object at 0x01759CD0>,
>  <amara.bindery.Point object at 0x01759D50>,
>  <amara.bindery.Point object at 0x01759D90>]
>
> I expected only 2 amara.bindery.Point objects in the last step. Is this
> (all 6 Points in the XML data) the expected behaviour?
>

About XPath: http://www.w3.org/TR/xpath

//Point selects all the Point descendants of the document root.

l.xml_xpath('//Line')[0].xml_xpath('Point')

or better:
l.xml_xpath('//Line[1]/Point')

But, be care because amara xpath has some problems ordering nodes (see
http://lists.fourthought.com/pipermail/4suite/2007-June/008285.html)
and it will not be fixed until amara 2.0

-- lm

From info at thegrantinstitute.com  Thu Dec  6 09:01:39 2007
From: info at thegrantinstitute.com (Anthony Jones)
Date: 06 Dec 2007 00:01:39 -0800
Subject: [XML-SIG] Professional Grant Proposal Writing Workshop (January
	2008: San Diego, CA)
Message-ID: <20071206000138.817203F8CD6F9852@thegrantinstitute.com>

An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/xml-sig/attachments/20071206/0c0ea2ef/attachment.htm 

From info at thegrantinstitute.com  Sun Dec 16 02:00:54 2007
From: info at thegrantinstitute.com (Anthony Jones)
Date: 15 Dec 2007 17:00:54 -0800
Subject: [XML-SIG] Professional Grant Proposal Writing Workshop (January
	2008: San Diego, CA)
Message-ID: <20071215170054.800464A55D1EFDB5@thegrantinstitute.com>

An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/xml-sig/attachments/20071215/1474f5ce/attachment.htm 

From noreply at sourceforge.net  Fri Dec 21 23:08:15 2007
From: noreply at sourceforge.net (SourceForge.net)
Date: Fri, 21 Dec 2007 14:08:15 -0800
Subject: [XML-SIG] [ pyxml-XBEL-1856104 ] getAttribute returns blank string
Message-ID: <E1J5q2R-0000KC-Lj@sc8-sf-web21.sourceforge.net>

XBEL item #1856104, was opened at 2007-12-21 16:08
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=707658&aid=1856104&group_id=6473

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Priority: 5
Private: No
Submitted By: Gazi Alankus (alanic7)
Assigned to: Nobody/Anonymous (nobody)
Summary: getAttribute returns blank string 

Initial Comment:
When I read an HTML document using xml.dom.ext.reader.HtmlLib.Reader, getAttribute() module for the elements returns blank string. 

Attached is a test source with comments on the prints. As far as I can see, xml/dom/Element.py implements getAttribute() as:

    def getAttribute(self, name):
        att = self.attributes.getNamedItem(name)
        return att and att.value or ''

The last three prints are from that return line. I'm not sure if xml/dom/Element.py is the source that the getAttribute() I used, but my trials show that this code should work but does not. So either this is not the source, or maybe the python source is compiled and the compiler messes something up. 

Version info:

pyXml pyxml-0.8.4, installed through Gentoo ebuild.

Python 2.4.4 (#1, Nov  6 2007, 18:42:27)
[GCC 4.1.2 (Gentoo 4.1.2)] on linux2


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=707658&aid=1856104&group_id=6473

From info at thegrantinstitute.com  Fri Dec 28 07:01:42 2007
From: info at thegrantinstitute.com (Anthony Jones)
Date: 27 Dec 2007 22:01:42 -0800
Subject: [XML-SIG] Professional Grant Proposal Writing Workshop (January
	2008: San Diego, CA)
Message-ID: <20071227220142.5A91A1FFB0C61F1D@thegrantinstitute.com>

An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/xml-sig/attachments/20071227/9522aefc/attachment.htm