Confusion over etree.ElementTree.Element.getiterator

Ben Sizer kylotan at gmail.com
Mon Jul 5 06:40:53 EDT 2010


On Jul 4, 7:33 am, Stefan Behnel <stefan... at behnel.de> wrote:
> BenSizer, 04.07.2010 00:32:
>
> > On Jul 3, 11:12 pm,BenSizer<kylo... at gmail.com>  wrote:
>
> >> >>> for el in root.getiterator():
>
> >> ...        print el
> >> [much output snipped]
> >> <Element {http://www.w3.org/1999/xhtml}a at d871e8>
> >> <Element {http://www.w3.org/1999/xhtml}a at d87288>
> >> <Element {http://www.w3.org/1999/xhtml}script at d87300>
> >> <Element {http://www.w3.org/1999/xhtml}script at d87378>
>
> > Hmm, I think I've worked it out. Apparently the XML namespace forms
> > part of the tag name in this case. Is that what is intended?
>
> Sure.
>
> > I didn't see any examples of this in the docs.
>
> Admittedly, it's three clicks away from the library docs on docs.python.org.
>
> http://effbot.org/zone/element.htm#xml-namespaces

Hopefully someone will see fit to roll this important documentation
into docs.python.org before the next release... oops, too late. ;)

It's one of those things that's easy to fix when you know what the
problem is. Unfortunately it makes the examples a bit awkward. The
example on http://docs.python.org/library/xml.etree.elementtree.html
opens up an xhtml file and reads a "p" tag within a "body" tag, but
the xhtml specification (http://www.w3.org/TR/xhtml1/#strict) states
that 'The root element of the document must contain an xmlns
declaration for the XHTML namespace'. Therefore I don't see how the
example Python code given could work on a proper xhtml file, given
that there should always be a namespace in effect but the code doesn't
allow for it.

That's my excuse anyway! :)

--
Ben Sizer



More information about the Python-list mailing list