only a simple xml reader <tag:id>value</tag:id>

Wed Feb 8 09:33:38 EST 2006

martijn at gamecreators.nl wrote:
> H!,
>
> Is it possible to get a <tag:id>value</tag:id> value ?
>
> When I do this:
> -----------------------------------------------------
> theXML = """<?xml version="1.0"?>
>     <title>The Fascist Menace</title>
> """
> import xml.dom.minidom as dom
> doc = dom.parseString(theXML)
> print doc.getElementsByTagName('title')[0].toxml()
>
> I get : <title>The Fascist Menace</title> thats oke for me
> -----------------------------------------------------
>
> But the xmlfile I must read have other tags:
> theXML = """<?xml version="1.0"?>
>     <title:id>The Fascist Menace</title:id>
>     <title:name>bla la etc</title:name>
> """
>
> how to get that values ?
> I try things like:
> print doc.getElementsByTagName('title:id')[0].toxml() <--error

Addressing your general question, unfortunately you're a bit stuck.
Minidom is rather confused about whether or not it's a namespace aware
library.  Addressing your specific example, I strongly advise you not
to use documents that are not well-formed according to Namespaces 1.0.
Your second example is a well-formed XML 1.0 external parsed entity,
but not a well-formed XML 1.0 document entity, because it has multiple
elements at document level.  It's also not well-formed according to
XMLNS 1.0 unless you declare the "title" prefix.  You will not be able
to use a non XMLNS 1.0 document with most XML technologies, including
XSLT, WXS, RELAX NG, etc.

If you have indeed declared a namespace and are just giving us a very
bad example, use:

print doc.getElementsByTagNameNS(title_namespace, 'id')

--
Uche Ogbuji                               Fourthought, Inc.
http://uche.ogbuji.net                    http://fourthought.com
http://copia.ogbuji.net                   http://4Suite.org
Articles: http://uche.ogbuji.net/tech/publications/