xml bug?

Imbaud Pierre pierre.imbaud at laposte.net
Thu Dec 28 13:58:21 EST 2006

I am using the standard xml library to create another library able to 
read, and maybe write,
xmp files.
Then an xml library bug popped out:
xml.dom.minidom was unable to parse an xml file that came from an 
example provided by an official organism.(http://www.iptc.org/IPTC4XMP)
The parsed file was somewhat hairy, but I have been able to reproduce 
the bug with a simplified
version, that goes:

<?xpacket begin='' id='W5M0MpCehiHzreSzNTczkc9d'?>
<x:xmpmeta xmlns:x='adobe:ns:meta/' x:xmptk='XMP toolkit 3.0-28, 
framework 1.6'>
<rdf:RDF xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#' 

  <rdf:Description rdf:about='uuid:f5b64178-9394-11d9-bb8e-a67e6693b6e9'
   xmlns:xmpPLUS='XMP Photographic Licensing Universal System (xmpPLUS, 

<?xpacket end='w'?>

The offending part is the one that goes: xmpPLUS='....'
it triggers an exception: ValueError: too many values to unpack,
in  _parse_ns_name. Some debugging showed an obvious mistake
in the scanning of the name argument, that goes beyond the closing
" ' ".

Im aware I dont give here enough matter to allow full understanding
of the bug. But thats not the place for this, and thats not my point.

Now my points are:
- how do I spot the version of a given library? There is a __version__
   attribute of the module, is that it?
- How do I access to a given library buglist? Maybe this one is known,
   about to be fixed, it would then be useless to report it.
- How do I report bugs, on a standard lib?
- I tried to copy the lib somewhere, put it BEFORE the official lib in
   "the path" (that is:sys.path), the stack shown by the traceback
   still shows the original files being used. Is there a special
   mechanism bypassing the sys.path search, for standard libs? (I may
   be wrong on this, it seems hard to believe...)

- does someone know a good tool to validate an xml file?

btw, my code:

from nxml.dom import minidom
class whatever:
     def __init__(self, inStream):
         xmldoc = minidom.parse(inStream)

Thanks for any help...

More information about the Python-list mailing list