ElementTree/DTD question

Greg Wilson gvwilson at cdf.toronto.edu
Tue Mar 15 16:50:25 EST 2005


I'm trying to convert from minidom to ElementTree for handling XML,
and am having trouble with entities in DTDs.  My Python script looks
like this:

----------------------------------------------------------------------

#!/usr/bin/env python

import sys, os
from elementtree import ElementTree

for filename in sys.argv[1:]:
    ElementTree.parse(filename)

----------------------------------------------------------------------

My first attempt was this XML file:

----------------------------------------------------------------------

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE lec [
  <!ENTITY ldots "&#x8230;">
]>
<lec title="Introduction">
<topic title="Motivation" summary="motivation for course">
 <slide>
  <b1>Write an introduction&ldots;</b1>
 </slide>
</topic>
</lec>

----------------------------------------------------------------------

Running "python validate.py first.xml" produces:

----------------------------------------------------------------------

Traceback (most recent call last):
  File "validate.py", line 7, in ?
    ElementTree.parse(filename)
  File "C:\Python23\Lib\site-packages\elementtree\ElementTree.py",
line 865, in parse
    tree.parse(source, parser)
  File "C:\Python23\Lib\site-packages\elementtree\ElementTree.py",
line 589, in parse
    parser.feed(data)
  File "C:\Python23\Lib\site-packages\elementtree\ElementTree.py",
line 1160, in feed
    self._parser.Parse(data, 0)
  File "C:\Python23\Lib\site-packages\elementtree\ElementTree.py",
line 1113, in _default
    raise expat.error(
xml.parsers.expat.ExpatError: undefined entity &ldots;: line 9, column
27

----------------------------------------------------------------------

All right, pull the DTD out, and use this XML file:

----------------------------------------------------------------------

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE lec SYSTEM "swc.dtd">
<lec title="Introduction">
<topic title="Motivation" summary="motivation for course">
 <slide>
  <b1>Write an introduction&ldots;</b1>
 </slide>
</topic>
</lec>

----------------------------------------------------------------------

with this minimalist DTD (saved as "swc.dtd" in the same directory as
both the XML file and the script):

----------------------------------------------------------------------

<!ENTITY ldots "&#x8230;">

----------------------------------------------------------------------

Same error; only the line number changed.  Anyone know what I'm doing
wrong?  (Note: minidom loads it just fine...)

Thanks,
Greg Wilson
gvwilson at cs.utoronto.ca



More information about the Python-list mailing list