Telling Expat to ignore junk in XML feed

Peter Clark pc451 at yahoo.com
Mon May 26 10:11:17 EDT 2003


Is there a way to ignore junk in an XML feed? Here's what I receive
(NOTE: I have ZERO control over this feed, so I can't solve the real
problem, unfortunately):
---
File "/home/peter/bin/k_weather.py", line 52, in ?
    dom = parseString(xmlweather)
  File "/usr/lib/python2.2/xml/dom/minidom.py", line 967, in
parseString
    return _doparse(pulldom.parseString, args, kwargs)
  File "/usr/lib/python2.2/xml/dom/minidom.py", line 954, in _doparse
    toktype, rootNode = events.getEvent()
  File "/usr/lib/python2.2/xml/dom/pulldom.py", line 255, in getEvent
    self.parser.feed(buf)
  File "/usr/lib/python2.2/xml/sax/expatreader.py", line 148, in feed
    self._err_handler.fatalError(exc)
  File "/usr/lib/python2.2/xml/sax/handler.py", line 38, in fatalError
    raise exception
xml.sax._exceptions.SAXParseException: <unknown>:3:0: junk after
document element
---
Looking at the feed in question, it's no surprise that expat choked:
---
<?xml version="1.0"?>
<br />
<b>Warning</b>:  fopen(/home3/petersen/work/cache/weather/USMN0027)
[<a href='ht
tp://www.php.net/function.fopen'>function.fopen</a>]: failed to create
stream: P
ermission denied in <b>/home3/petersen/work/production/weather/weather.php</b>
o
n line <b>114</b><br />
<br />
<b>Warning</b>:  fputs(): supplied argument is not a valid stream
resource in <b
>/home3/petersen/work/production/weather/weather.php</b> on line
<b>115</b><br /
>
<br />
<b>Warning</b>:  fclose(): supplied argument is not a valid stream
resource in <
b>/home3/petersen/work/production/weather/weather.php</b> on line
<b>116</b><br
/>
<weather>
(correct feed begins here)
---
It looks like a bunch of PHP errors have crept into the feed, but
haven't really altered the rest of the XML feed. If I delete the junk
by hand, it all works fine. Doing it by hand, however, is out of the
question. So: is there some way to tell expat to ignore the junk and
try to parse the XML as is?
    :Peter




More information about the Python-list mailing list