Help with xml.parsers.expat please?

Will Stuyvesant hwlgw at hotmail.com
Fri Jul 4 08:10:38 EDT 2003


There seems to be no XML parser that can do validation in
the Python Standard Libraries.  And I am stuck with Python
2.1.1. until my web master upgrades (I use Python for
CGI).  I know pyXML has validating parsers, but I can not
compile things on the (unix) webserver.  And even if I
could, the compiler I have access to would be different
than what was used to compile python for CGI.

I need to write a CGI script that does XML validation (and
then later also does other things).  It does not have to
be complete standards compliant validation but at least it
should check if elements are declared and allowed in
special places in the XML tree.

I tried to understand SAX and DOM but I gave up, and
effbot advises to avoid them anyway.  So I am studying
xml.parsers.expat now, but I am stuck.  

The program below *does* print information about DOCTYPE
declarations but nothing about the element definitions in
the DTD.  I feed it an XML file with a DOCTYPE declaration
like <!DOCTYPE ROOTTAG SYSTEM "MYDTD.DTD"> and the DTD is
in the same directory.  I also tried inputting the DTD
itself to this program but that doesn't work either
(ExpatError: syntaxerror at the first element definition).

Please help if you can.  




# file: minimal_validate.py
#
import xml.parsers.expat

def element_decl_handler(name, model):
    print 'ELEMENT definition: ', name, ' model: ', model

def doctype_decl_handler(doctypeName, systemId, publicId, has_internal_subset):
    print 'DOCTYPE declaration: '
    print '    doctypeName: ', doctypeName
    print '    systemId: ', systemId
    print '    publicId:', publicId
    print '    internal subset:', has_internal_subset

p = xml.parsers.expat.ParserCreate()

p.ElementDeclHandler = element_decl_handler
p.StartDoctypeDeclHandler = doctype_decl_handler

import sys
input = file(sys.argv[1]).read()
p.Parse(input)




More information about the Python-list mailing list