XML Validation with Python

Will Stuyvesant hwlgw at hotmail.com
Thu Jul 3 10:47:33 EDT 2003


I could not find a solution using the Python Standard
Libraries to write a simple commandline utility to do
XML validation.  And I found the xml.sax documentation
unclear, there are no good examples to look at.  Also
in the Python Cookbook and in the Python in a Nutshell
book the XML examples are BAD.  There is nowhere a
motivation for the class library design, for example
"why do you need a handler in a xml.sax.parse() and why
is there no default handler", nor simple examples how
to use it.  I like the approach taken by the Python
Standard Library book by Fredrik Lundh MUCH more: clear
examples and explanations.  A damn shame they do not
want a new edition at O'Reilly, the poor guy is now
putting a free version on his website.

I have found a solution for XML validation using the
3rd party pyRXP library from http://www.reportlab.com/xml/pyrxp.html
Their "download and install" info is a mess, I 
downloaded first a .ZIP with
only .DLL and .PYD files and it turned out you had to
plunk that into C:\Python22\DLL.  This made me turn
away from pyRXP initially because bad installation
usually means bad software.  But later on I found a
bigger .ZIP with more stuff so maybe I should've used
that one?  At least it works now.  I can do "import
pyRXP".  Make sure you also download
pyRXP_Documentation.pdf.  This is good documentation
with examples.  I notice the docs in the other big .ZIP
are in .RML format...whatever that is!

I can not believe the amount of bad documentation and
bad install approaches I see with 3rd party software.
That is why I normally stick to Python Standard Library
only.

Anyway, I can now do XML validation, below is
"validate.py".  But I am not solving my initial
problem: if it validates, then validate.py prints
nothing, if there is a mistake then it prints an error
message.  What I really wanted; giving more confidence
that the validation is okay; is to print 1 or 0
depending on the result, but I have not figured out yet
how to do that and now I am too tired of it all...

# file: validate.py
import sys
if len(sys.argv)<2 or sys.argv[1] in ['-h','--help','/?']: 
    print 'Usage: validate.py xmlfilename'
    sys.exit()
import pyRXP
p = pyRXP.Parser()
fn=open(sys.argv[1], 'r').read()
p.parse(fn)




More information about the Python-list mailing list