XML Validation with Python

Alan Kennedy alanmk at hotmail.com
Thu Jul 3 14:24:53 EDT 2003


Will Stuyvesant wrote:

> Because I want to use it from a cgi script written in Python.  And I
> am not allowed to install 3rd party stuff on the webserver.  Even if I
> was it would not be a solution since it has to be easy to put it on
> another webserver.  But of course: if there is a validating parser
> written completely in Python then I can use it too!  If it runs under
> Python 2.1.1, that is (that is what they have at the website).  I will
> investigate this www.garshol.priv.no link you gave me, thank you.

Glad to be of help.

There is a comment on Lars site, which is vaguely worrying, which
says:

"Note that it is recommended to use xmlproc through the SAX API rather
than directly, since this provides much greater freedom in the choice
of
parsers. (For example, you can switch to using Pyexpat which is
written
in C without changing your code.)"

Which seems to indicate to me that the author is encouraging the user
not to rely on xmlproc too much. Perhaps performance might be an
issue?

One more thing: There are alternative validation methods, which may or
not be suitable, based on your requirements.

For example, there is a python implementation of James Clark's Tree
Regular EXpressions (TREX), written in pure python, and which uses the
inbuilt C parser, written by James Tauber and called pytrex. I
personally find trex and pytrex a very natural, and thus easy to
learn, way to check structures in a tree, including data validation.
Pytrex is not complete, and is no longer maintained, but what's there
is good code, and with nice little features, such as the ability to
define your own datatype validation functions, which are called at
match time.

http://pytrex.sourceforge.net/

Pytrex is unlikely to be ever completed, because James Clark has
abandoned TREX in favour of RELAX-NG, for which I haven't seen any
python implementation.

http://www.relaxng.org/

There is a python implementation of XML-Schema, xsv, written by Henry
Thompson, which I think was kept fairly up-to-date with the XML-Schema
spec as it evolved. However, given the complexity of XML-Schema, and
having never tried to use xsv, I have no idea of its stability.

http://www.ltg.ed.ac.uk/~ht/xsv-status.html

I note that the author also maintains a web service for validating
documents.

Are you sure that XML validation-parsing is the right solution for
your problem? There may be simpler ways.

-- 
alan kennedy
-----------------------------------------------------
check http headers here: http://xhaus.com/headers
email alan:              http://xhaus.com/mailto/alan




More information about the Python-list mailing list