XML-schema 'best practice' question

Thu Sep 18 02:28:55 EDT 2008

Hi all

This is not strictly a Python question, but as I am writing in Python,
and as I know there are some XML gurus on this list, I hope it is
appropriate here.

XML-schemas are used to define the structure of an xml document, and
to validate that a particular document conforms to the schema. They
can also be used to transform the document, by filling in missing
attributes with default values.

In my situation, both the creation and the processing of the xml
document are under my control. I know that this begs the question 'why
use xml in the first place', but let's not go there for the moment.

Using minixsv, validating a document with a schema works, but is quite
slow. I appreciate that lxml may be quicker, but I think that my
question is still applicable.

I am thinking of adding a check to see if a document has changed since
it was last validated, and if not, skip the validation step. However,
I then do not get the default values filled in.

I can think of two possible solutions. I just wondered if this is a
common design issue when it comes to xml and schemas, and if there is
a 'best practice' to handle it.

1. Don't use default values - create the document with all values
filled in.

2. Use python to check for missing values and fill in the defaults
when processing the document.

Or maybe the best practice is to *always* validate a document before
processing it.

How do experienced practitioners handle this situation?

Thanks for any hints.

Frank Millman