[XML-SIG] validating with XML schema (long)

Thomas B. Passin tpassin at comcast.net
Tue Oct 14 23:06:46 EDT 2003


Hawkeye Parker wrote:
 > i need to validate the structure/content of some xml, and i'm parsing 
etc. with python.  i've been learning a bit about XML Schema and i'd 
like to confirm some basic assumptions:
 >
 > -validation with XML Schema (or any other validation language) 
doesn't "just happen".  i.e., just because you specify an .xsd file in 
your xml, you still need to explicitly "call" it to validate the xml. 
it must be, correct?
 >

You must make validation happen, but not by "calling" a specified xsd file.

 > assuming i'm right so far:  in terms of validation, it seems that DTD 
is unwieldy and that XML Schema (.xsd) is a much better choice,

Huh???  Most people think that xml schema is an unwieldy beast, not the dtd.

 >except that there's little support for it in general, and specifically 
in python.  in fact, there doesn't seem to be a whole lot of xml 
validation support at all . . . .  this makes me think that:
 >
 > -there are other (more sensible?) ways to validate the xml, like 
parsing into DOM and then using python to validate according to your 
desires.  maybe messy but obvious.

Not very feasible except for quite restricted kinds of validation, though.

 > -xml is new, validation of xml is newer, validation with XML Schema 
is newer yet.
 >
 > in anycase, i've gotten XSV and run it against a few of my own 
examples.  again, i'm confused:  XSV seems to validate the XML Schema 
itself (schemaErrors) as much as the XML (instanceErrors).  i guess this 
is good.  moreover, i was expecting to write something like this:
 >
 > XSV.validate('foo.xml', foo.xsd')
 >
 > which would raise an exception if anything went wrong with the 
validation of the XML (.xml) file according to the XML Schema file 
(.xsd).  instead, i get an (opaque) xml object that i will have to parse 
futher, eventually to raise my own custom exceptions.

xsv comes with an xslt stylesheet to make the results easier to read. 
You could start with that (command line operation) until you understand 
what xsv is telling you.

 >
 > lastly, here's an example of some simple xml and an empty schema:
 >
 > <?xml version="1.0" encoding="UTF-16" ?>
 > <PPSiteBuilder xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' 
xsi:noNamespaceSchemaLocation='PPSiteBuilderSchema.xsd'>
 > 	<site></site>
 > 	<reallyReallyStupidWrongTag></reallyReallyStupidWrongTag>
 > </PPSiteBuilder>
 >
 > <?xml version="1.0"?>
 > <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
 > </xsd:schema>
 >
 > here's the xsv output:
 >
 > <?xml version='1.0'?>
 > <xsv xmlns="http://www.w3.org/2000/05/xsv" docElt="{None}PPSiteBuilder"
 >      instanceAssessed="true" instanceErrors="0" schemaErrors="0"
 >   schemaLocs="None -> PPSiteBuilderSchema.xsd; None -> 
PPSiteBuilderSchema.xsd"
 >      target="file:///C:/sandbox/site_builder/siteBuilder.xml" 
validation="lax"
 >      version="XSV 2.5-2 of 2003/07/09 13:08:04">
 >   <schemaDocAttempt
 >     URI="file:///C:/sandbox/site_builder/PPSiteBuilderSchema.xsd"
 >                     outcome="success" source="schemaLoc"/>
 >   <schemaDocAttempt
 >     URI="file:///C:/sandbox/site_builder/PPSiteBuilderSchema.xsd"
 >                     outcome="redundant" source="schemaLoc"/>
 > </xsv>
 >
 >
 > XSV does not complain about this example,

But it does tell what it did.  In this case, xsv could not find any 
elements to validate (since the schema is empty), so it went to "lax" 
mode - validation='lax'.  This means it did not check the elements it 
found.  XML Schema validation can be either lax or strict - you have to 
read up on it.  With lax validation, xsv found no errors since all 
schema elements were satisfied or at least not failed (since there were 
none).

 > though none of the elements (<PPSiteBuilder>, <site>, etc.) are 
specified in the Schema.  i expect i'm missing something basic about 
xml, validation, and XML Schema, but this is just the sort of *very bad* 
xml that i want to be able to catch during validation.
 >

Learn how to enforce strict validation, or just use a DTD, or go to 
RELAX NG.  If you use a DTD, you have to use a validating parser and 
tell it to validate - Python can do this.  Search Google, you should 
find enough information.

Cheers,

Tom P





More information about the XML-SIG mailing list