XML Schema?

Romuald Texier rtexier at elikya.com
Thu Feb 15 06:24:38 EST 2001


Did you take a look at http://4suite.org/ ?

Regards.

Romuald Texier.

Harry George wrote:

> Thanks for the pointer.  I didn't find LTXML in my initial literature
> search.  Given that it exists, I don't see much reason to continue on
> my effort.  Maybe the xml-conf site could use the testcase generator.
> 
> Another possibility I considered was to do a python binding to the
> apache Xerces "C++".  Do you know if anyone has done that?  That would
> hook into IBM's significant C++/Java XML-oriented releases.
> 
> I'm not a fan of Schema either, but it sure is being hyped to the
> local decision makers -- so I need a python treatment.  The whole XML
> world has migrated from "It is deliberately simple so all languages can
> play" to "Let's complexify it so those pesky GPL guys can't keep up."
> 
> Uche Ogbuji <uche at ogbuji.net> writes:
> 
> > Harry George wrote:
> > > 
> > > Anyone have a python XML Schema parser/validator?  I thought I saw
> > > comments that it wasn't being done yet as part of xml-sig.  Of course,
> > > we don't actually need an XML Schema validator inpython (java or C++
> > > renditions would do fine), but there is a social cachet to it, so
> > > maybe worth the effort.
> > 
> > I'm not personally a fan of XML Schemas, but I think this would be a
> > very worth-while project.  You'd probably get plenty of help as well.
> > 
> > > Assuming it is an open task, here is an approach.  Anyone see holes in
> > > this, besides it being a humongous task?
> > > 
> > > 1. Get the specs from OASIS-->W3C.
> > > 
> > > 2. Get test cases (for schemas and for instances) There are a few
> > >    cases at xml-conf, but I think a lot more will be needed.  So I'll
> > >    need to generate them, and that suggests a case generator, plus of
> > >    course a test driver.  I have the testcase generator and driver
> > >    done.
> > > 
> > > 3. XML Schema is basically a regular expression problem, with nodes as
> > >    the "characters".
> > 
> > Hmm.  I wouldn't go this far.  The most basic parts of the content model
> > are so, but the entire data-type system and parts of the content model
> > need a different approach than regular grammar.
> > 
> > >    So we can use classical lexer algorithms:
> > >    regexpr --> NFA --> DFA.  The hassles may be at the leaf nodes,
> > >    where XML Schema has lots of special cases.  I don't knbow if there
> > >    are non-re constraints in the specs, but if so I'd apply them after
> > >    the initial pass.
> > 
> > Interesting approach.
> > 
> > > 4. Given that state machine, run schemas through the parser until it
> > > can
> > >    build machines from valid schemas and detect invalid ones.
> > > 
> > > 5. Given a sound state machine, run instance test cases through the
> > >    package until it is passing valid instances and detecting invalid
> > >    ones.
> > > 
> > > 6. This would probably be an iterative enhancement exercise, once the
> > >    state machine engine was in place.
> > > 
> > > I have a lex-workalike I wrote in Modula-2, which I'll use as the
> > > start point.  Probably could use a SAX input approach ("next node"
> > > instead of "next char"), maybe with 1 lookahead.
> > 
> > Just to note: LT-XML supposedly has a Python interface and an XSchemas
> > validator.  I still think your effort would be worth-while, especially
> > given your fresh approach.
> > 
> > http://www.ltg.ed.ac.uk/software/xml/
> > 
> > 
> > --
> > Uche Ogbuji
> > Personal:   uche at ogbuji.net         http://uche.ogbuji.net
> > Work:       uche.ogbuji at fourthought.com     http://Fourthought.com
> 

-- 
Romuald Texier



More information about the Python-list mailing list