[XML-SIG] Re: Issues with Unicode type

Eric van der Vlist vdv@dyomedea.com
26 Sep 2002 15:07:15 +0200


On Thu, 2002-09-26 at 14:48, Daniel Veillard wrote:

>   Hum, I think you would need a rewrite anyway for full conformance,=20
> the XML Schemas regexp have more complext constructs than standard regexp=
s
> the quantifiers may be more rich (not 100% sure I didn't checked fully)
> and all the character classes/group/category/blocks are not part of
> "normal" regexps (well I never saw any such description in regexps help
> or man before, so I doubt it appeared magically in python).

I am not 100% sure...

The quantifiers are "*", "+", "?" and the "{}" constructs and they seem
to work fine in Python...

As for the character classes/group/category/blocks, I was wondering if
they couldn't be described and generated with chargen.py.

A preparsing of the W3C XML Schema patterns could then be done to use
them.=20

What might be tougher are the features such as the complement of classes
(ex: [\p{IsBasicLatin}-[^\p{L}]]) which AFAIK is an extension over perl
regexps.

Eric (impatient to see the libxml bindings for this)
--=20
Rendez-vous =E0 Paris.
                          http://www.technoforum.fr/integ2002/index.html
------------------------------------------------------------------------
Eric van der Vlist       http://xmlfr.org            http://dyomedea.com
(W3C) XML Schema ISBN:0-596-00252-1 http://oreilly.com/catalog/xmlschema
------------------------------------------------------------------------