[XML-SIG] Re: Issues with Unicode type

Daniel Veillard veillard@redhat.com
Mon, 23 Sep 2002 17:46:59 -0400


On Mon, Sep 23, 2002 at 11:41:04PM +0200, Eric van der Vlist wrote:
> On Mon, 2002-09-23 at 23:26, Daniel Veillard wrote:
> > On Mon, Sep 23, 2002 at 10:50:34PM +0200, Eric van der Vlist wrote:
> > > Except that it's not the only location where it's broken and that won't
> > > work with regular expressions. If I define a pattern such as ".{5}" I
> > > want to check that this is 5 unicode characters, not 5 words of 16
> > > bits...
> > 
> >   I don't know about Relax regexp, but for schemas I had to rewrite
> > an engine to cope with the full regexps of the beast.
> 
> That's the same beast :-( ... there is no such thing as Relax NG regexp
> and it's just borrowing the datatypes from W3C XML Schema and most of
> their facets including patterns.

  okay I see,

> Would you have Python bindings available for this regexps engine?

  Should not be too hard but they would operate on UTF8 string like all
libxml2 internals. So far regexps were not compiled by default in libxml2,
I switched it on last week, so now could be a good time to add bindings,
I will try to do this before the next release.

Daniel

-- 
Daniel Veillard      | Red Hat Network https://rhn.redhat.com/
veillard@redhat.com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/