[XML-SIG] Re: Issues with Unicode type

Daniel Veillard veillard@redhat.com
Mon, 23 Sep 2002 16:31:17 -0400


On Mon, Sep 23, 2002 at 07:15:43PM +0200, Martin v. Loewis wrote:
> Eric van der Vlist <vdv@dyomedea.com> writes:
> 
> > I would say that since a XML document is defined as set of unicode
> > characters, a single character "&x10800;" 
> 
> ... is ill-formed. Only characters below &#xFFFF; are allowed in XML,
> strictly speaking.

  Wrong, sorry, see the spec ! 

    http://www.w3.org/TR/REC-xml#NT-Char

 &x10800; is perfectly legal and should be viewed as a single character
for example in XPath expressions. This doesn't mean that you have
to change your internal encoding, but you need to make sur the wrappers
for all the access computing length etc. are doing the computation right.

  Eric's problem can probably be solved technically by simply providing
such a wrapper function.

Daniel

-- 
Daniel Veillard      | Red Hat Network https://rhn.redhat.com/
veillard@redhat.com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/