[XML-SIG] Re: Issues with Unicode type

Daniel Veillard veillard@redhat.com
Mon, 23 Sep 2002 17:32:04 -0400


On Mon, Sep 23, 2002 at 03:16:08PM -0600, Uche Ogbuji wrote:
> Oh, but then Python is so much simpler:
> 
>     
> SP_PAT = re.compile(u"[\uD800-\uDBFF][\uDC00-\uDFFF]")
> def smart_len(u):
>     sp_count = len(SP_PAT.findall(u))
>     return len(u) - sp_count
> 
> 
> Problem solved.

  modulo the space and CPU requirements for the operation (okay you can tell
I'm primarilly a C coder :-)

> The great thing about Python is even when it frustrates you one moment, it 
> finds a way to quickly make up for it.

  I don't think chars are classes but types, and hence one cannot
make a subclass of strings whose instances could have all length/walk/extract
operations being special cased to reflect XML unicode string. I (and Eric
I bet) would like to be wrong on this :-)

Daniel

-- 
Daniel Veillard      | Red Hat Network https://rhn.redhat.com/
veillard@redhat.com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/