cached encoding (Re: [Python-Dev] Internationalization Toolkit)

M.-A. Lemburg mal@lemburg.com
Wed, 10 Nov 1999 10:55:42 +0100


Fredrik Lundh wrote:
> 
> Guido van Rossum <guido@CNRI.Reston.VA.US> wrote:
> > One specific question: in you discussion of typed strings, I'm not
> > sure why you couldn't convert everything to Unicode and be done with
> > it.  I have a feeling that the answer is somewhere in your case study
> > -- maybe you can elaborate?
> 
> Marc-Andre writes:
> 
>     Unicode objects should have a pointer to a cached (read-only) char
>     buffer <defencbuf> holding the object's value using the current
>     <default encoding>.  This is needed for performance and internal
>     parsing (see below) reasons. The buffer is filled when the first
>     conversion request to the <default encoding> is issued on the object.
> 
> keeping track of an external encoding is better left
> for the application programmers -- I'm pretty sure that
> different application builders will want to handle this
> in radically different ways, depending on their environ-
> ment, underlying user interface toolkit, etc.

It's not that hard to implement. All you have to do is check
whether the current encoding in <defencbuf> still is the same
as the threads view of <default encoding>. The <defencbuf>
buffer is needed to implement "s" et al. argument parsing
anyways.
 
> besides, this is how Tcl would have done it.  Python's
> not Tcl, and I think you need *very* good arguments
> for moving in that direction.
> 
> </F>
> 
> _______________________________________________
> Python-Dev maillist  -  Python-Dev@python.org
> http://www.python.org/mailman/listinfo/python-dev

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    51 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/