[Python-Dev] just say no...
M.-A. Lemburg
mal@lemburg.com
Fri, 12 Nov 1999 16:24:33 +0100
"Fred L. Drake, Jr." wrote:
>
> M.-A. Lemburg writes:
> > Such a buffer is needed to implement "s" and "s#" argument
> > parsing. It's a simple requirement to support those two
> > parsing markers -- there's not much to argue about, really...
> > unless, of course, you want to give up Unicode object support
> > for all APIs using these parsers.
>
> Perhaps I missed the agreement that these should always receive
> UTF-8 from Unicode strings. Was this agreed upon, or has it simply
> not been argued over in favor of other topics?
It's been in the proposal since version 0.1. The idea is to
provide a decent way of making existing script Unicode aware.
> If this has indeed been agreed upon... at least it can be computed
> on demand rather than at initialization!
This is what I intended to implement. The <defencbuf> buffer
will be filled upon the first request to the UTF-8 encoding.
"s" and "s#" are examples of such requests. The buffer will
remain intact until the object is destroyed (since other code
could store the pointer received via e.g. "s").
> Perhaps there should be two
> pointers: one to the UTF-8 buffer and one to a PyObject; if the
> PyObject is there it's a "old-style" string that's actually providing
> the buffer. This may or may not be a good idea; there's a lot of
> memory expense for long Unicode strings converted from UTF-8 that
> aren't ever converted back to UTF-8 or accessed using "s" or "s#".
> Ok, I've talked myself out of that. ;-)
Note that Unicode object are completely different beast ;-)
String object are not touched in any way by the proposal.
--
Marc-Andre Lemburg
______________________________________________________________________
Y2000: 49 days left
Business: http://www.lemburg.com/
Python Pages: http://www.lemburg.com/python/