[Python-Dev] just say no...

Fred L. Drake, Jr. fdrake@acm.org
Fri, 12 Nov 1999 09:34:56 -0500 (EST)


M.-A. Lemburg writes:
 > Such a buffer is needed to implement "s" and "s#" argument
 > parsing. It's a simple requirement to support those two
 > parsing markers -- there's not much to argue about, really...
 > unless, of course, you want to give up Unicode object support
 > for all APIs using these parsers.

  Perhaps I missed the agreement that these should always receive
UTF-8 from Unicode strings.  Was this agreed upon, or has it simply
not been argued over in favor of other topics?
  If this has indeed been agreed upon... at least it can be computed
on demand rather than at initialization!  Perhaps there should be two
pointers: one to the UTF-8 buffer and one to a PyObject; if the
PyObject is there it's a "old-style" string that's actually providing
the buffer.  This may or may not be a good idea; there's a lot of
memory expense for long Unicode strings converted from UTF-8 that
aren't ever converted back to UTF-8 or accessed using "s" or "s#".
Ok, I've talked myself out of that.  ;-)


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives