[Python-Dev] marshal (was:Buffer interface in abstract.c? )

Guido van Rossum guido@CNRI.Reston.VA.US
Tue, 10 Aug 1999 10:12:23 -0400


> Greg Stein <gstein@lyra.org> wrote:
> > > > > > >>> import unicode
> > > > > > >>> import marshal
> > > > > > >>> u = unicode.unicode
> > > > > > >>> s = u("foo")
> > > > > > >>> data = marshal.dumps(s)
> > > > > > >>> marshal.loads(data)
> > > > > > 'f\000o\000o\000'
> > > > > > >>> type(marshal.loads(data))
> > > > > > <type 'string'>
> >
> > > > Why do Unicode objects implement the bf_getcharbuffer slot ? I thought
> > > > that unicode objects use a two-byte character representation.
> > 
> > Unicode objects should *not* implement the getcharbuffer slot. Only
> > read, write, and segcount.
> 
> unicode objects do not implement the getcharbuffer slot.
> here's the relevant descriptor:
> 
> static PyBufferProcs unicode_as_buffer = {
>     (getreadbufferproc) unicode_buffer_getreadbuf,
>     (getwritebufferproc) unicode_buffer_getwritebuf,
>     (getsegcountproc) unicode_buffer_getsegcount
> };
> 
> the array module uses a similar descriptor.
> 
> maybe the unicode class shouldn't implement the
> buffer interface at all?  sure looks like the best way
> to avoid trivial mistakes (the current behaviour of
> fp.write(unicodeobj) is even more serious than the
> marshal glitch...)
> 
> or maybe the buffer design needs an overhaul?

I think most places that should use the charbuffer interface actually
use the readbuffer interface.  This is what should be fixed.

--Guido van Rossum (home page: http://www.python.org/~guido/)