[Python-Dev] marshal (was:Buffer interface in abstract.c? )

Fredrik Lundh fredrik@pythonware.com
Tue, 10 Aug 1999 14:35:33 +0200


Greg Stein <gstein@lyra.org> wrote:
> > > > > >>> import unicode
> > > > > >>> import marshal
> > > > > >>> u = unicode.unicode
> > > > > >>> s = u("foo")
> > > > > >>> data = marshal.dumps(s)
> > > > > >>> marshal.loads(data)
> > > > > 'f\000o\000o\000'
> > > > > >>> type(marshal.loads(data))
> > > > > <type 'string'>
>
> > > Why do Unicode objects implement the bf_getcharbuffer slot ? I thought
> > > that unicode objects use a two-byte character representation.
> 
> Unicode objects should *not* implement the getcharbuffer slot. Only
> read, write, and segcount.

unicode objects do not implement the getcharbuffer slot.
here's the relevant descriptor:

static PyBufferProcs unicode_as_buffer = {
    (getreadbufferproc) unicode_buffer_getreadbuf,
    (getwritebufferproc) unicode_buffer_getwritebuf,
    (getsegcountproc) unicode_buffer_getsegcount
};

the array module uses a similar descriptor.

maybe the unicode class shouldn't implement the
buffer interface at all?  sure looks like the best way
to avoid trivial mistakes (the current behaviour of
fp.write(unicodeobj) is even more serious than the
marshal glitch...)

or maybe the buffer design needs an overhaul?

</F>