2.2.2 Annoyance

Michael Hudson mwh at python.net
Tue Nov 25 13:14:34 EST 2003


"Miklós" <nospam at nowhere.hu> writes:

> Michael Hudson <mwh at python.net> wrote in message
> news:m37k1pzkjw.fsf at pc150.maths.bris.ac.uk...
> > "Miklós" <nospam at nowhere.hu> writes:
> >
> > > """
> > > __getitem__(self, key)
> > > Called to implement evaluation of self[key]. For sequence types, the
> > > accepted keys should be integers and slice objects.  Note that the
> special
> > > interpretation of
> > > """
> >
> > I think this is talking about implementing __getitem__, not calling it
> > directly.  Historically these things have been quite different, though
> > they are becoming less so.
> >
> Well, I've never been really into the guts of Python ... but could you shed
> some light for me on that why these two things are different?

Hmm, that's potentially a large topic.  I'll have a go.

At the C implementation level, Python distinguishes sequences (lists,
strings, tuples, arrays, etc) and mappings (dicts, basically).
However, Python-the-language uses the same notation -- brackets, [] --
to access the elements of both sequences and mappings, and there's
only one corresponding special method -- __getitem__.

If you're implementing a sequence in C, you're expected to fill out
the tp_as_sequence structure of the type object, which contains these
fields:

typedef struct {
	inquiry sq_length;
	binaryfunc sq_concat;
	intargfunc sq_repeat;
	intargfunc sq_item;
	intintargfunc sq_slice;
	intobjargproc sq_ass_item;
	intintobjargproc sq_ass_slice;
	objobjproc sq_contains;
	/* Added in release 2.0 */
	binaryfunc sq_inplace_concat;
	intargfunc sq_inplace_repeat;
} PySequenceMethods;

Notice that "sq_item" -- the function that retrieves an element of the
sequence -- is an "intargfunc", which is:

typedef PyObject *(*intargfunc)(PyObject *, int);

and "sq_slice" is an "intintargfunc", i.e:

typedef PyObject *(*intintargfunc)(PyObject *, int, int);

So, there's no way to actually implement a sequence that directly
handles a slice object!  You have to make a mapping instead (the
corresponding function in PyMappingMethods is mp_subscript, a
binaryfunc:

typedef PyObject * (*binaryfunc)(PyObject *, PyObject *);

).  This is the first reason you can't pass slice objects to
list.__getitem__ in 2.2 -- list just didn't implement mp_subscript.

The other reason is: just what is "list.__getitem__"?  If list was
implemented in Python, it would be obvious, it would be an unbound
method.  But list *isn't* implementing in Python, list is implemented
in C, so list.__getitem__ is a thing called a method-wrapper.  When
you access it, magic happens to find the appropriate C level method
and wrap it up so you can call it from Python.  But for __getitem__
specifically, there are two choices, sq_item and mp_subscript!  2.2
prefers sq_item, so even after the list type grew a mp_subscript
function [1].__getitem__(slice(0,1)) failed, because the
method-wrapper for sq_item didn't know what to do with the slice
object.  2.3 prefers mp_subscript over sq_item.

Hope that helped!

Cheers,
mwh

-- 
  The "of course, while I have no problem with this at all, it's
  surely too much for a lesser being" flavor of argument always
  rings hollow to me.                       -- Tim Peters, 29 Apr 1998




More information about the Python-list mailing list