Negative array indicies and slice()

Ian Kelly ian.g.kelly at gmail.com
Thu Nov 1 20:32:36 EDT 2012


On Thu, Nov 1, 2012 at 4:25 PM, Andrew Robinson
<andrew3 at r3dsolutions.com> wrote:
> The bottom line is:  __getitem__ must always *PASS* len( seq ) to slice()
> each *time* the slice() object is-used.  Since this is the case, it would
> have been better to have list, itself, have a default member which takes the
> raw slice indicies and does the conversion itself.  The size would not need
> to be duplicated or passed -- memory savings, & speed savings...

And then tuple would need to duplicate the same code.  As would deque.
 And str.  And numpy.array, and anything else that can be sliced,
including custom sequence classes.

> Let's apply D'Aprano 's logic to numpy; Numpy could just have subclassed
> *list*;

Numpy arrays are very different internally from lists.  They are
basically fancy wrappers of C arrays, whereas lists are a higher-level
abstraction.  They allow for multiple dimensions, which lists do not.
Slices of numpy arrays produce views, whereas slices of lists produce
brand new lists.  And they certainly do not obey the Liskov
Substitution Principle with respect to lists.

>>>> class ThirdParty( list ):  # Pretend this is someone else's...
> ...     def __init__(self): return
> ...     def __getitem__(self,aSlice): return aSlice
> ...
>
> We know it will default work like this:
>>>> a=ThirdParty()
>>>> a[1:2]
> slice(1, 2, None)
>
> # So, here's an injection...
>>>> ThirdParty.superOnlyOfNumpy__getitem__ = MyClass.__getitem__
>>>> ThirdParty.__getitem__ = lambda self,aSlice: ( 1, 3,
>>>> self.superOnlyOfNumpy__getitem__(aSlice ).step )
>>>> a[5:6]
> (1, 3, None)

I'm not understanding what this is meant to demonstrate.  Is "MyClass"
a find-replace error of "ThirdParty"?  Why do you have __getitem__
returning slice objects instead of items or subsequences?  What does
this example have to do with numpy?

> Numpy could have exported a (workable) function that would modify other list
> functions to affect ONLY numpy data types (eg: a filter).  This allows
> user's creating their own classes to inject them with Numpy's filter only
> when they desire;
>
> Recall Tim Peter's "explicit is better than implicit" Zen?

We could also require the user to explicitly declare when they're
performing arithmetic on variables that might not be floats.  Then we
can turn off run-time type checking unless the user explicitly
requests it, all in the name of micro-optimization and explicitness.

Seriously, whether x is usable as a sequence index is a property of x,
not a property of the sequence.  Users shouldn't need to pick and
choose *which* particular sequence index types their custom sequences
are willing to accept.  They should even be able to accept sequence
index types that haven't been written yet.

> Most importantly normal programs not using Numpy wouldn't have had to carry
> around an extra API check for index() *every* single time the heavily used
> [::] happened.  Memory & speed both.

The O(1) __index__ check is probably rather inconsequential compared
to the O(n) cost of actually performing the slicing.

> It's also a monkey patch, in that index() allows *conflicting* assumptions
> in violation of the unexpected monkey patch interaction worry.
>
> eg: Numpy *CAN* release an index() function on their floats -- at which
> point a basic no touch class (list itself) will now accept float as an index
> in direct contradiction of PEP 357's comment on floats... see?

Such a change would only affect numpy floats, not all floats, so it
would not be a monkey-patch.  In any case, that would be incorrect
usage of __index__.  We're all consenting adults here; we don't need
supervision to protect us from buggy third-party code.



More information about the Python-list mailing list