Fwd: Re: Negative array indicies and slice()

Andrew Robinson andrew3 at r3dsolutions.com
Sat Nov 3 18:34:04 EDT 2012


Forwarded to python list:

-------- Original Message --------
Subject: 	Re: Negative array indicies and slice()
Date: 	Sat, 03 Nov 2012 15:32:04 -0700
From: 	Andrew Robinson
Reply-To: 	andrew3 at r3dsolutions.com
To: 	Ian Kelly <>



On 11/01/2012 05:32 PM, Ian Kelly wrote:
>  On Thu, Nov 1, 2012 at 4:25 PM, Andrew Robinson
>>  The bottom line is:  __getitem__ must always *PASS* len( seq ) to slice()
>>  each *time* the slice() object is-used.  Since this is the case, it would
>>  have been better to have list, itself, have a default member which takes the
>>  raw slice indicies and does the conversion itself.  The size would not need
>>  to be duplicated or passed -- memory savings,&   speed savings...
>  And then tuple would need to duplicate the same code.  As would deque.
>    And str.  And numpy.array, and anything else that can be sliced,
>  including custom sequence classes.
I don't think that's true.  A generic function can be shared among
different objects without being embedded in an external index data
structure to boot!

If *self* were passed to an index conversion function (as would
naturally happen anyway if it were a method), then the method could take
len( self ) without knowing what the object is;
Should the object be sliceable -- the len() will definitely return the
required piece of information.

>  Numpy arrays are very different internally from lists.
Of course!  (Although, lists do allow nested lists.)

>  I'm not understanding what this is meant to demonstrate.  Is "MyClass"
>  a find-replace error of "ThirdParty"?  Why do you have __getitem__
>  returning slice objects instead of items or subsequences?  What does
>  this example have to do with numpy?
Here's a very cleaned up example file, cut and pastable:
#!/bin/env python
# File: sliceIt.py  --- a pre PEP357 hypothesis test skeleton

class Float16():
     """
     Numpy creates a float type, with very limited precision -- float16
     Rather than force you to install np for this test, I'm just making a
     faux object.  normally we'd just "import np"
     """

     def __init__(self,value): self.value = value
     def AltPEP357Solution(self):
         """ This is doing exactly what __index__ would be doing. """
         return None if self.value is None else int( self.value )

class ThirdParty( list ):
     """
     A simple class to implement a list wrapper, having all the
properties of
     a normal list -- but explicitly showing portions of the interface.
     """
     def __init__(self, aList): self.aList = aList

     def __getitem__(self, aSlice):
         print( "__getitems__", aSlice )
         temp=[]
         edges = aSlice.indices( len( self.aList ) ) # *unavoidable* call
         for i in range( *edges ): temp.append( self.aList[ i ] )
         return temp

def Inject_FloatSliceFilter( theClass ):
     """
     This is a courtesy function to allow injecting (duck punching)
     a float index filter into a user object.
     """
     def Filter_FloatSlice( self, aSlice ):

         # Single index retrieval filter
         try: start=aSlice.AltPEP357Solution()
         except AttributeError: pass
         else: return self.aList[ start ]

         # slice retrieval filter
         try: start=aSlice.start.AltPEP357Solution()
         except AttributeError: start=aSlice.start
         try: stop=aSlice.stop.AltPEP357Solution()
         except AttributeError: stop=aSlice.stop
         try: step=aSlice.step.AltPEP357Solution()
         except AttributeError: step=aSlice.step
         print( "Filter To",start,stop,step )
         return self.super_FloatSlice__getitem__( slice(start,stop,step) )

     theClass.super_FloatSlice__getitem__ = theClass.__getitem__
     theClass.__getitem__ = Filter_FloatSlice

# EOF: sliceIt.py

--------------------------------------------------------
Example run:

>>>  from sliceIt import *
>>>  test = ThirdParty( [1,2,3,4,5,6,7,8,9] )
>>>  test[0:6:3]
('__getitems__', slice(0, 6, 3))
[1, 4]
>>>  f16=Float16(8.3)
>>>  test[0:f16:2]
('__getitems__', slice(0,<sliceIt.Float16 instance at 0xb74baaac>, 2))
Traceback (most recent call last):
   File "<stdin>", line 1, in<module>
   File "sliceIt.py", line 26, in __getitem__
     edges = aSlice.indices( len( self.aList ) )  # This is an
*unavoidable* call
TypeError: object cannot be interpreted as an index
>>>  Inject_FloatSliceFilter( ThirdParty )
>>>  test[0:f16:2]
('Filter To', 0, 8, 2)
('__getitems__', slice(0, 8, 2))
[1, 3, 5, 7]
>>>  test[f16]
9

>  We could also require the user to explicitly declare when they're
>  performing arithmetic on variables that might not be floats. Then we
>  can turn off run-time type checking unless the user explicitly
>  requests it, all in the name of micro-optimization and explicitness.
:) None of those would help micro-optimization that I can see.
>  Seriously, whether x is usable as a sequence index is a property of x,
>  not a property of the sequence.
Yes, but the *LENGTH* of the sequence is a function of the *sequence*.
>  Users shouldn't need to pick and choose *which* particular sequence
>  index types their custom sequences are willing to accept. They should
>  even be able to accept sequence index types that haven't been written
>  yet.

I disagree, and "float" is a good example.  Besides -- Personally -- I
don't have a problem with subclassing for a custom sequence; in spite of
what D'Aprano thinks.  It's the generic sequences that irritate me.

OK, then, in your opinion what's the unspoken reason that PEP 357
happened, when in fact people already could have just said: myList[
int(firstItem) : int(secondItem), int(thirdItem) ]  ?

>>  Most importantly normal programs not using Numpy wouldn't have had to carry
>>  around an extra API check for index() *every* single time the heavily used
>>  [::] happened.  Memory&   speed both.
>  The O(1) __index__ check is probably rather inconsequential compared
>  to the O(n) cost of actually performing the slicing.
I'm sure that's true; at least -- I'm sure that O(1) index check done at
the *C* level is probably inconsequential compared to slicing at the *C*
level. When the index checking has to happen at the python interpreter
level, I'm not so sure... I'm trying to learn how to profile that.

>  <snip>
>  Such a change would only affect numpy floats, not all floats, so it
>  would not be a monkey-patch.
User's of python generally don't bother checking the types.  The object
"typeing" ability, I think, is a rather new development. When a function
accepts float, it often returns a "float"; so there is no reason that
one might not mix a python float and third party "float" as function
parameters -- and then use a return from that function which could be
*either* kind of float.

Since this is typical behavior, variables which have traditionally been
python floats can become another type without explicit warning; and then
may (all the sudden) index any list, anywhere, any time.

Besides, PEP357 doesn't DISTINGUISH between system floats and python
floats as indices.  The writers clearly believed that NETHER of them
were acceptable as indices.  That alone makes being able to turn *some*
floats "ON" as indices unexpected behavior.  ( I think the PEP writers
did the best they could with the limited tools they had coming into
their minds. )

As an aside: I am treating this as a postmortem; Gathering information
and looking for what was *good* as well as what was bad about an
implementation.  I have, for example, noticed that non-mutables can't be
made to have loops later;  Hence any object made strictly out of non
mutables at every step -- do not need garbage collection; and that can
be used to get rid of GC overhead on any object obeying that property.



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20121103/20cf5038/attachment.html>


More information about the Python-list mailing list