UserList.__getslice__(): copy.copy(self.data) vs. self.__class__(self.data).
Tom Funk
_spam_sux_tdfunk at _spam_sux_nettally.com
Tue Mar 14 16:36:33 EST 2000
In the Python Reference Manual, Section 3.3.5, "Additional methods for
emulation of sequence types", we find the following entry:
...
__getslice__ (self, i, j)
Called to implement evaluation of self[i:j]. The returned
object should be of the same type as self. Note that missing
i or j in the slice expression are replaced by zero or
sys.maxint, respectively, and no further transformations on
the indices is performed. The interpretation of negative
indices and indices larger than the length of the sequence
is up to the method.
...
The current implementation of UserList.__getslice__(), looks like this:
def __getslice__(self, i, j):
i = max(i, 0); j = max(j, 0)
userlist = self.__class__()
userlist.data[:] = self.data[i:j]
return userlist
Though this follows the guidelines outlined in the reference manual, it
has an interesting side effect: it instantiates a new object of the same
class but it loses the current values of all attributes.
Is this desireable behavior? Personally, I don't believe that it is.
My thinking is that the current internal state of the class should pass
to the newly instantiated object. I think a better implementation would
be:
def __getslice__(self, i, j):
i = max(i, 0); j = max(j, 0)
userlist = copy.copy(self)
userlist.data[:] = self.data[i:j]
return userlist
Also, should the i and j arguments be "adjusted" before being used to
access the list in self.data? Again, in the Python Reference Manual,
section 5.3.3 "Slicings," we find:
The lower and upper bound expressions, if present, must evaluate
to plain integers; defaults are zero and the sequence's length,
respectively. If either bound is negative, the sequence's length
is added to it.
So, the runtime normalizes negative numbers so that small-enough (large
enough??<g>) negative numbers start counting from the end of the
sequence. i.e., aList[-1] returns the last element in the aList. As
currently written, UserList converts to zero any negative numbers that
would otherwise raise an IndexError. The resulting behavior is that a
slice is returned rather than an IndexError being raised, thus:
>>> ul=UserList.UserList([0,1,2,3,4])
>>> ul[-1] # last element
4
>>> ul[-3:-1] # 3rd- and 2nd-to-the-last elements
[2,3]
>>> ul[-10] # calls UserList__getitem__(self,i)
Traceback (innermost last):
File "<interactive input>", line 1, in ?
File "UserList.py", line 29, in __getitem__
def __delitem__(self, i): del self.data[i]
IndexError: list index out of range
>>> ul[-10:-1] # should raise IndexError
[0, 1, 2, 3]
>>>
Again, at least to me, this behavior seems to be to be inconsistent with
a "real" list object.
I think this is a better implementation:
def __getslice__(self, i, j):
userlist = copy.copy(self)
userlist.data[:] = self.data[i:j]
return userlist
I'd be more than happy to implement these changes (there are a couple of
places where self.__class__() is called and where method arguments are
normalized to zero) and submit the context diffs -- unless I'm
overwhelmed with arguments to the contrary. Since UserList.py is part of
the standard distribution, and could break existing code if changes are
made, I wanted to bounce this off of the community before proceeding.
I'm using UserList in a current project and I've already implemented
these changes into a NewUserList class. Submitting context diffs would
be a piece of cake.
Any feedback? Should I proceed?
--
-=< tom >=-
Thomas D. Funk (tdfunk at asd-web.com) | "Software is the lever
Software Engineering Consultant | Archimedes was searching for"
Advanced Systems Design, Tallahassee FL. |
More information about the Python-list
mailing list