some random reflections of a "Python newbie": (2) language issues
Tim Peters
tim_one at email.msn.com
Mon Dec 13 05:05:27 EST 1999
[alex at magenta.com]
> ...
> The only problem with not looking at __getitem__'s "key"
> parameter seems to be that there is no way to "restart
> from the beginning" -- but I guess one could easily
> special-case-interpret a key of 0 to mean that. E.g.:
>
> class fib:
> "Enumerate the rabbits of a guy from Pisa"
> def __init__(self):
> (self.old, self.lat) = (0, 1)
> def __getitem__(self, key):
> if key==0:
> self.__init__()
> return self.old
> new = self.old+self.lat
> (self.old, self.lat) = (self.lat, new)
> return self.old
>
> Now,
> >>> rabbits=fib.fib()
> >>> for i in rabbits:
> ... if i>100: break
> ... print i
> does work as I thought it should.
>
> [... and more about enumerators ...]
The for/__getitem__ protocol was really designed for sequences, and it's a
strain to push it beyond that. This kind of stuff is cool, but after a few
years you may tire of it <0.7 wink>.
Here's a __getitem__ I've got sitting around in a Set class, that represents
a set of values via a dict self.d:
def __getitem__(self, i):
if i == 0:
self.keys = self.d.keys()
return self.keys[i]
Even that's a bit of a strain, but it does nest correctly -- albeit by
accident <wink>.
> ...
> However, there is one little problem remaining... being
> an enumerator doesn't let the object support nested loops
> on itself, which a "real" sequence has no problems with.
Exactly.
> If the object "can give out an enumerator" to itself,
> rather than keeping state itself for the enumeration,
> it would be more general/cleaner.
(At least) The same options are available in Python as in C++. The relative
burden of method overheads being what they are, though, the aforementioned
Set class also has a method that's much more heavily used than
Set.__getitem__:
# return set members, as a list
def tolist(self):
return self.d.keys()
That is, for/in works much faster on a native sequence type, so I generally
have a "tolist" or "totuple" method and do "for thing in
collection.tolist():". That doesn't work at all for unbounded collections
(like your rabbits), but is fast and obvious for most collections.
> I guess I can live with that through an "enumerator" func,
> somewhat like (if I define __enumerator__ to be the
> "give-out-an-enumerator" method):
Note that double-double-underscore names are technically reserved for the
implementation (i.e., Guido may stomp on any such name in a future release).
> def enumerate(obj):
> try:
> return obj.__enumerator__()
> except:
> try:
> return obj[:]
> except:
> return obj
>
> (not ideal, sure -- I actually want to give out the obj
> if it's an immutable sequence [can I rely on its having
> __hash__ for that...?],
Classes only have __hash__ if the user defines it. Contrarily, some mutable
objects do define __hash__; e.g., back to that surprisingly educational
<wink> Set class:
def __hash__(self):
if self.frozen:
hashcode = self.hashcode
else:
# The hash code must not depend on the order of the
# keys.
self.frozen = 1
hashcode = 0
_hash = hash
for x in self.d.keys():
hashcode = hashcode ^ _hash(x)
self.hashcode = hashcode
return hashcode
This makes it possible to have Sets of Sets (& so on), although putting a
set S in a set T freezes S. The mutating methods of Set complain if
self.frozen is true. Most times I lazier than that, though, and pass out
hashes without any protection.
The point is that people can & do abuse everything in the language in all
conceivable ways -- and in quite a few that aren't conceivable. There's
almost nothing you can count on without exception.
> slice it if it's a mutable one, etc, but, basically...), so I can
> have a loop such as "for x in enumerate(whatever):" which _is_
> polymorphic _and_ nestable.
>
> _WAY_ cool...!!!
>
> Btw, a request for stylistic advice -- am I overusing
> try/except in the above sketch for "enumerate"? Python
> seems to make it so easy and elegant to apply the good
> old "easier to ask for forgiveness than permission" idea,
> that my C++-programmer-instincts of only using exceptions
> in truly exceptional cases are eroding fast -- yet I see
> that recommendation in Guido's own book (p. 192 -- soon
> belied by the example on p. 194 that shows exceptions
> used to implement an anything-but-exceptional typeswitch,
> but...). Maybe in a smoothly dynamic language such as
> Python keeping "exception purity" is not such a high
> priority as in C++ or Java...?
You'll note that the docs rarely mention which exceptions may be raised, or
when or why; I'm still unsure whether that's Good or Bad! In the case of
checking an object for __enumerator__, that's what hasattr was designed for,
so most people would instead write
if hasattr(obj, "__enumerator__"):
...
But there's no way to test for whether obj[:] is possible without trying it,
so try/except is your only choice there. The Types-SIG archive (from about
a year ago) has many delightful words about all this <wink>.
> How would/should I implement "enumerate" without so much
> reliance on try/except, i.e. by explicitly testing "does
> this object have an __enumerator__ method that can be
> called without arguments, or, if not, can I slice it to get
> a copy, or, ...", etc...? In C++ I guess I'd do a
> dynamic_cast<> for this kind of thing (assuming inheritance
> from suitable abstract classes) -- what's the "best" Python
> idioms, and what are the trade-offs here...?
A dynamic language presents a great temptation to hyper-generalization.
Resist it! Not for your sake so much as for ours -- we'll never be able to
understand your code.
If you want to define a new all-encompassing protocol (like __enumerate__),
chances are excellent you're the only person in the world who will use it --
much as if 8 groups within a department are using C++, they'll maintain at
least 16 incompatible array classes <0.5 wink>.
If it's something you can't live without, then-- as its likely sole
user --the tradeoffs are entirely up to your judgment. Indeed, *I'm* the
world's only user of my Set class: everyone else just manipulates a dict
explicitly! "Programming infrastructure" classes & frameworks have a much
bigger audience in C++/Java, and for good reasons: it takes many more lines
of code to get something done in the latter. An enumerator in Python
usually takes no more than 4 lines of dirt-simple code, so people
instinctively realize it would take longer to read & understand the
__enumerator__ docs than to roll their own one-shots 100 times over.
That's why Python is so productive: nobody spends any time reading docs
<wink>.
>> [1.6 is supposed to add a __contains__ method]
> Oh my -- I even guessed the *name* of the special method
> needed...?!-) That cuts it, I guess -- Python and I were
> just _made_ for each other!-).
Python was made for everyone equally, Alex -- although I am pleased to say
you do seem to be *especially* equal <wink>.
python-farm-ly y'rs - tim
More information about the Python-list
mailing list