Newbie Questions: Swithing from Perl to Python

Alex Martelli aleaxit at yahoo.com
Sun Oct 26 10:07:42 EST 2003


Roy Smith wrote:
   ...
>> likely that 2.4 will grow list.sorted() (yes, a static method on the
>> list type).

Actually a classmethod -- only relevant if you subclass list, e.g.:

class mylist(list):
   ''' whatever tweaks '''

foo = mylist.sorted(bar)

now foo will be an instance of mylist, NOT just of list.  Marginal issue, of 
course.

> What do you mean by "a static method on the list type"?  Will I be able
> to do:
> 
>    for key in myDict.keys().sorted():
>       print key

Nope.  You will, if you insist, be able to code:

for key in list.sorted(myDict.keys()):
    print key

making a silly and useless copy of the keys list.  Or, you will be able to
avoid the useless copy by coding list.sorted(myDict) or also, if you prefer,
list.sorted(myDict.iterkeys()).

You can bind srt=list.sorted somewhere in your code if you want more
concision, but it will still be called as srt(somesequence), just as, say,
len(somesequence), NOT with method syntax on somesequence.

> and get what I expect?  If so, then I think that's the behavior that
> most people have found wanting in the current implementation and thus
> will make a lot of people happy.  It certainly will make me happy :-)

The list.sorted classmethod is more powerful than an instance method
of lists would be: it will let you build a sorted list out of any iterable 
at all, AND support smoothly and transparently _subclasses_ of list.

> If that's what you're talking about, there's an obvious downside, which
> is that now we'll have list.sort() and list.sorted() which do two
> different things.  This will be confusing.

Keeping one as an instance method and the other as a classmethod
should ease things.  I.e., in practice, the former will always be called
as somelist.sort(), the latter as list.sorted(somesequence).

In theory, classmethods do allow somewhat confusing use, e.g.:

myDict = {}
another = myDict.fromkeys(range(7))

this leaves myDict untouched and doesn't even look into it, except
for ascertaining its class.  Apparently, people aren't actually prone
to falling into such confusions (haven't seen _any_ occurrence wrt
fromkeys either here, on help at python.org, on python-dev, or any
other venue where I help, advise and teach).

> Is there a PEP on this I could read?  A quick look at the PEP index
> didn't show anything that looked appropos.

Unfortunately I don't think a PEP number has been assigned (I do
hope Raymond Hettinger, who's developing the patch for this as
well as other useful tweaks to the sort method for 2.3, has started the 
PEP-number-request procedure).

> I certainly understand the efficiency aspects of in-place sorting, but
> this has always seemed like premature optimization to me.  Most of the
> time (at least in the code I write), the cost of an extra copy is
> inconsequential.  I'll be happy to burn a few thousand CPU cycles if it
> lets me avoid an intermediate assignment or a couple of extra lines of
> code.  When things get too slow, then is the time to do some profiling
> and figure out where I can speed things up.

But the "couple of extra lines of code" is an even more trivial price to pay 
than the cost of a copy, in Python's general worldview.  The key is
consistency: all mutating methods of lists and dicts work inplace, and do
NOT return a list or dict (neither the original nor a copy) because Guido
does NOT like "chaining" -- such methods return None except when they
have a specifically useful item to return, such as the .pop methods.

Having .sort specified in a subtly different manner than .reverse, .update,
.extend etc would be inconsistent and harder to remember.  Specifying
them all to work in-place and return the original list would encourage
chaining, which Guido definitely does NOT want (saving a couple of
lines of code by making everything much denser is NOT what Python
is all about: "sparse is better than dense" is a Pythonic principle).  
Having them all make a preliminary copy and return the mutated copy
would be nicely functional, but unacceptable for all trivial operations
such as reverse -- slowdowns of over 100% are NOT within the realm
of "premature optimization" accusations (some operations would even
go up in big-O terms -- eek).  AND it would NOT let you "speed things
up" if the only way to e.g. extend a list copied it (try a loop of 
a.extend(b) vs one of a = a + b over a few thousand sublists b to
get an idea...).

So, I disagree with your characterization of having all mutators work 
in-place and NOT return the mutated object as "premature optimization".
It's a design that nicely achieves consistency, simplicity, and promotion
of Guido's favourite sparse-is-better-than-dense style in the "one obvious
way to do it" mold.

So, now (actually for 2.4 -- I predict more than a year, so don't hold your
breath) we get to try and allow a hopefully controlled amount of style
variation towards somewhat higher "conceptual" density -- what has
been described (not technically accurately, but suggestively) as "a more
declarative style", suitable to be read as 
"for k ranging over the sorted list corresponding to thedict,"...
rather than as
"get the list of thedict's keys; sort it; now, for k ranging over that 
list,"...
which "feels" more imperative, stylistically speaking, because it is a
sequence of obviously-imperative steps rather than an expression
conjoining "not obviously imperative" sub-steps.  It's a fine thread to
walk, since Python's roots ARE deeply imperative, of course;-).


Alex





More information about the Python-list mailing list