Python's simplicity philosophy

Sun Nov 16 15:23:19 EST 2003

On Sun, 16 Nov 2003 15:15:45 GMT, Alex Martelli <aleaxit at yahoo.com> wrote:

>Andrew Dalke wrote:
>   ...
>> The other three are in csv.py and look something like
>>         quotechar = reduce(lambda a, b, quotes = quotes:
>>                            (quotes[a] > quotes[b]) and a or b,
>> quotes.keys())
>> and are identical in intent to your use case -- select the
>> object from a list which maximizes some f(obj).
IMO the use cases in csv.py derive from a particular approach to a particular
problem, and don't mean much beyond that, except as a trigger for ideas. I.e., any
ideas have to stand on their own, without reference to the csv use cases.
(BTW I'm guessing one could do better with "for q in data.split(candidate_quote): ..." and
looking at q[0] and q[-1] and maybe q.strip()[0] and q.strip()[-1] as appropriate
to gather the statistics for the two candidate quotes. This would also let you check
for escaped quotes. But this is another rabbit trail here ;-)
>
>I have already suggested, in a post on this thread's direct ancestor 9 days 
>ago, the 100%-equivalent substitution in pure, simple, faster Python:
>
>quotechar = None
>for k in quotes:
>    if not quotechar or quotes[k]>quotes[quotechar]:
>        quotechar = k
>
>or, for those who consider fast, simple, obviously correct code too boring,
>
>quotechar = max([ (v,k) for k,v in quotes.iteritems() ])[-1]
>
>which is more concise and at least as clever as the original.
>
>
>> This suggests the usefulness of a new function or two,
>> perhaps in itertools, which applies a function to a list
>
>Nope -- itertools is not about CONSUMERS of iterators, which this one would 
>be.  All itertools entries RETURN iterators.
Ok, but what about returning an iterator -- e.g., funumerate(f, seq) -- that supplies f(x),x pairs
like enumerate(seq) supplies i,x?

[I'd suggest extending enumerate, but I already want to pass optional range parameters there,
so one could control the numeric values returned, e.g., enumerate(seq,<params>) corresponding to
zip(xrange(<params>),seq))]. [BTW, should xrange() default to xrange(0,sys.maxint,1)?]

>
>> of values and returns the first object which has the
>> maximum value, as in
>
>Given that the sort method of lists now has an optional key= argument, I
This is a new one on me:
 >>> seq.sort(key=lambda x:x)
 Traceback (most recent call last):
   File "<stdin>", line 1, in ?
 TypeError: sort() takes no keyword arguments

Do you mean the comparison function? Or is there something else now too?
I'm beginning to infer that key= is actually a keyword arg for a _function_
to get a "key" value from a composite object (in which case ISTM "getkeyvalue" or "valuefunc"
would be a better name). But IMO "key" suggests it will be used on elements x like x[key],
not passing a definition key=whatever and then using key(x) to get values.

>think the obvious approach would be to add the same optional argument to min 
>and max with exactly the same semantics.  I.e., just as today:
>
>x = max(somelist)
>somelist.sort()
>assert x == somelist[-1]
>
>we'd also have
>
>x = max(somelist, key=whatever)
>somelist.sort(key=whatever)
>assert x == somelist[-1]
I think I like it, other than the name. Maybe  s/key/valuefunc/ ;-)

>
>a perfectly natural extension, it seems to me.  I've found such total and 
>extreme opposition to this perfectly natural extension in private 
>correspondence about it with another Python committer that I've so far
>delayed proposing it on python-dev -- for reasons that escape me it would
>appear to be highly controversial rather than perfectly obvious.
>
>> def longest(seq):
>>     return imax(seq, len)
>
>That would be max(seq, key=len) in my proposal.

That's a nice option for max (and min, and ??), but ISTM that it would
also be nice to have a factory for efficient iterators of this kind.
It would probably be pretty efficient then to write

    maxlen, maxitem = max(funumerate(len,seq))

or

    def longest(seq):
        return max(funumerate(len,seq))[-1]

and it would be available as infrastructure for other efficient loops in
addition to being tied in to specific sequence processors like max and min.

Of course we can roll our own:

 >>> def funumerate(fun, seq):
 ...     for x in seq: yield fun(x),x
 ...
 >>> seq = 'ab cde f gh ijk'.split()
 >>> seq
 ['ab', 'cde', 'f', 'gh', 'ijk']
 >>> max(funumerate(len,seq))
 (3, 'ijk')
 >>> min(funumerate(len,seq))
 (1, 'f')
 >>> max(funumerate(len,seq))[-1]
 'ijk'
 >>> longest(seq)
 'ijk'
>
>> lines.  There were very few, and the paucity suggests
>> that 'sum' isn't needed all that often.  Then again, I'm
>> not one who suggested that that be a builtin function ;)
>
>Right, that was my own responsibility.  I did identify about
>10 spots in the standard library then (including substitutions
>for reduce); that's more than the typical built-in has, even though
>the tasks handled by the standard library are heavily slanted
>to string and text processing, networking &c, and (of course)
>"pure infrastructure", rather than typical application tasks.
>

Regards,
Bengt Richter