How is max supposed to work, especially key.

Peter Otten __peter__ at web.de
Thu Dec 4 06:24:41 EST 2014


Albert van der Horst wrote:

> In article <mailman.16378.1417111312.18130.python-list at python.org>,
> Peter Otten  <__peter__ at web.de> wrote:
>>Albert van der Horst wrote:
>>
>>> In the Rosetta code I come across this part of
>>> LU-decomposition.
>>>
>>> def pivotize(m):
>>>     """Creates the pivoting matrix for m."""
>>>     n = len(m)
>>>     ID = [[float(i == j) for i in xrange(n)] for j in xrange(n)]
>>>     for j in xrange(n):
>>>         row = max(xrange(j, n), key=lambda i: abs(m[i][j]))
>>>         if j != row:
>>>             ID[j], ID[row] = ID[row], ID[j]
>>>     return ID
>>>
>>> That it's using a cast from boolean to float and using
>>> at the other moment a float as a boolean, suggest that this
>>> code is a bit too clever for its own good, but anyway.
>>>
>>> My problem is with the max. I never saw a max with a key.
>>>
>>> In my python help(max) doesn't explain the key. It says that
>>> max can handle an iterator (I didn't know that), and you can
>>> pass and optional "key=func", but that's all.
>>>
>>> I expect it to be something like
>>>   elements in the iterator are taken into account only if the
>>>   key applied to the iterator evaluates to a True value.
>>>
>>> However that doesn't pan out:
>>> "
>>> max(xrange(100,200), key=lambda i: i%17==0 )
>>> 102
>>> "
>>>
>>> I expect the maximum number that is divisible by 17 in the
>>> range, not the minimum.
>>>
>>> Can anyone shed light on this?
>>
>>Given a function f() max(items, key=f) returns the element of the `items`
>>sequence with the greatest f(element), e. g. for
>>
>>max(["a", "bcd", "ef"], key=len)
>>
>>the values 1, 3, 2 are calculated and the longest string in the list is
>>returned:
>>
>>>>> max(["a", "bcd", "ef"], key=len)
>>'bcd'
>>
>>If there is more than one item with the maximum calculated the first is
>>given, so for your attempt
>>
>>max(xrange(100,200), key=lambda i: i%17==0 )
> 
>>
>>the values False, False, True, False, ... are calculated and because
>>
>>>>> True > False
>>True
>>
>>the first one with a True result is returned.
>>
> 
> So in that case max doesn't return the maximum (True), but instead
> something else.
> 
> Useful as that function may be, it shouldn't have been called max.
> 
> I don't blame myself for being misled.

I believe you still misunderstand. Again, this time with an almost real-
world car example:

max(values key=key)

calculates key(value) for every value in values. key() can be len() or an 
attribute getter, so that

max(cars, key=lambda car: car.weight) 

finds the heaviest car and

max(cars, key=lambda car: car.speed)

finds the fastest car. (If there is a tie the first car with maximum 
weight/speed is returned.)

The advantage of this approach is that you don't have to choose a "natural" 
order, i. e. should

max(cars)

find the fastest, or the heaviest, or the [you name it] car?

Also, you are really interested in the car, and

max(car.speed for car in cars)

would only give you the speed of the fastest car, not the car object itself. 
To find the fastest car without a key argument you'd have to write

fastest_car = max((car.speed, car) for car in cars)[-1]

or even

fastest_car = max((car.speed, i, car) for (i, car) in enumerate(cars))[-1]

if you are thorough and want the first car in the sequence with maximum 
speed or have to deal with car objects that aren't comparable.

The example where the key returns a boolean value is very uncommon, but 
booleans are not treated specially. So while you can write

first_red_car = max(cars, key=lambda car: car.color == RED)
if first_red_car.color == RED:
    print(first_red_car)
else:
    print("no red cars available")


that is unidiomatic. It is also inefficient because max() always iterates 
over the whole cars sequence in search for an even redder car. False == 0, 
and True == 1, and max() cannot be sure that the key function will never 
return 2 or TruerThanTrue ;) -- but the programmer usually knows.

Better:

red_cars = (car for car in cars if car.color == RED)
first_red_car = next(red_cars, None)
if first_red_car is not None:
    print(first_red_car)
else:
    print("no red cars available")


PS: Another thing to consider is the similarity with sorted().
You can sort by speed 

sorted(cars, key=lambda car: car.speed)

or "redness"

sorted(cars, key=lambda car: car.color == RED)

but you will see the former much more often than the latter.





More information about the Python-list mailing list