[Python-ideas] "Exposing" `__min__` and `__max__`

James Edwards jheiv at jheiv.com
Wed Jun 20 12:53:58 EDT 2018


I do think there is some merit in being able to as easily as possible
transition from something being, say list-backed, to a domain specific data
structure that's more efficient.  If one wanted to go from a list to a
minheap, it'd be nice just to change `x = []` to `x = Heap()` and have
everything else "just work", without having to change all `min(x)`s to
`x.min()`s, etc.  But trying to find places in production code where there
is the special casing you described is (now) on the top of my priorities.

Outside of invariant-maintaining data structures though, I think the most
compelling case I've come up with so far is that it could be implemented on
certain types of generators, without having to generate the whole
sequence.  (IMHO, they're neat side effects, but not very compelling by
themselves):

- __min__ and __max__ could be implemented on the range object, as
calculating these are straightforward, and you wouldn't need to generate
the entire sequence.  Passing an "un-listified" range to a function that
accepts lists is somewhat common, I think.  The inefficiency is compounded
if you generate the sequence multiple times, e.g.:

    def get_err(iterable):
        return max(iterable) - min(iterable)

    x = range(10000)
    get_err(x)

    # vs

    x = list(range(10000))
    get_err(x)

  Here, implementing __min__ and __max__ would make passing a range not
just as fast as passing a "listified" range, but significantly faster.  But
I think this is just a nice coincidence of exposing the special methods and
by no means motivating by itself.

- There are also a few infinite generators in itertools where calling min()
or max() on the generator will run forever, despite at least one of these
being a clearly defined:

    from itertools import count, cycle

    gen = count(start=0, step=1)  # min=0, no max
    # or
    gen = cycle([1,2,3])          # min=1, max=3

    print(min(gen))               # Will never terminate

  I'm even less sure about how often this would actually help than I am the
range example. I don't envision many places where people are passing
infinite generators to things expecting standard lists -- especially in
light of the fact that calling min() or max() on them will prevent further
execution.

Thanks for all the feedback, the search will continue :)



On Wed, Jun 20, 2018 at 11:24 AM, Guido van Rossum <guido at python.org> wrote:

> Finding more realistic use cases is key -- the actual spec is pretty
> obvious and doesn't worry me in terms of added language or implementation
> complexity.
>
> I think just finding a data structure that should implement its own
> min/max funtionality (or maybe one of these, like heapq) is not enough
> motivation. You have to find code where such a data structure (let's say a
> Tree) is passed to some function that also accepts, say, a list. Then that
> function would benefit from being able to call just `min(x)` rather than
> `x.min() if isinstance(x, Tree) else min(x)`. If whenever you have a Tree
> you know that you have a Tree (because it has other unique methods) then
> there's no burden for the user to call x.min().
>
> --Guido
>
> On Wed, Jun 20, 2018 at 5:24 AM James Edwards <jheiv at jheiv.com> wrote:
>
>> >  Are there any builtins or std library classes that offer their own
>> min()/max() methods?
>>
>> My first instinct was heapq[1], since the way to access the min value is
>> simply heap[0] (and I thought it could benefit from __min__) -- it's almost
>> the perfect motivating example.  But as it stands, the module uses
>> functions to operate directly on a standard list, so even if __min__ were
>> exposed, min(heap) would still iterate over the entire list.
>>
>> That being said, a heap *class* could take advantage of this, and
>> provide a semantically consistent optimization.
>>
>> I'm not sure how many examples will be found in stdlib, as I expect this
>> optimization to be restricted to specialized container types like heaps,
>> but I'll keep searching.
>>
>> [1] https://docs.python.org/3.6/library/heapq.html
>>
>> On Wed, Jun 20, 2018 at 3:00 AM, Steven D'Aprano <steve at pearwood.info>
>> wrote:
>>
>>> On Wed, Jun 20, 2018 at 07:05:19AM +0300, Serhiy Storchaka wrote:
>>> > 19.06.18 22:18, James Edwards пише:
>>> > >I've only recently looked for these special methods, so that in and
>>> of
>>> > >itself may be the reason these methods aren't exposed, but I could
>>> think
>>> > >of objects that may wish to implement __min__ and __max__ themselves,
>>> > >for efficiency.
>>> >
>>> > There are two questions.
>>> >
>>> > 1. What to do with additional min() and max() arguments: key and
>>> default.
>>>
>>> Since there are no reflected versions of min/max, there is no trouble
>>> with extra arguments. Just pass them through to the dunder:
>>>
>>> min(obj, key=x, default=y) => type(obj).__min__(key=x, default=y)
>>>
>>>
>>> > 2. Is the need of this feature large enough? Will the benefit for
>>> > special cases exceed the drawback of increasing implementation
>>> > complexity and slowing down common cases?
>>>
>>> Reasonable questions, but I don't think that the cost of testing:
>>>
>>>     if hasattr(type(obj), '__min__')
>>>     # or equivalent
>>>
>>> is going to be very large. Amortized over O(N) comparisons, that's
>>> practically free :-)
>>>
>>> More important, I think, is the increase in API complexity. That's two
>>> more dunders to learn about.
>>>
>>> The first part is critical: is this useful enough to justify two more
>>> dunders? I think the answer is a definite Maybe. Or perhaps Maybe Not.
>>>
>>> I think that without at least one use-case in the standard library,
>>> perhaps we should hold off on this. Unless numpy arrays are important
>>> enough to justify this on their own?
>>>
>>> Are there any builtins or std library classes that offer their own
>>> min()/max() methods? If so, that would be good evidence that making this
>>> a dunder-based protocol has stdlib use-cases.
>>>
>>>
>>>
>>> --
>>> Steve
>>> _______________________________________________
>>> Python-ideas mailing list
>>> Python-ideas at python.org
>>> https://mail.python.org/mailman/listinfo/python-ideas
>>> Code of Conduct: http://python.org/psf/codeofconduct/
>>>
>>
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
>
>
> --
> --Guido van Rossum (python.org/~guido)
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20180620/ecf22984/attachment.html>


More information about the Python-ideas mailing list