[Python-ideas] Exploiting type-homogeneity in list.sort() (again!)

Erik python at lucidity.plus.com
Tue Mar 7 16:36:57 EST 2017


Hi Elliot,

On 07/03/17 21:10, Elliot Gorokhovsky wrote:
> On Tue, Mar 7, 2017 at 1:47 PM Erik <python at lucidity.plus.com
> <mailto:python at lucidity.plus.com>> wrote:
>
>
>     I'd prefer the sort optimization to be based on what my list contains
>     NOW, not on what it may have contained some time in the past, so I'm not
>     a fan of the "once heterogeneous, always considered heterogeneous"
>     behaviour if it's cheap enough to avoid it.
>
>
> Sure. Dictionaries actually don't implement this, though: as soon as
> they see a non-string key, they permanently switch to a heterogeneous
> state (IIRC).

I'd be interested to know if this approach had been considered and 
rejected for dicts - but I think dicts are a bit of a special case 
anyway. Because they are historically a fundamental building block of 
the language (for name lookups etc) they are probably more sensitive to 
small changes than other objects.

> I think the bigger problem, though, is that most list use does *not*
> involve sorting, so it would be a shame to impose the non-trivial
> overhead of type-checking on *all* list use.

Yes, I understand that issue - I just thought I'd mention something that 
hadn't been pointed out yet _IF_ the idea of a type hint were to be 
considered (that's the sub-thread I'm replying to). If you're not doing 
that, then fine - I just wanted to put down things that occurred to me 
so they were documented (if only for rejection).

So, while I'm at it ;), here are some other things I noticed scanning 
the list object source (again, only if a type hint was considered):

* What is the type hint of an empty list? (this probably depends on how 
naturally the code for all of the type hint checking deals with NULL vs 
"unknown").

* listextend() - this should do the right thing with the type hint when 
extending one list with another.

* Several other methods ('contains', 'remove', 'count', 'index') also 
use PyObject_RichCompareBool(). They could also presumably benefit from 
the same optimisation (perhaps it's not all about sort() - perhaps this 
gives a little more weight to the idea).

> Anyway, my patch could always be a precursor to a more general
> optimization along these lines. I'm almost finished fixing the problem
> Tim identified earlier in this thread; after that, it'll be ready for
> review!

Nice - good job.
E.


More information about the Python-ideas mailing list