[Python-Dev] decorate-sort-undecorate

Tim Peters tim.one at comcast.net
Tue Oct 14 15:56:32 EDT 2003


[Guido]
> ...
> When exactly do you consider  a sort a "database sort"?

Oh, roughly anything that sorts large objects according to a small piece of
each.  In the sorting app I talked about, the database was a giant XML file,
and was read into memory with each database record represented as a class
instance with a few dozen data attributes.  The full-blown __cmp__ for this
class was large.  With a more memory-efficient database representation, I'd
expect the in-memory object to be a kind of wrapper around database-access
code, maybe with a cache of recently-referenced attributes.  Then it's mondo
expensive if you have to break ties by fetching data from disk.

...

>> It's not externality, it's decomposability: ....

> I experimented a bit with the version of Outlook I have, and it seems
> to always use the delivery date/time as the second key, and always in
> descending order.

It depends some on the current view, but I misremembered Outlook's UI
anyway:  to get a multi-heading sort, you have to be depress the shift key
when clicking on the 2nd (and 3rd, etc) column (and click twice (not
double-click!) to reverse the sort order on the current column; the shift
key applies there too if you want a multi-key sort order).

> ...
> I'm not sure that this helps us decide the default behavior of sorts
> in Python, which are rarely interactive in this sense.

Python is used to implement interactive apps.

> (If someone writes an Ourlook substitute, they can pretty well code the
> sort to do whatever they want.)

The 2.3 sort is stable.  That's not only the default, there's no choice
about it.  What's getting proposed is to give up stability to ease an
implementation trick in the cases where both the stability and speed of a
sort are most often most important.  If I'm just sorting a list of floats,
it's very hard to detect whether the sort was stable (I'd have to stare at
the memory addresses of the float objects to tell).  The cases where DSU
gets used are the ones where the object isn't the key (so that stability or
lack thereof becomes obvious), and where the user cares about speed (else
they'd just pass a custom comparison function instead of bothering with
DSU).

Most sorts I do won't specify a key= argument, so most sorts I do couldn't
care less what auto-DSU does.  When I do code a DSU by hand, I nearly always
include the index component, but more for speed reasons than for stability
reasons -- but I rarely write interactive apps.  I'm told that other people
do <wink -- but a lot of people were very happy to see the 2.3 sort become
stable>.

> ...
> But I don't see the interactivity in Python apps, and that's what
> counts here.

Won't your current employer write a web-based system security monitor in
Python, showing tables of information?  An app that doesn't make a table
view sortable on a column isn't a real app <wink>.

> ...
> To cut all this short, I propose that we offer using the index as a
> second sort key as an option, on by default, whose name can be (barely
> misleading) "stable".  On by default nicely matches the behavior of
> the 2.3 sort without any options.

At this point, I may be losing track of how many options the 2.4 sort is
growing -- cmp, key, and stable?  I'd rather drop stable, and that when cmp
or key (or both) is used, sort promises not to fall back to whole-object
comparison by magic (if cmp or key invoke whole-object comparison, fine,
that's on the user's head then).  There are several ways to implement the
latter, most of which would inherit stability from the core 2.3 sort.




More information about the Python-Dev mailing list