sorting many arrays from one...

Wed Jul 10 05:40:39 EDT 2002

Alex Martelli comments my remarks on the slowdown of the sorting functional
parameterized by a named comparison function:

> > Somebody cares about explaining this behaviour? Dynamic dispatching,
> > while sort() does not any dereferencing? Or what?

> Basically, sort needs to do O(N log N) comparisons -- if each comparison
> requires a function call, the time for those calls dominates sorting times
> even if each call does little -- the overhead is the same.  

Sorry, but this does not explain anything. Each comparison requires a function
call (perhaps inlined in some cases, but I don't believe that 'compare' may
be universally inlined) whether this comparison is passed as a parameter, or
called "because the sorting device <<knows>> what to call". So, thank you
very much indeed for your nice N log N remainder, but the issue is not the
cost of the *call* (execution), but of "call" (dereferencing/dispatching).

> If you pass
> nothing, sort knows to use built-in comparison methods, and its wonderful
> optimizations take hold.

May I ask somebody WHO KNOWS, what are those "wonderful optimizations", and
why are they so brutally desactivated with, say, .sort(cmp)?

> Indeed passing cmp directly
> in my test I see a slowdown of about 5 times, while f which calls cmp
> gives 10 times -- see, it IS all about function-call times.

Thank you for confirming my test. And for confirming that all IS about
the function call. I believe I made it clear that I am aware of that.
I asked WHY? One of important paradigms of modern programming are higher-
order functions.

> > For people coming from the Functional Programming Realm such slowdown
> > of parameterized functionals is hard to accept...

> A slowdown of about 3.9 CPU seconds
> on my machine for that many calls works out at about 0.76 microseconds
> per call -- preparing the arguments tuple, unpacking it in the
> called function, etc.  Doesn't feel wildly out of control to me --
> it's just a LOT of function calls, using mechanisms studied for an
> intepreted, dynamic language's needs for flexibility, not to get
> superb performance.

Aha, preparing argument tuples (you mean: making them physically, or
just stacking arguments, or what?). Yes, it takes some time, but there
are interpreted languages (e.g. ML) where it is very well optimized. 
No physical unpacking of any tuples inside a function.

> What makes it FEEL huge is sort's own superb performance -- 0.16
> microseconds per comparison *everything included* when using the
> built-in comparison.
> 
> So, it's all Tim Peters' fault, for making sort so incredibly fast --
> let's all gang up against him!!!

Now, put 4 more (!) marks and it will be even more funny... But, seriously
again a question for somebody who *knows* the details: concretely WHAT is 
optimized, and WHY it cannot be done with parameterized sort.

> Oh, and, a warning when you try to do such measurements: if you
> mistakenly call sort on an already-sorted list, it only does
> 299999 comparisons (for a 300000 items list) and finishes in
> 0.04 seconds -- 0.78 with a comparison function, so the ratio
> gets even worse.  If you have just one item out of order at
> the end, 300019 comparisons, and the same performances.  This
> is truly intolerable -- we HAVE to slow this darn this down!!!

I happen to know a little bit about serious sorting, so the warning
is a bit redundant, but thank you anyway, it suggests at least that
the sorting procedure does not use the naive quicksort. Unfortunately,
apart from an attempt to be funny, which I appreciate enormously ,
it does not address my questions...

> > ... the idea of Decorate-Sort-Undecorate, which begins by constructing
> > an additional list with fast-sortable keys is also something hard to
> > digest.
> 
> May I suggest some Fernet-Branca or other Italian amaro to your liking?
> They do wonders to help you digest rich, too-abundant meals.

My <<milieu>> has perhaps some different views on what is a too-abundant
meal, and also some different view of what is an elegant answer to a
technical question. Further on, you suggest (in a long and explicit way)
that it is a good idea, because the speed of comparison makes you gain 
alpha * NlogN, while the cost of decorating/undecorating is linear. 
That's OK, I never claimed that it wouldn't work. I used it myself, and
I submitted my query because I am still convinced that it shouldn't be
done. For some people wasting space *is* harmful independently of the
asymptotic complexity issues.

> What's surprising, at all, about this...?

Shall I repeat it once more? I think that the call overhead should be
perhaps better optimised, and if it seems difficult, I would like to
know why. What's surprising in my question? I am not a flea on your
nose, Maestro.

> Suppose there was no call overhead at all, just a very
> complicated comparison.  The effect would be exactly the same,
> no?  Appliying a complicated comparison function must happen
> N logN times; applying an equally complicated transformation
> function that makes the items rapidly comparable only must
> happen N times, and thus becomes irrelevant for large N.

Nothing to do with my questions.
Moreover, in  complicated, interesting cases you won't be
able to find this transformation producing fast comparisons in
linear time anyway, so this speculation is not so interesting. 
The following paragraph neither, I am still asking about the
function call overhead...

> But the same effect will apply in ANY language for the
> appropriate kind of sorting task if the comparison op
> you need grows rich and complicated enough to matter.
> 
> Assuming the language's sort has perfect optimization
> and the language makes it very easy to build, handle
> and dismantle sequences of tuples, of course:-).  But
> these are hardly DEFECTS, are they?-)

What does it mean: "language's sort has perfect optimization"?
It is no magic.

> > I found it a bit ridiculous to pollute the
> > memory with an ephemeric sequence just because there is something lousy
> > with parametrized sort 

> I think you're using loaded language, presumably reflecting
> loaded and not-well-considered sentiments.  

Thank you. I am always ready to learn how to behave correctly from
serious, elegant people, full of sparkling humor, and who show invariably
their full respect for all people in this newsgroup they address to.

> There's nothing "lousy" with functions in Python -- they just take 
> measurable time to call, that's all.  Most often one doesn't care all
> that much about performance.  

In the name of *who* you are claiming this?
Almost all creators of almost all programming languages care a lot about
efficiency, about removing redundancies and optimizing frequent operations.
Higher-order functions, and parameterized functionals are absolutely
fundamental in functional programming, and the compiler optimisations are
pushed into extreme there. In OO languages all the caching mechanisms and
some other ways of accelerating the dispatch of virtual methods are also
quite important. Python is no exception. It is already very good, sure,
and still improving, so I see no reason not to see a place which I think
to be fragile (if you prefer that than the word "lousy").

I appreciate your contribution to the Cookbook and I understand that
you will defend DSU. 

> That's not "polluting the memory with an ephemeric sequence" --
> it's investing some memory (not much -- O(N) memory, peanuts!)
> to reduce computation.  A hallowed, time-honored approach, you know.

It is your philosophy but there are others. For some people space matters, 
you know.

Jerzy Karczmarczuk
Caen, France