max(), sum(), next()

Thu Sep 4 04:28:46 EDT 2008

On Sep 4, 1:26 am, Steven D'Aprano <st... at REMOVE-THIS-
cybersource.com.au> wrote:
> On Wed, 03 Sep 2008 22:20:43 -0700, Mensanator wrote:
> > On Sep 3, 8:30 pm, Steven D'Aprano <st... at REMOVE-THIS-
> > cybersource.com.au> wrote:
> >> On Wed, 03 Sep 2008 16:20:39 -0700, Mensanator wrote:
> >> >>>> sum([])
> >> > 0
>
> >> > is a bug, just as it's a bug in Excel to evaluate blank cells as 0.
> >> > It should return None or throw an exception like sum([None,1]) does.
>
> >> You're wrong, because 99.9% of the time when users leave a blank cell
> >> in Excel, they want it to be treated as zero.
>
> > Then 99.9% of users want the wrong thing.
>
> It is to laugh.
>
> > Microsoft knows that this is a bug
>
> Says you.
>
> > but refuses to fix it to prevent breaking legacy documents (probably
> > dating back to VisiCalc). When graphimg data, a missing value should be
> > interpreted as a hole in the graph
>
> "Graphing data" is not sum(). I don't expect graphing data to result in
> the same result as sum(), why would I expect them to interpret input the
> same way?
>
> > +------+             +--+------+------+-----+
>
> Why should the graphing application ignore blanks ("missing data"), but
> sum() treat missing data as an error? That makes no sense at all.

Maybe it's important to know data is missing. You can see
the holes in a graph. You can't see the holes in a sum.

>
> > and not evaluated as 0
>
> > And Microsoft provides a workaround for graphs to make 0's appear as
> > holes. Of course, this will cause legitimate 0 values to disappear, so
> > the workaround is inconsistent.
>
> I'm not aware of any spreadsheet that treats empty cells as zero for the
> purpose of graphing, and I find your claim that Excel can't draw graphs
> with zero in them implausible, but I don't have a copy of Excel to test
> it.

That was a mistake. I made a followup correction, but
you probably didn't see it.

>
> >> Spreadsheet sum() is not the
> >> same as mathematician's sum, which doesn't have a concept of "blank
> >> cells". (But if it did, it would treat them as zero, since that's the
> >> only useful thing and mathematicians are just as much pragmatists as
> >> spreadsheet users.) The Excel code does the right thing, and your
> >> "pure" solution would do the unwanted and unexpected thing and is
> >> therefore buggy.
>
> > Apparently, you don't use databases or make surface contours.
>
> Neither databases nor surface contours are sum(). What possible relevance
> are they to the question of what sum() should do?

Because a sum that includes Nulls isn't valid. If you treated
Nulls as 0, then not only would your sum be wrong, but so
would your count and the average based on those. Now you
can EXPLICITLY tell the database to only consider non-Null
values, which doesn't change the total, but DOES change
the count.

>
> Do you perhaps imagine that there is only "ONE POSSIBLE CORRECT WAY" to
> deal with missing data, and every function and program must deal with it
> the same way?

But that's what sum() is doing now, treating sum([]) the same
as sum([],0). Why isn't sum() defined such that "...if list
is empty, return start, IF SPECIFIED, otherwise raise exception."
Then, instead of "ONE POSSIBLE CORRECT WAY", the user could
specify whether he wants Excel compatible behaviour or
Access compatible behaviour.

>
> > Contour programs REQUIRE that blanks are null, not 0
>
> Lucky for them that null is not 0 then.

No, but blank cells are 0 as far as Excel is concerned.
That behaviour causes nothing but trouble and I am
saddened to see Python emulate such nonsense.

>
> > so that the Kriging
> > algorithm interpolates around the holes rather than return false
> > calculations. Excel's treatment of blank cells is inconsistent with
> > Access' treatment of Nulls and therefore wrong, anyway you slice it.
>
> No no no, you messed that sentence up. What you *really* meant was:
>
> "Access' treatment of Nulls is inconsistent with Excel's treatment of
> blank cells and therefore wrong, anyway you slice it."
>
> No of course not. That would be stupid, just as stupid as your sentence.
> Excel is not Access. They do different things. Why should they
> necessarily interpret data the same way?

Because you want consistent results?

>
> > Maybe you want to say a bug is when it doesn't do what the author
> > intended, but I say if what the intention was is wrong, then a perfect
> > implentation is still a bug because it doesn't do what it's supposed to
> > do.
>
> Who decides what it is supposed to do if not the author?

The author can't change math on a whim.

> You, in your ivory tower who doesn't care a fig for
> what people want the software to do?

True, I could care less what peole want to do...

...as long as they do it consistently.

>
> Bug report: "Software does what users want it to do."
> Fix: "Make the software do something that users don't want."

What the users want doesn't carry any weight with respect
to what the database wants. The user must conform to the
needs of the database because the other way ain't ever gonna
happen.

>
> Great.

If only. But then, I probably wouldn't have a job.

>
> >> Bugs are defined by "does the code do what the user wants it to do?",
> >> not "is it mathematically pure?".
>
> > ReallY? So you think math IS a democracy? There is no reason to violate
> > mathematical purity.
>
> You've given a good example yourself: the Kriging algorithm needs a Null
> value which is not zero. There is no mathematical "null" which is
> distinct from zero, so there's an excellent violation of mathematical
> purity right there.

Hey, I was talking databases, you brought up mathematical purity.

>
> If I am given the job of adding up the number of widgets inside a box,
> and the box is empty, I answer that there are 0 widgets inside it.

Right. it has a known quantity and that quantity is 0.
Just because the box is empty doesn't mean the quantity
is Null.

> If I
> were to follow your advice and declare that "An error occurred, can't
> determine the number of widgets inside an empty box!" people would treat
> me as an idiot, and rightly so.

Right. But a better analogy is when a new shipment is due
but hasn't arrived yet so the quantity is unknown. Now the
boss comes up and says he needs to ship 5 widgets tomorrow
and asks how many you have. You say 0. Now the boss runs
out to Joe's Widget Emporium and pays retail only to discover
when he gets back that the shipment has arrived containing
12 widgets. Because you didn't say "I don't know, today's
shipment isn't here yet", the boss not only thinks you're
an idiot, but he fires you as well.

>
> > If I don't get EXACTLY the same answer from Excel,
> > Access, Mathematica and Python, then SOMEBODY is wrong. It would be a
> > shame if that somebody was Python.
>
> Well Excel, Python agree that the sum of an empty list is 0. What do
> Access and Mathematica do?

I don't know abaout Mathmatica, but if you EXPLICITLY
tell Access to sum only the non-Null values, you'll get the
same answer Excel does. Otherwise, any expression that
includes a Null evaluates to Null, which certainly isn't
the same answer Excel gives.

>
> >> The current behaviour of sum([]) does the right thing for the 99% of
> >> the time when users expect an integer.
>
> > Why shouldn't the users expect an exception? Isn't that why we have
> > try:except? Maybr 99% of users expect sum([])==0, but _I_ expect to be
> > able to distinguish an empty list from [4,-4].
>
> The way to distinguish lists is NOT to add them up and compare the sums:
>
> >>> sum([4, -4]) == sum([0]) == sum([1, 2, 3, -6]) == sum([-1, 2, -1])
>
> True
>
> The correct way is by comparing the lists themselves:
>
> >>> [] == [4, -4]
>
> False
>
> >> And the
> >> rest of the time, they have to specify a starting value for the sum
> >> anyway, and so sum([], initial_value) does the right thing *always*.
>
> > So if you really want [] to be 0, why not say sum([],0)?
>
> I don't want [] == 0. That's foolish. I want the sum of an empty list to
> be 0, which is a very different thing.

In certain circumstances. In others, an empty list summing
to 0 is just as foolish. That's why sum([]) should be an
error, so you can have it either way.

Isn't one of Python's slogans "Explicit is better than implicit"?

>
> And I don't need to say sum([],0) because the default value for the
> second argument is 0.

That's the problem. There is no justification for assuming
that unknown quantities are 0.

>
> > Why shouldn't nothing added to nothing return nothing? Having it
> > evaluate to 0 is wrong 99.9% of the time.
>
> It is to laugh.
>
> What's the difference between having 0 widgets in a box and having an
> empty box with, er, no widgets in it?

There are no "empty" boxes. There are only boxes with
known quantities and those with unknown quantities.
I hope that's not too ivory tower.

>
> --
> Steven