Rounding a number to nearest even

Gabriel Genellina gagsl-py2 at yahoo.com.ar
Fri Apr 11 16:29:03 EDT 2008


On 11 abr, 15:33, Lie <Lie.1... at gmail.com> wrote:
> On Apr 11, 10:19 pm, Mikael Olofsson <mik... at isy.liu.se> wrote:

> > That's exactly how I was taught to do rounding in what-ever low-level
> > class it was. The idea is to avoid a bias, which assumes that the
> > original values are already quantized. Assume that we have values
> > quantized to one decimal only, and assume that all values of this
> > decimal are equally likely. Also assume that the integer part of our
> > numbers are equally likely to be even or odd. Then the average rounding
> > error when rounding to integers will be 0.05 if you always round up when
> > the decimal is 5. If you round towards an even number instead when the
> > decimal is 5, then you will round up half of those times, and round down
> > the other half, and the average rounding error will be 0. That's the
> > idea. Of course you could argue that it would be even more fair to make
> > the choice based on the tossing of a fair coin.
>
> That old-school rounding method you're taught is based on a wrong
> assumption of the nature of number. In the past, rounding algorithm is
> based on this:
>
> Original => (RoundUp(u|d|n), RoundNearestEven(u|d|n)
> ...
> 1.0 => 1(n), 1(n)
> 1.1 => 1(d), 1(d)
> 1.2 => 1(d), 1(d)
> 1.3 => 1(d), 1(d)
> 1.4 => 1(d), 1(d)
> 1.5 => 2(u), 2(u)
> 1.6 => 2(u), 2(u)
> 1.7 => 2(u), 2(u)
> 1.8 => 2(u), 2(u)
> 1.9 => 2(u), 2(u)
> 2.0 => 2(n), 2(n)
> 2.1 => 2(d), 2(d)
> 2.2 => 2(d), 2(d)
> 2.3 => 2(d), 2(d)
> 2.4 => 2(d), 2(d)
> 2.5 => 3(u), 2(d)
> 2.6 => 3(u), 3(u)
> 2.7 => 3(u), 3(u)
> 2.8 => 3(u), 3(u)
> 2.9 => 3(u), 3(u)
> ...
>
> In this used-to-be-thought-correct table, Round Ups algorithm have 2
> Unrounded, 8 Round Down, and 10 Round Ups which seems incorrect while
> Round Even have 2 Unrounded, 9 Round Down, and 9 Round Up which seems
> correct. The misunderstanding comes from a view that thinks that there
> is such thing as Not Rounded while in fact the only number that is Not
> Rounded is 1 and 2 while 1.0 and 2.0 must still be rounded, in
> practice we can just say that all number must be rounded somewhere.

That's not correct. If the numbers to be rounded come from a
measurement, the left column is not just a number but the representant
of an interval (as Mikael said, the're quantized). 2.3 means that the
measurement was closer to 2.3 than to 2.2 or 2.4 - that is, [2.25,
2.35) (it doesn't really matter which side is open or closed). It is
this "interval" behavior that forces the "round-to-even-on-halves"
rule.
So, the numbers 1.6-2.4 on the left column cover the interval [1.55,
2.45) and there is no doubt that they should be rounded to 2.0 because
all of them are closer to 2.0 than to any other integer. Similarly
[2.55, 3.45) are all rounded to 3.
But what to do with [2.45, 2.55), the interval represented by 2.5? We
can assume a uniform distribution here even if the whole distribution
is not (because we're talking of the smallest measurable range). So
half of the time the "true value" would have been < 2.5, and we should
round to 2. And half of the time it's > 2.5 and we should round to 3.
Rounding always to 3 introduces a certain bias in the process.
Rounding randomly (tossing a coin, by example) would be fair, but
people usually prefer more deterministic approaches. If the number of
intervals is not so small, the "round even" rule provides a way to
choose from that two possibilities with equal probability.
So when we round 2.5 we are actually rounding an interval which could
be equally be rounded to 2 or to 3, and the same for 3.5, 4.5 etc. If
the number of such intervals is big, choosing the even number helps to
make as many rounds up as rounds down.
If the number of such intervals is small, *any* apriori rule will
introduce a bias.

> > Note that if you do not have quantized values and assuming that the
> > fraction part is evenly distributed between 0 and 1, than this whole
> > argument is moot. The probability of getting exactly 0.5 is zero in that
> > case, just as the probability of getting any other specified number is zero.
>
> Another mistake, in an unquantized value the probability of getting
> exactly 0.5 (or any other number specified) is not 0 but an
> infinitesimal (i.e. lim(x) where x -> 0 (BUT NOT ZERO))

That limit IS zero. And the probability of getting exactly a certain
real number, or any finite set of real numbers, is zero too (assuming
the usual definition of probability over infinite sets). But we're not
actually talking about real numbers here.

--
Gabriel Genellina



More information about the Python-list mailing list