[Numpy-discussion] arange(start, stop, step) and floating point (Ticket #8)

Wed Feb 8 20:10:02 EST 2006

On 2/8/06, Tim Hochberg <tim.hochberg at cox.net> wrote:
> Sasha wrote:
> Isn't that bad numerically? That is, isn't (n*step) much more accurate
> than (step + step + ....)?

It does not matter whether n*step is more accurate than step+...+step.
 As long as arange uses stop+=step loop to fill in the values, the
last element may exceed stop even if start + length*step does not. 
One may argue that filling with start + i*step is more accurate, but
that will probably be much slower (even than my O(N) algorithm).

> It also seems needlessly inefficient;
I proposed O(N) algorithm just to counter Robert's argument that it is
not possible to ensure the invariant.  On the other hand I don't think
it is that bad - I would expect the length computing loop to be much
faster than the main loop that involves main memory.

> you
> should be able to to it in at most a few steps:
>
> length = (stop - start)/step
> while length * step < stop:
>     length += 1
> while length * step >= stop:
>    length -= 1
>
> Fix errors, convert to C and enjoy. It should normally take only a few
> tries to get the right N.

This will not work (even if you fix the error of missing start+ in the
conditions :-): start + length*step < stop  does not guarantee than
start + step + ... + step < stop.

> I see that the internals of range use repeated adding to make the range.
> I imagine that is why you proposed the repeated adding. I think that
> results in error that's on the order of length ULP, while multiplying
> would result in error on the order of 1 ULP. So perhaps we should fix
> XXX_fill to be more accurate if nothing else.
>

I don't think accuracy of XXX_fill for fractional steps is worth improving.
In the cases where accuracy matters, one can always use integral step
and multiply the result by a float.  However, if anything is done to
that end, I would suggest to generalize XXX_fill functions to allow
accumulation be performed using a different type similarly to the way
op.reduce and op.accumulate functions us their (new in numpy) dtype
argument.

> >3. Change arange to ensure that arange(..., stop, ...)[-1] < stop.
> >
> I see that Travis has vetoed this in any event, but perhaps we should
> fix up the fill functions to be more accurate and maybe most of the
> problem would just magically go away.

The more I think about this, the more I am convinced that using arange
with a non-integer step is a bad idea.  Since making it illegal is not
an option, I don't see much of a point in changing exactly how bad it
is. Users who want fractional steps should just be educated about
linspace.