[Numpy-discussion] A proposal to implement round in C Was: Rookie problems - Why is C-code much faster?

Tue Feb 21 17:06:08 EST 2006

[I am reposting this under a different subject because my original
post got buried in a long thread that went on to discussing unrelated
topics.  Sorry if you had to read this post twice.]

It turns out that  around (or round_) is implemented in python:

def round_(a, decimals=0):
   """Round 'a' to the given number of decimal places.  Rounding
   behaviour is equivalent to Python.

   Return 'a' if the array is not floating point.  Round both the real
   and imaginary parts separately if the array is complex.
   """
   a = asarray(a)
   if not issubclass(a.dtype.type, _nx.inexact):
       return a
   if issubclass(a.dtype.type, _nx.complexfloating):
       return round_(a.real, decimals) + 1j*round_(a.imag, decimals)
   if decimals is not 0:
       decimals = asarray(decimals)
   s = sign(a)
   if decimals is not 0:
       a = absolute(multiply(a, 10.**decimals))
   else:
       a = absolute(a)
   rem = a-asarray(a).astype(_nx.intp)
   a = _nx.where(_nx.less(rem, 0.5), _nx.floor(a), _nx.ceil(a))
   # convert back
   if decimals is not 0:
       return multiply(a, s/(10.**decimals))
   else:
       return multiply(a, s)

I see many ways to improve the performance here.  First, there is no
need to check for "decimals is not 0" three times. This can be done
once, maybe at the expense of some code duplication.  Second,
_nx.where(_nx.less(rem, 0.5), _nx.floor(a), _nx.ceil(a)) seems to be
equivalent to _nx.floor(a+0.5). Finally, if rint is implemented as a
ufunc as Mads originally suggested, "decimals is 0" branch can just
call that.

It is tempting to rewrite the whole thing in C, but before I  do that
I have a few questions about current implementation.

1.  It is implemented in oldnumeric.py .  Does this mean it is
deprecated.  If so, what is the recommended replacement?

2.  Was it intended to support  array and fractional values for
decimals or is it an implementation artifact. Currently:

>>> around(array([1.2345]*5),[1,2,3,4,5])
array([ 1.2   ,  1.23  ,  1.235 ,  1.2345,  1.2345])
>>> around(1.2345,2.5)
array(1.2332882874656679)

3. It does nothing to exact types, even if decimals<0
>>> around(1234, -2)
array(1234)

Is this a bug? Consider that
>>> round(1234, -2)
1200.0
and
>>> around(1234., -2)
array(1200.0)

Docstring is self-contradictory: "Rounding behaviour is equivalent to
Python" is not consistent with "Return 'a' if the array is not
floating point."

I propose to deprecate around and implement a new "round" member
function in C that will only accept scalar "decimals"  and will behave
like a properly vectorized builtin round.  I will do the coding if
there is interest.

In any case, something has to be done here.  I don't think the
following timings are acceptable:

> python -m timeit -s "from numpy import array; x = array([1.5]*1000)" "(x+0.5).astype(int).astype(float)"
100000 loops, best of 3: 18.8 usec per loop
> python -m timeit -s "from numpy import array, around; x = array([1.5]*1000)" "around(x)"
10000 loops, best of 3: 155 usec per loop