[Numpy-discussion] Is there a more efficient way to do this?

Wed Aug 8 11:41:05 EDT 2012

On Wed, Aug 8, 2012 at 9:19 AM, Laszlo Nagy <gandalf at shopzeus.com> wrote:
> Is there a more efficient way to calculate the "slices" array below?
>
> I do not want to make copies of DATA, because it can be huge. The
> argsort is fast enough. I just need to create slices for different
> dimensions. The above code works, but it does a linear time search,
> implemented in pure Python code. For every iteration, Python code is
> executed. For 1 million rows, this is very slow. Is there a way to
> produce "slices" with numpy code? I could write C code for this, but I
> would prefer to do it with mass numpy operations.
>
> Thanks,
>
>    Laszlo

#Code
import numpy as np

#rows between 100 to 1M
rows = 1000
data = np.random.random_integers(0, 100, rows)

def get_slices_slow(data):
    o = np.argsort(data)

    slices = []
    prev_val = None
    sidx = -1

    for oidx, rowidx in enumerate(o):
        val = data[rowidx]
        if not val == prev_val:
            if prev_val is None:
                prev_val = val
                sidx = oidx
            else:
                slices.append((prev_val, sidx, oidx))
            sidx = oidx
            prev_val = val

    if (sidx >= 0) and (sidx < rows):
        slices.append((val, sidx, rows))
    slices = np.array(slices, dtype=np.int64)
    return slices

def get_slices_fast(data):
    nums = np.unique(data)
    slices = np.zeros((len(nums), 3), dtype=np.int64)
    slices[:,0] = nums
    count = 0
    for i, num in enumerate(nums):
        count += (data == num).sum()
        slices[i,2] = count
    slices[1:,1] = slices[:-1,2]
    return slices

def get_slices_faster(data):
    nums = np.unique(data)
    slices = np.zeros((len(nums), 3), dtype=np.int64)
    slices[:,0] = nums
    count = np.bincount(data)
    slices[:,2] = count.cumsum()
    slices[1:,1] = slices[:-1,2]
    return slices

#Testing in ipython
In [2]: (get_slices_slow(data) == get_slices_fast(data)).all()
Out[2]: True

In [3]: (get_slices_slow(data) == get_slices_faster(data)).all()
Out[3]: True

In [4]: timeit get_slices_slow(data)
100 loops, best of 3: 3.51 ms per loop

In [5]: timeit get_slices_fast(data)
1000 loops, best of 3: 1.76 ms per loop

In [6]: timeit get_slices_faster(data)
10000 loops, best of 3: 116 us per loop

So using the fast bincount and array indexing methods gets you about a
factor of 30 improvement.  Even just doing the counting in a loop with
good indexing will get you a factor of 2.

~Brett