[Numpy-discussion] Multidimension array access in C via Python API

Sebastian Berg sebastian at sipsolutions.net
Tue Apr 5 14:19:56 EDT 2016

On Di, 2016-04-05 at 09:48 -0700, mpc wrote:
> The idea is that I want to thin a large 2D buffer of x,y,z points to
> a given
> resolution by dividing the data into equal sized "cubes" (i.e.
> resolution is
> number of cubes along each axis) and averaging the points inside each
> cube
> (if any).

Another point is timing your actual code, in this case you could have
noticed that all time is spend in the while loops and little time in
those min/max calls before.

Algorithms, or what you do is the other thing. In the end, it seems
your code is just a high dimensional histogram. Though I am not sure if
numpy's histogram is fast, I am sure it vastly outperforms this and if
you are interested in how it does this, you could even check its code,
it is just in python (though numpy internally always has quite a lot of
fun boilerplate to make sure of corner cases).

And if you search for what you want to do first, you may find faster
solutions easily, batteries included and all, there are a lot of tools
out there. The other point is, don't optimize much if you don't know
exactly what you need to optimize.

- Sebastian

> *    # Fill up buffer data for demonstration purposes with initial
> buffer of
> size 10,000,000 to reduce to 1,000,000
>     size = 10000000
>     buffer = np.ndarray(shape=(size,3), dtype=np.float)
>     # fill it up
>     buffer[:, 0] = np.random.ranf(size)
>     buffer[:, 1] = np.random.ranf(size)
>     buffer[:, 2] = np.random.ranf(size)
>     # Create result buffer to size of cubed resolution (i.e. 100 ^ 3
> =
> 1,000,000)
>     resolution = 100
>     thinned_buffer = np.ndarray(shape=(resolution ** 3,3),
> dtype=np.float)
>     # Trying to convert the following into C to speed it up
>     x_buffer = buffer[:, 0]
>     y_buffer = buffer[:, 1]
>     z_buffer = buffer[:, 2]
>     min_x = x_buffer.min()
>     max_x = x_buffer.max()
>     min_y = y_buffer.min()
>     max_y = y_buffer.max()
>     min_z = z_buffer.min()
>     max_z = z_buffer.max()
>     z_block = (max_z - min_z) / resolution
>     x_block = (max_x - min_x) / resolution
>     y_block = (max_y - min_y) / resolution
>     current_idx = 0
>     x_idx = min_x
>     while x_idx < max_x:
>         y_idx = min_y
>         while y_idx < max_y:
>             z_idx = min_z
>             while z_idx < max_z:
>                 inside_block_points = np.where((x_buffer >= x_idx) &
>  (x_buffer <=
> x_idx + x_block) &
>  (y_buffer >=
> y_idx) &
>  (y_buffer <=
> y_idx + y_block) &
>  (z_buffer >=
> z_idx) &
>  (z_buffer <=
> z_idx + z_block))
>                 if inside_block_points[0].size > 0:
>                     mean_point =
> buffer[inside_block_points[0]].mean(axis=0)
>                     thinned_buffer[current_idx] = mean_point
>                     current_idx += 1
>                 z_idx += z_block
>             y_idx += y_block
>         x_idx += x_block
>     return thin_buffer
> *
> --
> View this message in context: http://numpy-discussion.10968.n7.nabble
> .com/Multidimension-array-access-in-C-via-Python-API
> -tp42710p42726.html
> Sent from the Numpy-discussion mailing list archive at Nabble.com.
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20160405/52265803/attachment.sig>

More information about the NumPy-Discussion mailing list