[Numpy-discussion] An alternative to vectorize that lets you access the array?

Ram Rachum ram at rachum.com
Sun Jul 12 09:00:45 EDT 2020


Hi everyone,

Here's a problem I've been dealing with. I wonder whether NumPy has a tool
that will help me, or whether this could be a useful feature request.

In the upcoming EuroPython 20200, I'll do a talk about live-coding a music
synthesizer. It's going to be a fun talk, I'll use the sounddevice
<https://github.com/spatialaudio/python-sounddevice/> module to make a
program that plays music. Do attend, or watch it on YouTube when it's out :)

There's a part in my talk that I could make simpler, and thus shave 3-4
minutes of cumbersome explanations. These 3-4 minutes matter a great deal
to me. But for that I need to do something with NumPy and I don't know
whether it's possible or not.


The sounddevice library takes an ndarray of sound data and plays it.
Currently I use `vectorize` to produce that array:

    output_array = np.vectorize(f, otypes='d')(input_array)

And I'd like to replace it with this code, which is supposed to give the
same output:

    output_array = np.ndarray(input_array.shape, dtype='d')
    for i, item in enumerate(input_array):
        output_array[i] = f(item)

The reason I want the second version is that I can then have sounddevice
start playing `output_array` in a separate thread, while it's being
calculated. (Yes, I know about the GIL, I believe that sounddevice releases
it.)

Unfortunately, the for loop is very slow, even when I'm not processing the
data on separate thread. I benchmarked it on both CPython and PyPy3, which
is my target platform. On CPython it's 3 times slower than vectorize, and
on PyPy3 it's 67 times slower than vectorize! That's despite the fact that
the Numpy documentation says "The `vectorize` function is provided
primarily for convenience, not for performance. The implementation is
essentially a `for` loop."

So here are a few questions:

1. Is there something like `vectorize`, except you get to access the output
array before it's finished? If not, what do you think about adding that as
an option to `vectorize`?

2. Is there a more efficient way of writing the `for` loop I've written
above? Or any other kind of solution to my problem?


Thanks for your help,
Ram Rachum.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20200712/cc9edc75/attachment.html>


More information about the NumPy-Discussion mailing list