[Numpy-discussion] An alternative to vectorize that lets you access the array?
Ram Rachum
ram at rachum.com
Sun Jul 12 09:00:45 EDT 2020
Hi everyone,
Here's a problem I've been dealing with. I wonder whether NumPy has a tool
that will help me, or whether this could be a useful feature request.
In the upcoming EuroPython 20200, I'll do a talk about live-coding a music
synthesizer. It's going to be a fun talk, I'll use the sounddevice
<https://github.com/spatialaudio/python-sounddevice/> module to make a
program that plays music. Do attend, or watch it on YouTube when it's out :)
There's a part in my talk that I could make simpler, and thus shave 3-4
minutes of cumbersome explanations. These 3-4 minutes matter a great deal
to me. But for that I need to do something with NumPy and I don't know
whether it's possible or not.
The sounddevice library takes an ndarray of sound data and plays it.
Currently I use `vectorize` to produce that array:
output_array = np.vectorize(f, otypes='d')(input_array)
And I'd like to replace it with this code, which is supposed to give the
same output:
output_array = np.ndarray(input_array.shape, dtype='d')
for i, item in enumerate(input_array):
output_array[i] = f(item)
The reason I want the second version is that I can then have sounddevice
start playing `output_array` in a separate thread, while it's being
calculated. (Yes, I know about the GIL, I believe that sounddevice releases
it.)
Unfortunately, the for loop is very slow, even when I'm not processing the
data on separate thread. I benchmarked it on both CPython and PyPy3, which
is my target platform. On CPython it's 3 times slower than vectorize, and
on PyPy3 it's 67 times slower than vectorize! That's despite the fact that
the Numpy documentation says "The `vectorize` function is provided
primarily for convenience, not for performance. The implementation is
essentially a `for` loop."
So here are a few questions:
1. Is there something like `vectorize`, except you get to access the output
array before it's finished? If not, what do you think about adding that as
an option to `vectorize`?
2. Is there a more efficient way of writing the `for` loop I've written
above? Or any other kind of solution to my problem?
Thanks for your help,
Ram Rachum.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20200712/cc9edc75/attachment.html>
More information about the NumPy-Discussion
mailing list