[Python-ideas] Respectively and its unpacking sentence

Andrew Barnert abarnert at yahoo.com
Wed Jan 27 20:51:46 EST 2016


On Jan 27, 2016, at 16:12, Steven D'Aprano <steve at pearwood.info> wrote:
> 
> I think you would be better off trying to get better support for 
> vectorized operations into Python:

I really think, at least 90% of the time, and probably a lot more, people are better off just using numpy than reinventing it. Obviously, building a "statistics-without-numpy" module to be added to the stdlib is an exception. But otherwise, the fact that numpy already exists, and has had a couple decades of heavy use and expert attention and two predecessor libraries to work out the kinks in the design, means that it's likely to be better, even for your limited purposes, than any limited-purpose thing you come up with. 

There are a lot more edge cases than you think. For example, you thought far enough ahead that your sum that works column-wise on 2D arrays. But what about when you need to work row-wise? What's the best interface: an axis parameter, or a transpose function (hey, you can even just use zip)? How do you then extend whichever choice you made to 3D? Or to when you want to get the total sum across both axes? For another example: should I be able to use vectorize to write a function of two arrays, and then apply it to a single N+1-D array, or is that going to cause more confusion than help? And so on. I wouldn't trust my own a priori intuition on those questions, so I'd go look at APL, J, MATLAB, R, and maybe Mathematica and see how their idioms best translate to Python in a variety of different kinds of problems. And I'd probably get some of it wrong, as numpy's ancestors did, and then have to agonize over compatibility-breaking changes.

And after all that, what would be the benefit? I no longer have to install numpy--but now I have to install pyvec instead. Which is just a less-featureful, less-tested, less-optimized, and less-refined numpy.

If there's something actually _wrong_ with numpy's design for your purposes (and you can't trivially wrap it away), that's different. Maybe you could do a whole lot lazily by sticking to the iterator protocol? (There's a nifty Haskell package for vectorizing lazily that might be worth looking at, as long as you can stand reading about everything in terms of monadic lifting where you'd say vectorize, etc.) But "I want the same as numpy but less good" doesn't seem like a good place to start, because at best, that's what you'll end up with.



More information about the Python-ideas mailing list