[Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?)
Gnata Xavier
xavier.gnata at gmail.com
Mon Mar 24 11:41:28 EDT 2008
> A couple of thoughts on parallelism:
>
> 1. Can someone come up with a small set of cases and time them on
> numpy, IDL, Matlab, and C, using various parallel schemes, for each of
> a representative set of architectures? We're comparing a benchmark to
> itself on different architectures, rather than seeing whether the
> thread capability is helping our competition on the same architecture.
> If it's mostly not helping them, we can forget it for the time being.
> I suspect that it is, in fact, helping them, or at least not hurting
> them.
>
>
Well I could ask some IDL users to provide you with benchmarks.
In C/OpenMP I have posted a trivial code.
> 2. Would it slow things much to have some state that the routines
> check before deciding whether to run a parallel implementation or not?
> It could default to single thread except in the cases where
> parallelism always helps, but the user can configure it to multithread
> beyond certain threshholds of, say, number of elements. Then, in the
> short term, a savvy user can tweak that state to get parallelism for
> more than N elements. In the longer term, there could be a test
> routine that would run on install and configure the state for that
> particular machine. When numpy started it would read the saved file
> and computation would be optimized for that machine. The user could
> always override it.
>
>
No it wouldn't cost that much and that is exactly the way IDL (for
instance) works.
> 3. We should remember the first rule of parallel programming, which
> Anne quotes as "premature optimization is the root of all evil".
> There is a lot to fix in numpy that is more fundamental than speed. I
> am the first to want things fast (I would love my secondary eclipse
> analysis to run in less than a week), but we have gaping holes in
> documentation and other areas that one would expect to have been
> filled before a 1.0 release. I hope we can get them filled for 1.1.
> It bears repeating that our main resource shortage is in person-hours,
> and we'll get more of those as the community grows. Right now our
> deficit in documentation is hurting us badly, while our deficit in
> parallelism is not. There is no faster way of growing the community
> than making it trivial to learn how to use numpy without hand-holding
> from an experienced user. Let's explore parallelism to assess when
> and how it might be right to do it, but let's stay focussed on the
> fundamentals until we have those nailed.
>
>
That is well put and clear.
It is also clear that our deficit in parallelism is not hurting us that
badly.
It is a real problem in some communities like astronomers and images
processing people but the lack of documentation is the first one, that
is true.
Xavier
More information about the NumPy-Discussion
mailing list