[Numpy-discussion] UMFPACK interface is unexpectedly slow

Wed Jul 21 21:34:38 EDT 2010

I hope I won't get identified as a spam bot :-). While I have not resolved
the problem itself, this is an issue that I cannot reproduce on our
cluster. I wanted to get back with some actual timings from the real
hardware we are going to be using and some details about the matrices, so
as not to chase ghosts, but this proved to be a headache saver.

It's still baffling because on the cluster I have also used stock packages
(albeit from Fedora, which is what our system administrator insists on
using) rather than my hand-compiled and optimized GotoBLAS and UMFPACK. It
didn't even occur to me to try to reproduce this on another system in the
last 4 hours I've been struggling with this, because I assumed that using
stock packages was giving me the uniformity I required. It seems I was
wrong. Nonetheless, I think it's safe to assume in this case that the
problem is not in NumPy or my code, and it would be wiser to bring this up
in Ubuntu's trackpad.

Thanks for your patience,
Alexandru

On Thu, July 22, 2010 4:10 am, Ioan-Alexandru Lazar wrote:
> Hello everyone,
>
> First of all, let me apologize for my earlier message; I made the mistake
> of trying to indent my code using SquirrelMail's horrible interface -- and
> pressing Tab and Space resulted in sending my (incomplete) e-mail to the
> list. Cursed be Opera's keyboard shortcuts now :-).
>
> I'm currently planning to use a Python-based infrastructure for our HPC
> project.
> I've previously used NumPy and SciPy for basic scientific computing tasks,
> so
> performance hasn't been quite an issue for me until now. At the moment I'm
> not too
> sure as to what to do next though, and I was hoping that someone with more
> experience in performance-related issues could point me to a way out of
> this.
>
> The trouble lays in the following piece of code:
>
> ===
>     w = 2 * math.pi * f
>     M = A - (1j*w*E)
>     n = M.shape[1]
>     B1 = numpy.zeros(n)
>     B2 = numpy.zeros(n)
>     B1[n-2] = 1.0
>     B2[n-1] = 1.0
> -> slow part starts here
>     umfpack.numeric(M)
>     x1 = umfpack.solve( um.UMFPACK_A, M, B1, autoTranspose = False)
>     x2 = umfpack.solve( um.UMFPACK_A, M, B2, autoTranspose = False)
>     solution = scipy.array([ [ x1[n-2], x2[n-2] ], [ x1[n-1], x2[n-1] ]])
>     return solution
> ====
>
> This isn't really too much -- it's generating a system matrix via
> operations that take little time, as I was expecting. Trouble is, the
> solve part takes significantly more time than Octave -- about 4 times.
>
> I'm using the stock version of UMFPACK in Ubuntu's repository; it's
> compiled against standard BLAS, so it's fairly slow, but so is Octave --
> so the problem isn't there.
>
> I'm obviously doing something wrong related to memory management here,
> because the memory consumption is also rocketing, but I'm not sure what
> exactly it is that I'm doing wrong. Could you point me towards some
> relevant documentation describing what I could do in order to improve the
> performance, or give me some hint related to that?
>
> Best regards,
> Alexandru Lazar
>