[Numpy-discussion] Profiling numpy ? (parts written in C)

Tue Dec 19 23:28:58 EST 2006

>>>>> "David" == David Cournapeau <david at ar.media.kyoto-u.ac.jp> writes:
    David> At the end, in the original context (speeding the drawing
    David> of spectrogram), this is the problem. Even if multiple
    David> backend/toolkits have obviously an impact in performances,
    David> I really don't see why a numpy function to convert an array
    David> to a RGB representation should be 10-20 times slower than
    David> matlab on the same machine.

This isn't exactly right.  When matplotlib converts a 2D grayscale
array to rgba, a lot goes on under the hood.  It's all numpy, but it's
far from single function and it involves many passes through the
data.  In principle, this could be done with one or two passes through
the data.  In practice, our normalization and colormapping abstractions
are so abstract that it is difficult (though not impossible) to
special case and optimize.

The top-level routine is

    def to_rgba(self, x, alpha=1.0):
        '''Return a normalized rgba array corresponding to x.
        If x is already an rgb or rgba array, return it unchanged.
        '''
        if hasattr(x, 'shape') and len(x.shape)>2: return x
        x = ma.asarray(x)
        x = self.norm(x)
        x = self.cmap(x, alpha)
        return x

which implies at a minimum two passes through the data, one for norm
and one for cmap.

In 99% of the use cases, cmap is a LinearSegmentedColormap though
users can define their own as long as it is callable.  My guess is
that the expensive part is Colormap.__call__, the base class for
LinearSegmentedColormap.  We could probably write some extension code
that does the following routine in one pass through the data.  But it
would be hairy.  In a quick look and rough count, I see about 10
passes through the data in the function below.

If you are interested in optimizing colormapping in mpl, I'd start
here.  I suspect there may be some low hanging fruit.

    def __call__(self, X, alpha=1.0):
        """
        X is either a scalar or an array (of any dimension).
        If scalar, a tuple of rgba values is returned, otherwise
        an array with the new shape = oldshape+(4,). If the X-values
        are integers, then they are used as indices into the array.
        If they are floating point, then they must be in the
        interval (0.0, 1.0).
        Alpha must be a scalar.
        """
        if not self._isinit: self._init()
        alpha = min(alpha, 1.0) # alpha must be between 0 and 1
        alpha = max(alpha, 0.0)
        self._lut[:-3, -1] = alpha
        mask_bad = None
        if not iterable(X):
            vtype = 'scalar'
            xa = array([X])
        else:
            vtype = 'array'
            xma = ma.asarray(X)
            xa = xma.filled(0)
            mask_bad = ma.getmask(xma)
        if typecode(xa) in typecodes['Float']:
            putmask(xa, xa==1.0, 0.9999999) #Treat 1.0 as slightly less than 1.
            xa = (xa * self.N).astype(Int)
        # Set the over-range indices before the under-range;
        # otherwise the under-range values get converted to over-range.
        putmask(xa, xa>self.N-1, self._i_over)
        putmask(xa, xa<0, self._i_under)
        if mask_bad is not None and mask_bad.shape == xa.shape:
            putmask(xa, mask_bad, self._i_bad)
        rgba = take(self._lut, xa)
        if vtype == 'scalar':
            rgba = tuple(rgba[0,:])
        return rgba

    David> I will take into account all those helpful messages, and
    David> hopefully come with something for the end of the week :),

    David> cheers

    David> David _______________________________________________
    David> Numpy-discussion mailing list Numpy-discussion at scipy.org
    David> http://projects.scipy.org/mailman/listinfo/numpy-discussion