[Numpy-discussion] Profiling numpy ? (parts written in C)
John Hunter
jdhunter at ace.bsd.uchicago.edu
Tue Dec 19 23:28:58 EST 2006
>>>>> "David" == David Cournapeau <david at ar.media.kyoto-u.ac.jp> writes:
David> At the end, in the original context (speeding the drawing
David> of spectrogram), this is the problem. Even if multiple
David> backend/toolkits have obviously an impact in performances,
David> I really don't see why a numpy function to convert an array
David> to a RGB representation should be 10-20 times slower than
David> matlab on the same machine.
This isn't exactly right. When matplotlib converts a 2D grayscale
array to rgba, a lot goes on under the hood. It's all numpy, but it's
far from single function and it involves many passes through the
data. In principle, this could be done with one or two passes through
the data. In practice, our normalization and colormapping abstractions
are so abstract that it is difficult (though not impossible) to
special case and optimize.
The top-level routine is
def to_rgba(self, x, alpha=1.0):
'''Return a normalized rgba array corresponding to x.
If x is already an rgb or rgba array, return it unchanged.
'''
if hasattr(x, 'shape') and len(x.shape)>2: return x
x = ma.asarray(x)
x = self.norm(x)
x = self.cmap(x, alpha)
return x
which implies at a minimum two passes through the data, one for norm
and one for cmap.
In 99% of the use cases, cmap is a LinearSegmentedColormap though
users can define their own as long as it is callable. My guess is
that the expensive part is Colormap.__call__, the base class for
LinearSegmentedColormap. We could probably write some extension code
that does the following routine in one pass through the data. But it
would be hairy. In a quick look and rough count, I see about 10
passes through the data in the function below.
If you are interested in optimizing colormapping in mpl, I'd start
here. I suspect there may be some low hanging fruit.
def __call__(self, X, alpha=1.0):
"""
X is either a scalar or an array (of any dimension).
If scalar, a tuple of rgba values is returned, otherwise
an array with the new shape = oldshape+(4,). If the X-values
are integers, then they are used as indices into the array.
If they are floating point, then they must be in the
interval (0.0, 1.0).
Alpha must be a scalar.
"""
if not self._isinit: self._init()
alpha = min(alpha, 1.0) # alpha must be between 0 and 1
alpha = max(alpha, 0.0)
self._lut[:-3, -1] = alpha
mask_bad = None
if not iterable(X):
vtype = 'scalar'
xa = array([X])
else:
vtype = 'array'
xma = ma.asarray(X)
xa = xma.filled(0)
mask_bad = ma.getmask(xma)
if typecode(xa) in typecodes['Float']:
putmask(xa, xa==1.0, 0.9999999) #Treat 1.0 as slightly less than 1.
xa = (xa * self.N).astype(Int)
# Set the over-range indices before the under-range;
# otherwise the under-range values get converted to over-range.
putmask(xa, xa>self.N-1, self._i_over)
putmask(xa, xa<0, self._i_under)
if mask_bad is not None and mask_bad.shape == xa.shape:
putmask(xa, mask_bad, self._i_bad)
rgba = take(self._lut, xa)
if vtype == 'scalar':
rgba = tuple(rgba[0,:])
return rgba
David> I will take into account all those helpful messages, and
David> hopefully come with something for the end of the week :),
David> cheers
David> David _______________________________________________
David> Numpy-discussion mailing list Numpy-discussion at scipy.org
David> http://projects.scipy.org/mailman/listinfo/numpy-discussion
More information about the NumPy-Discussion
mailing list