[Numpy-discussion] Re : [Newbie] Fast plotting

John Hunter jdh2358 at gmail.com
Tue Jan 6 09:34:02 EST 2009


On Tue, Jan 6, 2009 at 7:38 AM, Jean-Baptiste Rudant
<boogaloojb at yahoo.fr> wrote:
> Hello,
> I'm not an expert. Something exists in matplotlib, but it's not very
> efficient.
> import matplotlib.mlab
> import numpy
> N = 1000
> X  = numpy.random.randint(0, 10, N)
> Y = numpy.random.random(N)
> recXY = numpy.rec.fromarrays((X, Y), names='x, y')
> summary = matplotlib.mlab.rec_groupby(recXY, ('x',), (('y', numpy.mean,
> 'y_avg'),))

And you can use rec2txt for pretty printing in the shell:

In [103]: print matplotlib.mlab.rec2txt(summary)
   x   y_avg
   0   0.506
   1   0.531
   2   0.491
   3   0.482
   4   0.511
   5   0.507
   6   0.543
   7   0.525
   8   0.512
   9   0.472


> Jean-Baptiste Rudant
>
> ________________________________
> De : Franck Pommereau <pommereau at univ-paris12.fr>
> À : Discussion of Numerical Python <numpy-discussion at scipy.org>
> Envoyé le : Mardi, 6 Janvier 2009, 10h35mn 01s
> Objet : [Numpy-discussion] [Newbie] Fast plotting
>
> Hi all, and happy new year!
>
> I'm new to NumPy and searching a way to compute from a set of points
> (x,y) the mean value of y values associated to each distinct x value.
> Each point corresponds to a measure in a benchmark (x = parameter,  y =
> computation time) and I'd like to plot the graph of mean computation
> time wrt parameter values. (I know how to plot, but not how to compute
> mean values.)
>
> My points are stored as two arrays X, Y (same size).
> In pure Python, I'd do as follows:
>
> s = {} # sum of y values for each distinct x (as keys)
> n = {} # number of summed values (same keys)
> for x, y in zip(X, Y) :
>     s[x] = s.get(x, 0.0) + y
>     n[x] = n.get(x, 0) + 1
> new_x = array(list(sorted(s)))
> new_y = array([s[x]/n[x] for x in sorted(s)])
>
> Unfortunately, this code is much too slow because my arrays have
> millions of elements. But I'm pretty sure that NumPy offers a way to
> handle this more elegantly and much faster.
>
> As a bonus, I'd be happy if the solution would allow me to compute also
> standard deviation, min, max, etc.
>
> Thanks in advance for any help!
> Franck
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
>
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
>



More information about the NumPy-Discussion mailing list