[Numpy-discussion] Shape of join_by result is not what I expected

Pierre GM pgmdevlist at gmail.com
Wed Feb 10 21:16:49 EST 2010


On Feb 10, 2010, at 7:26 PM, David Carmean wrote:
>> 
> 
> Got this to "work", but now it's revealed my lack of understanding of the shape 
> of arrays;  I'd hoped that the results would look like (be the same shape as?) 
> the column_stack results.

You're misunderstanding what structured arrays / recarrays are.
Imagine a structured array of N records with 3 fields A, B and C. The shape of this array is (N,), but each element of the array is a special numpy object (numpy.void) with three elements (named A, B and C. The basic differences with an array of shape (N,3), where you expect the three columns to correspond to the fields A, B, C are that:
(1): the types of fields of a structured array are not necessarily homogeneous (you can have A int, B float and C string, eg) whereas for a standard array each column has the same type.
(2): the organization in memory is slighlty different.

Anyway, you're working with functions that return structured arrays, not standard arrays, so you end up with a 1D structured array.


>  I wanted to be able to take slices of the 
> results.   

Quite doable, depending on how you wanna slice: if you wanna take, say, the 2nd to 5th entries, just use [1:5]: the result will be a structured array with the same fields as the original.



> I created the original arrays from a list of tuples of the form
> 
>    [(1265184061, 0.02), (1265184121, 0.0), (1265184181, 0.31), ]
> 
> so the resulting arrays had the shape (n,2);

What function did you use to create this array ? If you end up with a (n,2), something went probably wrong and you're dealing with a standard array.


> these seemed easy to 
> manipulate by slicing, and my recollection was that this was a 
> useful format to feed mplotlib.plot.
> 
> The result looks like:
> 
>  array([ (1265184061.0, 0.0, 0.029999999999999999, 152.0, 1.5600000000000001, \
>    99.879999999999995, 0.02, 3.0, 0.0, 0.040000000000000001, 0.070000000000000007, \
>       0.68999999999999995),\
>    (1265184121.0, 0.0, 0.01, 148.0, 1.46, 99.950000000000003, 0.0, 0.0, 0.0, 0.01, \
> 	0.040000000000000001, 0.56000000000000005), ] )
> 
> with shape (n,)

be more specific: dtype ?


> These 1-dimensional results give me nice text output, I can't/don't know
> how to slice them;

Well, once again, that depends what you wanna do. Please be more specific.


>  this form may work for one of my use cases, but my
> main use case is to reprocess this data--which is for one server--by
> taking one field from about 60 servers worth of this data (saved to disk
> as binary pickles) and plot them all to a single canvas.
> 
> In other words, from sixty sets of this:
> 
>  tposix  	ldavg-15  ldavg-1  ldavg-5
>  1265184061.00    0.00   0.03    1.56
>  1265184121.00    0.00   0.01    1.46
>  1265184181.00    0.00   0.65    1.37
> 
> I need to collect and plot ldavg-1 as separate time-series plots.

you can access each field independently as youarray["tposic"], yourarray["ldavg-15"], ....





More information about the NumPy-Discussion mailing list