[Numpy-discussion] Shape of join_by result is not what I expected
Pierre GM
pgmdevlist at gmail.com
Wed Feb 10 21:16:49 EST 2010
On Feb 10, 2010, at 7:26 PM, David Carmean wrote:
>>
>
> Got this to "work", but now it's revealed my lack of understanding of the shape
> of arrays; I'd hoped that the results would look like (be the same shape as?)
> the column_stack results.
You're misunderstanding what structured arrays / recarrays are.
Imagine a structured array of N records with 3 fields A, B and C. The shape of this array is (N,), but each element of the array is a special numpy object (numpy.void) with three elements (named A, B and C. The basic differences with an array of shape (N,3), where you expect the three columns to correspond to the fields A, B, C are that:
(1): the types of fields of a structured array are not necessarily homogeneous (you can have A int, B float and C string, eg) whereas for a standard array each column has the same type.
(2): the organization in memory is slighlty different.
Anyway, you're working with functions that return structured arrays, not standard arrays, so you end up with a 1D structured array.
> I wanted to be able to take slices of the
> results.
Quite doable, depending on how you wanna slice: if you wanna take, say, the 2nd to 5th entries, just use [1:5]: the result will be a structured array with the same fields as the original.
> I created the original arrays from a list of tuples of the form
>
> [(1265184061, 0.02), (1265184121, 0.0), (1265184181, 0.31), ]
>
> so the resulting arrays had the shape (n,2);
What function did you use to create this array ? If you end up with a (n,2), something went probably wrong and you're dealing with a standard array.
> these seemed easy to
> manipulate by slicing, and my recollection was that this was a
> useful format to feed mplotlib.plot.
>
> The result looks like:
>
> array([ (1265184061.0, 0.0, 0.029999999999999999, 152.0, 1.5600000000000001, \
> 99.879999999999995, 0.02, 3.0, 0.0, 0.040000000000000001, 0.070000000000000007, \
> 0.68999999999999995),\
> (1265184121.0, 0.0, 0.01, 148.0, 1.46, 99.950000000000003, 0.0, 0.0, 0.0, 0.01, \
> 0.040000000000000001, 0.56000000000000005), ] )
>
> with shape (n,)
be more specific: dtype ?
> These 1-dimensional results give me nice text output, I can't/don't know
> how to slice them;
Well, once again, that depends what you wanna do. Please be more specific.
> this form may work for one of my use cases, but my
> main use case is to reprocess this data--which is for one server--by
> taking one field from about 60 servers worth of this data (saved to disk
> as binary pickles) and plot them all to a single canvas.
>
> In other words, from sixty sets of this:
>
> tposix ldavg-15 ldavg-1 ldavg-5
> 1265184061.00 0.00 0.03 1.56
> 1265184121.00 0.00 0.01 1.46
> 1265184181.00 0.00 0.65 1.37
>
> I need to collect and plot ldavg-1 as separate time-series plots.
you can access each field independently as youarray["tposic"], yourarray["ldavg-15"], ....
More information about the NumPy-Discussion
mailing list