[AstroPy] SQL-like queries against astropy.table objects
Derek Homeier
derek at astro.physik.uni-goettingen.de
Sat Feb 8 12:28:27 EST 2014
Hi,
I know almost nothing about SQL, but I agree that for basic conditionals that can be used
on numpy arrays this really is straightforward.
On 08.02.2014, at 3:00PM, Eric Jensen <ejensen1 at swarthmore.edu> wrote:
> In [35]:
>
> # Select only entries with long baselines in U and W, and Real amplitudes > 1 Jy:
> good_rows = np.logical_and(t['U'] > 100, t['V'] > 100, t['Real'] > 1)
> In [42]:
>
> # Make a new table from this selection, and see how many rows were in
> # new table vs. the original one:
> new_table = t[good_rows]
> new_table['U'].shape
>
This can be written even more concise, e.g. in the OP's example
> For example, say I have a table with the following columns (not the use
> case, but an easy demo):
> JD, mag, mag_error
>
> I could find all data points with an SNR > 100 within a date range by:
> SELECT * from table WHERE (JD > value AND JD < value AND
> mag/mag_err > 100);
new_table = t[[('JD']>minval)&(t['JD']<maxval)&(t['mag']/t['mag_err']>100)]
print(len(t), len(new_table))
directly gives you the desired selection (which works the same with any numpy structured array.
Not very painful, even if it might get a bit lengthy to type. I don't think though apt.Table offers a more
comfortable way like e.g. pytables does with tbl.where('JD>minval & JD<maxval')…
I am sure there's more powerful possibilities in SQL that are not as easy to reproduce;
operating on string columns is a bit more inconvenient, since numpy does not offer a lot of operations
working on arrays of str type; in this case I think you have to resort to python loops, but in this case
astropy Tables add_column[s] helps a lot over plain numpy. I have not explored the grouping methods
either.
HTH,
Derek
More information about the AstroPy
mailing list