[AstroPy] SQL-like queries against astropy.table objects

Derek Homeier derek at astro.physik.uni-goettingen.de
Sat Feb 8 12:28:27 EST 2014


Hi,

I know almost nothing about SQL, but I agree that for basic conditionals that can be used
on numpy arrays this really is straightforward.

On 08.02.2014, at 3:00PM, Eric Jensen <ejensen1 at swarthmore.edu> wrote:

> In [35]:
> 
> # Select only entries with long baselines in U and W, and Real amplitudes > 1 Jy:
> good_rows = np.logical_and(t['U'] > 100, t['V'] > 100, t['Real'] > 1)
> In [42]:
> 
> # Make a new table from this selection, and see how many rows were in 
> # new table vs. the original one:
> new_table = t[good_rows]
> new_table['U'].shape
> 
This can be written even more concise, e.g. in the OP's example 

> For example, say I have a table with the following columns (not the use 
> case, but an easy demo):
> JD, mag, mag_error
> 
> I could find all data points with an SNR > 100 within a date range by:
>     SELECT * from table WHERE (JD > value AND JD < value AND 
> mag/mag_err > 100);

new_table = t[[('JD']>minval)&(t['JD']<maxval)&(t['mag']/t['mag_err']>100)]
print(len(t), len(new_table))

directly gives you the desired selection (which works the same with any numpy structured array.
Not very painful, even if it might get a bit lengthy to type. I don't think though apt.Table offers a more
comfortable way like e.g. pytables does with tbl.where('JD>minval & JD<maxval')…

I am sure there's more powerful possibilities in SQL that are not as easy to reproduce;
operating on string columns is a bit more inconvenient, since numpy does not offer a lot of operations
working on arrays of str type; in this case I think  you have to resort to python loops, but in this case
astropy Tables add_column[s] helps a lot over plain numpy. I have not explored the grouping methods
either.

HTH,
 						Derek





More information about the AstroPy mailing list