[AstroPy] writing FITS files fast

Erik Bray embray at stsci.edu
Mon Apr 20 00:29:27 EDT 2015


On 4/19/2015 11:40 PM, Ivan Zolotukhin wrote:
> Hi,
>
> I need to write a FITS file from a web application in python and I
> need to do it fast. When I write file using astropy.io.fits module
> (v0.3.x), it takes several seconds for a table with 300+ columns and
> only 20 rows. Here's the hotshot profiler output of the relevant piece
> of code:

It *probably* need not be that slow, but it would be helpful to see your actual 
code.  For example, your output below shows a *lot* of calls to and a lot of 
time spent in code that can *probably* be avoided.  Though as you found using 
the most recent version should be significantly faster (v0.3.x is very old by 
comparison and there have been a lot of improvements since then).

Though I also agree fitsio should get you your best bang for your buck in most 
cases for raw I/O, and it should support 64-bit ints just fine.  That said, most 
of the time you're seeing in the astropy.io.fits code is nothing I/O releated, 
but just overhead related to creating the columns in the first place, and is 
probably mostly avoidable.

Erik

>           8506345 function calls (8480504 primitive calls) in 18.801 seconds
>
>     Ordered by: internal time, call count
>     List reduced from 2343 to 20 due to restriction <20>
>
>     ncalls  tottime  percall  cumtime  percall filename:lineno(function)
>      10635    5.409    0.001   13.575    0.001
> /usr/local/lib/python2.7/dist-packages/astropy/io/fits/column.py:953(__getattr__)
>    3660493    4.787    0.000    8.172    0.000
> /usr/local/lib/python2.7/dist-packages/astropy/io/fits/column.py:996(__getitem__)
>    3650946    3.386    0.000    3.386    0.000
> /usr/local/lib/python2.7/dist-packages/astropy/io/fits/util.py:589(_is_int)
>       2099    0.393    0.000    0.393    0.000
> /usr/local/lib/python2.7/dist-packages/astropy/io/fits/header.py:1779(_updateindices)
>
> For the more recent astropy version 1.0.2 situation is somewhat better
> but it's still significantly slower than I need:
>
>           1153615 function calls (1138771 primitive calls) in 4.364 seconds
>
>     Ordered by: internal time, call count
>     List reduced from 736 to 20 due to restriction <20>
>
>     ncalls  tottime  percall  cumtime  percall filename:lineno(function)
>       2099    0.419    0.000    0.425    0.000
> /usr/local/lib/python2.7/dist-packages/astropy/io/fits/header.py:1808(_updateindices)
>        346    0.312    0.001    0.852    0.002
> /usr/local/lib/python2.7/dist-packages/astropy/io/fits/column.py:1245(__getattr__)
>     124176    0.275    0.000    0.406    0.000
> /usr/local/lib/python2.7/dist-packages/astropy/io/fits/column.py:1285(__getitem__)
>        327    0.215    0.001    0.222    0.001
> /usr/local/lib/python2.7/dist-packages/django/db/backends/utils.py:58(execute)
>        357    0.201    0.001    0.293    0.001
> /usr/local/lib/python2.7/dist-packages/astropy/io/fits/header.py:1399(index)
>     137557    0.183    0.000    0.183    0.000
> /usr/local/lib/python2.7/dist-packages/astropy/io/fits/column.py:421(__get__)
>     124226    0.132    0.000    0.132    0.000
> /usr/local/lib/python2.7/dist-packages/astropy/io/fits/util.py:906(_is_int)
>
> My code is very simple however, it just creates necessary set of
> columns (astropy.table.MaskedColumn objects) casting data from python
> lists coming from the database to numpy arrays. Are there ways to save
> on checks like _is_int() in case the input array datatype has been
> already enforced?
>
> When the code is rewritten to use fitsio module
> (https://pypi.python.org/pypi/fitsio/), the execution time is
> significantly better:
>
>           129409 function calls (128693 primitive calls) in 0.811 seconds
>
>     Ordered by: internal time, call count
>     List reduced from 289 to 20 due to restriction <20>
>
>     ncalls  tottime  percall  cumtime  percall filename:lineno(function)
>        327    0.218    0.001    0.221    0.001
> /usr/local/lib/python2.7/dist-packages/django/db/backends/utils.py:58(execute)
>          4    0.083    0.021    0.083    0.021
> /usr/local/lib/python2.7/dist-packages/fitsio/fitslib.py:1311(_update_info)
>
> ...but fitsio does not seem to support 64 bit integers which is another
> requirement I have.
>
> What's the fastest solution to write FITS files that's available on
> the python market? Am I missing something in the astropy
> configuration?
>
> --
> With best regards,
>   Ivan
> _______________________________________________
> AstroPy mailing list
> AstroPy at scipy.org
> http://mail.scipy.org/mailman/listinfo/astropy
>




More information about the AstroPy mailing list