[AstroPy] Advice on reading FITS file with many HDUs

David Kirkby dkirkby at uci.edu
Mon Jan 19 13:56:34 EST 2015


Hi Kyle,

Thanks for the suggestion.  It turns out that fitsio is significantly
faster at reading my files with many small HDUs, but is very slow at
writing them (details at https://github.com/esheldon/fitsio/issues/32).
Fortunately, I can have the best of both worlds and won't have to abandon
FITS files for this application.

David

On Sat Jan 17 2015 at 8:22:00 PM Kyle Barbary <kylebarbary at gmail.com> wrote:

> Hi David,
>
> Erin Sheldon’s fitsio <https://github.com/esheldon/fitsio/> seems to do a
> bit better:
>
> In [1]: import fitsio
>
> In [2]: f = fitsio.FITS("LSST_i_descwl.fits");
>
> In [3]: f[0]
> Out[3]:
>
>   file: LSST_i_descwl.fits
>   extension: 0
>   type: IMAGE_HDU
>   image info:
>     data type: f4
>     dims: [4096,4096]
>
> Line 2 (opening the file) is nearly instantaneous, Line 3 (getting the
> info for the first HDU) takes maybe a second. Other HDU access after that
> (even near the end of the file) seems nearly instantaneous as well.
>
> Best,
> Kyle
>>
> On Sat, Jan 17, 2015 at 7:45 PM, David Kirkby <dkirkby at uci.edu> wrote:
>
>> Thanks for the feedback, Perry. Yes, point taken about asking too much
>> from FITS.  I was hoping there might be a simple fix, but my plan B is to
>> save the 45K-2 HDUs in an HDF5 file instead. That's a bit more hassle but
>> probably worth it.
>>
>> David
>>
>> On Sat Jan 17 2015 at 6:33:32 PM Perry Greenfield <stsci.perry at gmail.com>
>> wrote:
>>
>>> My first reaction is that you really are asking a lot of the fits module
>>> (45K headers!). If it was only to read the first two, we could consider
>>> some option to that module not to locate all headers in the file, but just
>>> the first N, but asking for random access basically requires reading most
>>> of them anyway.
>>>
>>> My second reaction is this is a poor use of FITS. Isn't there something
>>> more efficient you could do with storing the information in a FITS file? I
>>> know, I know, someone else decided this (usually) and you don't  have any
>>> control over that. Still, it makes me wonder seeing things like FITS taken
>>> to this extreme level.
>>>
>>> Really, this seems a better job for CFITSIO (I'd be curious to see how
>>> much faster it is though since it does have to run through most of the file
>>> to find the header locations as well, but will avoid the overhead of
>>> creating 45K objects).
>>>
>>> Cheers, Perry
>>>
>>> On Jan 17, 2015, at 8:29 PM, David Kirkby wrote:
>>>
>>> > I am reading a ~700 Mb file containing ~45K HDUs and looking for
>>> advice on how to speed things up. I typically only want to read the first 2
>>> HDUs right after opening the file, and then a small number of randomly
>>> selected HDUs while my program runs. The first 2 HDUs are the largest, but
>>> still only represent about10% of the total file size.
>>> >
>>> > The following command takes about 42 seconds:
>>> >
>>> >   fits.open('LSST_i.fits',mode='readonly',memmap=False)
>>> >
>>> > Changing to memap=True makes no difference but leads to "Too many open
>>> files" if I try to read too many HDUs after opening the file.
>>> >
>>> > Is this what I should expect? Any suggestions for opening the file and
>>> reading a small number of HDUs faster? If necessary, I can change the
>>> format of the file I am reading.
>>> >
>>> > In case it helps, the file I am testing with can be downloaded from:
>>> >
>>> >   http://srs.slac.stanford.edu/DataCatalog/?experiment=LSST-D
>>> ESC&folder=12915647
>>> >
>>> > thanks,
>>> > David
>>> > _______________________________________________
>>> > AstroPy mailing list
>>> > AstroPy at scipy.org
>>> > http://mail.scipy.org/mailman/listinfo/astropy
>>>
>>> _______________________________________________
>>> AstroPy mailing list
>>> AstroPy at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/astropy
>>>
>>
>> _______________________________________________
>> AstroPy mailing list
>> AstroPy at scipy.org
>> http://mail.scipy.org/mailman/listinfo/astropy
>>
>>
> _______________________________________________
> AstroPy mailing list
> AstroPy at scipy.org
> http://mail.scipy.org/mailman/listinfo/astropy
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/astropy/attachments/20150119/3a43b510/attachment.html>


More information about the AstroPy mailing list