From jpivarski at gmail.com Wed Oct 7 15:59:14 2020 From: jpivarski at gmail.com (Jim Pivarski) Date: Wed, 7 Oct 2020 14:59:14 -0500 Subject: [AstroPy] Projects involving irregularly shaped data Message-ID: Hi everyone, Adrian Price-Whelan recommended that I ask my question here, since it would reach a greater number of people involved in astronomical software. I'm a developer of Awkward Array , a Python package for manipulating large, irregularly shaped datasets: arrays with variable-length lists, nested records, missing values, or mixed data types. The interface is a strict generalization of NumPy: you can slice jagged arrays as though they were ordinary multidimensional arrays, and there are new functions that only make sense in the context of irregular data. Like NumPy, the actual calculations are precompiled loops on internally homogeneous arrays, and we're expanding it to include GPUs transparently (irregular data on GPUs in a NumPy-like syntax). This package was developed for particle physics (variable numbers of particles emerging from an array of collision events), but it seems like these problems would exist in other fields as well. Right now, we're working on a proposal to find data analysis projects that need to deal with large, irregularly structured data to see if Awkward Array is applicable and if it can be made more useful for them. Ideally, this would motivate more interoperability with other scientific Python libraries. (We can already use Awkward Arrays in Numba; we're working on cuDF, Dask, and Zarr. Adrian also recommended ASDF, which I'm looking into now.) Does anyone have or know about a data analysis project that is currently limited by this combination of large + irregular data? Is anyone interested in collaborating? Thank you! -- Jim -------------- next part -------------- An HTML attachment was scrubbed... URL: From erik.m.bray at gmail.com Wed Oct 7 16:19:34 2020 From: erik.m.bray at gmail.com (E. Madison Bray) Date: Wed, 7 Oct 2020 22:19:34 +0200 Subject: [AstroPy] Projects involving irregularly shaped data In-Reply-To: References: Message-ID: Hi Jim, Awkward Array looks, well, awesome. Thanks for pointing it out. (By the way, we met a couple years ago when you came to a workshop in France, hi!) Besides ASDF which you already mentioned it might also be useful for dealing with some more awkward FITS files too, but I'm not sure. I'll have to give it more careful scrutiny. I also have one non-astro/physics related project where this could be useful. In this case it's a machine learning application where I have binary matrices of 0s and 1s but of potentially different sizes, but they can be batched together when doing mini-batch gradient descent. In this case I just mask out the margins with -1s, which the models then have to account for in their evaluation. Do you know if anyone has tried adapting Awkward Array for use with PyTorch? On Wed, Oct 7, 2020 at 9:59 PM Jim Pivarski wrote: > > Hi everyone, > > Adrian Price-Whelan recommended that I ask my question here, since it would reach a greater number of people involved in astronomical software. > > I'm a developer of Awkward Array, a Python package for manipulating large, irregularly shaped datasets: arrays with variable-length lists, nested records, missing values, or mixed data types. The interface is a strict generalization of NumPy: you can slice jagged arrays as though they were ordinary multidimensional arrays, and there are new functions that only make sense in the context of irregular data. Like NumPy, the actual calculations are precompiled loops on internally homogeneous arrays, and we're expanding it to include GPUs transparently (irregular data on GPUs in a NumPy-like syntax). > > This package was developed for particle physics (variable numbers of particles emerging from an array of collision events), but it seems like these problems would exist in other fields as well. Right now, we're working on a proposal to find data analysis projects that need to deal with large, irregularly structured data to see if Awkward Array is applicable and if it can be made more useful for them. Ideally, this would motivate more interoperability with other scientific Python libraries. (We can already use Awkward Arrays in Numba; we're working on cuDF, Dask, and Zarr. Adrian also recommended ASDF, which I'm looking into now.) > > Does anyone have or know about a data analysis project that is currently limited by this combination of large + irregular data? Is anyone interested in collaborating? > > Thank you! > -- Jim > > _______________________________________________ > AstroPy mailing list > AstroPy at python.org > https://mail.python.org/mailman/listinfo/astropy From karl.kosack at cea.fr Thu Oct 8 10:02:21 2020 From: karl.kosack at cea.fr (KOSACK Karl) Date: Thu, 8 Oct 2020 16:02:21 +0200 Subject: [AstroPy] Projects involving irregularly shaped data In-Reply-To: References: Message-ID: <9852E7B1-E45F-4BAF-ADF9-4BBBCAE8898B@cea.fr> Hi Jim, I actually do have a good application for this: Very-High-Energy (VHE) gamma-ray astronomy using Imaging Atmospheric Cherenkov Telescopes (IACTs). I'm the data processing coordinator for CTA (the next generation IACT) as well as lead developer of a python-based low-level IACT reconstruction package (ctapipe), though this response is not made in any official capacity. In our field, we have nearly the same problem as in particle physics: we measure gamma rays by looking at air-showers produced in the atmosphere using an array of highly sensitive optical telescopes, whose cameras are very similar to particle physics detectors. So for each detected "event" (which could be a gamma ray or cosmic ray), we readout a sparse data block consisting of N sequences of images of the shower taken at ns time resolution (where N is around 1-100), and for each image sequence we only store the readout of pixels that have a signal or are close to those that do (so M_pixels is also a variable length array). So everything is variable-length record arrays of variable-length arrays. Once data are fully processed, the final result looks more similar to "traditional" astronomy: gamma-ray sky images, spectra, light curves etc, so most of this is hidden to the end-user. Due to this complexity, we've so far had to process raw data "event-by-event", and "telescope-by-telescope" at least at the first stages of analysis, and have had to make a series of complex data structures and loops to handle it all. We use numpy and numba heavily at the lowest-levels (avoiding loops over pixels and time-slices), but not for the event or telescope loops. Storage of the data is also somewhat complex, as we have to break it into flat tables to avoid slowness introduced by variable-length arrays in most storage formats like HDF5 or FITS, and to support HPC optimization. Also, this is "big" data, meaning that we will generate and process about 10 PB of real data per year, and a similar volume of simulated data, so use of parallel processing is critical, machine-learning is necessary, and even GPUs and other HPC methods are interesting. So the point of all this is: In an ideal world, we could easily apply algorithms to all events and telescopes at once (or at least as many as can fit into memory), and that requires something like awkward-array. I've followed with interest the evolution of awkward array, but so far we have not used it due to a few factors: 1. it didn't exist when we started development, and 2. it wasn't yet stable enough to be the core data structure of our whole framework. However, I think it's a really interesting technology to consider for a future refactoring. Would be happy to discuss more offline. Karl -- Karl Kosack CEA Saclay / CTA Observatory https://www.cta-observatory.org/ > On Oct 7, 2020, at 21:59, Jim Pivarski wrote: > > Hi everyone, > > Adrian Price-Whelan recommended that I ask my question here, since it would reach a greater number of people involved in astronomical software. > > I'm a developer of Awkward Array, a Python package for manipulating large, irregularly shaped datasets: arrays with variable-length lists, nested records, missing values, or mixed data types. The interface is a strict generalization of NumPy: you can slice jagged arrays as though they were ordinary multidimensional arrays, and there are new functions that only make sense in the context of irregular data. Like NumPy, the actual calculations are precompiled loops on internally homogeneous arrays, and we're expanding it to include GPUs transparently (irregular data on GPUs in a NumPy-like syntax). > > This package was developed for particle physics (variable numbers of particles emerging from an array of collision events), but it seems like these problems would exist in other fields as well. Right now, we're working on a proposal to find data analysis projects that need to deal with large, irregularly structured data to see if Awkward Array is applicable and if it can be made more useful for them. Ideally, this would motivate more interoperability with other scientific Python libraries. (We can already use Awkward Arrays in Numba; we're working on cuDF, Dask, and Zarr. Adrian also recommended ASDF, which I'm looking into now.) > > Does anyone have or know about a data analysis project that is currently limited by this combination of large + irregular data? Is anyone interested in collaborating? > > Thank you! > -- Jim > > _______________________________________________ > AstroPy mailing list > AstroPy at python.org > https://mail.python.org/mailman/listinfo/astropy From P.Wortmann at skatelescope.org Fri Oct 9 10:29:23 2020 From: P.Wortmann at skatelescope.org (Wortmann, Peter) Date: Fri, 9 Oct 2020 14:29:23 +0000 Subject: [AstroPy] Projects involving irregularly shaped data References: Message-ID: <0807bbace491499380d21d803fe13f20@exchange.ad.skatelescope.org> Hi Jim Another possible use case you might want to be aware of - for the Square Kilometre Array (https://www.skatelescope.org/) we are currently in the early stages of evaluating concrete technologies for dealing with data exchange within our pipelines. We are expecting heavy I/O workloads and want to evolve our software quite a bit over the lifetime of the observatory, so we are considering building around Apache Arrow (or similar) in-memory data structures. While most of our "primary" data is likely regularly shaped, there will definitely be very significant amounts of "secondary" data - such as sky models, or complex calibration and flagging data at minimum. To us, Awkward sounds like a very good candidate for dealing with such data. We will likely be building some prototypes over the next year to see how we could utilise Awkward in processing. I realise this is not quite what you were asking for, but might still be worthwhile to get in touch at some point? Greetings, Peter Wortmann (Data Processing Architect, Square Kilometre Array Organisation) On 07/10/2020 20:59, Jim Pivarski wrote: > Hi everyone, > > Adrian Price-Whelan recommended that I ask my question here, since it > would reach a greater number of people involved in astronomical software. > > I'm a developer of Awkward Array > , a Python package for > manipulating large, irregularly shaped datasets: arrays with > variable-length lists, nested records, missing values, or mixed data > types. The interface is a strict generalization of NumPy: you can slice > jagged arrays as though they were ordinary multidimensional arrays, and > there are new functions that only make sense in the context of irregular > data. Like NumPy, the actual calculations are precompiled loops on > internally homogeneous arrays, and we're expanding it to include GPUs > transparently (irregular data on GPUs in a NumPy-like syntax). > > This package was developed for particle physics (variable numbers of > particles emerging from an array of collision events), but it seems like > these problems would exist in other fields as well. Right now, we're > working on a proposal to find data analysis projects that need to deal > with large, irregularly structured data to see if Awkward Array is > applicable and if it can be made more useful for them. Ideally, this > would motivate more interoperability with other scientific Python > libraries. (We can already use Awkward Arrays in Numba; we're working on > cuDF, Dask, and Zarr. Adrian also recommended ASDF, which I'm looking > into now.) > > Does anyone have or know about a data analysis project that is currently > limited by this combination of large?+ irregular data? Is anyone > interested in collaborating? > > Thank you! > -- Jim > > > _______________________________________________ > AstroPy mailing list > AstroPy at python.org > https://mail.python.org/mailman/listinfo/astropy > SKA Organisation is a Private Limited Company by guarantee registered in England and Wales with registered number 07881918. Our registered office is at Jodrell Bank Observatory, Lower Withington, Macclesfield, Cheshire, England, SK11 9FT. This message is intended solely for the addressee and may contain confidential information. If you have received this message in error, please send it back to us, and immediately and permanently delete it. Do not use, copy or disclose the information contained in this message or in any attachment. This email has been scanned for viruses and malware, and may have been automatically archived, by Mimecast Ltd. Although SKA Organisation has taken reasonable precautions to ensure no viruses are present in this email, SKA Organisation cannot accept responsibility for any loss or damage sustained as a result of computer viruses and the recipient must ensure that the email (and attachments) are virus free. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jpivarski at gmail.com Fri Oct 9 12:35:47 2020 From: jpivarski at gmail.com (Jim Pivarski) Date: Fri, 9 Oct 2020 11:35:47 -0500 Subject: [AstroPy] Projects involving irregularly shaped data In-Reply-To: <0807bbace491499380d21d803fe13f20@exchange.ad.skatelescope.org> References: <0807bbace491499380d21d803fe13f20@exchange.ad.skatelescope.org> Message-ID: By the way, thanks for all the suggestions! I have been following up with everyone in private emails, to avoid making too much noise on this mailing list. -- Jim -------------- next part -------------- An HTML attachment was scrubbed... URL: From jbc.develop at gmail.com Fri Oct 9 12:59:12 2020 From: jbc.develop at gmail.com (Juan BC) Date: Fri, 9 Oct 2020 13:59:12 -0300 Subject: [AstroPy] Projects involving irregularly shaped data In-Reply-To: References: <0807bbace491499380d21d803fe13f20@exchange.ad.skatelescope.org> Message-ID: Hi I'm the maintainer of this thing https://feets.readthedocs.io/en/latest/ (with no sufficient time to end the current version) But in simple words: The project support lightcurves in multiple bands, but these curves have different lengths, so I can't use the traditional time-series object of astropy. Maybe your project can help El vie., 9 oct. 2020 a las 13:37, Jim Pivarski () escribi?: > By the way, thanks for all the suggestions! I have been following up with > everyone in private emails, to avoid making too much noise on this mailing > list. > -- Jim > > _______________________________________________ > AstroPy mailing list > AstroPy at python.org > https://mail.python.org/mailman/listinfo/astropy > -- Juan B Cabral -------------- next part -------------- An HTML attachment was scrubbed... URL: From erik.tollerud at gmail.com Fri Oct 9 17:23:36 2020 From: erik.tollerud at gmail.com (Erik Tollerud) Date: Fri, 9 Oct 2020 17:23:36 -0400 Subject: [AstroPy] Astropy Ombudsperson nominations and CoCo running notes In-Reply-To: References: Message-ID: Hello Astropy community, I am happy to say we have concluded the nomination period for the interim Ombudsperson successfully: Perry Greenfield was nominated and has been accepted in the role. To put it in the words of the nominator: Perry Greenfield needs little introduction. He has been involved with the project from the start and continues to be involved in this and other community developed projects. He has the integrity and ethics that are necessary for this role. He has experience working with a diverse community. Based on his branch manager position at STScI he has conflict resolution and mediation process skills. He will be a true asset for the Astropy Project. We couldn't agree more, and are happy to welcome Perry in this role. Assuming the APE0 governance process proceeds without major modifications to the ombudsperson role, the transition from interim to permanent will occur as the governance documents dictate (which in the version as of right now requires a 2/3 confirmation from the voting members). For the Astropy Coordination Committee, Erik T On Tue, Aug 25, 2020 at 12:41 PM Tom Aldcroft wrote: > Dear Astropy community, > > We, the Coordination Committee, would like to give you an update on two > things: > > - We are seeking nominations for a new ombudsperson for the Astropy > Project. > - The Astropy Coordination Committee running meeting notes are now > available. > > > *Ombudsperson* > Steve Crawford has accepted a new position with NASA as the Science > Mission Directorate at NASA as the new executive program office for > scientific data and computing. This is fantastic news for Steve and for the > scientific computing community (including Astropy!). However, it means that > Steve will no longer be able to serve as the Astropy ombudsperson and he > has stepped down as of Monday, Aug 17. > > In advance of the new Astropy governance document (APE0) coming into > effect, the Coordination Committee is seeking nominations for the role of > interim Astropy Ombudsperson. Self-nominations are very welcome. APE0 > contains a description of the ombudsperson voting process, so when it > becomes the active governance document, that process will be undertaken to > replace or confirm the interim Ombudsperson. > > The Ombudsperson role has the following responsibilities: > > > *Provide a point of contact for sensitive issues separate from the > coordinating committee, including:* > > - *Monitoring the confidential at astropy.org > email account* > - *Solicit and provide anonymized feedback to the astropy coordination > committee regarding coordination of the project* > - *Assist the coordination committee and community engagement > coordinator with violations of the code of conduct or other ethical > concerns* > > > *Coordination committee running notes* > In order to improve transparency and communications, the Coordination > Committee is taking a page from the interim finance committee: the meeting > notes of the Coordination Committee are now available at the following URL: > > > https://docs.google.com/document/d/19jnnQsnCbpSU1yUVmSLnVfOhrHyxN1P1kYAqiG4Ic7k/edit# > > This includes both past meetings (starting from Aug 11, 2020) and a > tentative agenda for the next scheduled meeting. Questions, comments, > feedback on the notes can be made as comments on the doc, emails to > coordinators at astropy.org, discussion in the #project Slack channel, or > issues in the project repo. > > Cheers, > Tom A on behalf the Coordination Committee > _______________________________________________ > AstroPy mailing list > AstroPy at python.org > https://mail.python.org/mailman/listinfo/astropy > -------------- next part -------------- An HTML attachment was scrubbed... URL: From perry at stsci.edu Mon Oct 12 10:05:50 2020 From: perry at stsci.edu (Perry Greenfield) Date: Mon, 12 Oct 2020 14:05:50 +0000 Subject: [AstroPy] Astropy Ombudsperson nominations and CoCo running notes In-Reply-To: References: Message-ID: Hi Erik, I think you mentioned that I am supposed to monitor an email address. How is that supposed to work? Something that is automatically redirected to me? Thanks, Perry > On Oct 9, 2020, at 5:23 PM, Erik Tollerud wrote: > > External Email - Use Caution > > > Hello Astropy community, > > I am happy to say we have concluded the nomination period for the interim Ombudsperson successfully: Perry Greenfield was nominated and has been accepted in the role. To put it in the words of the nominator: > > Perry Greenfield needs little introduction. He has been involved with the project > from the start and continues to be involved in this and other community developed > projects. He has the integrity and ethics that are necessary for this role. He has experience working > with a diverse community. Based on his branch manager position at STScI he has > conflict resolution and mediation process skills. He will be a true asset for the Astropy Project. > > > We couldn't agree more, and are happy to welcome Perry in this role. Assuming the APE0 governance process proceeds without major modifications to the ombudsperson role, the transition from interim to permanent will occur as the governance documents dictate (which in the version as of right now requires a 2/3 confirmation from the voting members). > > > For the Astropy Coordination Committee, > Erik T > > > On Tue, Aug 25, 2020 at 12:41 PM Tom Aldcroft wrote: > Dear Astropy community, > > We, the Coordination Committee, would like to give you an update on two things: > ? We are seeking nominations for a new ombudsperson for the Astropy Project. > ? The Astropy Coordination Committee running meeting notes are now available. > Ombudsperson > > Steve Crawford has accepted a new position with NASA as the Science Mission Directorate at NASA as the new executive program office for scientific data and computing. This is fantastic news for Steve and for the scientific computing community (including Astropy!). However, it means that Steve will no longer be able to serve as the Astropy ombudsperson and he has stepped down as of Monday, Aug 17. > > In advance of the new Astropy governance document (APE0) coming into effect, the Coordination Committee is seeking nominations for the role of interim Astropy Ombudsperson. Self-nominations are very welcome. APE0 contains a description of the ombudsperson voting process, so when it becomes the active governance document, that process will be undertaken to replace or confirm the interim Ombudsperson. > > The Ombudsperson role has the following responsibilities: > > Provide a point of contact for sensitive issues separate from the coordinating committee, including: > ? Monitoring the confidential at astropy.org email account > ? Solicit and provide anonymized feedback to the astropy coordination committee regarding coordination of the project > ? Assist the coordination committee and community engagement coordinator with violations of the code of conduct or other ethical concerns > Coordination committee running notes > > In order to improve transparency and communications, the Coordination Committee is taking a page from the interim finance committee: the meeting notes of the Coordination Committee are now available at the following URL: > > https://docs.google.com/document/d/19jnnQsnCbpSU1yUVmSLnVfOhrHyxN1P1kYAqiG4Ic7k/edit# > > This includes both past meetings (starting from Aug 11, 2020) and a tentative agenda for the next scheduled meeting. Questions, comments, feedback on the notes can be made as comments on the doc, emails to coordinators at astropy.org, discussion in the #project Slack channel, or issues in the project repo. > > Cheers, > Tom A on behalf the Coordination Committee > _______________________________________________ > AstroPy mailing list > AstroPy at python.org > https://mail.python.org/mailman/listinfo/astropy > _______________________________________________ > AstroPy mailing list > AstroPy at python.org > https://mail.python.org/mailman/listinfo/astropy From erik.tollerud at gmail.com Tue Oct 27 14:18:16 2020 From: erik.tollerud at gmail.com (Erik Tollerud) Date: Tue, 27 Oct 2020 14:18:16 -0400 Subject: [AstroPy] ANN: Astropy v4.1 released Message-ID: Dear colleagues, We are very happy to announce the v4.1 release of the Astropy package, a core Python package for Astronomy: http://www.astropy.org Astropy is a community-driven Python package intended to contain much of the core functionality and common tools needed for astronomy and astrophysics. It is part of the Astropy Project, which aims to foster an ecosystem of interoperable astronomy packages for Python. New and improved major functionality in this release includes: * A new SpectralCoord class for representing and transforming spectral quantities * Support for writing Dask arrays to FITS files * Added True Equator Mean Equinox (TEME) frame for satellite two-line ephemeris data * Support for in-place setting of array-valued SkyCoord and frame objects * Change in the definition of equality comparison for coordinate classes * Support use of SkyCoord in table vstack, dstack, and insert_row * Support for table cross-match join with SkyCoord or N-d columns * Support for custom attributes in Table subclasses * Added a new Time subformat unix_tai * Added support for the -TAB convention in FITS WCS * Support for replacing submodels in CompoundModel * Support for units on otherwise unitless models via the Model.coerce_units method. * Support for ASDF serialization of models In addition, hundreds of smaller improvements and fixes have been made. An overview of the changes is provided at: http://docs.astropy.org/en/stable/whatsnew/4.1.html Instructions for installing Astropy are provided on our website, and extensive documentation can be found at: http://docs.astropy.org If you usually use pip/vanilla Python, you can do: pip install astropy --upgrade If you make use of the Anaconda Python Distribution, soon you will be able update to Astropy v4.1 with: conda update astropy Or if you cannot wait for Anaconda to update their default version, you can use the astropy channel: conda update -c astropy astropy Please report any issues, or request new features via our GitHub repository: https://github.com/astropy/astropy/issues Nearly 400 developers have contributed code to Astropy so far, and you can find out more about the team behind Astropy here: https://www.astropy.org/team.html The LTS (Long Term Support) version of Astropy at the time of v4.1's release is v4.0 - this version will be maintained until next LTS release (v5.0, scheduled for Fall 2021). Additionally, note that the Astropy 4.x series only supports Python 3. Python 2 users can continue to use the 2.x series but it is no longer supported (as Python 2 itself is no longer supported). For assistance converting Python 2 code to Python 3, see the Python 3 for scientists conversion guide. If you use Astropy directly for your work, or as a dependency to another package, please remember to acknowledge it by citing the appropriate Astropy paper. For the most up-to-date suggestions, see the acknowledgement page, but as of this release the recommendation is: This research made use of Astropy, a community-developed core Python package for Astronomy (Astropy Collaboration, 2018). We hope that you enjoy using Astropy as much as we enjoyed developing it! Erik Tollerud v4.1 Release Coordinator on behalf of The Astropy Project https://www.astropy.org/announcements/release-4.1.html