The Vera C. Rubin Observatory’s Data Butler provides a way for science users to retrieve data without knowing where or how it is stored. In order to support the 10,000 science users in a hybrid cloud environment, we have to modify the Data Butler to use a client/server architecture such that we can share authentication and authorization controls with the Rubin Science Platform and more easily support standard tooling for scaling up backend services. In this paper we describe the changes being made to support this and some of the difficulties that are being encountered.
The Rubin Observatory’s Data Butler is designed to allow data file location and file formats to be abstracted away from the people writing the science pipeline algorithms. The Butler works in conjunction with the workflow graph builder to allow pipelines to be constructed from the algorithmic tasks. These pipelines can be executed at scale using object stores and multi-node clusters, or on a laptop using a local file system. The Butler and pipeline system are now in daily use during Rubin construction and early operations.
The Vera C. Rubin Observatory will advance many areas of astronomy over the next decade with its unique widefast- deep multi-color imaging survey, the Legacy Survey of Space and Time (LSST).1 The LSST will produce approximately 20TB of raw data per night, which will be automatically processed by the LSST Science Pipelines to generate science-ready data products – processed images, catalogs and alerts. To ensure that these data products enable transformative science with LSST, stringent requirements have been placed on their quality and scientific fidelity, for example on image quality and depth, astrometric and photometric performance, and object recovery completeness. In this paper we introduce faro, a framework for automatically and efficiently computing scientific performance metrics on the LSST data products for units of data of varying granularity, ranging from single-detector to full-survey summary statistics. By measuring and monitoring metrics, we are able to evaluate trends in algorithmic performance and conduct regression testing during development, compare the performance of one algorithm against another, and verify that the LSST data products will meet performance requirements by comparing to specifications. We present initial results using faro to characterize the performance of the data products produced on simulated and precursor data sets, and discuss plans to use faro to verify the performance of the LSST commissioning data products.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.