Paper published in a journal (Scientific congresses and symposiums)
The Well: a Large-Scale Collection of Diverse Physics Simulations for Machine Learning
Ohana, Ruben; McCabe, Michael; Meyer, Lucas et al.
2024In Advances in Neural Information Processing Systems, 37
Peer Reviewed verified by ORBi
 

Files


Full Text
2412.00568v2.pdf
Author postprint (5.38 MB)
Download

All documents in ORBi are protected by a user license.

Send to



Details



Keywords :
Information Systems; Signal Processing; Datasets; Fluid Dynamics
Abstract :
[en] Machine learning based surrogate models offer researchers powerful tools for accelerating simulation-based workflows. However, as standard datasets in this space often cover small classes of physical behavior, it can be difficult to evaluate the efficacy of new approaches. To address this gap, we introduce the Well: a large-scale collection of datasets containing numerical simulations of a wide variety of spatiotemporal physical systems. The Well draws from domain experts and numerical software developers to provide 15TB of data across 16 datasets covering diverse domains such as biological systems, fluid dynamics, acoustic scattering, as well as magneto-hydrodynamic simulations of extra-galactic fluids or supernova explosions. These datasets can be used individually or as part of a broader benchmark suite. To facilitate usage of the Well, we provide a unified PyTorch interface for training and evaluating models. We demonstrate the function of this library by introducing example baselines that highlight the new challenges posed by the complex dynamics of the Well. The code and data is available at https://github.com/PolymathicAI/the_well.
Disciplines :
Engineering, computing & technology: Multidisciplinary, general & others
Author, co-author :
Ohana, Ruben ;  Polymathic AI ; Flatiron Institute, United States
McCabe, Michael ;  Polymathic AI
Meyer, Lucas;  Polymathic AI
Morel, Rudy;  Polymathic AI ; Flatiron Institute, United States
Agocs, Fruzsina J.;  Flatiron Institute, United States ; University of Colorado, Boulder, United States
Beneitez, Miguel;  University of Cambridge, United Kingdom
Berger, Marsha;  Flatiron Institute, United States ; New York University, United States
Burkhart, Blakesley;  Flatiron Institute, United States ; Rutgers University, United States
Dalziel, Stuart B.;  University of Cambridge, United Kingdom
Fielding, Drummond B.;  Flatiron Institute, United States ; Cornell University, United States
Fortunato, Daniel;  Flatiron Institute, United States
Goldberg, Jared A.;  Flatiron Institute, United States
Hirashima, Keiya;  Polymathic AI ; Flatiron Institute, United States ; University of Tokyo, Japan
Jiang, Yan-Fei;  Flatiron Institute, United States
Kerswell, Rich R.;  University of Cambridge, United Kingdom
Maddu, Suryanarayana;  Flatiron Institute, United States
Miller, Jonah;  Los Alamos National Laboratory, United States
Mukhopadhyay, Payel;  University of California, Berkeley, United States
Nixon, Stefan S.;  University of Cambridge, United Kingdom
Shen, Jeff;  Princeton University, United States
Watteaux, Romain;  CEA DAM, France
Blancard, Bruno Régaldo-Saint;  Polymathic AI ; Flatiron Institute, United States
Rozet, François  ;  Université de Liège - ULiège > Département d'électricité, électronique et informatique (Institut Montefiore) > Big Data ; Polymathic AI
Parker, Liam H.;  Polymathic AI ; Flatiron Institute, United States ; University of California, Berkeley, United States
Cranmer, Miles;  Polymathic AI ; University of Cambridge, United Kingdom
Ho, Shirley;  Polymathic AI ; Flatiron Institute, United States ; New York University, United States ; Princeton University, United States
More authors (16 more) Less
 These authors have contributed equally to this work.
Language :
English
Title :
The Well: a Large-Scale Collection of Diverse Physics Simulations for Machine Learning
Publication date :
30 November 2024
Event name :
The Thirty-Eighth Annual Conference on Neural Information Processing Systems
Event place :
Vancouver, Canada
Event date :
December 10-15, 2024
Audience :
International
Journal title :
Advances in Neural Information Processing Systems
ISSN :
1049-5258
Publisher :
Neural information processing systems foundation
Volume :
37
Peer review/Selection committee :
Peer Reviewed verified by ORBi
Funding text :
The authors would like to thank the Scientific Computing Core, a division of the Flatiron Institute, a division of the Simons Foundation, and more specifically Geraud Krawezik for the computing support, the members of Polymathic AI for the insightful discussions, and especially Michael Eickenberg for his input on the paper. Polymathic AI acknowledges funding from the Simons Foundation and Schmidt Sciences, LLC. Additionally, we gratefully acknowledge the support of NVIDIA Corporation for the donation of the DGX Cloud node hours used in this research. The authors would like to thank Aaron Watters, Alex Meng and Lucy Reading-Ikkanda for their help on the visuals, as well as Keaton Burns for his help on using Dedalus. M.B and R.R.K acknowledge Dr Jacob Page and Dr Yves Dubief for their valuable discussions about the multistability of viscoelastic states, and are grateful to EPSRC for supporting this work via grant EP/V027247/1. B.B. acknowledges the generous support of the Flatiron Institute Simons Foundation for hosting the CATS database and the support of NASA award 19-ATP19-0020. R.M. would like to thank Keaton Burns for his advice on using the Dedalus package for generating data. J.S, J.A.G, Y-F J. would like to thank Lars Bildsten, William C. Schultz, and Matteo Cantiello for valuable discussions instrumental to the development of the global RSG simulation setup. These calculations were supported in part by NASA grants ATP-80NSSC18K0560 and ATP-80NSSC22K0725, and computational resources were provided by the NASA High-End Computing (HEC) program through the NASA Advanced Supercomputing (NAS) Division at Ames. J.M.M's work was supported through the Laboratory Directed Research and Development program under project number 20220564ECR at Los Alamos National Laboratory (LANL). LANL is operated by Triad National Security, LLC, for the National Nuclear Security Administration of U.S. Department of Energy (Contract No. 89233218CNA000001). P.M. acknowledges the continued support of the Neutrino Theory Network Program Grant under award number DE-AC02-07CHI11359. P.M. expresses gratitude to the Institute of Astronomy at the University of Cambridge for hosting them as a visiting researcher, during which the idea for this contribution was conceived and initiated. S.S.N would like to acknowledge that their work is funded and supported by the CEA. K.H. acknowledges support of Grants-in-Aid for JSPS Fellows (22KJ1153) and MEXT as \u201CProgram for Promoting Researches on the Supercomputer Fugaku\u201D (Structure and Evolution of the Universe Unraveled by Fusion of Simulation and AI; Grant Number JPMXP1020230406). These calculations are partially carried out on Cray XC50 CPU-cluster at the Center for Computational Astrophysics (CfCA) of the National Astronomical Observatory of Japan.
Available on ORBi :
since 20 May 2025

Statistics


Number of views
90 (5 by ULiège)
Number of downloads
79 (2 by ULiège)

Scopus citations®
 
8
Scopus citations®
without self-citations
6

Bibliography


Similar publications



Contact ORBi