[en] Hybrid modelling reduces the misspecification of expert models by combining
them with machine learning (ML) components learned from data. Like for many ML
algorithms, hybrid model performance guarantees are limited to the training
distribution. Leveraging the insight that the expert model is usually valid
even outside the training domain, we overcome this limitation by introducing a
hybrid data augmentation strategy termed \textit{expert augmentation}. Based on
a probabilistic formalization of hybrid modelling, we show why expert
augmentation improves generalization. Finally, we validate the practical
benefits of augmented hybrid models on a set of controlled experiments,
modelling dynamical systems described by ordinary and partial differential
equations.
Disciplines :
Computer science Mathematics
Author, co-author :
Wehenkel, Antoine ; Université de Liège - ULiège > Montefiore Institute of Electrical Engineering and Computer Science
Behrmann, Jens
Hsu, Hsiang
Sapiro, Guillermo
Louppe, Gilles ; Université de Liège - ULiège > Département d'électricité, électronique et informatique (Institut Montefiore) > Big Data
Jacobsen, Jörn-Henrik
Language :
English
Title :
Robust Hybrid Learning With Expert Augmentation
Publication date :
February 2023
Journal title :
Transactions on Machine Learning Research
eISSN :
2835-8856
Publisher :
OpenReview, Amherst, United States - Massachusetts
Aghili, F. Energetically consistent model of slipping and sticking frictional impacts in multibody systems. Multibody System Dynamics, 48(2):193–209, 2020.
Arjovsky, M., Bottou, L., Gulrajani, I., and Lopez-Paz, D. Invariant risk minimization. arXiv preprint arXiv:1907.02893, 2019.
Asseman, A., Kornuta, T., and Ozcan, A. Learning beyond simulated physics. In Modeling and Decision-making in the Spatiotemporal Domain Workshop, 2018. URL https://openreview.net/pdf?id=HylajWsRF7.
Braun, J. E. and Chaturvedi, N. An inverse gray-box model for transient building load prediction. HVAC&R Research, 8(1):73–99, 2002.
Brofos, J., Gabrié, M., Brubaker, M. A., and Lederman, R. R. Adaptation of the independent metropolishastings sampler with normalizing flow proposals. In International Conference on Artificial Intelligence and Statistics, pp. 5949–5986. PMLR, 2022.
Campbell, A., Chen, W., Stimper, V., Hernandez-Lobato, J. M., and Zhang, Y. A gradient based strategy for hamiltonian monte carlo hyperparameter optimization. In International Conference on Machine Learning, pp. 1238–1248. PMLR, 2021.
Chen, R. T., Rubanova, Y., Bettencourt, J., and Duvenaud, D. Neural ordinary di erential equations. In Advances in Neural Information Processing Systems, volume 32, 2018.
Crabtree, V. P. and Smith, P. R. Physiological models of the human vasculature and photoplethysmography. Electronic Systems and Control Division Research, Department of Electronic and Electrical Engineering, Loughborough University, pp. 60–63, 2003.
Cranmer, M., Greydanus, S., Hoyer, S., Battaglia, P., Spergel, D., and Ho, S. Lagrangian neural networks. In ICLR 2020 Workshop on Integration of Deep Neural Models and Di erential Equations, 2020.
Creager, E., Jacobsen, J.-H., and Zemel, R. Environment inference for invariant learning. In International Conference on Machine Learning, pp. 2189–2200. PMLR, 2021.
Cubuk, E. D., Zoph, B., Mane, D., Vasudevan, V., and Le, Q. V. Autoaugment: Learning augmentation strategies from data. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 113–123, 2019.
Doersch, C. and Zisserman, A. Sim2real transfer learning for 3d human pose estimation: motion to the rescue. Advances in Neural Information Processing Systems, 32:12949–12961, 2019.
Espeholt, L., Agrawal, S., Sønderby, C., Kumar, M., Heek, J., Bromberg, C., Gazen, C., Carver, R., Andrychowicz, M., Hickey, J., Bell, A., and Kalchbrenner, N. Deep learning for twelve hour precipitation forecasts. Nature communications, 13(1):5145, 2022.
Ganin, Y., Ustinova, E., Ajakan, H., Germain, P., Larochelle, H., Laviolette, F., Marchand, M., and Lempitsky, V. Domain-adversarial training of neural networks. The Journal of Machine Learning Research, 17(1): 2096–2030, 2016.
Geirhos, R., Jacobsen, J.-H., Michaelis, C., Zemel, R., Brendel, W., Bethge, M., and Wichmann, F. A. Shortcut learning in deep neural networks. Nature Machine Intelligence, 2(11):665–673, 2020.
Grünwald, P. D. The Minimum Description Length Principle. MIT press, 2007.
Guen, V. L. and Thome, N. Disentangling physical dynamics from unknown factors for unsupervised video prediction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11474–11484, 2020.
Gulrajani, I. and Lopez-Paz, D. In search of lost domain generalization. In International Conference on Learning Representations, 2020.
Hébert-Johnson, U., Kim, M., Reingold, O., and Rothblum, G. Multicalibration: Calibration for the (computationally-identifiable) masses. In International Conference on Machine Learning, pp. 1939–1948. PMLR, 2018.
Hodgkin, A. L. and Huxley, A. F. A quantitative description of membrane current and its application to conduction and excitation in nerve. The Journal of Physiology, 117(4):500–544, 1952.
Keriven, N. and Peyré, G. Universal invariant and equivariant graph neural networks. Advances in Neural Information Processing Systems, 32:7092–7101, 2019.
Kim, M. P., Ghorbani, A., and Zou, J. Multiaccuracy: Black-box post-processing for fairness in classification. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, pp. 247–254, 2019.
Koh, P. W., Sagawa, S., Marklund, H., Xie, S. M., Zhang, M., Balsubramani, A., Hu, W., Yasunaga, M., Phillips, R. L., Gao, I., et al. Wilds: A benchmark of in-the-wild distribution shifts. In International Conference on Machine Learning, pp. 5637–5664. PMLR, 2021.
Krueger, D., Caballero, E., Jacobsen, J.-H., Zhang, A., Binas, J., Zhang, D., Le Priol, R., and Courville, A. Out-of-distribution generalization via risk extrapolation (rex). In International Conference on Machine Learning, pp. 5815–5826. PMLR, 2021.
Lahoti, P., Beutel, A., Chen, J., Lee, K., Prost, F., Thain, N., Wang, X., and Chi, E. Fairness without demographics through adversarially reweighted learning. Advances in neural information processing systems, 33:728–740, 2020.
LeCun, Y., Bengio, Y., et al. Convolutional networks for images, speech, and time series. The Handbook of Brain Theory and Neural Networks, 3361(10), 1995.
Lei, C. L. and Mirams, G. R. Neural network di erential equations for ion channel modelling. Frontiers in Physiology, pp. 1166, 2021.
Levine, M. and Stuart, A. A framework for machine learning of model error in dynamical systems. Communications of the American Mathematical Society, 2(07):283–344, 2022.
Li, Y., Tian, X., Gong, M., Liu, Y., Liu, T., Zhang, K., and Tao, D. Deep domain generalization via conditional invariant adversarial networks. In Proceedings of the European Conference on Computer Vision (ECCV), pp. 624–639, 2018.
Mahmood, O., Mansimov, E., Bonneau, R., and Cho, K. Masked graph modeling for molecule generation. Nature Communications, 12(1):1–12, 2021.
Mehta, V., Char, I., Neiswanger, W., Chung, Y., Nelson, A., Boyer, M., Kolemen, E., and Schneider, J. Neural dynamical systems: Balancing structure and flexibility in physical prediction. In 2021 60th IEEE Conference on Decision and Control (CDC), pp. 3735–3742. IEEE, 2021.
Psichogios, D. C. and Ungar, L. H. A hybrid neural network-first principles approach to process modeling. AIChE Journal, 38(10):1499–1511, 1992.
Qian, Z., Zame, W., Fleuren, L., Elbers, P., and van der Schaar, M. Integrating expert odes into neural odes: pharmacology and disease progression. Advances in Neural Information Processing Systems, 34: 11364–11383, 2021.
Reichstein, M., Camps-Valls, G., Stevens, B., Jung, M., Denzler, J., Carvalhais, N., et al. Deep learning and process understanding for data-driven earth system science. Nature, 566(7743):195–204, 2019.
Rico-Martinez, R., Anderson, J., and Kevrekidis, I. Continuous-time nonlinear signal processing: a neural network based approach for gray box identification. In Proceedings of IEEE Workshop on Neural Networks for Signal Processing, pp. 596–605. IEEE, 1994.
Rivera-Sampayo, R. and Vélez-Reyes, M. Gray-box modeling of electric drive systems using neural networks. In Proceedings of the 2001 IEEE International Conference on Control Applications (CCA’01)(Cat. No. 01CH37204), pp. 146–151. IEEE, 2001.
Sadeghi, F., Toshev, A., Jang, E., and Levine, S. Sim2real viewpoint invariant visual servoing by recurrent control. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4691–4699, 2018a.
Sadeghi, F., Toshev, A., Jang, E., and Levine, S. Sim2real viewpoint invariant visual servoing by recurrent control. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4691–4699, 2018b. doi: 10.1109/CVPR.2018.00493.
Sagawa, S., Koh, P. W., Hashimoto, T. B., and Liang, P. Distributionally robust neural networks for group shifts: On the importance of regularization for worst-case generalization. In International Conference on Learning Representations, 2019.
Saha, P., Dash, S., and Mukhopadhyay, S. Physics-incorporated convolutional recurrent neural networks for source identification and forecasting of dynamical systems. Neural Networks, 144:359–371, 2021.
Shrivastava, A., Pfister, T., Tuzel, O., Susskind, J., Wang, W., and Webb, R. Learning from simulated and unsupervised images through adversarial training. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2107–2116, 2017.
Silver, D., Schrittwieser, J., Simonyan, K., Antonoglou, I., Huang, A., Guez, A., Hubert, T., Baker, L., Lai, M., Bolton, A., et al. Mastering the game of go without human knowledge. Nature, 550(7676):354–359, 2017.
Stachowiak, T. and Okada, T. A numerical analysis of chaos in the double pendulum. Chaos, Solitons & Fractals, 29(2):417–422, 2006.
Takeishi, N. and Kalousis, A. Physics-integrated variational autoencoders for robust and interpretable generative modeling. Advances in Neural Information Processing Systems, 34, 2021.
Thompson, M. L. and Kramer, M. A. Modeling chemical processes using prior knowledge and neural networks. AIChE Journal, 40(8):1328–1340, 1994.
Wald, Y., Feder, A., Greenfeld, D., and Shalit, U. On calibration and out-of-domain generalization. Advances in neural information processing systems, 34:2215–2227, 2021.
Willard, J., Jia, X., Xu, S., Steinbach, M., and Kumar, V. Integrating scientific knowledge with machine learning for engineering and environmental systems. ACM Comput. Surv., 55(4), nov 2022. ISSN 0360-0300. doi: 10.1145/3514228. URL https://doi.org/10.1145/3514228.
Xu, K., Hu, W., Leskovec, J., and Jegelka, S. How powerful are graph neural networks? In International Conference on Learning Representations, 2019.
Yin, Y., Le Guen, V., Dona, J., de Bézenac, E., Ayed, I., Thome, N., and Gallinari, P. Augmenting physical models with deep networks for complex dynamics forecasting. Journal of Statistical Mechanics: Theory and Experiment, 2021(12):124012, 2021.
Zhang, M., Hayes, P., and Barber, D. Generalization gap in amortized inference. In Advances in Neural Information Processing Systems, volume 35, 2022.
Zhang, Y., Barzilay, R., and Jaakkola, T. Aspect-augmented adversarial networks for domain adaptation. Transactions of the Association for Computational Linguistics, 5:515–528, 2017.
Zyla, P. et al. Review of Particle Physics. PTEP, 2020(8):083C01, 2020. doi: 10.1093/ptep/ptaa104.
Similar publications
Sorry the service is unavailable at the moment. Please try again later.
This website uses cookies to improve user experience. Read more
Save & Close
Accept all
Decline all
Show detailsHide details
Cookie declaration
About cookies
Strictly necessary
Performance
Strictly necessary cookies allow core website functionality such as user login and account management. The website cannot be used properly without strictly necessary cookies.
This cookie is used by Cookie-Script.com service to remember visitor cookie consent preferences. It is necessary for Cookie-Script.com cookie banner to work properly.
Performance cookies are used to see how visitors use the website, eg. analytics cookies. Those cookies cannot be used to directly identify a certain visitor.
Used to store the attribution information, the referrer initially used to visit the website
Cookies are small text files that are placed on your computer by websites that you visit. Websites use cookies to help users navigate efficiently and perform certain functions. Cookies that are required for the website to operate properly are allowed to be set without your permission. All other cookies need to be approved before they can be set in the browser.
You can change your consent to cookie usage at any time on our Privacy Policy page.