Doctoral thesis (Dissertations and theses)
Advances in Simulation-Based Inference: Towards the automation of the Scientific Method through Learning Algorithms
Hermans, Joeri
2022
 

Files


Full Text
thesis.pdf
Author postprint (5.51 MB)
Download

All documents in ORBi are protected by a user license.

Send to



Details



Keywords :
deep learning; machine learning; statistics; approximate inference; likelihood-free; simulation-based
Abstract :
[en] This dissertation presents several novel techniques and guidelines to advance the field of simulation-based inference. Simulation-based inference, or likelihood-free inference, refers to the process of statistical inference whenever simulating synthetic realizations x through detailed descriptions of their generating processes is possible, but evaluating the likelihood p(x | y) of parameters y tied to realizations x is intractable. What this effectively means is that while it is relatively simple to execute a computer simulation and collect samples from its generative process for various inputs y, it is rather difficult to invert the process where one poses the question: ``what set of parameters y could have been responsible producing x and what is their probability of doing that`` The likelihood p(x | y) plays a central role in answering this question. However, for most scientific simulators, the direct evaluation of the (true and unknown) likelihood involves solving an inverse problem that rests on the integration of all possible forward realizations implicitly defined by the computer code of the simulator. This issue is the core reason why it is typically impossible to evaluate the likelihood model of a computer simulator: it requires us to integrate across all possible code paths for all inputs y that could have potentially led to the realization x. Classical statistical inference based on the likelihood is for this reason impractical. Nevertheless, approximate inference remains possible by relying on surrogates that produce estimates of key quantities necessary for statistical inference. This thesis introduces various techniques and guidelines to effectively construct such surrogates and demonstrates how these approximations should be applied reliably. We explicitly make the point that the dogma of data efficiency should not be central to the field. Rather, reliable approximations should if we ever are to deduce scientific results with the techniques we developed over the years. This point is strengthened by demonstrating that all techniques can produce approximations that are not reliable from a scientific point of view, that is, when one is interested in constraining parameters or models. We argue for novel protocols that provide theoretically backed reliability properties. To that end, this thesis introduces a novel algorithm that provides such guarantees in terms of the binary classifier. In fact, the theoretical result is applicable to any binary classification problem. Finally, these contributions are framed within the context of the automation of science. This thesis concerned itself with the automation of the last step of the scientific method, which is described as a recurrence over the sequence hypothesis, experiment, and conclusion. For the most part, the steps of hypothesis formation and experiment design remain however solely for the scientists to decide. Only occasionally are they explored, designed and automated through computer-assisted means. For these two steps, we provide research avenues and proof of concepts that could unlock their automation.
Disciplines :
Computer science
Physics
Author, co-author :
Hermans, Joeri ;  Université de Liège - ULiège > Montefiore Institute of Electrical Engineering and Computer Science
Language :
English
Title :
Advances in Simulation-Based Inference: Towards the automation of the Scientific Method through Learning Algorithms
Defense date :
2022
Institution :
ULiège - University of Liège [Faculty of Applied Sciences], Belgium
Degree :
Doctor of Philosophy in Computer Science
Promotor :
Louppe, Gilles  ;  Université de Liège - ULiège > Département d'électricité, électronique et informatique (Institut Montefiore) > Big Data
President :
Geurts, Pierre ;  Université de Liège - ULiège > Montefiore Institute of Electrical Engineering and Computer Science
Jury member :
Tomczak, Jakub;  VU - Vrije Universiteit Amsterdam
Wehenkel, Louis  ;  Université de Liège - ULiège > Montefiore Institute of Electrical Engineering and Computer Science
Weniger, Christophe;  UvA - University of Amsterdam
Funders :
F.R.S.-FNRS - Fund for Scientific Research [BE]
Funding number :
FRIA 27575
Available on ORBi :
since 05 April 2022

Statistics


Number of views
301 (32 by ULiège)
Number of downloads
596 (26 by ULiège)

Bibliography


Similar publications



Contact ORBi