Computational Reproducibility in Modelling and Simulation Studies
A Case Study on Data Assimilation Algorithms for Agent-based Modelling and Simulation
G.M. Low Chew Tung (TU Delft - Technology, Policy and Management)
Yilin Huang – Mentor (TU Delft - System Engineering)
A. Verbraeck – Graduation committee member (TU Delft - Policy Analysis)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
In recent times, the ‘reproducibility crisis’ has become a cause for concern in the scientific community. Many disciplines from psychology and neuroscience to machine learning and ecology have been promoting initiatives towards ensuring the replicability and reproducibility of the findings reported in research publications. However, ensuring reproducibility and replicability of research faces many challenges such as differing definitions of the terms across disciplines, lack of incentives towards reproducing/replicating already published work and no standard methods for assessing a successful reproduction or replication. Reproducible research is fundamental to the scientific process, helps to ensure the credibility of scientific research and facilitates the dissemination and advancement of scientific knowledge. This crisis is especially relevant in the field of Modelling and Simulation and other computational sciences which rely on computer simulations to support findings yet there is a dearth of adequately detailed documentation to facilitate successful reproductions.
This project focussed on investigating the computational reproducibility of a research publication in the field of data assimilation for agent-based simulations. Agent-Based Modelling and Simulation (ABMS) is a computational method frequently employed to study complex socio-technical systems. Data assimilation techniques for ABMS is an emerging research area that seeks to incorporate real-time data into the model to improve its predictive capabilities. However, due to its novelty, reproducibility studies of these experiments are lacking. As this is a young research field, with various new methodologies being published, it is important to support verification and validation processes to advance scientific developments in the field such that the methods can be suitably adopted by applied researchers for future studies.
The main challenges of the reproduction process were identified as code quality and missing dependencies; ambiguous or missing specifications regarding the methodology and inconsistencies between textual descriptions and implemented code. Evaluation of reproducibility was also considered from the perspective of statistical metrics on one hand and qualitative reproducibility frameworks on the other hand. Furthermore, the experiment also highlighted the importance of computational provenance to connect the published results to the code or software used to generate them.
A series of practical steps to guide the workflow of future reproduction studies was drafted along with guiding questions to deduce computational workflows from publications and their code repositories when workflows to produce published results are missing.
A sensitivity analysis was employed to examine the influence of filter parameters including the number of particles, the resampling window, and the jitter standard deviation on the data assimilation algorithm’s estimation accuracy to verify the implementation and reproducibility of the particle filter algorithm used in the case study. From this experiment and based on literature, key elements that should be specified in future data assimilation for ABMS studies to ensure reproducibility of the research were identified.
In summary, this thesis project addressed the research gap in data assimilation for ABMS by conducting a reproducibility study of a research publication employing the Particle Filter technique. Key results from the original publication were reproduced and the original and reproduced results were compared. A reproducibility protocol was formulated to guide researchers in future reproducibility studies and with respect to data assimilation for agent-based simulations, a list of key parameters and considerations that should be reported for studies applying the particle filter to ABMS was devised.