A.W. Heemink | TU Delft Repository

A wave data assimilation system for the North Sea based on Ensemble Kalman Filtering and the potential of satellite altimetry

Journal article (2025) - C. W.E. de Korte, M. Verlaan, A. W. Heemink

A Wave Data Assimilation System based on the Ensemble Kalman Filter (EnKF) is implemented for the North Sea showing improved performance and physical consistency. We first show the EnKF implementation and illustrate the wave data assimilation system using identical twin experiments to assimilate synthetic observations from buoys. A sensitivity analysis shows that the ensemble size, assimilation frequency and observation uncertainty are relatively important settings. Lastly, the potential for assimilating satellite measurements was assessed by assimilating synthetic altimeter measurements with real pass-over tracks. In these experiments, the state contains the full wave spectrum, unlike in most existing schemes. The results show that wave spectra and integral variables beyond significant wave height show physically consistent updates for the buoy and satellite experiments, by assimilating only significant wave height. This is a key advantage of this implementation compared to the more widely used implementations in wave data assimilation. Although the satellite experiment performs slightly worse than the buoy experiment due to decreased temporal availability of measurements, the results underline the potential for assimilation of satellite altimeter measurements. Such a system provides a promising framework for future observation impact study using satellite altimeter measurements. ...

Estimating Wind and Emission Parameters in an Atmospheric Transport Model

Conference paper (2024) - Andres Yarce Botero, Santiago Lopez Restrepo, Olga Lucia Quintero, Arnold Heemink

The present study proposes a novel data assimilation (DA) approach for estimating emission and wind direction parameters in an advection-diffusion model. This implementation aims to improve the prediction of a chemical transport model over long distances by updating the emission operator in the model using DA techniques. As a first step, we want to test the method in a small-scale scenario. A low-dimensional advection-diffusion model was utilized to evaluate the effectiveness of the proposed approach under various sampling observation numbers. The model’s emission and wind parameters are perturbed as a source of uncertainty. The parameters are sequentially estimated with the adjoint-free Ensemble Kalman filter with an augmented state vector. These sequential DA techniques exploit the ensemble of multiple model realizations to reduce uncertainty in the state and parameter representation. An associated stream function with a divergence-free condition controls the wind fields, and the estimation of this stream function through the assimilation process allows corrections of the wind fields without violating physical laws. The technique’s performance was compared against validation observations such as the Root-Mean Square (RMS), and it was found that the number of assimilated observations had a significant impact on the parameter estimations results. This study demonstrates the potential of the proposed DA approach for improving the prediction of transport in the advection-diffusion model through parameter estimation. ...

Prediction of the Magnetic State of Ferromagnetic Objects by Assimilating Data into a Physical Model

Journal article (2024) - Aad Vijn, Bart Jan Peet, Marianne Schaaphok, Eugène Lepelaars, Arnold Heemink

This paper presents a hybrid model to estimate the magnetic behaviour of a ferromagnetic structure. The mathematical-physical model has been developed using the Method of Moments combined with a hysteresis model. The mathematical model was derived by a linearisation of the hysteresis curve. The initial magnetic state of a ferromagnetic object is found through inverse computations, including regularisation techniques. The idea of dictionary regularisation is introduced to support the inverse computations with prescribed templates that reflect a priori knowledge of the typical shapes of magnetisation distributions. These templates are extracted from the Method of Moments. Data assimilation is used to update the model in time by means of measurements of the magnetic field near a ferromagnetic structure. The proposed hybrid model is implemented for a typical steel object and verified by means of numerical experiments and measurements in an experimental environment. ...

Ozone exceedance forecasting with enhanced extreme instance augmentation

A case study in Germany

Journal article (2024) - Tuo Deng, Astrid Manders, Arjo Segers, Arnold Willem Heemink, Hai Xiang Lin

Accurately forecasting ozone levels that exceed specific thresholds is pivotal for mitigating adverse effects on both the environment and public health. However, predicting such ozone exceedances remains challenging due to the infrequent occurrence of high-concentration ozone data. This research, leveraging data from 57 German monitoring stations from 1999 to 2018, introduces an Enhanced Extreme Instance Augmentation Random Forest (EEIA-RF) approach that significantly improves the prediction of days when the maximum daily 8-hour average ozone concentrations exceed 120μg/m³. A pre-trained machine learning model is used to generate additional high-concentration data, which, combined with selectively reduced low-concentration data, forms a new dataset for training a refined model. This method achieved an improvement of at least 8% in the accuracy of predicting days with ozone exceedances across Germany. Our experiment underscores the approach's value in enhancing atmospheric modeling and supporting public health advisories and environmental policy-making related to ozone pollution. ...

Design and Implementation of a Low-Cost Air Quality Network for the Aburra Valley Surrounding Mountains

Pollutants

Journal article (2023) - Andrés Yarce Botero, Santiago Lopez Restrepo, Juan Sebastian Rodriguez, Diego Valle, Julian Galvez-Serna, Elena Montilla, Francisco Botero, Bas Henzing, Arnold Heemink, More Authors...

The densest network for measuring air pollutant concentrations in Colombia is in Medellin, where most sensors are located in the heavily polluted lower parts of the valley. Measuring stations in the higher elevations on the mountains surrounding the valley are not available, which limits our understanding of the valley’s pollutant dynamics and hinders the effectiveness of data assimilation studies using chemical transport models such as LOTOS-EUROS. To address this gap in measurements, we have designed a new network of low-cost sensors to be installed at altitudes above 2000 m.a.s.l. The network consists of custom-built, solar-powered, and remotely connected sensors. Locations were strategically selected using the LOTOS-EUROS model driven by diverse meteorology-simulated fields to explore the effects of the valley wind representation on the transport of pollutants. The sensors transmit collected data to internet gateways for posterior analysis. Various tests to verify the critical characteristics of the equipment, such as long-range transmission modeling and experiments with an R score of 0.96 for the best propagation model, energy power system autonomy, and sensor calibration procedures, besides case exposure to dust and water experiments, to ensure IP certifications. An inter-calibration procedure was performed to characterize the sensors against reference sensors and describe the observation error to provide acceptable ranges for the data assimilation algorithm (<10% nominal). The design, installation, testing, and implementation of this air quality network, oriented towards data assimilation over the Aburrá Valley, constitute an initial experience for the simulation capabilities toward the system’s operative capabilities. Our solution approach adds value by removing the disadvantages of low-cost devices and offers a viable solution from a developing country’s perspective, employing hardware explicitly designed for the situation. ...

The densest network for measuring air pollutant concentrations in Colombia is in Medellin, where most sensors are located in the heavily polluted lower parts of the valley. Measuring stations in the higher elevations on the mountains surrounding the valley are not available, which limits our understanding of the valley’s pollutant dynamics and hinders the effectiveness of data assimilation studies using chemical transport models such as LOTOS-EUROS. To address this gap in measurements, we have designed a new network of low-cost sensors to be installed at altitudes above 2000 m.a.s.l. The network consists of custom-built, solar-powered, and remotely connected sensors. Locations were strategically selected using the LOTOS-EUROS model driven by diverse meteorology-simulated fields to explore the effects of the valley wind representation on the transport of pollutants. The sensors transmit collected data to internet gateways for posterior analysis. Various tests to verify the critical characteristics of the equipment, such as long-range transmission modeling and experiments with an R score of 0.96 for the best propagation model, energy power system autonomy, and sensor calibration procedures, besides case exposure to dust and water experiments, to ensure IP certifications. An inter-calibration procedure was performed to characterize the sensors against reference sensors and describe the observation error to provide acceptable ranges for the data assimilation algorithm (<10% nominal). The design, installation, testing, and implementation of this air quality network, oriented towards data assimilation over the Aburrá Valley, constitute an initial experience for the simulation capabilities toward the system’s operative capabilities. Our solution approach adds value by removing the disadvantages of low-cost devices and offers a viable solution from a developing country’s perspective, employing hardware explicitly designed for the situation.

Improving Air Pollution Modelling in Complex Terrain with a Coupled WRF–LOTOS–EUROS Approach

A Case Study in Aburrá Valley, Colombia

Journal article (2023) - Jhon E. Hinestroza-Ramirez, Santiago Lopez-Restrepo, A. Yarce Botero, Arjo Segers, Angela Maria Rendon-Perez, Santiago Isaza-Cadavid, A.W. Heemink, Olga Lucia Quintero

Chemical transport models (CTM) are crucial for simulating the distribution of air pollutants, such as particulate matter, and evaluating their impact on the environment and human health. However, these models rely heavily on accurate emission inventory and meteorological inputs, usually obtained from reanalyzed weather data, such as the European Centre for Medium-Range Weather Forecasts (ECMWF). These inputs do not accurately reflect the complex topography and micro-scale meteorology in tropical regions where air pollution can pose a severe public health threat. We propose coupling the LOTOS–EUROS CTM model and the weather research and forecasting (WRF) model to improve LOTOS–EUROS representation. Using WRF as a meteorological driver provides high-resolution inputs for accurate pollutant simulation. We compared LOTOS–EUROS results when WRF and ECMWF provided the meteorological inputs during low and high pollutant concentration periods. The findings indicate that the WRF–LOTOS–EUROS coupling offers a more precise representation of the meteorology and pollutant dispersion than the default input of ECMWF. The simulations also capture the spatio-temporal variability of pollutant concentration and emphasize the importance of accounting for micro-scale meteorology and topography in air pollution modelling. ...

A Knowledge-Aided Robust Ensemble Kalman Filter Algorithm for Non-Linear and Non-Gaussian Large Systems

Journal article (2022) - Santiago Lopez Restrepo, Andres Yarce , Nicolás Pinel , O. L. Quintero, Arjo Segers, A.W. Heemink

This work proposes a robust and non-Gaussian version of the shrinkage-based knowledge-aided EnKF implementation called Ensemble Time Local H_∞ Filter Knowledge-Aided (EnTLHF-KA). The EnTLHF-KA requires a target covariance matrix to integrate previously obtained information and knowledge directly into the data assimilation (DA). The proposed method is based on the robust H_∞ filter and on its ensemble time-local version the EnTLHF, using an adaptive inflation factor depending on the shrinkage covariance estimated matrix. This implies a theoretical and solid background to construct robust filters from the well-known covariance inflation technique. The proposed technique is implemented in a synthetic assimilation experiment, and in an air quality application using the LOTOS-EUROS model over the Aburrá Valley to evaluate its potential for non-linear and non-Gaussian large systems. In the spatial distribution of the PM_2.5 concentrations along the valley, the method outperforms the well-known Local Ensemble Transform Kalman Filter (LETKF), and the non-robust knowledge-aided Ensemble Kalman filter (EnKF-KA). In contrast to the other simulations, the ability to issue warnings for high concentration events is also increased. Finally, the simulation using EnTLHF-KA has lower error values than using EnKF-KA, indicating the advantages of robust approaches in high uncertainty systems. ...

Surrogate-assisted inversion for large-scale history matching

Comparative study between projection-based reduced-order modeling and deep neural network

Journal article (2022) - Cong Xiao, Hai-Xiang Lin, Olwijn Leeuwenburgh, Arnold Heemink

History matching can play a key role in improving geological characterization and reducing the uncertainty of reservoir model predictions. Application of reservoir history matching is restricted by the huge computational cost by amongst others the many runs of the full model. Surrogate models with a reduced complexity are therefore used to reduce the computational demands. This paper presents an efficient surrogate-assisted deterministic inversion framework to primarily explore the possibility of applying deep neural network (DNN) surrogate to approximate the gradient of large-scale history matching by using auto-differentiation (AD). In combination with the deep neural network model, the AD enables us to evaluate the gradients efficiently in a parallel manner. Furthermore, the benefits of using stochastic gradient optimizers in the deep learning practice, instead of full gradient optimizers in conventional deterministic inversions, is investigated as well. Numerical experiments are conducted on a 3D benchmark reservoir model in the context of a water-flooding production scenario. The quantity of interest, e.g., dynamic saturation for an ensemble of test models, can be accurately predicted. The proposed surrogate-assisted inversion with stochastic gradient optimizer obtains a very quick convergence rate against the model and data noise for the high-dimensional history matching problem with a large number of data and parameters. In addition, we also conduct several comparisons and evaluations with our previously proposed projection-based subdomain POD-TPWL approach in terms of computational efficiency and accuracy. The subdomain POD-TPWL constructs a local surrogate model, which is repeatedly reconstructed a number of times for maintaining a satisfactory accuracy, while DNN constructs a global surrogate model based on the entire training data and generally does not require additional reconstructions. The subdomain POD-TPWL is very sensitive to how the domain is decomposed, increasing the training samples does not infinitely improve the history matching results by a fixed decomposition. Overall, these two kinds of surrogate models have demonstrated great potential in solving large-scale history matching problem. The DNN surrogate is particularly useful to generate multiple posteriors for model uncertainty quantification. ...

History matching can play a key role in improving geological characterization and reducing the uncertainty of reservoir model predictions. Application of reservoir history matching is restricted by the huge computational cost by amongst others the many runs of the full model. Surrogate models with a reduced complexity are therefore used to reduce the computational demands. This paper presents an efficient surrogate-assisted deterministic inversion framework to primarily explore the possibility of applying deep neural network (DNN) surrogate to approximate the gradient of large-scale history matching by using auto-differentiation (AD). In combination with the deep neural network model, the AD enables us to evaluate the gradients efficiently in a parallel manner. Furthermore, the benefits of using stochastic gradient optimizers in the deep learning practice, instead of full gradient optimizers in conventional deterministic inversions, is investigated as well. Numerical experiments are conducted on a 3D benchmark reservoir model in the context of a water-flooding production scenario. The quantity of interest, e.g., dynamic saturation for an ensemble of test models, can be accurately predicted. The proposed surrogate-assisted inversion with stochastic gradient optimizer obtains a very quick convergence rate against the model and data noise for the high-dimensional history matching problem with a large number of data and parameters. In addition, we also conduct several comparisons and evaluations with our previously proposed projection-based subdomain POD-TPWL approach in terms of computational efficiency and accuracy. The subdomain POD-TPWL constructs a local surrogate model, which is repeatedly reconstructed a number of times for maintaining a satisfactory accuracy, while DNN constructs a global surrogate model based on the entire training data and generally does not require additional reconstructions. The subdomain POD-TPWL is very sensitive to how the domain is decomposed, increasing the training samples does not infinitely improve the history matching results by a fixed decomposition. Overall, these two kinds of surrogate models have demonstrated great potential in solving large-scale history matching problem. The DNN surrogate is particularly useful to generate multiple posteriors for model uncertainty quantification.

Position correction in dust storm forecasting using LOTOS-EUROS v2.1

Grid-distorted data assimilation v1.0

Journal article (2021) - Jianbing Jin, Arjo Segers, Hai Xiang Lin, Bas Henzing, Xiaohui Wang, Arnold Heemink, Hong Liao

When calibrating simulations of dust clouds, both the intensity and the position are important. Intensity errors arise mainly from uncertain emission and sedimentation strengths, while position errors are attributed either to imperfect emission timing or to uncertainties in the transport. Though many studies have been conducted on the calibration or correction of dust simulations, most of these focus on intensity solely and leave the position errors mainly unchanged. In this paper, a grid-distorted data assimilation, which consists of an image-morphing method and an ensemble-based variational assimilation, is designed for realigning a simulated dust plume to correct the position error. This newly developed grid-distorted data assimilation has been applied to a dust storm event in May 2017 over East Asia. Results have been compared for three configurations: a traditional assimilation configuration that focuses solely on intensity correction, a grid-distorted data assimilation that focuses on position correction only and the hybrid assimilation that combines these two. For the evaluated case, the position misfit in the simulations is shown to be dominant in the results. The traditional emission inversion only slightly improves the dust simulation, while the grid-distorted data assimilation effectively improves the dust simulation and forecasting. The hybrid assimilation that corrects both position and intensity of the dust load provides the best initial condition for forecasting of dust concentrations. ...

Conditioning of deep-learning surrogate models to image data with application to reservoir characterization

Journal article (2021) - Cong Xiao, Olwijn Leeuwenburgh, Hai-Xiang Lin, Arnold Heemink

Imaging-type monitoring techniques are used in monitoring dynamic processes in many domains, including medicine, engineering, and geophysics. This paper aims to propose an efficient workflow for application of such data for the conditioning of simulation models. Such applications are very common in e.g. the geosciences, where large-scale simulation models and measured data are used to monitor the state of e.g. energy and water systems, predict their future behavior and optimize actions to achieve desired behavior of the system. In order to reduce the high computational cost and complexity of data assimilation workflows for high-dimensional parameter estimation, a residual-in-residual dense block extension of the U-Net convolutional network architecture is proposed, to predict time-evolving features in high-dimensional grids. The network is trained using high-fidelity model simulations. We present two examples of application of the trained network as a surrogate within an iterative ensemble-based workflow to estimate the static parameters of geological reservoirs based on binary-type image data, which represent fluid facies as obtained from time-lapse seismic surveys. The differences between binary images are parameterized in terms of distances between the fluid-facies boundaries, or fronts. We discuss the impact of the choice of network architecture, loss function, and number of training samples on the accuracy of results and on overall computational cost. From comparisons with conventional workflows based entirely on high-fidelity simulation models, we conclude that the proposed surrogate-supported hybrid workflow is able to deliver results with an accuracy equal to or better than the conventional workflow, and at significantly lower cost. Cost reductions are shown to increase with the number of samples of the uncertain parameter fields. The hybrid workflow is generic and should be applicable in addressing inverse problems in many geophysical applications as well as other engineering domains. ...

Imaging-type monitoring techniques are used in monitoring dynamic processes in many domains, including medicine, engineering, and geophysics. This paper aims to propose an efficient workflow for application of such data for the conditioning of simulation models. Such applications are very common in e.g. the geosciences, where large-scale simulation models and measured data are used to monitor the state of e.g. energy and water systems, predict their future behavior and optimize actions to achieve desired behavior of the system. In order to reduce the high computational cost and complexity of data assimilation workflows for high-dimensional parameter estimation, a residual-in-residual dense block extension of the U-Net convolutional network architecture is proposed, to predict time-evolving features in high-dimensional grids. The network is trained using high-fidelity model simulations. We present two examples of application of the trained network as a surrogate within an iterative ensemble-based workflow to estimate the static parameters of geological reservoirs based on binary-type image data, which represent fluid facies as obtained from time-lapse seismic surveys. The differences between binary images are parameterized in terms of distances between the fluid-facies boundaries, or fronts. We discuss the impact of the choice of network architecture, loss function, and number of training samples on the accuracy of results and on overall computational cost. From comparisons with conventional workflows based entirely on high-fidelity simulation models, we conclude that the proposed surrogate-supported hybrid workflow is able to deliver results with an accuracy equal to or better than the conventional workflow, and at significantly lower cost. Cost reductions are shown to increase with the number of samples of the uncertain parameter fields. The hybrid workflow is generic and should be applicable in addressing inverse problems in many geophysical applications as well as other engineering domains.

Medellin Air Quality Initiative (MAUI)

Book chapter (2021) - A. Yarce Botero, O.L. Quintero Montoya, More authors..., S. Lopez Restrepo, N. Pinel Pelaez, J.E. Hinestroza Ramirez, Elias David Nino-Ruiz, Jimmy Anderson Flórez, Angela María Rendón, Monica Lucia Alvarez-Laínez, A.W. Heemink

This chapter book presents Medellín Air qUality Initiative or MAUI Project; it tells a brief story of this teamwork, their scientific and technological directions. The modeling work focuses on the ecosystems and human health impact due to the exposition of several pollutants transported from long-range places and deposited. For this objective, the WRF and LOTOS-EUROS were configurated and implemented over the región of interest previously updating some input conditions like land use and orography. By other side, a spinoff initiative named SimpleSpace was also born during this time, developing, through this instrumentation branch a very compact and modular low-cost sensor to deploy in new air quality networks over the study domain. For testing this instrument and find an alternative way to measure pollutants in the vertical layers, the Helicopter In-Situ Pollution Assessment Experiment HIPAE misión was developed to take data through the overflight of a helicopter over Medellín. From the data obtained from the Simple units and other experiments in the payload, a citogenotoxicity analysis quantify the cellular damage caused by the exposition of the pollutants. ...

Estimating NOx LOTOS-EUROS CTM Emission Parameters over the Northwest of South America through 4DEnVar TROPOMI NO2 Assimilation

Journal article (2021) - A. Yarce Botero, S. Lopez Restrepo, N. Pinel Pelaez, Olga Quintero-Montoya, Arjo Segers, A.W. Heemink

In this work, we present the development of a 4D-Ensemble-Variational (4DEnVar) data assimilation technique to estimate NOx top-down emissions using the regional chemical transport model LOTOS-EUROS with the NO2 observations from the TROPOspheric Monitoring Instrument (TROPOMI). The assimilation was performed for a domain in the northwest of South America centered over Colombia, and includes regions in Panama, Venezuela and Ecuador. In the 4DEnVar approach, the implementation of the linearized and adjoint model are avoided by generating an ensemble of model simulations and by using this ensemble to approximate the nonlinear model and observation operator. Emission correction parameters’ locations were defined for positions where the model simulations showed significant discrepancies with the satellite observations. Using the 4DEnVar data assimilation method, optimal emission parameters for the LOTOS-EUROS model were estimated, allowing for corrections in areas where ground observations are unavailable and the region’s emission inventories do not correctly reflect the current emissions activities. The analyzed 4DEnVar concentrations were compared with the ground measurements of one local air quality monitoring network and the data retrieved by the satellite instrument Ozone Monitoring Instrument (OMI). The assimilation had a low impact on NO2 surface concentrations reducing the Mean Fractional Bias from 0.45 to 0.32, primordially enhancing the spatial and temporal variations in the simulated NO2 fields. ...

An efficient ensemble Kalman Filter implementation via shrinkage covariance matrix estimation

Exploiting prior knowledge

Journal article (2021) - Santiago Lopez-Restrepo, Elias D. Nino-Ruiz, Luis G. Guzman-Reyes, Andres Yarce, O. L. Quintero, Nicolas Pinel, Arjo Segers, A. W. Heemink

In this paper, we propose an efficient and practical implementation of the ensemble Kalman filter via shrinkage covariance matrix estimation. Our filter implementation combines information brought by an ensemble of model realizations, and that based on our prior knowledge about the dynamical system of interest. We perform the combination of both sources of information via optimal shrinkage factors. The method exploits the rank-deficiency of ensemble covariance matrices to provide an efficient and practical implementation of the analysis step in EnKF based formulations. Localization and inflation aspects are discussed, as well. Experimental tests are performed to assess the accuracy of our proposed filter implementation by employing an Advection Diffusion Model and an Atmospheric General Circulation Model. The experimental results reveal that the use of our proposed filter implementation can mitigate the impact of sampling noise, and even more, it can avoid the impact of spurious correlations during assimilation steps. ...

Spatial and Time Warping for Gauge Adjustment of Rainfall Estimates

Journal article (2021) - Camille Le Coz, Arnold Heemink, Martin Verlaan, Nick van de Giesen

Many satellite-based estimates use gauge information for bias correction. In general, bias-correction methods are focused on the intensity error and do not explicitly correct possible position or timing errors. However, position and timing errors in rainfall estimates can also lead to errors in the rainfall occurrence or the intensity. This is especially true for localized rainfall events such as the convective rainstorms occurring during the rainy season in sub-Saharan Africa. We investigated the use of warping to correct such errors. The goal was to gauge-adjust satellite-based estimates with respect to the position and the timing of the rain event, instead of its intensity. Warping is a field-deformation method that transforms an image into another one. We compared two methods, spatial warping focusing on the position errors and time warping for the timing errors. They were evaluated on two case studies: a synthetic rainfall event represented by an ellipse and a rain event in southern Ghana during the monsoon season. In both cases, the two warping methods reduced significantly the respective targeted (position or timing) errors. In the southern Ghana case, the average position error was decreased by about 45 km by the spatial warping and the average timing error was decreased from more than 1 h to 0.2 h by the time warping. Both warping methods also improved the continuous statistics on the intensity: the correlation went from 0.18 to at least 0.62 after warping in the southern Ghana case. The spatial warping seems more interesting because of its positive impact on both position and timing errors. ...

Urban air quality modeling using low-cost sensor network and data assimilation in the aburrá valley, colombia

Journal article (2021) - Santiago Lopez Restrepo, Andrés Yarce , Nicolás Pinel , O.L. Quintero , Arjo Segers, A.W. Heemink

The use of low air quality networks has been increasing in recent years to study urban pollution dynamics. Here we show the evaluation of the operational Aburrá Valley’s low-cost network against the official monitoring network. The results show that the PM_2.5 low-cost measurements are very close to those observed by the official network. Additionally, the low-cost allows a higher spatial representation of the concentrations across the valley. We integrate low-cost observations with the chemical transport model Long Term Ozone Simulation-European Operational Smog (LOTOS-EUROS) using data assimilation. Two different configurations of the low-cost network were assimilated: using the whole low-cost network (255 sensors), and a high-quality selection using just the sensors with a correlation factor greater than 0.8 with respect to the official network (115 sensors). The official stations were also assimilated to compare the more dense low-cost network’s impact on the model performance. Both simulations assimilating the low-cost model outperform the model without assimilation and assimilating the official network. The capability to issue warnings for pollution events is also improved by assimilating the low-cost network with respect to the other simulations. Finally, the simulation using the high-quality configuration has lower error values than using the complete low-cost network, showing that it is essential to consider the quality and location and not just the total number of sensors. Our results suggest that with the current advance in low-cost sensors, it is possible to improve model performance with low-cost network data assimilation. ...

Efficient estimation of space varying parameters in numerical models using non-intrusive subdomain reduced order modeling

Journal article (2021) - Cong Xiao, Olwijn Leeuwenburgh, Hai Xiang Lin, Arnold Heemink

A reduced order modeling algorithm for the estimation of space varying parameter patterns in numerical models is proposed. In this approach domain decomposition is applied to construct separate approximations to the numerical model in every subdomain. We introduce a new local parameterization that decouples the computational cost of the algorithm from the number of global principal components and therefore provides attractive scaling for models with a very large number of uncertain parameter patterns. By defining uncertain parameter patterns only in the various subdomains the number of full order simulation required for the derivation of the reduced order models can be reduced drastically. To avoid non-smoothness at the boundaries of the subdomains, the optimal local parameters patterns are projected onto global parameter patterns. The computational effort of the new methodology hardly increases when the number of parameter patterns increases. The number of training models depends primarily on the maximum number of local parameters in a subdomain, which can be decreased by refining the domain decomposition. We apply the new algorithm to a large-scale reservoir model parameter estimation problem. In this application 282 parameters could be estimated using only 90 full order model runs. ...

Data Assimilation as a Tool to Improve Chemical Transport Models Performance in Developing Countries

Book chapter (2021) - S. Lopez Restrepo, A. Yarce Botero, More Authors..., O.L. Quintero Montoya, N. Pinel Pelaez, J.E. Hinestroza Ramirez, Elias David Nino-Ruiz, Jimmy Anderson Flórez, Angela Maíra Rendón, Monica Lucia Alvarez-Laínez, A.W. Heemink

Particulate matter (PM) is one of the most problematic pollutants in urban air. The effects of PM on human health, associated especially with PM of ≤2.5μm in diameter, include asthma, lung cancer and cardiovascular disease. Consequently, major urban centers commonly monitor PM2.5 as part of their air quality management strategies. The Chemical Transport models allow for a permanent monitoring and prediction of pollutant behavior for all the regions of interest, different to the sensor network where the concentration is just available in specific points. In this chapter a data assimilation system for the LOTOS-EUROS chemical transport model has been implemented to improve the simulation and forecast of Particulate Matter in a densely populated urban valley of the tropical Andes. The Aburrá Valley in Colombia was used as a case study, given data availability and current environmental issues related to population expansion. Using different experiments and observations sources, we shown how the Data Assimilation can improve the model representation of pollutants. ...

Deep-Learning Inversion to Efficiently Handle Big-Data Assimilation

Application to Seismic History Matching

Conference paper (2020) - C. Xiao, A.W. Heemink, H.X. Lin, O. Leeuwenburgh

Seismic history matching can play a key role in geological characterization and uncertainty quantification. However, challenges related to intensive computational demands and complexity restricts its application in many practical cases. This paper presents a conceptual deep-learning-based framework fully deployed in the popular Pytorch architecture to accelerate the seismic history matching. We introduce a surrogate model based on a deep convolutional neural network with a stack of dense blocks, specifically a conditional deep convolutional autoencoder-decoder architecture (cDCAE). The surrogate model allows us to fully deploy data assimilation algorithms in Pytorch architecture and hence to easily make full use of the efficient computing units, in particular GPU’s for the matrix-matrix and matrix-vector multiplications. The feature of built-in automatic differentiation (AD) provided by Pytorch also makes is possible to evaluate gradient information efficiently in a parallel manner. Furthermore, it has been acknowledged to benefit from the deep learning practice of using stochastic gradient (SG) optimizers, e.g., Adam, instead of full gradient optimizers, e.g., Quasi-Newton, as is most common in conventional big-data assimilation. The proposed framework is tested on a benchmark 3D model in the context of petroleum engineering. This surrogate model is demonstrated to be capable of accurately predicting the quantity of interest, e.g., dynamic saturation maps for new geological realizations. Assessments demonstrating high surrogate-model accuracy are presented for an ensemble of test models. The robustness and dramatic speedup provided by the surrogate model suggests that it can help facilitate the application of large-scale seismic history matching. ...

Source backtracking for dust storm emission inversion using an adjoint method

Case study of Northeast China

Journal article (2020) - Jianbing Jin, Arjo Segers, Hong Liao, Arnold Heemink, Richard Kranenburg, Hai Xiang Lin

Emission inversion using data assimilation fundamentally relies on having the correct assumptions about the emission background error covariance. A perfect covariance accounts for the uncertainty based on prior knowledge and is able to explain differences between model simulations and observations. In practice, emission uncertainties are constructed empirically; hence, a partially unrepresentative covariance is unavoidable. Concerning its complex parameterization, dust emissions are a typical example where the uncertainty could be induced from many underlying inputs, e.g., information on soil composition and moisture, land cover and erosive wind velocity, and these can hardly be taken into account together. This paper describes how an adjoint model can be used to detect errors in the emission uncertainty assumptions. This adjoint-based sensitivity method could serve as a supplement of a data assimilation inverse modeling system to trace back the error sources in case large observation-minus-simulation residues remain after assimilation based on empirical background covariance. The method follows an application of a data assimilation emission inversion for an extreme severe dust storm over East Asia <span classCombining double low line"cit"idCombining double low line"xref_paren.1">(<a hrefCombining double low line"#bib1.bibx31">Jin et al.</a>, <a hrefCombining double low line"#bib1.bibx31">2019</a><a hrefCombining double low line"#bib1.bibx31">b</a>)</span>. The assimilation system successfully resolved observation-minus-simulation errors using satellite AOD observations in most of the dust-affected regions. However, a large underestimation of dust in Northeast China remained despite the fact that the assimilated measurements indicated severe dust plumes there. An adjoint implementation of our dust simulation model is then used to detect the most likely source region for these unresolved dust loads. The backward modeling points to the Horqin desert as the source region, which was indicated as a non-source region by the existing emission scheme. The reference emission and uncertainty are then reconstructed over the Horqin desert by assuming higher surface erodibility. After the emission reconstruction, the emission inversion is performed again, and the posterior dust simulations and reality are now in much closer harmony. Based on our results, it is advised that emission sources in dust transport models include the Horqin desert as a more active source region. ...

Emission inversion using data assimilation fundamentally relies on having the correct assumptions about the emission background error covariance. A perfect covariance accounts for the uncertainty based on prior knowledge and is able to explain differences between model simulations and observations. In practice, emission uncertainties are constructed empirically; hence, a partially unrepresentative covariance is unavoidable. Concerning its complex parameterization, dust emissions are a typical example where the uncertainty could be induced from many underlying inputs, e.g., information on soil composition and moisture, land cover and erosive wind velocity, and these can hardly be taken into account together. This paper describes how an adjoint model can be used to detect errors in the emission uncertainty assumptions. This adjoint-based sensitivity method could serve as a supplement of a data assimilation inverse modeling system to trace back the error sources in case large observation-minus-simulation residues remain after assimilation based on empirical background covariance.

The method follows an application of a data assimilation emission inversion for an extreme severe dust storm over East Asia <span classCombining double low line"cit"idCombining double low line"xref_paren.1">(<a hrefCombining double low line"#bib1.bibx31">Jin et al.</a>, <a hrefCombining double low line"#bib1.bibx31">2019</a><a hrefCombining double low line"#bib1.bibx31">b</a>)</span>. The assimilation system successfully resolved observation-minus-simulation errors using satellite AOD observations in most of the dust-affected regions. However, a large underestimation of dust in Northeast China remained despite the fact that the assimilated measurements indicated severe dust plumes there. An adjoint implementation of our dust simulation model is then used to detect the most likely source region for these unresolved dust loads. The backward modeling points to the Horqin desert as the source region, which was indicated as a non-source region by the existing emission scheme. The reference emission and uncertainty are then reconstructed over the Horqin desert by assuming higher surface erodibility. After the emission reconstruction, the emission inversion is performed again, and the posterior dust simulations and reality are now in much closer harmony. Based on our results, it is advised that emission sources in dust transport models include the Horqin desert as a more active source region.

Forecasting PM10 and PM2.5 in the Aburra Valley (Medellin, Colombia) via EnKF based data assimilation

Journal article (2020) - Santiago Lopez Restrepo, Andrés Yarce , Nicolas Pinel , O.L. Quintero , Arjo Segers, A.W. Heemink

A data assimilation system for the LOTOS-EUROS chemical transport model has been implemented to improve the simulation and forecast of PM₁₀ and PM_2.5 in a densely populated urban valley of the tropical Andes. The Aburrá Valley in Colombia was used as a case study, given data availability and current environmental issues related to population expansion. The data assimilation system is an Ensemble Kalman filter with covariance localization based on specification of uncertainties in the emissions. Observations assimilated were obtained from a surface network for the period March–April of 2016, a period of one of the worst air quality crisis in recent history of the region. In a first series of experiments, the spatial length scale of the covariance localization and the temporal length scale of the stochastic model for the emission uncertainty were calibrated to optimize the assimilation system. The calibrated system was then used in a series of assimilation experiments, where simulation of particulate matter concentrations was strongly improved during the assimilation period, which also improved the ability to accurately forecast PM₁₀ and PM_2.5 concentrations over a period of several days. ...