A. Heinlein
Please Note
33 records found
1
Bayesian Inverse Generative Neural Operator
Latent-Space Posterior Formulation for PDE-Constrained Inverse Problems
ReLU network. In the infinite-width limit, the residual dynamics are governed by the Neural Tangent Kernel. Under the uniform measure on the circle this kernel depends only on the angle between two points, so the associated operator is a convolution and the Fourier modes are its eigenfunctions, each decaying at a rate set by its eigenvalue, and the lower the frequency, the larger the eigenvalue, so low frequencies are learned first.
Away from this idealised limit the picture degrades only gradually. On a fixed low-frequency subspace, both finite sampling and frozen finite width keep the operator close to the continuum Fourier prediction, with error of order O(n^(-1/2)) in the sample size n and O(m^(-1/2)) in the width m. The description breaks only once the kernel is allowed to evolve during training. At small width the evolving kernel reaches a lower loss by strengthening its lowest-frequency components, even as its alignment with the Fourier basis fails to improve. This reinforces the low-frequency bias rather than approximating the fixed-kernel dynamics. A formal theory of this evolving-kernel regime remains the main open problem. ...
ReLU network. In the infinite-width limit, the residual dynamics are governed by the Neural Tangent Kernel. Under the uniform measure on the circle this kernel depends only on the angle between two points, so the associated operator is a convolution and the Fourier modes are its eigenfunctions, each decaying at a rate set by its eigenvalue, and the lower the frequency, the larger the eigenvalue, so low frequencies are learned first.
Away from this idealised limit the picture degrades only gradually. On a fixed low-frequency subspace, both finite sampling and frozen finite width keep the operator close to the continuum Fourier prediction, with error of order O(n^(-1/2)) in the sample size n and O(m^(-1/2)) in the width m. The description breaks only once the kernel is allowed to evolve during training. At small width the evolving kernel reaches a lower loss by strengthening its lowest-frequency components, even as its alignment with the Fourier basis fails to improve. This reinforces the low-frequency bias rather than approximating the fixed-kernel dynamics. A formal theory of this evolving-kernel regime remains the main open problem.
Surrogates are trained to minimise the one-step error on fixed velocity and diffusion fields and evaluated autoregressively on unseen fields over horizons eight times the training window. Two families are compared at a fixed message passing budget: single-scale models, and multiscale models organised as V-cycles over predetermined coarsened graphs. For each model, a piecewise-linear fit of the final rollout error against the CFL and Fourier numbers yields empirical stability limits, defined by a blow-up threshold.
Within these limits the surrogates reproduce the finite element reference accurately on both seen and unseen fields and show no abrupt change beyond the training horizon, although the diffusion-dominated regime is consistently harder than the advection-dominated one. The single-scale CFL limit tracks the number of message-passing blocks and lies slightly above it. Adding coarse levels at a fixed total message passing layer budget broadens the advective stability range, decisively at the largest stride, but a two-level hierarchy trades diffusive stability and in-region accuracy for this gain. Only a three-level V-cycle removes the penalty, attaining zero blow-ups on both axes, and deeper models show no oversmoothing.
The diffusion-side limits carry large variance, traced partly to the dissipative backward-Euler reference, and should be read as indicative. The work delivers a concrete operational range for GNN surrogates and identifies how multiscale models can extend it. ...
Surrogates are trained to minimise the one-step error on fixed velocity and diffusion fields and evaluated autoregressively on unseen fields over horizons eight times the training window. Two families are compared at a fixed message passing budget: single-scale models, and multiscale models organised as V-cycles over predetermined coarsened graphs. For each model, a piecewise-linear fit of the final rollout error against the CFL and Fourier numbers yields empirical stability limits, defined by a blow-up threshold.
Within these limits the surrogates reproduce the finite element reference accurately on both seen and unseen fields and show no abrupt change beyond the training horizon, although the diffusion-dominated regime is consistently harder than the advection-dominated one. The single-scale CFL limit tracks the number of message-passing blocks and lies slightly above it. Adding coarse levels at a fixed total message passing layer budget broadens the advective stability range, decisively at the largest stride, but a two-level hierarchy trades diffusive stability and in-region accuracy for this gain. Only a three-level V-cycle removes the penalty, attaining zero blow-ups on both axes, and deeper models show no oversmoothing.
The diffusion-side limits carry large variance, traced partly to the dissipative backward-Euler reference, and should be read as indicative. The work delivers a concrete operational range for GNN surrogates and identifies how multiscale models can extend it.
conditions of an 1D Diffusion euqation and an 1D Burgers’ equation to their terminal solutions, and operators approximating functions with piecewise functions on dyadic partitions. We analyze the sources of error in the proposed framework by decomposing the total error into the neural network generalization error and the reconstruction error. This perspective provides insight into how different components of the model contribute to the final prediction accuracy. In a series of numerical experiments, we compare models trained with different numbers of DWT coefficients for function representation, motivated by the fact that functions can be well approximated using only a subset of DWT coefficients, which also reduces computational cost. The experimental results show that our model achieves higher prediction accuracy than comparable models based on Fourier transforms on certain tasks. Moreover, we observe a non-monotonic relationship between the model’s prediction accuracy and the number of DWT coefficients used. In addition, experiments show that using a single neural network to learn the mappings among all wavelet coefficients is often more accurate than using multiple neural networks to separately learn the mappings of wavelet coefficients at different decomposition levels. Overall, the study demonstrates the effectiveness of the proposed model and analyzes the sources of its errors, thereby revealing its strengths and limitations. ...
conditions of an 1D Diffusion euqation and an 1D Burgers’ equation to their terminal solutions, and operators approximating functions with piecewise functions on dyadic partitions. We analyze the sources of error in the proposed framework by decomposing the total error into the neural network generalization error and the reconstruction error. This perspective provides insight into how different components of the model contribute to the final prediction accuracy. In a series of numerical experiments, we compare models trained with different numbers of DWT coefficients for function representation, motivated by the fact that functions can be well approximated using only a subset of DWT coefficients, which also reduces computational cost. The experimental results show that our model achieves higher prediction accuracy than comparable models based on Fourier transforms on certain tasks. Moreover, we observe a non-monotonic relationship between the model’s prediction accuracy and the number of DWT coefficients used. In addition, experiments show that using a single neural network to learn the mappings among all wavelet coefficients is often more accurate than using multiple neural networks to separately learn the mappings of wavelet coefficients at different decomposition levels. Overall, the study demonstrates the effectiveness of the proposed model and analyzes the sources of its errors, thereby revealing its strengths and limitations.
To address these limitations, this study proposes a method based exclusively on Full Order Model (FOM) data that follows a uniform training pipeline applicable to any DH network without requiring manual analysis of the network topology or case-specific architectural adjustments. A hybrid framework based on Proper Orthogonal Decomposition (POD) is developed, in which POD extracts dominant spatial modes from high-dimensional FOM data, while a feedforward neural network predicts the corresponding temporal coefficients from compressed input features. The ROM output is subsequently used as an initial guess for the FOM state iteration procedure, thereby preserving physical consistency.
The approach is evaluated on two realistic DH networks of different scales. In both cases, the ROM achieves total relative reconstruction errors below 5% (4.8% for the smaller network and 3.6% for the larger network), with prediction times below 0.1 seconds compared to approximately 100 seconds for a single FOM iteration. For the smaller network, integrating the ROM into the optimization workflow results in a 1.17× speed-up while producing decision variables nearly identical to those obtained with the FOM. This improvement arises from skipping the first FOM iteration, reducing the number of iterations required for convergence, and updating fewer time steps per iteration. For the larger network, the ROM maintains high predictive accuracy but performs less reliably during optimization, likely due to limited training data for rarely activated backup sources. Overall, the results demonstrate that hybrid POD-based ROMs can significantly improve the computational efficiency of DH network state estimation and optimization, provided that the training dataset adequately represents all relevant operational regimes. ...
To address these limitations, this study proposes a method based exclusively on Full Order Model (FOM) data that follows a uniform training pipeline applicable to any DH network without requiring manual analysis of the network topology or case-specific architectural adjustments. A hybrid framework based on Proper Orthogonal Decomposition (POD) is developed, in which POD extracts dominant spatial modes from high-dimensional FOM data, while a feedforward neural network predicts the corresponding temporal coefficients from compressed input features. The ROM output is subsequently used as an initial guess for the FOM state iteration procedure, thereby preserving physical consistency.
The approach is evaluated on two realistic DH networks of different scales. In both cases, the ROM achieves total relative reconstruction errors below 5% (4.8% for the smaller network and 3.6% for the larger network), with prediction times below 0.1 seconds compared to approximately 100 seconds for a single FOM iteration. For the smaller network, integrating the ROM into the optimization workflow results in a 1.17× speed-up while producing decision variables nearly identical to those obtained with the FOM. This improvement arises from skipping the first FOM iteration, reducing the number of iterations required for convergence, and updating fewer time steps per iteration. For the larger network, the ROM maintains high predictive accuracy but performs less reliably during optimization, likely due to limited training data for rarely activated backup sources. Overall, the results demonstrate that hybrid POD-based ROMs can significantly improve the computational efficiency of DH network state estimation and optimization, provided that the training dataset adequately represents all relevant operational regimes.
Sharpened CG Iteration Bound for High-contrast Heterogeneous Scalar Elliptic PDEs
Going Beyond Condition Number
Building on foundational work in spectral analysis and iterative solvers, the thesis introduces novel multi-cluster and tail-cluster bounds for the CG method. These bounds are derived through a combination of theoretical analysis and practical algorithms for partitioning eigenspectra, and are validated both analytically and numerically. The new bounds utilize key spectral characteristics, such as cluster condition numbers and spectral width, to more accurately estimate the number of iterations required for convergence. Numerical experiments demonstrate that the sharpened bounds can be up to 1000 times tighter than the classical bound and are effective in distinguishing the robustness of different Schwarz preconditioners.
Despite their improved accuracy, the practical application of these bounds for a priori iteration estimation is challenged by the need for detailed spectral information, which is often unavailable in the early stages of iterative solvers. The thesis discusses heuristic approaches for leveraging partial spectral data and highlights the dependency of bound accuracy on the choice of coefficient functions and preconditioners.
In conclusion, the sharpened CG iteration bounds developed in this work provide a significant advancement in predictive performance analysis for high-contrast elliptic problems. Future research directions include refining cluster partitioning algorithms, improving a priori spectral estimation, and extending the applicability of these bounds to more complex problems and preconditioners. ...
Building on foundational work in spectral analysis and iterative solvers, the thesis introduces novel multi-cluster and tail-cluster bounds for the CG method. These bounds are derived through a combination of theoretical analysis and practical algorithms for partitioning eigenspectra, and are validated both analytically and numerically. The new bounds utilize key spectral characteristics, such as cluster condition numbers and spectral width, to more accurately estimate the number of iterations required for convergence. Numerical experiments demonstrate that the sharpened bounds can be up to 1000 times tighter than the classical bound and are effective in distinguishing the robustness of different Schwarz preconditioners.
Despite their improved accuracy, the practical application of these bounds for a priori iteration estimation is challenged by the need for detailed spectral information, which is often unavailable in the early stages of iterative solvers. The thesis discusses heuristic approaches for leveraging partial spectral data and highlights the dependency of bound accuracy on the choice of coefficient functions and preconditioners.
In conclusion, the sharpened CG iteration bounds developed in this work provide a significant advancement in predictive performance analysis for high-contrast elliptic problems. Future research directions include refining cluster partitioning algorithms, improving a priori spectral estimation, and extending the applicability of these bounds to more complex problems and preconditioners.
To address these limitations, this thesis introduces two enhancement strategies. First, we propose Gradient-Enhanced HINTS (GE-HINTS), a method that incorporates first-order derivative information into the DeepONet's loss function. Motivated by the anti-frequency principle, this approach mitigates the model's spectral bias, and thus improve the performance of HINTS. Second, we develop "HINTS-in-the-loop" training strategies, which makes the DeepONet model aware of the true residual distributions it will encounter during inference. This is achieved through both an offline data augmentation strategy and an online, end-to-end differentiable training loop that optimizes the solver's multi-step performance.
Numerical experiments on benchmark problems demonstrated the effectiveness of our proposed methods. Both GE-HINTS and the HINTS-in-the-loop strategies significantly accelerate the convergence of the single-level HINTS solver. Overall, this thesis provides
both mechanistic understanding and practical strategies for accelerating the HINTS framework. We hope these insights will aid researchers seeking effective hybrid iterative solvers and will contribute to further progress in this area. ...
To address these limitations, this thesis introduces two enhancement strategies. First, we propose Gradient-Enhanced HINTS (GE-HINTS), a method that incorporates first-order derivative information into the DeepONet's loss function. Motivated by the anti-frequency principle, this approach mitigates the model's spectral bias, and thus improve the performance of HINTS. Second, we develop "HINTS-in-the-loop" training strategies, which makes the DeepONet model aware of the true residual distributions it will encounter during inference. This is achieved through both an offline data augmentation strategy and an online, end-to-end differentiable training loop that optimizes the solver's multi-step performance.
Numerical experiments on benchmark problems demonstrated the effectiveness of our proposed methods. Both GE-HINTS and the HINTS-in-the-loop strategies significantly accelerate the convergence of the single-level HINTS solver. Overall, this thesis provides
both mechanistic understanding and practical strategies for accelerating the HINTS framework. We hope these insights will aid researchers seeking effective hybrid iterative solvers and will contribute to further progress in this area.
Preconditioned Krylov Solvers under Shared-Memory Parallelism
Evaluating Convergence, Scalability, and Parallel Overhead
Github: https://github.com/Hugoreijersen/Krylov-Subspace-Methods.git ...
Github: https://github.com/Hugoreijersen/Krylov-Subspace-Methods.git
On the Effectiveness of Modeling Uncertain Constraint-Based Utility Functions with Quadratic Polynomials
With Applications in Autonomous Negotiations
The main contributions of this thesis are threefold. First, it introduces a probabilistic complexity measure for these hypercubic functions, capturing how parameters such as dimensionality, constraint width, the number of constraints, and the number of issues interact to shape the function's complexity. Second, it develops a novel agent that leverages a regression model with quadratic basis functions to construct a surrogate model of a hypercubic constraint-based utility function. Third, it evaluates the agent through extensive experiments, demonstrating how performance scales with complexity. Following the steps outlined in this thesis, the performance of surrogate models can be directly compared.
The results demonstrate that the surrogate-based method is a promising approach, as the agent constructed in this thesis outperforms the agents from the 2014 Automated Negotiating Agent Competition which used similar scenarios as those considered in this thesis. These agents all have in common that they directly search the utility function as opposed to a surrogate model of it. Furthermore, the results indicate that simple basis functions, such as quadratic ones, enable the agent to reach the global maximum of its utility function in low-complexity hypercubic cases, with performance scaling reasonably well up to medium complexity. Beyond this point, however, performance deteriorates rapidly, clearly signaling the need for more expressive surrogate models. ...
The main contributions of this thesis are threefold. First, it introduces a probabilistic complexity measure for these hypercubic functions, capturing how parameters such as dimensionality, constraint width, the number of constraints, and the number of issues interact to shape the function's complexity. Second, it develops a novel agent that leverages a regression model with quadratic basis functions to construct a surrogate model of a hypercubic constraint-based utility function. Third, it evaluates the agent through extensive experiments, demonstrating how performance scales with complexity. Following the steps outlined in this thesis, the performance of surrogate models can be directly compared.
The results demonstrate that the surrogate-based method is a promising approach, as the agent constructed in this thesis outperforms the agents from the 2014 Automated Negotiating Agent Competition which used similar scenarios as those considered in this thesis. These agents all have in common that they directly search the utility function as opposed to a surrogate model of it. Furthermore, the results indicate that simple basis functions, such as quadratic ones, enable the agent to reach the global maximum of its utility function in low-complexity hypercubic cases, with performance scaling reasonably well up to medium complexity. Beyond this point, however, performance deteriorates rapidly, clearly signaling the need for more expressive surrogate models.
Operator Learning for Loss Parameter Estimation in Dredging Operations
To optimize the suction production on Trailing Suction Hopper Dredgers
Due to the nature of a lagging density sensor, we integrate a real-time rolling mean error correction mechanism. This addresses training biases for refined predictions, as well as offering an anomaly detection mechanism. The model is trained and validated on real-world vessel data, including synthetic simulations of vacuum processes, and evaluated using trip-wise and global metrics. Experimental results show that the proposed architecture significantly outperforms the rolling mean baseline setups and the classical DeepONet across accuracy metrics such as the root mean square error (RMSE).
This work demonstrates the value of combining domain knowledge with operator learning techniques in maritime engineering. The proposed framework offers a scalable framework, allowing application across entire fleets for real-time suction production estimation and anomaly detection, contributing to efficient dredging operations. ...
Due to the nature of a lagging density sensor, we integrate a real-time rolling mean error correction mechanism. This addresses training biases for refined predictions, as well as offering an anomaly detection mechanism. The model is trained and validated on real-world vessel data, including synthetic simulations of vacuum processes, and evaluated using trip-wise and global metrics. Experimental results show that the proposed architecture significantly outperforms the rolling mean baseline setups and the classical DeepONet across accuracy metrics such as the root mean square error (RMSE).
This work demonstrates the value of combining domain knowledge with operator learning techniques in maritime engineering. The proposed framework offers a scalable framework, allowing application across entire fleets for real-time suction production estimation and anomaly detection, contributing to efficient dredging operations.
Activation function trade-offs for training efficiency of Physics-Informed Neural Networks used in solving 1D Burgers’ Equation
Analyzing the impact of the choice of adaptive activation function on the speed and accuracy of generating PDE solutions using PINNs
Leveraging Parallel Schwarz Domain Decomposition
Using node level parallelism for the implementation of the parallel Schwarz method
This thesis introduces a novel machine learning framework designed to improve Reynolds-averaged Navier-Stokes models in turbulent stratified gas-liquid flows while employing the Boussinesq approximation. The framework encompasses two methods for turbulent viscosity field inversion and introduces correction terms in the turbulence model equations to ensure an accurate prediction of the turbulent viscosity field. Through sparse symbolic regression, the framework consistently discovers models that improve the accuracy of the baseline RANS model, even in untrained flow scenarios, though further testing is needed for varied flow regimes.
Key findings include the superior performance of sparse symbolic regression models over neural network (NN) models in improving the baseline RANS model accuracy. Notably, LASSO and elastic net techniques yielded the most successful models, significantly reducing baseline errors. However, these models did not surpass the Egorov damping approach in terms of accuracy, indicating the need for further refinement.
The developed models were numerically stable and robust, which is important for practical use. However, a main limitation is that the models' accuracy during training did not always correlate with the results when coupled with the RANS equations. Moreover, data from more varied flow conditions is needed to properly assess the generalizability of the models.
Overall, this research highlights the potential of data-driven turbulence modelling to enhance two-phase flow simulations, marking a significant step forward while also identifying areas for future improvement and exploration. ...
This thesis introduces a novel machine learning framework designed to improve Reynolds-averaged Navier-Stokes models in turbulent stratified gas-liquid flows while employing the Boussinesq approximation. The framework encompasses two methods for turbulent viscosity field inversion and introduces correction terms in the turbulence model equations to ensure an accurate prediction of the turbulent viscosity field. Through sparse symbolic regression, the framework consistently discovers models that improve the accuracy of the baseline RANS model, even in untrained flow scenarios, though further testing is needed for varied flow regimes.
Key findings include the superior performance of sparse symbolic regression models over neural network (NN) models in improving the baseline RANS model accuracy. Notably, LASSO and elastic net techniques yielded the most successful models, significantly reducing baseline errors. However, these models did not surpass the Egorov damping approach in terms of accuracy, indicating the need for further refinement.
The developed models were numerically stable and robust, which is important for practical use. However, a main limitation is that the models' accuracy during training did not always correlate with the results when coupled with the RANS equations. Moreover, data from more varied flow conditions is needed to properly assess the generalizability of the models.
Overall, this research highlights the potential of data-driven turbulence modelling to enhance two-phase flow simulations, marking a significant step forward while also identifying areas for future improvement and exploration.