A. Heinlein
Please Note
73 records found
1
PACMANN
Point adaptive collocation method for artificial neural networks
Physics-Informed Neural Networks (PINNs) have emerged as a tool for approximating the solution of Partial Differential Equations (PDEs) in both forward and inverse problems. PINNs minimize a loss function which includes the PDE residual determined for a set of collocation points. Previous work has shown that the number and distribution of these collocation points have a significant influence on the accuracy of the PINN solution. Therefore, the effective placement of these collocation points is an active area of research. Specifically, available adaptive collocation point sampling methods have been reported to scale poorly in terms of computational cost when applied to high-dimensional problems. In this work, we address this issue and present the Point Adaptive Collocation Method for Artificial Neural Networks (PACMANN). PACMANN incrementally moves collocation points toward regions of higher residuals using gradient-based optimization algorithms guided by the gradient of the PINN loss function, that is, the squared PDE residual. We apply PACMANN to several forward and inverse problems, including one with a low-regularity solution and 3D Navier Stokes, and demonstrate that this method matches the performance of state-of-the-art methods in terms of the accuracy/efficiency tradeoff for the low-dimensional problems, while outperforming available approaches for high-dimensional problems. Key features of the method include its low computational cost and simplicity of integration into existing physics-informed neural network pipelines. The code is available at https://github.com/CoenVisser/PACMANN.
In this paper, pseudo-time stepping (also known as pseudo-transient continuation) is employed in order to improve nonlinear convergence. The classical algorithm is enhanced by a neural network model that is trained to predict a local pseudo-time step. Generalization of the novel approach is facilitated by predicting the local pseudo-time step separately on each element using only local information on a patch of adjacent elements as input. Numerical results for standard benchmark problems, including flow over a backward facing step geometry and Couette flow, show the performance of the machine learning-enhanced globalization approach; as the software for the simulations, the CFD Module of COMSOL Multiphysics® is employed. ...
In this paper, pseudo-time stepping (also known as pseudo-transient continuation) is employed in order to improve nonlinear convergence. The classical algorithm is enhanced by a neural network model that is trained to predict a local pseudo-time step. Generalization of the novel approach is facilitated by predicting the local pseudo-time step separately on each element using only local information on a patch of adjacent elements as input. Numerical results for standard benchmark problems, including flow over a backward facing step geometry and Couette flow, show the performance of the machine learning-enhanced globalization approach; as the software for the simulations, the CFD Module of COMSOL Multiphysics® is employed.
We enhance machine learning algorithms for learning model parameters in complex systems represented by differential equations with domain decomposition methods. The study evaluates the performance of two approaches, namely (vanilla) Physics-Informed Neural Networks (PINNs) and Finite Basis Physics-Informed Neural Networks (FBPINNs), in learning the dynamics of test models with a quasi-stationary longtime behavior. We test the approaches for data sets in different dynamical regions and with varying noise level. As results, the FBPINN approach better captures the overall dynamical behavior compared to the vanilla PINN approach, even in cases with data only from a time domain with quasi-stationary dynamics.
Multiscale problems are challenging for neural network-based discretizations of differential equations, such as physics-informed neural networks (PINNs) and operator networks. This can be (partly) attributed to the so-called spectral bias of neural networks. To improve the performance of PINNs for time-dependent problems, a combination of multifidelity stacking PINNs and domain decomposition-based finite basis PINNs is employed. In particular, to learn the high-fidelity part of the multifidelity model, a domain decomposition in time is employed. The performance is investigated for a pendulum and a two-frequency problem as well as the Allen-Cahn equation. It can be observed that the domain decomposition approach clearly improves the PINN and stacking PINN approaches. Finally, it is demonstrated that the FBPINN approach can be extended to multifidelity physics-informed deep operator networks.
DDU-Net
A Domain Decomposition-Based CNN for High-Resolution Image Segmentation on Multiple GPUs
The segmentation of ultra-high resolution images poses challenges such as loss of spatial information or computational inefficiency. In this work, a novel approach that combines encoder-decoder architectures with domain decomposition strategies to address these challenges is proposed. Specifically, a domain decomposition-based U-Net (DDU-Net) architecture is introduced, which partitions input images into non-overlapping patches that can be processed independently on separate devices. A communication network is added to facilitate inter-patch information exchange to enhance the understanding of spatial context. Experimental validation is performed on a synthetic dataset that is designed to measure the effectiveness of the communication network. Then, the performance is tested on the DeepGlobe land cover classification dataset as a real-world benchmark data set. The results demonstrate that the approach, which includes inter-patch communication for images divided into 16 × 16 non-overlapping subimages, achieves a 2 - 3% higher intersection over union (IoU) score compared to the same network without inter-patch communication. The performance of the network which includes communication is equivalent to that of a baseline U-Net trained on the full image, showing that our model provides an effective solution for segmenting ultra-high-resolution images while preserving spatial context. The code is available at https://github.com/corne00/DDU-Net.
Background: Burn injuries present a significant global health challenge. Among the most severe long-term consequences are contractures, which can lead to functional impairments and disfigurement. Understanding and predicting the evolution of post-burn wounds is essential for developing effective treatment strategies. Traditional mathematical models, while accurate, are often computationally expensive and time-consuming, limiting their practical application. Recent advancements in machine learning, particularly in deep learning, offer promising alternatives for accelerating these predictions. Methods: This study explores the use of a deep operator network, a type of neural operator, as a surrogate model for finite element simulations aimed at predicting post-burn contraction across multiple wound shapes. A deep operator network was trained on three distinct initial wound shapes, with enhancements made to the architecture by incorporating initial wound shape information and applying sine augmentation to enforce boundary conditions. Findings: The performance of the trained deep operator network was evaluated on a test set including finite element simulations based on convex combinations of the three basic wound shapes. The model achieved an R2 score of 0.99, indicating strong predictive accuracy and generalization. Moreover, the model provided reliable predictions over an extended period of up to one year, with speedups of up to 128-fold on the Central Processing Unit and 235-fold on the Graphical Processing Unit, compared to the numerical model. Interpretation: These findings suggest that deep operator networks can effectively serve as a surrogate for traditional finite element methods in simulating post-burn wound evolution, with potential applications in medical treatment planning.
Randomized neural networks (RaNNs), characterized by fixed hidden layers after random initialization, offer a computationally efficient alternative to fully parameterized neural networks trained using stochastic gradient descent-type algorithms. In this paper, we integrate RaNNs with overlapping Schwarz domain decomposition in two primary ways: firstly, to formulate the least-squares problem with localized basis functions, and secondly, to construct effective overlapping Schwarz preconditioners for solving the resulting linear systems. Specifically, neural networks are randomly initialized in each subdomain following a uniform distribution, and these localized solutions are combined through a partition of unity, providing a global approximation to the solution of the partial differential equation. Boundary conditions are imposed via a constraining operator, eliminating the necessity for penalty methods. Furthermore, we apply principal component analysis (PCA) within each subdomain to reduce the number of basis functions, thereby significantly improving the conditioning of the resulting linear system. By constructing additive Schwarz (AS) and restricted AS preconditioners, we efficiently solve the least-squares problems using iterative solvers such as the Conjugate Gradient (CG) and generalized minimal residual methods. Numerical experiments clearly demonstrate that the proposed methodology substantially reduces computational time, particularly for multi-scale and time-dependent PDE problems. Additionally, we present a three-dimensional numerical example illustrating the superior efficiency of employing the CG method combined with an AS preconditioner over direct methods like QR decomposition for solving the associated least-squares system.
Two-level domain decomposition preconditioners lead to fast convergence and scalability of iterative solvers. However, for highly heterogeneous problems with a rapidly varying coefficient function, the condition number of the preconditioned system generally depends on the contrast of the coefficient function. As a result, the convergence may deteriorate. Enhancing the coarse space by functions constructed from suitable local eigenvalue problems restores robust, contrast-independent convergence; these coarse spaces are often denoted as adaptive or spectral coarse spaces. However, these eigenvalue problems typically rely on nonalgebraic information such that the adaptive coarse spaces cannot be constructed from the fully assembled system matrix. In this paper, a novel algebraic adaptive coarse space which relies on the a-orthogonal decomposition of (local) finite element (FE) spaces into functions that solve the elliptic PDE with some trace and FE functions that are zero on the boundary is proposed. In particular, the basis is constructed from eigenmodes of two types of local eigenvalue problems associated with the edges of the domain decomposition. To approximate functions that solve the PDE locally, we employ a transfer eigenvalue problem which has originally been proposed for the construction of optimal local approximation spaces for multiscale methods. In addition, we make use of a Dirichlet eigenvalue problem that is a slight modification of the Neumann eigenvalue problem used in the adaptive generalized Dryja-Smith-Widlund (AGDSW) coarse space. Both eigenvalue problems rely solely on local Dirichlet matrices, which can be extracted from the fully assembled system matrix, allowing for an algebraic construction. By combining arguments from multiscale and domain decomposition methods, we derive a contrast-independent upper bound for the condition number. While we restrict ourselves here to a two-dimensional diffusion problem discretized by low-order FEs on regular meshes, the proposed framework is general, and we conjecture that the approach can be readily extended, for instance, to other elliptic problems, three dimensions, or, under mild assumptions, higher-order discretizations. The robustness of the method is confirmed numerically for a variety of heterogeneous coefficient distributions, including binary random distributions and a coefficient function constructed from the SPE10 benchmark. The results are comparable to those of the nonalgebraic AGDSW coarse space as well as for those cases where the convergence of the classical algebraic generalized Dryja-Smith-Widlund coarse space deteriorates. Moreover, the coarse space dimension is the same as or comparable to the AGDSW coarse space for all numerical experiments.
Physics-informed neural networks (PINNs) are a powerful approach for solving problems involving differential equations, yet they often struggle to solve problems with high frequency and/or multi-scale solutions. Finite basis physics-informed neural networks (FBPINNs) improve the performance of PINNs in this regime by combining them with an overlapping domain decomposition approach. In this work, FBPINNs are extended by adding multiple levels of domain decompositions to their solution ansatz, inspired by classical multilevel Schwarz domain decomposition methods (DDMs). Analogous to typical tests for classical DDMs, we assess how the accuracy of PINNs, FBPINNs and multilevel FBPINNs scale with respect to computational effort and solution complexity by carrying out strong and weak scaling tests. Our numerical results show that the proposed multilevel FBPINNs consistently and significantly outperform PINNs across a range of problems with high frequency and multi-scale solutions. Furthermore, as expected in classical DDMs, we show that multilevel FBPINNs improve the accuracy of FBPINNs when using large numbers of subdomains by aiding global communication between subdomains.
Solving partial differential equations (PDEs) is a common task in numerical mathematics and scientific computing. Typical discretization schemes, for example, finite element (FE), finite volume (FV), or finite difference (FD) methods, have the disadvantage that the computations have to be repeated once the boundary conditions (BCs) or the geometry change slightly; typical examples requiring the solution of many similar problems are time-dependent and inverse problems or uncertainty quantification.
The success and advancement of machine learning (ML) in fields such as image recognition and natural language processing has lead to the development of novel methods for the solution of problems in physics and engineering.
A computational framework is presented to numerically simulate the effects of antihypertensive drugs, in particular calcium channel blockers, on the mechanical response of arterial walls. A stretch-dependent smooth muscle model by Uhlmann and Balzani is modified to describe the interaction of pharmacological drugs and the inhibition of smooth muscle activation. The coupled deformation-diffusion problem is then solved using the finite element software FEDDLib and overlapping Schwarz preconditioners from the Trilinos package FROSch. These preconditioners include highly scalable parallel GDSW (generalized Dryja–Smith–Widlund) and RGDSW (reduced GDSW) preconditioners. Simulation results show the expected increase in the lumen diameter of an idealized artery due to the drug-induced reduction of smooth muscle contraction, as well as a decrease in the rate of arterial contraction in the presence of calcium channel blockers. Strong and weak parallel scalability of the resulting computational implementation are also analyzed.
Multilevel extensions of overlapping Schwarz domain decomposition preconditioners of Generalized Dryja-Smith-Widlund (GDSW) type are considered in this paper. The original GDSW preconditioner is a two-level overlapping Schwarz domain decomposition preconditioner, which can be constructed algebraically from the fully assembled stiffness matrix. The FROSch software, which belongs to the ShyLU package of the Trilinos software library, provides parallel implementations of different variants of GDSW preconditioners. The coarse problem can limit the parallel scalability of two-level GDSW preconditioners. As a remedy, in the past, three-level GDSW approaches have been proposed, which can significantly extend the range of scalability. Here, a multilevel extension of the GDSW preconditioner is introduced and analyzed. Finally, parallel results for the implementation in FROSch for up to 40 000 cores of the SuperMUC-NG supercomputer at Leibniz Supercomputing Centre (LRZ) and to 48 000 cores of the JUWELS supercomputer at Jülich Supercomputing Centre (JSC) are presented.