J. Chen | TU Delft Repository

High-Performance Iterative Methods for the Helmholtz Equation

Doctoral thesis (2025) - J. Chen, C. Vuik, M.B. van Gijzen, V.N.S.R. Dwarka

The numerical solution of the Helmholtz equation presents significant challenges in computational mathematics and scientific computing, particularly for high-frequency problems in heterogeneous media. This dissertation addresses these challenges through the development of high-performance iterative methods, focusing on the critical balance between numerical efficiency and practical implementation on modern computing architectures.

The research is motivated by the growing computational demands in seismic imaging and other wave propagation applications, where increasing frequencies and larger domains necessitate more efficient solution strategies. Traditional approaches often struggle with the combined challenges of wavenumber-dependent convergence, pollution errors, and substantial memory requirements, particularly for three-dimensional problems in heterogeneous media.

This work presents a comprehensive framework for solving large-scale Helmholtz problems through matrix-free parallel implementations of preconditioned iterative methods. The framework combines Complex Shifted Laplace Preconditioner (CSLP) with advanced deflation techniques, implemented in a manner that eliminates the need for explicit matrix storage while maintaining computational efficiency. A key innovation is the development of matrix-free implementations for higher-order deflation methods combined with the CSLP preconditioner, achieved through carefully designed re-discretization schemes that preserve the advantages of Galerkin coarsening.

The methodology progresses from two-dimensional implementations to fully three-dimensional frameworks, incorporating increasingly sophisticated preconditioning techniques. A significant achievement is the development of a matrix-free parallel multilevel deflation preconditioner that exhibits near wavenumber-independent convergence while maintaining excellent parallel scalability. The implementation utilizes a hybrid MPI+OpenMP parallelization strategy, effectively addressing both computational and memory challenges in extreme-scale scenarios.

Extensive numerical experiments validate the effectiveness of these methods across a range of problem types, from academic test cases to industrial-scale applications. Notably, the framework successfully resolves a challenging seismic model, involving approximately 3.8 billion degrees of freedom, while achieving 86\% parallel efficiency when scaling to 2304 CPU cores. This demonstration of practical viability for large-scale heterogeneous problems represents a significant advance in computational capabilities for seismic imaging applications.

The research makes several fundamental contributions to the field of numerical analysis and scientific computing. First, it establishes new approaches for matrix-free implementation of state-of-the-art preconditioners, significantly reducing memory requirements while maintaining numerical efficiency. Second, it demonstrates the achievement of close-to wavenumber-independent convergence through carefully designed deflation strategies in a parallel computing environment. Third, it provides a comprehensive framework for solving extreme-scale Helmholtz problems that combines numerical robustness with practical applicability.

The methodologies developed in this work contribute to the broader field of scientific computing, demonstrating how careful algorithm design, combined with modern computing architectures, can address previously intractable problems in wave propagation modeling. ...

The numerical solution of the Helmholtz equation presents significant challenges in computational mathematics and scientific computing, particularly for high-frequency problems in heterogeneous media. This dissertation addresses these challenges through the development of high-performance iterative methods, focusing on the critical balance between numerical efficiency and practical implementation on modern computing architectures.

The research is motivated by the growing computational demands in seismic imaging and other wave propagation applications, where increasing frequencies and larger domains necessitate more efficient solution strategies. Traditional approaches often struggle with the combined challenges of wavenumber-dependent convergence, pollution errors, and substantial memory requirements, particularly for three-dimensional problems in heterogeneous media.

This work presents a comprehensive framework for solving large-scale Helmholtz problems through matrix-free parallel implementations of preconditioned iterative methods. The framework combines Complex Shifted Laplace Preconditioner (CSLP) with advanced deflation techniques, implemented in a manner that eliminates the need for explicit matrix storage while maintaining computational efficiency. A key innovation is the development of matrix-free implementations for higher-order deflation methods combined with the CSLP preconditioner, achieved through carefully designed re-discretization schemes that preserve the advantages of Galerkin coarsening.

The methodology progresses from two-dimensional implementations to fully three-dimensional frameworks, incorporating increasingly sophisticated preconditioning techniques. A significant achievement is the development of a matrix-free parallel multilevel deflation preconditioner that exhibits near wavenumber-independent convergence while maintaining excellent parallel scalability. The implementation utilizes a hybrid MPI+OpenMP parallelization strategy, effectively addressing both computational and memory challenges in extreme-scale scenarios.

Extensive numerical experiments validate the effectiveness of these methods across a range of problem types, from academic test cases to industrial-scale applications. Notably, the framework successfully resolves a challenging seismic model, involving approximately 3.8 billion degrees of freedom, while achieving 86\% parallel efficiency when scaling to 2304 CPU cores. This demonstration of practical viability for large-scale heterogeneous problems represents a significant advance in computational capabilities for seismic imaging applications.

The research makes several fundamental contributions to the field of numerical analysis and scientific computing. First, it establishes new approaches for matrix-free implementation of state-of-the-art preconditioners, significantly reducing memory requirements while maintaining numerical efficiency. Second, it demonstrates the achievement of close-to wavenumber-independent convergence through carefully designed deflation strategies in a parallel computing environment. Third, it provides a comprehensive framework for solving extreme-scale Helmholtz problems that combines numerical robustness with practical applicability.

The methodologies developed in this work contribute to the broader field of scientific computing, demonstrating how careful algorithm design, combined with modern computing architectures, can address previously intractable problems in wave propagation modeling.

Matrix-Free Parallel Scalable Multilevel Deflation Preconditioning for Heterogeneous Time-Harmonic Wave Problems

Journal article (2025) - Jinqiang Chen, Vandana Dwarka, Cornelis Vuik

We present a matrix-free parallel scalable multilevel deflation preconditioned method for heterogeneous time-harmonic wave problems. Building on the higher-order deflation preconditioning proposed by Dwarka and Vuik (SIAM J. Sci. Comput. 42(2):A901-A928, 2020; J. Comput. Phys. 469:111327, 2022) for highly indefinite time-harmonic waves, we adapt these techniques for parallel implementation in the context of solving large-scale heterogeneous problems with minimal pollution error. Our proposed method integrates the Complex Shifted Laplacian preconditioner with deflation approaches. We employ higher-order deflation vectors and re-discretization schemes derived from the Galerkin coarsening approach for a matrix-free parallel implementation. We suggest a robust and efficient configuration of the matrix-free multilevel deflation method, which yields a close to wavenumber-independent convergence and good time efficiency. Numerical experiments demonstrate the effectiveness of our approach for increasingly complex model problems. The matrix-free implementation of the preconditioned Krylov subspace methods reduces memory consumption, and the parallel framework exhibits satisfactory parallel performance and weak parallel scalability. This work represents a significant step towards developing efficient, scalable, and parallel multilevel deflation preconditioning methods for large-scale real-world applications in wave propagation. ...

A matrix-free parallel two-level deflation preconditioner for two-dimensional heterogeneous Helmholtz problems

Journal article (2024) - Jinqiang Chen, Vandana Dwarka, Cornelis Vuik

We propose a matrix-free parallel two-level deflation method combined with the Complex Shifted Laplacian Preconditioner (CSLP) for two-dimensional heterogeneous Helmholtz problems encountered in seismic exploration, antennas, and medical imaging. These problems pose challenges in terms of accuracy and convergence due to scalability issues with numerical solvers. Motivated by the limitations imposed by excessive computational time and memory constraints when employing a sequential solver with constructed matrices, we parallelize the two-level deflation method without constructing any matrices. Our approach utilizes preconditioned Krylov subspace methods and approximates the CSLP preconditioner with a parallel geometric multigrid V-cycle. For the two-level deflation, standard inter-grid deflation vectors and further high-order deflation vectors are considered. As another main contribution, the matrix-free Galerkin coarsening approach and a novel re-discretization scheme as well as high-order finite-difference schemes on the coarse grid are studied to obtain wavenumber-independent convergence. The optimal settings for an efficient coarse-grid problem solver are investigated. Numerical experiments of model problems show that the wavenumber independence has been obtained for medium wavenumbers. The matrix-free parallel framework shows satisfactory weak and strong parallel scalability. ...

A matrix-free parallel solution method for the three-dimensional heterogeneous Helmholtz equation

Journal article (2024) - J. Chen, V. Dwarka, C. Vuik

The Helmholtz equation is related to seismic exploration, sonar, antennas, and medical imaging applications. It is one of the most challenging problems to solve in terms of accuracy and convergence due to the scalability issues of the numerical solvers. For 3D large-scale applications, high-performance parallel solvers are also needed. In this paper, a matrix-free parallel iterative solver is presented for the three-dimensional (3D) heterogeneous Helmholtz equation. We consider the preconditioned Krylov subspace methods for solving the linear system obtained from finite-difference discretization. The Complex Shifted Laplace Preconditioner (CSLP) is employed since it results in a linear increase in the number of iterations as a function of the wavenumber. The preconditioner is approximately inverted using one parallel 3D multigrid cycle. For parallel computing, the global domain is partitioned blockwise. The matrix-vector multiplication and preconditioning operator are implemented in a matrix-free way instead of constructing large, memory-consuming coefficient matrices. Numerical experiments of 3D model problems demonstrate the robustness and outstanding strong scaling of our matrix-free parallel solution method. Moreover, the weak parallel scalability indicates our approach is suitable for realistic 3D heterogeneous Helmholtz problems with minimized pollution error. ...

Matrix-Free Parallel Preconditioned Iterative Solvers for the 2D Helmholtz Equation Discretized with Finite Differences

Conference paper (2024) - Jinqiang Chen, Vandana Dwarka, Cornelis Vuik

We present a matrix-free parallel iterative solver for the Helmholtz equation related to applications in seismic problems and study its parallel performance. We apply Krylov subspace methods, GMRES, Bi-CGSTAB and IDR(s), to solve the linear system obtained from a second-order finite difference discretization. The Complex Shifted Laplace Preconditioner (CSLP) is employed to improve the convergence of Krylov solvers. The preconditioner is approximately inverted by multigrid iterations. For parallel computing, the global domain is partitioned blockwise. The standard MPI library is employed for data communication. The matrix-vector multiplication and preconditioning operator are implemented in a matrix-free way instead of constructing large, memory-consuming coefficient matrices. These adjustments lead to direct improvements in terms of memory consumption. Numerical experiments of model problems show that the matrix-free parallel solution method has satisfactory parallel performance and weak scalability. It allows us to solve larger problems in parallel to obtain more accurate numerical solutions. ...

High accuracy numerical investigation of trailing edge noise at vortex shedding critical angle of attack

Conference paper (2022) - Huabin Zheng, Jinqiang Chen, Peixiang Yu, Hua Ouyang

In this paper, the trailing edge noise generated by a 2D airfoil around the critical angle of attack for vortex shedding is numerically investigated using an in-house code with high accuracy and efficiency. In the present method, a fourth-order upwind compact finite-difference scheme with dispersion relation preserving (DRP) property is applied for the convection terms, and a fourth-order Runge-Kutta scheme is used for temporal discretization. The reflection of sound on the boundary is suppressed with Navier-Stokes characteristics boundary condition (NSCBC). To improve computational efficiency, a novel parallel computing strategy for the high-order compact schemes is employed. Thus, direct numerical simulation (DNS) can be realized for the flows of low Reynolds number (Re), while implicit large eddy simulation (ILES) would be carried for the flows of high Reynolds number. The present numerical method is validated by comparing the lift coefficient, drag coefficient and Strouhal number (St) to the previous publications. Based on the high accuracy and high-fidelity method, the flow field and sound field of a two-dimensional NACA0012 airfoil around critical angle of attack (AoA) at Re = 1000 are simultaneously solved. The results indicate that sound source is dipole centered at the surface of the airfoil at vortex shedding frequency, and is dipole, quadrupole or more complex sources located at the wake close to the trailing edge at higher order frequencies. These findings will help to improve understanding about the generation and propagation mechanisms of trailing edge noises at low Reynolds number. ...

A Cartesian grid method with improvement of resolving the boundary layer structure for two‐dimensional incompressible flows

Journal article (2021) - Xin Tong, Lipo Wang, Jinqiang Chen, Hua Ouyang

The Cartesian gird has its unique advantages in computational fluid dynamics, especially for complicated boundary cases. However, the boundary layer structures can not be resolved efficiently and effectively using the Cartesian mesh. To overcome such a problem, a new boundary layer structure resolving (BLSR) algorithm is proposed on the basis of the boundary layer physics and force balance analysis. For the present two‐dimensional test cases, numerical results justify that the surface friction and drag force can be more accurately calculated without refining the near‐wall resolution. In principle this BLSR algorithm is easy to implement with negligible increase of the computational cost. ...

A Novel Parallel Computing Strategy for Compact Difference Schemes with Consistent Accuracy and Dispersion

Journal article (2021) - Jinqiang Chen, Peixiang Yu, Hua Ouyang, Zhen F. Tian

In this paper, based on the boundary approximation approach for parallelization of the compact difference schemes, a novel strategy for the sub-domain boundary approximation schemes is proposed to maintain consistent accuracy and dispersion with the compact scheme in the interior points. In this strategy, not only the order of accuracy of the sub-domain boundary scheme is the same as the interior scheme, but the coefficient of the first truncation error term is also equal to that of the internal scheme. Furthermore, to realize the consistent dispersion performance for a class of high order upwind compact schemes, which usually include two expressions, we modify the opposite expression to be the sub-domain boundary scheme. As an example of application, the present strategy is applied to a fourth-order upwind compact scheme, and its accuracy is verified by a numerical test. The resolution and efficiency of the newly proposed parallel method are examined by four numerical examples, including propagation of a wave-packet, convection of isentropic vortex, Rayleigh–Taylor instability problems, and propagation of Gauss pulse. The results obtained demonstrate that the present strategy for compact difference schemes has the feasibility to solve the flow problems with high accuracy, resolution and efficiency in parallel computation. ...

A high‐order compact scheme for solving the 2D steady incompressible Navier‐Stokes equations in general curvilinear coordinates

Journal article (2020) - Jinqiang Chen, Peixiang Yu, Zhenfu F. Tian, Hua Ouyang

In this paper, a high‐order compact finite difference algorithm is established for the stream function‐velocity formulation of the two‐dimensional steady incompressible Navier‐Stokes equations in general curvilinear coordinates. Different from the previous work, not only the stream function and its first‐order partial derivatives but also the second‐order mixed partial derivative is treated as unknown variable in this work. Numerical examples, including a test problem with an analytical solution, three types of lid‐driven cavity flow problems with unusual shapes and steady flow past a circular cylinder as well as an elliptic cylinder with angle of attack, are solved numerically by the newly proposed scheme. For two types of the lid‐driven trapezoidal cavity flow, we provide the detailed data using the fine grid sizes, which can be considered the benchmark solutions. The results obtained prove that the present numerical method has the ability to solve the incompressible flow for complex geometry in engineering applications, especially by using a nonorthogonal coordinate transformation, with high accuracy. ...