In this paper the research behind the parallelization on the GPU of the time parallel time integration method Parareal. Firstly, the theory behind Parareal and its convergence theorems will be detailed. Then, two test models, the Lorenz system and Heat diffusion equation, will be
...
In this paper the research behind the parallelization on the GPU of the time parallel time integration method Parareal. Firstly, the theory behind Parareal and its convergence theorems will be detailed. Then, two test models, the Lorenz system and Heat diffusion equation, will be introduced. Additionally, the derivation of the Forward Euler and Backward Euler methods for these problems will be discussed. Secondly, an overview of development in parallel programming will be given, with a focus on architecture, memory organization and GPU properties. Thirdly, the implementation of Parareal in Python using the CuPy library will be shown, including the Parareal convergence plots and the code profiling results. In the second half of the paper, there will be an outline of the improvements that were made for a better speedup of the Parareal implementation. A discussion on linear solvers and their efficiency in
regard to matrix properties will be presented. Moreover, the reason why a different linear solver for the Heat diffusion equation was needed, than the one built into CuPy, will be explained. Furthermore, the creation of a separate linear solver based on the Thomas algoithm as a CUDA-kernel in Python will be shared. The construction of CuPy elementwise kernels for the Lorenz system will be described as well. Lastly, the speedup results for the CuPy built-in functions will be compared to the self-made
kernels utilizing a self-derived speedup formula. A reflection on the implementation of Parareal in practice will conclude this paper.