Mode-Decomposition in DeepONets
Generalization and Coupling Analysis
J.J. Taraz (TU Delft - Electrical Engineering, Mathematics and Computer Science)
A. Heinlein – Graduation committee member (TU Delft - Numerical Analysis)
H. Schuttelaars – Graduation committee member (TU Delft - Mathematical Physics)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Operator learning promises to revolutionize scientific computing by learning solution operators for differential equations directly from data, potentially accelerating tasks like design optimization and uncertainty quantification by orders of magnitude. The deep operator network (DeepONet), the first practical architecture for operator learning, consists of a trunk and branch network. Its output is given by a linear combination of basis functions, where the functions are learned by the trunk network and the coefficients are learned by the branch network. However, despite their theoretical promise, DeepONets suffer from poor accuracy compared to classical numerical solvers, limiting their practical adoption. Understanding and addressing these accuracy limitations is crucial for advancing the field.
In this thesis, we first analyze the performance limitations of the classical DeepONet. We demonstrate that for many classical examples, the trunk network's error is much smaller than the total approximation error. Thus, the space spanned by the basis functions contains functions which approximate the true solutions well. The total approximation error is dominated by the branch network's error, i.e., the error of the coefficients.
To investigate this further, we construct a modified DeepONet. In this modification we replace the learnable trunk network with optimal basis vectors (modes) derived from a singular value decomposition (SVD). This modification is called SVD-based operator network (SVDONet). This simplification enables us to decompose the total error into mode-specific contributions, revealing how the coefficients of different spatial modes are approximated. Our mode decomposition analysis yields several key insights.
First, we discover that for some modes a low training error does not necessarily correspond to a low test error, i.e., the coefficients learned for these modes do not generalize well.
Second, we show that architectural choices profoundly impact generalization: the standard "unstacked" DeepONet architecture, where all modes share hidden neurons, significantly improves generalization for modes corresponding to small singular values at the cost of the modes with large singular values.
Third, we study how improving the coefficients of one mode impacts the coefficients of other modes. Here, we show fundamental differences between different optimization algorithms.
These findings establish mode decomposition as a powerful lens for analyzing neural operators, revealing that the success of operator learning hinges not on learning all modes equally well, but on the delicate balance between mode prioritization, architectural coupling, and optimization dynamics.