C.M. Menzen
Please Note
7 records found
1
While probabilistic methods have many benefits, such as recursive estimation and uncertainty quantification, they often come with substantial memory and compute requirements.
Computational challenges are particularly pronounced in large-scale settings, where data sets contain a high number of measurements, and for high-dimensional problems, which require exponentially many parameters to describe probability distributions.
These scenarios can suffer from the curse of dimensionality, which requires exponentially growing computing resources, making conventional approaches computationally intractable.
This dissertation addresses computational challenges by leveraging tensor networks (TNs) to develop computationally efficient probabilistic algorithms.
TNs, also known as tensor decompositions, extend matrix decomposition to higher dimensions by representing large multidimensional arrays, i.e., tensors, in a compact, decomposed format, defined by TN components and TN ranks.
Under the assumption of low-rank structure, TNs enable efficient storage and computation, making large-scale and high-dimensional problems more tractable, even on resource-constrained hardware such as conventional laptops.
The focus of this work is on scalable solutions for Bayesian estimation problems involving Gaussian distributions and exact inference, including recursive filtering and Gaussian process (GP) regression. ...
While probabilistic methods have many benefits, such as recursive estimation and uncertainty quantification, they often come with substantial memory and compute requirements.
Computational challenges are particularly pronounced in large-scale settings, where data sets contain a high number of measurements, and for high-dimensional problems, which require exponentially many parameters to describe probability distributions.
These scenarios can suffer from the curse of dimensionality, which requires exponentially growing computing resources, making conventional approaches computationally intractable.
This dissertation addresses computational challenges by leveraging tensor networks (TNs) to develop computationally efficient probabilistic algorithms.
TNs, also known as tensor decompositions, extend matrix decomposition to higher dimensions by representing large multidimensional arrays, i.e., tensors, in a compact, decomposed format, defined by TN components and TN ranks.
Under the assumption of low-rank structure, TNs enable efficient storage and computation, making large-scale and high-dimensional problems more tractable, even on resource-constrained hardware such as conventional laptops.
The focus of this work is on scalable solutions for Bayesian estimation problems involving Gaussian distributions and exact inference, including recursive filtering and Gaussian process (GP) regression.
For the first time, this position paper introduces a fundamental link between tensor networks (TNs) and Green AI, highlighting their synergistic potential to enhance both the inclusivity and sustainability of AI research. We argue that TNs are valuable for Green AI due to their strong mathematical backbone and inherent logarithmic compression potential. We undertake a comprehensive review of the ongoing discussions on Green AI, emphasizing the importance of sustainability and inclusivity in AI research to demonstrate the significance of establishing the link between Green AI and TNs. To support our position, we first provide a comprehensive overview of efficiency metrics proposed in Green AI literature and then evaluate examples of TNs in the fields of kernel machines and deep learning using the proposed efficiency metrics. This position paper aims to incentivize meaningful, constructive discussions by bridging fundamental principles of Green AI and TNs. We advocate for researchers to seriously evaluate the integration of TNs into their research projects, and in alignment with the link established in this paper, we support prior calls encouraging researchers to treat Green AI principles as a research priority.
We present a mapping algorithm to compute large-scale magnetic field maps in indoor environments with approximate Gaussian process (GP) regression. Mapping the spatial variations in the ambient magnetic field can be used for 10-calization algorithms in indoor areas. To compute such a map, GP regression is a suitable tool because it provides predictions of the magnetic field at new locations along with uncertainty quantification. Because full GP regression has a complexity that grows cubically with the number of data points, approximations for GPs have been extensively studied. In this paper, we build on the structured kernel interpolation (SKI) framework, speeding up inference by exploiting efficient Krylov subspace methods. More specifically, we incorporate SKI with derivatives (D-SKI) into the scalar potential model for magnetic field modeling and compute both predictive mean and covariance with a complexity that is linear in the data points. In our simulations, we show that our method achieves better accuracy than current state-of-the-art methods on magnetic field maps with a growing mapping area. In our large-scale experiments, we construct magnetic field maps from up to 40000 three-dimensional magnetic field measurements in less than two minutes on a standard laptop.
This paper presents a method for approximate Gaussian process (GP) regression with tensor networks (TNs). A parametric approximation of a GP uses a linear combination of basis functions, where the accuracy of the approximation depends on the total number of basis functions M. We develop an approach that allows us to use an exponential amount of basis functions without the corresponding exponential computational complexity. The key idea to enable this is using low-rank TNs. We first find a suitable low-dimensional subspace from the data, described by a low-rank TN. In this low-dimensional subspace, we then infer the weights of our model by solving a Bayesian inference problem. Finally, we project the resulting weights back to the original space to make GP predictions. The benefit of our approach comes from the projection to a smaller subspace: It modifies the shape of the basis functions in a way that it sees fit based on the given data, and it allows for efficient computations in the smaller subspace. In an experiment with an 18-dimensional benchmark data set, we show the applicability of our method to an inverse dynamics problem.
This paper proposes a Bayesian Volterra tensor network (TN) to solve high-order discrete nonlinear multiple-input multiple-output (MIMO) Volterra system identification problems. Using a low-rank tensor network to compress all Volterra kernels at once, we avoid the exponential growth of monomials with respect to the order of the Volterra kernel. Our contribution is to introduce a Bayesian framework for the low-rank Volterra TN. Compared to the least squares solution for Volterra TNs, we include prior assumptions explicitly in the model. In particular, we show for the first time how a zero-mean prior with diagonal covariance matrix corresponds to implementing a Tikhonov regularization for the MIMO Volterra TN. Furthermore, adopting a Bayesian viewpoint enables simulations with Bayesian uncertainty bounds based on noise and prior assumptions. In addition, we demonstrate via numerical experiments how Tikhonov regularization prevents overfitting in the case of higher-rank TNs.
Multiway data often naturally occurs in a tensorial format which can be approximately represented by a low-rank tensor decomposition. This is useful because complexity can be significantly reduced and the treatment of large-scale data sets can be facilitated. In this paper, we find a low-rank representation for a given tensor by solving a Bayesian inference problem. This is achieved by dividing the overall inference problem into subproblems where we sequentially infer the posterior distribution of one tensor decomposition component at a time. This leads to a probabilistic interpretation of the well-known iterative algorithm alternating linear scheme (ALS). In this way, the consideration of measurement noise is enabled, as well as the incorporation of application-specific prior knowledge and the uncertainty quantification of the low-rank tensor estimate. To compute the low-rank tensor estimate from the posterior distributions of the tensor decomposition components, we present an algorithm that performs the unscented transform in tensor train format.