Multivariate time series arise in a wide range of domains, such as weather forecasting and financial modeling, where multiple interdependent variables evolve simultaneously over time. For instance, temperature readings at one location may have a delayed influence on nearby region
...
Multivariate time series arise in a wide range of domains, such as weather forecasting and financial modeling, where multiple interdependent variables evolve simultaneously over time. For instance, temperature readings at one location may have a delayed influence on nearby regions, while currency exchange rates exhibit complex, lagged interactions across global financial markets Effectively modeling these spatiotemporal interactions, particularly in streaming and non-stationary settings, remains a fundamental challenge. Traditional approaches such as Temporal PCA operate on sample covariance matrices but often suffer from stability issues in the estimated eigenvectors, especially in low-data regimes or when the corresponding eigenvalues are close. These covariance estimation errors can propagate into the learned representation and degrade performance in downstream tasks. Recent graph-based learning methods address this limitation by constructing graphs from the sample covariance matrix and learning from its structure. However, these approaches typically consider only lag-zero correlations, thereby limiting their ability to model cross-temporal dependencies and fully capture the spatiotemporal structure inherent in multivariate time series. To overcome this, this thesis proposes the Lagged spatiotemporal coVariance Neural Network (LVNN), a neural network architecture that leverages lagged covariance information to learn representations from multivariate time series in a streaming setting. LVNN constructs a spatiotemporal graph by concatenating consecutive temporal samples, computing their extended sample covariance matrix, and using it as a structural prior for graph convolutions. This design enables the model to capture variable interactions not only within time steps but also across temporal lags. However, the use of a larger spatiotemporal covariance matrix introduces additional computational overhead and spurious correlations. To address this, we introduce two structural modifications to our proposed model. First, we retain only spatial and backward temporal connections, corresponding to the block upper triangular part of the extended covariance matrix. Second, we apply thresholding-based sparsification techniques to prune weak correlations and improve scalability. We begin by proving that LVNN is robust to perturbations in the online estimation of the extended covariance matrix in stationary settings, improving over the stability issues of temporal PCA-based methods. These findings are empirically validated on synthetic stationary datasets. Then, to assess the effectiveness of the learned embeddings, we evaluate LVNN on single-step forecasting tasks using three real-world datasets across different forecasting horizons. The standard LVNN model performs comparably to our baselines, while the LVNN variant with the block upper triangular matrix demonstrates the most consistent performance. Furthermore, applying hard and soft thresholding sparsification techniques to the extended covariance matrix substantially reduces the computational overhead, with only a minor impact on forecasting performance. These results support our hypothesis that cross-temporal covariance terms are a valuable source of inductive bias for representation learning in multivariate time series.