Generating random correlation matrices with constraints
I.H. van der Brug (TU Delft - Electrical Engineering, Mathematics and Computer Science)
D. Kurowicka – Mentor (TU Delft - Applied Probability)
Nestor Parolya – Graduation committee member (TU Delft - Statistics)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Correlation matrices play a central role in multivariate modelling across fields such as finance and statistics. However, generating valid correlation matrices, remains a non-trivial problem due to the global positive definiteness condition they must satisfy. This thesis investigates two methods for generating correlation matrices, with extensions on how to control or influence the average correlation. The first method relies on square root decomposition of the correlation matrix, parametrizing it as the product of a matrix with unit-norm rows and its transpose. A recent extension of this by Tuitman et al. is explored, which enables the generation of matrices with a fixed average correlation. This is achieved through iterative construction of the decomposition, ensuring the weighted sum of vectors has a prescribed norm, corresponding to the target average correlation. The algorithms geometric structure, feasibility conditions, and statistical properties are analysed.
The second method is based on the C-vine construction using partial correlations, as introduced by Joe and Kurowicka. There exists a one-to-one mapping from a set of partial correlations to a full correlation matrix. This approach parametrizes the matrix through a structured sequence of partial correlations. The distribution from which these partial correlations are sampled can be adjusted to achieve specific properties in the resulting matrices, for example using specific Beta distributions we obtain matrices following the LKJ distribution. The extension by Joe and Kurowicka is investigated, which allows the expected value of each correlation to be fixed across samples.
A comparison of both methods is provided in terms of construction, flexibility, numerical stability, and statistical properties of the resulting matrices. While the square root decomposition method offers strict per-matrix control over the average correlation, the C-vine approach provides greater flexibility, enabling finer control over marginal distributions. The thesis concludes with a discussion on practical trade-offs and potential directions for future work.