Repository hosted by TU Delft Library

Home · Contact · About · Disclaimer ·

Matrix correlations for high-dimensional data: The modified RV-coefficient

Publication files not online:

Author: Smilde, A.K. · Kiers, H.A.L. · Bijlsma, S. · Rubingh, C.M. · Erk, M.J. van
Institution: TNO Kwaliteit van Leven
Source:Bioinformatics, 3, 25, 401-405
Identifier: 241416
doi: doi:10.1093/bioinformatics/btn634
Keywords: Biology · Analytical research · Biomedical research · Bioinformatics · Controlled study · Correlation coefficient · Data analysis · Functional genomics · Matrix correlation · Metabolomics · Nonhuman · Priority journal · RV coefficient · Simulation · Statistical analysis · Theory · Transcriptomics · Algorithms · Computer Simulation · Genomics · Biomedical Innovation · Healthy Living


Motivation: Modern functional genomics generates high-dimensional datasets. It is often convenient to have a single simple number characterizing the relationship between pairs of such high-dimensional datasets in a comprehensive way. Matrix correlations are such numbers and are appealing since they can be interpreted in the same way as Pearson's correlations familiar to biologists. The high-dimensionality of functional genomics data is, however, problematic for existing matrix correlations. The motivation of this article is 2-fold: (i) we introduce the idea of matrix correlations to the bioinformatics community and (ii) we give an improvement of the most promising matrix correlation coefficient (the RV-coefficient) circumventing the problems of high-dimensional data. Results: The modified RV-coefficient can be usedin high-dimensional data analysis studies as an easy measure of common information of two datasets. This is shown by theoretical arguments, simulations and applications to two real-life examples from functional genomics, i.e. a transcriptomics and metabolomics example. © The Author 2008. Published by Oxford University Press. All rights reserved.