Using out-of-batch reference populations to improve untargeted metabolomics for screening inborn errors of metabolism

Journal Article (2020)
Author(s)

Michiel Bongaerts (Erasmus MC)

Ramon Bonte (Erasmus MC)

Serwet Demirdas (Erasmus MC)

Edwin H. Jacobs (Erasmus MC)

Esmee Oussoren (Erasmus MC)

Ans T. van der Ploeg (Erasmus MC)

Margreet A.E.M. Wagenmakers (Erasmus MC)

Robert M.W. Hofstra (Erasmus MC)

Henk J. Blom (Erasmus MC)

Marcel J.T. Reinders (TU Delft - Electrical Engineering, Mathematics and Computer Science)

George J.G. Ruijter (Erasmus MC)

Research Group
Pattern Recognition and Bioinformatics
DOI related publication
https://doi.org/10.3390/metabo11010008 Final published version
More Info
expand_more
Publication Year
2020
Language
English
Research Group
Pattern Recognition and Bioinformatics
Journal title
Metabolites
Issue number
1
Volume number
11
Article number
8
Pages (from-to)
1-40
Downloads counter
277
Collections
Institutional Repository
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Untargeted metabolomics is an emerging technology in the laboratory diagnosis of inborn errors of metabolism (IEM). Analysis of a large number of reference samples is crucial for correcting variations in metabolite concentrations that result from factors, such as diet, age, and gender in order to judge whether metabolite levels are abnormal. However, a large number of reference samples requires the use of out-of-batch samples, which is hampered by the semi-quantitative nature of untargeted metabolomics data, i.e., technical variations between batches. Methods to merge and accurately normalize data from multiple batches are urgently needed. Based on six metrics, we compared the existing normalization methods on their ability to reduce the batch effects from nine independently processed batches. Many of those showed marginal performances, which motivated us to develop Metchalizer, a normalization method that uses 10 stable isotope-labeled internal standards and a mixed effect model. In addition, we propose a regression model with age and sex as covariates fitted on reference samples that were obtained from all nine batches. Metchalizer applied on log-transformed data showed the most promising performance on batch effect removal, as well as in the detection of 195 known biomarkers across 49 IEM patient samples and performed at least similar to an approach utilizing 15 within-batch reference samples. Furthermore, our regression model indicates that 6.5–37% of the considered features showed significant age-dependent variations. Our comprehensive comparison of normalization methods showed that our Log-Metchalizer approach enables the use out-of-batch reference samples to establish clinically-relevant reference values for metabolite concentrations. These findings open the possibilities to use large scale out-of-batch reference samples in a clinical setting, increasing the throughput and detection accuracy.