Effect of Output Granularity on SARS-CoV-2 Variant Abundance Estimates using Domestic Wastewater Sequencing

None, None

Effect of Output Granularity on SARS-CoV-2 Variant Abundance Estimates using Domestic Wastewater Sequencing

Bachelor Thesis (2022)

Author(s)

Y. Kalia (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

J.A. Baaijens – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

K.A. Hildebrandt – Graduation committee member (TU Delft - Computer Graphics and Visualisation)

Faculty

Electrical Engineering, Mathematics and Computer Science

Copyright

SARS-CoV-2 Variant Prediction COVID Wastewater Abundance RNA-Sequencing

To reference this document use:

https://resolver.tudelft.nl/uuid:362dc6bc-6e93-4221-8573-8b7acb0f7ef0

More Info

expand_more

Publication Year

2022

Language

English

Copyright

Graduation Date

28-01-2022

Awarding Institution

Delft University of Technology

Project

['CSE3000 Research Project']

Programme

['Computer Science and Engineering']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Monitoring of SARS-CoV-2 variants is crucial to efforts in combating the COVID-19 pandemic. Lineage level abundance estimates for SARS-CoV-2 can be obtained from viral material present in domestic wastewater. The abundance predictions can be made at different levels of granularity-individual lineage level(high granularity) or variant level(low granularity). The question this paper answers is to what extent abundance predictions are more accurate at lower granularity. Here we show that when wastewater samples contain only one lineage low granularity predictions are in general more accurate than high granularity for all lineages across Alpha, Delta and Mu variants. No variant level overestimation was observed for this experiment, which was thought to be something that could have made low granularity predictions less accurate than those at high granularity. When lineages of a variant were combined into a wastewater sample, the prediction error rose because of the smaller relative abundances of the genome sequences. Overestimation due to predictions of all lineages being pooled into one lineage was observed here with the overestimated high granularity lineage being more accurate than the low granularity predictions. If samples are expected to contain a very small amount of lineages then it is better to make predictions at low granularity. On the other hand, as the relative abundances of lineages decrease in a sample due to a large number of lineages, the chances of lineage level predictions having a smaller relative prediction error rate increases- making high granularity the better choice for more accurate predictions.

Files

Final_Paper_SARS_CoV_2_Output_... (pdf)

(pdf | 3.28 Mb)

License info not available