Understanding the influence of DNA fragment lengths in detecting cancer

None, None

Understanding the influence of DNA fragment lengths in detecting cancer

Detection of cancer using blood

Bachelor Thesis (2024)

Author(s)

M. Păun (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

I.B. Pronk – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Daan Hazelaar – Mentor (Erasmus MC)

S. Makrodimitris – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)

M.J.T. Reinders – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)

J.A. Pouwelse – Graduation committee member (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Faculty

Electrical Engineering, Mathematics and Computer Science

To reference this document use

https://resolver.tudelft.nl/uuid:4eb1a7bb-8199-44f9-a3bc-eaeb676d9061

More Info

expand_more

Publication Year

2024

Language

English

Graduation Date

28-06-2024

Awarding Institution

Delft University of Technology

Project

CSE3000 Research Project

Programme

Computer Science and Engineering

Faculty

Electrical Engineering, Mathematics and Computer Science

Downloads counter

303

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Detecting cancer at an initial stage could change the course of the disease's development. A non-invasive examination consists of the liquid biopsy of blood, revealing biomarkers that could provide information about the existence of a tumour or not in the organism. The research touches upon the relevance of DNA fragments, precisely the length of fragments, in the detection of cancer. An in-depth interpretation of the fragment length distribution for predicting the state of a patient as being healthy or sick with cancer was approached. The distribution was explored from four perspectives: the complete fragment length distribution, the size range from 90 to 150 bp, important lengths selected by the feature extraction methods and the Fourier Transform of the initial data. These were input in three machine learning models. Using the fragment lengths between 93 and 98 produced accuracy and AUC scores of over 0.85 for all supervised classification models. Processing the data with the Fourier Transform and using the amplitude of spectrums as features in the Random Forest model resulted in an AUC of 0.99.

Files

Research_paper_Monica_Paun.pdf

(pdf | 0.422 Mb)

License info not available