Evaluating Machine Learning Approaches for Predicting Drug Response in Cancer Cells

None, None

Evaluating Machine Learning Approaches for Predicting Drug Response in Cancer Cells

A Comparative Analysis of Geneformer and Support Vector Machine

Bachelor Thesis (2024)

Author(s)

S. Banas (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

N. Brouwer – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

M. J.T. Reinders – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

N.M. Gürel – Graduation committee member (TU Delft - Pattern Recognition and Bioinformatics)

Faculty

Electrical Engineering, Mathematics and Computer Science

Machine learning Cancer SVM Transcriptomics Geneformer

To reference this document use:

https://resolver.tudelft.nl/uuid:00e969c3-2f2e-4969-bf89-6621d4ecd16f

More Info

expand_more

Publication Year

2024

Language

English

Graduation Date

23-06-2024

Awarding Institution

Delft University of Technology

Project

['CSE3000 Research Project']

Programme

['Computer Science and Engineering']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Accurately predicting how cancer cells respond to drug treatment is important to advance drug development. This paper presents a comparative analysis of Geneformer, a deep-learning transformer pre-trained on transcriptomic data, and Support Vector Machine. Using the Sciplex2 dataset, which includes transcriptomic data from lung cancer cells treated with three drugs, both models were trained to predict the response of cancer cells to drug treatments.

This paper investigates how Geneformer and SVM perform in predicting the treatment label of cells across different drugs and doses, which drug doses are suitable for conducting single-gene perturbation experiments, how accurately can these experiments replicate drug effects, and what are the differences in results between Geneformer and SVM regarding their ability to identify significant genes affecting drug response.

Results indicate that while SVM generally achieves higher accuracy in predicting treatment labels of cells, Geneformer demonstrates better capability in identifying genes whose perturbations mimic drug effects. Geneformer's embeddings show significant shifts towards treated cell states after single-gene perturbations, indicating a deeper understanding of gene interactions in drug response. On the other hand, SVM's predictions rely more on differential gene expression. This comparative analysis underscores the strengths and limitations of each approach in modelling complex biological systems and predicting the drug response of cancer cells.

Files

Samuel_Banas_Research_Paper_Fi... (pdf)

(pdf | 0.763 Mb)

License info not available