Two-Step Transfer Learning Improves Deep Learning–Based Drug Response Prediction in Small Datasets

A Case Study of Glioblastoma

Journal Article (2025)
Author(s)

Jie Ju (Erasmus MC)

Ioannis Ntafoulis (Erasmus MC)

Michelle Klein (Erasmus MC)

MJT Reinders (TU Delft - Pattern Recognition and Bioinformatics)

Martine Lamfers (Erasmus MC)

Andrew P. Stubbs (Erasmus MC)

Yunlei Li (Erasmus MC)

Research Group
Pattern Recognition and Bioinformatics
DOI related publication
https://doi.org/10.1177/11779322241301507
More Info
expand_more
Publication Year
2025
Language
English
Research Group
Pattern Recognition and Bioinformatics
Volume number
19
Pages (from-to)
1-12
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

While deep learning (DL) is used in patients’ outcome predictions, the insufficiency of patient samples limits the accuracy. In this study, we investigated how transfer learning (TL) alleviates the small sample size problem. A 2-step TL framework was constructed for a difficult task: predicting the response of the drug temozolomide (TMZ) in glioblastoma (GBM) cell cultures. The GBM is aggressive, and most patients do not benefit from the only approved chemotherapeutic agent TMZ. O6-methylguanine-DNA methyltransferase (MGMT) promoter methylation status is the only biomarker for TMZ responsiveness but has shown limited predictive power. The 2-step TL framework was built on 3 datasets: (1) the subset of the Genomics of Drug Sensitivity in Cancer (GDSC) dataset, including miscellaneous cell cultures treated by TMZ, cyclophosphamide, bortezomib, and oxaliplatin, as the source dataset; (2) the Human Glioblastoma Cell Culture (HGCC) dataset, for fine-tuning; and (3) a small target dataset GSE232173, for validation. The latter two included specifically TMZ-treated GBM cell cultures. The DL models were pretrained on the cell cultures treated by each of the 4 drugs from GDSC, respectively. Then, the DL models were refined on HGCC, where the best source drug was identified. Finally, the DL model was validated on GSE232173. Using 2-step TL with pretraining on oxaliplatin was not only superior to those without TL and with 1-step TL but also better than 3 benchmark methods, including MGMT. The oxaliplatin-based TL improved the performance probably by increasing the weights of cell cycle-related genes, which relates to the TMZ response processes. Our findings support the potential of oxaliplatin being an alternative therapy for patients with GBM and TL facilitating drug repurposing research. We recommend that following our methodology, using mixed cancers and a related drug as the source and then fine-tuning the model with the target cancer and the target drug will enhance drug response prediction.