Strategies for Fine-Tuning Geneformer to Predict the Exposure Level of Cancer Cells to Treatments

A Comparison of Different Fine-Tuning Strategies for Foundation Models

More Info
expand_more

Abstract

Studying the interactions of genes within a cell is an area of significant interest in the field of medicine as it can provide answers to what exactly drives the behavior of a cell under specific circumstances, such as diseases. Once understood, gene interactions can enable the synthesis of efficient, possibly personalized treatments for these diseases and other disorders. However, studying gene interactions requires a large number of samples which might be costly and laborious to obtain in the case of rare disorders for which there is not much recorded data. Geneformer, a context-aware, attention-based deep-learning model, was created specifically for solving this problem. The model makes use of transfer learning to apply any relevant knowledge gained from a larger, similar domain onto a downstream domain with limited data which can be used to further train the model. In this paper, we assessed four fine-tuning strategies, including the one used throughout the in silico experiments presented in the original Geneformer paper. We did this to assess whether the accuracy of Geneformer on the downstream task of predicting the sensitivity of cancer cells to different treatments can be improved versus the default implementation as found within the model's paper. The model was firstly fine-tuned using a training dataset compiled from the sciplex2 dataset, followed by the prediction of the dosage levels to which samples from a test set were exposed. Upon performing the experiment, we concluded that, depending on the way in which knowledge from the source domain is stored inside the pre-trained model and the similarity between the source and the target domains, different fine-tuning strategies were suitable for a given task. Hence, there is no single optimal fine-tuning method which can be used to predict the level to which cancer cells were exposed to treatments such as nutlin-3A.