Long-Short Term Memory Model for chromosomal aberration detection in Non-Invasive Prenatal Testing

More Info
expand_more

Abstract

In 1997 it was discovered that fragments of DNA circulate freely in the blood plasma and, in the case of pregnancy, this DNA consists of DNA belonging to both the mother and the fetus. This circulating free DNA has made it possible to test for chromosomal aberration in the fetus through non-invasive methods, thereby avoiding the 1 in 100 chance of causing a miscarriage. Since then, multiple methods have been developed to detect chromosomal abnormalities with increasing accuracy and decreasing costs. The current state-of-the-art WISECONDOR uses a within-sample reference set, which is then used to calculate the z-score on a sliding window to determine whether an aberration is present or not. Here, we introduce a deep learning approach to non-invasive prenatal testing in the form of a Long-Short Term Memory model, which takes a sequence of GC normalized read counts per bin on the genome and outputs the label healthy or aberrated per bin. To test the performance of both WISECONDOR and the newly proposed model, data is simulated, and multiple experiments are set up to test the influence of certain aspects of NIPT. When comparing the LSTM model to WISECONDOR, it was shown that the LSTM model is still too inconsistent in its performance. This is caused by its reliance on the initialization of the weights and its dependence on the training set's composition.