Acceleration of Seed Extension for BWA-MEM DNA Alignment Using GPUs

Master thesis (2019)

Authors

J.J.G. Lévy Electrical Engineering, Mathematics and Computer Science

Contributors

Z. Al-Ars Computer Engineering - (supervisor 1)

N. Ahmed Computer Engineering - (supervisor 1)

M. Zandrahimi Computer Engineering - (supervisor 2)

R. Ishihara QID/Ishihara Lab - QuTech Advanced Research Centre (supervisor 2)

Faculty

Electrical Engineering, Mathematics and Computer Science

DNA Biology GPU Parallel processing

More Info

expand_more

To reference this document use:

http://resolver.tudelft.nl/uuid:bd22471f-058a-4071-95bb-5126e263124b

Published Date

19-08-2019

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Electrical Engineering, Mathematics and Computer Science

Abstract

DNA aligning is a compute-intensive and time-consuming task required for all further DNA processing. It consists in finding for each DNA string from a sample its location in a reference genome. Usual techniques for short reads (hundreds of bases) involve seed-extension, where a small matching seed is found with quick search through FM-index and then extended on both ends with a custom Smith-Waterman algorithm, giving optimal solution. However this seed-extension takes a tremendous amount of time. This is why we present in this thesis a solution to offload extension on a GPU to be done in a parallel fashion. This is possible thanks to the fact that the DNA reads do not present any kind of relation between each other. We used the Burrows-Wheeler Aligner - Maximal Exact Match (BWA-MEM), a reference program in the field, to which we integrated a GPU-accelerated library for extension, GASAL2. However, BWA-MEM has a left-right dependency on extension starting scores, with the left alignment starting with the seed score, then the right part starting with the previously calculated left score. We designed a solution by starting both extensions with the seed score, we called this method the "seed-only" paradigm. On our test machine featuring 12 hyperthreaded cores and an NVIDIA Tesla K40c, when running with 12 threads, we could observe a raw kernel speed-up of 4.8x ; but if we allow non-blocking calls to let the CPU run the seeding tasks while the GPU computes the extension, we can reach a 16x effective speed-up for the extension. This extension part takes around 27% of the total time but our acceleration introduces a small overhead due to memory preparations and copying, which makes the whole application 1.28x faster, getting close to the theoretical maximum of 1.37x. Additionally, the paradigm shift we operated creates minimal differences in the final main scores on good quality alignments, with a 1.82% difference for our 5.2 million sequences data set. This makes our acceleration with GASAL2 an solid and efficient solution for a single machine.

Files

Jonathan_LEVY_Master_Thesis.pd... (.pdf)

(.pdf | 2.61 Mb)