Multi-GPU Brain

A multi-node implementation for an extended Hodgkin-Huxley simulator

Master Thesis (2019)
Author(s)

M.A. van der Vlag (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

C. Strydis – Mentor (Erasmus MC)

Zaid Al-Ars – Coach (TU Delft - Computer Engineering)

A.J. van Genderen – Graduation committee member (TU Delft - Computer Engineering)

Faculty
Electrical Engineering, Mathematics and Computer Science
Copyright
© 2019 Michiel van der Vlag
More Info
expand_more
Publication Year
2019
Language
English
Copyright
© 2019 Michiel van der Vlag
Graduation Date
28-02-2019
Awarding Institution
Delft University of Technology
Programme
['Computer Engineering']
Sponsors
Erasmus MC
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Current brain simulators do no scale linearly to realistic problem sizes (e.g. >100,000 neurons), which makes them impractical for researchers. The goal of the thesis is to explore the use of true multi-GPU acceleration on computationally challenging brain models and to assess the scalability of such models given sufficient access to multi-node acceleration platforms. The brain model used is a state-of-the-art, extended HodgkinHuxley, biophysically- meaningful, three-compartmental model of the inferior-olivary nucleus. Not only the simulation of the cells, but also the setup of the network is taken into account when designing and benchmarking the multi-GPU version. For network sizes varying from 65K to 4M cells, 10, 100 and 1000 synapses per neuron are simulated. These simulations are executed on 8, 16, 32 and 48 GPUs. A Gaussian-distributed network of 4 million cells with a density of 1,000 synapses per neuron executed on 48 GPUs, is setup and simulated in 465.69 seconds, of which the cell-computation phase takes 4.57 seconds, obtaining a speedup of 50 times the execution time on a single GPU. A uniform-distributed network of same size and density is setup and simulated in 889.89 seconds of which the cell-computation phase takes 10.09 seconds, obtaining a speedup of 8 times the execution on a single GPU. For the implemented design, the inter-GPU communication becomes the major bottleneck, as latency increases when the sent packet sizes increase. This communication overhead does not dominate the overall execution while scaling network sizes is tractable. This scalable design gives a good prospect for neuroscientists, proving that large network-size simulations are possible, using a multi-GPU setup.

Files

Thesis_MAvanderVlag.pdf
(pdf | 4.83 Mb)
License info not available