Multi-GPU Brain

None, None

Multi-GPU Brain

A multi-node implementation for an extended Hodgkin-Huxley simulator

Master Thesis (2019)

Author(s)

M.A. van der Vlag (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

C. Strydis – Mentor (Erasmus MC)

Zaid Al-Ars – Coach (TU Delft - Computer Engineering)

A.J. van Genderen – Graduation committee member (TU Delft - Computer Engineering)

Faculty

Electrical Engineering, Mathematics and Computer Science

Copyright

Cluster Multi GPU Brain simulator Hodgkin-Huxley Gap junctions Neuron network Cartesius Multi compartment Linear scaling

To reference this document use:

https://resolver.tudelft.nl/uuid:6e93f997-1b8a-4af6-bfaa-261c205a9b04

More Info

expand_more

Publication Year

2019

Language

English

Copyright

Graduation Date

28-02-2019

Awarding Institution

Delft University of Technology

Programme

['Computer Engineering']

Abstract

Current brain simulators do no scale linearly to realistic problem sizes (e.g. >100,000 neurons), which makes them impractical for researchers. The goal of the thesis is to explore the use of true multi-GPU acceleration on computationally challenging brain models and to assess the scalability of such models given sufficient access to multi-node acceleration platforms. The brain model used is a state-of-the-art, extended HodgkinHuxley, biophysically- meaningful, three-compartmental model of the inferior-olivary nucleus. Not only the simulation of the cells, but also the setup of the network is taken into account when designing and benchmarking the multi-GPU version. For network sizes varying from 65K to 4M cells, 10, 100 and 1000 synapses per neuron are simulated. These simulations are executed on 8, 16, 32 and 48 GPUs. A Gaussian-distributed network of 4 million cells with a density of 1,000 synapses per neuron executed on 48 GPUs, is setup and simulated in 465.69 seconds, of which the cell-computation phase takes 4.57 seconds, obtaining a speedup of 50 times the execution time on a single GPU. A uniform-distributed network of same size and density is setup and simulated in 889.89 seconds of which the cell-computation phase takes 10.09 seconds, obtaining a speedup of 8 times the execution on a single GPU. For the implemented design, the inter-GPU communication becomes the major bottleneck, as latency increases when the sent packet sizes increase. This communication overhead does not dominate the overall execution while scaling network sizes is tractable. This scalable design gives a good prospect for neuroscientists, proving that large network-size simulations are possible, using a multi-GPU setup.

Files

Thesis_MAvanderVlag.pdf

(pdf | 4.83 Mb)

License info not available