Ultra low latency deep neural network inference for gravitational waves interferometer

None, None

Ultra low latency deep neural network inference for gravitational waves interferometer

*

Master Thesis (2021)

Author(s)

M.C.B. de Rooij (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Zaid Al-Ars – Mentor (TU Delft - Computer Engineering)

Roel Aaij – Graduation committee member (Nikhef)

Jan Rellermeyer – Graduation committee member (TU Delft - Data-Intensive Systems)

J. Petri-König – Graduation committee member (TU Delft - Computer Engineering)

Faculty

Electrical Engineering, Mathematics and Computer Science

Copyright

Neural Network FPGA Acceleration Framework Gravitational Waves

To reference this document use:

https://resolver.tudelft.nl/uuid:0496ebc1-d295-4c89-858a-06f84209c0da

More Info

expand_more

Publication Year

2021

Language

English

Copyright

Graduation Date

05-03-2021

Awarding Institution

Delft University of Technology

Programme

['Computer Engineering']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Low latency Convolutional Neural Network (CNN) inference research is gaining more and more momentum for tasks such as speech and image classications. This is because CNNs have the ability to surpass human accuracy in classication of images. For improving the measurement setup of gravitational waves, low latency CNNs inference are researched. The CNN needs to process data from images to enable certain automatic controls for the control system. The data of these images need to be processed within 0.1 ms from the moment of taking the image to the control system obtaining the result of a deep neural network. Hardware acceleration is needed to reduce the execution latency of the network and reach the 0.1 ms requirement. Field-Programmable Gate Arrays (FPGAs) in particular have the ability to provide the needed acceleration. This is because FPGAs have the ability to create highly customised layers of the network and obtain the lowest possible latency. To reduce the design eort and complexity of the machine learning design, Xilinx introduced the FINN (Fast, Scalable Quantized Neural Network Inference on FPGAs) framework. FINN is an end-to-end deep learning framework that generates data ow-style architectures customised for each network. To establish if FINN can create the required ultra low latency CNN, some of FINN pretrained networks are used. The rst neural network investigated is the Tiny Fully Connected (TFC) network. The TFC network is a multilayer perceptron (MLP) for MNIST classication with three fully connected layers. The other network investigated is the convolutional neural network named CNV. CNV is a derivative of the VGG16 topology. The VGG16 topology is used for deep learning image classication problems with multiple convolutional layers. By using the analysis tools included with FINN, it can be determined if FINN is able to create the required ultra low latency CNN. The TFC network can be parallelised to a total of 5 expected cycles, with 1 expected cycle per layer. One cycle for the input quantization to standalone thresholding, one for the output layer and nally three for the fully connected layers. For the CNV network on the other hand, the initial convolution layer is unable to go below 8196 expected cycles, because of certain bottlenecks with FINN. These bottlenecks occur because of how FINN implements certain layers, moreover because certain layers can simply no longer be parallelised to lower the latency of that layer. To see if the CNV could achieve the latency requirements, a software emulation of the execution of the network has been done. This emulation showcased that by continuously increasing the parallelisation parameters, together with increasing clock frequencies, it is possible to create an ultra low latency pipeline of a CNN. This conguration has 45866 expected total cycles for the network, its expected cycles for the slowest layer is 8196 and needs a minimum frequency of 200 MHz. With those conguration it is possible to create a pipeline that has a latency of lower than 0.1 ms.

Files

Martijn_CBRooij_Thesis_Nikhef_... (pdf)

(pdf | 1.96 Mb)

License info not available