Modeling Inference Time of Deep Neural Networks on Memory-constrained Systems

None, None

Modeling Inference Time of Deep Neural Networks on Memory-constrained Systems

Bachelor Thesis (2020)

Author(s)

J.C. Brouwer (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Lydia Chen – Mentor (TU Delft - Data-Intensive Systems)

S. Ghiassi – Graduation committee member (TU Delft - Data-Intensive Systems)

B.A. Cox – Graduation committee member (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Marco A. Zuñiga Zamalloa – Coach (TU Delft - Embedded Systems)

Faculty

Electrical Engineering, Mathematics and Computer Science

Copyright

Machine Learning Inference Prediction Deep Neural Networks Performance Analysis Memory-constrained Modeling

To reference this document use:

https://resolver.tudelft.nl/uuid:bcfb9bb6-a21e-4f0b-b98d-5410e399ff34

More Info

expand_more

Publication Year

2020

Language

English

Copyright

Graduation Date

22-06-2020

Awarding Institution

Delft University of Technology

Project

['CSE3000 Research Project']

Programme

['Computer Science and Engineering']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Deep neural networks have revolutionized multiple fields within computer science. It is important to have a comprehensive understanding of the memory requirements and performance of deep networks on low-resource systems. While there have been efforts to this end, the effects of severe memory limits and heavy swapping are understudied. We have profiled multiple deep networks under varying memory restrictions and on different hardware. Using this data, we develop two modeling approaches to predict the execution time of a network based on a description of its layers and the available memory. The first modeling approach is based on engineering predictive features through a theoretical analysis of the computations required to execute a layer. The second approach uses a LASSO regression to select predictive features from an expanded set of predictors. Both approaches achieve a mean absolute percentage error of 5% on log-transformed data, but suffer degraded performance on transformation of predictions back to regular space.

Files

Hans_Brouwer_Modeling_Inferenc... (pdf)

(pdf | 0.596 Mb)

License info not available