Analysis of the effect of caching convolutional network layers on resource constraint devices

None, None

Analysis of the effect of caching convolutional network layers on resource constraint devices

Bachelor Thesis (2020)

Author(s)

W.E. van Lil (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Y. Chen – Mentor (TU Delft - Data-Intensive Systems)

B.A. Cox – Graduation committee member (TU Delft - Data-Intensive Systems)

Amirmasoud Ghiassi – Graduation committee member (TU Delft - Data-Intensive Systems)

Faculty

Electrical Engineering, Mathematics and Computer Science

Copyright

Deep learning Caching Transfer learning Partial loading

To reference this document use:

https://resolver.tudelft.nl/uuid:85251e58-77d6-40e5-b77a-06cc1a7798d1

More Info

expand_more

Publication Year

2020

Language

English

Copyright

Graduation Date

22-06-2020

Awarding Institution

Delft University of Technology

Project

['CSE3000 Research Project', 'Optimizing Multiple Deep Learning Models on Edge Devices']

Programme

['Computer Science and Engineering']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Using transfer learning, convolutional neural networks for different purposes can have similar layers which can be reused by caching them, reducing their load time. Four ways of loading and executing these layers, bulk, linear, DeepEye and partial loading, were analysed under different memory constraints and different amounts of similar networks. When there is sufficient memory, caching will decrease the loading time and will always influence the single threaded bulk and linear mode. On the multithreaded approaches this only holds when the loading time is longer than the execution time. This depends largely on what network will be run. When memory constraints are applied caching can be a way to still run multiple networks without much increased cost. It can also be opted to use less memory on a device and use transfer learning with caching to still get the same results.

Files

Scriptie_wouter.pdf

(pdf | 0.231 Mb)

License info not available