Evaluating and Improving Large-Scale Machine Learning Frameworks

None, None

Evaluating and Improving Large-Scale Machine Learning Frameworks

Master Thesis (2019)

Author(s)

D.O. Graur (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Jan S. Rellermeyer – Mentor (TU Delft - Data-Intensive Systems)

Gustavo Alonso – Mentor (ETH Zürich)

Dick Epema – Graduation committee member (TU Delft - Data-Intensive Systems)

Faculty

Electrical Engineering, Mathematics and Computer Science

Copyright

Tensorflow Machine Learning Backpropagation Deep Learning Neural Networks Performance Classification Hardware Distributed Systems Scalability ResNet Clusters Caffe2 Bottleneck Identification Large Scale Nodes Limitations Images Parameter Update

To reference this document use:

https://resolver.tudelft.nl/uuid:ea9d655d-eabd-4409-9d06-0e10dd7124ef

More Info

expand_more

Publication Year

2019

Language

English

Copyright

Graduation Date

10-09-2019

Awarding Institution

Delft University of Technology

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Given the increasing popularity of Machine Learning, and the ever increasing need to solve larger and more complex learning challenges, it is unsurprising that numerous distributed learning strategies have been brought forward in recent years, along with many large scale Machine Learning frameworks. It is however unclear how well these strategies perform across different cluster and batch sizes, or what their hardware demands are, as there is little research in the public domain on this matter. Identifying the weaknesses and limitations of the parameter update strategies is, however, essential towards increasing the efficiency of large scale Machine Learning and making it commonplace. This thesis seeks to find the answers to these aforementioned issues, and provide evidence of the strategies’ limitations and the root causes behind them. To make the study possible, the thesis looks into particular implementations of the strategies within the TensorFlow and Caffe2 frameworks.

Files

MSc_Thesis_Graur_Dan.pdf

(pdf | 86.6 Mb)

License info not available