Evaluating and Improving Large-Scale Machine Learning Frameworks

Master thesis (2019)

Authors

D.O. Graur Electrical Engineering, Mathematics and Computer Science

Contributors

Jan S. Rellermeyer Data-Intensive Systems - (mentor)

Gustavo Alonso ETH Zürich (mentor)

D.H.J. Epema Data-Intensive Systems - (graduation committee member)

Faculty

Electrical Engineering, Mathematics and Computer Science, Electrical Engineering, Mathematics and Computer Science

To reference this document use:

http://resolver.tudelft.nl/uuid:ea9d655d-eabd-4409-9d06-0e10dd7124ef

More Info

expand_more

Published Date

10-09-2019

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Electrical Engineering, Mathematics and Computer Science

Abstract

Given the increasing popularity of Machine Learning, and the ever increasing need to solve larger and more complex learning challenges, it is unsurprising that numerous distributed learning strategies have been brought forward in recent years, along with many large scale Machine Learning frameworks. It is however unclear how well these strategies perform across different cluster and batch sizes, or what their hardware demands are, as there is little research in the public domain on this matter. Identifying the weaknesses and limitations of the parameter update strategies is, however, essential towards increasing the efficiency of large scale Machine Learning and making it commonplace. This thesis seeks to find the answers to these aforementioned issues, and provide evidence of the strategies’ limitations and the root causes behind them. To make the study possible, the thesis looks into particular implementations of the strategies within the TensorFlow and Caffe2 frameworks.

Files

MSc_Thesis_Graur_Dan.pdf

(pdf | 86.6 Mb)