EasyCompress

None, None

EasyCompress

Automated Compression for Deep Learning Models

Master Thesis (2023)

Author(s)

A. Van Steenweghen (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Luís Cruz – Mentor (TU Delft - Software Engineering)

Rui Maranhao – Graduation committee member (Universidade do Porto)

A van Deursen – Coach (TU Delft - Software Technology)

Jan van Gemert – Coach (TU Delft - Pattern Recognition and Bioinformatics)

Faculty

Electrical Engineering, Mathematics and Computer Science

Deep Learning Compression Automation Green AI

To reference this document use:

https://resolver.tudelft.nl/uuid:0ecb7bfc-dd72-4259-89b5-776da634f024

More Info

expand_more

Publication Year

2023

Language

English

Graduation Date

03-07-2023

Awarding Institution

Delft University of Technology

Programme

['Computer Science | Software Technology']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Over the past years the size of deep learning models has been growing consistently. This growth has led to significant improvements in performance, but at the expense of increased computational resource demands. Compression techniques can be used to improve the efficiency of deep learning models by shrinking their size and computational needs, while
preserving performance.

This thesis presents EasyCompress, an automated and user-friendly tool to compress deep learning models. The tool improves on existing compression research by focusing on generalizability and practical usability, in three ways. Firstly, it aligns with specific compression objectives and performance requirements, ensuring the compression accomplishes its intended goal effectively. Secondly, it employs flexible compression techniques, so that it is applicable to a diverse set of models without requiring deep model knowledge. Finally, it automates the compression process, eliminating difficult and time-consuming implementation
efforts.

EasyCompress intelligently selects, tailors, and combines various compression techniques to minimize model size, latency, or number of computations while preserving performance. It employs structured pruning to reduce the number of parameters and computations, uses knowledge distillation techniques to ensure better accuracy recovery, and uses quantization to achieve additional compression.

The tool’s effectiveness is evaluated across diverse model architectures and configurations. Experimental results on a range of models and datasets demonstrate its ability to reduce the model size at least 5-fold, inference time by at least 1.5-fold, and the number of computations by at least 3-fold. Most compression rates are even higher, reaching up to 10, 20, and even 100-fold reductions.

The tool is available online at https://thesis.abelvansteenweghen.com.

Files

Thesis_Final.pdf

(pdf | 1.23 Mb)

License info not available