Compression of the embedding layer in an LSTM model using tensor train decomposition for NLP

Master thesis (2022)

Authors

S.A. Jonnalagadda Mechanical Engineering

Contributors

K. Batselier Team Kim Batselier - Mechanical, Maritime and Materials Engineering (supervisor 1)

F. Wesel Team Kim Batselier - Mechanical, Maritime and Materials Engineering (supervisor 2)

P. Mohajerin Esfahani Team Peyman Mohajerin Esfahani - Mechanical, Maritime and Materials Engineering (supervisor 2)

Faculty

Mechanical Engineering

NLP LSTM Neural Networks Tensor decomposition Green AI

More Info

expand_more

To reference this document use:

http://resolver.tudelft.nl/uuid:9d40a129-f351-4e22-8821-71b11cf8668b

Published Date

28-11-2022

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Mechanical Engineering

Abstract

Natural Language Processing (NLP) deals with understanding and processing human text by any computer software. There are several network architectures in the fields of deep learning and artificial intelligence that are used for NLP. Deep learning techniques like recurrent neural networks and feed-forward neural networks are used to develop language models that perform several NLP tasks. Over the years, researchers have worked on developing state-of-the-art language models that achieve high accuracy and performance for NLP applications. With the development of deep neural network language models, the computational resources requirements and the energy costs for training and running language models increased. This led to research to compress the language models, thereby reducing the computational complexity of the language models. One of the methods used for this is tensor decomposition, like the tensor-train (TT) decomposition. During this thesis work, the application of the TT-decomposition method for compressing the embedding layer in a long-short-term memory model was investigated. Specifically, the effect of factorization and the order of factors in the embedding layer when it is represented in the TT-matrix format on the maximum test accuracy of the long- short term memory model for the NLP task of sentiment analysis was investigated. This was done by considering three different factorizations of the embedding layer in the model. Further, the effect of change in TT-ranks (hyperparameters of the model when the embedding layer is represented in the TT-matrix format) on the maximum test accuracy was also investigated. Based on the investigation and empirical results obtained, this thesis concludes that by having a larger number of factors in the factorization of the embedding layer, the maximum test accuracy of the model increases. Further, in a particular factorization, when the factors were arranged in such a way that the maximum values of the TT-ranks had a smaller gap, the maximum test accuracy of the model improved. In one particular configuration of the model, the number of parameters was reduced by 24.5 times compared to that of the original uncompressed model, and a maximum test accuracy of 77.10% was achieved compared to a maximum test accuracy of 78.05% in the case of the original model.

Files

MSc_Report_Aravind_Jonnalagadd... (.pdf)

(.pdf | 5.23 Mb)