Binary Deep Learning

Binary Weights and Semantic Binary Data Compression

More Info
expand_more

Abstract

Improving the efficiency in deploying deep neural networks (DNNs) and processing complex high-dimensional data has drawn increasing attention in recent years. Yet, the deployment of large DNN models is challenged by the high computational complexity and energy consumption, making it difficult to run on resource-constrained devices such as mobile phones. Moreover, the exploding amount of high-dimensional data requires large storage and transmission capacities which is infeasible to be processed on mobile devices.
To alleviate these limitations, this dissertation focuses on binarization techniques, including model binarization and data binarization, to improve the efficiency in terms of storage, computation and energy.
In model binarization, we binarize both the weight and activation of DNN models which can reach up to 32× memory saving and a speed up of 58×. We also develop pruning algorithms to further compress the binarized network while maintaining accuracy. To efficiently train the binarized networks, we discover new optimization methods that has less hyper-parameters and can improve the accuracy.
In data binarization, we propose deep hashing algorithms that learn smaller binary data representation. Deep hashing methods have become an effective technique for fast and efficient similarity search and retrieval of high-dimensional data items in large databases.