Recurrent Knowledge Distillation

Conference Paper (2018)
Author(s)

S. Pintea (TU Delft - Pattern Recognition and Bioinformatics)

Yue Liu (KTH Royal Institute of Technology)

Jan van Van Gemert (TU Delft - Pattern Recognition and Bioinformatics)

Research Group
Pattern Recognition and Bioinformatics
Copyright
© 2018 S. Pintea, Yue Liu, J.C. van Gemert
DOI related publication
https://doi.org/10.1109/ICIP.2018.8451253
More Info
expand_more
Publication Year
2018
Language
English
Copyright
© 2018 S. Pintea, Yue Liu, J.C. van Gemert
Research Group
Pattern Recognition and Bioinformatics
Pages (from-to)
3393-3397
ISBN (print)
978-1-4799-7062-9
ISBN (electronic)
978-1-4799-7061-2
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Knowledge distillation compacts deep networks by letting a small student network learn from a large teacher network. The accuracy of knowledge distillation recently benefited from adding residual layers. We propose to reduce the size of the student network even further by recasting multiple residual layers in the teacher network into a single recurrent student layer. We propose three variants of adding recurrent connections into the student network, and show experimentally on CIFAR-10, Scenes and MiniPlaces, that we can reduce the number of parameters at little loss in accuracy.

Files

1805.07170.pdf
(pdf | 0.488 Mb)
License info not available