Recurrent Knowledge Distillation

Conference Paper (2018)
Author(s)

Silvia L. Pintea (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Yue Liu (KTH Royal Institute of Technology)

Jan van Gemert (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Research Group
Pattern Recognition and Bioinformatics
DOI related publication
https://doi.org/10.1109/ICIP.2018.8451253 Final published version
More Info
expand_more
Publication Year
2018
Language
English
Research Group
Pattern Recognition and Bioinformatics
Pages (from-to)
3393-3397
ISBN (print)
978-1-4799-7062-9
ISBN (electronic)
978-1-4799-7061-2
Event
25th IEEE International Conference on Image Processing (2018-10-07 - 2018-10-10), Athens, Greece
Downloads counter
243
Collections
Institutional Repository
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Knowledge distillation compacts deep networks by letting a small student network learn from a large teacher network. The accuracy of knowledge distillation recently benefited from adding residual layers. We propose to reduce the size of the student network even further by recasting multiple residual layers in the teacher network into a single recurrent student layer. We propose three variants of adding recurrent connections into the student network, and show experimentally on CIFAR-10, Scenes and MiniPlaces, that we can reduce the number of parameters at little loss in accuracy.

Files

1805.07170.pdf
(pdf | 0.488 Mb)
License info not available