Teaching How to Learn to Learn

Teacher-Student Curriculum Learning for Efficient Meta-Learning

Bachelor Thesis (2024)
Author(s)

B.B. Kovács (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

J.A. de Vries – Mentor (TU Delft - Sequential Decision Making)

Matthijs Spaan – Mentor (TU Delft - Sequential Decision Making)

P.K. Murukannaiah – Graduation committee member (TU Delft - Interactive Intelligence)

Faculty
Electrical Engineering, Mathematics and Computer Science
More Info
expand_more
Publication Year
2024
Language
English
Graduation Date
25-06-2024
Awarding Institution
Delft University of Technology
Project
CSE3000 Research Project
Programme
Computer Science and Engineering
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

We investigate whether a teacher-student curriculum learning approach using a teacher network with a simpler structure than the student network can achieve better results at meta-learning. The goal of meta-learning is to learn from a set of tasks, and then perform well on a new, structurally similar but unseen task with minimal retraining. Instead of sampling uniformly from all data to create the training batches, the curriculum-learning approach aims to create a sequence of mini-batches that enhances the training process, also known as a curriculum. During teacher-student curriculum learning a "teacher" network is trained in the standard manner, and then its outputs are used to order the training samples by difficulty and categorise them into mini-batches. This curriculum is then used to train the "student" network. Previous teacher-student models either had pre-trained more complex teachers, or teachers with the same structure as the student network. We investigate whether a teacher network with a simpler structure can also increase accuracy, while preserving computational resources. We find that using such a curriculum worsens performance compared to not using any curriculum at all.

Files

CSE3000-3.pdf
(pdf | 0.442 Mb)
License info not available