Finding the Needle in the Pre-Trained Model Zoo

None, None

Finding the Needle in the Pre-Trained Model Zoo

The Use of Rich Metadata and Graph Learning to Estimate Task Transferability

Master Thesis (2024)

Author(s)

H.J. van der Wilk (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

R. Hai – Mentor (TU Delft - Web Information Systems)

Ziyu Li – Mentor (TU Delft - Web Information Systems)

Avishek Anand – Graduation committee member (TU Delft - Web Information Systems)

Q. Song – Graduation committee member (TU Delft - Embedded Systems)

Faculty

Electrical Engineering, Mathematics and Computer Science

Deep learning Transfer learning Graph learning Transferability estimation Model zoos

To reference this document use:

https://resolver.tudelft.nl/uuid:59a7191d-4522-4a74-8ccf-503d11a5101b

More Info

expand_more

Publication Year

2024

Language

English

Graduation Date

25-06-2024

Awarding Institution

Delft University of Technology

Programme

['Computer Science']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

The democratization of machine learning through public repositories, often known as model zoos, has significantly increased the availability of pre-trained models for practitioners. However, this abundance can make it difficult to choose the most suitable pre-trained model for fine-tuning on new tasks. Although various methods have been proposed in the field of transferability estimation to address this issue, these methods can take hours to execute and may still fail to find the optimal pre-trained model for fine-tuning. By exploring a new graph learning-based approach to transferability estimation, we outperform state-of-the-art methods such as LogME, improving the accuracy of the best-predicted model by up to 31.5\% in less than 5 minutes.

Files

Thesis_HvdW_final.pdf

(pdf | 11.5 Mb)

License info not available