Shrink-Perturb Improves Architecture Mixing During Population Based Training for Neural Architecture Search

None, None; None, None; None, None; None, None

Shrink-Perturb Improves Architecture Mixing During Population Based Training for Neural Architecture Search

Conference Paper (2023)

Author(s)

Alexander Chebykin (Centrum Wiskunde & Informatica (CWI))

Arkadiy Dushatskiy (Centrum Wiskunde & Informatica (CWI), TU Delft - Algorithmics)

Tanja Alderliesten (TU Delft - Algorithmics, Leiden University Medical Center)

P.A.N. Bosman (TU Delft - Algorithmics, Centrum Wiskunde & Informatica (CWI))

Research Group

Algorithmics

Copyright

DOI related publication

https://doi.org/10.3233/FAIA230294

To reference this document use:

https://resolver.tudelft.nl/uuid:2244d288-b123-4124-8b42-5698e783ef4d

More Info

expand_more

Publication Year

2023

Language

English

Copyright

Research Group

Algorithmics

Pages (from-to)

381-388

ISBN (electronic)

9781643684369

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

In this work, we show that simultaneously training and mixing neural networks is a promising way to conduct Neural Architecture Search (NAS). For hyperparameter optimization, reusing the partially trained weights allows for efficient search, as was previously demonstrated by the Population Based Training (PBT) algorithm. We propose PBT-NAS, an adaptation of PBT to NAS where architectures are improved during training by replacing poorly-performing networks in a population with the result of mixing well-performing ones and inheriting the weights using the shrink-perturb technique. After PBT-NAS terminates, the created networks can be directly used without retraining. PBT-NAS is highly parallelizable and effective: on challenging tasks (image generation and reinforcement learning) PBT-NAS achieves superior performance compared to baselines (random search and mutation-based PBT).

Files

FAIA_372_FAIA230294.pdf

(pdf | 0.604 Mb)