Print Email Facebook Twitter Efficient Neural Architecture Search for Language Modeling Title Efficient Neural Architecture Search for Language Modeling Author Li, Mingxi (TU Delft Electrical Engineering, Mathematics and Computer Science; TU Delft Interactive Intelligence) Contributor Oliehoek, F.A. (mentor) Pan, W. (graduation committee) van Gemert, J.C. (graduation committee) Zhou, H. (graduation committee) Degree granting institution Delft University of Technology Programme Electrical Engineering | Embedded Systems Date 2019-08-21 Abstract Neural networks have achieved great success in many difficult learning tasks like image classification, speech recognition and natural language processing. However, neural architectures are hard to design, which requires lots of knowledge and time of human experts. Therefore, there has been a growing interest in automating the process of designing neural architectures. Though these searched architectures have achieved competitive performance on various tasks, the efficiency of NAS still needs to be improved. Moreover, current neural architecture search approach disregards the dependency between a node and its predecessors and successors. This thesis builds upon BayesNAS which employs the classic Bayesian learning method to search for CNN architectures, and extends it to the problem of neural architecture search for recurrent architectures. Hierarchical sparse priors are used to model the architecture parameters to alleviate the dependency issue. Since the update of posterior variance is based on Laplace approximation, an efficient method to compute the Hessian of recurrent layer is proposed. We can find candidated architectures after training the over-parameterized network for only one epoch. Our experiments on Penn Treebank and WikiText-2 show that competitive architectures can be found in 0.3 GPU days using a single GPU for language modeling task. We find that our algorithm is more efficient than state-of-the-art. Subject NASDeep learningArtificial intelligence To reference this document use: http://resolver.tudelft.nl/uuid:aa5c948d-43c4-480d-9818-43949c67a3b5 Part of collection Student theses Document type master thesis Rights © 2019 Mingxi Li Files PDF Thesis_Final.pdf 1.11 MB Close viewer /islandora/object/uuid:aa5c948d-43c4-480d-9818-43949c67a3b5/datastream/OBJ/view