Print Email Facebook Twitter A Case for Deep Learning in Mining Software Repositories Title A Case for Deep Learning in Mining Software Repositories Author Nijessen, Rik (TU Delft Electrical Engineering, Mathematics and Computer Science) Contributor Gousios, G. (mentor) Hauff, C. (graduation committee) van Deursen, A. (graduation committee) Degree granting institution Delft University of Technology Programme Computer Science | Software Technology Date 2017-11-10 Abstract Repository mining researchers have successfully applied machine learning in a variety ofscenarios. However, the use of deep learning in repository mining tasks is still in its infancy.In this thesis, we describe the advantages and disadvantages of using deep learning in mining software repository research and demonstrate these by doing two case studies on pull requests.In the first, we train neural models to predict, on arrival, whether a pull request is going to be merged or not.In the second, we train neural models to answer the question: given two pull requests, are these similar?We show that using neural models, researchers are able to avoid feature engineering, because these models can be trained on raw data. Furthermore, neural models have the potential to outperformtraditional supervised machine learning models, due to being able to learn relevant features by themselves.However, the power of neural models comes at a cost: optimizing the parameters of neural models and explaining neural models is difficult and training them is costly.We, therefore, recommend researchers to take into account well performing neural architectures in other domains, such as natural language processing, before creating novel architectures.Furthermore, it is therefore important to include a less costly baseline when using neural models in research, to show that the power and thereby the cost of neural models is justified. Subject deep learningmining software repositoriespull requests To reference this document use: http://resolver.tudelft.nl/uuid:fc0cf997-4900-435c-b213-00e5828490de Part of collection Student theses Document type master thesis Rights © 2017 Rik Nijessen Files PDF thesis_rnijessen_final.pdf 1.53 MB Close viewer /islandora/object/uuid:fc0cf997-4900-435c-b213-00e5828490de/datastream/OBJ/view