An analysis of Java release practices on GitHub

Bachelor Thesis (2024)
Author(s)

V.C. Roest (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

S. Proksch – Mentor (TU Delft - Software Engineering)

C.B. Poulsen – Graduation committee member (TU Delft - Programming Languages)

Faculty
Electrical Engineering, Mathematics and Computer Science
Copyright
© 2024 Vivian Roest
More Info
expand_more
Publication Year
2024
Language
English
Copyright
© 2024 Vivian Roest
Graduation Date
02-02-2024
Awarding Institution
Delft University of Technology
Project
['CSE3000 Research Project']
Programme
['Computer Science and Engineering']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

This paper examines the release practices of Java Maven Repositories on GitHub. Most prior research in this vein has been done on Maven Central, the largest Maven package repository. However, GitHub hosts 15.5 million Java repositories, and is left untapped. Additionally of interest is the fact that GitHub provides a competitor to Maven Central, GitHub packages. To this end, the paper establishes an index of all Java repositories on GitHub. Furthermore, this dataset also includes Maven configuration (POM.xml) files. Additionally, an in-depth analysis is done of a sample of 500 000 of those 15.5 million repositories. This sample ended up containing 170 798 Java Maven repositories that had those POM.xml files. In this sample we discovered that of those 170 798, 6 507 (≈ 3.8%) had set up distribution configuration. Maven Central ended up being the most popular but GitHub packages and others ended up being quite popular as well. In the external repositories configured in those Java projects we notice a distinct lack of GitHub packages, other repositories were still present. We theorize that the lower popularity of GitHub packages is because it requires authentication, which is not trivial to set up. We discuss several approaches that can improve this situation.

Files

Research_project.pdf
(pdf | 0.159 Mb)
License info not available