Finding your digital sibling
Grouping GitHub projects that share certain attributes based on interactions and activities
R.W. de Bruin (TU Delft - Electrical Engineering, Mathematics and Computer Science)
Sebastian Proksch – Mentor (TU Delft - Software Engineering)
S. Huang – Mentor (TU Delft - Software Technology)
Julia Olkhovskaya – Graduation committee member (TU Delft - Interactive Intelligence)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
This study explores the feasibility of categorizing GitHub projects based on their interactions and activities, aiming to assist both researchers and practitioners in navigating the vast landscape of open-source software. Through experiments and analysis, key attributes contributing to project categorization are identified, paving the way for effective grouping of projects in terms of interactions and activities. Findings indicate distinct clusters among GitHub projects, highlighting the influence of interactions and activities on project categorization. The study underscores the importance of refining grouping algorithms and improving project categorization methods for future research. Future work could involve developing user-friendly tools to facilitate project discovery and exploring correlations between interaction related metrics and project development dynamics. Overall, this study contributes to advancing our understanding of project categorization on GitHub, facilitating more efficient knowledge sharing and collaboration within professional fields.