Vulnerability prealerting by monitoring the online repositories of open source projects

Master thesis (2023)

Authors

A. Westfalewicz Electrical Engineering, Mathematics and Computer Science

Contributors

S. Proksch Software Engineering - (supervisor 1)

Magiel Bruntink Software Improvement Group (supervisor 1)

Faculty

Electrical Engineering, Mathematics and Computer Science

More Info

expand_more

To reference this document use:

http://resolver.tudelft.nl/uuid:be08d8c2-4fd6-405b-8861-804985cbecd5

Published Date

13-01-2023

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Electrical Engineering, Mathematics and Computer Science

Abstract

Software security plays a crucial role in the modern world governed by software. And while closed source projects can enjoy a sense of confidentiality when addressing security issues, open source projects undertake them publicly even though just as many projects rely on them. In 50% of documented cases, the vulnerabilities could have been spotted almost 20 days before their disclosure leaving plenty of time for a potential attacker to exploit the weakness.

Based on the results of a basic text search, we conclude that the majority of security-related activity is in reaction to known vulnerabilities and that maintainers are not always mentioning security terms when fixing exploits. We also confirm that many security-labeled issues are not pushed to vulnerability systems, even though the maintainers realize their security aspect. Then, while commit classification models can spot security-related commits automatically, the models struggle in realistic scenarios, and no particular feature or sampling method is vastly better than the others. Nonetheless, we evaluated the state-of-the-art models which spot security-related commits with an F1 score of 0.36.

Given the findings, we conclude that security-related activity is hard to automatically distinguish from everyday development activity and that manual review is required to spot these traces. Proposed methods can make this review easier. We suggest that more attention should be given to open source security to avoid early public traces of vulnerabilities.

Files

Westfalewicz_MSc_thesis.pdf

(.pdf | 3.37 Mb)