Beyond obfuscation: Signature-based and relocation-resistant vulnerability detection in Uber JARs

Master Thesis (2024)
Author(s)

D. Plămădeală (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Thomas Durieux – Mentor (TU Delft - Software Engineering)

A. Panichella – Graduation committee member (TU Delft - Software Engineering)

J.E.A.P. Decouchant – Graduation committee member (TU Delft - Data-Intensive Systems)

Faculty
Electrical Engineering, Mathematics and Computer Science
More Info
expand_more
Publication Year
2024
Language
English
Graduation Date
17-05-2024
Awarding Institution
Delft University of Technology
Programme
Computer Science
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Software development often relies on dependencies managed by package managers to simplify the integration of external libraries and frameworks, reducing development time. However, developers sometimes choose to bundle dependencies directly within their software packages. Bundling dependencies means including all necessary third-party frameworks directly within the application's distributable archive, such as a JAR file, to ensure all components are present without needing external installations. This practice, resulting in Uber JARs (or fat JARs), presents both challenges and advantages within the Maven ecosystem. This project examines the prevalence, risks, and impact of Uber JARs by analyzing over 9 million POM files and 12 million JAR artifacts from Maven Central, identifying artifacts with previously undetected vulnerabilities. Notably, 10.48% of the analyzed artifacts, amounting to 915,089, fall under the category of Uber JARs, indicating a significant prevalence within the Maven repository. Central to this work, JarSift detects Uber JARs' contents, including the libraries, their versions, and vulnerabilities. JarSift's accuracy is demonstrated with an F1 score ranging from 0.474 to 0.857, depending on the Uber JAR configuration. Analysis reveals about 17.13% Uber JARs in a small-scale dataset contained undisclosed vulnerabilities, and 0.63% of all libraries in our dataset fully completely matched known vulnerable libraries. These findings highlight the need for better detection and mitigation strategies in the Maven ecosystem and inform developers of potential risks, helping them implement more robust security measures.

Files

License info not available