Circular Image

D. Spinellis

18 records found

Topic Classification of Publications

Identifying publication topics based on existing journals

Accurate topic classification is crucial in the scientific community when it comes to finding relevant journals. However, the efficiency and accuracy of topic classification of publications do not seem to be at its best performance, especially with the fast-paced rise in the quan ...
The growth of academic publications, heterogeneity of datasets and the absence of a globally accepted organization identifier introduce the challenge of affiliation disambiguation in bibliographic databases. In this paper, we create a baseline using the currently implemented algo ...

Towards More Effective Querying of Medical Literature in Alexandria3K

How useful can Alexandria3K be for performing literature reviews

The Alexandria3K library, a versatile Python-based tool, has been expanded to include the integra- tion of the PubMed dataset, enriching its capabil- ities in the analysis of scientific papers. Origi- nally supporting major datasets like Crossref and US patents, and smaller yet s ...
This thesis investigates the prevalence of Pylint warnings in open-source Python projects and evaluates the effectiveness of an AI-driven tool for automatically fixing these warnings. The study also explores how developers perceive automated code suggestions and seeks to streamli ...

Author Name Disambiguation using Large Language Models

Contributions to a system for open reproducible publication research

Author name disambiguation, otherwise described as (publication) record linking, is a problem that has had considerable research dedicated to its solv- ing. Author attributions, calculating research met- rics and conducting literature reviews are amongst processes that experience ...

The set of regression and integration tests at many modern software companies is huge. It is difficult to run all tests after each code change, so the tests are often run for batches of code changes by different developers, late in the release cycle. This has ...

In (open-source) development, developers routinely rely on other libraries to improve their coding efficiency by reusing code. This reliance on other packages could cause issues when critical dependencies have suddenly have a vulnerability introduced to them. This work analyzes t ...
Using open-source packages when developing software applications is the general practice among a vast amount of software developers. However, importing open-source code which may depend on other existing technologies may lead to the appearance of a transitive dependency chain. As ...
The main principle of Open Source development is that developers can reuse different libraries over and over again to make their lives easier. That is why this practice has gained a lot of popularity. However, libraries usually depend on other already existing pieces of code. Thi ...
Developers rely on different software to improve their efficiency as to reuse parts of code and be able to maintain it with ease, which is why open source software libraries have gained much pop- ularity over the past years. This paper analyzes what are the most used packages fro ...
The use of open-source packages is a common practice among developers. It decreases the development time and improves maintainability. But adding a dependency to a project comes with inherit risks such as introducing vulnerabilities. A few solutions that help visualize all of the ...
This research studies the symptoms, root causes, impact, triggers, fixes, and system dependency of bugs in the Puppet configuration management system. Puppet is a widely used open-source configuration management system that performs various administrative tasks on machines based ...
Configuration management systems are a class of software used to automate system administrative tasks, one of which is the configuration of software systems. Although the automation is less error-prone than manual configuration done by a human, bugs in the source code can still c ...
Research that focuses on examining software bugs is critical when developing tools for preventing and for fixing software issues. Previous work in this area has explored other types of systems, such as bugs of compilers and security issues stemming from open source systems hosted ...
The study of bugs can provide important information to understand their nature in the context of complex software systems as well as supporting developers in their detection, fix and prevention. Previous studies focused on analyzing bugs under different perspectives such as chang ...
There are many valid reasons for someone to choose to stay anonymous online, not least of which is the fact that online privacy is a human right. However, discrimination against users of anonymity networks from web-servers and content distribution networks on the grounds of defen ...
Tor is an anonymity network used by a vast number of users in order to protect their privacy on the internet. It should not come as a surprise that this service is also used for abuse such as Denial of service attacks and other malicious activities because of the anonymity it pro ...
The Lightning Network (LN) is a second-layer solution built on top of the Bitcoin protocol, allowing faster and cheaper transactions without compromising on decentralization. LN is also designed to be more anonymous, since less information has to be shared with the entire network ...