A. Katsifodimos | TU Delft Repository

Styx: Transactional Stateful Functions on Streaming Dataflows

Journal article (2025) - K. Psarakis (author) , G.C. Christodoulou (author) , G. Siachamis (author) , M. Fragkoulis (author) , A. Katsifodimos (author)

Developing stateful cloud applications, such as low-latency workflows and microservices with strict consistency requirements, remains arduous for programmers. The Stateful Functions-as-a-Service (SFaaS) paradigm aims to serve these use cases. However, existing approaches provide ...

Accelerating machine learning queries with linear algebra query processing

Journal article (2025) - Wenbo Sun (author) , Asterios Katsifodimos (author) , R. Hai (author)

The rapid growth of large-scale machine learning (ML) models has led numerous commercial companies to utilize ML models for generating predictive results to help business decision-making. As two primary components in traditional predictive pipelines, data processing, and model pr ...

Transactional Cloud Applications: Status Quo, Challenges, and Opportunities

Other (2025) - R.N. Laigner (author) , G.C. Christodoulou (author) , K. Psarakis (author) , Asterios Katsifodimos (author) , Yongluan Zhou (author)

Transactional cloud applications such as payment, booking, reservation systems, and complex business workflows are currently being rewritten for deployment in the cloud. This migration to the cloud is happening mainly for reasons of cost and scalability. Over the years, applicati ...

Transactional Cloud Applications Go with the (Data) Flow

Conference paper (2025) - K. Psarakis (author) , G.C. Christodoulou (author) , Marios Fragkoulis (author) , Asterios Katsifodimos (author)

Traditional monolithic applications are migrated to the cloud, typically using a microservice-like architecture. Although this migration leads to significant benefits such as scalability and development agility, it also leaves behind the transactional guarantees that database sys ...

Cascade

From Imperative Code to Stateful Dataflows

Conference paper (2025) - M. Schutte (author) , Lucas van Mol (author) , G.C. Christodoulou (author) , A. Katsifodimos (author)

Executing applications in the cloud is becoming increasingly popular, primarily developed as microservices containing imperative code. In our previous work, we have made the case that such applications can benefit from using dataflow-based runtimes in a cloud environment. In part ...

Styx in Action: Transactional Cloud Applications Made Easy

Conference paper (2025) - K. Psarakis (author) , O. Mráz (author) , G.C. Christodoulou (author) , G. Siachamis (author) , Marios Fragkoulis (author) , Asterios Katsifodimos (author)

Developing and deploying transactional cloud applications such as banking and e-commerce systems is a daunting task for developers. The reason for this difficulty is twofold. First, developing such applications shifts the developers’ focus from the application logic to considerat ...

Evaluating Stream Processing Autoscalers

Conference paper (2024) - G. Siachamis (author) , G.C. Christodoulou (author) , K. Psarakis (author) , Marios Fragkoulis (author) , A. van Deursen (author) , A Katsifodimos (author)

While the concept of large-scale stream processing is very popular nowadays, efficient dynamic allocation of resources is still an open issue in the area. The database research community has yet to evaluate different autoscaling techniques for stream processing engines under a ro ...

LLM-PQA

LLM-enhanced Prediction Query Answering

Conference paper (2024) - Z. Li (author) , Wenjie Zhao (author) , Asterios Katsifodimos (author) , R. Hai (author)

The advent of Large Language Models (LLMs) provides an opportunity to change the way queries are processed, moving beyond the constraints of conventional SQL-based database systems. However, using an LLM to answer a prediction query is still challenging, since an external ML mode ...

Stateful Entities: Object-oriented Cloud Applications as Distributed Dataflows

Conference paper (2024) - K. Psarakis (author) , W.D. Zorgdrager (author) , Marios Fragkoulis (author) , Guido Salvaneschi (author) , Asterios Katsifodimos (author)

Although the cloud has reached a state of robustness, the burden of using its resources falls on the shoulders of programmers who struggle to keep up with ever-growing cloud infrastructure services and abstractions. As a result, state management, scaling, operation, and failure m ...

CheckMate: Evaluating Checkpointing Protocols for Streaming Dataflows

Conference paper (2024) - G. Siachamis (author) , K. Psarakis (author) , Marios Fragkoulis (author) , Arie Van Deursen (author) , Paris Carbone (author) , Asterios Katsifodimos (author)

Stream processing in the last decade has seen broad adoption in both commercial and research settings. One key element for this success is the ability of modern stream processors to handle failures while ensuring exactly-once processing guarantees. At the moment of writing, virtu ...

Human-in-the-Loop Feature Discovery for Tabular Data

Conference paper (2024) - A. Ionescu (author) , Zeger Mouw (author) , E.A. Aivaloglou (author) , Rihan Hai (author) , Asterios Katsifodimos (author)

In recent years, researchers have developed several methods to automate discovering datasets and augmenting features for training Machine Learning (ML) models. Together with feature selection, these efforts have paved the way towards what is termed the feature discovery process. ...

Key Insights from a Feature Discovery User Study

Conference paper (2024) - A. Ionescu (author) , Zeger Mouw (author) , E.A. Aivaloglou (author) , Asterios Katsifodimos (author)

Multiple works in data management research focus on automating the processes of data augmentation and feature discovery to save users from having to perform these tasks manually. Yet, this automation often leads to a disconnect with the users, as it fails to consider the specific ...

Adaptive Distributed Streaming Similarity Joins

Conference paper (2023) - George Siachamis (author) , K. Psarakis (author) , M. Fragkoulis (author) , Odysseas Papapetrou (author) , A. Van Deursen (author) , Asterios Katsifodimos (author)

How can we perform similarity joins of multi-dimensional streams in a distributed fashion, achieving low latency? Can we adaptively repartition those streams in order to retain high performance under concept drifts? Current approaches to similarity joins are either restricted to ...

A survey on the evolution of stream processing systems

Journal article (2023) - Marios Fragkoulis (author) , Paris Carbone (author) , Vasiliki Kalavri (author) , Asterios Katsifodimos (author)

Stream processing has been an active research field for more than 20 years, but it is now witnessing its prime time due to recent successful efforts by the research community and numerous worldwide open-source communities. This survey provides a comprehensive overview of fundamen ...

Topio: An Open-Source Web Platform for Trading Geospatial Data

Conference paper (2023) - Andra Ionescu (author) , Kostas Patroumpas (author) , K. Psarakis (author) , Georgios Chatzigeorgakidis (author) , Diego Collarana (author) , Kai Barenscher (author) , Dimitrios Skoutas (author) , A Katsifodimos (author) , Spiros Athanasiou (author)

The increasing need for data trading across businesses nowadays has created a demand for data marketplaces. However, despite the intentions of both data providers and consumers, today’s data marketplaces remain mere data catalogs. We believe that marketplaces of the future requir ...

Towards Evaluating Stream Processing Autoscalers

Conference paper (2023) - George Siachamis (author) , Job Kanis (author) , Wybe Koper (author) , K. Psarakis (author) , M. Fragkoulis (author) , A. Van Van Deursen (author) , A Katsifodimos (author)

In this work, we evaluate autoscaling solutions for stream processing engines. Although autoscaling has become a mainstream subject of research in the last decade, the database research community has yet to evaluate different autoscaling techniques under a proper benchmarking set ...

Leveraging Large Language Models for Sequential Recommendation

Conference paper (2023) - Jesse Harte (author) , Wouter Zorgdrager (author) , Panos Louridas (author) , A. Katsifodimos (author) , Dietmar Jannach (author) , Marios Fragkoulis (author)

Sequential recommendation problems have received increasing attention in research during the past few years, leading to the inception of a large variety of algorithmic approaches. In this work, we explore how large language models (LLMs), which are nowadays introducing disruptive ...

Metadata Representations for Queryable Repositories of Machine Learning Models

Journal article (2023) - Z. Li (author) , Henk Kant (author) , R. Hai (author) , Asterios Katsifodimos (author) , Marco Brambilla (author) , Alessandro Bozzon (author)

Machine learning (ML) practitioners and organizations are building model repositories of pre-trained models, referred to as model zoos. These model zoos contain metadata describing the properties of the ML models and datasets. The metadata serves crucial roles for reporting, audi ...

An Empirical Performance Comparison between Matrix Multiplication Join and Hash Join on GPUs

Conference paper (2023) - Wenbo Sun (author) , Asterios Katsifodimos (author) , R. Hai (author)

Recent advances in Graphic Processing Units (GPUs) have facilitated a significant performance boost for database operators, in particular, joins. It has been intensively studied how conventional join implementations, such as hash joins, benefit from the massive parallelism of GPU ...

Automatic Table Union Search with Tabular Representation Learning

Conference paper (2023) - Xuming Hu (author) , Shen Wang (author) , Xiao Qin (author) , Chuan Lei (author) , Zhengyuan Shen (author) , Christos Faloutsos (author) , Asterios Katsifodimos (author) , George Karypis (author) , Lijie Wen (author) , Philip S. Yu (author)

Given a data lake of tabular data as well as a query table, how can we retrieve all the tables in the data lake that can be unioned with the query table? Table union search constitutes an essential task in data discovery and preparation as it enables data scientists to navigate m ...