Searched for: collection%253Air
(1 - 20 of 34)

Pages

document
Li, Z. (author), Sun, W. (author), Hai, R. (author), Bozzon, A. (author), Katsifodimos, A (author)
The proliferation of pre-trained ML models in public Web-based model zoos facilitates the engineering of ML pipelines to address complex inference queries over datasets and streams of unstructured content. Constructing optimal plan for a query is hard, especially when constraints (e.g. accuracy or execution time) must be taken into consideration...
conference paper 2023
document
Li, Z. (author), Hai, R. (author), Katsifodimos, A (author), Bozzon, A. (author)
Machine learning (ML) researchers and practitioners are building repositories of pre-trained models, called model zoos. These model zoos contain metadata that detail various properties of the ML models and datasets, which are useful for reporting, auditing, reproducibility, and interpretability. Unfortunately, the existing metadata...
conference paper 2023
document
Siachamis, G. (author), Psarakis, K. (author), Fragkoulis, M. (author), Papapetrou, Odysseas (author), van Deursen, A. (author), Katsifodimos, A (author)
How can we perform similarity joins of multi-dimensional streams in a distributed fashion, achieving low latency? Can we adaptively repartition those streams in order to retain high performance under concept drifts? Current approaches to similarity joins are either restricted to single-node deployments or focus on set-similarity joins, failing...
conference paper 2023
document
Ionescu, A. (author), Alexandridou, Alexandra (author), Psarakis, K. (author), Patroumpas, Kostas (author), Chatzigeorgakidis, Georgios (author), Skoutas, Dimitrios (author), Athanasiou, Spiros (author), Hai, R. (author), Katsifodimos, A (author)
The increasing need for data trading has created a high demand for data marketplaces. These marketplaces require a set of valueadded services, such as advanced search and discovery, that have been proposed in the database research community for years, but are yet to be put to practice. In this paper we propose to demonstrate the Topio...
conference paper 2023
document
Siachamis, G. (author), Kanis, Job (author), Koper, Wybe (author), Psarakis, K. (author), Fragkoulis, M. (author), van Deursen, A. (author), Katsifodimos, A (author)
In this work, we evaluate autoscaling solutions for stream processing engines. Although autoscaling has become a mainstream subject of research in the last decade, the database research community has yet to evaluate different autoscaling techniques under a proper benchmarking setting and evaluation framework. As a result, every newly proposed...
conference paper 2023
document
Harte, Jesse (author), Zorgdrager, Wouter (author), Louridas, Panos (author), Katsifodimos, A (author), Jannach, Dietmar (author), Fragkoulis, M. (author)
Sequential recommendation problems have received increasing attention in research during the past few years, leading to the inception of a large variety of algorithmic approaches. In this work, we explore how large language models (LLMs), which are nowadays introducing disruptive effects in many AI-based applications, can be used to build or...
conference paper 2023
document
Hai, R. (author), Koutras, C. (author), Ionescu, A. (author), Li, Z. (author), Sun, W. (author), van Schijndel, Jessie (author), Kang, Yan (author), Katsifodimos, A (author)
Machine learning (ML) training data is often scattered across disparate collections of datasets, called data silos. This fragmentation poses a major challenge for data-intensive ML applications: integrating and transforming data residing in different sources demand a lot of manual work and computational resources. With data privacy and...
conference paper 2023
document
Li, Z. (author), Schonfeld, Mariette (author), Hai, R. (author), Bozzon, A. (author), Katsifodimos, A (author)
Given a set of pre-trained Machine Learning (ML) models, can we solve complex analytic tasks that make use of those models by formulating ML inference queries? Can we mitigate different tradeoffs, e.g., high accuracy, low execution costs and memory footprint, when optimizing the queries? In this work we present different multi-objective ML...
conference paper 2023
document
Sun, W. (author), Katsifodimos, A (author), Hai, R. (author)
Recent advances in Graphic Processing Units (GPUs) have facilitated a significant performance boost for database operators, in particular, joins. It has been intensively studied how conventional join implementations, such as hash joins, benefit from the massive parallelism of GPUs. With the proliferation of machine learning, more databases...
conference paper 2023
document
Fragkoulis, M. (author), Carbone, Paris (author), Kalavri, Vasiliki (author), Katsifodimos, A (author)
Stream processing has been an active research field for more than 20 years, but it is now witnessing its prime time due to recent successful efforts by the research community and numerous worldwide open-source communities. This survey provides a comprehensive overview of fundamental aspects of stream processing systems and their evolution in...
journal article 2023
document
Li, Z. (author), Kant, Henk (author), Hai, R. (author), Katsifodimos, A (author), Brambilla, Marco (author), Bozzon, A. (author)
Machine learning (ML) practitioners and organizations are building model repositories of pre-trained models, referred to as model zoos. These model zoos contain metadata describing the properties of the ML models and datasets. The metadata serves crucial roles for reporting, auditing, ensuring reproducibility, and enhancing interpretability....
journal article 2023
document
Sun, W. (author), Katsifodimos, A (author), Hai, R. (author)
The rapid growth of large-scale machine learning (ML) models has led numerous commercial companies to utilize ML models for generating predictive results to help business decision-making. As two primary components in traditional predictive pipelines, data processing, and model predictions often operate in separate execution environments,...
conference paper 2023
document
Ionescu, A. (author), Patroumpas, Kostas (author), Psarakis, K. (author), Chatzigeorgakidis, Georgios (author), Collarana, Diego (author), Barenscher, Kai (author), Skoutas, Dimitrios (author), Katsifodimos, A (author), Athanasiou, Spiros (author)
The increasing need for data trading across businesses nowadays has created a demand for data marketplaces. However, despite the intentions of both data providers and consumers, today’s data marketplaces remain mere data catalogs. We believe that marketplaces of the future require a set of value-added services, such as advanced search and...
conference paper 2023
document
Li, Z. (author), Hai, R. (author), Bozzon, A. (author), Katsifodimos, A (author)
Machine learning (ML) practitioners and organizations are building model zoos of pre-trained models, containing metadata describing properties of the ML models and datasets that are useful for reporting, auditing, reproducibility, and interpretability purposes. The metatada is currently not standardised; its expressivity is limited; and there is...
conference paper 2022
document
de Heus, Martijn (author), Psarakis, K. (author), Fragkoulis, M. (author), Katsifodimos, A (author)
Serverless computing is currently the fastest-growing cloud services segment. The most prominent serverless offering is Function-as-a-Service (FaaS), where users write functions and the cloud automates deployment, maintenance, and scalability. Although FaaS is a good fit for executing stateless functions, it does not adequately support...
journal article 2022
document
Verheijde, Jim (author), Karakoidas, Vassilios (author), Fragkoulis, M. (author), Katsifodimos, A (author)
Distributed streaming dataflow systems have evolved into scalable and fault-tolerant production-grade systems. Their applicability has departed from the mere analysis of streaming windows and complex-event processing, and now includes cloud applications and machine learning inference. Although the advancements in the state management of...
conference paper 2022
document
Ionescu, A. (author), Hai, R. (author), Fragkoulis, M. (author), Katsifodimos, A (author)
Machine Learning (ML) applications require high-quality datasets. Automated data augmentation techniques can help increase the richness of training data, thus increasing the ML model accuracy. Existing solutions focus on efficiency and ML model accuracy but do not exploit the richness of dataset relationships. With relational data, the challenge...
conference paper 2022
document
Hai, R. (author), Koutras, C. (author), Ionescu, A. (author), Katsifodimos, A (author)
Data science workflows often require extracting, preparing and integrating data from multiple data sources. This is a cumbersome and slow process: most of the times, data scientists prepare data in a data processing system or a data lake, and export it as a table, in order for it to be consumed by a Machine Learning (ML) algorithm. Recent...
abstract 2022
document
Ionescu, A. (author), Katsifodimos, A (author), Houben, G.J.P.M. (author)
As data is produced at an unprecedented rate, the need and ex- pectation to make it easily available for the end-users is growing. Dataset Discovery has become an important subject in the data management community, as it represents the means of providing the data to the user and fulfilling an information need. Since the end-user is the one that...
conference paper 2021
document
Gencer, Can (author), Topolnik, Marko (author), Ďurina, Viliam (author), Demirci, Emin (author), Kahveci, Ensar B. (author), Gürbüz, Ali (author), Lukáš, Ondřej (author), Fragkoulis, M. (author), Katsifodimos, A (author)
Jet is an open-source, high-performance, distributed stream processor built at Hazelcast during the last five years. Jet was engineered with millisecond latency on the 99.99th percentile as its primary design goal. Originally Jet’s purpose was to be an execution engine that performs complex business logic on top of streams generated by...
journal article 2021
Searched for: collection%253Air
(1 - 20 of 34)

Pages