Implementation and evaluation of Ordo

A high performance data processing system

Master Thesis (2023)
Author(s)

M. Melas (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Jan Rellermeyer – Mentor (TU Delft - Data-Intensive Systems)

Lydia Y. Chen – Graduation committee member (TU Delft - Data-Intensive Systems)

A. Katsifodimos – Graduation committee member (TU Delft - Web Information Systems)

Faculty
Electrical Engineering, Mathematics and Computer Science
Copyright
© 2023 Minas Melas
More Info
expand_more
Publication Year
2023
Language
English
Copyright
© 2023 Minas Melas
Graduation Date
20-03-2023
Awarding Institution
Delft University of Technology
Programme
Computer Science
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Data processing systems have become increasingly important in modern computing, as the volume and complexity of data that needs to be analyzed has grown dramatically. Multiple data processing systems have been and are being developed, that are scalable, resilient and performant.

However, despite the advances made in data processing technology, there are still challenges that need to be addressed in order to optimize the performance, energy efficiency as well as the practical- ity of these systems. One such challenge is the need to effectively manage the underlying system’s resources, including the system’s throughput and the amount of work that each operator has to do and to use optimal data-structures that would lead in faster task processing speeds.

To address this challenge, this thesis proposes the implementation of a high-performance data processing system that exposes the underlying system’s metrics to the application level and applys an innovative way for operator communication, by utilizing an efficient thread-safe data structure. By providing underlying system’s metrics to the application’s scheduler, the scheduler can schedule the tasks optimally according to the current system’s state and adjust the system’s resources during run- time. This alleviates the developers from having to fine-tune the system beforehand and allows the system to tackle fluctuating input workload more efficiently.

This thesis will explore the design and implementation of such system, as well as its impact on the performance, energy-efficiency and resiliency of data processing applications. We provide perfor- mance measurements as well as a qualitative comparison of our system compared to other state-of-the art systems, proving our hypotheses.

Files

Final_Thesis_Ordo.pdf
(pdf | 2.42 Mb)
License info not available