Searched for: contributor%3A%22Katsifodimos%2C+A+%28mentor%29%22
(1 - 20 of 35)

Pages

document
Reppas, Panagiotis (author)
This thesis embarks on the quest to efficiently compute similarities between data streams in real-time, a task burgeoning in importance with the advent of big data and real-time analytics. At the heart of this endeavor is the expansion of the Condor framework to accommodate new probabilistic data structures, tailored to meet the distinctive...
master thesis 2024
document
Geukes, Colin (author)
In real-world scenarios, users provide invaluable data; however, this data is inherently incoherent, incomplete, and duplicated, i.e., different data rows refer to the same real-world object. Merging duplications to a single entry broadens the knowledge of a given real-world object represented within a data set. Applying a straightforward cross...
master thesis 2023
document
Harte, Jesse (author)
In this thesis we aim to research and design different neural models for session recommendation. We investigate the fundamental neural models for session recommendation, namely BERT4Rec, SASRec and GRU4Rec and subsequently use our findings to design a simpler but performant neural model. <br/><br/>Firstly, we address methodological errors made...
master thesis 2023
document
Veneti, Theodoros (author)
Stream Processing Engines (SPEs) are called upon to help solve problems around big and volatile data, while satisfying the needs for near real-time processing. In order for such systems to be considered effective solutions to such problems at scale, efficient elasticity and non dataflow-disturbing reconfiguration operations within are a...
master thesis 2023
document
Hernandez Quintanilla, Tomás (author)
Similarity joins are operations which involve identifying similar pairs of records within one or multiple datasets. These operations are typically time-sensitive, as timely identification of relations can lead to increased profitability. Therefore, it is advantageous to analyze them using a stream processing system, which offers real-time...
master thesis 2023
document
van Lil, Wouter (author)
Serverless computing has allowed developers to write pieces of code comprising solely of the necessary functionality whilst not having to think about the underlying infrastructure. One prominent model is Function-as-a-Service (FaaS), where the code is structured into functions that run based on incoming events. This model was initially stateless...
master thesis 2023
document
Schutte, Marcus (author)
Today's need for highly available systems leads to data partitioning and replication across multiple nodes. Providing strong transactional consistency in a distributed database requires extensive communication. For this, algorithms such as two phase commit are used. These communication algorithms add extra network latency's. For application...
master thesis 2023
document
Gavalas, Nikos (author)
The adoption of the serverless architecture and the Function-as-a-Service model has significantly increased in recent years, with more enterprises migrating their software and hardware to the cloud. However, most applications require state management, leading to the use of external databases. To alleviate the burden of state management, there...
master thesis 2023
document
Mânăstireanu, Andrei (author)
The curse of dimensionality is a common challenge in machine learning, and feature selection techniques are commonly employed to address this issue by selecting a subset of relevant features. However, there is no consistently superior approach for choosing the most significant subset of features. We conducted a comprehensive analysis comparing...
bachelor thesis 2023
document
Udilă, Andrei (author)
This paper presents a comprehensive evaluation and comparison of encoding methods for categorical data in the context of machine learning. The study focuses on five popular encoding techniques: one-hot, ordinal, target, catboost, and count encoders. These methods are evaluated using linear models, decision trees, and support vector machines ...
bachelor thesis 2023
document
Vasilev, Kiril (author)
The data used in machine learning algorithms strongly influences the algorithms' capabilities. Feature selection techniques can choose a set of columns that meet a certain learning goal. There is a wide variety of feature selection methods, however, the ones we cover in this comparative analysis are part of the information-theoretical-based...
bachelor thesis 2023
document
Buşe, Florena (author)
Thus far the democratization of machine learning, which resulted in the field of AutoML, has focused on the automation of model selection and hyperparameter optimization. Nevertheless, the need for high-quality databases to increase performance has sparked interest in correlation-based feature selection, a simple and fast, yet effective approach...
bachelor thesis 2023
document
Anceaux, Duyemo (author)
Since every day more and more data is collected, it becomes more and more expensive to process. To reduce these costs, you can use dimensionality reduction to reduce the number of features per instance in a given dataset. <br/><br/>In this paper, we will compare four possible methods of dimensionality reduction. The feature extraction methods...
bachelor thesis 2023
document
Kant, Henk (author)
As the sharing of machine learning (ML) models has increased in popularity, more so-called model zoos are created. These repositories facilitate the sharing of models and their metadata, and other people to find and re-use an existing model. However, the metadata provided for models is insufficient, with little focus on practical aspects of a...
master thesis 2023
document
Comans, Martijn (author)
Serverless computing is an increasingly popular paradigm in cloud computing where many of the operational challenges of running cloud applications, like server provi- sioning and management, are left to the cloud provider. A popular form of server- less computing is Functions-as-a-Service (FaaS), where the user submits functions for which the...
master thesis 2023
document
Chronas, Konstantinos (author)
Matching schemas is a fundamental task in data integration and semantic web applications. However, generating labeled data for schema matching tasks is challenging, requiring an efficient and effective approach. This thesis addresses this challenge by investigating schema matching techniques and crowdsourcing solutions. We developed a prototype...
master thesis 2023
document
Wiemers, Gianni (author)
The use of data streams has increased a lot over the last two decades or so. and<br/>With this increase comes the need for fast and consistent fault recovery. Rollback<br/>recovery mechanisms from traditional distributed systems have been adapted successfully for stream engines. These mechanisms can be categorized into one of three different...
master thesis 2023
document
van Schijndel, Jessie (author)
The workflow of a data science practitioner includes gathering information from different sources and applying machine learning (ML) models. Such dispersed information can be combined through a process known as Data Integration (DI), which defines relations between entities and attributes. When all information is combined in one source suited...
master thesis 2022
document
Wang, Wang Hao (author)
Current speed of data growth has exponentially increased over the past decade, highlighting the need of modern organizations for data discovery systems. Several (automated) schema matching approaches have been proposed to find related data, exploiting different parts of schema information (e.g. data type, data distribution, column name, etc.)....
master thesis 2022
document
Palakodeti, Anitej (author)
Generating synthetic images has wide applications in several fields such as creating datasets for machine learning or using these images to investigate the behaviour of machine learning models. An essential requirement when generating images is to control aspects such as the entities or objects in the image. Controlling this helps in creating...
master thesis 2022
Searched for: contributor%3A%22Katsifodimos%2C+A+%28mentor%29%22
(1 - 20 of 35)

Pages