Search results | TU Delft Repositories

Searched for: contributor%3A%22Ionescu%2C+A.+%28mentor%29%22

(1 - 10 of 10)

document: Human Interaction in Tabular Data Augmentation in Data Science Workflows
Mouw, Zeger (author)
The advancement of artificial intelligence (AI) has led to an increased demand for both a greater volume and quality of data. In many companies, data is dispersed across multiple tables, yet AI models typically require data in a single table format. This necessitates the merging of these tables and the selection of optimal features for the model...
master thesis 2024

document: Automatic feature discovery: A comparative study between filter and wrapper feature selection techniques
Mânăstireanu, Andrei (author)
The curse of dimensionality is a common challenge in machine learning, and feature selection techniques are commonly employed to address this issue by selecting a subset of relevant features. However, there is no consistently superior approach for choosing the most significant subset of features. We conducted a comprehensive analysis comparing...
bachelor thesis 2023

document: Encoding methods for categorical data: A comparative analysis for linear models, decision trees, and support vector machines
Udilă, Andrei (author)
This paper presents a comprehensive evaluation and comparison of encoding methods for categorical data in the context of machine learning. The study focuses on five popular encoding techniques: one-hot, ordinal, target, catboost, and count encoders. These methods are evaluated using linear models, decision trees, and support vector machines ...
bachelor thesis 2023

document: Filtering Knowledge: A Comparative Analysis of Information-Theoretical-Based Feature Selection Methods
Vasilev, Kiril (author)
The data used in machine learning algorithms strongly influences the algorithms' capabilities. Feature selection techniques can choose a set of columns that meet a certain learning goal. There is a wide variety of feature selection methods, however, the ones we cover in this comparative analysis are part of the information-theoretical-based...
bachelor thesis 2023

document: Data-Driven Empirical Analysis of Correlation-Based Feature Selection Techniques
Buşe, Florena (author)
Thus far the democratization of machine learning, which resulted in the field of AutoML, has focused on the automation of model selection and hyperparameter optimization. Nevertheless, the need for high-quality databases to increase performance has sparked interest in correlation-based feature selection, a simple and fast, yet effective approach...
bachelor thesis 2023

document: A comparative study for using PCA, LDA, GDA, and Lasso for dimensionality reduction before classification algorithms
Anceaux, Duyemo (author)
Since every day more and more data is collected, it becomes more and more expensive to process. To reduce these costs, you can use dimensionality reduction to reduce the number of features per instance in a given dataset. <br/><br/>In this paper, we will compare four possible methods of dimensionality reduction. The feature extraction methods...
bachelor thesis 2023

document: An exploratory journey to combine schema matchers for better relevance prediction
Wang, Wang Hao (author)
Current speed of data growth has exponentially increased over the past decade, highlighting the need of modern organizations for data discovery systems. Several (automated) schema matching approaches have been proposed to find related data, exploiting different parts of schema information (e.g. data type, data distribution, column name, etc.)....
master thesis 2022

document: PCADA: Partial Correlation Aware Data Augmentation for random forest classifier
Lorek, Oskar (author)
Machine learning models require rich, quality data sets to achieve high accuracy. With current exponential growth of data being generated it is becoming increasingly hard to prepare high-quality tables within reasonable time frame. To combat this issue automated data augmentation methods has emerged in recent years. However, existing solution do...
bachelor thesis 2022

document: From Feature Selection to Data Augmentation: the ADA Algorithm
Cruset Pla, Eduard (author)
The democratization of data science, and in particular of the machine learning pipeline, has focused on the automation of model selection, feature processing, and hyperparameter tuning. Nevertheless, the need for high-quality data for increased performance has sparked interest in the inclusion of data augmentation in these automatic machine...
bachelor thesis 2022

document: Automatic feature augmentation ranking: XGBoost
Neut, Oliver (author)
Automatic machine learning is a subfield of machine learning that automates the common procedures faced in predictive tasks. The problem of one such procedure is automatic data augmentation, where one desires to enrich the existing data to increase model performance. In relational data repositories, the data is stored in normal form. This causes...
bachelor thesis 2022

Searched for: contributor%3A%22Ionescu%2C+A.+%28mentor%29%22

(1 - 10 of 10)