Key Insights from a Feature Discovery User Study

Conference Paper (2024)
Author(s)

A. Ionescu (TU Delft - Web Information Systems)

Zeger Mouw (Student TU Delft)

E.A. Aivaloglou (TU Delft - Web Information Systems)

A Katsifodimos (TU Delft - Data-Intensive Systems)

Research Group
Web Information Systems
DOI related publication
https://doi.org/10.1145/3665939.3665961
More Info
expand_more
Publication Year
2024
Language
English
Research Group
Web Information Systems
ISBN (electronic)
9798400706936
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Multiple works in data management research focus on automating the processes of data augmentation and feature discovery to save users from having to perform these tasks manually. Yet, this automation often leads to a disconnect with the users, as it fails to consider the specific needs and preferences of the actual end-users of data management systems for machine learning. To explore this issue further, we conducted 19 semi-structured, think-aloud use-case studies based on a scenario in which data specialists were tasked with augmenting a base table with additional features to train a machine learning model. In this paper, we share key insights into the practices of feature discovery on tabular data performed by real-world data specialists derived from our user study. Our research uncovered differences between the user assumptions reported in the literature and the actual practices, as well as some areas where literature and real-world practices align.