G.M. Allen | TU Delft Repository

On the Same Page?

Exploring Value Alignment in Book Recommender Systems

Master thesis (2026) - D.S.R. Doting, M.S. Pera, L. Cavalcante Siebert, G.M. Allen

Recommender systems (RSs) have become a central part of daily digital life, shaping which items, people, and opportunities users are exposed to at scale. Given that personal values are fundamental to individual identity, decision-making, and consumer behavior, and that RSs learn from interaction data that is itself shaped by these values, the question arises whether the recommendations produced by standard RSs already reflect users' personal values. To date, no prior work has empirically investigated this question.

This thesis addresses that gap by examining the extent to which user values are reflected in recommendation outcomes, and whether explicitly incorporating value information can improve this alignment. Using Schwartz's Theory of Basic Human Values as a theoretical framework, we conduct an offline experiment on the Goodreads dataset. We construct value profiles for both users and recommended items using the Personal Values Dictionary, which maps over a thousand English words to their corresponding Schwartz value. These profiles are derived from user reviews and book descriptions respectively, and are used to measure the alignment between a user's personal values and the values embedded in their recommendations.

Our results show that standard RSs exhibit a weak but positive degree of value alignment, suggesting that interaction-based optimization procedures partially capture users' values without explicitly modeling them. Furthermore, we find that explicitly incorporating user value profiles as features within the RS increases this alignment. These findings carry important implications for the design of value-aware recommender systems, and suggest that early integration of value information is a promising direction for future research. ...

Empowering Users to Handle Misinformation in Podcasts

Master thesis (2024) - E.X. Tan, U.K. Gadiraju, Z. Yue, G.M. Allen

Podcasts are a rapidly growing medium for information sharing, but their audio and one-way communication format presents unique challenges in addressing misinformation. This thesis explores how to empower podcast listeners to identify and respond to misinformation effectively. Study I investigates listening habits, user trust, confidence, and behavioral responses to misinformation in podcasts through a survey of diverse participants. Key findings highlight gaps in user confidence, the impact of demographic factors, and preferences for incentives to flag misinformation. Study II builds upon these insights to design, implement, and evaluate three interventions—PAUSE, ALERT, and VOLUNTARY—aimed at optimizing user engagement in flagging misinformation. A labeled podcast dataset was created to facilitate this task-based experiment. The findings offer insights into the design of user-centric misinformation detection systems. Interventions have shown potential in empowering users to identify misinformation in podcasts. Although, whether they are able to address misinformation in podcasts effectively remains uncertain and needs further exploration. This work not only addresses a significant gap in the literature but also lays the groundwork for future innovations in combating misinformation in podcasts. ...

Improving User Engagement to Reduce Dropout Rates in Long Web Surveys

Exploring the Effectiveness of Achievement Primes Amongst Intrinsically and Extrinsically Motivated Respondents

Master thesis (2023) - T.A.R. van Tussenbroek, G.M. Allen, U.K. Gadiraju, F. Broz

Web surveys have increasingly been used to collect data from respondents over the years. They offer several advantages compared to other methods of obtaining data. Researchers benefit from a broad demographic representation to make generalized conclusions, and satisfaction surveys allow employees to explain shortcomings or improvements anonymously. Both examples demand comprehensive information, thereby requiring a lengthy survey. However, dropout increases with the length of a survey, which is a big problem on web surveys as it decreases the statistical significance of the results. Proposed solutions, such as reducing the number of questions or rewarding respondents with an incentive, may not always be feasible due to the preciseness of information required or limited financial capabilities.

Achievement primes have been shown to reduce dropout on short surveys targeting extrinsically motivated respondents without additional costs or the need to reduce survey length. As repeated exposure to primes reinforces the stimuli, long surveys may also benefit from achievement primes. In this study, respondents are exposed to a questionnaire of more than 15 minutes on health whilst working behind a computer containing either no prime, passive achievement primes, or active achievement primes. Besides extrinsically motivated respondents, recruited via the crowdworking platform Prolific, intrinsically motivated respondents are also targeted in this study, recruited via snowball sampling.

Through a 2 times 3 factorial design, we discovered no statistical difference in dropout, perceived workload, and user engagement across the three questionnaire variants when evaluating intrinsically (N=88) and extrinsically motivated respondents (N=140) individually. By comparing intrinsically with extrinsically motivated respondents, we discovered extrinsically motivated respondents were more engaged and dropped out less. ...

From Clicks to Cues

Exploring user behaviour as a language in music video consumption

Master thesis (2023) - V. Mittal, U.K. Gadiraju, G.M. Allen, M.S. Pera, M. Khosla

As music video streaming occupies a significant market share in how people consume music, gaining an understanding of user behavioural patterns becomes increasingly crucial. This understanding can enable better music video streaming experiences by tailoring them towards more personalized and user-centric designs. Though prior works have emphasized user behaviour during solely listening to music, understanding user actions/clicks while consuming music videos remains largely unexplored. Given the unique experience offered by the combination of audio and visual elements, there is a need for focused research in this area.

Therefore this study attempts to bridge this research gap by collecting and analysing a large dataset of streaming sessions from a music video streaming company - XITE. In total, we analyzed 1.8 million sessions from approximately 270,000 unique users. The behaviour exhibited during those sessions is interpreted as a language and modelled using the Language Model - Doc2Vec. This facilitated the conversion of session action sequences into embeddings. Our findings suggest that music video streaming sessions exhibit cohesive user interaction patterns, which can be grouped into distinct clusters, thereby enabling the detection of distinct behavioural patterns across user sessions.

Furthermore, previous studies have indicated that user interactions with multimedia streaming platforms can be influenced by the context in which content is consumed. Extending these findings, our analysis of behavioural clusters revealed that certain user behaviours while consuming music videos are associated with specific music video genres and temporal factors. For instance, we discovered that passive sessions predominantly commence around 10 am, while sessions requiring more active engagement typically start in the evening. The insights derived from this study are valuable for improving user-centric design in music video streaming platforms and providing businesses with data-driven recommendations for strategic planning. ...

As music video streaming occupies a significant market share in how people consume music, gaining an understanding of user behavioural patterns becomes increasingly crucial. This understanding can enable better music video streaming experiences by tailoring them towards more personalized and user-centric designs. Though prior works have emphasized user behaviour during solely listening to music, understanding user actions/clicks while consuming music videos remains largely unexplored. Given the unique experience offered by the combination of audio and visual elements, there is a need for focused research in this area.

Therefore this study attempts to bridge this research gap by collecting and analysing a large dataset of streaming sessions from a music video streaming company - XITE. In total, we analyzed 1.8 million sessions from approximately 270,000 unique users. The behaviour exhibited during those sessions is interpreted as a language and modelled using the Language Model - Doc2Vec. This facilitated the conversion of session action sequences into embeddings. Our findings suggest that music video streaming sessions exhibit cohesive user interaction patterns, which can be grouped into distinct clusters, thereby enabling the detection of distinct behavioural patterns across user sessions.

Furthermore, previous studies have indicated that user interactions with multimedia streaming platforms can be influenced by the context in which content is consumed. Extending these findings, our analysis of behavioural clusters revealed that certain user behaviours while consuming music videos are associated with specific music video genres and temporal factors. For instance, we discovered that passive sessions predominantly commence around 10 am, while sessions requiring more active engagement typically start in the evening. The insights derived from this study are valuable for improving user-centric design in music video streaming platforms and providing businesses with data-driven recommendations for strategic planning.

Safeguarding inclusion when using gestures in microtask crowdsourcing

Bachelor thesis (2022) - S.H. Veringa, G.M. Allen, U.K. Gadiraju, J.A. Pouwelse

Microtask crowdsource workers are negatively influenced, mentally as well as physically, by the repetitive nature of the tasks they perform. Research is ongoing on whether using a gesture-based input technique could mitigate these negative effects. This paper identifies possible ways that using gestures as an alternative input modality could lead to exclusion by analysing survey responses, where n=10. While further research is necessary, there are indications that this could lead to cultural and physical exclusion of certain groups. This paper is not meant to discourage using gestures as an alternative method of input but is solely meant to bring attention to possible risks to take into account. ...

Investigating Body Gestures as Means of Input Modalities in Crowdsourced Microtasks

Bachelor thesis (2022) - A.A. Ajanidisz, U.K Gadiraju, G.M. Allen, J.A. Pouwelse

Microtask crowdsourcing has grown in popularity in recent years. Microtasking is a form of crowdsourcing in which typically small, simple tasks are distributed over the Internet to a large number of people, also known as workers. Workers are highly susceptible to developing musculoskeletal disorders due to prolonged computer use and the monotonous, performance-oriented nature of microtasking. Fortunately, it has been demonstrated that exercise can remedy these health issues. Since some body gestures resemble low-intensity exercise, the use of gestures as input in crowdsourced microtasks has the potential to improve the health of the worker. The purpose of this study was to determine which gestures are effective for controlling microtask workflows in terms of health benefits and usability. In an effort to maximize the positive impact on health, a total of 12 gestures were developed for four distinct microtask workflow elements. Then, we incorporated these gestures into a survey to evaluate the subjective perceptions of usability. On the basis of the survey results, we ranked these gestures for each workflow element and proposed three gesture-command dictionaries optimized for maximum efficiency. Due to the numerous limitations of this study, it is strongly recommended that the outcomes be enhanced. The primary contribution of this study is, therefore, the establishment of new research directions for gestural input in microtasking and in all human-computer interaction. ...

GANAesthetic : An experience of interactively exploring aesthetically pleasing images and incorporating the human perception of beauty to discover aesthetic latent dimensions

Bachelor thesis (2022) - T.H. Nguyen, J.D. Lomas, U.K. Gadiraju, W.L.A. van der Maden, G.M. Allen, D.H.J. Epema

Despite the fact that climate change is becoming increasingly dangerous and prevalent, there is still a lack of public engagement. This can be explained by the fact that the media portrays climate change as an abstract concept. The message can be more effectively communicated through visual art because it is more likely to invoke emotional responses in individuals. By including human perception and rating data, the generative adversarial neural network (GAN) produces better image output. Therefore, this paper explores methods for using the human perception of beauty in order to improve StyleGAN2 outputs. In GANAesthetic, UI sliders allow users to explore satellite images interactively, that is, visually appealing satellite images generated from StyleGAN2. The GANAesthetic was determined to be the most appropriate methodology for the study. The choice of GANAesthetic over other approaches will be explained in this paper, as well as its implementation. The paper will also describe an experiment to discover aesthetic latent dimensions. ...

How can crowdsourced workers effectively rate artwork images produced by Generative Adversarial Network transformers?

Bachelor thesis (2022) - M. Rahman, J.D. Lomas, U.K. Gadiraju, W.L.A. van der Maden, G.M. Allen, D.H.J. Epema

Generative Adversarial Networks (GANs) can create artwork images and we need effective ways of rating their aesthetic values. This could help us determine the most aesthetic artwork images (and identify the GANs that created them) and train GANs to produce more aesthetic artwork images in the future. In this research, we analyzed the effectiveness of using two different survey formats (binary-choice and four-choice) for displaying GAN-produced artwork images to crowd- sourced workers and gathering their ratings. The artwork images were of different landscapes like the desert, arctic, coastal regions, etc. Additionally, we investigated how the choice of showing different images together (image groupings) per question affects the final rating results. Results demonstrate that the four-choice format is superior to the binary-choice format in producing more consistent, reliable, and accurate results. The effects of the different image groupings were insignif- icant for the results of the four-choice format. In contrast, different image groupings displayed statistically significant changes in the results for the binary-choice format. However, it was found that crowdsourced workers preferred the binary-choice format more as they found it to be less strenuous and more effective in allowing them to express their rating choices. ...

Iterative training with human rated images to improve GAN generated image aesthetics

Effects of dataset size and training length

Bachelor thesis (2022) - B.I. Çelebi, W.L.A. van der Maden, J.D. Lomas, G.M. Allen, U.K. Gadiraju, D.H.J. Epema

Generative Adversarial Networks (GANs) brought rapid developments in generating synthetic images by mimicking structures in the training data. With the list of application of GANs growing drastically, it has lately become an exciting technology to explore for designers to communicate their ideas and arts through technology and create engaging experiences for humans. Nevertheless, translating human experiences to artificial intelligence and creating visually pleasant imagery is a challenging task due to complex semantics of human perception. To address this issue, we introduce an iterative training approach in which the generated images are curated by humans and the most pleasing ones are fed back into the network to retrain. Additionally, we do a factorial analysis to investigate how the aesthetic quality and the diversity are affected by the size of training data and training length. In experiments, we validate that this method can significantly improve the aesthetic quality of generated images regardless of the dataset size and training length, however the use of smaller datasets comes with a cost of reduction in the image diversity and novelty in the output images. The aesthetic bias towards certain contexts can also deteriorate the diversity and affect the model evaluations. On the other hand, no significant relationship has been found regarding the training length, however this could possibly be due to instabilities that happen during the model convergence progress. ...

Beauty in the Eye of Machine

Using Automated Measures of Aesthetic Beauty to Improve GAN Output of Satellite Images

Bachelor thesis (2022) - J.M. Catlett, J.D. Lomas, W.L.A. van der Maden, U.K. Gadiraju, G.M. Allen, D.H.J. Epema

This paper aims to evaluate which automated measures of aesthetic beauty are the best predictors for human ratings of aesthetics and proposes that typicality and novelty may increase the correlation between the two. To study the correlation between these metrics, a literature study was performed to find a select amount of potentially good predictors, a pipeline was created to extract these values from each image within our dataset, a survey was conducted to vote for which images were considered most aesthetic, and finally regression analysis was performed to see which metrics offered highest correlation with the human rating data. From this we could see there were indeed a number of automated metrics that consistently scored high as predictors for the human aesthetic ratings and there was a slight improvement in the fit of the prediction model upon including novelty as a feature. However, at this moment, the improvement is not significant to conclude these features are better at predicting human ratings. ...

To what degree can we use NLP to mine current and trending topics with respect to well-being?

Bachelor thesis (2022) - N. Manglani, W.L.A. van der Maden, U.K. Gadiraju, J.D. Lomas, G.M. Allen, Z. Erkin

The increase in global internet users brings forth a vast amount of social media users and therefore opinions that are shared online. A subset of those users, adolescents, seem to develop some sort of addiction towards social media, which could lead to low life satisfaction. This paper tries to extract trending topics and their relation to well-being in order to help organizations like MyWellnessCheck check in on adolescents and students. The results indicated that this was possible, despite the vast amount of spam that is present online. Unsurprisingly, current events made the list of trending topics with negative sentiment like ”school shootings”, as well as unexpected topics with positive sentiment that could potentially im- prove well-being, like ”safe snacks”. ...

Designing a dashboard for wellbeing data

A recommendation system for individual wellbeing

Bachelor thesis (2022) - M.A.A. Groenendijk, W.L.A. van der Maden, G.M. Allen, U.K. Gadiraju, J.D. Lomas, Z. Erkin

Due to COVID-19 the overall wellbeing worldwide decreased. Assessing and improving wellbeing became a more important subject. This article describes the design research that uses the My Wellness Check survey created by the Delft University for Technology and aims to create a dashboard for wellbeing. That includes a way of authenticating users to very sensitive data. Also, finding ways how to improve personal wellbeing by using a recommendation system based on different types of filtering and the additional elements that are needed for a recommendation system. ...

Investigating Feasibility of Webcam-Based Eye-Tracking as an Alternative Input Modality for Micro-Task Work

Bachelor thesis (2022) - D.R. Struijk, G.M. Allen, U.K. Gadiraju

People who perform work on micro-task crowdsourcing platforms, often do so using a mouse and/or keyboard for many hours at a time, while alternative modes of input could potentially provide a better experience. This research investigates the feasibility of using webcam-based eye-tracking in a micro-task work environment. We accomplish this by setting up a user study (n = 20), where participants are asked to perform a series of image classification tasks using either a mouse or just their eyes. Overall, results show that participants using a webcam are generally able to complete the tasks adequately. However, they perform somewhat slower and less accurate, and are less content with their overall experience. Based on our results, we suggest that there are still limitations to overcome when applying webcam-based eye-tracking to micro-tasks. ...

Using Transformers to Generate Wellbeing Questions

Bachelor thesis (2022) - M. Trasberg, W.L.A. van der Maden, U.K. Gadiraju, G.M. Allen, J.D. Lomas, Z. Erkin

With the surge of mental health issues during COVID-19, more emphasis has turned towards assessing wellbeing. At the same time, recent advances in AI have shown huge potential in a variety of fields. However, few solutions are available at the intersection of those two fields. This research explores how the use of transformer models like GPT-3 could have a positive impact in the domain of wellbeing and proposes a solution to automatize the survey question creation process. After comparing several GPT-3 question creation methods, it was found that through clever prompt engineering and added context, it can be possible to generate syntactically and contextually correct questions about any specific wellbeing context. In addition, the paper discusses potential ways to assess such questions and offers a demo for a question generation web application. ...

Using graphics to assess wellbeing in a conversational user interface

Bachelor thesis (2022) - A. Achilleos, W.L.A. van der Maden, J.D. Lomas, U.K. Gadiraju, G.M. Allen, Z. Erkin

With the COVID-19 pandemic testing humanity worldwide in unforeseen ways, wellbeing assessment has stepped to the foreground of individual health status. Conversational User Interfaces (CUIs) prove promising as an assessment tool, but lacking the means to retain users engaged during the process. This research aims to explore a solution introducing a graphical modality to a CUI, that improves user experience during wellbeing assessment. A psychological projective testing technique was researched, the H-T-P test, which exposes one's personality, mood and feelings through drawing. A system was designed that utilizes a simplified version of the test and image classification to provide a new means of wellbeing assessment. The system's performance and usability show that such an approach is indeed feasible. This paper describes the design process of the system, the resulting prototype and possible improvements for future implementations. ...

Effects of Adaptive Conversational User Interfaces on Enjoyment and Engagement while assessing Wellbeing

Bachelor thesis (2022) - C.E. Eijkelkamp, W.L.A. van der Maden, J.D. Lomas, U.K. Gadiraju, G.M. Allen, Z. Erkin

A decrease in wellbeing worldwide due to the COVID-19 pandemic called for ways to assess wellbeing in a scalable and adequate manner. Conversational User Interfaces (CUIs) seem suitable, however, applying them optimally in certain contexts remains a challenge. This study aims to find ways to make CUIs more engaging and have a better experience by making them adaptive. A 3x2 between-subjects experiment is designed in which the effects of avatar presence, gender, and an empathic conversational style are researched. A chatbot was created in telegram, and the visual design and conversational style were altered to measure the effects on Questionnaire Experience (QX), Enjoyment, and Empathy. In total 30 participants chatted with a randomly assigned chatbot and filled in a survey about their experiences. There is no statistical preference for avatar presence or conversational style. Male gendered chatbots score higher on QX, but female chatbots are perceived as more empathic when comparing gender. ...

Investigating Webcam-based Hand-tracking for Navigation in Micro-task Crowdsourcing

Bachelor thesis (2022) - S. El Hilali, G.M. Allen, U.K. Gadiraju

The health of micro-task crowdsourcing workers, also called crowdworkers, is something that is overlooked in the micro-task crowdsourcing literature. Due to repetitive tasks, they can develop Repetitive Strain Injuries. To look into other ways of navigating Crowdsourcing Work Environments (CSWEs) outside the mouse and keyboard paradigm, we consider webcam-based hand-tracking in this paper. The main question we considered was which hand gestures were most suitable for navigating CSWEs. By having micro-task crowdworkers (n=14) test five methods of navigating CSWEs, we found that gestures which were considered easiest and most
useful were those that specified a single action in an interface catered to hand-tracking controls. Gestures which attempt to directly replace the mouse in a regular mouse-oriented interface were rated lower on usefulness and ease of use. We also found that most crowdworkers were unlikely to use hand gestures for progressing through related subtasks, since they were considered harder than using the keyboard and mouse. ...

Analyzing the health status of crowd workers compared to desk workers

Bachelor thesis (2022) - T.Y. Huang, G.M. Allen, U.K. Gadiraju, J.A. Pouwelse

Microtask crowdsourcing workers, also known as crowd workers, perform small tasks known as microtasks.
These people use crowdsourcing platforms to complete these microtasks.
Crowd workers have to work in front of a screen to complete these microtasks, risking musculoskeletal problems and other mental problems.
Their working conditions look similar to desk workers, who are people that work remotely or at the office behind a desk.
This study aims to find the health differences between crowd workers and desk workers.
It will provide a general overview on the subjective well-being, experienced and mental health.
In order to analyze the differences in health, a survey will be deployed on a crowdsourcing platform in order to recruit crowd workers and desk workers will be recruited through snowball sampling.
The questions of the survey are divided into 5 groups, each representing a health category: general health, workspace quality, physical well-being, social well-being and emotional well-being.
For this study 17 crowd workers were recruited and 9 desk workers.
From the results, desk workers are healthier in general, have a healthier workspace because some desk workers work in ergonomically good offices, a healthier physical well-being, a healthier social well-being due to them having colleagues and a better emotional well-being. Crowd workers have a lower level of stress, because of the microtasks being mostly very simple, while desk workers have mentally demanding deadlines and projects to work on. ...