N. Roy | TU Delft Repository

On the Effects of Automatically Generated Adjunct Questions for Search as Learning

Conference paper (2024) - Peide Zhu, Arthur Câmara, Nirmal Roy, David Maxwell, Claudia Hauff

Actively engaging learners with learning materials has been shown to be very important in the Search as Learning (SAL) setting. One active reading strategy relies on asking so-called adjunct questions, i.e., manually curated questions geared towards essential concepts of the target material. However, manual question creation is impractical given the vast online content. Recent research has explored the effects of Automatic Question Generation (AQG) on aiding human learning. These studies have primarily focused on user studies in controlled online reading scenarios with limited documents. However, the impacts of adjunct questions on learning in the SAL setting, which involves learning through web searching, are not yet well understood. This paper addresses this gap by conducting a user study with automatically generated adjunct questions integrated into the reading interface built on top of a search system. We conducted a between-subjects user study (N = 144) to investigate the incorporation of automatically generated adjunct questions on participants' learning. We employed three different question generation strategies as well as a control condition: (i) synthesis questions; (ii) factoid questions targeting random text spans; and (iii) factoid questions targeting terms and phrases relevant to the information need at hand. We present four major findings: (i) participants who received adjunct questions exhibited significantly more fine-grained reading behaviour, such as longer document dwell time and more scrolls, than those without adjunct questions. However, adjunct questions' influence on learning outcomes depends on the AQG strategy. (ii) Question types significantly influence participants' reading behaviour. (iii) The adjunct questions' target spans significantly influence learning outcomes. Lastly, (iv) participants' prior knowledge levels affect adjunct questions' effects on their learning outcomes and their reaction to different AQG strategies. Our findings have significant design implications for learning-oriented search systems. The data and code is available at https://github.com/zpeide/AQG-AdjunctQuestions. ...

Actively engaging learners with learning materials has been shown to be very important in the Search as Learning (SAL) setting. One active reading strategy relies on asking so-called adjunct questions, i.e., manually curated questions geared towards essential concepts of the target material. However, manual question creation is impractical given the vast online content. Recent research has explored the effects of Automatic Question Generation (AQG) on aiding human learning. These studies have primarily focused on user studies in controlled online reading scenarios with limited documents. However, the impacts of adjunct questions on learning in the SAL setting, which involves learning through web searching, are not yet well understood. This paper addresses this gap by conducting a user study with automatically generated adjunct questions integrated into the reading interface built on top of a search system. We conducted a between-subjects user study (N = 144) to investigate the incorporation of automatically generated adjunct questions on participants' learning. We employed three different question generation strategies as well as a control condition: (i) synthesis questions; (ii) factoid questions targeting random text spans; and (iii) factoid questions targeting terms and phrases relevant to the information need at hand. We present four major findings: (i) participants who received adjunct questions exhibited significantly more fine-grained reading behaviour, such as longer document dwell time and more scrolls, than those without adjunct questions. However, adjunct questions' influence on learning outcomes depends on the AQG strategy. (ii) Question types significantly influence participants' reading behaviour. (iii) The adjunct questions' target spans significantly influence learning outcomes. Lastly, (iv) participants' prior knowledge levels affect adjunct questions' effects on their learning outcomes and their reaction to different AQG strategies. Our findings have significant design implications for learning-oriented search systems. The data and code is available at https://github.com/zpeide/AQG-AdjunctQuestions.

Exploring the effects of interactive interfaces on user search behaviour

Doctoral thesis (2024) - N. Roy

Interactive information retrieval (IIR) is a user-centered approach to information seeking and retrieval. In this paradigm, the search process is not confined to a single query and a static set of results. Instead, it emphasises the active involvement of users in refining their information needs, iteratively modifying queries, and exploring retrieved content. IIR studies research how to facilitate a more tailored and practical search experience, adapting to the evolving requirements and preferences of users. In this thesis, we focus on four distinct yet interrelated areas in the domain of IIR to have a better understanding of the interaction between the user and the information retrieval system. How users interact with a search system depends on several things, including, but not limited to, the device on which they search, the interface, the task at hand, their prior expertise and so on. In Chapter 2, we explore the role of search interface layout and task complexity on user search behaviour and their task effectiveness. We aim to reproduce the setup of two IIR studies conducted a decade back that explored the effect of the search interface and task complexity on user behaviour. As search interfaces have kept on evolving, we ask the question of whether user search behaviour has remained the same. Our goal is to observe to what extent the findings from those two studies still hold today. Next, we focus on a specific aspect of IIR, called Search as Learning (SAL), where users participate in learning-oriented search tasks. These search tasks are exploratory, involving multiple iterations that require cognitive processing and sensemaking. It often requires the searchers to spend time scanning, viewing, comparing and understanding documents. Prior studies have shown that, in offline classroom learning scenarios, active reading tools like highlighting and note-taking tools help learners better process what they read and consequently help their learning outcomes. In Chapter 3, we explore to what extent highlighting and note-taking tools, when we implement and incorporate them into the interface of a standard search engine, affect search behaviour and users’ learning outcomes. We intend to explore if they are also beneficial in the online SAL scenario. While designing and incorporating widgets (e.g. a note-taking tool) in a search interface, researchers face numerous design decisions regarding where to place the widgets, what they should look like, what functionalities they must have and so on. Due to budget constraints, it is not feasible to run A/B tests on all possible options. Thus, next in Chapter 4, we build a user model leveraging Search Economic Theory (SET), where we, for the first time, incorporate positional information of widgets. SET is based on micro-economic theory that assumes that users are rational agents—they aim to maximise profit and minimise cost. Previous work has utilised SET to develop models for predicting user interaction under various circumstances where widgets on the SERP are typically considered fixed, and their position is not part of the user model definition. Thus, in this thesis, we explore if we can derive a sensible hypothesis of user behaviour using our user model that incorporates positional information of widgets. Finally, having so far dealt with documents in text modality of presentation, in Chapter 5 we look into the voice modality of presentation in the context of collecting relevance judgments for building test collections by employing crowdworkers. Previous studies have explored to what extent various factors like document length, topic difficulty, cognitive aspects of crowdworkers, etc., affect their relevance judgement effectiveness. However, none of them considered the presentation modality of the documents to be judged. Audio-only devices are getting popular, and leveraging these devices can increase the scope of collecting relevance judgements. For example, crowdworkers can judge document on-the-go, those with visual disabilities can also participate in the judgement task and so on. Thus, we observe how the presentation modality of documents, that is, representing them as text or voice, affects the relevance judgement effectiveness of crowdworkers. We also explore to what extent there is an interplay of document length and cognitive aspects of crowdworkers with the presentation modality. With the studies conducted in this thesis, we make scientific contributions to the field by providing novel insights covering a breadth of topics and advancing our understanding of the field. We hope our contributions pave the way for further research and exploration in the field of IIR with the ultimate goal of enhancing the web search experience and performance of users. ...

Interactive information retrieval (IIR) is a user-centered approach to information seeking and retrieval. In this paradigm, the search process is not confined to a single query and a static set of results. Instead, it emphasises the active involvement of users in refining their information needs, iteratively modifying queries, and exploring retrieved content. IIR studies research how to facilitate a more tailored and practical search experience, adapting to the evolving requirements and preferences of users. In this thesis, we focus on four distinct yet interrelated areas in the domain of IIR to have a better understanding of the interaction between the user and the information retrieval system. How users interact with a search system depends on several things, including, but not limited to, the device on which they search, the interface, the task at hand, their prior expertise and so on. In Chapter 2, we explore the role of search interface layout and task complexity on user search behaviour and their task effectiveness. We aim to reproduce the setup of two IIR studies conducted a decade back that explored the effect of the search interface and task complexity on user behaviour. As search interfaces have kept on evolving, we ask the question of whether user search behaviour has remained the same. Our goal is to observe to what extent the findings from those two studies still hold today. Next, we focus on a specific aspect of IIR, called Search as Learning (SAL), where users participate in learning-oriented search tasks. These search tasks are exploratory, involving multiple iterations that require cognitive processing and sensemaking. It often requires the searchers to spend time scanning, viewing, comparing and understanding documents. Prior studies have shown that, in offline classroom learning scenarios, active reading tools like highlighting and note-taking tools help learners better process what they read and consequently help their learning outcomes. In Chapter 3, we explore to what extent highlighting and note-taking tools, when we implement and incorporate them into the interface of a standard search engine, affect search behaviour and users’ learning outcomes. We intend to explore if they are also beneficial in the online SAL scenario. While designing and incorporating widgets (e.g. a note-taking tool) in a search interface, researchers face numerous design decisions regarding where to place the widgets, what they should look like, what functionalities they must have and so on. Due to budget constraints, it is not feasible to run A/B tests on all possible options. Thus, next in Chapter 4, we build a user model leveraging Search Economic Theory (SET), where we, for the first time, incorporate positional information of widgets. SET is based on micro-economic theory that assumes that users are rational agents—they aim to maximise profit and minimise cost. Previous work has utilised SET to develop models for predicting user interaction under various circumstances where widgets on the SERP are typically considered fixed, and their position is not part of the user model definition. Thus, in this thesis, we explore if we can derive a sensible hypothesis of user behaviour using our user model that incorporates positional information of widgets. Finally, having so far dealt with documents in text modality of presentation, in Chapter 5 we look into the voice modality of presentation in the context of collecting relevance judgments for building test collections by employing crowdworkers. Previous studies have explored to what extent various factors like document length, topic difficulty, cognitive aspects of crowdworkers, etc., affect their relevance judgement effectiveness. However, none of them considered the presentation modality of the documents to be judged. Audio-only devices are getting popular, and leveraging these devices can increase the scope of collecting relevance judgements. For example, crowdworkers can judge document on-the-go, those with visual disabilities can also participate in the judgement task and so on. Thus, we observe how the presentation modality of documents, that is, representing them as text or voice, affects the relevance judgement effectiveness of crowdworkers. We also explore to what extent there is an interplay of document length and cognitive aspects of crowdworkers with the presentation modality. With the studies conducted in this thesis, we make scientific contributions to the field by providing novel insights covering a breadth of topics and advancing our understanding of the field. We hope our contributions pave the way for further research and exploration in the field of IIR with the ultimate goal of enhancing the web search experience and performance of users.

Viewpoint Diversity in Search Results

Conference paper (2023) - Tim Draws, Nirmal Roy, Oana Inel, Alisa Rieger, Rishav Hada, Mehmet Orcun Yalcin, Benjamin Timmermans, Nava Tintarev

Adverse phenomena such as the search engine manipulation effect (SEME), where web search users change their attitude on a topic following whatever most highly-ranked search results promote, represent crucial challenges for research and industry. However, the current lack of automatic methods to comprehensively measure or increase viewpoint diversity in search results complicates the understanding and mitigation of such effects. This paper proposes a viewpoint bias metric that evaluates the divergence from a pre-defined scenario of ideal viewpoint diversity considering two essential viewpoint dimensions (i.e., stance and logic of evaluation). In a case study, we apply this metric to actual search results and find considerable viewpoint bias in search results across queries, topics, and search engines that could lead to adverse effects such as SEME. We subsequently demonstrate that viewpoint diversity in search results can be dramatically increased using existing diversification algorithms. The methods proposed in this paper can assist researchers and practitioners in evaluating and improving viewpoint diversity in search results. ...

Hear Me Out

A Study on the Use of the Voice Modality for Crowdsourced Relevance Assessments

Conference paper (2023) - Nirmal Roy, Agathe Balayn, David Maxwell, Claudia Hauff

The creation of relevance assessments by human assessors (often nowadays crowdworkers) is a vital step when building IR test collections. Prior works have investigated assessor quality & behaviour, and tooling to support assessors in their task. We have few insights though into the impact of a document's presentation modality on assessor efficiency and effectiveness. Given the rise of voice-based interfaces, we investigate whether it is feasible for assessors to judge the relevance of text documents via a voice-based interface. We ran a user study (n = 49) on a crowdsourcing platform where participants judged the relevance of short and long documents-sampled from the TREC Deep Learning corpus-presented to them either in the text or voice modality. We found that: (i) participants are equally accurate in their judgements across both the text and voice modality; (ii) with increased document length it takes participants significantly longer (for documents of length > 120 words it takes almost twice as much time) to make relevance judgements in the voice condition; and (iii) the ability of assessors to ignore stimuli that are not relevant (i.e., inhibition) impacts the assessment quality in the voice modality-assessors with higher inhibition are significantly more accurate than those with lower inhibition. Our results indicate that we can reliably leverage the voice modality as a means to effectively collect relevance labels from crowdworkers. ...

Users and Contemporary SERPs

A (Re-)Investigation: Examining User Interactions and Experiences

Conference paper (2022) - N. Roy, D.M. Maxwell, C. Hauff

The Search Engine Results Page (SERP) has evolved significantly over the last two decades, moving away from the simple ten blue links paradigm to considerably more complex presentations that contain results from multiple verticals and granularities of textual information. Prior works have investigated how user interactions on the SERP are influenced by the presence or absence of heterogeneous content (e.g., images, videos, or news content), the layout of the SERP (\emphlist vs. grid layout), and task complexity. In this paper, we reproduce the user studies conducted in prior works---specifically those of~\citetarguello2012task and~\citetsiu2014first ---to explore to what extent the findings from research conducted five to ten years ago still hold today as the average web user has become accustomed to SERPs with ever-increasing presentational complexity. To this end, we designed and ran a user study with four different SERP interfaces:(i) ~\empha heterogeneous grid ;(ii) ~\empha heterogeneous list ;(iii) ~\empha simple grid ; and(iv) ~\empha simple list. We collected the interactions of $41$ study participants over $12$ search tasks for our analyses. We observed that SERP types and task complexity affect user interactions with search results. We also find evidence to support most (6 out of 8) observations from~\citearguello2012task,siu2014first indicating that user interactions with different interfaces and to solve tasks of different complexity have remained mostly similar over time. ...

A many-analysts approach to the relation between religiosity and well-being

Journal article (2022) - Suzanne Hoogeveen, Alexandra Sarafoglou, AC Balazs, Yonathan Aditya, Alexandra J. Alayan, Peter J. Allen, Sacha Altay, T.A. Draws, N. Roy, More authors...

The relation between religiosity and well-being is one of the most researched topics in the psychology of religion, yet the directionality and robustness of the effect remains debated. Here, we adopted a many-analysts approach to assess the robustness of this relation based on a new cross-cultural dataset ((Formula presented.) participants from 24 countries). We recruited 120 analysis teams to investigate (1) whether religious people self-report higher well-being, and (2) whether the relation between religiosity and self-reported well-being depends on perceived cultural norms of religion (i.e., whether it is considered normal and desirable to be religious in a given country). In a two-stage procedure, the teams first created an analysis plan and then executed their planned analysis on the data. For the first research question, all but 3 teams reported positive effect sizes with credible/confidence intervals excluding zero (median reported (Formula presented.)). For the second research question, this was the case for 65% of the teams (median reported (Formula presented.)). While most teams applied (multilevel) linear regression models, there was considerable variability in the choice of items used to construct the independent variables, the dependent variable, and the included covariates. ...

How Do Active Reading Strategies Affect Learning Outcomes in Web Search?

Conference paper (2021) - N. Roy, M. Valle Torre, Ujwal Gadiraju, D.M. Maxwell, C. Hauff

Prior work in education research has shown that various active reading strategies, notably highlighting and note-taking, benefit learning outcomes. Most of these findings are based on observational studies where learners learn from a single document. In a Search as Learning (SAL) context where learners have to iteratively scan and explore a large number of documents to address their learning objective, the effect of these active reading strategies is largely unexplored. To address this research gap, we carried out a crowd-sourced user study, and explored the effects of different highlighting and note-taking strategies on learning during a complex, learning-oriented search task. Out of five hypotheses derived from the education literature we could confirm three in the SAL context. Our findings have important design implications on aiding learning through search. Learners can benefit from search interfaces equipped with active reading tools—but some learning strategies employing these tools are more effective than others. (This research has been supported by DDS (Delft Data Science) and NWO projects SearchX (639.022.722) and Aspasia (015.013.027).) ...

Searching to Learn with Instructional Scaffolding

Conference paper (2021) - A. Câmara, Nirmal Roy, David Maxwell, Claudia Hauff

Web search engines are today considered to be the primary tool to assist and empower learners in finding information relevant to their learning goals- be it learning something new, improving their existing skills, or just fulfilling a curiosity. While several approaches for improving search engines for the learning scenario have been proposed (e.g. a specific ranking function), instructional scaffolding (or simply scaffolding)-a traditional learning support strategy-has not been studied in the context of search as learning, despite being shown to be effective for improving learning in both digital and traditional learning contexts. When scaffolding is employed, instructors provide learners with support throughout their autonomous learning process. We hypothesize that the usageof scaffolding techniques within a search system can be an effective way to help learners achieve their learning objectives whilst searching. As such, this paper investigates the incorporation of scaffolding into a search system employing three different strategies (as well as a control condition): (i) AQe, the automatic expansion of user queries with relevant subtopics; (ii) CURATEDsc, the presenting of a manually curated static list of relevant subtopics on the search engine result page; and (iii) FEEDBACKsc, which projects real-time feedback about a user's exploration of the topic space on top of the CURATEDsc visualization. To investigate the effectiveness of these approaches withrespect to human learning, we conduct a user study (N=126) where participants were tasked with searching and learning about topics such as genetically modified organisms. We find that (i) the introduction of the proposed scaffolding methods in the proposed topics does not significantly improve learning gains. However, (ii) it does significantly impact search behavior. Furthermore, (iii) immediate feedback of the participants' learning (FEEDBACKsc) leads to undesirable user behavior, with participants seemingly focusing on the feedback gauges instead of learning. ...

Web search engines are today considered to be the primary tool to assist and empower learners in finding information relevant to their learning goals- be it learning something new, improving their existing skills, or just fulfilling a curiosity. While several approaches for improving search engines for the learning scenario have been proposed (e.g. a specific ranking function), instructional scaffolding (or simply scaffolding)-a traditional learning support strategy-has not been studied in the context of search as learning, despite being shown to be effective for improving learning in both digital and traditional learning contexts. When scaffolding is employed, instructors provide learners with support throughout their autonomous learning process. We hypothesize that the usageof scaffolding techniques within a search system can be an effective way to help learners achieve their learning objectives whilst searching. As such, this paper investigates the incorporation of scaffolding into a search system employing three different strategies (as well as a control condition): (i) AQe, the automatic expansion of user queries with relevant subtopics; (ii) CURATEDsc, the presenting of a manually curated static list of relevant subtopics on the search engine result page; and (iii) FEEDBACKsc, which projects real-time feedback about a user's exploration of the topic space on top of the CURATEDsc visualization. To investigate the effectiveness of these approaches withrespect to human learning, we conduct a user study (N=126) where participants were tasked with searching and learning about topics such as genetically modified organisms. We find that (i) the introduction of the proposed scaffolding methods in the proposed topics does not significantly improve learning gains. However, (ii) it does significantly impact search behavior. Furthermore, (iii) immediate feedback of the participants' learning (FEEDBACKsc) leads to undesirable user behavior, with participants seemingly focusing on the feedback gauges instead of learning.

Note the Highlight: Incorporating Active Reading Tools in a Search as Learning Environment

Conference paper (2021) - N. Roy, M. Valle Torre, Ujwal Gadiraju, D.M. Maxwell, C. Hauff

Active reading strategies - -such as content annotations (through the use of highlighting and note-taking, for example) - -have been shown to yield improvements to a learner's knowledge and understanding of the topic being explored. This has been especially notable in long and complex learning endeavours. With web search engines nowadays used as the primary gateway for learners (or users) to find content that helps them realise their learning goals, they are often poorly equipped with the necessary tools to aid in sense-making, an important aspect of theSearch as Learning (SAL) process. Within theInformation Retrieval (IR) community, research efforts have explored ways to keep track of users' search context by providing a notepad-like interface for the collection of relevant articles, and aid them during the exploratory search process. However, these studies did not explicitly measure the effect that such tools have on knowledge and understanding during a complex, learning-oriented search task. In this paper, we address this research gap by carrying out an InteractiveIR experiment with highlighting and note-taking tools built into the search interface. We conducteda crowdsourced between-subjects study (N=115), where participants were assigned to one of four conditions: (i) control (a standard web search interface); (ii) high (highlighting enabled);(iii) note (note-taking enabled); and (iv) highnote (both highlighting and note-taking enabled). We assess participants' learning with a recall-oriented vocabulary learning task, and a cognitively more taxing essay writing task. We find that(i) active reading tools do not aid in the vocabulary learning task. However,(ii) participants in high covered 34% more subtopics, and participants in note covered 34% more facts in their essays when compared to control. Furthermore, (iii) we observed that incorporating active learning tools significantly changed the search behaviour of participants across a number of measures. This is the first work that sheds light on the effect of active reading tools on the SAL process, with important design implications for learning-oriented search systems. ...

Active reading strategies - -such as content annotations (through the use of highlighting and note-taking, for example) - -have been shown to yield improvements to a learner's knowledge and understanding of the topic being explored. This has been especially notable in long and complex learning endeavours. With web search engines nowadays used as the primary gateway for learners (or users) to find content that helps them realise their learning goals, they are often poorly equipped with the necessary tools to aid in sense-making, an important aspect of theSearch as Learning (SAL) process. Within theInformation Retrieval (IR) community, research efforts have explored ways to keep track of users' search context by providing a notepad-like interface for the collection of relevant articles, and aid them during the exploratory search process. However, these studies did not explicitly measure the effect that such tools have on knowledge and understanding during a complex, learning-oriented search task. In this paper, we address this research gap by carrying out an InteractiveIR experiment with highlighting and note-taking tools built into the search interface. We conducteda crowdsourced between-subjects study (N=115), where participants were assigned to one of four conditions: (i) control (a standard web search interface); (ii) high (highlighting enabled);(iii) note (note-taking enabled); and (iv) highnote (both highlighting and note-taking enabled). We assess participants' learning with a recall-oriented vocabulary learning task, and a cognitively more taxing essay writing task. We find that(i) active reading tools do not aid in the vocabulary learning task. However,(ii) participants in high covered 34% more subtopics, and participants in note covered 34% more facts in their essays when compared to control. Furthermore, (iii) we observed that incorporating active learning tools significantly changed the search behaviour of participants across a number of measures. This is the first work that sheds light on the effect of active reading tools on the SAL process, with important design implications for learning-oriented search systems.

Incorporating Widget Positioning in Interaction Models of Search Behaviour

Conference paper (2021) - N. Roy, A. Barbosa Câmara, D.M. Maxwell, C. Hauff

Models developed to simulate user interactions with search interfaces typically do not consider the visual layout and presentation of a Search Engine Results Page (SERP). In particular, the position and size of interfacewidgets ---such as entity cards and query suggestions---are usually considered a negligible constant. In contrast, in this work, we investigate the impact of widget positioning on user behaviour. To this end, we focus on one specific widget: the Query History Widget (QHW). It allows users to see (and thus reflect) on their recently issued queries. We build a novel simulation model based on Search Economic Theory (SET) that considers how users behave when faced with such a widget by incorporating its positioning on the SERP. We derive five hypotheses from our model and experimentally validate them based on user interaction data gathered for an ad-hoc search task, run across five different placements of the \qhw on the SERP. We find partial support for three of the five hypotheses, and indeed observe that a widget's location has a significant impact on search behaviour. ...

Exploring users' learning gains within search sessions

Conference paper (2020) - Nirmal Roy, Felipe Moraes, Claudia Hauff

The area of search as learning is concerned with the optimization of search systems (that is, retrieval functions, user interface elements, etc.) for human learning - -this is in contrast to the currently dominant paradigm of optimizing the search experience by optimizing for relevance. While prior work typically considers learning as something that happens at some point during the search session, we are interested in when during the search session learning occurs. In order to answer this question, we here present the results of a user study ($N=64$) in which searchers were tasked with learning about a topic by searching the web for 20 minutes; they were prompted at regular intervals during the search session on their knowledge about the topic. We find that for study participants with little to no prior knowledge the learning gains are sublinear, while participants with some prior knowledge have the largest knowledge gains towards the end of the search session. ...