YL

Y. Lee

info

Please Note

8 records found

Journal article (2025) - Yoon Lee, Gosia Migut, Marcus Specht
Learner behaviours often provide critical clues about learners' cognitive processes. However, the capacity of human intelligence to comprehend and intervene in learners' cognitive processes is often constrained by the subjective nature of human evaluation and the challenges of maintaining consistency and scalability. The recent widespread AI technology has been applied to learning analytics (LA), aiming at a more accurate, consistent and scalable understanding of learning to compensate for challenges that human intelligence faces. However, machine intelligence has been criticized for lacking contextual understanding and difficulties dealing with complex human emotions and social cues. In this work, we aim to understand learners' internal cognitive processes based on the external behavioural cues of learners in a digital reading context, using a hybrid intelligence (HI) approach, bridging human and machine intelligence. Based on the behavioural frameworks and the insights from human experts, we scope specific behavioural cues that are known to be relevant to learners' attention regulation, which is highly relevant for learners' cognitive processes. We utilize the public WEDAR dataset with 30 subjects' video data, behaviour annotation and pre–post tests on multiple choice and summarization tasks. We apply the explainable AI (XAI) approach to train the machine learning model so that human evaluators can also understand which behavioural features were essential for predicting the usage of the cognitive processes (ie, higher-order thinking skills [HOTS] and lower-order thinking skills [LOTS]) of learners, providing insights for the next-round feature engineering and intervention design. The result indicates that the dominant use of attention regulation behaviours is a reliable indicator of low use of LOTS with 79.33% prediction accuracy, while reading speed is a valuable indicator for predicting the overall usage of HOTS and LOTS, ranging from 60.66% to 78.66% accuracy, highly surpassing random guess of 33.33%. Our study demonstrates how various combinations of behavioural features supported by HI can inform learners' cognitive processes accurately and interpretably, integrating human and machine intelligence. Practitioner notes What is already known about this topic Human attention is a cognitive process that allows us to choose and concentrate on relevant information, which leads to successful learning. In affective computing, certain behavioural cues (eg, attention regulation behaviours) are used to indicate learners' attentional states during learning. What this paper adds Attention regulation behaviours during digital reading can work as predictors of different levels of cognitive processes (ie, the utilization of higher-order thinking skills [HOTS] and lower-order thinking skills [LOTS]), leveraged by computer vision and machine learning. By developing an explainable AI model, we can predict learners' cognitive processes, which often cannot be achieved by human observations, while understanding behavioural components that lead to such machine decisions is critical. It can provide valuable machine-driven insights into the relationship between humans' external and internal states in learning. Based on the frameworks spanning cognitive AI, psychology and education, expert knowledge can contribute to initial feature selection and engineering for the hybrid intelligence (HI) model development and next-round intervention design. Implications for practice and/or policy Human and machine intelligence form an iterative cycle to build a HI to understand and intervene in learners' cognitive processes in digital reading, balancing each other's strengths and weaknesses in decision-making. It can eventually inform automated feedback loops in widespread e-learning, a new education norm since the COVID-19 pandemic. Our framework also has the potential to be extended to other scenarios with digital reading, providing concrete examples of where human intelligence and machine intelligence can contribute to building a HI. It represents more systematic supports that apply to real-life practices. ...

Multimodal AI for Real-Time Interaction Loop towards Attentive E-Reading

Doctoral thesis (2024) - Y. Lee
E-learning has shifted the traditional learning paradigms in higher education, offering more flexible, ubiquitous, and personalized learning experiences. The previous years COVID-19 pandemic required a re-calibration of education to accommodate virtual learning environments from the traditional classroom-based education. Widespread learning platforms and digital devices have accelerated the adoption of e-learning , and now, it plays a central role in formal and informal education. ...
Book chapter (2023) - Yoon Lee, Gosia Migut, Marcus Specht
This study is built upon a behavior-based framework for real-time attention evaluation of higher education learners in e-reading. Significant challenges in AI model developments for learning analytics have been 1) defining valid indicators and 2) connecting the analytics results to interventions, balancing the generalization and personalization needs. To address this, we utilized a public multimodal WEDAR dataset and trained a neural network model based on real-time features of learners, aiming at predicting learners’ moment-to-moment distractions. Real-time features for model training include 30 learners’ attention regulation behaviors annotated every second, reaction times to blur stimuli, and page numbers indicating various reading phases. Our preliminary model based on a neural network has achieved 66.26% accuracy in predicting self-reported distractions. Based on the model, we suggest a framework of a Behavior-based Feedback Loop for Attentive e-reading (BFLAe). It has text blur as feedback, a mechanism responsive to learners’ distractions that also works as data for next-round feedback. The general feedback implementation rules are established on a statistical analysis conducted on all learners. In addition, we propose a strategy for personalizing feedback using a quartile analysis of individual data, promoting learner-specific feedback. Our framework addresses the high demand for an automated e-learning assistant with non-intrusive data collection based on real-world settings and intuitive feedback provision. The feedback system aims to help learners with longer attention spans and less frequent distractions, leading to more engaging e-reading. ...
Conference paper (2023) - Yoon Lee, Marcus Specht
Reading on digital devices has become more commonplace, while it often poses challenges to learners' attention. In this study, we hypothesized that allowing learners to reflect on their reading phases with an empathic social robot companion might enhance learners' attention in e-reading. To verify our assumption, we collected a novel dataset (SKEP) in an e-reading setting with social robot support. It contains 25 multimodal features from various sensors and logged data that are direct and indirect cues of attention. Based on the SKEP dataset, we comprehensively compared the difference between HRI-based (treatment) and GUI-based (control) feedback and obtained insights for intervention design. Based on the human annotation of the nearly 40 hours of video data streams from 60 subjects, we developed a machine learning model to capture attention-regulation behaviors in e-reading. We exploited a two-stage framework to recognize learners' observable self-regulatory behaviors and conducted attention analysis. The proposed system showed a promising performance with high prediction results of e-reading with HRI, such as 72.97% accuracy in recognizing attention regulation behaviors, 74.29% accuracy in predicting knowledge gain, 75.00% for perceived interaction experience, and 75.00% for perceived social presence. We believe our work can inspire the future design of HRI-based e-reading and its analysis. ...

Adaptive Data-Driven Persona Development and Application Based on Unsupervised Learning

Journal article (2023) - Yoon Lee, Gosia Migut, Marcus Specht
Different individual features of the learner data often work as essential indicators of learning and intervention needs. This work exploits the personas in the design thinking process as the theoretical basis to analyze and cluster learners’ learning behavior patterns as groups. To adapt to the learning practice, we develop data-driven personas by clustering learners’ features based on factual learning outcomes (i.e., knowledge gain, perceived learning experience, perceived social presence) based on unsupervised learning, a more accessible and objective intervention design strategy for e-reading practices. Using the Chi-square test, we quantitatively evaluate different clusters driven by various unsupervised learning methods on the multimodal SKEP dataset. Furthermore, for a more practical real-life application, we achieved automatic persona prediction based on the attention regulation behaviors of learners. The subject-independent evaluation results indicate the best classification accuracy of 70% for the four-level classification task, differentiating three personas of learners with needs and another without feedback needs. It also shows that time-based sampling on both independent and cumulative learner behaviors works as robust predictors of learner personas, achieving a stable accuracy range of 65%-70% throughout the e-reading with the SVM classifier. Our work inspires the design of a real-time feedback loop for e-learning based on conversational agents. ...
Conference paper (2023) - Yoon Lee, Bibeg Limbu, Zoltan Rusak, Marcus Specht
Technology-enhanced learning systems, specifically multimodal learning technologies, use sensors to collect data from multiple modalities to provide personalized learning support beyond traditional learning settings. However, many studies surrounding such multimodal learning systems mostly focus on technical aspects concerning data collection and exploitation and therefore overlook theoretical and instructional design aspects such as feedback design in multimodal settings. This paper explores multimodal learning systems as a critical part of technology-enhanced learning used for capturing and analyzing the learning process to exploit the collected multimodal data to generate feedback in multimodal settings. By investigating various studies, we aim to reveal the roles of multimodality in technology-enhanced learning across various learning domains. Our scoping review outlines the conceptual landscape of multimodal learning systems, identifies potential gaps, and provides new perspectives on adaptive multimodal system design: intertwining learning data for meaningful insights into learning, designing effective feedback, and implementing them in diverse learning domains. ...

Webcam-based Attention Analysis via Attention Regulator Behavior Recognition with a Novel E-reading Dataset

Conference paper (2022) - Yoon Lee, Haoyu Chen, Guoying Zhao, Marcus Specht
Human attention is critical yet challenging cognitive process to measure due to its diverse definitions and non-standardized evaluation. In this work, we focus on the attention self-regulation of learners, which commonly occurs as an effort to regain focus, contrary to attention loss. We focus on easy-to-observe behavioral signs in the real-world setting to grasp learners' attention in e-reading. We collected a novel dataset of 30 learners, which provides clues of learners' attentional states through various metrics, such as learner behaviors, distraction self-reports, and questionnaires for knowledge gain. To achieve automatic attention regulator behavior recognition, we annotated 931,440 frames into six behavior categories every second in the short clip form, using attention self-regulation from the literature study as our labels. The preliminary Pearson correlation coefficient analysis indicates certain correlations between distraction self-reports and unimodal attention regulator behaviors. Baseline model training has been conducted to recognize the attention regulator behaviors by implementing classical neural networks to our WEDAR dataset, with the highest prediction result of 75.18% and 68.15% in subject-dependent and subject-independent settings, respectively. Furthermore, we present the baseline of using attention regulator behaviors to recognize the attentional states, showing a promising performance of 89.41% (leave-five-subject-out). Our work inspires the detection & feedback loop design for attentive e-reading, connecting multimodal interaction, learning analytics, and affective computing. ...
Conference paper (2020) - H. Chen, E. Tan, Y. Lee, S. Praharaj, M. Specht, G. Zhao
Using Artificial Intelligence (AI) and machine learning technologies to automatically mine latent patterns from educational data holds great potential to inform teaching and learning practices. However, the current AI technology mostly works as "black box"-only the inputs and the corresponding outputs are available, which largely impedes researchers from gaining access to explainable feedback. This interdisciplinary work presents an explainable AI prototype with visualized explanations as feedback for computer-supported collaborative learning (CSCL). This research study seeks to provide interpretable insights with machine learning technologies for multimodal learning analytics (MMLA) by introducing two different explanatory machine learning-based models (neural network and Bayesian network) in different manners (end-to-end learning and probabilistic analysis) and for the same goal-provide explainable and actionable feedback. The prototype is applied to the real-world collaborative learning scenario with data-driven learning based on sensor-data from multiple modalities which can assess collaborative learning processes and render explanatory real-time feedback. ...