Y. Lee
Please Note
8 records found
1
Unveiling cognitive processes in digital reading through behavioural cues
A hybrid intelligence (HI) approach
Learner behaviours often provide critical clues about learners' cognitive processes. However, the capacity of human intelligence to comprehend and intervene in learners' cognitive processes is often constrained by the subjective nature of human evaluation and the challenges of maintaining consistency and scalability. The recent widespread AI technology has been applied to learning analytics (LA), aiming at a more accurate, consistent and scalable understanding of learning to compensate for challenges that human intelligence faces. However, machine intelligence has been criticized for lacking contextual understanding and difficulties dealing with complex human emotions and social cues. In this work, we aim to understand learners' internal cognitive processes based on the external behavioural cues of learners in a digital reading context, using a hybrid intelligence (HI) approach, bridging human and machine intelligence. Based on the behavioural frameworks and the insights from human experts, we scope specific behavioural cues that are known to be relevant to learners' attention regulation, which is highly relevant for learners' cognitive processes. We utilize the public WEDAR dataset with 30 subjects' video data, behaviour annotation and pre–post tests on multiple choice and summarization tasks. We apply the explainable AI (XAI) approach to train the machine learning model so that human evaluators can also understand which behavioural features were essential for predicting the usage of the cognitive processes (ie, higher-order thinking skills [HOTS] and lower-order thinking skills [LOTS]) of learners, providing insights for the next-round feature engineering and intervention design. The result indicates that the dominant use of attention regulation behaviours is a reliable indicator of low use of LOTS with 79.33% prediction accuracy, while reading speed is a valuable indicator for predicting the overall usage of HOTS and LOTS, ranging from 60.66% to 78.66% accuracy, highly surpassing random guess of 33.33%. Our study demonstrates how various combinations of behavioural features supported by HI can inform learners' cognitive processes accurately and interpretably, integrating human and machine intelligence. Practitioner notes What is already known about this topic Human attention is a cognitive process that allows us to choose and concentrate on relevant information, which leads to successful learning. In affective computing, certain behavioural cues (eg, attention regulation behaviours) are used to indicate learners' attentional states during learning. What this paper adds Attention regulation behaviours during digital reading can work as predictors of different levels of cognitive processes (ie, the utilization of higher-order thinking skills [HOTS] and lower-order thinking skills [LOTS]), leveraged by computer vision and machine learning. By developing an explainable AI model, we can predict learners' cognitive processes, which often cannot be achieved by human observations, while understanding behavioural components that lead to such machine decisions is critical. It can provide valuable machine-driven insights into the relationship between humans' external and internal states in learning. Based on the frameworks spanning cognitive AI, psychology and education, expert knowledge can contribute to initial feature selection and engineering for the hybrid intelligence (HI) model development and next-round intervention design. Implications for practice and/or policy Human and machine intelligence form an iterative cycle to build a HI to understand and intervene in learners' cognitive processes in digital reading, balancing each other's strengths and weaknesses in decision-making. It can eventually inform automated feedback loops in widespread e-learning, a new education norm since the COVID-19 pandemic. Our framework also has the potential to be extended to other scenarios with digital reading, providing concrete examples of where human intelligence and machine intelligence can contribute to building a HI. It represents more systematic supports that apply to real-life practices.
Interactive Intelligence
Multimodal AI for Real-Time Interaction Loop towards Attentive E-Reading
Reading on digital devices has become more commonplace, while it often poses challenges to learners' attention. In this study, we hypothesized that allowing learners to reflect on their reading phases with an empathic social robot companion might enhance learners' attention in e-reading. To verify our assumption, we collected a novel dataset (SKEP) in an e-reading setting with social robot support. It contains 25 multimodal features from various sensors and logged data that are direct and indirect cues of attention. Based on the SKEP dataset, we comprehensively compared the difference between HRI-based (treatment) and GUI-based (control) feedback and obtained insights for intervention design. Based on the human annotation of the nearly 40 hours of video data streams from 60 subjects, we developed a machine learning model to capture attention-regulation behaviors in e-reading. We exploited a two-stage framework to recognize learners' observable self-regulatory behaviors and conducted attention analysis. The proposed system showed a promising performance with high prediction results of e-reading with HRI, such as 72.97% accuracy in recognizing attention regulation behaviors, 74.29% accuracy in predicting knowledge gain, 75.00% for perceived interaction experience, and 75.00% for perceived social presence. We believe our work can inspire the future design of HRI-based e-reading and its analysis.
What Attention Regulation Behaviors Tell Us About Learners in E-Reading?
Adaptive Data-Driven Persona Development and Application Based on Unsupervised Learning
Technology-enhanced learning systems, specifically multimodal learning technologies, use sensors to collect data from multiple modalities to provide personalized learning support beyond traditional learning settings. However, many studies surrounding such multimodal learning systems mostly focus on technical aspects concerning data collection and exploitation and therefore overlook theoretical and instructional design aspects such as feedback design in multimodal settings. This paper explores multimodal learning systems as a critical part of technology-enhanced learning used for capturing and analyzing the learning process to exploit the collected multimodal data to generate feedback in multimodal settings. By investigating various studies, we aim to reveal the roles of multimodality in technology-enhanced learning across various learning domains. Our scoping review outlines the conceptual landscape of multimodal learning systems, identifies potential gaps, and provides new perspectives on adaptive multimodal system design: intertwining learning data for meaningful insights into learning, designing effective feedback, and implementing them in diverse learning domains.
WEDAR
Webcam-based Attention Analysis via Attention Regulator Behavior Recognition with a Novel E-reading Dataset
Human attention is critical yet challenging cognitive process to measure due to its diverse definitions and non-standardized evaluation. In this work, we focus on the attention self-regulation of learners, which commonly occurs as an effort to regain focus, contrary to attention loss. We focus on easy-to-observe behavioral signs in the real-world setting to grasp learners' attention in e-reading. We collected a novel dataset of 30 learners, which provides clues of learners' attentional states through various metrics, such as learner behaviors, distraction self-reports, and questionnaires for knowledge gain. To achieve automatic attention regulator behavior recognition, we annotated 931,440 frames into six behavior categories every second in the short clip form, using attention self-regulation from the literature study as our labels. The preliminary Pearson correlation coefficient analysis indicates certain correlations between distraction self-reports and unimodal attention regulator behaviors. Baseline model training has been conducted to recognize the attention regulator behaviors by implementing classical neural networks to our WEDAR dataset, with the highest prediction result of 75.18% and 68.15% in subject-dependent and subject-independent settings, respectively. Furthermore, we present the baseline of using attention regulator behaviors to recognize the attentional states, showing a promising performance of 89.41% (leave-five-subject-out). Our work inspires the detection & feedback loop design for attentive e-reading, connecting multimodal interaction, learning analytics, and affective computing.