P.S. Cesar Garcia
Please Note
103 records found
1
Disability, Differences, and Diversity
Revisiting Inclusive Design and Access
Over 1.3 billion people worldwide live with long-term disabilities, yet many still face systemic exclusion despite advances in accessibility policy and technology. New regulations such as the EU Accessibility Act demand comprehensive transitions, but compliance risks becoming a superficial “checklist” exercise rather than fostering meaningful inclusion. For the HCI community, this moment calls for rethinking our approaches to participation, technology, ethics, and policy. In this meetup, we bring together researchers, practitioners, and advocates to revisit inclusive design through four themes: rethinking inclusive methodologies, disentangling technological challenges, unpacking ethical implications, and navigating policy opportunities. Through interactive mapping activities, participants will share practices, identify collaboration opportunities, and co-develop future directions. Our goal is to build cross-disciplinary connections and create actionable approaches that move beyond compliance toward holistic inclusion, ensuring that accessibility remains central to HCI research and practice.
PhysioDrum
Bridging Physical and Digital Realms in Immersive Musical Interaction
Perceptual quality assessment of Dynamic Point Cloud (DPC) contents plays an important role in various Virtual Reality (VR) applications that involve human beings as the end user. Understanding and modeling perceptual quality assessment is greatly enriched by insights from visual attention. However, incorporating aspects of visual attention in DPC quality models is largely unexplored, as ground-truth visual attention data are scarcely available. Besides, testing methods and procedures for collecting visual attention data are still to be agreed on. This article presents a dataset containing subjective opinion scores and visual attention maps of DPCs, collected in a VR environment using eye-tracking technology. Both the quality score and eye-tracking data were collected during a subjective quality assessment experiment, in which subjects were instructed to watch and rate DPCs at various degradation levels under 6 Degrees of Freedom (DoF) inspection, using a head-mounted display. Qualitative interview analysis was also conducted after the experiment. The dataset consists of 50 DPCs, including 5 reference DPCs, with each reference encoded at 3 distortion levels using 3 different codecs (namely G-PCC, V-PCC, CWI-PCL), amounting to a total of 9 degraded version per reference. Additionally, it incorporates 1,000 gaze trials from 40 participants, yielding a total of 15,000 visual attention maps across all the DPCs. We additionally benchmark objective quality metrics originally designed for static point clouds, evaluating their performance in our dataset using two temporal pooling strategies. Furthermore, we employ the visual attention data that are retrieved during our experiment to evaluate whether the performance of widely used objective quality metrics is improved by considering subjective measurements of visual attention. This dataset establishes a link between quality assessment and visual attention within the context of DPC. Moreover, thematic analysis of the interviews helps uncover user behavior and factors impacting perceptual quality for DPC in 6 DoF. This work deepens our understanding of DPC quality assessment and visual attention, driving progress in the realm of VR experiences and perception.
Recent technological developments on AI and immersive media are transforming the artistic landscape, providing novel mechanisms for artists and audiences. Following a human-centric approach, together with a theatre company in Greece, this paper investigates how subtitle placement affects user experience and cognitive load in a live theatre performance enhanced by AR glasses. To do so, we design and develop a system for displaying subtitles in VR and AR. We evaluated the system in two conditions (N = 19;N = 12), both in a controlled environment (VR) and an actual theatre (AR). In the latter, we integrate AI solutions to provide automatic captioning and translation in real time, and VFX to further augment the experience. Our quantitative and qualitative results showed no difference between subtitle placements in terms of cognitive load and user experience, with users equally liking the two proposed approaches. Results also highlighted the perceived usefulness of AR to enhance theatre performances, indicating new paths for wider accessibility and further immersion.
UVG-CWI-DQPC
Dual-Quality Point Cloud Dataset for Volumetric Video Applications
PointPCA+
A Full-reference Point Cloud Quality Assessment Metric with PCA-based Features
The latest social VR technologies have enabled users to attend traditional media and arts performances together while being geographically removed, making such experiences accessible despite budget, distance, and other restrictions. In this work, we aim at improving the way remote performances are shared by designing and evaluating a VR theatre lobby which serves as a space for users to gather, interact, and relive the common experience of watching a virtual opera. We conducted an initial test with experts ($\mathrm{N}=10$, i.e., designers and opera enthusiasts) in pairs using our VR lobby prototype, developed based on the theoretical lobby design concept. A unique aspect of our experience is its highly realistic representation of users in the virtual space. The test results guided refinements to the VR lobby structure and implementation, aiming to improve the user experience and align it more closely with the social VR lobby's intended purpose. With the enhanced prototype, we ran a between-subject controlled study ($\mathrm{N}=40$) to compare the user experience in the social VR lobby between individuals and paired participants. To do so, we designed and validated a questionnaire to measure the user experience in the VR lobby. Results of our mixed-methods analysis, including interviews, questionnaire results, and user behavior, reveal the strength of our social VR lobby in connecting with other users, consuming the opera in a deeper manner, and exploring new possibilities beyond what is common in real life. All supplemental materials are available at https://github.com/cwi-dis/IEEEVR2024-VRLobby.
ComPEQ-MR
Compressed Point Cloud Dataset with Eye Tracking and Quality Assessment in Mixed Reality
Point clouds (PCs) have attracted researchers and developers due to their ability to provide immersive experiences with six degrees of freedom (6DoF). However, there are still several open issues in understanding the Quality of Experience (QoE) and visual attention of end users while experiencing 6DoF volumetric videos. First, encoding and decoding point clouds require a significant amount of both time and computational resources. Second, QoE prediction models for dynamic point clouds in 6DoF have not yet been developed due to the lack of visual quality databases. Third, visual attention in 6DoF is hardly explored, which impedes research into more sophisticated approaches for adaptive streaming of dynamic point clouds. In this work, we provide an open-source Compressed Point cloud dataset with Eye-tracking and Quality assessment in Mixed Reality (ComPEQ - MR). The dataset comprises four compressed dynamic point clouds processed by Moving Picture Experts Group (MPEG) reference tools (i.e., VPCC and GPCC), each with 12 distortion levels. We also conducted subjective tests to assess the quality of the compressed point clouds with different levels of distortion. The rating scores are attached to ComPEQ - MR so that they can be used to develop QoE prediction models in the context of MR environments. Additionally, eye-tracking data for visual saliency is included in this dataset, which is necessary to predict where people look when watching 3D videos in MR experiences. We collected opinion scores and eye-tracking data from 41 participants, resulting in 2132 responses and 164 visual attention maps in total. The dataset is available at https://ftp.itec.aau.at/datasets/ComPEQ-MR/.
Extended Reality (XR) has emerged as a transformative and immersive technology with versatile applications in content creation and consumption. As XR gains popularity, companies eager to adopt it often possess a surface-level understanding, investing significant resources without effectively addressing the genuine needs of end-users. This study explores the current workflows of XR production companies, and the potential of social XR in mitigating challenges throughout the XR production workflow. We present the outcomes of three respective focus group workshops conducted with three XR production companies and their experts (N=17). The results indicate that at every stage of the production, namely pre-production, production, post-production, and post-release, there are communication challenges between producers and clients, as well as different production and post-production specialists. We discuss various aspects of XR concerning the problem and propose novel opportunities offered by social XR to ameliorate those challenges, improving communication and making development more agile.
Affective computing has experienced substantial advancements in recognizing emotions through image and facial expression analysis. However, the incorporation of physiological data remains constrained. Emotion recognition with physiological data shows promising results in controlled experiments but lacks generalization to real-world settings. To address this, we present G-REx, a dataset for real-world affective computing. We collected physiological data (photoplethysmography and electrodermal activity) using a wrist-worn device during long-duration movie sessions. Emotion annotations were retrospectively performed on segments with elevated physiological responses. The dataset includes over 31 movie sessions, totaling 380 h+ of data from 190+ subjects. The data were collected in a group setting, which can give further context to emotion recognition systems. Our setup aims to be easily replicable in any real-life scenario, facilitating the collection of large datasets for novel affective computing systems.
Emotion recognition systems are typically trained to classify a given psychophysiological state into emotion categories. Current platforms for emotion ground-truth collection show limitations for real-world scenarios of long-duration content (e.g. >10 minutes), namely: 1) Real-time annotation tools are distracting and become exhausting; 2) Perform retrospective annotation of the whole content in bulk (providing highly coarse annotations); or 3) Are used by external experts (depending on the number of annotators and their subjective experience). We explore a novel approach, the EmotiphAI Annotator, that allows undisturbed content visualisation and simplifies the annotation process by using segmentation algorithms that select brief clips for emotional annotation retrospectively. We compare three methods for content segmentation based on physiological data (Electrodermal Activity (EDA), emotion-based), scene (time-based), and random (control) selection. The EmotiphAI Annotator attained a B+ System Usability Scale score and low-average mental workload as per the NASA Task Load Index (40%). The reliability of the self-report was analysed by the inter-rater agreement (STD < 0.75), coherence across time segmentation methods (STD < 0.17), comparison against the state-of-the-art ground truth (STD < 0.7), and correlation to EDA (>0.3 to 0.8), where the EDA-based method obtained the overall best performance.
The Internet of Multisensory, Multimedia and Musical Things (Io3MT) is a new concept that arises from the confluence of several areas of computer science, arts, and humanities, with the objective of grouping in a single place devices and data that explore the five human senses, besides multimedia aspects and music content. In the context of this brave new idea paper, we advance the proposition of a theoretical alignment between the emerging domain in question and the field of Artificial Intelligence (AI). The main goal of this endeavor is to tentatively delineate the inceptive trends and conceivable consequences stemming from the fusion of these domains within the sphere of artistic presentations. Our comprehensive analysis spans a spectrum of dimensions, encompassing the automated generation of multimedia content, the real-time extraction of sensory effects, and post-performance analytical strategies. In this manner, artists are equipped with quantitative metrics that can be employed to enhance future artistic performances. We assert that this cooperative amalgamation has the potential to serve as a conduit for optimizing the creative capabilities of stakeholders.