Circular Image

Y.B. Eisma

info

Please Note

34 records found

Eye tracking and response times reveal the dynamics of highway merging decisions

Merging onto a highway is a safety-critical task resulting in a large number of traffic accidents; fundamental research into merging behavior of human drivers can help reduce this toll. Two cognitive processes critical to merging, attention allocation and decision making, have been extensively studied in real-world and simulated driving scenarios. However, how these processes interact during highway merging remains poorly understood. While the relationship between attention and decision making has been widely examined in cognitive science, this work has largely relied on simple decision-making paradigms involving choices between static items on a computer screen, which limits the understanding of more dynamic and naturalistic decisions such as in driving. To address this gap, we investigated the relationship between attention and decision making in a simplified highway merging task. In a video-based experiment, participants (N=24) repeatedly made merging gap acceptance decisions based on the dynamic information about the distance and time-to-arrival to the end of the merging lane and the gap to the target-lane vehicle (available in the front view and the side mirror, respectively). Participants’ decisions, response times, and eye movements were recorded. We found that decisions to accept a gap were considerably faster than decisions to reject a gap. Decision outcomes and timing depended on the distance to and time-to-arrival of the target-lane vehicle, but also on the time pressure due to approaching the end of the merging lane. Most importantly, under high time pressure, a greater proportion of time spent looking at the side mirror was associated with a lower probability of accepting the gap. This finding indicates that differences in visual information sampling can be closely linked to decision outcomes when time budgets are constrained. Our results provide initial empirical insights relevant for future cognitive modeling of the interplay between decision making and attention during highway merging. This work can inform early-stage exploration of driver monitoring and support systems for partially automated driving. ...

New psychological phenomena

Journal article (2025) - Joost de Winter, P. A. Hancock, Yke Bauke Eisma
This study describes the impact of ChatGPT use on the nature of work from the perspective of academics and educators. We elucidate six phenomena: (1) the cognitive workload associated with conducting Turing tests to determine if ChatGPT has been involved in work productions; (2) the ethical void and alienation that result from recondite ChatGPT use; (3) insights into the motives of individuals who fail to disclose their ChatGPT use, while, at the same time, the recipient does not reveal their awareness of that use; (4) the sense of ennui as the meanings of texts dissipate and no longer reveal the sender’s state of understanding; (5) a redefinition of utility, wherein certain texts show redundancy with patterns already embedded in the base model, while physical measurements and personal observations are considered as unique and novel; (6) a power dynamic between sender and recipient, inadvertently leaving non-participants as disadvantaged third parties. This paper makes clear that the introduction of AI tools into society has far-reaching effects, initially most prominent in text-related fields, such as academia. Whether these implementations represent beneficial innovations for human prosperity, or a rather different line of social evolution, represents the pith of our present discussion. ...
Attention bias towards social threat has been linked to loneliness and anxiety, though findings are mixed and concerns about measurement reliability persist. This study examined whether state and trait loneliness, along with personality, self-esteem, social anxiety, and life satisfaction, are associated with attention bias towards social threat images (indicating rejection or exclusion) in young adults (N = 241). AI-generated images were used to enhance control over stimulus content and category distinctions. Participants completed an eye-tracking free-viewing task comprising 40 image matrices (four images per matrix, displayed for 6000 ms). We then computed attention bias (dwell time percentage, total fixation duration percentage, and fixation count percentage) and initial orientation of attention (first fixation percentage). The attention bias measures showed adequate-to-good internal consistency (α = 0.61–0.86). No significant associations emerged between loneliness and attention to socially threatening stimuli, suggesting that heightened vigilance to social threat may not be a feature of loneliness in non-clinical young adults. However, it was found that females exhibited greater attention to social positive images, and baseline pupil diameter was associated with social anxiety. Future research should assess whether loneliness-specific attention bias is a replicable phenomenon, ideally by using an extreme-sampling approach with very lonely individuals. ...
Conference paper (2025) - Youssef Zahran, Gertjan Burghouts, Yke B. Eisma
Despite the significant advancements in computer vision models, their ability to generalize to novel object-attribute compositions remains limited. Existing methods for Compositional Zero-Shot Learning (CZSL) mainly focus on image classification. This paper aims to enhance CZSL in object detection without forgetting prior learned knowledge. We use Grounding DINO and incorporate Compositional Soft Prompting (CSP) into it and extend it with Compositional Anticipation. We achieve a 70.5% improvement over CSP on the harmonic mean (HM) between seen and unseen compositions on the CLEVR dataset. Furthermore, we introduce Contrastive Prompt Tuning to incrementally address model confusion between similar compositions. We demonstrate the effectiveness of this method and achieve an increase of 14.5% in HM across the pretrain, increment, and unseen sets. Collectively, these methods provide a framework for learning various compositions with limited data, as well as improving the performance of underperforming compositions when additional data becomes available. ...
Recent advancements in AI have accelerated the evolution of versatile robot designs. Chess provides a standardized environment for evaluating the impact of robot behavior on human behavior. This article presents an open-source chess robot for human-robot interaction research, specifically focusing on verbal and non-verbal interactions. The OpenChessRobot recognizes chess pieces using computer vision, executes moves, and interacts with the human player through voice and robotic gestures. We detail the software design, provide quantitative evaluations of the efficacy of the robot, and offer a guide for its reproducibility. An online survey examining people’s views of the robot in three possible scenarios was conducted with 597 participants. The robot received the highest ratings in the robotics education and the chess coach scenarios, while the home entertainment scenario received the lowest scores. The code is accessible on GitHub: https://github.com/renchizhhhh/OpenChessRobot. ...
Robots are becoming more capable and can autonomously perform tasks such as navigating between locations. However, human oversight remains crucial. This study compared two touchless methods for directing mobile robots: voice control and gesture control, to investigate the efficiency of these methods and the preference of users. We tested these methods in two conditions: one in which participants remained stationary and one in which they walked freely alongside the robot. We hypothesized that walking alongside the robot would result in higher intuitiveness ratings and improved task performance, based on the idea that walking promotes spatial alignment and reduces the effort required for mental rotation. In a 2×2 within-subject design, 218 participants guided the quadruped robot Spot along a circuitous route with multiple 90° turns using rotate left, rotate right, and walk forward commands. After each trial, participants rated the intuitiveness of the command mapping, while post-experiment interviews were used to gather the participants’ preferences. Results showed that voice control combined with walking with Spot was the most favored and intuitive, whereas gesture control while standing caused confusion for left/right commands. Nevertheless, 29% of participants preferred gesture control, citing increased task engagement and visual congruence as reasons. An odometry-based analysis revealed that participants often followed behind Spot, particularly in the gesture control condition, when they were allowed to walk. In conclusion, voice control with walking produced the best outcomes. Improving physical ergonomics and adjusting gesture types could make gesture control more effective. ...
This study investigated human performance in identifying AI-generated images. In a speeded forced-choice task, 255 participants viewed paired images (one real, one AI-generated by Midjourney) of standard or futuristic cars and buildings and had to identify the AI-generated one, while eye movements were recorded using an eye-tracker. Results revealed a powerful “futurism-as-artificiality” heuristic. Specifically, participants performed poorly (55% correct) when an AI-generated standard image was paired with a real futuristic image. Conversely, accuracy was high (91% correct) when the AI-generated futuristic image was paired with a real standard image. Participants’ gaze landed first on the AI-generated image more often when it depicted a futuristic design than when it depicted a standard one. The demonstrated heuristic presents a double-edged sword for information veracity: it may lead to the uncritical acceptance of AI-generated misinformation that appears conventional, while simultaneously causing real forward-thinking designs to be dismissed as fake. ...
Journal article (2025) - J.C.F. de Winter, Y.B. Eisma
Many of the commentators use the opportunity to highlight the value of Ergonomics and Human Factors (EHF) and challenge the notion that the discipline is fading. In response, we argue that EHF science is lagging behind the rapid developments in AI, remains entrenched in past-century achievements, and is in decline. Indices such as membership counts, conference attendance numbers, and new regulations reflect activity, but not necessarily impact. Important questions about human-AI collaboration are being addressed by other disciplines, often without the involvement of EHF. Rather than advocating for systemic frameworks, we advocate for new skill development and the adoption of data-driven AI methods. To illustrate the potential of AI within EHF, we demonstrate how a vision-language model can replicate findings from a classic knobs-and-dials study. In conclusion, we acknowledge EHF’s knowledge base but foresee existential risks unless the field undergoes major reforms to remain relevant in an AI-dominated future. ...
Journal article (2024) - Joost de Winter, Dimitra Dodou, Yke Bauke Eisma
Within a year of its launch, ChatGPT has seen a surge in popularity. While many are drawn to its effectiveness and user-friendly interface, ChatGPT also introduces moral concerns, such as the temptation to present generated text as one’s own. This led us to theorize that personality traits such as Machiavellianism and sensation-seeking may be predictive of ChatGPT usage. We launched two online questionnaires with 2000 respondents each, in September 2023 and March 2024, respectively. In Questionnaire 1, 22% of respondents were students, and 54% were full-time employees; 32% indicated they used ChatGPT at least weekly. Analysis of our ChatGPT Acceptance Scale revealed two factors, Effectiveness and Concerns, which correlated positively and negatively, respectively, with ChatGPT use frequency. A specific aspect of Machiavellianism (manipulation tactics) was found to predict ChatGPT usage. Questionnaire 2 was a replication of Questionnaire 1, with 21% students and 54% full-time employees, of which 43% indicated using ChatGPT weekly. In Questionnaire 2, more extensive personality scales were used. We found a moderate correlation between Machiavellianism and ChatGPT usage (r = 0.22) and with an opportunistic attitude towards undisclosed use (r = 0.30), relationships that largely remained intact after controlling for gender, age, education level, and the respondents’ country. We conclude that covert use of ChatGPT is associated with darker personality traits, something that requires further attention. ...
Journal article (2024) - Y.B. Eisma, S.T. van Vliet, A.J. Nederveen, J.C.F. de Winter
Steady-State Visual Evoked Potentials (SSVEPs) are brain responses measurable via electroencephalography (EEG) in response to continuous visual stimulation at a constant frequency. SSVEPs have been instrumental in advancing our understanding of human vision and attention, as well as in the development of brain-computer interfaces (BCIs). Ongoing questions remain about which type of visual stimulus causes the most potent SSVEP response. The current study investigated the effects of color, size, and flicker frequency on the signal-to-noise ratio of SSVEPs, complemented by pupillary light reflex measurements obtained through an eye-tracker. Six participants were presented with visual stimuli that differed in terms of color (white, red, green), shape (circles, squares, triangles), size (10,000 to 30,000 pixels), flicker frequency (8 to 25 Hz), and grouping (one stimulus at a time versus four stimuli presented in a 2 × 2 matrix to simulate a BCI). The results indicated that larger stimuli elicited stronger SSVEP responses and more pronounced pupil constriction. Additionally, the results revealed an interaction between stimulus color and flicker frequency, with red being more effective at lower frequencies and white at higher frequencies. Future SSVEP research could focus on the recommended waveform, interactions between SSVEP and power grid frequency, a wider range of flicker frequencies, a larger sample of participants, and a systematic comparison of the information transfer obtained through SSVEPs, pupil diameter, and eye movements. ...

An Experiment Revealing the Role of Human Subjectivity

Journal article (2024) - Y.B. Eisma, R. Koerts, J.C.F. de Winter
With the growing capabilities of AI, technology is increasingly able to match or even surpass human performance. In the current study, focused on the game of chess, we investigated whether chess players could distinguish if they were playing against a human or a computer, and how they achieved this. A total of 24 chess players each played eight 5+0 Blitz games from different starting positions. They played against (1) a human, (2) Maia, a neural network-based chess engine trained to play in a human-like manner, (3) Stockfish 16, the best chess engine available, downgraded to play at a lower level, and (4) Stockfish 16 at its maximal level. The opponent’s move time was fixed at 10 seconds. During the game, participants verbalized their thoughts, and after each game, they indicated by means of a questionnaire whether they thought they had played against a human or a machine and if there were particular moves that revealed the nature of the opponent. The results showed that Stockfish at the highest level was usually correctly identified as an engine, while Maia was often incorrectly identified as a human. The moves of the downgraded Stockfish were relatively often labeled as ‘strange’ by the participants. In conclusion, the Turing test, as applied here in a domain where computers can perform superhumanly, is essentially a test of whether the chess computer can devise suboptimal moves that correspond to human moves, and not necessarily a test of computer intelligence. ...

Near-Perfect Performance on a Mathematics Exam

Journal article (2024) - J.C.F. de Winter, D. Dodou, Y.B. Eisma
The processes underlying human cognition are often divided into System 1, which involves fast, intuitive thinking, and System 2, which involves slow, deliberate reasoning. Previously, large language models were criticized for lacking the deeper, more analytical capabilities of System 2. In September 2024, OpenAI introduced the o1 model series, designed to handle System 2-like reasoning. While OpenAI’s benchmarks are promising, independent validation is still needed. In this study, we tested the o1-preview model twice on the Dutch ‘Mathematics B’ final exam. It scored a near-perfect 76 and 74 out of 76 points. For context, only 24 out of 16,414 students in the Netherlands achieved a perfect score. By comparison, the GPT-4o model scored 66 and 62 out of 76, well above the Dutch students’ average of 40.63 points. Neither model had access to the exam figures. Since there was a risk of model contamination (i.e., the knowledge cutoff for o1-preview and GPT-4o was after the exam was published online), we repeated the procedure with a new Mathematics B exam that was published after the cutoff date. The results again indicated that o1-preview performed strongly (97.8th percentile), which suggests that contamination was not a factor. We also show that there is some variability in the output of o1-preview, which means that sometimes there is ‘luck’ (the answer is correct) or ‘bad luck’ (the output has diverged into something that is incorrect). We demonstrate that the self-consistency approach, where repeated prompts are given and the most common answer is selected, is a useful strategy for identifying the correct answer. It is concluded that while OpenAI’s new model series holds great potential, certain risks must be considered. ...

Fade of a discipline

Journal article (2024) - J. C.F. de Winter, Y. B. Eisma
In this commentary, we argue that the field of Ergonomics and Human Factors (EHF) has the tendency to present itself as a thriving and impactful science, while in reality, it is losing credibility. We assert that EHF science (1) has introduced terminology that is internally inconsistent and hardly predictive-valid, (2) has virtually no impact on industrial practice, which operates within frameworks of regulatory compliance and profit generation, (3) repeatedly employs the same approach of conducting lab experiments within unrealistic paradigms in order to complete deliverables, (4) suggests it is a cumulative science, but is neither a leader nor even an adopter of open-science initiatives that are characteristic of scientific progress and (5) is being assimilated by other disciplines as well as Big Tech. Recommendations are provided to reverse this trend, although we also express a certain resignation as our scientific discipline loses significance. Practitioner Summary: This paper offers criticism of the field of Ergonomics. There are issues such as unclear terminology, unrealistic experiments, insufficient impact and lack of open data. We provide recommendations to reverse the trend. This article concerns a critique of EHF as a science, and is not a critique of EHF practitioners. ...

Governed by visual complexity and centrality

Journal article (2023) - Joost C.F. de Winter, D. Dodou, Yke Bauke Eisma
Raven matrices are widely considered a pure test of cognitive abilities. Previous research has examined the extent to which cognitive strategies are predictive of the number of correct responses to Raven items. This study examined whether response times can be explained directly from the centrality and visual complexity of the matrix cells (edge density and perceived complexity). A total of 159 participants completed a 12-item version of the Raven Advanced Progressive Matrices. In addition to item number (an index of item difficulty), the findings demonstrated a positive correlation between the visual complexity of Raven items and both the mean response time and the number of fixations on the matrix (a strong correlate of response time). Moreover, more centrally placed cells as well as more complex cells received more fixations. It is concluded that response times on Raven matrices are impacted by low-level stimulus attributes, namely, visual complexity and eccentricity. ...
Journal article (2023) - Yke Bauke Eisma, Lucas van Gent, Joost de Winter
Automated vehicles need to prioritize pedestrian safety. One way to achieve this is through external human–machine interfaces (eHMIs) that send visual signals to pedestrians. eHMIs can be either text-based or light-based. However, there has been limited research on the effects of these types of eHMI on human information processing and attention allocation. This study aimed to fill this gap by using a gaze-contingent approach, which blurs the view outside a circular aperture, to test the hypothesis that text-based eHMIs, which require focused or foveal attention, result in longer response times compared to light-based eHMIs, which can be understood using peripheral vision. In this study, 23 participants watched animated video clips of traffic situations involving automated vehicles with either no eHMI, a flashing-light eHMI, or a text-based eHMI. Their eye movements were tracked, and they were asked to press the spacebar when they felt it was safe to cross the road. The results showed faster response times when an eHMI was present, with no significant difference between the two types of eHMIs. Further analysis suggested that the flashing-light eHMI captured attention briefly, while the text-based eHMI held attention for a longer period. When no eHMI was present, participants focused on the approaching vehicle for the longest time. The gaze-contingent window resulted in fewer eye movements and slower response times. In conclusion, the study showed that the gaze-contingent window negatively affected response times and eye movements, emphasizing the importance of considering peripheral vision when designing eHMIs for pedestrian safety. ...
Chunking theory and previous eye-tracking studies suggest that expert chess players use peripheral vision to judge chess positions and determine the best moves to play. However, the role of peripheral vision in chess has largely been inferred rather than tested through controlled experimentation. In this study, we used a gaze-contingent paradigm in a reconstruction task, similar to the one initially used by De Groot (1946). It was hypothesized that the smaller the gaze-contingent window while memorizing a chess position, the smaller the differences in reconstruction accuracy between novice and expert players. Participants viewed 30 chess positions for 20 seconds, after which they reconstructed this position. This was done for four different window sizes as well as for full visibility of the board. The results, as measured by Cohen’s d effect sizes between experts and novices of the proportion of correctly placed pieces, supported the above hypothesis, with experts performing much better but losing much of their performance advantage for the smallest window size. A complementary find-the-best-move task and additional eye-movement analyses showed that experts had a longer median fixation duration and more spatially concentrated scan patterns than novice players. These findings suggest a key contribution of peripheral vision and are consistent with the prevailing chunking theory. ...
Journal article (2023) - Johan N. van der Meer, Yke B. Eisma, Ronald Meester, Marc Jacobs, Aart J. Nederveen
The interaction between biological tissue and electromagnetic fields (EMF) is a topic of increasing interest due to the rising prevalence of background EMF in the past decades. Previous studies have attempted to measure the effects of EMF on brainwaves using EEG recordings, but are typically hampered by experimental and environmental factors. In this study, we present a framework for measuring the impact of EMF on EEG while controlling for these factors. A Bayesian statistical approach is employed to provide robust statistical evidence of the observed EMF effects. This study included 32 healthy participants in a double-blinded crossover counterbalanced design. EEG recordings were taken from 63 electrodes across 6 brain regions. Participants underwent a measurement protocol comprising two 18-min sessions with alternating blocks of eyes open (EO) and eyes closed (EC) conditions. Group 1 (n = 16) had EMF during the first session and sham during the second session; group 2 (n = 16) had the opposite. Power spectral density plots were generated for all sessions and brain regions. The Bayesian analysis provided statistical evidence for the presence of an EMF effect in the alpha band power density in the EO condition. This measurement protocol holds potential for future research on the impact of novel transmission protocols. ...
A spectrum of control methods in human–robot interaction was investigated, ranging from direct control to telepresence with a virtual representation of the robot arm. A total of 24 participants used a setup that included a Franka Emika Panda robot arm, Varjo XR-3 head-mounted display, and Leap Motion Controller. Participants performed a box-and-block task using the bare hand (A), and under five gesture-controlled robotic operation methods: direct sight (B), sight via video-feedthrough (C), in a 3D telepresence environment with (D) and without (E) virtual representation of the robot arm, and using a 2D video feed (F). The number of grabbing attempts did not differ significantly between conditions, but local operation (B & C) yielded more transferred blocks than teleoperation (D–F). Teleoperation using a 3D presentation was advantageous compared to teleoperation using a 2D video feed, as demonstrated by lower peak forces and smaller range in gripper heights in conditions D and E compared to condition F, a finding supported by analyses of the head movement activity. Finally, the bare hand yielded the best performance and subjective ratings. In summary, teleoperation using a 3D presentation provided a smoother interaction than teleoperation with a 2D video feed. However, direct human interaction remains a benchmark yet to surpass. ...
Journal article (2023) - Y.B. Eisma, A. Bakay, J.C.F. de Winter
Introduction
In the 1950s and 1960s, John Senders carried out a number of influential experiments on the monitoring of multidegree-of-freedom systems. In these experiments, participants were tasked with detecting events (threshold crossings) for multiple dials, each presenting a signal with different bandwidth. Senders’ analyses showed a nearly linear relationship between signal bandwidth and the amount of attention paid to the dial, and he argued that humans sample according to bandwidth, in line with the Nyquist–Shannon sampling theorem.

Objective
The current study tested whether humans indeed sample the dials based on bandwidth alone or whether they also use salient peripheral cues.

Methods
A dial-monitoring task was performed by 33 participants. In half of the trials, a gaze-contingent window was used that blocked peripheral vision.

Results
The results showed that, without peripheral vision, humans do not effectively distribute their attention across the dials. The findings also suggest that, when given full view, humans can detect the speed of the dial using their peripheral vision.

Conclusion
It is concluded that salience and bandwidth are both drivers of distributed visual attention in a dial-monitoring task.

Application
The present findings indicate that salience plays a major role in guiding human attention. A subsequent recommendation for future human–machine interface design is that task-critical elements should be made salient. ...

The effects of automated vehicle characteristics on cyclists’ decision-making

Journal article (2022) - Pavlo Bazilinskyy, Dimitra Dodou, Yke Bauke Eisma, Willem Vlakveld, Joost de Winter
Automated vehicles (AVs) may feature blinded (i.e. blacked-out) windows and external human–machine interfaces (eHMIs), and the driver may be inattentive or absent, but how these features affect cyclists is unknown. In a crowdsourcing study, participants viewed images of approaching vehicles from a cyclist's perspective and decided whether to brake. The images depicted different combinations of traditional vehicles versus AVs, eHMI presence, vehicle approach direction, driver visibility/window-blinding, visual complexity of the surroundings, and distance to the cyclist (urgency). The results showed that the eHMI and urgency level had a strong impact on crossing decisions, whereas visual complexity had no significant influence. Blinded windows caused participants to brake for the traditional vehicle. A second crowdsourcing experiment aimed to clarify the findings of Experiment 1 by also requiring participants to detect the vehicle features. It was found that the eHMI ‘GO’ and blinded windows yielded high detection rates and that driver eye contact caused participants to continue pedalling. To conclude, blinded windows increase the probability that cyclists brake, and driver eye contact stimulates cyclists to continue cycling. Our findings, which were obtained with large international samples, may help elucidate how AVs (in which the driver may not be visible) affect cyclists’ behaviour. ...