CO

C.R.M.M. Oertel Genannt Bierbach

19 records found

The emergence of Language Language Models (LLMs)-based agents represents a significant advancement in artificial intelligence (AI), offering new possibilities for complex problem-solving and interaction within a virtual environment. Our work is based on the Voyager paper [1], whi ...

Nuances of Interrater Agreement on Automatic Affect Prediction from Physiological Signals

A Systematic Review of Datasets Presenting Various Agreement Measures and Affect Representation Schemes

This study explores the influence of interrater agreement measures and affect representation schemes in automatic affect prediction systems using physiological signals. These systems often use supervised learning and require unambiguous and objective labeling, a challenge when mu ...
Human-computer interaction has long been the focus of technological evolution; however, in order for this type of system to reach its peak potential, machines must recognize that humans are constantly influenced by emotions. Text affective content analysis models are one attempt ...
Emotional datasets for automatic affect prediction usually employ raters to annotate emotions or verify the annotations. To ensure the reliability of these raters some use interrater agreement measures, to verify the degree to which annotators agree with each other on what they r ...
With the rise in the number of human-computer interactions, the need for systems that can accurately infer and respond to users' emotions becomes increasingly important. One can achieve this by examining audio-visual signals, aiming to identify the underlying emotions from an ind ...
Understanding how users retrospectively evaluate their interactions with adaptive intelligent systems is crucial to improving their behaviours during interactions. Prior work has shown the potential to predict retrospective evaluations based on different real-time aspects of conv ...
Automatic Speech Recognition (ASR) systems have become increasingly important for society, yet their performance varies significantly across different diverse speaker groups. With a significant non-native population in the Netherlands, it is crucial that ASR systems accurately re ...
Automatic Speech Recognition (ASR) systems are found in many places and are used by many people. Some groups of people, superficially older Dutch adults, are recognized less well by these systems. Given the aging population of the Netherlands, it would be beneficial to have ASR s ...

How Good Are State-of-the-Art Automatic Speech Recognition Systems in Recognizing Dutch Diverse Speech?

An Evaluation of Meta MMS and OpenAI Whisper on Native and Non-Native Dutch Speech

Automatic speech recognition (ASR) is increasingly used in daily applications, such as voice-activated virtual assistants like Siri and Alexa, real-time transcription for meetings and lectures, and voice commands for smart home devices. However, studies show that even state-of-th ...

Comparing performance of ASR systems on native Dutch children and teenagers: Google vs. Microsoft

Evaluating Speech Recognition Accuracy of state-of-the-art ASR models on Dutch child and teenager speech

Automatic Speech Recognition (ASR) technology is becoming more and more useful in everyday life, therefor also requiring higher accuracy across all different user demographics. This study compares the performance of Google's and Microsoft's ASR systems on native Dutch child and t ...
Existing content-based image retrieval models work well for natural photos, but not for images of architectural floor plans.
Previous work on floor plan retrieval has focused on graph-based methods, rather than image-based floor plans.
Training a CNN-based representation ...

In the Netherlands, there is a shortage of primary school teachers, due to this shortage, teachers often do not have a lot of one-on-one time with the students. A social robot could be the solution to creating more one-on-“one” time with the students. In addi ...

In this work, we present VisuaLayered, the implementation of a combined analysis workflow for pigment identification. VisuaLayered is an integrated, interactive system that focuses on the combined visual analysis of Macro X-Ray Fluorescence (MA-XRF) and Reflectance Imaging Spectr ...
Hate speech detection on social media platforms remains a challenging task. Manual moderation by humans is the most reliable but infeasible, and machine learning models for detecting hate speech are scalable but unreliable as they often perform poorly on unseen data. Therefore, h ...
The implementation of social robots in the healthcare industry is becoming substantial as a consequence of the scarcity of healthcare professionals, rising costs of healthcare and an increase in the number of vulnerable populations. Social robots will be deployed, in increasing n ...
Continuous affective self-reports are intrusive and expensive to acquire, forcing researchers to use alternative labels for the construction of their predictive models. The most predominantly used labels in literature are continuous perceived affective labels obtained using exter ...

Word recognition in a model of visually grounded speech

An analysis using techniques inspired by human speech processing research

A Visually Grounded Speech model is a neural model which is trained to embed image caption pairs closely together in a common embedding space. As a result, such a model can retrieve semantically related images given a speech caption and vice versa. The purpose of this research is ...
There has been a big increase in the use of social robots, such as Pepper, which use verbal communication as the main method of interacting with a human. Verbal communication with a robot is performed using Automatic Speech Recognition (ASR) to recognize words from an audio strea ...
In this thesis the automatic multimodal detection of social and task cohesion in meetings is studied. The presence of social and task cohesion has positive benefits on employee well-being, creativity and productiveness, and can therefore be used to assess meeting quality. Convers ...