Laughter in Motion: Pose-Based Detection Across Annotation Modalities in Natural Social Interactions

None, None

Laughter in Motion: Pose-Based Detection Across Annotation Modalities in Natural Social Interactions

Investigating modality annotation impact for detecting laughter in the wild

Bachelor Thesis (2025)

Author(s)

V. Guenov (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

H.S. Hung – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

L. Li – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

S. Tan – Mentor (TU Delft - Interactive Intelligence)

Faculty

Electrical Engineering, Mathematics and Computer Science

Machine learning Detection Classification Modalities Laughter Spontanious

To reference this document use:

https://resolver.tudelft.nl/uuid:21df5a1e-036d-43e3-b7be-460ba3fd65af

More Info

expand_more

Publication Year

2025

Language

English

Coordinates

52.002200, 4.373600

Graduation Date

05-11-2025

Awarding Institution

Delft University of Technology

Project

['CSE3000 Research Project', 'Multimodal Machine Learning Techniques for Analyzing Laughter and Drinking in Spontaneous Social Encounters']

Programme

['Computer Science and Engineering']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Laughter is a complex multimodal behavior and one of the most essential aspects of social interactions. Although previous research has used both auditory and facial cues for laughter detection, these approaches are commonly afflicted with difficulties in noisy, occluded, and privacy-sensitive settings. This paper explores the potential of using body posture alone—captured through 2D keypoint estimation as a robust signal for automatic laughter detection in naturalistic settings. We create a machine learning pipeline using the ConfLab dataset, which segments pose data, extracts motion-based features, and trains Random Forest classifiers on various annotation modalities (audio-only, video-only, and audiovisual) and segmentation methods (fixed and variable length). We show that, while variable-length segmentation yields optimal performance, it leads to overfitting. On the other hand, fixed-duration segmentation with three-second windows and audiovisual annotations achieves a pragmatic compromise and reaches F1-scores (65\%) comparable to earlier efforts in ideal environments. Upper-body movement, especially head and arm motion, is seen to be salient cues to laughter via feature importance analysis. Annotation modality is also found to significantly affect both classification performance and relative pose feature importance. These findings demonstrate the viability of pose-based laughter detection and reveal how annotation choices shape model behavior, offering insights for affective computing in the wild.

Files

Guenov_-_Thesis.pdf

(pdf | 1.5 Mb)

License info not available