Extracting emotions from face-to-face communication

More Info
expand_more

Abstract

In our life we get more and more dependent on our computer and we have less time for face-to-face social activities with friends and families. In face?to?face communication our faces convey lots of emotions using facial expressions and lots of information is transmitted faster non-verbally than verbally, through the facial expressions. Using lots of modalities like facial expressions, speech and (hand and body) gestures make the communication between the humans multimodal. But the current communication style between human and computer is still dominated by keyboard and mouse and there is no room for emotions and facial expressions. In order to have a better understanding of human?human communication and to improve the human?computer interaction it is essential to identify and describe the different modalities of human?human communication including collection and annotation of multimodal data. Using the facial expressions, (hand and body) gestures, speech recognition and content awareness make the communication multimodal and enable the computer to adapt itself to the needs of the individual users. In this report we study one of the important modalities, the facial expressions and we proposed an algorithm for tracking the facial expressions from face?to?face communication. To discover the relation between the different facial expressions and their meaning we needed data of the human face?to?face communication to analyze the facial expressions during the interaction between them. After some research we decided to make some recordings and build our own database. Our research problem is to localize facial expressions, to label them and research the communication impact. To facilitate the localization we put markers on the faces of test persons during our experiment. We asked 3 observers annotators) to watch the recordings of the face-of-face communication and put labels on the segments that contain an emotion. Calculating the level of agreement between the annotators we compared their results with each other and finally we used these labeled segments in our model to extract the different features for each emotion. Using our model we followed the changes of the face during the facial expressions to collect facial features and used these facial features to define emotion clusters. Using these clusters we can start building an automatic emotion extraction tool from face?to?face communications. For defining enough features to recognize the emotions we need a much bigger database. During this project using our data, we extract some features for basic emotions and define clusters to recognize them, like recognizing ‘sadness’ from ‘happiness’ and ‘anger’ from ‘surprise’. For making a big more emotion clusters and recognizing more emotions we need more annotated data.