Layered Regression Analysis on Multimodal Approach for Personality and Job Candidacy Prediction and Explanation

Master thesis (2017)

Authors

Sukma Achmadnoer Sukma Wicaksana Electrical Engineering, Mathematics and Computer Science

Contributors

C.C.S. Liem (supervisor 1)

Faculty

Electrical Engineering, Mathematics and Computer Science

Multimodal Linear model Regression models Personality assessment Video resume

More Info

expand_more

To reference this document use:

http://resolver.tudelft.nl/uuid:a527395d-f42c-426d-b80b-29c3b6478802

Published Date

24-08-2017

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Electrical Engineering, Mathematics and Computer Science

Abstract

Video blogs (vlogs) are a popular media form for people to present themselves. In case a vlogger would be a job candidate, vlog content can be useful for automatically assessing the candidate's traits, as well as potential interviewability. Using a dataset from the CVPR ChaLearn competition, we build a model predicting Big Five personality trait scores and invite to Interview score of vloggers, explicitly targeting explainability of the system output to humans without the technical background. We use human-explainable features as input, and linear models for the systems building blocks of our layered architecture to ensure a transparent system. This multimodal layered architecture model is an enhancement to our initial submission model to the ChaLearn competition. Six multimodal feature representations are constructed to capture facial expression, movement, speaking pattern, and linguistic usage. Each of these representations is treated individually before the late fusion technique to combine each prediction. For each, correlation analysis is done to get the relation between input features and the predicted traits by considering the significance level of Pearson's correlation coefficient. This way, we split each representation into two feature set; full feature set and subset of a high significance level of features. Three regression techniques are fitted to these two datasets per representation to get the best possible model for each. Then, the six predictions are combined on the second layer of regression to ensure the fair weighting. Our layered regression architecture ensures us to have the best possible model for each representation to make a better overall accuracy. As a result, our enhanced model outperform our initial ChaLearn competition submission model and other systems in the competition. Despite our simple linear model that has lower accuracy than the more complex model on the same competition, we have a strength of the more interpretable model and report description.

Files

Layered_Regression_Analysis_on... (.pdf)

(.pdf | 2.64 Mb)