A systematic comparison of commonsense knowledge usages between natural language processing (NLP) and computer vision (CV)
A.S. Kuiper (TU Delft - Electrical Engineering, Mathematics and Computer Science)
G. He – Mentor (TU Delft - Web Information Systems)
Jie Yang – Mentor (TU Delft - Web Information Systems)
U.K. Gadiraju – Mentor (TU Delft - Web Information Systems)
Geert-Jan Houben – Graduation committee member (TU Delft - Web Information Systems)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Commonsense knowledge is the key of human intelligence in generalizing their knowledge to deal with complex tasks. Over the past years, a lot of research has been done in both natural language processing (NLP) and computer vision (CV) on leveraging commonsense knowledge to improve AI models. However, no systematic comparisons of existing work have been made between the two domains. Therefore this survey aims to provide an overview of how commonsense knowledge is used within NLP and CV and how research varies between these two domains and what future challenges it may hold. An observation made from this survey is that leveraging commonsense is more difficult in CV than NLP, as commonsense is mostly incorporated textually and datasets need to be filtered to make them more relevant for visual commonsense. We hope to promote further research and create a better understanding of commonsense knowledge and its applications with this survey.