Searched for: subject%3A%22automatically%22
(1 - 1 of 1)
document
Scholten, J.S.M. (author)
A Visually Grounded Speech model is a neural model which is trained to embed image caption pairs closely together in a common embedding space. As a result, such a model can retrieve semantically related images given a speech caption and vice versa. The purpose of this research is to investigate whether and how a Visually Grounded Speech model...
master thesis 2020