Searched for: +
(1 - 2 of 2)
document
Wang, X. (author), Qiao, T. (author), Zhu, Jihua (author), Hanjalic, A. (author), Scharenborg, O.E. (author)
An estimated half of the world’s languages do not have a written form, making it impossible for these languages to benefit from any existing text-based technologies. In this paper, a speech-to-image generation (S2IG) framework is proposed which translates speech descriptions to photo-realistic images without using any text information, thus...
conference paper 2020
document
Wang, Bokun (author), Yang, Yang (author), Xing, Xu (author), Hanjalic, A. (author), Shen, Heng Tao (author)
Cross-modal retrieval aims to enable flexible retrieval experience across different modalities (e.g., texts vs. images). The core of crossmodal retrieval research is to learn a common subspace where the items of different modalities can be directly compared to each other. In this paper, we present a novel Adversarial Cross-Modal Retrieval ...
conference paper 2017