Searched for: +
(1 - 2 of 2)
document
Wang, X. (author), Xie, Qicong (author), Xie, Lei (author), Zhu, Jihua (author), Scharenborg, O.E. (author)
Automatically generating videos in which synthesized speech is synchronized with lip movements in a talking head has great potential in many human-computer interaction scenarios. In this paper, we present an automatic method to generate synchronized speech and talking-head videos on the basis of text and a single face image of an arbitrary...
journal article 2023
document
Wang, X. (author), van der Hout, Justin (author), Zhu, Jihua (author), Hasegawa-Johnson, Mark (author), Scharenborg, O.E. (author)
Image captioning technology has great potential in many scenarios. However, current text-based image captioning methods cannot be applied to approximately half of the world's languages due to these languages’ lack of a written form. To solve this problem, recently the image-to-speech task was proposed, which generates spoken descriptions of...
journal article 2021