Jingkuan Song | TU Delft Repository

Unified Binary Generative Adversarial Network for Image Retrieval and Compression

Journal article (2020) - Jingkuan Song (author) , Tao He (author) , Lianli Gao (author) , Xu Xu (author) , A. Hanjalic (author) , Heng Tao Shen (author)

Binary codes have often been deployed to facilitate large-scale retrieval tasks, but not that often for image compression. In this paper, we propose a unified framework, BGAN+, that restricts the input noise variable of generative adversarial networks to be binary and conditioned ...

Matching images and text with multi-modal tensor fusion and re-ranking

Conference paper (2019) - Tan Wang (author) , A. Hanjalic (author) , Xu Xu (author) , Heng Tao Shen (author) , Yang Yang (author) , Jingkuan Song (author)

A major challenge in matching images and text is that they have intrinsically different data distributions and feature representations. Most existing approaches are based either on embedding or classification, the first one mapping image and text instances into a common embedding ...

From Deterministic to Generative

Multimodal Stochastic RNNs for Video Captioning

Journal article (2018) - Jingkuan Song (author) , Yuyu Guo (author) , Lianli Gao (author) , Xuelong Li (author) , A Hanjalic (author) , Heng Tao Shen (author)

Video captioning, in essential, is a complex natural process, which is affected by various uncertainties stemming from video content, subjective judgment, and so on. In this paper, we build on the recent progress in using encoder-decoder framework for video captioning and address ...

Binary Generative Adversarial Networks for Image Retrieval

Conference paper (2018) - Jingkuan Song (author) , Tao He (author) , Lianli Gao (author) , Xu Xu (author) , A. Hanjalic (author) , Heng Tao Shen (author)

The most striking successes in image retrieval using deep hashing have mostly involved discriminative models, which require labels. In this paper, we use binary generative adversarial networks (BGAN) to embed images to binary codes in an unsupervised way. By restricting the input ...