Yang Yang | TU Delft Repository

Joint Feature Synthesis and Embedding

Adversarial Cross-Modal Retrieval Revisited

Journal article (2022) - Xing Xu (author) , Kaiyi Lin (author) , Yang Yang (author) , A. Hanjalic (author) , Heng Tao Shen (author)

Recently, generative adversarial network (GAN) has shown its strong ability on modeling data distribution via adversarial learning. Cross-modal GAN, which attempts to utilize the power of GAN to model the cross-modal joint distribution and to learn compatible cross-modal features ...

Recently, generative adversarial network (GAN) has shown its strong ability on modeling data distribution via adversarial learning. Cross-modal GAN, which attempts to utilize the power of GAN to model the cross-modal joint distribution and to learn compatible cross-modal features, is becoming the research hotspot. However, the existing cross-modal GAN approaches typically 1) require labeled multimodal data of massive labor cost to establish cross-modal correlation; 2) utilize the vanilla GAN model that results in unstable training procedure and meaningless synthetic features; and 3) lack of extensibility for retrieving cross-modal data of new classes. In this article, we revisit the adversarial learning in existing cross-modal GAN methods and propose Joint Feature Synthesis and Embedding (JFSE), a novel method that jointly performs multimodal feature synthesis and common embedding space learning to overcome the above three shortcomings. Specifically, JFSE deploys two coupled conditional Wassertein GAN modules for the input data of two modalities, to synthesize meaningful and correlated multimodal features under the guidance of the word embeddings of class labels. Moreover, three advanced distribution alignment schemes with advanced cycle-consistency constraints are proposed to preserve the semantic compatibility and enable the knowledge transfer in the common embedding space for both the true and synthetic cross-modal features. All these add-ons in JFSE not only help to learn more effective common embedding space that effectively captures the cross-modal correlation but also facilitate to transfer knowledge to multimodal data of new classes. Extensive experiments are conducted on four widely used cross-modal datasets, and the comparisons with more than ten state-of-the-art approaches show that our JFSE method achieves remarkably accuracy improvement on both standard retrieval and the newly explored zero-shot and generalized zero-shot retrieval tasks.

Interactions between a magnon mode and a cavity photon mode mediated by traveling photons

Journal article (2020) - J. W. Rao (author) , Y Wang (author) , Yang Yang (author) , Tao Yu (author) , Y. S. Gui (author) , X. Fan (author) , D. S. Xue (author) , Can Ming Hu (author)

We systematically study the indirect interaction between a magnon mode and a cavity photon mode mediated by traveling photons of a waveguide. From a general Hamiltonian, we derive the effective coupling strength between two separated modes, and obtain the theoretical expression o ...

Radial Graph Convolutional Network for Visual Question Generation

Journal article (2020) - Xu Xu (author) , Tan Wang (author) , Yang Yang (author) , A. Hanjalic (author) , Heng Tao Shen (author)

In this article, we address the problem of visual question generation (VQG), a challenge in which a computer is required to generate meaningful questions about an image targeting a given answer. The existing approaches typically treat the VQG task as a reversed visual question an ...

Macro-meso dynamic analysis of railway transition zone: Hybrid DEM/FDM simulation and experimental validation

Journal article (2020) - Can Shi (author) , Chunfa Zhao (author) , Yang Yang (author) , Y. Guo (author) , Xu Zhang (author) , Yang Feng (author)

To probe into the mechanical behaviour of railway transition zone from the macro-meso aspects, a numerical model of transition zone is built that hybrids the Discrete Element Method (DEM) and Finite Difference Method (FDM). The DEM is utilised to simulate the ballast bed and slee ...

Matching images and text with multi-modal tensor fusion and re-ranking

Conference paper (2019) - Tan Wang (author) , A. Hanjalic (author) , Xu Xu (author) , Heng Tao Shen (author) , Yang Yang (author) , Jingkuan Song (author)

A major challenge in matching images and text is that they have intrinsically different data distributions and feature representations. Most existing approaches are based either on embedding or classification, the first one mapping image and text instances into a common embedding ...

Denoising controlled-source electromagnetic data using least-squares inversion

Journal article (2018) - Yang Yang (author) , Diquan Li (author) , Tiegang Tong (author) , D. Zhang (author) , Yatong Zhou (author) , Yangkang Chen (author)

Strong noise is one of the toughest problems in the controlled-source electromagnetic (CSEM) method, which highly affects the quality of recorded data. The three main types of noise existing in CSEM data are periodic noise, Gaussian white noise, and nonperiodic noise, among which ...

Video Captioning by Adversarial LSTM

Journal article (2018) - Yang Yang (author) , Jie Zhou (author) , Jiangbo Ai (author) , Yi Bin (author) , A. Hanjalic (author) , Heng Tao Shen (author)

In this paper, we propose a novel approach to video captioning based on adversarial learning and long short-term memory (LSTM). With this solution concept, we aim at compensating for the deficiencies of LSTM-based video captioning methods that generally show potential to effectiv ...

Adversarial Cross-Modal Retrieval

Conference paper (2017) - Bokun Wang (author) , Yang Yang (author) , Xing Xu (author) , A. Hanjalic (author) , Heng Tao Shen (author)

Cross-modal retrieval aims to enable flexible retrieval experience across different modalities (e.g., texts vs. images). The core of crossmodal retrieval research is to learn a common subspace where the items of different modalities can be directly compared to each other. In this ...

An implicit switching model for distribution network reliability assessment

Conference paper (2016) - Yang Yang (author) , Simon H. Tindemans (author) , G. Strbac (author)

Modern active distribution networks make use of intelligent switching actions to restore supply to end users after faults. This complicates the reliability analysis of such networks, as the number of possible switching actions grows exponentially with network size. This paper pro ...