Searched for: +
(1 - 10 of 10)
document
Brownjohn, James (author), Raby, Alison (author), Au, Siu Kui (author), Zhu, Zuo (author), Wang, Xinrui (author), Antonini, A. (author)
A set of seven rock lighthouses around the British Isles was studied by a combination of forced and ambient vibration tests executed with some extreme logistical constraints. Forced vibration testing of the circular section masonry towers combined with experimental modal analysis identified modes with alignment assumed the same as the shaker...
conference paper 2019
document
Wang, X. (author), Qiao, T. (author), Zhu, Jihua (author), Hanjalic, A. (author), Scharenborg, O.E. (author)
An estimated half of the world’s languages do not have a written form, making it impossible for these languages to benefit from any existing text-based technologies. In this paper, a speech-to-image generation (S2IG) framework is proposed which translates speech descriptions to photo-realistic images without using any text information, thus...
conference paper 2020
document
Zhu, Y. (author), Wang, H. (author), Goverde, R.M.P. (author)
Real-time railway traffic management is important for the daily operations of railway systems. It predicts and resolves operational conflicts caused by events like excessive passenger boardings/alightings. Traditional optimization methods for this problem are restricted by the size of the problem instances. Therefore, this paper proposes a...
conference paper 2020
document
Wang, X. (author), Feng, S. (author), Zhu, Jihua (author), Hasegawa-Johnson, Mark (author), Scharenborg, O.E. (author)
This paper proposes a new model, referred to as the show and speak (SAS) model that, for the first time, is able to directly synthesize spoken descriptions of images, bypassing the need for any text or phonemes. The basic structure of SAS is an encoder-decoder architecture that takes an image as input and predicts the spectrogram of speech that...
conference paper 2021
document
Wang, X. (author), Tian, Tian (author), Zhu, Jihua (author), Scharenborg, O.E. (author)
In the case of unwritten languages, acoustic models cannot be trained in the standard way, i.e., using speech and textual transcriptions. Recently, several methods have been proposed to learn speech representations using images, i.e., using visual grounding. Existing studies have focused on scene images. Here, we investigate whether fine...
conference paper 2021
document
Zhu, P. (author), Wang, Z. (author), Yang, J. (author), Hauff, C. (author), Anand, A. (author)
Quality control is essential for creating extractive question answering (EQA) datasets via crowdsourcing. Aggregation across answers, i.e. word spans within passages annotated, by different crowd workers is one major focus for ensuring its quality. However, crowd workers cannot reach a consensus on a considerable portion of questions. We...
conference paper 2022
document
Xie, Jiahong (author), Cheng, Haibo (author), Zhu, Rong (author), Wang, Ping (author), Liang, K. (author)
To date there are few researches on the semantic information of passwords, which leaves a gap preventing us from fully understanding the passwords characteristic and security. We propose a new password probability model for semantic information based on Markov Chain with both generalization and accuracy, called WordMarkov, that can capture the...
conference paper 2022
document
Zhu, R. (author), Yang, M. (author), Yang, J. (author), Wang, Q. (author)
Federated Learning (FL) is an important privacy-preserving learning paradigm that is expected to play an essential role in the future Intelligent Internet of Things (IoT). However, model training in FL is vulnerable to noise and the statistical heterogeneity of local data across IoT clients. In this paper, we propose FedNaWi, a “Go Narrow, Then...
conference paper 2023
document
Yu, Fuyang (author), Wang, Zhen (author), Li, Dongyuan (author), Zhu, P. (author), Liang, Xiaohui (author), Wang, Xiaochuan (author), Okumura, Manabu (author)
Cross-modal retrieval, as an important emerging foundational information retrieval task, benefits from recent advances in multimodal technologies. However, current cross-modal retrieval methods mainly focus on the interaction between textual information and 2D images, lacking research on 3D data, especially point clouds at scene level,...
conference paper 2024
document
Zhu, P. (author), Wang, Zhen (author), Okumura, Manabu (author), Yang, J. (author)
Textbook question answering is challenging as it aims to automatically answer various questions on textbook lessons with long text and complex diagrams, requiring reasoning across modalities. In this work, we propose MRHF, a novel framework that incorporates dense passage re-ranking and the mixture-of-experts architecture for TQA. MRHF...
conference paper 2024
Searched for: +
(1 - 10 of 10)