N.D. Gaubitch | TU Delft Repository

Distributed Tdoa-Based Indoor Source Localisation

Conference paper (2018) - Wangyang Yu, Nikolay D. Gaubitch, Richard Heusdens

Indoor localisation is an important research topic with several possible applications. For example, knowing a user's location can be used as navigation aid in hospitals and malls, or for better targeted marketing. In this paper we consider the case where the environment of interest is equipped with several receivers (with known location) from which time-difference-of-arrival (TDOA) measurements are obtained and used to localise the source. We will present a distributed algorithm for localising the source. More specifically, we experimentally show that the distributed algorithm, which only uses time-of-arrival (TOA) measurements obtained from neighbouring receivers to calculate the TDOAs, performs as well as a centralised solution that has access to all TOA measurements in the network. In addition, we propose a method for discarding erroneous TOA measurements which considerably improves the performance in noisy and reverberant environments. ...

Room Geometry Estimation from Acoustic Echoes using Graph-Based Echo Labeling

Conference paper (2016) - Ingmar Jager, Richard Heusdens, Nikolay D. Gaubitch

A computer being able to estimate the geometry of a room could benefit applications such as auralization, robot navigation, virtual reality and teleconferencing. When estimating the geometry of a room using multiple microphones, the main challenge is to identify which reflections, or echoes, originate from the same wall and can, therefore, be modeled by a virtual source outside the room using the mirror image source model. In this paper we present a new and efficient method to disambiguate the echoes using a graph theoretical approach where echo combinations are modeled as nodes in a graph and the problem is stated as a maximum independent set problem. Once the echoes are correctly labelled, we know the locations of the virtual sources from which we can infer the room geometry. Experiments for shoe-box shaped rooms show that we can reliably estimate the room geometry within seconds on contemporary hardware and achieve centimeter precision on finding the vertices of the room. ...

Estimation of Room Acoustic Parameters

The ACE Challenge

Journal article (2016) - James Eaton, N. D. Gaubitch, Aliastair H. Moore, Patrick A. Naylor

Reverberation time (T₆₀) and Direct-to-reverberant ratio (DRR) are important parameters which together can characterize sound captured by microphones in nonanechoic rooms. These parameters are important in speech processing applications such as speech recognition and dereverberation. The values of T₆₀ and DRR can be estimated directly from the acoustic impulse response (AIR) of the room. In practice, the AIR is not normally available, in which case these parameters must be estimated blindly from the observed speech in the microphone signal. The acoustic characterization of environments (ACE) challenge aimed to determine the state-of-the-art in blind acoustic parameter estimation and also to stimulate research in this area. A summary of the ACE challenge, and the corpus used in the challenge is presented together with an analysis of the results. Existing algorithms were submitted alongside novel contributions, the comparative results for which are presented in this paper. The challenge showed that T₆₀ estimation is a mature field where analytical approaches dominate whilst DRR estimation is a less mature field where machine learning approaches are currently more successful. ...

Real-time loudspeaker distance estimation with stereo audio

Conference paper (2015) - Jesper Kjcer Nielsen, Nikolay D. Gaubitch, Richard Heusdens, Jorge Martinez, Tobias Lindstrøm Jensen

Knowledge on how a number of loudspeakers are positioned relative to a listening position can be used to enhance the listening experience. Usually, these loudspeaker positions are estimated using calibration signals, either audible or psycho-acoustically hidden inside the desired audio signal. In this paper, we propose to use the desired audio signal instead. Specifically, we treat the case of estimating the distance between two loudspeakers playing back a stereo music or speech signal. In this connection, we develop a real-time maximum likelihood estimator and demonstrate that it has a variance in the millimetre range in a real environment for even a modest sampling frequency. ...

A fast method for high-resolution voiced/unvoiced detection and glottal closure/opening instant estimation of speech

Journal article (2015) - A.I. Koutrouvelis, GP Kafentzis, ND Gaubitch, R Heusdens

We propose a fast speech analysis method which simultaneously performs high-resolution voiced/unvoiced detection (VUD) and accurate estimation of glottal closure and glottal opening instants (GCIs and GOIs, respectively). The proposed algorithm exploits the structure of the glottal flow derivative in order to estimate GCIs and GOIs only in voiced speech using simple time-domain criteria. We compare our method with well-known GCI/GOI methods, namely, the dynamic programming projected phase-slope algorithm (DYPSA), the yet another GCI/GOI algorithm (YAGA) and the speech event detection using the residual excitation and a mean-based signal (SEDREAMS). Furthermore, we examine the performance of the aforementioned methods when combined with state-of-the-art VUD algorithms, namely, the robust algorithm for pitch tracking (RAPT) and the summation of residual harmonics (SRH). Experiments conducted on the APLAWD and SAM databases show that the proposed algorithm outperforms the state-of-the-art combinations of VUD and GCI/GOI algorithms with respect to almost all evaluation criteria for clean speech. Experiments on speech contaminated with several noise types (white Gaussian, babble, and car-interior) are also presented and discussed. The proposed algorithm outperforms the state-of-the-art combinations in most evaluation criteria for signal-to-noise ratio greater than 10 dB. ...