ER

E.H.J. Riemens

info

Please Note

3 records found

Conference paper (2024) - Hanyuan Ban, Ellen H.J. Riemens, Raj Thilak Rajan
Gaussian process regression (GPR), is a powerful non-parametric approach for data modeling, which has garnered considerable interest in the past decade, however its widespread application is impeded by the significant computational burden for larger datasets. The computational complexity for both inference and hyperparameter learning in GPs lead to O(N3) for N training points. The current state-of-the-art approximations, such as structured kernel interpolation (SKI)-based methods e.g., Kernel Interpolation for Scalable Structured Gaussian Process (KISSGP), have emerged to mitigate this challenge by providing a scalable inducing point alternatives. However, the choice of the optimal number of grid points, which influences the accuracy and efficiency of the model, typically remains fixed and is chosen arbitrarily. In this work, we introduce a novel approximation framework, Malleable KISSGP (MKISSGP), which dynamically adjusts grid points using a new hyperparameter of the model called density, which adapts to the changes in the kernel hyperparameters in each training iteration. In comparison with the state-of-the-art KISSGP and irrespective of changes in hyperparameters, our proposed MKISSGP algorithm exhibits consistent error levels in the reconstruction of the kernel matrix, and offers reduced computational complexity. We present extensive simulations to validate the improved performance of the proposed MKISSGP, and give directions for future research. ...

A LiDAR-aided approach for detection of acoustically reflective surfaces from microphone measurements

Loudspeakers are placed in an environment unknown to the loudspeaker designers. The room influences the acoustic experience for the user. Having information about the room makes it possible to better reproduce the sound field as intended. Using microphone measurements, the location of acoustic reflectors can be inferred. Current state-of-the-art methods for room boundary detection focus on a two-dimensional setting. Detection of arbitrary reflectors in three dimensions increase complexity due to practical limitations, i.e. the need for a spherical array and the increase of computational complexity. The presence of horizontal reflectors cause inaccuracy for wall detection due to model mismatch. Loudspeakers may not present an omnidirectional directivity pattern, as usually assumed in the literature, thus making the detection of acoustic reflectors in some directions more challenging.

In this thesis, a LiDAR sensor is added to a smart loudspeaker to improve wall detection accuracy and robustness. This is done in two ways.
First, the horizontal reflectors that are not present in the acoustic model are sought detected with the LiDAR sensor to enable elimination of their detrimental influence. Second, a method is proposed to compensate for the challenging regions for wall detection in highly directive loudspeakers, using the LiDAR sensor. Experimental results, evaluated in different simulated scenarios are shown for comparison of the proposed method and the state-of-the-art method, that exclusively uses acoustic information. ...

Investigating the influence of different speech modifications on the intelligibility of speech in near-end noise

Several algorithms to enhance the intelligibility of speech in near-end noise were analyzed and implemented. The algorithms considered were assessed based on the intrusive instrumental intelligibility metric SIIB_Gauss. An implementation based on the direct optimization for this metric is assessed, as well as an implementation based on human induced speech modifications, including increased sound intensity, flattening of the spectral tilt, increased vowel duration and increased consonant-vowel ratio. Another implemented algorithm is the amplification of the transient component of speech. Results show that for increased vowel duration a decrease in intelligibility was found in SIIB_Gauss value as well as in informal listening tests. The other implementations did show an increase in intelligibility according to SIIB_Gauss at SNRs between -4 dB and 6 dB in both stationary and fluctuating noise, under a power constraint. Finally, the implementations were combined into a system that automatically selects the optimal algorithm to use under the given noise conditions. It is shown that this combined system is able to increase intelligibility of speech in the presence of non-fluctuating noise, fluctuating noise, speech shaped noise, and competing speaker noise. ...