AK

A. Koutrouvelis

info

Please Note

14 records found

Conference paper (2019) - Andreas I. Koutrouvelis, Richard C. Hendriks, Richard Heusdens, Jesper Jensen
Factor analysis is a popular tool in multivariate statistics, applied in several areas of study such as psychology, economics, chemistry and signal processing. Given a set of observed random variables, factor analysis aims at explaining and analyzing the correlation between these random variables. This is done by finding a meaningful structural model representation for the correlation matrix of the observed random variables, and subsequently estimating the underlying model parameters. In this paper, we focus on factor analysis methods applied to a commonly used signal model for sensor arrays applications and use it to jointly estimate the underlying model parameters. In addition we discuss practical considerations of these methods. ...
Conference paper (2019) - Andreas I. Koutrouvelis, Richard C. Hendriks, Richard Heusdens, Jesper Jensen, Meng Guo
While the majority of binaural beamformers aim to minimize the output noise power while (approximately) preserving the binaural cues of the sources using constraints, we propose in this paper to minimize the binaural-cue distortions of the sources in the acoustic scene, such that the output noise power is below a predefined threshold. This new problem formulation is a convex QCQP problem, which leads to an efficient trade-off between noise reduction, binaural-cue preservation and complexity. In particular, the proposed beamformer provides a better trade-off between noise reduction and binaural-cue preservation (in terms of interaural level and phase differences) compared to the well-known binaural minimum variance distortionless response-η beamformer. ...
Journal article (2019) - Andreas I. Koutrouvelis, Richard C. Hendriks, Richard Heusdens, Jesper Jensen
One of the biggest challenges in multimicrophone applications is the estimation of the parameters of the signal model, such as the power spectral densities (PSDs) of the sources, the early (relative) acoustic transfer functions of the sources with respect to the microphones, the PSD of late reverberation, and the PSDs of microphone-self noise. Typically, existing methods estimate subsets of the aforementioned parameters and assume some of the other parameters to be known a priori. This may result in inconsistencies and inaccurately estimated parameters and potential performance degradation in the applications using these estimated parameters. So far, there is no method to jointly estimate all the aforementioned parameters. In this paper, we propose a robust method for jointly estimating all the aforementioned parameters using confirmatory factor analysis. The estimation accuracy of the signal-model parameters thus obtained outperforms existing methods in most cases. We experimentally show significant performance gains in several multimicrophone applications over state-of-the-art methods. ...
In this letter, we propose a decentralized framework for rate-distributed linearly constrained minimum variance (LCMV) beamforming in wireless acoustic sensor networks. To save the energy usage within the network, we propose to minimize the transmission cost and put a constraint on the noise reduction performance. Subsequently, we decentralize the obtained LCMV filter structure by exploiting an imposed block diagonal form of the noise correlation matrix. As a result, the beamformer weights are calculated in a decentralized fashion and each node can determine its quantization rate locally. Finally, numerical results validate the proposed method. ...
Binaural cues are important for sound localization. In addition, spatially separated sound sources are more intelligible than when they are co-located. Binaural cue preservation in multi-microphone hearing assistive devices is therefore important for the user's listening experience and safety. A number of linearly-constrained-minimum-variance (LCMV) based methods exist for this purpose. These are all limited in the number of sources for which they can preserve the binaural cues. We propose a method of automatically selecting the most important interfering sources using convex optimization. The proposed method is compared, using simulation experiments, to existing methods in terms of noise suppression and localization errors. It improves the performance of the joint binaural LCMV beam-former, by giving it more degrees of freedom for noise reduction and allows a larger number of (virtual) sources present in the scene. ...
Journal article (2019) - Andreas I. Koutrouvelis, Richard Christian Hendriks, Richard Heusdens, Jesper Jensen
The recently proposed relaxed binaural beamforming (RBB) optimization problem provides a flexible tradeoff between noise suppression and binaural-cue preservation of the sound sources in the acoustic scene. It minimizes the output noise power, under the constraints, which guarantee that the target remains unchanged after processing and the binaural-cue distortions of the acoustic sources will be less than a user-defined threshold. However, the RBB problem is a computationally demanding non convex optimization problem. The only existing suboptimal method which approximately solves the RBB is a successive convex optimization (SCO) method which, typically, requires to solve multiple convex optimization problems per frequency bin, in order to converge. Convergence is achieved when all constraints of the RBB optimization problem are satisfied. In this paper, we propose a semidefinite convex relaxation (SDCR) of the RBB optimization problem. The proposed suboptimal SDCR method solves a single convex optimization problem per frequency bin, resulting in a much lower computational complexity than the SCO method. Unlike the SCO method, the SDCR method does not guarantee user-controlled upper-bounded binaural-cue distortions. To tackle this problem, we also propose a suboptimal hybrid method that combines the SDCR and SCO methods. Instrumental measures combined with a listening test show that the SDCR and hybrid methods achieve significantly lower computational complexity than the SCO method, and in most cases better tradeoff between predicted intelligibility and binaural-cue preservation than the SCO method. ...
The paramount importance of good hearing in everyday life has driven an exploration into the improvement of hearing capabilities of (hearing impaired) people in acoustic challenging situations using hearing assistive devices (HADs). HADs are small portable devices, which primarily aim at improving the intelligibility of an acoustic source that has drawn the attention of the HAD user. One of the most important steps to achieve this is via filtering the sound recorded using the HAD microphones, such that ideally all unwanted acoustic sources in the acoustic scene are suppressed, while the target source is maintained undistorted. Modern HAD systems often consist of two collaborative (typically wirelessly connected) HADs, each placed on a different ear. These HAD systems are commonly referred to as binaural HAD systems. In a binaural HAD system, each HAD has typically more than one microphone forming a small local microphone array. The two HADs merge their microphone arrays forming a single larger microphone array. This provides more degrees of freedom for noise reduction. The multi-microphone noise reduction filters are commonly referred to as beamformers, and the beamformers designed for binaural HAD systems are commonly referred to as binaural beamformers. ...
Conference paper (2018) - Andreas I. Koutrouvelis, Richard C. Hendriks, Richard Heusdens, Jesper Jensen, Meng Guo
In this paper, we perceptually evaluate two recently proposed binaural multi-microphone speech enhancement methods in terms of intelligibility improvement and binaural-cue preservation. We compare these two methods with the well-known binaural minimum variance distortionless response (BMVDR) method. More specifically, we measure the 50% speech reception threshold, and the localization error of all dominant point sources in three different acoustic scenes. The listening tests are divided into a parameter selection phase and a testing phase. The parameter selection phase is used to select the algorithms' parameters based on one acoustic scene. In the testing phase, the two methods are evaluated in two other acoustic scenes in order to examine their robustness. Both methods achieve significantly better intelligiblity compared to the unprocessed scene, and slightly worse intelligibility than the BMVDR method. However, unlike the BMVDR method which severely distorts the binaural cues of all interferers, the new methods achieve localization errors which are not significantly different compared to those of the unprocessed scene. ...
We propose a new robust distributed linearly constrained beamformer which utilizes a set of linear equality constraints to reduce the cross power spectral density matrix to a block-diagonal form. The proposed beamformer has a convenient objective function for use in arbitrary distributed network topologies while having identical performance to a centralized implementation. Moreover, the new optimization problem is robust to relative acoustic transfer function (RATF) estimation errors and to target activity detection (TAD) errors. Two variants of the proposed beamformer are presented and evaluated in the context of multi-microphone speech enhancement in a wireless acoustic sensor network, and are compared with other state-of-the-art distributed beamformers in terms of communication costs and robustness to RATF estimation errors and TAD errors. ...
Journal article (2017) - Andreas I. Koutrouvelis, Richard Christian Hendriks, Richard Heusdens, Jesper Jensen
In this paper, we propose a new binaural beamforming technique, which can be seen as a relaxation of the linearly constrained minimum variance (LCMV) framework. The proposed method can achieve simultaneous noise reduction and exact binaural cue preservation of the target source, similar to the binaural minimum variance distortionless response (BMVDR) method. However, unlike BMVDR, the proposed method is also able to preserve the binaural cues of multiple interferers to a certain predefined accuracy. Specifically, it is able to control the trade-off between noise reduction and binaural cue preservation of the interferers by using a separate trade-off parameter per-interferer. Moreover, we provide a robust way of selecting these trade-off parameters in such a way that the preservation accuracy for the binaural cues of the interferers is always better than the corresponding ones of the BMVDR. The relaxation of the constraints in the proposed method achieves approximate binaural cue preservation of more interferers than other previously presented LCMV-based binaural beamforming methods that use strict equality constraints. ...
Conference paper (2017) - Andreas I. Koutrouvelis, Richard C. Hendriks, Richard Heusdens, Jesper Jensen, Meng Guo
Binaural beamformers (BFs) aim to reduce the output noise power while simultaneously preserving the binaural cues of all sources. Typically, the latter is accomplished via
constraints relating the output and input interaural transfer functions (ITFs). The ITF is a function of the corresponding relative acoustic transfer function (RATF), which implies that RATF estimates of all sources in the acoustic scene are required. Here, we propose an alternative way to approximately preserve the binaural cues of the entire acoustic scene without estimating RATFs. We propose to preserve the binaural cues of all sources with a set of fixed pre-determined RATFs distributed around the head. Two recently proposed binaural BFs are evaluated in the context of using pre-determined RATFs and compared to the binaural minimum variance distrortionless response BF which can only preserve the binaural cues of the target. ...
Conference paper (2017) - Andreas I. Koutrouvelis, Jesper Jensen, Meng Guo, Richard C. Hendriks, Richard Heusdens
Binaural multi-microphone noise reduction methods aim at noise suppression while preserving the spatial impression of the acoustic scene. Recently, a new binaural speech enhancement method was proposed which chooses per timefrequency (TF) tile either the enhanced target or a suppressed noisy version. The selection between the two is based on the input SNR per TF tile. In this paper we modify this method such that the selection mechanism is based on the output SNR. The proposed modification of deciding which TF tile is target-or
noise-dominated leads to choices, which are better aligned with simultaneous masking properties of the auditory system, and, hence, improves the performance over the initial version of the algorithm. ...
Conference paper (2016) - Andreas I. Koutrouvelis, Richard C. Hendriks, Jesper Jensen, Richard Heusdens
We propose a new multi-microphone noise reduction technique for binaural cue preservation of the desired source and the interferers. This method is based on the linearly constrained minimum variance (LCMV) framework, where the constraints are used for the binaural cue preservation of the desired source and of multiple interferers. In this framework there is a trade-off between noise reduction and binaural cue preservation. The more constraints the LCMV uses for preserving binaural cues, the less degrees of freedom can be used for noise suppression. The recently presented binaural LCMV (BLCMV) method and the optimal BLCMV (OBLCMV) method require two constraints per interferer and introduce an additional interference rejection parameter. This unnecessarily reduces the degrees of freedom, available for noise reduction, and negatively influences the trade-off between noise reduction and binaural cue preservation. With the proposed method, binaural cue preservation is obtained using just a single constraint per interferer without the need of an interference rejection parameter. The proposed method can simultaneously achieve noise reduction and perfect binaural cue preservation of more than twice as many interferers as the BLCMV, while the OBLCMV can preserve the binaural cues of only one interferer. ...
Journal article (2015) - A.I. Koutrouvelis, GP Kafentzis, ND Gaubitch, R Heusdens
We propose a fast speech analysis method which simultaneously performs high-resolution voiced/unvoiced detection (VUD) and accurate estimation of glottal closure and glottal opening instants (GCIs and GOIs, respectively). The proposed algorithm exploits the structure of the glottal flow derivative in order to estimate GCIs and GOIs only in voiced speech using simple time-domain criteria. We compare our method with well-known GCI/GOI methods, namely, the dynamic programming projected phase-slope algorithm (DYPSA), the yet another GCI/GOI algorithm (YAGA) and the speech event detection using the residual excitation and a mean-based signal (SEDREAMS). Furthermore, we examine the performance of the aforementioned methods when combined with state-of-the-art VUD algorithms, namely, the robust algorithm for pitch tracking (RAPT) and the summation of residual harmonics (SRH). Experiments conducted on the APLAWD and SAM databases show that the proposed algorithm outperforms the state-of-the-art combinations of VUD and GCI/GOI algorithms with respect to almost all evaluation criteria for clean speech. Experiments on speech contaminated with several noise types (white Gaussian, babble, and car-interior) are also presented and discussed. The proposed algorithm outperforms the state-of-the-art combinations in most evaluation criteria for signal-to-noise ratio greater than 10 dB. ...