Room Shape Estimation from Acoustic Echoes using Graph-based Echo Labeling

More Info
expand_more

Abstract

Some vision impaired people can hear the shape of a room using acoustic echoes. A computer being able to do the same could benefit applications such as auralization, virtual reality and teleconferencing. This thesis describes the process of estimating the room shape from acoustic room impulse responses to finding geometry from image sources. As we use multiple microphones, the biggest challenge is to find the echoes from different microphones that correspond to the same image source. We present a new method to disambiguate the echoes using graph theory. We model combinations of echoes as nodes in a graph. The maximum independent set in the graph yields the disambiguated echoes. The disambiguated echoes are transformed to time-difference-of-arrival data so that we are able to calculate the locations of the sources and image sources in a closed-form fashion. From the estimated sources and image sources we finally infer the room geometry using the image source model. The experiments, which are limited to simulated shoe box shaped rooms show that we can reliably estimate room shapes within seconds on contemporary hardware. We achieve a sub-centimeter precision on finding the vertices of the room.