Published in

Interspeech 2007, 2007

DOI: 10.21437/interspeech.2007-333

Links

Tools

Export citation

Search in Google Scholar

Model-driven detection of clean speech patches in noise.

Proceedings article published in 2007 by Jonathan Laidler, Martin Cooke, Neil D. Lawrence ORCID
This paper was not found in any repository; the policy of its publisher is unknown or unclear.
This paper was not found in any repository; the policy of its publisher is unknown or unclear.

Full text: Unavailable

Question mark in circle
Preprint: policy unknown
Question mark in circle
Postprint: policy unknown
Question mark in circle
Published version: policy unknown

Abstract

Listeners may be able to recognise speech in adverse condi- tions by "glimpsing" time-frequency regions where the target speech is dominant. Previous computational attempts to iden- tify such regions have been source-driven, using primitive cues. This paper describes a model-driven approach in which the like- lihood of spectro-temporal patches of a noisy mixture represent- ing speech is given by a generative model. The focus is on patch size and patch modelling. Small patches lead to a lack of dis- crimination, while large patches are more likely to contain con- tributions from other sources. A "cleanness" measure reveals that a good patch size is one which extends over a quarter of the speech frequency range and lasts for 40 ms. Gaussian mixture models are used to represent patches. A compact representa- tion based on a 2D discrete cosine transform leads to reasonable speech/background discrimination. Index Terms: speech separation, glimpsing, model-driven, spectro-temporal patches.