Improvement of speech source localization in noisy environment using overcomplete rational-dilation wavelet transforms

Di Liu*, Andy W.H. Khong

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

The generalized cross-correlation using the phase transform prefilter remains popular for the estimation of time-differences-of-arrival. However it is not robust to noise and as a consequence, the performance of direction-of-arrival algorithms is often degraded under low signal-to-noise condition. We propose to address this problem through the use of a wavelet-based speech enhancement technique since the wavelet transform can achieve good denoising performance. The over-complete rational-dilation wavelet transform is then exploited to effectively process speech signals due to its higher frequency resolution. In addition, we exploit the joint distribution of the speech in the wavelet domain and develop a novel local noise variance estimator based on the bivariate shrinkage function. As will be shown, our proposed algorithm achieves good direction-of-arrival performance in the presence of noise.

Original languageEnglish
Title of host publicationProceedings - 2010 International Conference on Cyberworlds, CW 2010
Pages77-81
Number of pages5
DOIs
Publication statusPublished - 2010
Externally publishedYes
Event2010 10th International Conference on Cyberworlds, CW 2010 - Singapore, Singapore
Duration: Oct 20 2010Oct 22 2010

Publication series

NameProceedings - 2010 International Conference on Cyberworlds, CW 2010

Conference

Conference2010 10th International Conference on Cyberworlds, CW 2010
Country/TerritorySingapore
CitySingapore
Period10/20/1010/22/10

ASJC Scopus Subject Areas

  • Computer Networks and Communications
  • Computer Science Applications

Keywords

  • Denoising
  • DOA estimation
  • Speech source localization
  • Wavelet

Fingerprint

Dive into the research topics of 'Improvement of speech source localization in noisy environment using overcomplete rational-dilation wavelet transforms'. Together they form a unique fingerprint.

Cite this