Multichannel Linear Prediction-Based Speech Dereverberation Considering Sparse and Low-Rank Priors

Taihui Wang, Feiran Yang*, Jun Yang

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

This article addresses the multi-channel linear prediction (MCLP)-based speech dereverberation problem by jointly considering the sparsity and low-rank priors of speech spectrograms. We utilize the complex generalized Gaussian (CGG) distribution as the source model and the generalized nonnegative matrix factorization (NMF) as the spectral model. The difference between the presented model and existing ones for MCLP is twofold. First, we adopt the CGG distribution with a time-frequency-variant scale parameter instead of that with a time-frequency-invariant scale parameter. Second, the time-frequency-varying scale parameter is approximated by NMF in a low-rank manner. Based on the maximum-likelihood criterion, speech dereverberation is formulated as an optimization problem that minimizes the prediction error weighted by the reciprocal of sparse and low-rank parameters. A convergence-guaranteed algorithm is derived to estimate the parameters using the majorization-minimization technology. The WPE, NMF-based WPE and CGG-based WPE can be treated as special cases of the proposed method with different shape and domain parameters. As a byproduct, the proposed method provides a simple and elegant way to derive the CGG-based WPE algorithm. A series of experiments show the superiority of the proposed method over WPE, NMF-based WPE and CGG-based WPE methods.

Original languageEnglish
Pages (from-to)1724-1735
Number of pages12
JournalIEEE/ACM Transactions on Audio Speech and Language Processing
Volume32
DOIs
Publication statusPublished - 2024
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2014 IEEE.

ASJC Scopus Subject Areas

  • Computer Science (miscellaneous)
  • Acoustics and Ultrasonics
  • Computational Mathematics
  • Electrical and Electronic Engineering

Keywords

  • complex generalized Gaussian
  • multichannel linear prediction
  • nonnegative matrix factorization
  • Speech dereverberation
  • weighted prediction error

Cite this