Real-Time Independent Vector Analysis with a Deep-Learning-Based Source Model

Fang Kang; Feiran Yang; Jun Yang

doi:10.1109/SLT48900.2021.9383599

Real-Time Independent Vector Analysis with a Deep-Learning-Based Source Model

Fang Kang, Feiran Yang, Jun Yang

CAS - Institute of Acoustics

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

8 Citations (Scopus)

Abstract

In this paper, we present a real-time blind source separation (BSS) algorithm, which unifies the independent vector analysis (IVA) as a spatial model and a deep neural network (DNN) as a source model. The auxiliary-function based IVA (Aux-IVA) is utilized to update the demixing matrix, and the required time-varying variance of the speech source is estimated by a DNN. The DNN could provide a more accurate source model, which then helps to optimize the spatial model. In addition, because the DNN is used to estimate the source variance instead of the source power spectrogram, the size of DNN can be reduced significantly. Experiment results show that the joint utilization of the model-based approach and the data-driven approach provides a more efficient solution than just alone in terms of convergence rate and source separation performance.

Original language	English
Title of host publication	2021 IEEE Spoken Language Technology Workshop, SLT 2021 - Proceedings
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	665-669
Number of pages	5
ISBN (Electronic)	9781728170664
DOIs	https://doi.org/10.1109/SLT48900.2021.9383599
Publication status	Published - Jan 19 2021
Externally published	Yes
Event	2021 IEEE Spoken Language Technology Workshop, SLT 2021 - Virtual, Shenzhen, China Duration: Jan 19 2021 → Jan 22 2021

Publication series

Name	2021 IEEE Spoken Language Technology Workshop, SLT 2021 - Proceedings

Conference

Conference	2021 IEEE Spoken Language Technology Workshop, SLT 2021
Country/Territory	China
City	Virtual, Shenzhen
Period	1/19/21 → 1/22/21

Bibliographical note

Publisher Copyright:
© 2021 IEEE.

ASJC Scopus Subject Areas

Linguistics and Language
Language and Linguistics
Artificial Intelligence
Computer Science Applications
Computer Vision and Pattern Recognition
Hardware and Architecture

Keywords

Blind source separation
deep neural net-work
real-time

Access to Document

10.1109/SLT48900.2021.9383599

Cite this

Kang, F., Yang, F., & Yang, J. (2021). Real-Time Independent Vector Analysis with a Deep-Learning-Based Source Model. In 2021 IEEE Spoken Language Technology Workshop, SLT 2021 - Proceedings (pp. 665-669). Article 9383599 (2021 IEEE Spoken Language Technology Workshop, SLT 2021 - Proceedings). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/SLT48900.2021.9383599

@inproceedings{26ab90db1d3c43b2958b83e139c96756,

title = "Real-Time Independent Vector Analysis with a Deep-Learning-Based Source Model",

abstract = "In this paper, we present a real-time blind source separation (BSS) algorithm, which unifies the independent vector analysis (IVA) as a spatial model and a deep neural network (DNN) as a source model. The auxiliary-function based IVA (Aux-IVA) is utilized to update the demixing matrix, and the required time-varying variance of the speech source is estimated by a DNN. The DNN could provide a more accurate source model, which then helps to optimize the spatial model. In addition, because the DNN is used to estimate the source variance instead of the source power spectrogram, the size of DNN can be reduced significantly. Experiment results show that the joint utilization of the model-based approach and the data-driven approach provides a more efficient solution than just alone in terms of convergence rate and source separation performance.",

keywords = "Blind source separation, deep neural net-work, real-time",

author = "Fang Kang and Feiran Yang and Jun Yang",

note = "Publisher Copyright: {\textcopyright} 2021 IEEE.; 2021 IEEE Spoken Language Technology Workshop, SLT 2021 ; Conference date: 19-01-2021 Through 22-01-2021",

year = "2021",

month = jan,

day = "19",

doi = "10.1109/SLT48900.2021.9383599",

language = "English",

series = "2021 IEEE Spoken Language Technology Workshop, SLT 2021 - Proceedings",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "665--669",

booktitle = "2021 IEEE Spoken Language Technology Workshop, SLT 2021 - Proceedings",

address = "United States",

}

Kang, F, Yang, F & Yang, J 2021, Real-Time Independent Vector Analysis with a Deep-Learning-Based Source Model. in 2021 IEEE Spoken Language Technology Workshop, SLT 2021 - Proceedings., 9383599, 2021 IEEE Spoken Language Technology Workshop, SLT 2021 - Proceedings, Institute of Electrical and Electronics Engineers Inc., pp. 665-669, 2021 IEEE Spoken Language Technology Workshop, SLT 2021, Virtual, Shenzhen, China, 1/19/21. https://doi.org/10.1109/SLT48900.2021.9383599

Real-Time Independent Vector Analysis with a Deep-Learning-Based Source Model. / Kang, Fang; Yang, Feiran; Yang, Jun.
2021 IEEE Spoken Language Technology Workshop, SLT 2021 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2021. p. 665-669 9383599 (2021 IEEE Spoken Language Technology Workshop, SLT 2021 - Proceedings).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

TY - GEN

T1 - Real-Time Independent Vector Analysis with a Deep-Learning-Based Source Model

AU - Kang, Fang

AU - Yang, Feiran

AU - Yang, Jun

PY - 2021/1/19

Y1 - 2021/1/19

N2 - In this paper, we present a real-time blind source separation (BSS) algorithm, which unifies the independent vector analysis (IVA) as a spatial model and a deep neural network (DNN) as a source model. The auxiliary-function based IVA (Aux-IVA) is utilized to update the demixing matrix, and the required time-varying variance of the speech source is estimated by a DNN. The DNN could provide a more accurate source model, which then helps to optimize the spatial model. In addition, because the DNN is used to estimate the source variance instead of the source power spectrogram, the size of DNN can be reduced significantly. Experiment results show that the joint utilization of the model-based approach and the data-driven approach provides a more efficient solution than just alone in terms of convergence rate and source separation performance.

AB - In this paper, we present a real-time blind source separation (BSS) algorithm, which unifies the independent vector analysis (IVA) as a spatial model and a deep neural network (DNN) as a source model. The auxiliary-function based IVA (Aux-IVA) is utilized to update the demixing matrix, and the required time-varying variance of the speech source is estimated by a DNN. The DNN could provide a more accurate source model, which then helps to optimize the spatial model. In addition, because the DNN is used to estimate the source variance instead of the source power spectrogram, the size of DNN can be reduced significantly. Experiment results show that the joint utilization of the model-based approach and the data-driven approach provides a more efficient solution than just alone in terms of convergence rate and source separation performance.

KW - Blind source separation

KW - deep neural net-work

KW - real-time

UR - http://www.scopus.com/inward/record.url?scp=85103943947&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85103943947&partnerID=8YFLogxK

U2 - 10.1109/SLT48900.2021.9383599

DO - 10.1109/SLT48900.2021.9383599

M3 - Conference contribution

AN - SCOPUS:85103943947

T3 - 2021 IEEE Spoken Language Technology Workshop, SLT 2021 - Proceedings

SP - 665

EP - 669

BT - 2021 IEEE Spoken Language Technology Workshop, SLT 2021 - Proceedings

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 2021 IEEE Spoken Language Technology Workshop, SLT 2021

Y2 - 19 January 2021 through 22 January 2021

ER -

Real-Time Independent Vector Analysis with a Deep-Learning-Based Source Model

Abstract

Publication series

Conference

Bibliographical note

ASJC Scopus Subject Areas

Keywords

Access to Document

Other files and links

Fingerprint

Cite this