Binaural Speech Separation Using Binary and Ratio Time-Frequency Masks

Mahmoodzadeh ,  Azar; Abutalebi ,  Hamidreza; Soltanian-Zadeh ,  Hamid; Sheikhzadeh ,  Hamid

Volume 6, Issue 3 (9-2014) 2014, 6(3): 15-24 | Back to browse issues page

Mendeley

Zotero

RefWorks

Mahmoodzadeh A, Abutalebi H, Soltanian-Zadeh H, Sheikhzadeh H. Binaural Speech Separation Using Binary and Ratio Time-Frequency Masks. International Journal of Information and Communication Technology Research 2014; 6 (3) :15-24
URL: http://ijict.itrc.ac.ir/article-1-120-en.html

Binaural Speech Separation Using Binary and Ratio Time-Frequency Masks

Azar Mahmoodzadeh

, Hamidreza Abutalebi

, Hamid Soltanian-Zadeh

, Hamid Sheikhzadeh

Abstract: (2412 Views)

In many speech applications, the target signal is corrupted by highly correlated noise sources. Separating desired speaker signals from the mixture is one of the most challenging research topics in speech signal processing. This paper proposes a binaural system combined with a monaural incoherent post processor for speech segregation. The proposed binaural system is based on spatial localization cues: Interaural Time Differences (ITD) and Interaural Intensity Differences (IID). A target speech is separated from interfering sounds by estimating time–frequency binary and ratio masks. The binary mask is estimated using the multi-level extension of the Otsu thresholding algorithm used in image segmentation. ITD and IID are important features for mask estimation in low and high frequencies, respectively. The ratio mask is estimated using the incoherent monaural speech separation system as the post processing stage. Systematic evaluations show that the proposed system can separate the target signal with acceptance quality.

Keywords: Interaural intensity differences, interaural time differences, speech separation, time-frequency binary mask, ratio mask

Full-Text [PDF 2008 kb] (986 Downloads)

Type of Study: Research | Subject: Information Technology

Rights and permissions
	This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Principal Contact