Volume 15, Issue 3 (9-2023)                   2023, 15(3): 53-65 | Back to browse issues page

XML Print


Download citation:
BibTeX | RIS | EndNote | Medlars | ProCite | Reference Manager | RefWorks
Send citation to:

Safari Dehnavi R, Seyedin S. NMF-based Improvement of DNN and LSTM Pre-Training for Speech Enhancemet. International Journal of Information and Communication Technology Research 2023; 15 (3) :53-65
URL: http://ijict.itrc.ac.ir/article-1-555-en.html
1- Department of Electrical Engineering Amirkabir University of Technology (Tehran Polytechnic) Tehran, Iran
2- Department of Electrical Engineering Amirkabir University of Technology (Tehran Polytechnic) Tehran, Iran , sseyedin@aut.ac.ir
Abstract:   (786 Views)
A novel pre-training method is proposed to improve deep-neural-networks (DNN) and long-short-term-memory (LSTM) performance, and reduce the local minimum problem for speech enhancement. We propose initializing the last layer weights of DNN and LSTM by Non-Negative-Matrix-Factorization (NMF) basis transposed values instead of random weights. Due to its ability to extract speech features even in presence of non-stationary noises, NMF is faster and more successful than previous pre-training methods for network convergence. Using NMF basis matrix in the first layer along with another pre-training method is also proposed. To achieve better results, we further propose training individual models for each noise type based on a noise classification strategy. The evaluation of the proposed method on TIMIT data shows that it outperforms the baselines significantly in terms of perceptual-evaluation-of-speech-quality (PESQ) and other objective measures. Our method outperforms the baselines in terms of PESQ up to 0.17, with an improvement percentage of 3.4%.
Full-Text [PDF 1238 kb]   (318 Downloads)    
Type of Study: Research | Subject: Communication Technology

Add your comments about this article : Your username or Email:
CAPTCHA

Send email to the article author


Rights and permissions
Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.