{"title":"On-line Speech Enhancement by Time-Frequency Masking under Prior Knowledge of Source Location","authors":"Min Ah Kang, Sangbae Jeong, Minsoo Hahn","volume":7,"journal":"International Journal of Electrical and Computer Engineering","pagesStart":1933,"pagesEnd":1939,"ISSN":"1307-6892","URL":"https:\/\/publications.waset.org\/pdf\/94","abstract":"
This paper presents the source extraction system which can extract only target signals with constraints on source localization in on-line systems. The proposed system is a kind of methods for enhancing a target signal and suppressing other interference signals. But, the performance of proposed system is superior to any other methods and the extraction of target source is comparatively complete. The method has a beamforming concept and uses an improved time-frequency (TF) mask-based BSS algorithm to separate a target signal from multiple noise sources. The target sources are assumed to be in front and test data was recorded in a reverberant room. The experimental results of the proposed method was evaluated by the PESQ score of real-recording sentences and showed a noticeable speech enhancement.<\/p>\r\n","references":"[1] M. Brandstein and D. Ward, Microphone Arrays, Springer, 2001.\r\n[2] S. Haykin, Adaptive Filter Theory, Prentice Hall, 1991.\r\n[3] S. Gannot, D. Burshtein, and E. Weinstein, \"Signal enhancement using\r\nbeamforming and nonstationarity with applications to speech,\" IEEE\r\nTrans. Signal Process., vol.49, no.8, Aug. 2001, pp.1614-1626.\r\n[4] \u251c\u00fb. Yilmaz and S. Rickard, \"Blind separation of speech mixtures via\r\ntime-frequency masking,\" IEEE Trans. Signal Process., vol. 52, no. 7,\r\nJuly 2004, pp.1830-1846.\r\n[5] ITU-T, \"Perceptual evaluation of speech quality (PESQ), an objective\r\nmethod for end-to-end speech quality assessment of narrow-band\r\ntelephone networks and speech codecs,\" ITU-T Recommendation P.862,\r\nFebruary 2001.\r\n[6] H. Sawada, S. Araki, R. Mukai, and S. Makino, \"Blind extraction of\r\ndominant target sources using ICA and time-frequency masking,\" IEEE\r\nTrans. Signal Process. , vol. 14, no. 6, Nov. 2006, pp.2165-2173.\r\n[7] H. Saruwatari, S. Kurita, and K. Takeda, \"Blind source separation\r\ncombining frequency-domain ICA and beamforming,\" in Proc.\r\nICASSP2001, pp.2733-2736.\r\n[8] G. Shi and P. Aarabi, \"Robust digit recognition using phase-dependent\r\ntime-frequency masking,\" in Proceedings of ICASSP, Hong Kong, Apr.\r\n2003, pp.684-687.\r\n[9] A. Bell and T. Sejnowski, \"An information maximization approach to\r\nblind separation and blind deconvolution,\" Neural Comput., vol.7, Nov.\r\n1995, pp.1129-1159.\r\n[10] J. Yang-Won, K. Hong-Goo, L. Chungyong, Y. Dae-Hee, C. Changkyu,\r\nand K. Jaywoo, \"Adaptive Microphone Array System with Two-Stage\r\nAdaptation Mode Controller,\" in IEICE Trans. Fundamentals, vol. E88-A,\r\nno. 4, Apr. 2005.","publisher":"World Academy of Science, Engineering and Technology","index":"Open Science Index 7, 2007"}