{"title":"Speech Intelligibility Improvement Using Variable Level Decomposition DWT","authors":"Samba Raju, Chiluveru, Manoj Tripathy","volume":157,"journal":"International Journal of Electronics and Communication Engineering","pagesStart":23,"pagesEnd":27,"ISSN":"1307-6892","URL":"https:\/\/publications.waset.org\/pdf\/10011030","abstract":"Intelligibility is an essential characteristic of a speech
\r\nsignal, which is used to help in the understanding of information in
\r\nspeech signal. Background noise in the environment can deteriorate
\r\nthe intelligibility of a recorded speech. In this paper, we presented a
\r\nsimple variance subtracted - variable level discrete wavelet transform,
\r\nwhich improve the intelligibility of speech. The proposed algorithm
\r\ndoes not require an explicit estimation of noise, i.e., prior knowledge
\r\nof the noise; hence, it is easy to implement, and it reduces the
\r\ncomputational burden. The proposed algorithm decides a separate
\r\ndecomposition level for each frame based on signal dominant and
\r\ndominant noise criteria. The performance of the proposed algorithm
\r\nis evaluated with speech intelligibility measure (STOI), and results
\r\nobtained are compared with Universal Discrete Wavelet Transform
\r\n(DWT) thresholding and Minimum Mean Square Error (MMSE)
\r\nmethods. The experimental results revealed that the proposed scheme
\r\noutperformed competing methods","references":"[1] P. C. Loizou, Speech Enhancement: Theory and Practice. Boca Raton,\r\nFL, USA: CRC press, 2007.\r\n[2] Y. Ephraim and D. Malah, \u201cSpeech enhancement using a\r\nMinimum-Mean Square Error Short-Time Spectral Amplitude\r\nestimator,\u201d IEEE Transactions on Acoustics, Speech, and Signal\r\nProcessing, vol. 32, no. 6, pp. 1109\u20131121, 1984.\r\n[3] S. G. Mallat, \u201cA Theory for Multiresolution Signal Decomposition: The\r\nWavelet Representation,\u201d IEEE Transactions on Pattern Analysis and\r\nMachine Intelligence, vol. 11, no. 7, pp. 674\u2013693, 1989.\r\n[4] G. Kim and P. C. Loizou, \u201cImproving Speech Intelligibility in\r\nNoise using Environment-Optimized Algorithms,\u201d IEEE Transactions on\r\nAudio, Speech, and Language Processing, vol. 18, no. 8, pp. 2080\u20132090,\r\n2010.\r\n[5] P. C. Loizou and G. Kim, \u201cReasons Why Current Speech-Enhancement\r\nAlgorithms do not Improve Speech Intelligibility and Suggested\r\nSolutions,\u201d IEEE Transactions on Audio, Speech, and Language\r\nProcessing, vol. 19, no. 1, pp. 47\u201356, 2010.\r\n[6] D. Wang and J. Chen, \u201cSupervised peech separation based on deep\r\nlearning: An overview,\u201d IEEE Transactions on Audio, Speech, and\r\nLanguage Processing, vol. 26, no. 10, pp. 1702\u20131726, 2018.\r\n[7] M. Kolbk, Z.-H. Tan, J. Jensen, M. Kolbk, Z.-H. Tan, and J. Jensen,\r\n\u201cSpeech Intelligibility Potential of General and Specialized Deep Neural\r\nNetwork based Speech Enhancement Systems,\u201d IEEE Transactions on\r\nAudio, Speech, and Language Processing, vol. 25, no. 1, pp. 153\u2013167,\r\n2017.\r\n[8] S. Y. Low, D. S. Pham, and S. Venkatesh, \u201cCompressive Speech\r\nEnhancement,\u201d Speech Communication, vol. 55, no. 6, pp. 757\u2013768,\r\n2013.\r\n[9] M. Srivastava, C. L. Anderson, and J. H. Freed, \u201cA New Wavelet\r\nDenoising Method for Selecting Decomposition Levels and Noise\r\nThresholds,\u201d IEEE Access, vol. 4, pp. 3862\u20133877, 2016.\r\n[10] J. S. Garofolo et al., \u201cGetting started with the DARPA TIMIT CD-ROM:\r\nAn acoustic phonetic continuous speech database,\u201d National Institute of\r\nStandards and Technology (NIST), Gaithersburgh, MD, vol. 107, pp.\r\n1\u20136, 1988.\r\n[11] A. Varga and H. J. Steeneken, \u201cAssessment for Automatic Speech\r\nRecognition: II. NOISEX-92: A Database and an Experiment to Study\r\nthe Effect of Additive Noise on Speech Recognition Systems,\u201d Speech\r\ncommunication, vol. 12, no. 3, pp. 247\u2013251, 1993.\r\n[12] D. L. Donoho and J. M. Johnstone, \u201cIdeal Spatial Adaptation by Wavelet\r\nShrinkage,\u201d biometrika, vol. 81, no. 3, pp. 425\u2013455, 1994.\r\n[13] D. L. Donoho, \u201cDe-noising by soft-thresholding,\u201d IEEE Transactions on\r\ninformation Theory, vol. 41, no. 3, pp. 613\u2013627, 1995.\r\n[14] C. H. Taal, R. C. Hendriks, R. Heusdens, and J. Jensen, \u201cAn\r\nAlgorithm for Intelligibility Prediction of Time\u2013Frequency Weighted\r\nNoisy Speech,\u201d IEEE Transactions on Audio, Speech, and Language\r\nProcessing, vol. 19, no. 7, pp. 2125\u20132136, 2011.","publisher":"World Academy of Science, Engineering and Technology","index":"Open Science Index 157, 2020"}