Search results for: A. Serackis
2 Bidirectional Dynamic Time Warping Algorithm for the Recognition of Isolated Words Impacted by Transient Noise Pulses
Authors: G. Tamulevičius, A. Serackis, T. Sledevič, D. Navakauskas
Abstract:
We consider the biggest challenge in speech recognition – noise reduction. Traditionally detected transient noise pulses are removed with the corrupted speech using pulse models. In this paper we propose to cope with the problem directly in Dynamic Time Warping domain. Bidirectional Dynamic Time Warping algorithm for the recognition of isolated words impacted by transient noise pulses is proposed. It uses simple transient noise pulse detector, employs bidirectional computation of dynamic time warping and directly manipulates with warping results. Experimental investigation with several alternative solutions confirms effectiveness of the proposed algorithm in the reduction of impact of noise on recognition process – 3.9% increase of the noisy speech recognition is achieved.
Keywords: Transient noise pulses, noise reduction, dynamic time warping, speech recognition.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19461 Evaluation of Features Extraction Algorithms for a Real-Time Isolated Word Recognition System
Authors: Tomyslav Sledevič, Artūras Serackis, Gintautas Tamulevičius, Dalius Navakauskas
Abstract:
Paper presents an comparative evaluation of features extraction algorithm for a real-time isolated word recognition system based on FPGA. The Mel-frequency cepstral, linear frequency cepstral, linear predictive and their cepstral coefficients were implemented in hardware/software design. The proposed system was investigated in speaker dependent mode for 100 different Lithuanian words. The robustness of features extraction algorithms was tested recognizing the speech records at different signal to noise rates. The experiments on clean records show highest accuracy for Mel-frequency cepstral and linear frequency cepstral coefficients. For records with 15 dB signal to noise rate the linear predictive cepstral coefficients gives best result. The hard and soft part of the system is clocked on 50 MHz and 100 MHz accordingly. For the classification purpose the pipelined dynamic time warping core was implemented. The proposed word recognition system satisfy the real-time requirements and is suitable for applications in embedded systems.
Keywords: Isolated word recognition, features extraction, MFCC, LFCC, LPCC, LPC, FPGA, DTW.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3540