Search results for: Speech denoising
290 A Sparse Representation Speech Denoising Method Based on Adapted Stopping Residue Error
Authors: Qianhua He, Weili Zhou, Aiwu Chen
Abstract:
A sparse representation speech denoising method based on adapted stopping residue error was presented in this paper. Firstly, the cross-correlation between the clean speech spectrum and the noise spectrum was analyzed, and an estimation method was proposed. In the denoising method, an over-complete dictionary of the clean speech power spectrum was learned with the K-singular value decomposition (K-SVD) algorithm. In the sparse representation stage, the stopping residue error was adaptively achieved according to the estimated cross-correlation and the adjusted noise spectrum, and the orthogonal matching pursuit (OMP) approach was applied to reconstruct the clean speech spectrum from the noisy speech. Finally, the clean speech was re-synthesised via the inverse Fourier transform with the reconstructed speech spectrum and the noisy speech phase. The experiment results show that the proposed method outperforms the conventional methods in terms of subjective and objective measure.
Keywords: Speech denoising, sparse representation, K-singular value decomposition, orthogonal matching pursuit.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1014289 Spectral Entropy Employment in Speech Enhancement based on Wavelet Packet
Authors: Talbi Mourad, Salhi Lotfi, Chérif Adnen
Abstract:
In this work, we are interested in developing a speech denoising tool by using a discrete wavelet packet transform (DWPT). This speech denoising tool will be employed for applications of recognition, coding and synthesis. For noise reduction, instead of applying the classical thresholding technique, some wavelet packet nodes are set to zero and the others are thresholded. To estimate the non stationary noise level, we employ the spectral entropy. A comparison of our proposed technique to classical denoising methods based on thresholding and spectral subtraction is made in order to evaluate our approach. The experimental implementation uses speech signals corrupted by two sorts of noise, white and Volvo noises. The obtained results from listening tests show that our proposed technique is better than spectral subtraction. The obtained results from SNR computation show the superiority of our technique when compared to the classical thresholding method using the modified hard thresholding function based on u-law algorithm.
Keywords: Enhancement, spectral subtraction, SNR, discrete wavelet packet transform, spectral entropy Histogram
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1991288 Fractional Masks Based On Generalized Fractional Differential Operator for Image Denoising
Authors: Hamid A. Jalab, Rabha W. Ibrahim
Abstract:
This paper introduces an image denoising algorithm based on generalized Srivastava-Owa fractional differential operator for removing Gaussian noise in digital images. The structures of nxn fractional masks are constructed by this algorithm. Experiments show that, the capability of the denoising algorithm by fractional differential-based approach appears efficient to smooth the Gaussian noisy images for different noisy levels. The denoising performance is measured by using peak signal to noise ratio (PSNR) for the denoising images. The results showed an improved performance (higher PSNR values) when compared with standard Gaussian smoothing filter.
Keywords: Fractional calculus, fractional differential operator, fractional mask, fractional filter.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3003287 Electrocardiogram Signal Denoising Using a Hybrid Technique
Authors: R. Latif, W. Jenkal, A. Toumanari, A. Hatim
Abstract:
This paper presents an efficient method of electrocardiogram signal denoising based on a hybrid approach. Two techniques are brought together to create an efficient denoising process. The first is an Adaptive Dual Threshold Filter (ADTF) and the second is the Discrete Wavelet Transform (DWT). The presented approach is based on three steps of denoising, the DWT decomposition, the ADTF step and the highest peaks correction step. This paper presents some application of the approach on some electrocardiogram signals of the MIT-BIH database. The results of these applications are promising compared to other recently published techniques.Keywords: Hybrid technique, ADTF, DWT, tresholding, ECG signal.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1199286 Speech Enhancement Using Wavelet Coefficients Masking with Local Binary Patterns
Authors: Christian Arcos, Marley Vellasco, Abraham Alcaim
Abstract:
In this paper, we present a wavelet coefficients masking based on Local Binary Patterns (WLBP) approach to enhance the temporal spectra of the wavelet coefficients for speech enhancement. This technique exploits the wavelet denoising scheme, which splits the degraded speech into pyramidal subband components and extracts frequency information without losing temporal information. Speech enhancement in each high-frequency subband is performed by binary labels through the local binary pattern masking that encodes the ratio between the original value of each coefficient and the values of the neighbour coefficients. This approach enhances the high-frequency spectra of the wavelet transform instead of eliminating them through a threshold. A comparative analysis is carried out with conventional speech enhancement algorithms, demonstrating that the proposed technique achieves significant improvements in terms of PESQ, an international recommendation of objective measure for estimating subjective speech quality. Informal listening tests also show that the proposed method in an acoustic context improves the quality of speech, avoiding the annoying musical noise present in other speech enhancement techniques. Experimental results obtained with a DNN based speech recognizer in noisy environments corroborate the superiority of the proposed scheme in the robust speech recognition scenario.Keywords: Binary labels, local binary patterns, mask, wavelet coefficients, speech enhancement, speech recognition.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1017285 Denoising and Compression in Wavelet Domainvia Projection on to Approximation Coefficients
Authors: Mario Mastriani
Abstract:
We describe a new filtering approach in the wavelet domain for image denoising and compression, based on the projections of details subbands coefficients (resultants of the splitting procedure, typical in wavelet domain) onto the approximation subband coefficients (much less noisy). The new algorithm is called Projection Onto Approximation Coefficients (POAC). As a result of this approach, only the approximation subband coefficients and three scalars are stored and/or transmitted to the channel. Besides, with the elimination of the details subbands coefficients, we obtain a bigger compression rate. Experimental results demonstrate that our approach compares favorably to more typical methods of denoising and compression in wavelet domain.
Keywords: Compression, denoising, projections, wavelets.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1614284 An Advanced Method for Speech Recognition
Authors: Meysam Mohamad pour, Fardad Farokhi
Abstract:
In this paper in consideration of each available techniques deficiencies for speech recognition, an advanced method is presented that-s able to classify speech signals with the high accuracy (98%) at the minimum time. In the presented method, first, the recorded signal is preprocessed that this section includes denoising with Mels Frequency Cepstral Analysis and feature extraction using discrete wavelet transform (DWT) coefficients; Then these features are fed to Multilayer Perceptron (MLP) network for classification. Finally, after training of neural network effective features are selected with UTA algorithm.Keywords: Multilayer perceptron (MLP) neural network, Discrete Wavelet Transform (DWT) , Mels Scale Frequency Filter , UTA algorithm.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2365283 Performance Improvement in the Bivariate Models by using Modified Marginal Variance of Noisy Observations for Image-Denoising Applications
Authors: R. Senthilkumar
Abstract:
Most simple nonlinear thresholding rules for wavelet- based denoising assume that the wavelet coefficients are independent. However, wavelet coefficients of natural images have significant dependencies. This paper attempts to give a recipe for selecting one of the popular image-denoising algorithms based on VisuShrink, SureShrink, OracleShrink, BayesShrink and BiShrink and also this paper compares different Bivariate models used for image denoising applications. The first part of the paper compares different Shrinkage functions used for image-denoising. The second part of the paper compares different bivariate models and the third part of this paper uses the Bivariate model with modified marginal variance which is based on Laplacian assumption. This paper gives an experimental comparison on six 512x512 commonly used images, Lenna, Barbara, Goldhill, Clown, Boat and Stonehenge. The following noise powers 25dB,26dB, 27dB, 28dB and 29dB are added to the six standard images and the corresponding Peak Signal to Noise Ratio (PSNR) values are calculated for each noise level.Keywords: BiShrink, Image-Denoising, PSNR, Shrinkage function
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1346282 Microarrays Denoising via Smoothing of Coefficients in Wavelet Domain
Authors: Mario Mastriani, Alberto E. Giraldez
Abstract:
We describe a novel method for removing noise (in wavelet domain) of unknown variance from microarrays. The method is based on a smoothing of the coefficients of the highest subbands. Specifically, we decompose the noisy microarray into wavelet subbands, apply smoothing within each highest subband, and reconstruct a microarray from the modified wavelet coefficients. This process is applied a single time, and exclusively to the first level of decomposition, i.e., in most of the cases, it is not necessary a multirresoltuion analysis. Denoising results compare favorably to the most of methods in use at the moment.
Keywords: Directional smoothing, denoising, edge preservation, microarrays, thresholding, wavelets
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1501281 Empirical Mode Decomposition Based Denoising by Customized Thresholding
Authors: Wahiba Mohguen, Raïs El’hadi Bekka
Abstract:
This paper presents a denoising method called EMD-Custom that was based on Empirical Mode Decomposition (EMD) and the modified Customized Thresholding Function (Custom) algorithms. EMD was applied to decompose adaptively a noisy signal into intrinsic mode functions (IMFs). Then, all the noisy IMFs got threshold by applying the presented thresholding function to suppress noise and to improve the signal to noise ratio (SNR). The method was tested on simulated data and real ECG signal, and the results were compared to the EMD-Based signal denoising methods using the soft and hard thresholding. The results showed the superior performance of the proposed EMD-Custom denoising over the traditional approach. The performances were evaluated in terms of SNR in dB, and Mean Square Error (MSE).
Keywords: Customized thresholding, ECG signal, EMD, hard thresholding, Soft-thresholding.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1084280 Feature Preserving Nonlinear Diffusion for Ultrasonic Image Denoising and Edge Enhancement
Authors: Shujun Fu, Qiuqi Ruan, Wenqia Wang, Yu Li
Abstract:
Utilizing echoic intension and distribution from different organs and local details of human body, ultrasonic image can catch important medical pathological changes, which unfortunately may be affected by ultrasonic speckle noise. A feature preserving ultrasonic image denoising and edge enhancement scheme is put forth, which includes two terms: anisotropic diffusion and edge enhancement, controlled by the optimum smoothing time. In this scheme, the anisotropic diffusion is governed by the local coordinate transformation and the first and the second order normal derivatives of the image, while the edge enhancement is done by the hyperbolic tangent function. Experiments on real ultrasonic images indicate effective preservation of edges, local details and ultrasonic echoic bright strips on denoising by our scheme.
Keywords: anisotropic diffusion, coordinate transformationdirectional derivatives, edge enhancement, hyperbolic tangentfunction, image denoising.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1813279 Adaptive Anisotropic Diffusion for Ultrasonic Image Denoising and Edge Enhancement
Authors: Shujun Fu, Qiuqi Ruan, Wenqia Wang, Yu Li
Abstract:
Utilizing echoic intension and distribution from different organs and local details of human body, ultrasonic image can catch important medical pathological changes, which unfortunately may be affected by ultrasonic speckle noise. A feature preserving ultrasonic image denoising and edge enhancement scheme is put forth, which includes two terms: anisotropic diffusion and edge enhancement, controlled by the optimum smoothing time. In this scheme, the anisotropic diffusion is governed by the local coordinate transformation and the first and the second order normal derivatives of the image, while the edge enhancement is done by the hyperbolic tangent function. Experiments on real ultrasonic images indicate effective preservation of edges, local details and ultrasonic echoic bright strips on denoising by our scheme.
Keywords: anisotropic diffusion, coordinate transformation, directional derivatives, edge enhancement, hyperbolic tangent function, image denoising.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1899278 An Efficient Adaptive Thresholding Technique for Wavelet Based Image Denoising
Authors: D.Gnanadurai, V.Sadasivam
Abstract:
This frame work describes a computationally more efficient and adaptive threshold estimation method for image denoising in the wavelet domain based on Generalized Gaussian Distribution (GGD) modeling of subband coefficients. In this proposed method, the choice of the threshold estimation is carried out by analysing the statistical parameters of the wavelet subband coefficients like standard deviation, arithmetic mean and geometrical mean. The noisy image is first decomposed into many levels to obtain different frequency bands. Then soft thresholding method is used to remove the noisy coefficients, by fixing the optimum thresholding value by the proposed method. Experimental results on several test images by using this method show that this method yields significantly superior image quality and better Peak Signal to Noise Ratio (PSNR). Here, to prove the efficiency of this method in image denoising, we have compared this with various denoising methods like wiener filter, Average filter, VisuShrink and BayesShrink.Keywords: Wavelet Transform, Gaussian Noise, ImageDenoising, Filter Banks and Thresholding.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2906277 Complex Wavelet Transform Based Image Denoising and Zooming Under the LMMSE Framework
Authors: T. P. Athira, Gibin Chacko George
Abstract:
This paper proposes a dual tree complex wavelet transform (DT-CWT) based directional interpolation scheme for noisy images. The problems of denoising and interpolation are modelled as to estimate the noiseless and missing samples under the same framework of optimal estimation. Initially, DT-CWT is used to decompose an input low-resolution noisy image into low and high frequency subbands. The high-frequency subband images are interpolated by linear minimum mean square estimation (LMMSE) based interpolation, which preserves the edges of the interpolated images. For each noisy LR image sample, we compute multiple estimates of it along different directions and then fuse those directional estimates for a more accurate denoised LR image. The estimation parameters calculated in the denoising processing can be readily used to interpolate the missing samples. The inverse DT-CWT is applied on the denoised input and interpolated high frequency subband images to obtain the high resolution image. Compared with the conventional schemes that perform denoising and interpolation in tandem, the proposed DT-CWT based noisy image interpolation method can reduce many noise-caused interpolation artifacts and preserve well the image edge structures. The visual and quantitative results show that the proposed technique outperforms many of the existing denoising and interpolation methods.
Keywords: Dual-tree complex wavelet transform (DT-CWT), denoising, interpolation, optimal estimation, super resolution.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2163276 Automatic Recognition of Emotionally Coloured Speech
Authors: Theologos Athanaselis, Stelios Bakamidis, Ioannis Dologlou
Abstract:
Emotion in speech is an issue that has been attracting the interest of the speech community for many years, both in the context of speech synthesis as well as in automatic speech recognition (ASR). In spite of the remarkable recent progress in Large Vocabulary Recognition (LVR), it is still far behind the ultimate goal of recognising free conversational speech uttered by any speaker in any environment. Current experimental tests prove that using state of the art large vocabulary recognition systems the error rate increases substantially when applied to spontaneous/emotional speech. This paper shows that recognition rate for emotionally coloured speech can be improved by using a language model based on increased representation of emotional utterances.Keywords: Statistical language model, N-grams, emotionallycoloured speech
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1617275 Effect of Visual Speech in Sign Speech Synthesis
Authors: Zdenek Krnoul
Abstract:
This article investigates a contribution of synthesized visual speech. Synthesis of visual speech expressed by a computer consists in an animation in particular movements of lips. Visual speech is also necessary part of the non-manual component of a sign language. Appropriate methodology is proposed to determine the quality and the accuracy of synthesized visual speech. Proposed methodology is inspected on Czech speech. Hence, this article presents a procedure of recording of speech data in order to set a synthesis system as well as to evaluate synthesized speech. Furthermore, one option of the evaluation process is elaborated in the form of a perceptual test. This test procedure is verified on the measured data with two settings of the synthesis system. The results of the perceptual test are presented as a statistically significant increase of intelligibility evoked by real and synthesized visual speech. Now, the aim is to show one part of evaluation process which leads to more comprehensive evaluation of the sign speech synthesis system.
Keywords: Perception test, Sign speech synthesis, Talking head, Visual speech.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1476274 Anisotropic Total Fractional Order Variation Model in Seismic Data Denoising
Authors: Jianwei Ma, Diriba Gemechu
Abstract:
In seismic data processing, attenuation of random noise is the basic step to improve quality of data for further application of seismic data in exploration and development in different gas and oil industries. The signal-to-noise ratio of the data also highly determines quality of seismic data. This factor affects the reliability as well as the accuracy of seismic signal during interpretation for different purposes in different companies. To use seismic data for further application and interpretation, we need to improve the signal-to-noise ration while attenuating random noise effectively. To improve the signal-to-noise ration and attenuating seismic random noise by preserving important features and information about seismic signals, we introduce the concept of anisotropic total fractional order denoising algorithm. The anisotropic total fractional order variation model defined in fractional order bounded variation is proposed as a regularization in seismic denoising. The split Bregman algorithm is employed to solve the minimization problem of the anisotropic total fractional order variation model and the corresponding denoising algorithm for the proposed method is derived. We test the effectiveness of theproposed method for synthetic and real seismic data sets and the denoised result is compared with F-X deconvolution and non-local means denoising algorithm.Keywords: Anisotropic total fractional order variation, fractional order bounded variation, seismic random noise attenuation, Split Bregman Algorithm.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1013273 Systholic Boolean Orthonormalizer Network in Wavelet Domain for Microarray Denoising
Authors: Mario Mastriani
Abstract:
We describe a novel method for removing noise (in wavelet domain) of unknown variance from microarrays. The method is based on the following procedure: We apply 1) Bidimentional Discrete Wavelet Transform (DWT-2D) to the Noisy Microarray, 2) scaling and rounding to the coefficients of the highest subbands (to obtain integer and positive coefficients), 3) bit-slicing to the new highest subbands (to obtain bit-planes), 4) then we apply the Systholic Boolean Orthonormalizer Network (SBON) to the input bit-plane set and we obtain two orthonormal otput bit-plane sets (in a Boolean sense), we project a set on the other one, by means of an AND operation, and then, 5) we apply re-assembling, and, 6) rescaling. Finally, 7) we apply Inverse DWT-2D and reconstruct a microarray from the modified wavelet coefficients. Denoising results compare favorably to the most of methods in use at the moment.
Keywords: Bit-Plane, Boolean Orthonormalization Process, Denoising, Microarrays, Wavelets
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1489272 Multiscale Blind Image Restoration with a New Method
Authors: Alireza Mallahzadeh, Hamid Dehghani, Iman Elyasi
Abstract:
A new method, based on the normal shrink and modified version of Katssagelous and Lay, is proposed for multiscale blind image restoration. The method deals with the noise and blur in the images. It is shown that the normal shrink gives the highest S/N (signal to noise ratio) for image denoising process. The multiscale blind image restoration is divided in two sections. The first part of this paper proposes normal shrink for image denoising and the second part of paper proposes modified version of katssagelous and Lay for blur estimation and the combination of both methods to reach a multiscale blind image restoration.Keywords: Multiscale blind image restoration, image denoising, blur estimation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1721271 The Main Principles of Text-to-Speech Synthesis System
Authors: K.R. Aida–Zade, C. Ardil, A.M. Sharifova
Abstract:
In this paper, the main principles of text-to-speech synthesis system are presented. Associated problems which arise when developing speech synthesis system are described. Used approaches and their application in the speech synthesis systems for Azerbaijani language are shown.
Keywords: synthesis of Azerbaijani language, morphemes, phonemes, sounds, sentence, speech synthesizer, intonation, accent, pronunciation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5652270 TeleMe Speech Booster: Web-Based Speech Therapy and Training Program for Children with Articulation Disorders
Authors: C. Treerattanaphan, P. Boonpramuk, P. Singla
Abstract:
Frequent, continuous speech training has proven to be a necessary part of a successful speech therapy process, but constraints of traveling time and employment dispensation become key obstacles especially for individuals living in remote areas or for dependent children who have working parents. In order to ameliorate speech difficulties with ample guidance from speech therapists, a website has been developed that supports speech therapy and training for people with articulation disorders in the standard Thai language. This web-based program has the ability to record speech training exercises for each speech trainee. The records will be stored in a database for the speech therapist to investigate, evaluate, compare and keep track of all trainees’ progress in detail. Speech trainees can request live discussions via video conference call when needed. Communication through this web-based program facilitates and reduces training time in comparison to walk-in training or appointments. This type of training also allows people with articulation disorders to practice speech lessons whenever or wherever is convenient for them, which can lead to a more regular training processes.
Keywords: Web-Based Remote Training Program, Thai Speech Therapy, Articulation Disorders.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1859269 Blind Speech Separation Using SRP-PHAT Localization and Optimal Beamformer in Two-Speaker Environments
Authors: Hai Quang Hong Dam, Hai Ho, Minh Hoang Le Ngo
Abstract:
This paper investigates the problem of blind speech separation from the speech mixture of two speakers. A voice activity detector employing the Steered Response Power - Phase Transform (SRP-PHAT) is presented for detecting the activity information of speech sources and then the desired speech signals are extracted from the speech mixture by using an optimal beamformer. For evaluation, the algorithm effectiveness, a simulation using real speech recordings had been performed in a double-talk situation where two speakers are active all the time. Evaluations show that the proposed blind speech separation algorithm offers a good interference suppression level whilst maintaining a low distortion level of the desired signal.Keywords: Blind speech separation, voice activity detector, SRP-PHAT, optimal beamformer.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1388268 Evaluation of a Multi-Resolution Dyadic Wavelet Transform Method for usable Speech Detection
Authors: Wajdi Ghezaiel, Amel Ben Slimane Rahmouni, Ezzedine Ben Braiek
Abstract:
Many applications of speech communication and speaker identification suffer from the problem of co-channel speech. This paper deals with a multi-resolution dyadic wavelet transform method for usable segments of co-channel speech detection that could be processed by a speaker identification system. Evaluation of this method is performed on TIMIT database referring to the Target to Interferer Ratio measure. Co-channel speech is constructed by mixing all possible gender speakers. Results do not show much difference for different mixtures. For the overall mixtures 95.76% of usable speech is correctly detected with false alarms of 29.65%.Keywords: Co-channel speech, usable speech, multi-resolutionanalysis, speaker identification
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1365267 Elimination Noise by Adaptive Wavelet Threshold
Authors: Iman Elyasi, Sadegh Zarmehi
Abstract:
Due to some reasons, observed images are degraded which are mainly caused by noise. Recently image denoising using the wavelet transform has been attracting much attention. Waveletbased approach provides a particularly useful method for image denoising when the preservation of edges in the scene is of importance because the local adaptivity is based explicitly on the values of the wavelet detail coefficients. In this paper, we propose several methods of noise removal from degraded images with Gaussian noise by using adaptive wavelet threshold (Bayes Shrink, Modified Bayes Shrink and Normal Shrink). The proposed thresholds are simple and adaptive to each subband because the parameters required for estimating the threshold depend on subband data. Experimental results show that the proposed thresholds remove noise significantly and preserve the edges in the scene.
Keywords: Image denoising, Bayes Shrink, Modified Bayes Shrink, Normal Shrink.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2472266 Narrowband Speech Hiding using Vector Quantization
Authors: Driss Guerchi, Fatiha Djebbar
Abstract:
In this work we introduce an efficient method to limit the impact of the hiding process on the quality of the cover speech. Vector quantization of the speech spectral information reduces drastically the number of the secret speech parameters to be embedded in the cover signal. Compared to scalar hiding, vector quantization hiding technique provides a stego signal that is indistinguishable from the cover speech. The objective and subjective performance measures reveal that the current hiding technique attracts no suspicion about the presence of the secret message in the stego speech, while being able to recover an intelligible copy of the secret message at the receiver side.Keywords: Speech steganography, LSF vector quantization, fast Fourier transform
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1514265 Using Teager Energy Cepstrum and HMM distancesin Automatic Speech Recognition and Analysis of Unvoiced Speech
Authors: Panikos Heracleous
Abstract:
In this study, the use of silicon NAM (Non-Audible Murmur) microphone in automatic speech recognition is presented. NAM microphones are special acoustic sensors, which are attached behind the talker-s ear and can capture not only normal (audible) speech, but also very quietly uttered speech (non-audible murmur). As a result, NAM microphones can be applied in automatic speech recognition systems when privacy is desired in human-machine communication. Moreover, NAM microphones show robustness against noise and they might be used in special systems (speech recognition, speech conversion etc.) for sound-impaired people. Using a small amount of training data and adaptation approaches, 93.9% word accuracy was achieved for a 20k Japanese vocabulary dictation task. Non-audible murmur recognition in noisy environments is also investigated. In this study, further analysis of the NAM speech has been made using distance measures between hidden Markov model (HMM) pairs. It has been shown the reduced spectral space of NAM speech using a metric distance, however the location of the different phonemes of NAM are similar to the location of the phonemes of normal speech, and the NAM sounds are well discriminated. Promising results in using nonlinear features are also introduced, especially under noisy conditions.Keywords: Speech recognition, unvoiced speech, nonlinear features, HMM distance measures
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1646264 Analysis of Combined Use of NN and MFCC for Speech Recognition
Authors: Safdar Tanweer, Abdul Mobin, Afshar Alam
Abstract:
The performance and analysis of speech recognition system is illustrated in this paper. An approach to recognize the English word corresponding to digit (0-9) spoken by 2 different speakers is captured in noise free environment. For feature extraction, speech Mel frequency cepstral coefficients (MFCC) has been used which gives a set of feature vectors from recorded speech samples. Neural network model is used to enhance the recognition performance. Feed forward neural network with back propagation algorithm model is used. However other speech recognition techniques such as HMM, DTW exist. All experiments are carried out on Matlab.
Keywords: Speech Recognition, MFCC, Neural Network, classifier.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3267263 On SNR Estimation by the Likelihood of near Pitch for Speech Detection
Authors: Young-Hwan Song, Doo-Heon Kyun, Jong-Kuk Kim, Myung-Jin Bae
Abstract:
People have the habitual pitch level which is used when people say something generally. However this pitch should be changed irregularly in the presence of noise. So it is useful to estimate SNR of speech signal by pitch. In this paper, we obtain the energy of input speech signal and then we detect a stationary region on voiced speech. And we get the pitch period by NAMDF for the stationary region that is not varied pitch rapidly. After getting pitch, each frame is divided by pitch period and the likelihood of closed pitch is estimated. In this paper, we proposed new parameter, NLF, to estimate the SNR of received speech signal. The NLF is derived from the correlation of near pitch periods. The NLF is obtained for each stationary region in voiced speech. Finally we confirmed good performance of the estimation of the SNR of received input speech in the presence of noise.
Keywords: Likelihood, pitch, SNR, speech.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1574262 Speech Impact Realization via Manipulative Argumentation Techniques in Modern American Political Discourse
Authors: Zarine Avetisyan
Abstract:
The present paper presents the discussion of scholars concerning speech impact, peculiarities of its realization, speech strategies and techniques in particular. Departing from the viewpoints of many prominent linguists, the paper suggests that manipulative argumentation be viewed as a most pervasive speech strategy with a certain set of techniques which are to be found in modern American political discourse. The precedence of their occurrence allows us to regard them as pragmatic patterns of speech impact realization in effective public speaking.Keywords: Manipulative argumentation, political discourse, speech impact, technique.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2289261 Speech Enhancement Using Kalman Filter in Communication
Authors: Eng. Alaa K. Satti Salih
Abstract:
Revolutions Applications such as telecommunications, hands-free communications, recording, etc. which need at least one microphone, the signal is usually infected by noise and echo. The important application is the speech enhancement, which is done to remove suppressed noises and echoes taken by a microphone, beside preferred speech. Accordingly, the microphone signal has to be cleaned using digital signal processing DSP tools before it is played out, transmitted, or stored. Engineers have so far tried different approaches to improving the speech by get back the desired speech signal from the noisy observations. Especially Mobile communication, so in this paper will do reconstruction of the speech signal, observed in additive background noise, using the Kalman filter technique to estimate the parameters of the Autoregressive Process (AR) in the state space model and the output speech signal obtained by the MATLAB. The accurate estimation by Kalman filter on speech would enhance and reduce the noise then compare and discuss the results between actual values and estimated values which produce the reconstructed signals.
Keywords: Autoregressive Process, Kalman filter, Matlab and Noise speech.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4025