Search results for: speech signal processing.

2575 Recognition by Online Modeling – a New Approach of Recognizing Voice Signals in Linear Time

Abstract:

This work presents a novel means of extracting fixedlength parameters from voice signals, such that words can be recognized in linear time. The power and the zero crossing rate are first calculated segment by segment from a voice signal; by doing so, two feature sequences are generated. We then construct an FIR system across these two sequences. The parameters of this FIR system, used as the input of a multilayer proceptron recognizer, can be derived by recursive LSE (least-square estimation), implying that the complexity of overall process is linear to the signal size. In the second part of this work, we introduce a weighting factor λ to emphasize recent input; therefore, we can further recognize continuous speech signals. Experiments employ the voice signals of numbers, from zero to nine, spoken in Mandarin Chinese. The proposed method is verified to recognize voice signals efficiently and accurately.

Keywords: Speech Recognition, FIR system, Recursive LSE, Multilayer Perceptron

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1417

2574 High-Individuality Voice Conversion Based on Concatenative Speech Synthesis

Authors: Kei Fujii, Jun Okawa, Kaori Suigetsu

Abstract:

Concatenative speech synthesis is a method that can make speech sound which has naturalness and high-individuality of a speaker by introducing a large speech corpus. Based on this method, in this paper, we propose a voice conversion method whose conversion speech has high-individuality and naturalness. The authors also have two subjective evaluation experiments for evaluating individuality and sound quality of conversion speech. From the results, following three facts have be confirmed: (a) the proposal method can convert the individuality of speakers well, (b) employing the framework of unit selection (especially join cost) of concatenative speech synthesis into conventional voice conversion improves the sound quality of conversion speech, and (c) the proposal method is robust against the difference of genders between a source speaker and a target speaker.

Keywords: concatenative speech synthesis, join cost, speaker individuality, unit selection, voice conversion

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1939

2573 Review of Surface Electromyogram Signals: Its Analysis and Applications

Authors: Anjana Goen, D. C. Tiwari

Abstract:

Electromyography (EMG) is the study of muscles function through analysis of electrical activity produced from muscles. This electrical activity which is displayed in the form of signal is the result of neuromuscular activation associated with muscle contraction. The most common techniques of EMG signal recording are by using surface and needle/wire electrode where the latter is usually used for interest in deep muscle. This paper will focus on surface electromyogram (SEMG) signal. During SEMG recording, several problems had to been countered such as noise, motion artifact and signal instability. Thus, various signal processing techniques had been implemented to produce a reliable signal for analysis. SEMG signal finds broad application particularly in biomedical field. It had been analyzed and studied for various interests such as neuromuscular disease, enhancement of muscular function and human-computer interface.

Keywords: Evolvable hardware (EHW), Functional Electrical Simulation (FES), Hidden Markov Model (HMM), Hjorth Time Domain (HTD).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3516

2572 Comparative Study of Filter Characteristics as Statistical Vocal Correlates of Clinical Psychiatric State in Human

Authors: Thaweesak Yingthawornsuk, Chusak Thanawattano

Abstract:

Acoustical properties of speech have been shown to be related to mental states of speaker with symptoms: depression and remission. This paper describes way to address the issue of distinguishing depressed patients from remitted subjects based on measureable acoustics change of their spoken sound. The vocal-tract related frequency characteristics of speech samples from female remitted and depressed patients were analyzed via speech processing techniques and consequently, evaluated statistically by cross-validation with Support Vector Machine. Our results comparatively show the classifier's performance with effectively correct separation of 93% determined from testing with the subjectbased feature model and 88% from the frame-based model based on the same speech samples collected from hospital visiting interview sessions between patients and psychiatrists.

Keywords: Depression, SVM, Vocal Extract, Vocal Tract

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1541

2571 Oil Debris Signal Detection Based on Integral Transform and Empirical Mode Decomposition

Authors: Chuan Li, Ming Liang

Abstract:

Oil debris signal generated from the inductive oil debris monitor (ODM) is useful information for machine condition monitoring but is often spoiled by background noise. To improve the reliability in machine condition monitoring, the high-fidelity signal has to be recovered from the noisy raw data. Considering that the noise components with large amplitude often have higher frequency than that of the oil debris signal, the integral transform is proposed to enhance the detectability of the oil debris signal. To cancel out the baseline wander resulting from the integral transform, the empirical mode decomposition (EMD) method is employed to identify the trend components. An optimal reconstruction strategy including both de-trending and de-noising is presented to detect the oil debris signal with less distortion. The proposed approach is applied to detect the oil debris signal in the raw data collected from an experimental setup. The result demonstrates that this approach is able to detect the weak oil debris signal with acceptable distortion from noisy raw data.

Keywords: Integral transform, empirical mode decomposition, oil debris, signal processing, detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1716

2570 Environmentally Adaptive Acoustic Echo Suppression for Barge-in Speech Recognition

Authors: Jong Han Joo, Jeong Hun Lee, Young Sun Kim, Jae Young Kang, Seung Ho Choi

Abstract:

In this study, we propose a novel technique for acoustic echo suppression (AES) during speech recognition under barge-in conditions. Conventional AES methods based on spectral subtraction apply fixed weights to the estimated echo path transfer function (EPTF) at the current signal segment and to the EPTF estimated until the previous time interval. However, the effects of echo path changes should be considered for eliminating the undesired echoes. We describe a new approach that adaptively updates weight parameters in response to abrupt changes in the acoustic environment due to background noises or double-talk. Furthermore, we devised a voice activity detector and an initial time-delay estimator for barge-in speech recognition in communication networks. The initial time delay is estimated using log-spectral distance measure, as well as cross-correlation coefficients. The experimental results show that the developed techniques can be successfully applied in barge-in speech recognition systems.

Keywords: Acoustic echo suppression, barge-in, speech recognition, echo path transfer function, initial delay estimator, voice activity detector.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2317

2569 Complex-Valued Neural Network in Signal Processing: A Study on the Effectiveness of Complex Valued Generalized Mean Neuron Model

Authors: Anupama Pande, Ashok Kumar Thakur, Swapnoneel Roy

Abstract:

A complex valued neural network is a neural network which consists of complex valued input and/or weights and/or thresholds and/or activation functions. Complex-valued neural networks have been widening the scope of applications not only in electronics and informatics, but also in social systems. One of the most important applications of the complex valued neural network is in signal processing. In Neural networks, generalized mean neuron model (GMN) is often discussed and studied. The GMN includes a new aggregation function based on the concept of generalized mean of all the inputs to the neuron. This paper aims to present exhaustive results of using Generalized Mean Neuron model in a complex-valued neural network model that uses the back-propagation algorithm (called -Complex-BP-) for learning. Our experiments results demonstrate the effectiveness of a Generalized Mean Neuron Model in a complex plane for signal processing over a real valued neural network. We have studied and stated various observations like effect of learning rates, ranges of the initial weights randomly selected, error functions used and number of iterations for the convergence of error required on a Generalized Mean neural network model. Some inherent properties of this complex back propagation algorithm are also studied and discussed.

Keywords: Complex valued neural network, Generalized Meanneuron model, Signal processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1730

2568 An Approach to Noise Variance Estimation in Very Low Signal-to-Noise Ratio Stochastic Signals

Authors: Miljan B. Petrović, Dušan B. Petrović, Goran S. Nikolić

Abstract:

This paper describes a method for AWGN (Additive White Gaussian Noise) variance estimation in noisy stochastic signals, referred to as Multiplicative-Noising Variance Estimation (MNVE). The aim was to develop an estimation algorithm with minimal number of assumptions on the original signal structure. The provided MATLAB simulation and results analysis of the method applied on speech signals showed more accuracy than standardized AR (autoregressive) modeling noise estimation technique. In addition, great performance was observed on very low signal-to-noise ratios, which in general represents the worst case scenario for signal denoising methods. High execution time appears to be the only disadvantage of MNVE. After close examination of all the observed features of the proposed algorithm, it was concluded it is worth of exploring and that with some further adjustments and improvements can be enviably powerful.

Keywords: Noise, signal-to-noise ratio, stochastic signals, variance estimation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2258

2567 Voice Features as the Diagnostic Marker of Autism

Authors: Elena Lyakso, Olga Frolova, Yuri Matveev

Abstract:

The aim of the study is to determine the acoustic features of voice and speech of children with autism spectrum disorders (ASD) as a possible additional diagnostic criterion. The participants in the study were 95 children with ASD aged 5-16 years, 150 typically development (TD) children, and 103 adults – listening to children’s speech samples. Three types of experimental methods for speech analysis were performed: spectrographic, perceptual by listeners, and automatic recognition. In the speech of children with ASD, the pitch values, pitch range, values of frequency and intensity of the third formant (emotional) leading to the “atypical” spectrogram of vowels are higher than corresponding parameters in the speech of TD children. High values of vowel articulation index (VAI) are specific for ASD children’s speech signals. These acoustic features can be considered as diagnostic marker of autism. The ability of humans and automatic recognition of the psychoneurological state of children via their speech is determined.

Keywords: Autism spectrum disorders, biomarker of autism, child speech, voice features.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 619

2566 A Sparse Representation Speech Denoising Method Based on Adapted Stopping Residue Error

Authors: Qianhua He, Weili Zhou, Aiwu Chen

Abstract:

A sparse representation speech denoising method based on adapted stopping residue error was presented in this paper. Firstly, the cross-correlation between the clean speech spectrum and the noise spectrum was analyzed, and an estimation method was proposed. In the denoising method, an over-complete dictionary of the clean speech power spectrum was learned with the K-singular value decomposition (K-SVD) algorithm. In the sparse representation stage, the stopping residue error was adaptively achieved according to the estimated cross-correlation and the adjusted noise spectrum, and the orthogonal matching pursuit (OMP) approach was applied to reconstruct the clean speech spectrum from the noisy speech. Finally, the clean speech was re-synthesised via the inverse Fourier transform with the reconstructed speech spectrum and the noisy speech phase. The experiment results show that the proposed method outperforms the conventional methods in terms of subjective and objective measure.

Keywords: Speech denoising, sparse representation, K-singular value decomposition, orthogonal matching pursuit.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1014

2565 Eisenhower’s Farewell Speech: Initial and Continuing Communication Effects

Authors: B. Kuiper

Abstract:

When Dwight D. Eisenhower delivered his final Presidential speech in 1961, he was using the opportunity to bid farewell to America, but he was also trying to warn his fellow countrymen about deeper challenges threatening the country. In this analysis, Eisenhower’s speech is examined in light of the impact it had on American culture, communication concepts, and political ramifications. The paper initially highlights the previous literature on the speech, especially in light of its 50^thanniversary, and reveals a man whose main concern was how the speech’s words would affect his beloved country. The painstaking approach to the wording of the speech to reveal the intent is key, particularly in light of analyzing the motivations according to “virtuous communication.” This philosophical construct indicates that Eisenhower’s Farewell Address was crafted carefully according to a departing President’s deepest values and concerns, concepts that he wanted to pass along to his successor, to his country, and even to the world.

Keywords: Eisenhower, mass communication, political speech, rhetoric.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1869

2564 Piecewise Interpolation Filter for Effective Processing of Large Signal Sets

Authors: Anatoli Torokhti, Stanley Miklavcic

Abstract:

Suppose KY and KX are large sets of observed and reference signals, respectively, each containing N signals. Is it possible to construct a filter F : KY → KX that requires a priori information only on few signals, p N, from KX but performs better than the known filters based on a priori information on every reference signal from KX? It is shown that the positive answer is achievable under quite unrestrictive assumptions. The device behind the proposed method is based on a special extension of the piecewise linear interpolation technique to the case of random signal sets. The proposed technique provides a single filter to process any signal from the arbitrarily large signal set. The filter is determined in terms of pseudo-inverse matrices so that it always exists.

Keywords: Wiener filter, filtering of stochastic signals.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1412

2563 On-line Speech Enhancement by Time-Frequency Masking under Prior Knowledge of Source Location

Authors: Min Ah Kang, Sangbae Jeong, Minsoo Hahn

Abstract:

This paper presents the source extraction system which can extract only target signals with constraints on source localization in on-line systems. The proposed system is a kind of methods for enhancing a target signal and suppressing other interference signals. But, the performance of proposed system is superior to any other methods and the extraction of target source is comparatively complete. The method has a beamforming concept and uses an improved time-frequency (TF) mask-based BSS algorithm to separate a target signal from multiple noise sources. The target sources are assumed to be in front and test data was recorded in a reverberant room. The experimental results of the proposed method was evaluated by the PESQ score of real-recording sentences and showed a noticeable speech enhancement.

Keywords: Beam forming, Non-stationary noise reduction, Source separation, TF mask.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2022

2562 Wavelet-Based ECG Signal Analysis and Classification

Authors: Madina Hamiane, May Hashim Ali

Abstract:

This paper presents the processing and analysis of ECG signals. The study is based on wavelet transform and uses exclusively the MATLAB environment. This study includes removing Baseline wander and further de-noising through wavelet transform and metrics such as signal-to noise ratio (SNR), Peak signal-to-noise ratio (PSNR) and the mean squared error (MSE) are used to assess the efficiency of the de-noising techniques. Feature extraction is subsequently performed whereby signal features such as heart rate, rise and fall levels are extracted and the QRS complex was detected which helped in classifying the ECG signal. The classification is the last step in the analysis of the ECG signals and it is shown that these are successfully classified as Normal rhythm or Abnormal rhythm. The final result proved the adequacy of using wavelet transform for the analysis of ECG signals.

Keywords: ECG Signal, QRS detection, thresholding, wavelet decomposition, feature extraction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1273

2561 Neural Network Based Speech to Text in Malay Language

Authors: H. F. A. Abdul Ghani, R. R. Porle

Abstract:

Speech to text in Malay language is a system that converts Malay speech into text. The Malay language recognition system is still limited, thus, this paper aims to investigate the performance of ten Malay words obtained from the online Malay news. The methodology consists of three stages, which are preprocessing, feature extraction, and speech classification. In preprocessing stage, the speech samples are filtered using pre emphasis. After that, feature extraction method is applied to the samples using Mel Frequency Cepstrum Coefficient (MFCC). Lastly, speech classification is performed using Feedforward Neural Network (FFNN). The accuracy of the classification is further investigated based on the hidden layer size. From experimentation, the classifier with 40 hidden neurons shows the highest classification rate which is 94%.

Keywords: Feed-Forward Neural Network, FFNN, Malay speech recognition, Mel Frequency Cepstrum Coefficient, MFCC, speech-to-text.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 746

2560 Analysis of Electrocardiograph (ECG) Signal for the Detection of Abnormalities Using MATLAB

Authors: Durgesh Kumar Ojha, Monica Subashini

Abstract:

The proposed method is to study and analyze Electrocardiograph (ECG) waveform to detect abnormalities present with reference to P, Q, R and S peaks. The first phase includes the acquisition of real time ECG data. In the next phase, generation of signals followed by pre-processing. Thirdly, the procured ECG signal is subjected to feature extraction. The extracted features detect abnormal peaks present in the waveform Thus the normal and abnormal ECG signal could be differentiated based on the features extracted. The work is implemented in the most familiar multipurpose tool, MATLAB. This software efficiently uses algorithms and techniques for detection of any abnormalities present in the ECG signal. Proper utilization of MATLAB functions (both built-in and user defined) can lead us to work with ECG signals for processing and analysis in real time applications. The simulation would help in improving the accuracy and the hardware could be built conveniently.

Keywords: ECG Waveform, Peak Detection, Arrhythmia, Matlab.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 12008

2559 A Method for Quality Inspection of Motors by Detecting Abnormal Sound

Authors: Tadatsugu Kitamoto

Abstract:

Recently, a quality of motors is inspected by human ears. In this paper, I propose two systems using a method of speech recognition for automation of the inspection. The first system is based on a method of linear processing which uses K-means and Nearest Neighbor method, and the second is based on a method of non-linear processing which uses neural networks. I used motor sounds in these systems, and I successfully recognize 86.67% of motor sounds in the linear processing system and 97.78% in the non-linear processing system.

Keywords: Acoustical diagnosis, Neural networks, K-means, Short-time Fourier transformation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1700

2558 Subjective Assessment about Super Resolution Image Resolution

Authors: Seiichi Gohshi, Hiroyuki Sekiguchi, Yoshiyasu Shimizu, Takeshi Ikenaga

Abstract:

Super resolution (SR) technologies are now being applied to video to improve resolution. Some TV sets are now equipped with SR functions. However, it is not known if super resolution image reconstruction (SRR) for TV really works or not. Super resolution with non-linear signal processing (SRNL) has recently been proposed. SRR and SRNL are the only methods for processing video signals in real time. The results from subjective assessments of SSR and SRNL are described in this paper. SRR video was produced in simulations with quarter precision motion vectors and 100 iterations. These are ideal conditions for SRR. We found that the image quality of SRNL is better than that of SRR even though SRR was processed under ideal conditions.

Keywords: Super Resolution Image Reconstruction, Super Resolution with Non-Linear Signal Processing, Subjective Assessment, Image Quality

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1695

2557 Tool Failure Detection Based on Statistical Analysis of Metal Cutting Acoustic Emission Signals

Authors: Othman Belgassim, Krzysztof Jemielniak

Abstract:

The analysis of Acoustic Emission (AE) signal generated from metal cutting processes has often approached statistically. This is due to the stochastic nature of the emission signal as a result of factors effecting the signal from its generation through transmission and sensing. Different techniques are applied in this manner, each of which is suitable for certain processes. In metal cutting where the emission generated by the deformation process is rather continuous, an appropriate method for analysing the AE signal based on the root mean square (RMS) of the signal is often used and is suitable for use with the conventional signal processing systems. The aim of this paper is to set a strategy in tool failure detection in turning processes via the statistic analysis of the AE generated from the cutting zone. The strategy is based on the investigation of the distribution moments of the AE signal at predetermined sampling. The skews and kurtosis of these distributions are the key elements in the detection. A normal (Gaussian) distribution has first been suggested then this was eliminated due to insufficiency. The so called Beta distribution was then considered, this has been used with an assumed β density function and has given promising results with regard to chipping and tool breakage detection.

Keywords: AE signal, skew, kurtosis, tool failure

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1846

2556 Practical Guidelines and Examples for the Users of the TMS320C6713 DSK

Authors: Abdullah A Wardak

Abstract:

This paper describes how the correct endian mode of the TMS320C6713 DSK board can be identified. It also explains how the TMS320C6713 DSK board can be used in the little endian and in the big endian modes for assembly language programming in particular and for signal processing in general. Similarly, it discusses how crucially important it is for a user of the TMS320C6713 DSK board to identify the mode of operation and then use it correctly during the development stages of the assembly language programming; otherwise, it will cause unnecessary confusion and erroneous results as far as storing data into the memory and loading data from the memory is concerned. Furthermore, it highlights and strongly recommends to the users of the TMS320C6713 DSK board to be aware of the availability and importance of various display options in the Code Composer Studio (CCS) for correctly interpreting and displaying the desired data in the memory. The information presented in this paper will be of great importance and interest to those practitioners and developers who wants to use the TMS320C6713 DSK board for assembly language programming as well as input-output signal processing manipulations. Finally, examples that clearly illustrate the concept are presented.

Keywords: Assembly language programming, big endian mode, little endian mode, signal processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2787

2555 Investigating Simple Multipath Compensation for Frequency Modulated Signals at Lower Frequencies

Authors: Lusungu Ndovi

Abstract:

Radio propagation from point-to-point is affected by the physical channel in many ways. A signal arriving at a destination travels through a number of different paths which are referred to as multi-paths. Research in this area of wireless communications has progressed well over the years with the research taking different angles of focus. By this is meant that some researchers focus on ways of reducing or eluding Multipath effects whilst others focus on ways of mitigating the effects of Multipath through compensation schemes. Baseband processing is seen as one field of signal processing that is cardinal to the advancement of software defined radio technology. This has led to wide research into the carrying out certain algorithms at baseband. This paper considers compensating for Multipath for Frequency Modulated signals. The compensation process is carried out at Radio frequency (RF) and at Quadrature baseband (QBB) and the results are compared. Simulations are carried out using MatLab so as to show the benefits of working at lower QBB frequencies than at RF.

Keywords: Quadrature baseband, Radio frequency, MultipathCompensation, Frequency modulation, Signal Processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1650

2554 A Novel Method for Blood Glucose Measurement by Noninvasive Technique Using Laser

Authors: V.Ashok, A.Nirmalkumar, N.Jeyashanthi

Abstract:

A method and apparatus for noninvasive measurement of blood glucose concentration based on transilluminated laser beam via the Index Finger has been reported in this paper. This method depends on atomic gas (He-Ne) laser operating at 632.8nm wavelength. During measurement, the index finger is inserted into the glucose sensing unit, the transilluminated optical signal is converted into an electrical signal, compared with the reference electrical signal, and the obtained difference signal is processed by signal processing unit which presents the results in the form of blood glucose concentration. This method would enable the monitoring blood glucose level of the diabetic patient continuously, safely and noninvasively.

Keywords: Anisotropy factor, Blood glucose, Diabetes Mellitus, Noninvasive method, Photo detectors.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3295

2553 Speech Acts and Politeness Strategies in an EFL Classroom in Georgia

Authors: Tinatin Kurdghelashvili

Abstract:

The paper deals with the usage of speech acts and politeness strategies in an EFL classroom in Georgia (Rep of). It explores the students’ and the teachers’ practice of the politeness strategies and the speech acts of apology, thanking, request, compliment / encouragement, command, agreeing / disagreeing, addressing and code switching. The research method includes observation as well as a questionnaire. The target group involves the students from Georgian public schools and two certified, experienced local English teachers. The analysis is based on Searle’s Speech Act Theory and Brown and Levinson’s politeness strategies. The findings show that the students have certain knowledge regarding politeness yet they fail to apply them in English communication. In addition, most of the speech acts from the classroom interaction are used by the teachers and not the students. Thereby, it is suggested that teachers should cultivate the students’ communicative competence and attempt to give them opportunities to practise more English speech acts than they do today.

Keywords: English as a foreign language, Georgia, politeness principles, speech acts.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6196

2552 Robust Features for Impulsive Noisy Speech Recognition Using Relative Spectral Analysis

Authors: Hajer Rahali, Zied Hajaiej, Noureddine Ellouze

Abstract:

The goal of speech parameterization is to extract the relevant information about what is being spoken from the audio signal. In speech recognition systems Mel-Frequency Cepstral Coefficients (MFCC) and Relative Spectral Mel-Frequency Cepstral Coefficients (RASTA-MFCC) are the two main techniques used. It will be shown in this paper that it presents some modifications to the original MFCC method. In our work the effectiveness of proposed changes to MFCC called Modified Function Cepstral Coefficients (MODFCC) were tested and compared against the original MFCC and RASTA-MFCC features. The prosodic features such as jitter and shimmer are added to baseline spectral features. The above-mentioned techniques were tested with impulsive signals under various noisy conditions within AURORA databases.

Keywords: Auditory filter, impulsive noise, MFCC, prosodic features, RASTA filter.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2323

2551 Numerical Analysis of All-Optical Microwave Mixing and Bandpass Filtering in an RoF Link

Authors: S. Khosroabadi, M. R. Salehi

Abstract:

In this paper, all-optical signal processors that perform both microwave mixing and bandpass filtering in a radio-over-fiber (RoF) link are presented. The key device is a Mach-Zehnder modulator (MZM) which performs all-optical microwave mixing. An up-converted microwave signal is obtained and other unwanted frequency components are suppressed at the end of the fiber span.

Keywords: Microwave mixing, bandpass filtering, all-optical, signal processing, MZM.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1719

2550 Adaptive Fourier Decomposition Based Signal Instantaneous Frequency Computation Approach

Authors: Liming Zhang

Abstract:

There have been different approaches to compute the analytic instantaneous frequency with a variety of background reasoning and applicability in practice, as well as restrictions. This paper presents an adaptive Fourier decomposition and (α-counting) based instantaneous frequency computation approach. The adaptive Fourier decomposition is a recently proposed new signal decomposition approach. The instantaneous frequency can be computed through the so called mono-components decomposed by it. Due to the fast energy convergency, the highest frequency of the signal will be discarded by the adaptive Fourier decomposition, which represents the noise of the signal in most of the situation. A new instantaneous frequency definition for a large class of so-called simple waves is also proposed in this paper. Simple wave contains a wide range of signals for which the concept instantaneous frequency has a perfect physical sense. The α-counting instantaneous frequency can be used to compute the highest frequency for a signal. Combination of these two approaches one can obtain the IFs of the whole signal. An experiment is demonstrated the computation procedure with promising results.

Keywords: Adaptive Fourier decomposition, Fourier series, signal processing, instantaneous frequency

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2357

2549 Advances in Artificial Intelligence Using Speech Recognition

Authors: Khaled M. Alhawiti

Abstract:

This research study aims to present a retrospective study about speech recognition systems and artificial intelligence. Speech recognition has become one of the widely used technologies, as it offers great opportunity to interact and communicate with automated machines. Precisely, it can be affirmed that speech recognition facilitates its users and helps them to perform their daily routine tasks, in a more convenient and effective manner. This research intends to present the illustration of recent technological advancements, which are associated with artificial intelligence. Recent researches have revealed the fact that speech recognition is found to be the utmost issue, which affects the decoding of speech. In order to overcome these issues, different statistical models were developed by the researchers. Some of the most prominent statistical models include acoustic model (AM), language model (LM), lexicon model, and hidden Markov models (HMM). The research will help in understanding all of these statistical models of speech recognition. Researchers have also formulated different decoding methods, which are being utilized for realistic decoding tasks and constrained artificial languages. These decoding methods include pattern recognition, acoustic phonetic, and artificial intelligence. It has been recognized that artificial intelligence is the most efficient and reliable methods, which are being used in speech recognition.

Keywords: Speech recognition, acoustic phonetic, artificial intelligence, Hidden Markov Models (HMM), statistical models of speech recognition, human machine performance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7978

2548 Signal Reconstruction Using Cepstrum of Higher Order Statistics

Authors: Adnan Al-Smadi, Mahmoud Smadi

Abstract:

This paper presents an algorithm for reconstructing phase and magnitude responses of the impulse response when only the output data are available. The system is driven by a zero-mean independent identically distributed (i.i.d) non-Gaussian sequence that is not observed. The additive noise is assumed to be Gaussian. This is an important and essential problem in many practical applications of various science and engineering areas such as biomedical, seismic, and speech processing signals. The method is based on evaluating the bicepstrum of the third-order statistics of the observed output data. Simulations results are presented that demonstrate the performance of this method.

Keywords: Cepstrum, bicepstrum, third order statistics

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2037

2547 Sensitivity Analysis for Direction of Arrival Estimation Using Capon and Music Algorithms in Mobile Radio Environment

Authors: Mustafa Abdalla, Khaled A. Madi, Rajab Farhat

Abstract:

An array antenna system with innovative signal processing can improve the resolution of a source direction of arrival (DoA) estimation. High resolution techniques take the advantage of array antenna structures to better process the incoming waves. They also have the capability to identify the direction of multiple targets. This paper investigates performance of the DOA estimation algorithm namely; Capon and MUSIC on the uniform linear array (ULA). The simulation results show that in Capon and MUSIC algorithm the resolution of the DOA techniques improves as number of snapshots, number of array elements, signal-to-noise ratio and separation angle between the two sources θ increases.

Keywords: Antenna array, Capon, MUSIC, Direction-of-arrival estimation, signal processing, uniform linear arrays.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2730

2546 Speech Enhancement Using Wavelet Coefficients Masking with Local Binary Patterns

Authors: Christian Arcos, Marley Vellasco, Abraham Alcaim

Abstract:

In this paper, we present a wavelet coefficients masking based on Local Binary Patterns (WLBP) approach to enhance the temporal spectra of the wavelet coefficients for speech enhancement. This technique exploits the wavelet denoising scheme, which splits the degraded speech into pyramidal subband components and extracts frequency information without losing temporal information. Speech enhancement in each high-frequency subband is performed by binary labels through the local binary pattern masking that encodes the ratio between the original value of each coefficient and the values of the neighbour coefficients. This approach enhances the high-frequency spectra of the wavelet transform instead of eliminating them through a threshold. A comparative analysis is carried out with conventional speech enhancement algorithms, demonstrating that the proposed technique achieves significant improvements in terms of PESQ, an international recommendation of objective measure for estimating subjective speech quality. Informal listening tests also show that the proposed method in an acoustic context improves the quality of speech, avoiding the annoying musical noise present in other speech enhancement techniques. Experimental results obtained with a DNN based speech recognizer in noisy environments corroborate the superiority of the proposed scheme in the robust speech recognition scenario.

Keywords: Binary labels, local binary patterns, mask, wavelet coefficients, speech enhancement, speech recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1017