Search results for: speech signal processing.
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2665

Search results for: speech signal processing.

2605 Time Delay Estimation Using Signal Envelopes for Synchronisation of Recordings

Authors: Sergei Aleinik, Mikhail Stolbov

Abstract:

In this work, a method of time delay estimation for  dual-channel acoustic signals (speech, music, etc.) recorded under  reverberant conditions is investigated. Standard methods based on  cross-correlation of the signals show poor results in cases involving  strong reverberation, large distances between microphones and  asynchronous recordings. Under similar conditions, a method based  on cross-correlation of temporal envelopes of the signals delivers a  delay estimation of acceptable quality. This method and its properties  are described and investigated in detail, including its limits of  applicability. The method’s optimal parameter estimation and a  comparison with other known methods of time delay estimation are  also provided.

 

Keywords: Cross-correlation, delay estimation, signal envelope, signal processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3063
2604 A New Approach to Signal Processing for DC-Electromagnetic Flowmeters

Authors: Michael Schukat

Abstract:

Electromagnetic flowmeters with DC excitation are used for a wide range of fluid measurement tasks, but are rarely found in dosing applications with short measurement cycles due to the achievable accuracy. This paper will identify a number of factors that influence the accuracy of this sensor type when used for short-term measurements. Based on these results a new signal-processing algorithm will be described that overcomes the identified problems to some extend. This new method allows principally a higher accuracy of electromagnetic flowmeters with DC excitation than traditional methods.

Keywords: Electromagnetic Flowmeter, Kalman Filter, ShortMeasurement Cycles, Signal Estimation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1613
2603 Subjective Evaluation of Spectral and Time Domain Cascading Algorithm for Speech Enhancement for Mobile Communication

Authors: Harish Chander, Balwinder Singh, Ravinder Khanna

Abstract:

In this paper, we present the comparative subjective analysis of Improved Minima Controlled Recursive Averaging (IMCRA) Algorithm, the Kalman filter and the cascading of IMCRA and Kalman filter algorithms. Performance of speech enhancement algorithms can be predicted in two different ways. One is the objective method of evaluation in which the speech quality parameters are predicted computationally. The second is a subjective listening test in which the processed speech signal is subjected to the listeners who judge the quality of speech on certain parameters. The comparative objective evaluation of these algorithms was analyzed in terms of Global SNR, Segmental SNR and Perceptual Evaluation of Speech Quality (PESQ) by the authors and it was reported that with cascaded algorithms there is a substantial increase in objective parameters. Since subjective evaluation is the real test to judge the quality of speech enhancement algorithms, the authenticity of superiority of cascaded algorithms over individual IMCRA and Kalman algorithms is tested through subjective analysis in this paper. The results of subjective listening tests have confirmed that the cascaded algorithms perform better under all types of noise conditions.

Keywords: Speech enhancement, spectral domain, time domain, PESQ, subjective analysis, objective analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1231
2602 Speech Enhancement by Marginal Statistical Characterization in the Log Gabor Wavelet Domain

Authors: Suman Senapati, Goutam Saha

Abstract:

This work presents a fusion of Log Gabor Wavelet (LGW) and Maximum a Posteriori (MAP) estimator as a speech enhancement tool for acoustical background noise reduction. The probability density function (pdf) of the speech spectral amplitude is approximated by a Generalized Laplacian Distribution (GLD). Compared to earlier estimators the proposed method estimates the underlying statistical model more accurately by appropriately choosing the model parameters of GLD. Experimental results show that the proposed estimator yields a higher improvement in Segmental Signal-to-Noise Ratio (S-SNR) and lower Log-Spectral Distortion (LSD) in two different noisy environments compared to other estimators.

Keywords: Speech Enhancement, Generalized Laplacian Distribution, Log Gabor Wavelet, Bayesian MAP Marginal Estimator.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1629
2601 The Main Principles of Text-to-Speech Synthesis System

Authors: K.R. Aida–Zade, C. Ardil, A.M. Sharifova

Abstract:

In this paper, the main principles of text-to-speech synthesis system are presented. Associated problems which arise when developing speech synthesis system are described. Used approaches and their application in the speech synthesis systems for Azerbaijani language are shown.

Keywords: synthesis of Azerbaijani language, morphemes, phonemes, sounds, sentence, speech synthesizer, intonation, accent, pronunciation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5652
2600 TeleMe Speech Booster: Web-Based Speech Therapy and Training Program for Children with Articulation Disorders

Authors: C. Treerattanaphan, P. Boonpramuk, P. Singla

Abstract:

Frequent, continuous speech training has proven to be a necessary part of a successful speech therapy process, but constraints of traveling time and employment dispensation become key obstacles especially for individuals living in remote areas or for dependent children who have working parents. In order to ameliorate speech difficulties with ample guidance from speech therapists, a website has been developed that supports speech therapy and training for people with articulation disorders in the standard Thai language. This web-based program has the ability to record speech training exercises for each speech trainee. The records will be stored in a database for the speech therapist to investigate, evaluate, compare and keep track of all trainees’ progress in detail. Speech trainees can request live discussions via video conference call when needed. Communication through this web-based program facilitates and reduces training time in comparison to walk-in training or appointments. This type of training also allows people with articulation disorders to practice speech lessons whenever or wherever is convenient for them, which can lead to a more regular training processes.

Keywords: Web-Based Remote Training Program, Thai Speech Therapy, Articulation Disorders.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1859
2599 Algorithm of Measurement of Noise Signal Power in the Presence of Narrowband Interference

Authors: Alexey V. Klyuev, Valery P. Samarin, Viktor F. Klyuev

Abstract:

A power measurement algorithm of the input mix components of the noise signal and narrowband interference is considered using functional transformations of the input mix in the postdetection processing channel. The algorithm efficiency analysis has been carried out for different interference-to-signal ratio. Algorithm performance features have been explored by numerical experiment results.

Keywords: Noise signal, continuous narrowband interference, signal power, spectrum width, detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1397
2598 A System of Automatic Speech Recognition based on the Technique of Temporal Retiming

Authors: Samir Abdelhamid, Noureddine Bouguechal

Abstract:

We report in this paper the procedure of a system of automatic speech recognition based on techniques of the dynamic programming. The technique of temporal retiming is a technique used to synchronize between two forms to compare. We will see how this technique is adapted to the field of the automatic speech recognition. We will expose, in a first place, the theory of the function of retiming which is used to compare and to adjust an unknown form with a whole of forms of reference constituting the vocabulary of the application. Then we will give, in the second place, the various algorithms necessary to their implementation on machine. The algorithms which we will present were tested on part of the corpus of words in Arab language Arabdic-10 [4] and gave whole satisfaction. These algorithms are effective insofar as we apply them to the small ones or average vocabularies.

Keywords: Continuous speech recognition, temporal retiming, phonetic decoding, algorithms, vocal signal, dynamic programming.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1347
2597 A Simple Adaptive Atomic Decomposition Voice Activity Detector Implemented by Matching Pursuit

Authors: Thomas Bryan, Veton Kepuska, Ivica Kostanic

Abstract:

A simple adaptive voice activity detector (VAD) is implemented using Gabor and gammatone atomic decomposition of speech for high Gaussian noise environments. Matching pursuit is used for atomic decomposition, and is shown to achieve optimal speech detection capability at high data compression rates for low signal to noise ratios. The most active dictionary elements found by matching pursuit are used for the signal reconstruction so that the algorithm adapts to the individual speakers dominant time-frequency characteristics. Speech has a high peak to average ratio enabling matching pursuit greedy heuristic of highest inner products to isolate high energy speech components in high noise environments. Gabor and gammatone atoms are both investigated with identical logarithmically spaced center frequencies, and similar bandwidths. The algorithm performs equally well for both Gabor and gammatone atoms with no significant statistical differences. The algorithm achieves 70% accuracy at a 0 dB SNR, 90% accuracy at a 5 dB SNR and 98% accuracy at a 20dB SNR using 30d B SNR as a reference for voice activity.

Keywords: Atomic Decomposition, Gabor, Gammatone, Matching Pursuit, Voice Activity Detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1792
2596 Optimal Data Compression and Filtering: The Case of Infinite Signal Sets

Authors: Anatoli Torokhti, Phil Howlett

Abstract:

We present a theory for optimal filtering of infinite sets of random signals. There are several new distinctive features of the proposed approach. First, we provide a single optimal filter for processing any signal from a given infinite signal set. Second, the filter is presented in the special form of a sum with p terms where each term is represented as a combination of three operations. Each operation is a special stage of the filtering aimed at facilitating the associated numerical work. Third, an iterative scheme is implemented into the filter structure to provide an improvement in the filter performance at each step of the scheme. The final step of the concerns signal compression and decompression. This step is based on the solution of a new rank-constrained matrix approximation problem. The solution to the matrix problem is described in this paper. A rigorous error analysis is given for the new filter.

Keywords: stochastic signals, optimization problems in signal processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1280
2595 The Principle Probabilities of Space-Distance Resolution for a Monostatic Radar and Realization in Cylindrical Array

Authors: Anatoly D. Pluzhnikov, Elena N. Pribludova, Alexander G. Ryndyk

Abstract:

In conjunction with the problem of the target selection on a clutter background, the analysis of the scanning rate influence on the spatial-temporal signal structure, the generalized multivariate correlation function and the quality of the resolution with the increase pulse repetition frequency is made. The possibility of the object space-distance resolution, which is conditioned by the range-to-angle conversion with an increased scanning rate, is substantiated. The calculations for the real cylindrical array at high scanning rate are presented. The high scanning rate let to get the signal to noise improvement of the order of 10 dB for the space-time signal processing.

Keywords: Antenna pattern, array, signal processing, spatial resolution.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1074
2594 Evaluation of a Multi-Resolution Dyadic Wavelet Transform Method for usable Speech Detection

Authors: Wajdi Ghezaiel, Amel Ben Slimane Rahmouni, Ezzedine Ben Braiek

Abstract:

Many applications of speech communication and speaker identification suffer from the problem of co-channel speech. This paper deals with a multi-resolution dyadic wavelet transform method for usable segments of co-channel speech detection that could be processed by a speaker identification system. Evaluation of this method is performed on TIMIT database referring to the Target to Interferer Ratio measure. Co-channel speech is constructed by mixing all possible gender speakers. Results do not show much difference for different mixtures. For the overall mixtures 95.76% of usable speech is correctly detected with false alarms of 29.65%.

Keywords: Co-channel speech, usable speech, multi-resolutionanalysis, speaker identification

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1366
2593 An Intelligent Text Independent Speaker Identification Using VQ-GMM Model Based Multiple Classifier System

Authors: Cheima Ben Soltane, Ittansa Yonas Kelbesa

Abstract:

Speaker Identification (SI) is the task of establishing identity of an individual based on his/her voice characteristics. The SI task is typically achieved by two-stage signal processing: training and testing. The training process calculates speaker specific feature parameters from the speech and generates speaker models accordingly. In the testing phase, speech samples from unknown speakers are compared with the models and classified. Even though performance of speaker identification systems has improved due to recent advances in speech processing techniques, there is still need of improvement. In this paper, a Closed-Set Tex-Independent Speaker Identification System (CISI) based on a Multiple Classifier System (MCS) is proposed, using Mel Frequency Cepstrum Coefficient (MFCC) as feature extraction and suitable combination of vector quantization (VQ) and Gaussian Mixture Model (GMM) together with Expectation Maximization algorithm (EM) for speaker modeling. The use of Voice Activity Detector (VAD) with a hybrid approach based on Short Time Energy (STE) and Statistical Modeling of Background Noise in the pre-processing step of the feature extraction yields a better and more robust automatic speaker identification system. Also investigation of Linde-Buzo-Gray (LBG) clustering algorithm for initialization of GMM, for estimating the underlying parameters, in the EM step improved the convergence rate and systems performance. It also uses relative index as confidence measures in case of contradiction in identification process by GMM and VQ as well. Simulation results carried out on voxforge.org speech database using MATLAB highlight the efficacy of the proposed method compared to earlier work.

Keywords: Feature Extraction, Speaker Modeling, Feature Matching, Mel Frequency Cepstrum Coefficient (MFCC), Gaussian mixture model (GMM), Vector Quantization (VQ), Linde-Buzo-Gray (LBG), Expectation Maximization (EM), pre-processing, Voice Activity Detection (VAD), Short Time Energy (STE), Background Noise Statistical Modeling, Closed-Set Tex-Independent Speaker Identification System (CISI).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1886
2592 Using Teager Energy Cepstrum and HMM distancesin Automatic Speech Recognition and Analysis of Unvoiced Speech

Authors: Panikos Heracleous

Abstract:

In this study, the use of silicon NAM (Non-Audible Murmur) microphone in automatic speech recognition is presented. NAM microphones are special acoustic sensors, which are attached behind the talker-s ear and can capture not only normal (audible) speech, but also very quietly uttered speech (non-audible murmur). As a result, NAM microphones can be applied in automatic speech recognition systems when privacy is desired in human-machine communication. Moreover, NAM microphones show robustness against noise and they might be used in special systems (speech recognition, speech conversion etc.) for sound-impaired people. Using a small amount of training data and adaptation approaches, 93.9% word accuracy was achieved for a 20k Japanese vocabulary dictation task. Non-audible murmur recognition in noisy environments is also investigated. In this study, further analysis of the NAM speech has been made using distance measures between hidden Markov model (HMM) pairs. It has been shown the reduced spectral space of NAM speech using a metric distance, however the location of the different phonemes of NAM are similar to the location of the phonemes of normal speech, and the NAM sounds are well discriminated. Promising results in using nonlinear features are also introduced, especially under noisy conditions.

Keywords: Speech recognition, unvoiced speech, nonlinear features, HMM distance measures

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1647
2591 Analysis of Combined Use of NN and MFCC for Speech Recognition

Authors: Safdar Tanweer, Abdul Mobin, Afshar Alam

Abstract:

The performance and analysis of speech recognition system is illustrated in this paper. An approach to recognize the English word corresponding to digit (0-9) spoken by 2 different speakers is captured in noise free environment. For feature extraction, speech Mel frequency cepstral coefficients (MFCC) has been used which gives a set of feature vectors from recorded speech samples. Neural network model is used to enhance the recognition performance. Feed forward neural network with back propagation algorithm model is used. However other speech recognition techniques such as HMM, DTW exist. All experiments are carried out on Matlab.

Keywords: Speech Recognition, MFCC, Neural Network, classifier.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3268
2590 SySRA: A System of a Continuous Speech Recognition in Arab Language

Authors: Samir Abdelhamid, Noureddine Bouguechal

Abstract:

We report in this paper the model adopted by our system of continuous speech recognition in Arab language SySRA and the results obtained until now. This system uses the database Arabdic-10 which is a corpus of word for the Arab language and which was manually segmented. Phonetic decoding is represented by an expert system where the knowledge base is translated in the form of production rules. This expert system transforms a vocal signal into a phonetic lattice. The higher level of the system takes care of the recognition of the lattice thus obtained by deferring it in the form of written sentences (orthographical Form). This level contains initially the lexical analyzer which is not other than the module of recognition. We subjected this analyzer to a set of spectrograms obtained by dictating a score of sentences in Arab language. The rate of recognition of these sentences is about 70% which is, to our knowledge, the best result for the recognition of the Arab language. The test set consists of twenty sentences from four speakers not having taken part in the training.

Keywords: Continuous speech recognition, lexical analyzer, phonetic decoding, phonetic lattice, vocal signal.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1389
2589 Detecting Abnormal ECG Signals Utilising Wavelet Transform and Standard Deviation

Authors: Dejan Stantic, Jun Jo

Abstract:

ECG contains very important clinical information about the cardiac activities of the heart. Often the ECG signal needs to be captured for a long period of time in order to identify abnormalities in certain situations. Such signal apart of a large volume often is characterised by low quality due to the noise and other influences. In order to extract features in the ECG signal with time-varying characteristics at first need to be preprocessed with the best parameters. Also, it is useful to identify specific parts of the long lasting signal which have certain abnormalities and to direct the practitioner to those parts of the signal. In this work we present a method based on wavelet transform, standard deviation and variable threshold which achieves 100% accuracy in identifying the ECG signal peaks and heartbeat as well as identifying the standard deviation, providing a quick reference to abnormalities.

Keywords: Electrocardiogram-ECG, Arrhythmia, Signal Processing, Wavelet Transform, Standard Deviation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2909
2588 A Smart-Visio Microphone for Audio-Visual Speech Recognition “Vmike“

Authors: Y. Ni, K. Sebri

Abstract:

The practical implementation of audio-video coupled speech recognition systems is mainly limited by the hardware complexity to integrate two radically different information capturing devices with good temporal synchronisation. In this paper, we propose a solution based on a smart CMOS image sensor in order to simplify the hardware integration difficulties. By using on-chip image processing, this smart sensor can calculate in real time the X/Y projections of the captured image. This on-chip projection reduces considerably the volume of the output data. This data-volume reduction permits a transmission of the condensed visual information via the same audio channel by using a stereophonic input available on most of the standard computation devices such as PC, PDA and mobile phones. A prototype called VMIKE (Visio-Microphone) has been designed and realised by using standard 0.35um CMOS technology. A preliminary experiment gives encouraged results. Its efficiency will be further investigated in a large variety of applications such as biometrics, speech recognition in noisy environments, and vocal control for military or disabled persons, etc.

Keywords: Audio-Visual Speech recognition, CMOS Smartsensor, On-Chip image processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1826
2587 Speech Impact Realization via Manipulative Argumentation Techniques in Modern American Political Discourse

Authors: Zarine Avetisyan

Abstract:

The present paper presents the discussion of scholars concerning speech impact, peculiarities of its realization, speech strategies and techniques in particular. Departing from the viewpoints of many prominent linguists, the paper suggests that manipulative argumentation be viewed as a most pervasive speech strategy with a certain set of techniques which are to be found in modern American political discourse. The precedence of their occurrence allows us to regard them as pragmatic patterns of speech impact realization in effective public speaking.

Keywords: Manipulative argumentation, political discourse, speech impact, technique.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2289
2586 An ICA Algorithm for Separation of Convolutive Mixture of Speech Signals

Authors: Rajkishore Prasad, Hiroshi Saruwatari, Kiyohiro Shikano

Abstract:

This paper describes Independent Component Analysis (ICA) based fixed-point algorithm for the blind separation of the convolutive mixture of speech, picked-up by a linear microphone array. The proposed algorithm extracts independent sources by non- Gaussianizing the Time-Frequency Series of Speech (TFSS) in a deflationary way. The degree of non-Gaussianization is measured by negentropy. The relative performances of algorithm under random initialization and Null beamformer (NBF) based initialization are studied. It has been found that an NBF based initial value gives speedy convergence as well as better separation performance

Keywords: Blind signal separation, independent component analysis, negentropy, convolutive mixture.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1778
2585 An Advanced Method for Speech Recognition

Authors: Meysam Mohamad pour, Fardad Farokhi

Abstract:

In this paper in consideration of each available techniques deficiencies for speech recognition, an advanced method is presented that-s able to classify speech signals with the high accuracy (98%) at the minimum time. In the presented method, first, the recorded signal is preprocessed that this section includes denoising with Mels Frequency Cepstral Analysis and feature extraction using discrete wavelet transform (DWT) coefficients; Then these features are fed to Multilayer Perceptron (MLP) network for classification. Finally, after training of neural network effective features are selected with UTA algorithm.

Keywords: Multilayer perceptron (MLP) neural network, Discrete Wavelet Transform (DWT) , Mels Scale Frequency Filter , UTA algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2366
2584 A Real-Time Signal Processing Technique for MIDI Generation

Authors: Farshad Arvin, Shyamala Doraisamy

Abstract:

This paper presents a new hardware interface using a microcontroller which processes audio music signals to standard MIDI data. A technique for processing music signals by extracting note parameters from music signals is described. An algorithm to convert the voice samples for real-time processing without complex calculations is proposed. A high frequency microcontroller as the main processor is deployed to execute the outlined algorithm. The MIDI data generated is transmitted using the EIA-232 protocol. The analyses of data generated show the feasibility of using microcontrollers for real-time MIDI generation hardware interface.

Keywords: Signal processing, MIDI, Microcontroller, EIA-232.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2127
2583 Temporal Signal Processing by Inference Bayesian Approach for Detection of Abrupt Variation of Statistical Characteristics of Noisy Signals

Authors: Farhad Asadi, Hossein Sadati

Abstract:

In fields such as neuroscience and especially in cognition modeling of mental processes, uncertainty processing in temporal zone of signal is vital. In this paper, Bayesian online inferences in estimation of change-points location in signal are constructed. This method separated the observed signal into independent series and studies the change and variation of the regime of data locally with related statistical characteristics. We give conditions on simulations of the method when the data characteristics of signals vary, and provide empirical evidence to show the performance of method. It is verified that correlation between series around the change point location and its characteristics such as Signal to Noise Ratios and mean value of signal has important factor on fluctuating in finding proper location of change point. And one of the main contributions of this study is related to representing of these influences of signal statistical characteristics for finding abrupt variation in signal. There are two different structures for simulations which in first case one abrupt change in temporal section of signal is considered with variable position and secondly multiple variations are considered. Finally, influence of statistical characteristic for changing the location of change point is explained in details in simulation results with different artificial signals.

Keywords: Time series, fluctuation in statistical characteristics, optimal learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 564
2582 Performance Analysis of a Series of Adaptive Filters in Non-Stationary Environment for Noise Cancelling Setup

Authors: Anam Rafique, Syed Sohail Ahmed

Abstract:

One of the essential components of much of DSP application is noise cancellation. Changes in real time signals are quite rapid and swift. In noise cancellation, a reference signal which is an approximation of noise signal (that corrupts the original information signal) is obtained and then subtracted from the noise bearing signal to obtain a noise free signal. This approximation of noise signal is obtained through adaptive filters which are self adjusting. As the changes in real time signals are abrupt, this needs adaptive algorithm that converges fast and is stable. Least mean square (LMS) and normalized LMS (NLMS) are two widely used algorithms because of their plainness in calculations and implementation. But their convergence rates are small. Adaptive averaging filters (AFA) are also used because they have high convergence, but they are less stable. This paper provides the comparative study of LMS and Normalized NLMS, AFA and new enhanced average adaptive (Average NLMS-ANLMS) filters for noise cancelling application using speech signals.

Keywords: AFA, ANLMS, LMS, NLMS.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1934
2581 Part of Speech Tagging Using Statistical Approach for Nepali Text

Authors: Archit Yajnik

Abstract:

Part of Speech Tagging has always been a challenging task in the era of Natural Language Processing. This article presents POS tagging for Nepali text using Hidden Markov Model and Viterbi algorithm. From the Nepali text, annotated corpus training and testing data set are randomly separated. Both methods are employed on the data sets. Viterbi algorithm is found to be computationally faster and accurate as compared to HMM. The accuracy of 95.43% is achieved using Viterbi algorithm. Error analysis where the mismatches took place is elaborately discussed.

Keywords: Hidden Markov model, Viterbi algorithm, POS tagging, natural language processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1708
2580 Control Signal from EOG Analysis and Its Application

Authors: Myoung Ro Kim, Gilwon Yoon

Abstract:

A game using electro-oculography (EOG) as control signal was introduced in this study. Various EOG signals are generated by eye movements. Even though EOG is a quite complex type of signal, distinct and separable EOG signals could be classified from horizontal and vertical, left and right eye movements. Proper signal processing was incorporated since EOG signal has very small amplitude in the order of micro volts and contains noises influenced by external conditions. Locations of the electrodes were set to be above and below as well as left and right positions of the eyes. Four control signals of up, down, left and right were generated. A microcontroller processed signals in order to simulate a DDR game. A LCD display showed arrows falling down with four different head directions. This game may be used as eye exercise for visual concentration and acuity. Our proposed EOG control signal can be utilized in many other applications of human machine interfaces such as wheelchair, computer keyboard and home automation.

Keywords: DDR game, EOG, eye movement.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4795
2579 Possibilities, Challenges and the State of the Art of Automatic Speech Recognition in Air Traffic Control

Authors: Van Nhan Nguyen, Harald Holone

Abstract:

Over the past few years, a lot of research has been conducted to bring Automatic Speech Recognition (ASR) into various areas of Air Traffic Control (ATC), such as air traffic control simulation and training, monitoring live operators for with the aim of safety improvements, air traffic controller workload measurement and conducting analysis on large quantities controller-pilot speech. Due to the high accuracy requirements of the ATC context and its unique challenges, automatic speech recognition has not been widely adopted in this field. With the aim of providing a good starting point for researchers who are interested bringing automatic speech recognition into ATC, this paper gives an overview of possibilities and challenges of applying automatic speech recognition in air traffic control. To provide this overview, we present an updated literature review of speech recognition technologies in general, as well as specific approaches relevant to the ATC context. Based on this literature review, criteria for selecting speech recognition approaches for the ATC domain are presented, and remaining challenges and possible solutions are discussed.

Keywords: Automatic Speech Recognition, ASR, Air Traffic Control, ATC.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4042
2578 Embedded Electrochemistry with a Miniaturized, Drone-Based, Potentiostat System for Remote Detection Chemical Warfare Agents

Authors: Amer Dawoud, Rashid Mia, Arati Biswakarma, Jesy Motchaalangaram, Wujan Miao, Karl Wallace

Abstract:

The development of an embedded miniaturized drone-based system for remote detection of Chemical Warfare Agents (CWAs) is proposed. The paper focuses on the software/hardware system design of the electrochemical Cyclic Voltammetry (CV) and Differential Pulse Voltammetry (DPV) signal processing for future deployment on drones. The paper summarizes the progress made towards hardware and electrochemical signal processing for signature detection of CWA. Also, the miniature potentiostat signal is validated by comparing it with the high-end lab potentiostat signal.

Keywords: Drone-based, remote detection chemical warfare agents, miniaturized, potentiostat.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 526
2577 Speech Data Compression using Vector Quantization

Authors: H. B. Kekre, Tanuja K. Sarode

Abstract:

Mostly transforms are used for speech data compressions which are lossy algorithms. Such algorithms are tolerable for speech data compression since the loss in quality is not perceived by the human ear. However the vector quantization (VQ) has a potential to give more data compression maintaining the same quality. In this paper we propose speech data compression algorithm using vector quantization technique. We have used VQ algorithms LBG, KPE and FCG. The results table shows computational complexity of these three algorithms. Here we have introduced a new performance parameter Average Fractional Change in Speech Sample (AFCSS). Our FCG algorithm gives far better performance considering mean absolute error, AFCSS and complexity as compared to others.

Keywords: Vector Quantization, Data Compression, Encoding, , Speech coding.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2403
2576 Performance Evaluation of Acoustic-Spectrographic Voice Identification Method in Native and Non-Native Speech

Authors: E. Krasnova, E. Bulgakova, V. Shchemelinin

Abstract:

The paper deals with acoustic-spectrographic voice identification method in terms of its performance in non-native language speech. Performance evaluation is conducted by comparing the result of the analysis of recordings containing native language speech with recordings that contain foreign language speech. Our research is based on Tajik and Russian speech of Tajik native speakers due to the character of the criminal situation with drug trafficking. We propose a pilot experiment that represents a primary attempt enter the field.

Keywords: Speaker identification, acoustic-spectrographic method, non-native speech.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 866