Search results for: interference cancellation of speech

392 Continuous Feature Adaptation for Non-Native Speech Recognition

Authors: Y. Deng, X. Li, C. Kwan, B. Raj, R. Stern

Abstract:

The current speech interfaces in many military applications may be adequate for native speakers. However, the recognition rate drops quite a lot for non-native speakers (people with foreign accents). This is mainly because the nonnative speakers have large temporal and intra-phoneme variations when they pronounce the same words. This problem is also complicated by the presence of large environmental noise such as tank noise, helicopter noise, etc. In this paper, we proposed a novel continuous acoustic feature adaptation algorithm for on-line accent and environmental adaptation. Implemented by incremental singular value decomposition (SVD), the algorithm captures local acoustic variation and runs in real-time. This feature-based adaptation method is then integrated with conventional model-based maximum likelihood linear regression (MLLR) algorithm. Extensive experiments have been performed on the NATO non-native speech corpus with baseline acoustic model trained on native American English. The proposed feature-based adaptation algorithm improved the average recognition accuracy by 15%, while the MLLR model based adaptation achieved 11% improvement. The corresponding word error rate (WER) reduction was 25.8% and 2.73%, as compared to that without adaptation. The combined adaptation achieved overall recognition accuracy improvement of 29.5%, and WER reduction of 31.8%, as compared to that without adaptation.

Keywords: speaker adaptation; environment adaptation; robust speech recognition; SVD; non-native speech recognition

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3199

391 Investigating Interference Errors Made by Azzawia University 1st year Students of English in Learning English Prepositions

Authors: Aimen Mohamed Almaloul

Abstract:

The main focus of this study is investigating the interference of Arabic in the use of English prepositions by Libyan university students. Prepositions in the tests used in the study were categorized, according to their relation to Arabic, into similar Arabic and English prepositions (SAEP), dissimilar Arabic and English prepositions (DAEP), Arabic prepositions with no English counterparts (APEC), and English prepositions with no Arabic counterparts (EPAC).

The subjects of the study were the first year university students of the English department, Sabrata Faculty of Arts, Azzawia University; both males and females, and they were 100 students. The basic tool for data collection was a test of English prepositions; students are instructed to fill in the blanks with the correct prepositions and to put a zero (0) if no preposition was needed. The test was then handed to the subjects of the study.

The test was then scored and quantitative as well as qualitative results were obtained. Quantitative results indicated the number, percentages and rank order of errors in each of the categories and qualitative results indicated the nature and significance of those errors and their possible sources. Based on the obtained results the researcher could detect that students made more errors in the EPAC category than the other three categories and these errors could be attributed to the lack of knowledge of the different meanings of English prepositions. This lack of knowledge forced the students to adopt what is called the strategy of transfer.

Keywords: Foreign language acquisition, foreign language learning, interference system, interlanguage system, mother tongue interference.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5021

390 Mitigation of Electromagnetic Interference Generated by GPIB Control-Network in AC-DC Transfer Measurement System

Authors: M. M. Hlakola, E. Golovins, D. V. Nicolae

Abstract:

The field of instrumentation electronics is undergoing an explosive growth, due to its wide range of applications. The proliferation of electrical devices in a close working proximity can negatively influence each other’s performance. The degradation in the performance is due to electromagnetic interference (EMI). This paper investigates the negative effects of electromagnetic interference originating in the General Purpose Interface Bus (GPIB) control-network of the AC-DC transfer measurement system. Remedial measures of reducing measurement errors and failure of range of industrial devices due to EMI have been explored. The ACDC transfer measurement system was analysed for the commonmode (CM) EMI effects. Further investigation of coupling path as well as much accurate identification of noise propagation mechanism has been outlined. To prevent the occurrence of common-mode (ground loops) which was identified between the GPIB system control circuit and the measurement circuit, a microcontroller-driven GPIB switching isolator device was designed, prototyped, programmed and validated. This mitigation technique has been explored to reduce EMI effectively.

Keywords: CM, EMI, GPIB, ground loops.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1813

389 Automatic Detection of Syllable Repetition in Read Speech for Objective Assessment of Stuttered Disfluencies

Authors: K. M. Ravikumar, Balakrishna Reddy, R. Rajagopal, H. C. Nagaraj

Abstract:

Automatic detection of syllable repetition is one of the important parameter in assessing the stuttered speech objectively. The existing method which uses artificial neural network (ANN) requires high levels of agreement as prerequisite before attempting to train and test ANNs to separate fluent and nonfluent. We propose automatic detection method for syllable repetition in read speech for objective assessment of stuttered disfluencies which uses a novel approach and has four stages comprising of segmentation, feature extraction, score matching and decision logic. Feature extraction is implemented using well know Mel frequency Cepstra coefficient (MFCC). Score matching is done using Dynamic Time Warping (DTW) between the syllables. The Decision logic is implemented by Perceptron based on the score given by score matching. Although many methods are available for segmentation, in this paper it is done manually. Here the assessment by human judges on the read speech of 10 adults who stutter are described using corresponding method and the result was 83%.

Keywords: Assessment, DTW, MFCC, Objective, Perceptron, Stuttering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2782

388 A Mixing Matrix Estimation Algorithm for Speech Signals under the Under-Determined Blind Source Separation Model

Authors: Jing Wu, Wei Lv, Yibing Li, Yuanfan You

Abstract:

The separation of speech signals has become a research hotspot in the field of signal processing in recent years. It has many applications and influences in teleconferencing, hearing aids, speech recognition of machines and so on. The sounds received are usually noisy. The issue of identifying the sounds of interest and obtaining clear sounds in such an environment becomes a problem worth exploring, that is, the problem of blind source separation. This paper focuses on the under-determined blind source separation (UBSS). Sparse component analysis is generally used for the problem of under-determined blind source separation. The method is mainly divided into two parts. Firstly, the clustering algorithm is used to estimate the mixing matrix according to the observed signals. Then the signal is separated based on the known mixing matrix. In this paper, the problem of mixing matrix estimation is studied. This paper proposes an improved algorithm to estimate the mixing matrix for speech signals in the UBSS model. The traditional potential algorithm is not accurate for the mixing matrix estimation, especially for low signal-to noise ratio (SNR).In response to this problem, this paper considers the idea of an improved potential function method to estimate the mixing matrix. The algorithm not only avoids the inuence of insufficient prior information in traditional clustering algorithm, but also improves the estimation accuracy of mixing matrix. This paper takes the mixing of four speech signals into two channels as an example. The results of simulations show that the approach in this paper not only improves the accuracy of estimation, but also applies to any mixing matrix.

Keywords: Clustering algorithm, potential function, speech signal, the UBSS model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 655

387 Wind Interference Effect on Tall Building

Authors: Atul K. Desai, Jigar K. Sevalia, Sandip A. Vasanwala

Abstract:

When a building is located in an urban area, it is exposed to a wind of different characteristics then wind over an open terrain. This is development of turbulent wake region behind an upstream building. The interaction with upstream building can produce significant changes in the response of the tall building. Here, in this paper, an attempt has been made to study wind induced interference effects on tall building. In order to study wind induced interference effect (IF) on Tall Building, initially a tall building (which is termed as Principal Building now on wards) with square plan shape has been considered with different Height to Width Ratio and total drag force is obtained considering different terrain conditions as well as different incident wind direction. Then total drag force on Principal Building is obtained by considering adjacent building which is termed as Interfering Building now on wards with different terrain conditions and incident wind angle. To execute study, Computational Fluid Dynamics (CFD) Code namely Fluent and Gambit have been used.

Keywords: Computational Fluid Dynamics, Tall Building, Turbulent, Wake Region, Wind.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3773

386 Spectral Entropy Employment in Speech Enhancement based on Wavelet Packet

Authors: Talbi Mourad, Salhi Lotfi, Chérif Adnen

Abstract:

In this work, we are interested in developing a speech denoising tool by using a discrete wavelet packet transform (DWPT). This speech denoising tool will be employed for applications of recognition, coding and synthesis. For noise reduction, instead of applying the classical thresholding technique, some wavelet packet nodes are set to zero and the others are thresholded. To estimate the non stationary noise level, we employ the spectral entropy. A comparison of our proposed technique to classical denoising methods based on thresholding and spectral subtraction is made in order to evaluate our approach. The experimental implementation uses speech signals corrupted by two sorts of noise, white and Volvo noises. The obtained results from listening tests show that our proposed technique is better than spectral subtraction. The obtained results from SNR computation show the superiority of our technique when compared to the classical thresholding method using the modified hard thresholding function based on u-law algorithm.

Keywords: Enhancement, spectral subtraction, SNR, discrete wavelet packet transform, spectral entropy Histogram

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1967

385 Adaptive Filtering in Subbands for Supervised Source Separation

Authors: Bruna Luisa Ramos Prado Vasques, Mariane Rembold Petraglia, Antonio Petraglia

Abstract:

This paper investigates MIMO (Multiple-Input Multiple-Output) adaptive filtering techniques for the application of supervised source separation in the context of convolutive mixtures. From the observation that there is correlation among the signals of the different mixtures, an improvement in the NSAF (Normalized Subband Adaptive Filter) algorithm is proposed in order to accelerate its convergence rate. Simulation results with mixtures of speech signals in reverberant environments show the superior performance of the proposed algorithm with respect to the performances of the NLMS (Normalized Least-Mean-Square) and conventional NSAF, considering both the convergence speed and SIR (Signal-to-Interference Ratio) after convergence.

Keywords: Adaptive filtering, multirate processing, normalized subband adaptive filter, source separation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 945

384 The Effect of Natural Light on the Performance of Visible Light Communication Systems

Authors: Mahmoud Beshr, Ivan Andonovic, Moustafa H. Aly

Abstract:

Visible Light Communication (VLC) offers advantages of low energy consumption, licence free and RF interference free operation. One application area for VLC is in the provision of health centred services circumventing issues of interference with any biomedical device within the environment. VLC performamce is affected by natural light restricting systems avilability and relibility. The paper presents an analysis of the performance of VLC systems under different meteorological conditions. The evaluation considered the impact of natural light as a function of different reflection surfaces in different room sizes.

Keywords: Visible light communication, impulse reponse , performance analysis , natural light.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1688

383 Bangla Vowel Characterization Based on Analysis by Synthesis

Authors: Syed Akhter Hossain, M. Lutfar Rahman, Farruk Ahmed

Abstract:

Bangla Vowel characterization determines the spectral properties of Bangla vowels for efficient synthesis as well as recognition of Bangla vowels. In this paper, Bangla vowels in isolated word have been analyzed based on speech production model within the framework of Analysis-by-Synthesis. This has led to the extraction of spectral parameters for the production model in order to produce different Bangla vowel sounds. The real and synthetic spectra are compared and a weighted square error has been computed along with the error in the formant bandwidths for efficient representation of Bangla vowels. The extracted features produced good representation of targeted Bangla vowel. Such a representation also plays essential role in low bit rate speech coding and vocoders.

Keywords: Speech, vowel, formant, synthesis, spectrum, LPC.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2353

382 Speech Recognition Using Scaly Neural Networks

Authors: Akram M. Othman, May H. Riadh

Abstract:

This research work is aimed at speech recognition using scaly neural networks. A small vocabulary of 11 words were established first, these words are “word, file, open, print, exit, edit, cut, copy, paste, doc1, doc2". These chosen words involved with executing some computer functions such as opening a file, print certain text document, cutting, copying, pasting, editing and exit. It introduced to the computer then subjected to feature extraction process using LPC (linear prediction coefficients). These features are used as input to an artificial neural network in speaker dependent mode. Half of the words are used for training the artificial neural network and the other half are used for testing the system; those are used for information retrieval. The system components are consist of three parts, speech processing and feature extraction, training and testing by using neural networks and information retrieval. The retrieve process proved to be 79.5-88% successful, which is quite acceptable, considering the variation to surrounding, state of the person, and the microphone type.

Keywords: Feature extraction, Liner prediction coefficients, neural network, Speech Recognition, Scaly ANN.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1718

381 Forensic Speaker Verification in Noisy Environmental by Enhancing the Speech Signal Using ICA Approach

Authors: Ahmed Kamil Hasan Al-Ali, Bouchra Senadji, Ganesh Naik

Abstract:

We propose a system to real environmental noise and channel mismatch for forensic speaker verification systems. This method is based on suppressing various types of real environmental noise by using independent component analysis (ICA) algorithm. The enhanced speech signal is applied to mel frequency cepstral coefficients (MFCC) or MFCC feature warping to extract the essential characteristics of the speech signal. Channel effects are reduced using an intermediate vector (i-vector) and probabilistic linear discriminant analysis (PLDA) approach for classification. The proposed algorithm is evaluated by using an Australian forensic voice comparison database, combined with car, street and home noises from QUT-NOISE at a signal to noise ratio (SNR) ranging from -10 dB to 10 dB. Experimental results indicate that the MFCC feature warping-ICA achieves a reduction in equal error rate about (48.22%, 44.66%, and 50.07%) over using MFCC feature warping when the test speech signals are corrupted with random sessions of street, car, and home noises at -10 dB SNR.

Keywords: Noisy forensic speaker verification, ICA algorithm, MFCC, MFCC feature warping.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 968

380 A Smart-Visio Microphone for Audio-Visual Speech Recognition “Vmike“

Authors: Y. Ni, K. Sebri

Abstract:

The practical implementation of audio-video coupled speech recognition systems is mainly limited by the hardware complexity to integrate two radically different information capturing devices with good temporal synchronisation. In this paper, we propose a solution based on a smart CMOS image sensor in order to simplify the hardware integration difficulties. By using on-chip image processing, this smart sensor can calculate in real time the X/Y projections of the captured image. This on-chip projection reduces considerably the volume of the output data. This data-volume reduction permits a transmission of the condensed visual information via the same audio channel by using a stereophonic input available on most of the standard computation devices such as PC, PDA and mobile phones. A prototype called VMIKE (Visio-Microphone) has been designed and realised by using standard 0.35um CMOS technology. A preliminary experiment gives encouraged results. Its efficiency will be further investigated in a large variety of applications such as biometrics, speech recognition in noisy environments, and vocal control for military or disabled persons, etc.

Keywords: Audio-Visual Speech recognition, CMOS Smartsensor, On-Chip image processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1808

379 Computationally Efficient Signal Quality Improvement Method for VoIP System

Authors: H. P. Singh, S. Singh

Abstract:

The voice signal in Voice over Internet protocol (VoIP) system is processed through the best effort policy based IP network, which leads to the network degradations including delay, packet loss jitter. The work in this paper presents the implementation of finite impulse response (FIR) filter for voice quality improvement in the VoIP system through distributed arithmetic (DA) algorithm. The VoIP simulations are conducted with AMR-NB 6.70 kbps and G.729a speech coders at different packet loss rates and the performance of the enhanced VoIP signal is evaluated using the perceptual evaluation of speech quality (PESQ) measurement for narrowband signal. The results show reduction in the computational complexity in the system and significant improvement in the quality of the VoIP voice signal.

Keywords: VoIP, Signal Quality, Distributed Arithmetic, Packet Loss, Speech Coder.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1812

378 Extracting Tongue Shape Dynamics from Magnetic Resonance Image Sequences

Authors: María S. Avila-García, John N. Carter, Robert I. Damper

Abstract:

An important problem in speech research is the automatic extraction of information about the shape and dimensions of the vocal tract during real-time speech production. We have previously developed Southampton dynamic magnetic resonance imaging (SDMRI) as an approach to the solution of this problem.However, the SDMRI images are very noisy so that shape extraction is a major challenge. In this paper, we address the problem of tongue shape extraction, which poses difficulties because this is a highly deforming non-parametric shape. We show that combining active shape models with the dynamic Hough transform allows the tongue shape to be reliably tracked in the image sequence.

Keywords: Vocal tract imaging, speech production, active shapemodels, dynamic Hough transform, object tracking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1719

377 Protocol Modifications for Improved Co-Channel Wireless LAN Goodput in Partitioned Spaces

Authors: Raymond J. Jayabal, Chiew Tong Lau

Abstract:

Partitions can play a significant role in minimising cochannel interference of Wireless LANs by attenuating signals across room boundaries. This could pave the way towards higher density deployments in home and office environments through spatial channel reuse. Yet, due to protocol limitations, the latest incantation of IEEE 802.11 standard is still unable to take advantage of this fact: Despite having clearly adequate Signal to Interference Ratio (SIR) over co-channel neighbouring networks in other rooms, its goodput falls significantly lower than its maximum in the absence of cochannel interferers. In this paper, we describe how this situation can be remedied via modest modifications to the standard.

Keywords: IEEE 802.11 Wireless LAN, spatial channel re-use, physical layer capture.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1361

376 Interest of the Sequences Pseudo Noises Codes of Different Lengths for the Reduction from the Interference between Users of CDMA Network

Authors: Nerguè Kassahan Kone, Souleymane Oumtanaga

Abstract:

The third generation (3G) of cellular system adopted the spread spectrum as solution for the transmission of the data in the physical layer. Contrary to systems IS-95 or CDMAOne (systems with spread spectrum of the preceding generation), the new standard, called Universal Mobil Telecommunications System (UMTS), uses long codes in the down link. The system is conceived for the vocal communication and the transmission of the data. In particular, the down link is very important, because of the asymmetrical request of the data, i.e., more remote loading towards the mobiles than towards the basic station. Moreover, the UMTS uses for the down link an orthogonal spreading out with a variable factor of spreading out (OVSF for Orthogonal Variable Spreading Factor). This characteristic makes it possible to increase the flow of data of one or more users by reducing their factor of spreading out without changing the factor of spreading out of other users. In the current standard of the UMTS, two techniques to increase the performances of the down link were proposed, the diversity of sending antenna and the codes space-time. These two techniques fight only fainding. The receiver proposed for the mobil station is the RAKE, but one can imagine a receiver more sophisticated, able to reduce the interference between users and the impact of the coloured noise and interferences to narrow band. In this context, where the users have long codes synchronized with variable factor of spreading out and ignorance by the mobile of the other active codes/users, the use of the sequences of code pseudo-noises different lengths is presented in the form of one of the most appropriate solutions.

Keywords: DS-CDMA, multiple access interference, ratio Signal / interference + Noise.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1335

375 Plug and Play Interferometer Configuration using Single Modulator Technique

Authors: Norshamsuri Ali, Hafizulfika, Salim Ali Al-Kathiri, Abdulla Al-Attas, Suhairi Saharudin, Mohamed Ridza Wahiddin

Abstract:

We demonstrate single-photon interference over 10 km using a plug and play system for quantum key distribution. The quality of the interferometer is measured by using the interferometer visibility. The coding of the signal is based on the phase coding and the value of visibility is based on the interference effect, which result a number of count. The setup gives full control of polarization inside the interferometer. The quality measurement of the interferometer is based on number of count per second and the system produces 94 % visibility in one of the detectors.

Keywords: single photon, interferometer, quantum key distribution.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1598

374 Stress Analysis for Two Fitted Thin Walled Cylinder with High Angular Velocity

Authors: A.V. Hoseini, A. Bidi, M. H. Pol, M.Jalali azizpour

Abstract:

In this paper stress and strain for two rotating thin wall cylinder fitted together with initial interference and overlap are computed. Also stress value for variation of initial interference is calculated. At first problem is considered without rotation and next angular velocity increased from 0 to 50000 rev/min and stress in each stage is calculated. The important point is that when stress become very small in magnitude the angular velocity is critical and two cylinders will separate. The critical speed i.e. speed of separation is calculated in each step.

Keywords: Thin walled cylinder, high angular velocity, twofitted thin walled

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1405

373 Influence of Loudness Compression on Hearing with Bone Anchored Hearing Implants

Authors: Anja Kurz, Marc Flynn, Tobias Good, Marco Caversaccio, Martin Kompis

Abstract:

Bone Anchored Hearing Implants (BAHI) are routinely used in patients with conductive or mixed hearing loss, e.g. if conventional air conduction hearing aids cannot be used. New sound processors and new fitting software now allow the adjustment of parameters such as loudness compression ratios or maximum power output separately. Today it is unclear, how the choice of these parameters influences aided speech understanding in BAHI users. In this prospective experimental study, the effect of varying the compression ratio and lowering the maximum power output in a BAHI were investigated. Twelve experienced adult subjects with a mixed hearing loss participated in this study. Four different compression ratios (1.0; 1.3; 1.6; 2.0) were tested along with two different maximum power output settings, resulting in a total of eight different programs. Each participant tested each program during two weeks. A blinded Latin square design was used to minimize bias. For each of the eight programs, speech understanding in quiet and in noise was assessed. For speech in quiet, the Freiburg number test and the Freiburg monosyllabic word test at 50, 65, and 80 dB SPL were used. For speech in noise, the Oldenburg sentence test was administered. Speech understanding in quiet and in noise was improved significantly in the aided condition in any program, when compared to the unaided condition. However, no significant differences were found between any of the eight programs. In contrast, on a subjective level there was a significant preference for medium compression ratios of 1.3 to 1.6 and higher maximum power output.

Keywords: Bone Anchored Hearing Implant, Compression, Maximum Power Output, Speech understanding.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2042

372 Comparison of Fricative Vocal Tract Transfer Functions Derived using Two Different Segmentation Techniques

Authors: K. S. Subari, C. H. Shadle, A. Barney, R. I. Damper

Abstract:

The acoustic and articulatory properties of fricative speech sounds are being studied using magnetic resonance imaging (MRI) and acoustic recordings from a single subject. Area functions were derived from a complete set of axial and coronal MR slices using two different methods: the Mermelstein technique and the Blum transform. Area functions derived from the two techniques were shown to differ significantly in some cases. Such differences will lead to different acoustic predictions and it is important to know which is the more accurate. The vocal tract acoustic transfer function (VTTF) was derived from these area functions for each fricative and compared with measured speech signals for the same fricative and same subject. The VTTFs for /f/ in two vowel contexts and the corresponding acoustic spectra are derived here; the Blum transform appears to show a better match between prediction and measurement than the Mermelstein technique.

Keywords: Area functions, fricatives, vocal tract transferfunction, MRI, speech.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1636

371 Automotive 3-Microphone Noise Canceller in a Frequently Moving Noise Source Environment

Authors: Z. Qi, T. J. Moir

Abstract:

A combined three-microphone voice activity detector (VAD) and noise-canceling system is studied to enhance speech recognition in an automobile environment. A previous experiment clearly shows the ability of the composite system to cancel a single noise source outside of a defined zone. This paper investigates the performance of the composite system when there are frequently moving noise sources (noise sources are coming from different locations but are not always presented at the same time) e.g. there is other passenger speech or speech from a radio when a desired speech is presented. To work in a frequently moving noise sources environment, whilst a three-microphone voice activity detector (VAD) detects voice from a “VAD valid zone", the 3-microphone noise canceller uses a “noise canceller valid zone" defined in freespace around the users head. Therefore, a desired voice should be in the intersection of the noise canceller valid zone and VAD valid zone. Thus all noise is suppressed outside this intersection of area. Experiments are shown for a real environment e.g. all results were recorded in a car by omni-directional electret condenser microphones.

Keywords: Signal processing, voice activity detection, noise canceller, microphone array beam forming.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1594

370 Robust Features for Impulsive Noisy Speech Recognition Using Relative Spectral Analysis

Authors: Hajer Rahali, Zied Hajaiej, Noureddine Ellouze

Abstract:

The goal of speech parameterization is to extract the relevant information about what is being spoken from the audio signal. In speech recognition systems Mel-Frequency Cepstral Coefficients (MFCC) and Relative Spectral Mel-Frequency Cepstral Coefficients (RASTA-MFCC) are the two main techniques used. It will be shown in this paper that it presents some modifications to the original MFCC method. In our work the effectiveness of proposed changes to MFCC called Modified Function Cepstral Coefficients (MODFCC) were tested and compared against the original MFCC and RASTA-MFCC features. The prosodic features such as jitter and shimmer are added to baseline spectral features. The above-mentioned techniques were tested with impulsive signals under various noisy conditions within AURORA databases.

Keywords: Auditory filter, impulsive noise, MFCC, prosodic features, RASTA filter.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2307

369 A Simple Adaptive Atomic Decomposition Voice Activity Detector Implemented by Matching Pursuit

Authors: Thomas Bryan, Veton Kepuska, Ivica Kostanic

Abstract:

A simple adaptive voice activity detector (VAD) is implemented using Gabor and gammatone atomic decomposition of speech for high Gaussian noise environments. Matching pursuit is used for atomic decomposition, and is shown to achieve optimal speech detection capability at high data compression rates for low signal to noise ratios. The most active dictionary elements found by matching pursuit are used for the signal reconstruction so that the algorithm adapts to the individual speakers dominant time-frequency characteristics. Speech has a high peak to average ratio enabling matching pursuit greedy heuristic of highest inner products to isolate high energy speech components in high noise environments. Gabor and gammatone atoms are both investigated with identical logarithmically spaced center frequencies, and similar bandwidths. The algorithm performs equally well for both Gabor and gammatone atoms with no significant statistical differences. The algorithm achieves 70% accuracy at a 0 dB SNR, 90% accuracy at a 5 dB SNR and 98% accuracy at a 20dB SNR using 30d B SNR as a reference for voice activity.

Keywords: Atomic Decomposition, Gabor, Gammatone, Matching Pursuit, Voice Activity Detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1777

368 Online Collaborative Learning System Using Speech Technology

Authors: Sid-Ahmed. Selouani, Tang-Ho Lê, Chadia Moghrabi, Benoit Lanteigne, Jean Roy

Abstract:

A Web-based learning tool, the Learn IN Context (LINC) system, designed and being used in some institution-s courses in mixed-mode learning, is presented in this paper. This mode combines face-to-face and distance approaches to education. LINC can achieve both collaborative and competitive learning. In order to provide both learners and tutors with a more natural way to interact with e-learning applications, a conversational interface has been included in LINC. Hence, the components and essential features of LINC+, the voice enhanced version of LINC, are described. We report evaluation experiments of LINC/LINC+ in a real use context of a computer programming course taught at the Université de Moncton (Canada). The findings show that when the learning material is delivered in the form of a collaborative and voice-enabled presentation, the majority of learners seem to be satisfied with this new media, and confirm that it does not negatively affect their cognitive load.

Keywords: E-leaning, Knowledge Network, Speech recognition, Speech synthesis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1695

367 Hybrid Method Using Wavelets and Predictive Method for Compression of Speech Signal

Authors: Karima Siham Aoubid, Mohamed Boulemden

Abstract:

The development of the signal compression algorithms is having compressive progress. These algorithms are continuously improved by new tools and aim to reduce, an average, the number of bits necessary to the signal representation by means of minimizing the reconstruction error. The following article proposes the compression of Arabic speech signal by a hybrid method combining the wavelet transform and the linear prediction. The adopted approach rests, on one hand, on the original signal decomposition by ways of analysis filters, which is followed by the compression stage, and on the other hand, on the application of the order 5, as well as, the compression signal coefficients. The aim of this approach is the estimation of the predicted error, which will be coded and transmitted. The decoding operation is then used to reconstitute the original signal. Thus, the adequate choice of the bench of filters is useful to the transform in necessary to increase the compression rate and induce an impercevable distortion from an auditive point of view.

Keywords: Compression, linear prediction analysis, multiresolution analysis, speech signal.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1316

366 An Efficient Technique for EMI Mitigation in Fluorescent Lamps using Frequency Modulation and Evolutionary Programming

Authors: V.Sekar, T.G.Palanivelu, B.Revathi

Abstract:

Electromagnetic interference (EMI) is one of the serious problems in most electrical and electronic appliances including fluorescent lamps. The electronic ballast used to regulate the power flow through the lamp is the major cause for EMI. The interference is because of the high frequency switching operation of the ballast. Formerly, some EMI mitigation techniques were in practice, but they were not satisfactory because of the hardware complexity in the circuit design, increased parasitic components and power consumption and so on. The majority of the researchers have their spotlight only on EMI mitigation without considering the other constraints such as cost, effective operation of the equipment etc. In this paper, we propose a technique for EMI mitigation in fluorescent lamps by integrating Frequency Modulation and Evolutionary Programming. By the Frequency Modulation technique, the switching at a single central frequency is extended to a range of frequencies, and so, the power is distributed throughout the range of frequencies leading to EMI mitigation. But in order to meet the operating frequency of the ballast and the operating power of the fluorescent lamps, an optimal modulation index is necessary for Frequency Modulation. The optimal modulation index is determined using Evolutionary Programming. Thereby, the proposed technique mitigates the EMI to a satisfactory level without disturbing the operation of the fluorescent lamp.

Keywords: Ballast, Electromagnetic interference (EMI), EMImitigation, Evolutionary programming (EP), Fluorescent lamp, Frequency Modulation (FM), Modulation index.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2262

365 Introduce the FWA in the Band 3300-3400 MHz

Authors: Lway F. Abdulrazak, Zaid A. Shamsan, Ali K. Aswad, Tharek Abd. Rahman

Abstract:

This paper gives a study about forging solution to deploy the fixed wireless access (FWA) in the band 3300-3400MHz instead of 3400-3600MHz to eschew the harmful interference between from the FWA towards fixed satellite services receiver presented in this band. The impact of FWA services toward the FSS and the boundaries of spectrum emission mask had been considered to calculate the possible Guard band required in this case. In addition, supplementary separation distance added to improve the coexistence between the two adjacent bands. Simulation had been done using Matlab software base on ITU models reliance on the most popular specification used for the tropical weather countries. Review the current problem of interference between two systems and some mitigation techniques which adopted in Malaysia as a case study is a part of this research.

Keywords: Coexistence, FSS, FWA, mask.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1793

364 Modeling Concave Globoidal Cam with Swinging Roller Follower : A Case Study

Authors: Nguyen Van Tuong, Premysl Pokorny

Abstract:

This paper describes a computer-aided design for design of the concave globoidal cam with cylindrical rollers and swinging follower. Four models with different modeling methods are made from the same input data. The input data are angular input and output displacements of the cam and the follower and some other geometrical parameters of the globoidal cam mechanism. The best cam model is the cam which has no interference with the rollers when their motions are simulated in assembly conditions. The angular output displacement of the follower for the best cam is also compared with that of in the input data to check errors. In this study, Pro/ENGINEER® Wildfire 2.0 is used for modeling the cam, simulating motions and checking interference and errors of the system.

Keywords: Globoidal cam, sweep, pitch surface, modeling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3650

363 Recognition of Noisy Words Using the Time Delay Neural Networks Approach

Authors: Khenfer-Koummich Fatima, Mesbahi Larbi, Hendel Fatiha

Abstract:

This paper presents a recognition system for isolated words like robot commands. It’s carried out by Time Delay Neural Networks; TDNN. To teleoperate a robot for specific tasks as turn, close, etc… In industrial environment and taking into account the noise coming from the machine. The choice of TDNN is based on its generalization in terms of accuracy, in more it acts as a filter that allows the passage of certain desirable frequency characteristics of speech; the goal is to determine the parameters of this filter for making an adaptable system to the variability of speech signal and to noise especially, for this the back propagation technique was used in learning phase. The approach was applied on commands pronounced in two languages separately: The French and Arabic. The results for two test bases of 300 spoken words for each one are 87%, 97.6% in neutral environment and 77.67%, 92.67% when the white Gaussian noisy was added with a SNR of 35 dB.

Keywords: Neural networks, Noise, Speech Recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1921