Search results for: Blind speech separation

616 Development of a Computer Vision System for the Blind and Visually Impaired Person

Abstract:

Eyes are an essential and conspicuous organ of the human body. Human eyes are outward and inward portals of the body that allows to see the outside world and provides glimpses into ones inner thoughts and feelings. Inevitable blindness and visual impairments may results from eye-related disease, trauma, or congenital or degenerative conditions that cannot be corrected by conventional means. The study emphasizes innovative tools that will serve as an aid to the blind and visually impaired (VI) individuals. The researchers fabricated a prototype that utilizes the Microsoft Kinect for Windows and Arduino microcontroller board. The prototype facilitates advanced gesture recognition, voice recognition, obstacle detection and indoor environment navigation. Open Computer Vision (OpenCV) performs image analysis, and gesture tracking to transform Kinect data to the desired output. A computer vision technology device provides greater accessibility for those with vision impairments.

Keywords: Algorithms, Blind, Computer Vision, Embedded Systems, Image Analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3598

615 Analysis of Blind Decision Feedback Equalizer Convergence: Interest of a Soft Decision

Authors: S. Cherif, S. Marcos, M. Jaidane

Abstract:

In this paper the behavior of the decision feedback equalizers (DFEs) adapted by the decision-directed or the constant modulus blind algorithms is presented. An analysis of the error surface of the corresponding criterion cost functions is first developed. With the intention of avoiding the ill-convergence of the algorithm, the paper proposes to modify the shape of the cost function error surface by using a soft decision instead of the hard one. This was shown to reduce the influence of false decisions and to smooth the undesirable minima. Modified algorithms using the soft decision during a pseudo-training phase with an automatic switch to the properly tracking phase are then derived. Computer simulations show that these modified algorithms present better ability to avoid local minima than conventional ones.

Keywords: Blind DFEs, decision-directed algorithm, constant modulus algorithm, cost function analysis, convergence analysis, soft decision.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1878

614 A Blind SLM Scheme for Reduction of PAPR in OFDM Systems

Authors: K. Kasiri, M. J. Dehghani

Abstract:

In this paper we propose a blind algorithm for peakto- average power ratio (PAPR) reduction in OFDM systems, based on selected mapping (SLM) algorithm as a distortionless method. The main drawback of the conventional SLM technique is the need for transmission of several side information bits, for each data block, which results in loss in data rate transmission. In the proposed method some special number of carriers in the OFDM frame is reserved to be rotated with one of the possible phases according to the number of phase sequence blocks in SLM algorithm. Reserving some limited number of carriers wont effect the reduction in PAPR of OFDM signal. Simulation results show using ML criteria at the receiver will lead to the same system-performance as the conventional SLM algorithm, while there is no need to send any side information to the receiver.

Keywords: Orthogonal Frequency Division Multiplexing(OFDM), Peak-to-Average Power Ratio (PAPR), Selected Mapping(SLM), Blind SLM (BSLM).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2289

613 Convergence Analysis of a Prediction based Adaptive Equalizer for IIR Channels

Authors: Miloje S. Radenkovic, Tamal Bose

Abstract:

This paper presents the convergence analysis of a prediction based blind equalizer for IIR channels. Predictor parameters are estimated by using the recursive least squares algorithm. It is shown that the prediction error converges almost surely (a.s.) toward a scalar multiple of the unknown input symbol sequence. It is also proved that the convergence rate of the parameter estimation error is of the same order as that in the iterated logarithm law.

Keywords: Adaptive blind equalizer, Recursive leastsquares, Adaptive Filtering, Convergence analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1448

612 Speech Enhancement by Marginal Statistical Characterization in the Log Gabor Wavelet Domain

Authors: Suman Senapati, Goutam Saha

Abstract:

This work presents a fusion of Log Gabor Wavelet (LGW) and Maximum a Posteriori (MAP) estimator as a speech enhancement tool for acoustical background noise reduction. The probability density function (pdf) of the speech spectral amplitude is approximated by a Generalized Laplacian Distribution (GLD). Compared to earlier estimators the proposed method estimates the underlying statistical model more accurately by appropriately choosing the model parameters of GLD. Experimental results show that the proposed estimator yields a higher improvement in Segmental Signal-to-Noise Ratio (S-SNR) and lower Log-Spectral Distortion (LSD) in two different noisy environments compared to other estimators.

Keywords: Speech Enhancement, Generalized Laplacian Distribution, Log Gabor Wavelet, Bayesian MAP Marginal Estimator.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1621

611 Performance Evaluation of an Inventive CO2 Gas Separation Inorganic Ceramic Membrane

Authors: Ngozi Nwogu, Mohammed Kajama, Edward Gobina

Abstract:

Atmospheric carbon dioxide emissions are considered as the greatest environmental challenge the world is facing today. The tasks to control the emissions include the recovery of CO2 from flue gas. This concern has been improved due to recent advances in materials process engineering resulting in the development of inorganic gas separation membranes with excellent thermal and mechanical stability required for most gas separations. This paper, therefore, evaluates the performance of a highly selective inorganic membrane for CO2 recovery applications. Analysis of results obtained is in agreement with experimental literature data. Further results show the prediction performance of the membranes for gas separation and the future direction of research. The materials selection and the membrane preparation techniques are discussed. Method of improving the interface defects in the membrane and its effect on the separation performance has also been reviewed and in addition advances to totally exploit the potential usage of this innovative membrane.

Keywords: Carbon dioxide, gas separation, inorganic ceramic membrane & perm selectivity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2967

610 Numerical Study of Flow Separation Control over a NACA2415 Airfoil

Authors: M. Tahar Bouzaher

Abstract:

This study involves numerical simulation of the flow around a NACA2415 airfoil, with a 18° angle of attack, and flow separation control using a rod, It involves putting a cylindrical rod - upstream of the leading edge- in vertical translation movement in order to accelerate the transition of the boundary layer by interaction between the rod wake and the boundary layer. The viscous, nonstationary flow is simulated using ANSYS FLUENT 13. The rod movement is reproduced using the dynamic mesh technique and an in-house developed UDF (User Define Function). The frequency varies from 75 to 450 Hz and the considered amplitudes are 2%, and 3% of the foil chord. The frequency chosen closed to the frequency of separation. Our results showed a substantial modification in the flow behavior and a maximum drag reduction of 61%.

Keywords: CFD, Flow separation, Active control, Boundary layer, rod, NACA 2415.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2993

609 Precombining Adaptive LMMSE Detection for DS-CDMA Systems in Time Varying Channels: Non Blind and Blind Approaches

Authors: M. D. Kokate, T. R. Sontakke, P. W. Wani

Abstract:

This paper deals with an adaptive multiuser detector for direct sequence code division multiple-access (DS-CDMA) systems. A modified receiver, precombinig LMMSE is considered under time varying channel environment. Detector updating is performed with two criterions, mean square estimation (MSE) and MOE optimization technique. The adaptive implementation issues of these two schemes are quite different. MSE criterion updates the filter weights by minimizing error between data vector and adaptive vector. MOE criterion together with canonical representation of the detector results in a constrained optimization problem. Even though the canonical representation is very complicated under time varying channels, it is analyzed with assumption of average power profile of multipath replicas of user of interest. The performance of both schemes is studied for practical SNR conditions. Results show that for poor SNR, MSE precombining LMMSE is better than the blind precombining LMMSE but for greater SNR, MOE scheme outperforms with better result.

Keywords: LMMSE, MOE, MUD.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1492

608 From Maskee to Audible Noise in Perceptual Speech Enhancement

Authors: Asmaa Amehraye, Dominique Pastor, Ahmed Tamtaoui, Driss Aboutajdine

Abstract:

A new analysis of perceptual speech enhancement is presented. It focuses on the fact that if only noise above the masking threshold is filtered, then noise below the masking threshold, but above the absolute threshold of hearing, can become audible after the masker filtering. This particular drawback of some perceptual filters, hereafter called the maskee-to-audible-noise (MAN) phenomenon, favours the emergence of isolated tonals that increase musical noise. Two filtering techniques that avoid or correct the MAN phenomenon are proposed to effectively suppress background noise without introducing much distortion. Experimental results, including objective and subjective measurements, show that these techniques improve the enhanced speech quality and the gain they bring emphasizes the importance of the MAN phenomenon.

Keywords: Perceptual speech filtering, maskee to audible noise, distorsion, musical noise.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1487

607 Large Eddy Simulation of Flow Separation Control over a NACA2415 Airfoil

Authors: M. Tahar Bouzaher

Abstract:

This study involves a numerical simulation of the flow around a NACA2415 airfoil, with a 15°angle of attack, and flow separation control using a rod, It reposes inputting a cylindrical rod upstream of the leading edge in order to accelerate the transition of the boundary layer by interaction between the rod wake and the boundary layer. The viscous, non-stationary flow is simulated using ANSYS FLUENT 13. Our results showed a substantial modification in the flow behavior and a maximum drag reduction of 51%.

Keywords: CFD, Flow separation, Active control, Boundary layer, rod, NACA 2415.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2456

606 A System of Automatic Speech Recognition based on the Technique of Temporal Retiming

Authors: Samir Abdelhamid, Noureddine Bouguechal

Abstract:

We report in this paper the procedure of a system of automatic speech recognition based on techniques of the dynamic programming. The technique of temporal retiming is a technique used to synchronize between two forms to compare. We will see how this technique is adapted to the field of the automatic speech recognition. We will expose, in a first place, the theory of the function of retiming which is used to compare and to adjust an unknown form with a whole of forms of reference constituting the vocabulary of the application. Then we will give, in the second place, the various algorithms necessary to their implementation on machine. The algorithms which we will present were tested on part of the corpus of words in Arab language Arabdic-10 [4] and gave whole satisfaction. These algorithms are effective insofar as we apply them to the small ones or average vocabularies.

Keywords: Continuous speech recognition, temporal retiming, phonetic decoding, algorithms, vocal signal, dynamic programming.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1341

605 A proposed High-Resolution Time-Frequency Distribution for the Analysis of Multicomponent and Speech Signals

Authors: D. Boutana, B. Barkat , F. Marir

Abstract:

In this paper, we propose a novel time-frequency distribution (TFD) for the analysis of multi-component signals. In particular, we use synthetic as well as real-life speech signals to prove the superiority of the proposed TFD in comparison to some existing ones. In the comparison, we consider the cross-terms suppression and the high energy concentration of the signal around its instantaneous frequency (IF).

Keywords: Cohen's Class, Multicomponent signal, SeparableKernel, Speech signal, Time- frequency resolution.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1861

604 A Completed Adaptive De-mixing Algorithm on Stiefel Manifold for ICA

Authors: Jianwei Wu

Abstract:

Based on the one-bit-matching principle and by turning the de-mixing matrix into an orthogonal matrix via certain normalization, Ma et al proposed a one-bit-matching learning algorithm on the Stiefel manifold for independent component analysis [8]. But this algorithm is not adaptive. In this paper, an algorithm which can extract kurtosis and its sign of each independent source component directly from observation data is firstly introduced.With the algorithm , the one-bit-matching learning algorithm is revised, so that it can make the blind separation on the Stiefel manifold implemented completely in the adaptive mode in the framework of natural gradient.

Keywords: Independent component analysis, kurtosis, Stiefel manifold, super-gaussians or sub-gaussians.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1497

603 End Point Detection for Wavelet Based Speech Compression

Authors: Jalal Karam

Abstract:

In real-field applications, the correct determination of voice segments highly improves the overall system accuracy and minimises the total computation time. This paper presents reliable measures of speech compression by detcting the end points of the speech signals prior to compressing them. The two different compession schemes used are the Global threshold and the Level- Dependent threshold techniques. The performance of the proposed method is tested wirh the Signal to Noise Ratios, Peak Signal to Noise Ratios and Normalized Root Mean Square Error parameter measures.

Keywords: Wavelets, End-points Detection, Compression.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1374

602 The Effectiveness of Cognitive Behavioural Intervention in Alleviating Social Avoidance for Blind Students

Authors: Mohamed M. Elsherbiny

Abstract:

Social Avoidance is one of the most important problems that face a good number of disabled students. It results from the negative attitudes of non-disabled students, teachers and others. Some of the past research has shown that non-disabled individuals hold negative attitudes toward persons with disabilities. The present study aims to alleviate Social Avoidance by applying the Cognitive Behavioral Intervention. 24 Blind students aged 19–24 (university students) were randomly chosen we compared an experimental group (consisted of 12 students) who went through the intervention program, with a control group (12 students also) who did not go through such intervention. We used the Social Avoidance and Distress Scale (SADS) to assess social anxiety and distress behavior. The author used many techniques of cognitive behavioral intervention such as modeling, cognitive restructuring, extension, contingency contracts, selfmonitoring, assertiveness training, role play, encouragement and others. Statistically, T-test was employed to test the research hypothesis. Result showed that there is a significance difference between the experimental group and the control group after the intervention and also at the follow up stages of the Social Avoidance and Distress Scale. Also for the experimental group, there is a significance difference before the intervention and the follow up stages for the scale. Results showed that, there is a decrease in social avoidance. Accordingly, cognitive behavioral intervention program was successful in decreasing social avoidance for blind students.

Keywords: Social avoidance, cognitive behavioral intervention, blind disability, disability.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1983

601 Blind Low Frequency Watermarking Method

Authors: Dimitar Taskovski, Sofija Bogdanova, Momcilo Bogdanov

Abstract:

We present a low frequency watermarking method adaptive to image content. The image content is analyzed and properties of HVS are exploited to generate a visual mask of the same size as the approximation image. Using this mask we embed the watermark in the approximation image without degrading the image quality. Watermark detection is performed without using the original image. Experimental results show that the proposed watermarking method is robust against most common image processing operations, which can be easily implemented and usually do not degrade the image quality.

Keywords: Blind, digital watermarking, low frequency, visualmask.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1535

600 Mechanical Structure Design Optimization by Blind Number Theory: Time-dependent Reliability

Authors: Zakari Yaou, Lirong Cui

Abstract:

In a product development process, understanding the functional behavior of the system, the role of components in achieving functions and failure modes if components/subsystem fails its required function will help develop appropriate design validation and verification program for reliability assessment. The integration of these three issues will help design and reliability engineers in identifying weak spots in design and planning future actions and testing program. This case study demonstrate the advantage of unascertained theory described in the subjective cognition uncertainty, and then applies blind number (BN) theory in describing the uncertainty of the mechanical system failure process and the same time used the same theory in bringing out another mechanical reliability system model. The practical calculations shows the BN Model embodied the characters of simply, small account of calculation but betterforecasting capability, which had the value of macroscopic discussion to some extent.

Keywords: Mechanical structure Design, time-dependent stochastic process, unascertained information, blind number theory.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1465

599 Blind Identification Channel Using Higher Order Cumulants with Application to Equalization for MC−CDMA System

Authors: Mohammed Zidane, Said Safi, Mohamed Sabri, Ahmed Boumezzough

Abstract:

In this paper we propose an algorithm based on higher order cumulants, for blind impulse response identification of frequency radio channels and downlink (MC−CDMA) system Equalization. In order to test its efficiency, we have compared with another algorithm proposed in the literature, for that we considered on theoretical channel as the Proakis’s ‘B’ channel and practical frequency selective fading channel, called Broadband Radio Access Network (BRAN C), normalized for (MC−CDMA) systems, excited by non-Gaussian sequences. In the part of (MC−CDMA), we use the Minimum Mean Square Error (MMSE) equalizer after the channel identification to correct the channel’s distortion. The simulation results, in noisy environment and for different signal to noise ratio (SNR), are presented to illustrate the accuracy of the proposed algorithm.

Keywords: Blind identification and equalization, Higher Order Cumulants, (MC−CDMA) system, MMSE equalizer.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1776

598 NonStationary CMA for Decision Feedback Equalization of Markovian Time Varying Channels

Authors: S. Cherif, M. Turki-Hadj Alouane

Abstract:

In this paper, we propose a modified version of the Constant Modulus Algorithm (CMA) tailored for blind Decision Feedback Equalizer (DFE) of first order Markovian time varying channels. The proposed NonStationary CMA (NSCMA) is designed so that it explicitly takes into account the Markovian structure of the channel nonstationarity. Hence, unlike the classical CMA, the NSCMA is not blind with respect to the channel time variations. This greatly helps the equalizer in the case of realistic channels, and avoids frequent transmissions of training sequences. This paper develops a theoretical analysis of the steady state performance of the CMA and the NSCMA for DFEs within a time varying context. Therefore, approximate expressions of the mean square errors are derived. We prove that in the steady state, the NSCMA exhibits better performance than the classical CMA. These new results are confirmed by simulation. Through an experimental study, we demonstrate that the Bit Error Rate (BER) is reduced by the NSCMA-DFE, and the improvement of the BER achieved by the NSCMA-DFE is as significant as the channel time variations are severe.

Keywords: Time varying channel, Markov model, Blind DFE, CMA, NSCMA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1293

597 A Stereo Image Processing System for Visually Impaired

Authors: G. Balakrishnan, G. Sainarayanan, R. Nagarajan, Sazali Yaacob

Abstract:

This paper presents a review on vision aided systems and proposes an approach for visual rehabilitation using stereo vision technology. The proposed system utilizes stereo vision, image processing methodology and a sonification procedure to support blind navigation. The developed system includes a wearable computer, stereo cameras as vision sensor and stereo earphones, all moulded in a helmet. The image of the scene infront of visually handicapped is captured by the vision sensors. The captured images are processed to enhance the important features in the scene in front, for navigation assistance. The image processing is designed as model of human vision by identifying the obstacles and their depth information. The processed image is mapped on to musical stereo sound for the blind-s understanding of the scene infront. The developed method has been tested in the indoor and outdoor environments and the proposed image processing methodology is found to be effective for object identification.

Keywords: Blind navigation, stereo vision, image processing, object preference, music tones.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4104

596 Separation Characteristics of the Hollow Fiber Membrane Module Using Water Mixed with Small Sized Bubbles Composed of Synthesized Exhalations

Authors: Pil Woo Heo, Hyunse Kim

Abstract:

Fish can breathe freely under water using dissolved oxygen and survive for a long time without going out of the water. A human can also survive under water using dissolved oxygens, if properly used. He needs more dissolved oxygens than the fish, so efficient separation device is required. Since the amount of oxygen contained in water is weak, a person needs a lot of surface area to breathe in water, which leads to a large-sized device. It can be applied to various fields if it is developed as a device which is advantageous to carry in small size. In this paper, we have carried out a study on the effective use of exhalations and proposed the separation characteristics of the gas containing dissolved oxygen in the state of mixed gas considering the components of exhalation. The system was configured to have a fine bubble when the gas mixture injected into the front end of the separator. While the fluid containing the fine bubbles was supplied to the separator, the dissolved gas contained in water was separated using a vacuum pump. The gas separation amount of the separating apparatus with respect to the supplied mixed gas was measured. The amounts of separation of dissolved gas were increased as the amounts of mixed gas supplied were increased.

Keywords: Small sized bubbles, synthesized exhalations, separation, hollow fiber module.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 683

595 Assamese Numeral Speech Recognition using Multiple Features and Cooperative LVQ -Architectures

Authors: Manash Pratim Sarma, Kandarpa Kumar Sarma

Abstract:

A set of Artificial Neural Network (ANN) based methods for the design of an effective system of speech recognition of numerals of Assamese language captured under varied recording conditions and moods is presented here. The work is related to the formulation of several ANN models configured to use Linear Predictive Code (LPC), Principal Component Analysis (PCA) and other features to tackle mood and gender variations uttering numbers as part of an Automatic Speech Recognition (ASR) system in Assamese. The ANN models are designed using a combination of Self Organizing Map (SOM) and Multi Layer Perceptron (MLP) constituting a Learning Vector Quantization (LVQ) block trained in a cooperative environment to handle male and female speech samples of numerals of Assamese- a language spoken by a sizable population in the North-Eastern part of India. The work provides a comparative evaluation of several such combinations while subjected to handle speech samples with gender based differences captured by a microphone in four different conditions viz. noiseless, noise mixed, stressed and stress-free.

Keywords: Assamese, Recognition, LPC, Spectral, ANN.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1985

594 Effect of Crude Oil Particle Elasticity on the Separation Efficiency of a Hydrocyclone

Authors: M. H. Narasingha, K. Pana-Suppamassadu, P. Narataruksa

Abstract:

The separation efficiency of a hydrocyclone has extensively been considered on the rigid particle assumption. A collection of experimental studies have demonstrated their discrepancies from the modeling and simulation results. These discrepancies caused by the actual particle elasticity have generally led to a larger amount of energy consumption in the separation process. In this paper, the influence of particle elasticity on the separation efficiency of a hydrocyclone system was investigated through the Finite Element (FE) simulations using crude oil droplets as the elastic particles. A Reitema-s design hydrocyclone with a diameter of 8 mm was employed to investigate the separation mechanism of the crude oil droplets from water. The cut-size diameter eter of the crude oil was 10 - Ðçm in order to fit with the operating range of the adopted hydrocylone model. Typical parameters influencing the performance of hydrocyclone were varied with the feed pressure in the range of 0.3 - 0.6 MPa and feed concentration between 0.05 – 0.1 w%. In the simulation, the Finite Element scheme was applied to investigate the particle-flow interaction occurred in the crude oil system during the process. The interaction of a single oil droplet at the size of 10 - Ðçm to the flow field was observed. The feed concentration fell in the dilute flow regime so the particle-particle interaction was ignored in the study. The results exhibited the higher power requirement for the separation of the elastic particulate system when compared with the rigid particulate system.

Keywords: Hydrocyclone, separation efficiency, strain energy density, strain rate.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1797

593 SMaTTS: Standard Malay Text to Speech System

Authors: Othman O. Khalifa, Zakiah Hanim Ahmad, Teddy Surya Gunawan

Abstract:

This paper presents a rule-based text- to- speech (TTS) Synthesis System for Standard Malay, namely SMaTTS. The proposed system using sinusoidal method and some pre- recorded wave files in generating speech for the system. The use of phone database significantly decreases the amount of computer memory space used, thus making the system very light and embeddable. The overall system was comprised of two phases the Natural Language Processing (NLP) that consisted of the high-level processing of text analysis, phonetic analysis, text normalization and morphophonemic module. The module was designed specially for SM to overcome few problems in defining the rules for SM orthography system before it can be passed to the DSP module. The second phase is the Digital Signal Processing (DSP) which operated on the low-level process of the speech waveform generation. A developed an intelligible and adequately natural sounding formant-based speech synthesis system with a light and user-friendly Graphical User Interface (GUI) is introduced. A Standard Malay Language (SM) phoneme set and an inclusive set of phone database have been constructed carefully for this phone-based speech synthesizer. By applying the generative phonology, a comprehensive letter-to-sound (LTS) rules and a pronunciation lexicon have been invented for SMaTTS. As for the evaluation tests, a set of Diagnostic Rhyme Test (DRT) word list was compiled and several experiments have been performed to evaluate the quality of the synthesized speech by analyzing the Mean Opinion Score (MOS) obtained. The overall performance of the system as well as the room for improvements was thoroughly discussed.

Keywords: Natural Language Processing, Text-To-Speech (TTS), Diphone, source filter, low-/ high- level synthesis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1965

592 Speech Activated Automation

Authors: Rui Antunes

Abstract:

This article presents a simple way to perform programmed voice commands for the interface with commercial Digital and Analogue Input/Output PCI cards, used in Robotics and Automation applications. Robots and Automation equipment can "listen" to voice commands and perform several different tasks, approaching to the human behavior, and improving the human- machine interfaces for the Automation Industry. Since most PCI Digital and Analogue Input/Output cards are sold with several DLLs included (for use with different programming languages), it is possible to add speech recognition capability, using a standard speech recognition engine, compatible with the programming languages used. It was created in this work a Visual Basic 6 (the world's most popular language) application, that listens to several voice commands, and is capable to communicate directly with several standard 128 Digital I/O PCI Cards, used to control complete Automation Systems, with up to (number of boards used) x 128 Sensors and/or Actuators.

Keywords: Speech Recognition, Automation, Robotics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1830

591 A Blind Digital Watermark in Hadamard Domain

Authors: Saeid Saryazdi, Hossein Nezamabadi-pour

Abstract:

A new blind gray-level watermarking scheme is described. In the proposed method, the host image is first divided into 4*4 non-overlapping blocks. For each block, two first AC coefficients of its Hadamard transform are then estimated using DC coefficients of its neighbor blocks. A gray-level watermark is then added into estimated values. Since embedding watermark does not change the DC coefficients, watermark extracting could be done by estimating AC coefficients and comparing them with their actual values. Several experiments are made and results suggest the robustness of the proposed algorithm.

Keywords: Digital Watermarking, Image watermarking, Information Hiden, Steganography.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2256

590 Virtual Speaking Head for Hearing Impaired Students

Authors: Eva Pajorová, Ladislav Hluchý

Abstract:

Developed tool is one of system tools for easier access to various scientific areas and real time interactive learning between lecturer and for hearing impaired students. There is no demand for the lecturer to know Sign Language (SL). Instead, the new software tools will perform the translation of the regular speech into SL, after which it will be transferred to the student. On the other side, the questions of the student (in SL) will be translated and transferred to the lecturer in text or speech. One of those tools is presented tool. It-s too for developing the correct Speech Visemes as a root of total communication method for hearing impared students.

Keywords: Impared people, sing language, communication methods.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1840

589 Noise Estimation for Speech Enhancement in Non-Stationary Environments-A New Method

Authors: Ch.V.Rama Rao, Gowthami., Harsha., Rajkumar., M.B.Rama Murthy, K.Srinivasa Rao, K.AnithaSheela

Abstract:

This paper presents a new method for estimating the nonstationary noise power spectral density given a noisy signal. The method is based on averaging the noisy speech power spectrum using time and frequency dependent smoothing factors. These factors are adjusted based on signal-presence probability in individual frequency bins. Signal presence is determined by computing the ratio of the noisy speech power spectrum to its local minimum, which is updated continuously by averaging past values of the noisy speech power spectra with a look-ahead factor. This method adapts very quickly to highly non-stationary noise environments. The proposed method achieves significant improvements over a system that uses voice activity detector (VAD) in noise estimation.

Keywords: Noise estimation, Non-stationary noise, Speechenhancement.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2335

588 Automatic Distance Compensation for Robust Voice-based Human-Computer Interaction

Authors: Randy Gomez, Keisuke Nakamura, Kazuhiro Nakadai

Abstract:

Distant-talking voice-based HCI system suffers from performance degradation due to mismatch between the acoustic speech (runtime) and the acoustic model (training). Mismatch is caused by the change in the power of the speech signal as observed at the microphones. This change is greatly influenced by the change in distance, affecting speech dynamics inside the room before reaching the microphones. Moreover, as the speech signal is reflected, its acoustical characteristic is also altered by the room properties. In general, power mismatch due to distance is a complex problem. This paper presents a novel approach in dealing with distance-induced mismatch by intelligently sensing instantaneous voice power variation and compensating model parameters. First, the distant-talking speech signal is processed through microphone array processing, and the corresponding distance information is extracted. Distance-sensitive Gaussian Mixture Models (GMMs), pre-trained to capture both speech power and room property are used to predict the optimal distance of the speech source. Consequently, pre-computed statistic priors corresponding to the optimal distance is selected to correct the statistics of the generic model which was frozen during training. Thus, model combinatorics are post-conditioned to match the power of instantaneous speech acoustics at runtime. This results to an improved likelihood in predicting the correct speech command at farther distances. We experiment using real data recorded inside two rooms. Experimental evaluation shows voice recognition performance using our method is more robust to the change in distance compared to the conventional approach. In our experiment, under the most acoustically challenging environment (i.e., Room 2: 2.5 meters), our method achieved 24.2% improvement in recognition performance against the best-performing conventional method.

Keywords: Human Machine Interaction, Human Computer Interaction, Voice Recognition, Acoustic Model Compensation, Acoustic Speech Enhancement.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1877

587 Absence of Developmental Change in Epenthetic Vowel Duration in Japanese Speakers’ English

Authors: Takayuki Konishi, Kakeru Yazawa, Mariko Kondo

Abstract:

This study examines developmental change in the production of epenthetic vowels by Japanese learners of English in relation to acquisition of L2 English speech rhythm. Seventy-two Japanese learners of English in the J-AESOP corpus were divided into lower- and higher-level learners according to their proficiency score and the frequency of vowel epenthesis. Three learners were excluded because no vowel epenthesis was observed in their utterances. The analysis of their read English speech data showed no statistical difference between lower- and higher-level learners, implying the absence of any developmental change in durations of epenthetic vowels. This result, together with the findings of previous studies, will be discussed in relation to the transfer of L1 phonology and manifestation of L2 English rhythm.

Keywords: Vowel epenthesis, Japanese learners of English, L2 speech corpus, speech rhythm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1118