Search results for: Perceptual speech filtering
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 558

Search results for: Perceptual speech filtering

378 A Simple Adaptive Atomic Decomposition Voice Activity Detector Implemented by Matching Pursuit

Authors: Thomas Bryan, Veton Kepuska, Ivica Kostanic

Abstract:

A simple adaptive voice activity detector (VAD) is implemented using Gabor and gammatone atomic decomposition of speech for high Gaussian noise environments. Matching pursuit is used for atomic decomposition, and is shown to achieve optimal speech detection capability at high data compression rates for low signal to noise ratios. The most active dictionary elements found by matching pursuit are used for the signal reconstruction so that the algorithm adapts to the individual speakers dominant time-frequency characteristics. Speech has a high peak to average ratio enabling matching pursuit greedy heuristic of highest inner products to isolate high energy speech components in high noise environments. Gabor and gammatone atoms are both investigated with identical logarithmically spaced center frequencies, and similar bandwidths. The algorithm performs equally well for both Gabor and gammatone atoms with no significant statistical differences. The algorithm achieves 70% accuracy at a 0 dB SNR, 90% accuracy at a 5 dB SNR and 98% accuracy at a 20dB SNR using 30d B SNR as a reference for voice activity.

Keywords: Atomic Decomposition, Gabor, Gammatone, Matching Pursuit, Voice Activity Detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1752
377 Context-aware Recommender Systems using Data Mining Techniques

Authors: Kyoung-jae Kim, Hyunchul Ahn, Sangwon Jeong

Abstract:

This study proposes a novel recommender system to provide the advertisements of context-aware services. Our proposed model is designed to apply a modified collaborative filtering (CF) algorithm with regard to the several dimensions for the personalization of mobile devices – location, time and the user-s needs type. In particular, we employ a classification rule to understand user-s needs type using a decision tree algorithm. In addition, we collect primary data from the mobile phone users and apply them to the proposed model to validate its effectiveness. Experimental results show that the proposed system makes more accurate and satisfactory advertisements than comparative systems.

Keywords: Location-based advertisement, Recommender system, Collaborative filtering, User needs type, Mobile user.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2135
376 Online Collaborative Learning System Using Speech Technology

Authors: Sid-Ahmed. Selouani, Tang-Ho Lê, Chadia Moghrabi, Benoit Lanteigne, Jean Roy

Abstract:

A Web-based learning tool, the Learn IN Context (LINC) system, designed and being used in some institution-s courses in mixed-mode learning, is presented in this paper. This mode combines face-to-face and distance approaches to education. LINC can achieve both collaborative and competitive learning. In order to provide both learners and tutors with a more natural way to interact with e-learning applications, a conversational interface has been included in LINC. Hence, the components and essential features of LINC+, the voice enhanced version of LINC, are described. We report evaluation experiments of LINC/LINC+ in a real use context of a computer programming course taught at the Université de Moncton (Canada). The findings show that when the learning material is delivered in the form of a collaborative and voice-enabled presentation, the majority of learners seem to be satisfied with this new media, and confirm that it does not negatively affect their cognitive load.

Keywords: E-leaning, Knowledge Network, Speech recognition, Speech synthesis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1666
375 Hybrid Method Using Wavelets and Predictive Method for Compression of Speech Signal

Authors: Karima Siham Aoubid, Mohamed Boulemden

Abstract:

The development of the signal compression algorithms is having compressive progress. These algorithms are continuously improved by new tools and aim to reduce, an average, the number of bits necessary to the signal representation by means of minimizing the reconstruction error. The following article proposes the compression of Arabic speech signal by a hybrid method combining the wavelet transform and the linear prediction. The adopted approach rests, on one hand, on the original signal decomposition by ways of analysis filters, which is followed by the compression stage, and on the other hand, on the application of the order 5, as well as, the compression signal coefficients. The aim of this approach is the estimation of the predicted error, which will be coded and transmitted. The decoding operation is then used to reconstitute the original signal. Thus, the adequate choice of the bench of filters is useful to the transform in necessary to increase the compression rate and induce an impercevable distortion from an auditive point of view.

Keywords: Compression, linear prediction analysis, multiresolution analysis, speech signal.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1286
374 A Novel Tracking Method Using Filtering and Geometry

Authors: Sang Hoon Lee, Jong Sue Bae, Taewan Kim, Jin Mo Song, Jong Ju Kim

Abstract:

Image target detection and tracking methods based on target information such as intensity, shape model, histogram and target dynamics have been proven to be robust to target model variations and background clutters as shown by recent researches. However, no definitive answer has been given to occluded target by counter measure or limited field of view(FOV). In this paper, we will present a novel tracking method using filtering and computational geometry. This paper has two central goals: 1) to deal with vulnerable target measurements; and 2) to maintain target tracking out of FOV using non-target-originated information. The experimental results, obtained with airborne images, show a robust tracking ability with respect to the existing approaches. In exploring the questions of target tracking, this paper will be limited to consideration of airborne image.

Keywords: Tracking, Computational geometry, Homography, Filter

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1738
373 Minimizing of Target Localization Error using Multi-robot System and Particle Filters

Authors: Jana Puchyova

Abstract:

In recent years a number of applications with multirobot systems (MRS) is growing in various areas. But their design is in practice often difficult and algorithms are proposed for the theoretical background and do not consider errors and noise in real conditions, so they are not usable in real environment. These errors are visible also in task of target localization enough, when robots try to find and estimate the position of the target by the sensors. Localization of target is possible also with one robot but as it was examined target finding and localization with group of mobile robots can estimate the target position more accurately and faster. The accuracy of target position estimation is made by cooperation of MRS and particle filtering. Advantage of usage the MRS with particle filtering was tested on task of fixed target localization by group of mobile robots.

Keywords: Multi-robot system, particle filter, position estimation, target localization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1520
372 EMD-Based Signal Noise Reduction

Authors: A.O. Boudraa, J.C. Cexus, Z. Saidi

Abstract:

This paper introduces a new signal denoising based on the Empirical mode decomposition (EMD) framework. The method is a fully data driven approach. Noisy signal is decomposed adaptively into oscillatory components called Intrinsic mode functions (IMFs) by means of a process called sifting. The EMD denoising involves filtering or thresholding each IMF and reconstructs the estimated signal using the processed IMFs. The EMD can be combined with a filtering approach or with nonlinear transformation. In this work the Savitzky-Golay filter and shoftthresholding are investigated. For thresholding, IMF samples are shrinked or scaled below a threshold value. The standard deviation of the noise is estimated for every IMF. The threshold is derived for the Gaussian white noise. The method is tested on simulated and real data and compared with averaging, median and wavelet approaches.

Keywords: Empirical mode decomposition, Signal denoisingnonstationary process.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3911
371 Modeling and Visualizing Seismic Wave Propagation in Elastic Medium Using Multi-Dimension Wave Digital Filtering Approach

Authors: Jason Chien-Hsun Tseng, Nguyen Dong-Thai Dao, Chong-Ching Chang

Abstract:

A novel PDE solver using the multidimensional wave digital filtering (MDWDF) technique to achieve the solution of a 2D seismic wave system is presented. In essence, the continuous physical system served by a linear Kirchhoff circuit is transformed to an equivalent discrete dynamic system implemented by a MD wave digital filtering (MDWDF) circuit. This amounts to numerically approximating the differential equations used to describe elements of a MD passive electronic circuit by a grid-based difference equations implemented by the so-called state quantities within the passive MDWDF circuit. So the digital model can track the wave field on a dense 3D grid of points. Details about how to transform the continuous system into a desired discrete passive system are addressed. In addition, initial and boundary conditions are properly embedded into the MDWDF circuit in terms of state quantities. Graphic results have clearly demonstrated some physical effects of seismic wave (P-wave and S–wave) propagation including radiation, reflection, and refraction from and across the hard boundaries. Comparison between the MDWDF technique and the finite difference time domain (FDTD) approach is also made in terms of the computational efficiency.

Keywords: Seismic Wave Propagation, Multi-dimension WaveDigital Filters, Partial Differential Equations.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1388
370 Recognition of Noisy Words Using the Time Delay Neural Networks Approach

Authors: Khenfer-Koummich Fatima, Mesbahi Larbi, Hendel Fatiha

Abstract:

This paper presents a recognition system for isolated words like robot commands. It’s carried out by Time Delay Neural Networks; TDNN. To teleoperate a robot for specific tasks as turn, close, etc… In industrial environment and taking into account the noise coming from the machine. The choice of TDNN is based on its generalization in terms of accuracy, in more it acts as a filter that allows the passage of certain desirable frequency characteristics of speech; the goal is to determine the parameters of this filter for making an adaptable system to the variability of speech signal and to noise especially, for this the back propagation technique was used in learning phase. The approach was applied on commands pronounced in two languages separately: The French and Arabic. The results for two test bases of 300 spoken words for each one are 87%, 97.6% in neutral environment and 77.67%, 92.67% when the white Gaussian noisy was added with a SNR of 35 dB.

Keywords: Neural networks, Noise, Speech Recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1894
369 Speech Encryption and Decryption Using Linear Feedback Shift Register (LFSR)

Authors: Tin Lai Win, Nant Christina Kyaw

Abstract:

This paper is taken into consideration the problem of cryptanalysis of stream ciphers. There is some attempts need to improve the existing attacks on stream cipher and to make an attempt to distinguish the portions of cipher text obtained by the encryption of plain text in which some parts of the text are random and the rest are non-random. This paper presents a tutorial introduction to symmetric cryptography. The basic information theoretic and computational properties of classic and modern cryptographic systems are presented, followed by an examination of the application of cryptography to the security of VoIP system in computer networks using LFSR algorithm. The implementation program will be developed Java 2. LFSR algorithm is appropriate for the encryption and decryption of online streaming data, e.g. VoIP (voice chatting over IP). This paper is implemented the encryption module of speech signals to cipher text and decryption module of cipher text to speech signals.

Keywords: Linear Feedback Shift Register.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3068
368 Quality Control of Automotive Gearbox Based On Vibration Signal Analysis

Authors: Nilson Barbieri, Bruno Matos Martins, Gabriel de Sant'Anna Vitor Barbieri

Abstract:

In more complex systems, such as automotive gearbox, a rigorous treatment of the data is necessary because there are several moving parts (gears, bearings, shafts, etc.), and in this way, there are several possible sources of errors and also noise. The basic objective of this work is the detection of damage in automotive gearbox. The detection methods used are the wavelet method, the bispectrum; advanced filtering techniques (selective filtering) of vibrational signals and mathematical morphology. Gearbox vibration tests were performed (gearboxes in good condition and with defects) of a production line of a large vehicle assembler. The vibration signals are obtained using five accelerometers in different positions of the sample. The results obtained using the kurtosis, bispectrum, wavelet and mathematical morphology showed that it is possible to identify the existence of defects in automotive gearboxes.

Keywords: Automotive gearbox, mathematical morphology, wavelet, bispectrum.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2272
367 Matrix-Interleaved Serially Concatenated Block Codes for Speech Transmission in Fixed Wireless Communication Systems

Authors: F. Mehran

Abstract:

In this paper, we study a class of serially concatenated block codes (SCBC) based on matrix interleavers, to be employed in fixed wireless communication systems. The performances of SCBC¬coded systems are investigated under various interleaver dimensions. Numerical results reveal that the matrix interleaver could be a competitive candidate over conventional block interleaver for frame lengths of 200 bits; hence, the SCBC coding based on matrix interleaver is a promising technique to be employed for speech transmission applications in many international standards such as pan-European Global System for Mobile communications (GSM), Digital Cellular Systems (DCS) 1800, and Joint Detection Code Division Multiple Access (JD-CDMA) mobile radio systems, where the speech frame contains around 200 bits.

Keywords: Matrix Interleaver, serial concatenated block codes (SCBC), turbo codes, wireless communications.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1893
366 Hybrid Recommender Systems using Social Network Analysis

Authors: Kyoung-Jae Kim, Hyunchul Ahn

Abstract:

This study proposes novel hybrid social network analysis and collaborative filtering approach to enhance the performance of recommender systems. The proposed model selects subgroups of users in Internet community through social network analysis (SNA), and then performs clustering analysis using the information about subgroups. Finally, it makes recommendations using cluster-indexing CF based on the clustering results. This study tries to use the cores in subgroups as an initial seed for a conventional clustering algorithm. This model chooses five cores which have the highest value of degree centrality from SNA, and then performs clustering analysis by using the cores as initial centroids (cluster centers). Then, the model amplifies the impact of friends in social network in the process of cluster-indexing CF.

Keywords: Social network analysis, Recommender systems, Collaborative filtering, Customer relationship management

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2720
365 Application of Hardware Efficient CIC Compensation Filter in Narrow Band Filtering

Authors: Vishal Awasthi, Krishna Raj

Abstract:

In many communication and signal processing systems, it is highly desirable to implement an efficient narrow-band filter that decimate or interpolate the incoming signals. This paper presents hardware efficient compensated CIC filter over a narrow band frequency that increases the speed of down sampling by using multiplierless decimation filters with polyphase FIR filter structure. The proposed work analyzed the performance of compensated CIC filter on the bases of the improvement of frequency response with reduced hardware complexity in terms of no. of adders and multipliers and produces the filtered results without any alterations. CIC compensator filter demonstrated that by using compensation with CIC filter improve the frequency response in passed of interest 26.57% with the reduction in hardware complexity 12.25% multiplications per input sample (MPIS) and 23.4% additions per input sample (APIS) w.r.t. FIR filter respectively.

Keywords: Multirate filtering, Narrow-band Signaling, Compensation Theory, CIC filter, Decimation, Compensation filter.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2899
364 Evaluation of Sensor Pattern Noise Estimators for Source Camera Identification

Authors: Benjamin Anderson-Sackaney, Amr Abdel-Dayem

Abstract:

This paper presents a comprehensive survey of recent source camera identification (SCI) systems. Then, the performance of various sensor pattern noise (SPN) estimators was experimentally assessed, under common photo response non-uniformity (PRNU) frameworks. The experiments used 1350 natural and 900 flat-field images, captured by 18 individual cameras. 12 different experiments, grouped into three sets, were conducted. The results were analyzed using the receiver operator characteristic (ROC) curves. The experimental results demonstrated that combining the basic SPN estimator with a wavelet-based filtering scheme provides promising results. However, the phase SPN estimator fits better with both patch-based (BM3D) and anisotropic diffusion (AD) filtering schemes.

Keywords: Sensor pattern noise, source camera identification, photo response non-uniformity, anisotropic diffusion, peak to correlation energy ratio.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1090
363 Multimodal Database of Emotional Speech, Video and Gestures

Authors: Tomasz Sapiński, Dorota Kamińska, Adam Pelikant, Egils Avots, Cagri Ozcinar, Gholamreza Anbarjafari

Abstract:

People express emotions through different modalities. Integration of verbal and non-verbal communication channels creates a system in which the message is easier to understand. Expanding the focus to several expression forms can facilitate research on emotion recognition as well as human-machine interaction. In this article, the authors present a Polish emotional database composed of three modalities: facial expressions, body movement and gestures, and speech. The corpora contains recordings registered in studio conditions, acted out by 16 professional actors (8 male and 8 female). The data is labeled with six basic emotions categories, according to Ekman’s emotion categories. To check the quality of performance, all recordings are evaluated by experts and volunteers. The database is available to academic community and might be useful in the study on audio-visual emotion recognition.

Keywords: Body movement, emotion recognition, emotional corpus, facial expressions, gestures, multimodal database, speech.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1073
362 Online Prediction of Nonlinear Signal Processing Problems Based Kernel Adaptive Filtering

Authors: Hamza Nejib, Okba Taouali

Abstract:

This paper presents two of the most knowing kernel adaptive filtering (KAF) approaches, the kernel least mean squares and the kernel recursive least squares, in order to predict a new output of nonlinear signal processing. Both of these methods implement a nonlinear transfer function using kernel methods in a particular space named reproducing kernel Hilbert space (RKHS) where the model is a linear combination of kernel functions applied to transform the observed data from the input space to a high dimensional feature space of vectors, this idea known as the kernel trick. Then KAF is the developing filters in RKHS. We use two nonlinear signal processing problems, Mackey Glass chaotic time series prediction and nonlinear channel equalization to figure the performance of the approaches presented and finally to result which of them is the adapted one.

Keywords: KLMS, online prediction, KAF, signal processing, RKHS, Kernel methods, KRLS, KLMS.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1008
361 Skyline Extraction using a Multistage Edge Filtering

Authors: Byung-Ju Kim, Jong-Jin Shin, Hwa-Jin Nam, Jin-Soo Kim

Abstract:

Skyline extraction in mountainous images can be used for navigation of vehicles or UAV(unmanned air vehicles), but it is very hard to extract skyline shape because of clutters like clouds, sea lines and field borders in images. We developed the edge-based skyline extraction algorithm using a proposed multistage edge filtering (MEF) technique. In this method, characteristics of clutters in the image are first defined and then the lines classified as clutters are eliminated by stages using the proposed MEF technique. After this processing, we select the last line using skyline measures among the remained lines. This proposed algorithm is robust under severe environments with clutters and has even good performance for infrared sensor images with a low resolution. We tested this proposed algorithm for images obtained in the field by an infrared camera and confirmed that the proposed algorithm produced a better performance and faster processing time than conventional algorithms.

Keywords: MEF, mountainous image, navigation, skyline

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1830
360 Impulse Noise Reduction in Brain Magnetic Resonance Imaging Using Fuzzy Filters

Authors: Benjamin Y. M. Kwan, Hon Keung Kwan

Abstract:

Noise contamination in a magnetic resonance (MR) image could occur during acquisition, storage, and transmission in which effective filtering is required to avoid repeating the MR procedure. In this paper, an iterative asymmetrical triangle fuzzy filter with moving average center (ATMAVi filter) is used to reduce different levels of salt and pepper noise in a brain MR image. Besides visual inspection on filtered images, the mean squared error (MSE) is used as an objective measurement. When compared with the median filter, simulation results indicate that the ATMAVi filter is effective especially for filtering a higher level noise (such as noise density = 0.45) using a smaller window size (such as 3x3) when operated iteratively or using a larger window size (such as 5x5) when operated non-iteratively.

Keywords: Brain images, Fuzzy filters, Magnetic resonance imaging, Salt and pepper noise reduction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2166
359 On the AC-Side Interface Filter in Three-Phase Shunt Active Power Filter Systems

Authors: Mihaela Popescu, Alexandru Bitoleanu, Mircea Dobriceanu

Abstract:

The proper selection of the AC-side passive filter interconnecting the voltage source converter to the power supply is essential to obtain satisfactory performances of an active power filter system. The use of the LCL-type filter has the advantage of eliminating the high frequency switching harmonics in the current injected into the power supply. This paper is mainly focused on analyzing the influence of the interface filter parameters on the active filtering performances. Some design aspects are pointed out. Thus, the design of the AC interface filter starts from transfer functions by imposing the filter performance which refers to the significant current attenuation of the switching harmonics without affecting the harmonics to be compensated. A Matlab/Simulink model of the entire active filtering system including a concrete nonlinear load has been developed to examine the system performances. It is shown that a gamma LC filter could accomplish the attenuation requirement of the current provided by converter. Moreover, the existence of an optimal value of the grid-side inductance which minimizes the total harmonic distortion factor of the power supply current is pointed out. Nevertheless, a small converter-side inductance and a damping resistance in series with the filter capacitance are absolutely needed in order to keep the ripple and oscillations of the current at the converter side within acceptable limits. The effect of change in the LCL-filter parameters is evaluated. It is concluded that good active filtering performances can be achieved with small values of the capacitance and converter-side inductance.

Keywords: Active power filter, LCL filter, Matlab/Simulinkmodeling, Passive filters, Transfer function.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2966
358 SySRA: A System of a Continuous Speech Recognition in Arab Language

Authors: Samir Abdelhamid, Noureddine Bouguechal

Abstract:

We report in this paper the model adopted by our system of continuous speech recognition in Arab language SySRA and the results obtained until now. This system uses the database Arabdic-10 which is a corpus of word for the Arab language and which was manually segmented. Phonetic decoding is represented by an expert system where the knowledge base is translated in the form of production rules. This expert system transforms a vocal signal into a phonetic lattice. The higher level of the system takes care of the recognition of the lattice thus obtained by deferring it in the form of written sentences (orthographical Form). This level contains initially the lexical analyzer which is not other than the module of recognition. We subjected this analyzer to a set of spectrograms obtained by dictating a score of sentences in Arab language. The rate of recognition of these sentences is about 70% which is, to our knowledge, the best result for the recognition of the Arab language. The test set consists of twenty sentences from four speakers not having taken part in the training.

Keywords: Continuous speech recognition, lexical analyzer, phonetic decoding, phonetic lattice, vocal signal.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1346
357 Collaborative and Content-based Recommender System for Social Bookmarking Website

Authors: Cheng-Lung Huang, Cheng-Wei Lin

Abstract:

This study proposes a new recommender system based on the collaborative folksonomy. The purpose of the proposed system is to recommend Internet resources (such as books, articles, documents, pictures, audio and video) to users. The proposed method includes four steps: creating the user profile based on the tags, grouping the similar users into clusters using an agglomerative hierarchical clustering, finding similar resources based on the user-s past collections by using content-based filtering, and recommending similar items to the target user. This study examines the system-s performance for the dataset collected from “del.icio.us," which is a famous social bookmarking website. Experimental results show that the proposed tag-based collaborative and content-based filtering hybridized recommender system is promising and effectiveness in the folksonomy-based bookmarking website.

Keywords: Collaborative recommendation, Folksonomy, Social tagging

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2199
356 Efficient System for Speech Recognition using General Regression Neural Network

Authors: Abderrahmane Amrouche, Jean Michel Rouvaen

Abstract:

In this paper we present an efficient system for independent speaker speech recognition based on neural network approach. The proposed architecture comprises two phases: a preprocessing phase which consists in segmental normalization and features extraction and a classification phase which uses neural networks based on nonparametric density estimation namely the general regression neural network (GRNN). The relative performances of the proposed model are compared to the similar recognition systems based on the Multilayer Perceptron (MLP), the Recurrent Neural Network (RNN) and the well known Discrete Hidden Markov Model (HMM-VQ) that we have achieved also. Experimental results obtained with Arabic digits have shown that the use of nonparametric density estimation with an appropriate smoothing factor (spread) improves the generalization power of the neural network. The word error rate (WER) is reduced significantly over the baseline HMM method. GRNN computation is a successful alternative to the other neural network and DHMM.

Keywords: Speech Recognition, General Regression NeuralNetwork, Hidden Markov Model, Recurrent Neural Network, ArabicDigits.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2138
355 A Background Subtraction Based Moving Object Detection around the Host Vehicle

Authors: Hyojin Lim, Cuong Nguyen Khac, Ho-Youl Jung

Abstract:

In this paper, we propose moving object detection method which is helpful for driver to safely take his/her car out of parking lot. When moving objects such as motorbikes, pedestrians, the other cars and some obstacles are detected at the rear-side of host vehicle, the proposed algorithm can provide to driver warning. We assume that the host vehicle is just before departure. Gaussian Mixture Model (GMM) based background subtraction is basically applied. Pre-processing such as smoothing and post-processing as morphological filtering are added. We examine “which color space has better performance for detection of moving objects?” Three color spaces including RGB, YCbCr, and Y are applied and compared, in terms of detection rate. Through simulation, we prove that RGB space is more suitable for moving object detection based on background subtraction.

Keywords: Gaussian mixture model, background subtraction, Moving object detection, color space, morphological filtering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2515
354 A Supervised Text-Independent Speaker Recognition Approach

Authors: Tudor Barbu

Abstract:

We provide a supervised speech-independent voice recognition technique in this paper. In the feature extraction stage we propose a mel-cepstral based approach. Our feature vector classification method uses a special nonlinear metric, derived from the Hausdorff distance for sets, and a minimum mean distance classifier.

Keywords: Text-independent speaker recognition, mel cepstral analysis, speech feature vector, Hausdorff-based metric, supervised classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1788
353 Combined Automatic Speech Recognition and Machine Translation in Business Correspondence Domain for English-Croatian

Authors: Sanja Seljan, Ivan Dunđer

Abstract:

The paper presents combined automatic speech recognition (ASR) of English and machine translation (MT) for English and Croatian and Croatian-English language pairs in the domain of business correspondence. The first part presents results of training the ASR commercial system on English data sets, enriched by error analysis. The second part presents results of machine translation performed by free online tool for English and Croatian and Croatian-English language pairs. Human evaluation in terms of usability is conducted and internal consistency calculated by Cronbach's alpha coefficient, enriched by error analysis. Automatic evaluation is performed by WER (Word Error Rate) and PER (Position-independent word Error Rate) metrics, followed by investigation of Pearson’s correlation with human evaluation.

Keywords: Automatic machine translation, integrated language technologies, quality evaluation, speech recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2863
352 Filtering and Reconstruction System for Gray Forensic Images

Authors: Ahd Aljarf, Saad Amin

Abstract:

Images are important source of information used as evidence during any investigation process. Their clarity and accuracy is essential and of the utmost importance for any investigation. Images are vulnerable to losing blocks and having noise added to them either after alteration or when the image was taken initially, therefore, having a high performance image processing system and it is implementation is very important in a forensic point of view. This paper focuses on improving the quality of the forensic images. For different reasons packets that store data can be affected, harmed or even lost because of noise. For example, sending the image through a wireless channel can cause loss of bits. These types of errors might give difficulties generally for the visual display quality of the forensic images. Two of the images problems: noise and losing blocks are covered. However, information which gets transmitted through any way of communication may suffer alteration from its original state or even lose important data due to the channel noise. Therefore, a developed system is introduced to improve the quality and clarity of the forensic images.

Keywords: Image Filtering, Image Reconstruction, Image Processing, Forensic Images.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2166
351 Automatically Driven Vector for Guidewire Segmentation in 2D and Biplane Fluoroscopy

Authors: Simon Lessard, Pascal Bigras, Caroline Lau, Daniel Roy, Gilles Soulez, Jacques A. de Guise

Abstract:

The segmentation of endovascular tools in fluoroscopy images can be accurately performed automatically or by minimum user intervention, using known modern techniques. It has been proven in literature, but no clinical implementation exists so far because the computational time requirements of such technology have not yet been met. A classical segmentation scheme is composed of edge enhancement filtering, line detection, and segmentation. A new method is presented that consists of a vector that propagates in the image to track an edge as it advances. The filtering is performed progressively in the projected path of the vector, whose orientation allows for oriented edge detection, and a minimal image area is globally filtered. Such an algorithm is rapidly computed and can be implemented in real-time applications. It was tested on medical fluoroscopy images from an endovascular cerebral intervention. Ex- periments showed that the 2D tracking was limited to guidewires without intersection crosspoints, while the 3D implementation was able to cope with such planar difficulties.

Keywords: Edge detection, Line Enhancement, Segmentation, Fluoroscopy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1693
350 Using Speech Emotion Recognition as a Longitudinal Biomarker for Alzheimer’s Disease

Authors: Yishu Gong, Liangliang Yang, Jianyu Zhang, Zhengyu Chen, Sihong He, Xusheng Zhang, Wei Zhang

Abstract:

Alzheimer’s disease (AD) is a progressive neurodegenerative disorder that affects millions of people worldwide and is characterized by cognitive decline and behavioral changes. People living with Alzheimer’s disease often find it hard to complete routine tasks. However, there are limited objective assessments that aim to quantify the difficulty of certain tasks for AD patients compared to non-AD people. In this study, we propose to use speech emotion recognition (SER), especially the frustration level as a potential biomarker for quantifying the difficulty patients experience when describing a picture. We build an SER model using data from the IEMOCAP dataset and apply the model to the DementiaBank data to detect the AD/non-AD group difference and perform longitudinal analysis to track the AD disease progression. Our results show that the frustration level detected from the SER model can possibly be used as a cost-effective tool for objective tracking of AD progression in addition to the Mini-Mental State Examination (MMSE) score.

Keywords: Alzheimer’s disease, Speech Emotion Recognition, longitudinal biomarker, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 171
349 Spectral Analysis of Speech: A New Technique

Authors: Neeta Awasthy, J.P.Saini, D.S.Chauhan

Abstract:

ICA which is generally used for blind source separation problem has been tested for feature extraction in Speech recognition system to replace the phoneme based approach of MFCC. Applying the Cepstral coefficients generated to ICA as preprocessing has developed a new signal processing approach. This gives much better results against MFCC and ICA separately, both for word and speaker recognition. The mixing matrix A is different before and after MFCC as expected. As Mel is a nonlinear scale. However, cepstrals generated from Linear Predictive Coefficient being independent prove to be the right candidate for ICA. Matlab is the tool used for all comparisons. The database used is samples of ISOLET.

Keywords: Cepstral Coefficient, Distance measures, Independent Component Analysis, Linear Predictive Coefficients.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1914