Search results for: Parts of Speech
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1034

Search results for: Parts of Speech

944 Preliminary Study of the Phonological Development in Three- and Four-Year-Old Bulgarian Children

Authors: Tsvetomira Braynova, Miglena Simonska

Abstract:

The article presents the results of a research of phonological processes in three- and four-year-old children. A test, created for the purpose of the study, was developed and conducted among 120 children. The study included three areas of research - at the level of words (96 words), at the level of sentence repetition (10 sentences) and at the level of generating own speech from a picture (15 pictures). The test also gives us additional information about the articulation errors of the assessed children. The main purpose of the research is to analyze all phonological processes that occur at this age in Bulgarian children and to identify which are typical and atypical for this age. The results show that the most common phonology errors that children make are: sound substitution, elision of sound, metathesis of sound, elision of syllable, elision of consonants clustered in a syllable. Measuring the correlation between average length of repeated speech and average length of generated speech, the analysis does not prove that the more words a child can repeat in part “repeated speech”, the more words they can be expected to generate in part “generating sentence”. The results of this study show that the task of naming a word provides sufficient and representative information to assess the child's phonology.

Keywords: Articulation, phonology, speech, language development.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 322
943 Identifying Missing Component in the Bechdel Test Using Principal Component Analysis Method

Authors: Raghav Lakhotia, Chandra Kanth Nagesh, Krishna Madgula

Abstract:

A lot has been said and discussed regarding the rationale and significance of the Bechdel Score. It became a digital sensation in 2013, when Swedish cinemas began to showcase the Bechdel test score of a film alongside its rating. The test has drawn criticism from experts and the film fraternity regarding its use to rate the female presence in a movie. The pundits believe that the score is too simplified and the underlying criteria of a film to pass the test must include 1) at least two women, 2) who have at least one dialogue, 3) about something other than a man, is egregious. In this research, we have considered a few more parameters which highlight how we represent females in film, like the number of female dialogues in a movie, dialogue genre, and part of speech tags in the dialogue. The parameters were missing in the existing criteria to calculate the Bechdel score. The research aims to analyze 342 movies scripts to test a hypothesis if these extra parameters, above with the current Bechdel criteria, are significant in calculating the female representation score. The result of the Principal Component Analysis method concludes that the female dialogue content is a key component and should be considered while measuring the representation of women in a work of fiction.

Keywords: Bechdel test, dialogue genre, parts of speech tags, principal component analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 763
942 Continuous Feature Adaptation for Non-Native Speech Recognition

Authors: Y. Deng, X. Li, C. Kwan, B. Raj, R. Stern

Abstract:

The current speech interfaces in many military applications may be adequate for native speakers. However, the recognition rate drops quite a lot for non-native speakers (people with foreign accents). This is mainly because the nonnative speakers have large temporal and intra-phoneme variations when they pronounce the same words. This problem is also complicated by the presence of large environmental noise such as tank noise, helicopter noise, etc. In this paper, we proposed a novel continuous acoustic feature adaptation algorithm for on-line accent and environmental adaptation. Implemented by incremental singular value decomposition (SVD), the algorithm captures local acoustic variation and runs in real-time. This feature-based adaptation method is then integrated with conventional model-based maximum likelihood linear regression (MLLR) algorithm. Extensive experiments have been performed on the NATO non-native speech corpus with baseline acoustic model trained on native American English. The proposed feature-based adaptation algorithm improved the average recognition accuracy by 15%, while the MLLR model based adaptation achieved 11% improvement. The corresponding word error rate (WER) reduction was 25.8% and 2.73%, as compared to that without adaptation. The combined adaptation achieved overall recognition accuracy improvement of 29.5%, and WER reduction of 31.8%, as compared to that without adaptation.

Keywords: speaker adaptation; environment adaptation; robust speech recognition; SVD; non-native speech recognition

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3189
941 Automatic Detection of Syllable Repetition in Read Speech for Objective Assessment of Stuttered Disfluencies

Authors: K. M. Ravikumar, Balakrishna Reddy, R. Rajagopal, H. C. Nagaraj

Abstract:

Automatic detection of syllable repetition is one of the important parameter in assessing the stuttered speech objectively. The existing method which uses artificial neural network (ANN) requires high levels of agreement as prerequisite before attempting to train and test ANNs to separate fluent and nonfluent. We propose automatic detection method for syllable repetition in read speech for objective assessment of stuttered disfluencies which uses a novel approach and has four stages comprising of segmentation, feature extraction, score matching and decision logic. Feature extraction is implemented using well know Mel frequency Cepstra coefficient (MFCC). Score matching is done using Dynamic Time Warping (DTW) between the syllables. The Decision logic is implemented by Perceptron based on the score given by score matching. Although many methods are available for segmentation, in this paper it is done manually. Here the assessment by human judges on the read speech of 10 adults who stutter are described using corresponding method and the result was 83%.

Keywords: Assessment, DTW, MFCC, Objective, Perceptron, Stuttering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2764
940 Developing an Online Library for Faster Retrieval of Mold Base and Standard Parts of Injection Molding

Authors: Alan C. Lin, Ricky N. Joevan

Abstract:

This paper focuses on developing a system to transfer mold base plates and standard parts faster during the stage of injection mold design. This system not only provides a way to compare the file version, but also it utilizes Siemens NX 10 to isolate the updated information into a single executable file (.dll), and then, the file can be transferred without the need of transferring the whole file. By this way, the system can help the user to download only necessary mold base plates and standard parts, and those parts downloaded are only the updated portions.

Keywords: CAD, injection molding, mold base, data retrieval.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1138
939 Spectral Entropy Employment in Speech Enhancement based on Wavelet Packet

Authors: Talbi Mourad, Salhi Lotfi, Chérif Adnen

Abstract:

In this work, we are interested in developing a speech denoising tool by using a discrete wavelet packet transform (DWPT). This speech denoising tool will be employed for applications of recognition, coding and synthesis. For noise reduction, instead of applying the classical thresholding technique, some wavelet packet nodes are set to zero and the others are thresholded. To estimate the non stationary noise level, we employ the spectral entropy. A comparison of our proposed technique to classical denoising methods based on thresholding and spectral subtraction is made in order to evaluate our approach. The experimental implementation uses speech signals corrupted by two sorts of noise, white and Volvo noises. The obtained results from listening tests show that our proposed technique is better than spectral subtraction. The obtained results from SNR computation show the superiority of our technique when compared to the classical thresholding method using the modified hard thresholding function based on u-law algorithm.

Keywords: Enhancement, spectral subtraction, SNR, discrete wavelet packet transform, spectral entropy Histogram

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1949
938 Bangla Vowel Characterization Based on Analysis by Synthesis

Authors: Syed Akhter Hossain, M. Lutfar Rahman, Farruk Ahmed

Abstract:

Bangla Vowel characterization determines the spectral properties of Bangla vowels for efficient synthesis as well as recognition of Bangla vowels. In this paper, Bangla vowels in isolated word have been analyzed based on speech production model within the framework of Analysis-by-Synthesis. This has led to the extraction of spectral parameters for the production model in order to produce different Bangla vowel sounds. The real and synthetic spectra are compared and a weighted square error has been computed along with the error in the formant bandwidths for efficient representation of Bangla vowels. The extracted features produced good representation of targeted Bangla vowel. Such a representation also plays essential role in low bit rate speech coding and vocoders.

Keywords: Speech, vowel, formant, synthesis, spectrum, LPC.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2338
937 Forensic Speaker Verification in Noisy Environmental by Enhancing the Speech Signal Using ICA Approach

Authors: Ahmed Kamil Hasan Al-Ali, Bouchra Senadji, Ganesh Naik

Abstract:

We propose a system to real environmental noise and channel mismatch for forensic speaker verification systems. This method is based on suppressing various types of real environmental noise by using independent component analysis (ICA) algorithm. The enhanced speech signal is applied to mel frequency cepstral coefficients (MFCC) or MFCC feature warping to extract the essential characteristics of the speech signal. Channel effects are reduced using an intermediate vector (i-vector) and probabilistic linear discriminant analysis (PLDA) approach for classification. The proposed algorithm is evaluated by using an Australian forensic voice comparison database, combined with car, street and home noises from QUT-NOISE at a signal to noise ratio (SNR) ranging from -10 dB to 10 dB. Experimental results indicate that the MFCC feature warping-ICA achieves a reduction in equal error rate about (48.22%, 44.66%, and 50.07%) over using MFCC feature warping when the test speech signals are corrupted with random sessions of street, car, and home noises at -10 dB SNR.

Keywords: Noisy forensic speaker verification, ICA algorithm, MFCC, MFCC feature warping.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 948
936 A Smart-Visio Microphone for Audio-Visual Speech Recognition “Vmike“

Authors: Y. Ni, K. Sebri

Abstract:

The practical implementation of audio-video coupled speech recognition systems is mainly limited by the hardware complexity to integrate two radically different information capturing devices with good temporal synchronisation. In this paper, we propose a solution based on a smart CMOS image sensor in order to simplify the hardware integration difficulties. By using on-chip image processing, this smart sensor can calculate in real time the X/Y projections of the captured image. This on-chip projection reduces considerably the volume of the output data. This data-volume reduction permits a transmission of the condensed visual information via the same audio channel by using a stereophonic input available on most of the standard computation devices such as PC, PDA and mobile phones. A prototype called VMIKE (Visio-Microphone) has been designed and realised by using standard 0.35um CMOS technology. A preliminary experiment gives encouraged results. Its efficiency will be further investigated in a large variety of applications such as biometrics, speech recognition in noisy environments, and vocal control for military or disabled persons, etc.

Keywords: Audio-Visual Speech recognition, CMOS Smartsensor, On-Chip image processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1795
935 Computationally Efficient Signal Quality Improvement Method for VoIP System

Authors: H. P. Singh, S. Singh

Abstract:

The voice signal in Voice over Internet protocol (VoIP) system is processed through the best effort policy based IP network, which leads to the network degradations including delay, packet loss jitter. The work in this paper presents the implementation of finite impulse response (FIR) filter for voice quality improvement in the VoIP system through distributed arithmetic (DA) algorithm. The VoIP simulations are conducted with AMR-NB 6.70 kbps and G.729a speech coders at different packet loss rates and the performance of the enhanced VoIP signal is evaluated using the perceptual evaluation of speech quality (PESQ) measurement for narrowband signal. The results show reduction in the computational complexity in the system and significant improvement in the quality of the VoIP voice signal.

Keywords: VoIP, Signal Quality, Distributed Arithmetic, Packet Loss, Speech Coder.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1798
934 Extracting Tongue Shape Dynamics from Magnetic Resonance Image Sequences

Authors: María S. Avila-García, John N. Carter, Robert I. Damper

Abstract:

An important problem in speech research is the automatic extraction of information about the shape and dimensions of the vocal tract during real-time speech production. We have previously developed Southampton dynamic magnetic resonance imaging (SDMRI) as an approach to the solution of this problem.However, the SDMRI images are very noisy so that shape extraction is a major challenge. In this paper, we address the problem of tongue shape extraction, which poses difficulties because this is a highly deforming non-parametric shape. We show that combining active shape models with the dynamic Hough transform allows the tongue shape to be reliably tracked in the image sequence.

Keywords: Vocal tract imaging, speech production, active shapemodels, dynamic Hough transform, object tracking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1706
933 Influence of Loudness Compression on Hearing with Bone Anchored Hearing Implants

Authors: Anja Kurz, Marc Flynn, Tobias Good, Marco Caversaccio, Martin Kompis

Abstract:

Bone Anchored Hearing Implants (BAHI) are  routinely used in patients with conductive or mixed hearing loss, e.g.  if conventional air conduction hearing aids cannot be used. New  sound processors and new fitting software now allow the adjustment  of parameters such as loudness compression ratios or maximum  power output separately. Today it is unclear, how the choice of these  parameters influences aided speech understanding in BAHI users.  In this prospective experimental study, the effect of varying the  compression ratio and lowering the maximum power output in a  BAHI were investigated.  Twelve experienced adult subjects with a mixed hearing loss  participated in this study. Four different compression ratios (1.0; 1.3;  1.6; 2.0) were tested along with two different maximum power output  settings, resulting in a total of eight different programs. Each  participant tested each program during two weeks. A blinded Latin  square design was used to minimize bias.  For each of the eight programs, speech understanding in quiet and  in noise was assessed. For speech in quiet, the Freiburg number test  and the Freiburg monosyllabic word test at 50, 65, and 80 dB SPL  were used. For speech in noise, the Oldenburg sentence test was  administered.  Speech understanding in quiet and in noise was improved  significantly in the aided condition in any program, when compared  to the unaided condition. However, no significant differences were  found between any of the eight programs. In contrast, on a subjective  level there was a significant preference for medium compression  ratios of 1.3 to 1.6 and higher maximum power output.

 

Keywords: Bone Anchored Hearing Implant, Compression, Maximum Power Output, Speech understanding.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2025
932 Comparison of Fricative Vocal Tract Transfer Functions Derived using Two Different Segmentation Techniques

Authors: K. S. Subari, C. H. Shadle, A. Barney, R. I. Damper

Abstract:

The acoustic and articulatory properties of fricative speech sounds are being studied using magnetic resonance imaging (MRI) and acoustic recordings from a single subject. Area functions were derived from a complete set of axial and coronal MR slices using two different methods: the Mermelstein technique and the Blum transform. Area functions derived from the two techniques were shown to differ significantly in some cases. Such differences will lead to different acoustic predictions and it is important to know which is the more accurate. The vocal tract acoustic transfer function (VTTF) was derived from these area functions for each fricative and compared with measured speech signals for the same fricative and same subject. The VTTFs for /f/ in two vowel contexts and the corresponding acoustic spectra are derived here; the Blum transform appears to show a better match between prediction and measurement than the Mermelstein technique.

Keywords: Area functions, fricatives, vocal tract transferfunction, MRI, speech.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1624
931 Automotive 3-Microphone Noise Canceller in a Frequently Moving Noise Source Environment

Authors: Z. Qi, T. J. Moir

Abstract:

A combined three-microphone voice activity detector (VAD) and noise-canceling system is studied to enhance speech recognition in an automobile environment. A previous experiment clearly shows the ability of the composite system to cancel a single noise source outside of a defined zone. This paper investigates the performance of the composite system when there are frequently moving noise sources (noise sources are coming from different locations but are not always presented at the same time) e.g. there is other passenger speech or speech from a radio when a desired speech is presented. To work in a frequently moving noise sources environment, whilst a three-microphone voice activity detector (VAD) detects voice from a “VAD valid zone", the 3-microphone noise canceller uses a “noise canceller valid zone" defined in freespace around the users head. Therefore, a desired voice should be in the intersection of the noise canceller valid zone and VAD valid zone. Thus all noise is suppressed outside this intersection of area. Experiments are shown for a real environment e.g. all results were recorded in a car by omni-directional electret condenser microphones.

Keywords: Signal processing, voice activity detection, noise canceller, microphone array beam forming.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1579
930 Scheduling Maintenance Actions for Gas Turbines Aircraft Engines

Authors: Anis Gharbi

Abstract:

This paper considers the problem of scheduling maintenance actions for identical aircraft gas turbine engines. Each one of the turbines consists of parts which frequently require replacement. A finite inventory of spare parts is available and all parts are ready for replacement at any time. The inventory consists of both new and refurbished parts. Hence, these parts have different field lives. The goal is to find a replacement part sequencing that maximizes the time that the aircraft will keep functioning before the inventory is replenished. The problem is formulated as an identical parallel machine scheduling problem where the minimum completion time has to be maximized. Two models have been developed. The first one is an optimization model which is based on a 0-1 linear programming formulation, while the second one is an approximate procedure which consists in decomposing the problem into several two-machine subproblems. Each subproblem is optimally solved using the first model. Both models have been implemented using Lingo and have been tested on two sets of randomly generated data with up to 150 parts and 10 turbines. Experimental results show that the optimization model is able to solve only instances with no more than 4 turbines, while the decomposition procedure often provides near-optimal solutions within a maximum CPU time of 3 seconds.

Keywords: Aircraft turbines, Scheduling, Identical parallel machines, 0-1 linear programming, Heuristic.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1972
929 Robust Features for Impulsive Noisy Speech Recognition Using Relative Spectral Analysis

Authors: Hajer Rahali, Zied Hajaiej, Noureddine Ellouze

Abstract:

The goal of speech parameterization is to extract the relevant information about what is being spoken from the audio signal. In speech recognition systems Mel-Frequency Cepstral Coefficients (MFCC) and Relative Spectral Mel-Frequency Cepstral Coefficients (RASTA-MFCC) are the two main techniques used. It will be shown in this paper that it presents some modifications to the original MFCC method. In our work the effectiveness of proposed changes to MFCC called Modified Function Cepstral Coefficients (MODFCC) were tested and compared against the original MFCC and RASTA-MFCC features. The prosodic features such as jitter and shimmer are added to baseline spectral features. The above-mentioned techniques were tested with impulsive signals under various noisy conditions within AURORA databases.

Keywords: Auditory filter, impulsive noise, MFCC, prosodic features, RASTA filter.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2293
928 A Simple Adaptive Atomic Decomposition Voice Activity Detector Implemented by Matching Pursuit

Authors: Thomas Bryan, Veton Kepuska, Ivica Kostanic

Abstract:

A simple adaptive voice activity detector (VAD) is implemented using Gabor and gammatone atomic decomposition of speech for high Gaussian noise environments. Matching pursuit is used for atomic decomposition, and is shown to achieve optimal speech detection capability at high data compression rates for low signal to noise ratios. The most active dictionary elements found by matching pursuit are used for the signal reconstruction so that the algorithm adapts to the individual speakers dominant time-frequency characteristics. Speech has a high peak to average ratio enabling matching pursuit greedy heuristic of highest inner products to isolate high energy speech components in high noise environments. Gabor and gammatone atoms are both investigated with identical logarithmically spaced center frequencies, and similar bandwidths. The algorithm performs equally well for both Gabor and gammatone atoms with no significant statistical differences. The algorithm achieves 70% accuracy at a 0 dB SNR, 90% accuracy at a 5 dB SNR and 98% accuracy at a 20dB SNR using 30d B SNR as a reference for voice activity.

Keywords: Atomic Decomposition, Gabor, Gammatone, Matching Pursuit, Voice Activity Detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1763
927 Online Collaborative Learning System Using Speech Technology

Authors: Sid-Ahmed. Selouani, Tang-Ho Lê, Chadia Moghrabi, Benoit Lanteigne, Jean Roy

Abstract:

A Web-based learning tool, the Learn IN Context (LINC) system, designed and being used in some institution-s courses in mixed-mode learning, is presented in this paper. This mode combines face-to-face and distance approaches to education. LINC can achieve both collaborative and competitive learning. In order to provide both learners and tutors with a more natural way to interact with e-learning applications, a conversational interface has been included in LINC. Hence, the components and essential features of LINC+, the voice enhanced version of LINC, are described. We report evaluation experiments of LINC/LINC+ in a real use context of a computer programming course taught at the Université de Moncton (Canada). The findings show that when the learning material is delivered in the form of a collaborative and voice-enabled presentation, the majority of learners seem to be satisfied with this new media, and confirm that it does not negatively affect their cognitive load.

Keywords: E-leaning, Knowledge Network, Speech recognition, Speech synthesis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1680
926 Hybrid Method Using Wavelets and Predictive Method for Compression of Speech Signal

Authors: Karima Siham Aoubid, Mohamed Boulemden

Abstract:

The development of the signal compression algorithms is having compressive progress. These algorithms are continuously improved by new tools and aim to reduce, an average, the number of bits necessary to the signal representation by means of minimizing the reconstruction error. The following article proposes the compression of Arabic speech signal by a hybrid method combining the wavelet transform and the linear prediction. The adopted approach rests, on one hand, on the original signal decomposition by ways of analysis filters, which is followed by the compression stage, and on the other hand, on the application of the order 5, as well as, the compression signal coefficients. The aim of this approach is the estimation of the predicted error, which will be coded and transmitted. The decoding operation is then used to reconstitute the original signal. Thus, the adequate choice of the bench of filters is useful to the transform in necessary to increase the compression rate and induce an impercevable distortion from an auditive point of view.

Keywords: Compression, linear prediction analysis, multiresolution analysis, speech signal.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1300
925 The Optimization of Engine Mounting Parts Using Hot-Cold Forging Technology

Authors: D. H. Park, Y. H. Tak, H. H. Kwon, G. J. Kwon, H. G. Kim

Abstract:

The purpose of this study is to develop a forging process of automotive parts that satisfies the deformation characteristics. The analyses of temperature variation and deformation behavior of the material are important to obtain the optimal forging products. The hot compression test was carried out to know formability at high temperature. In order to define the optimum forging conditions including material temperature, strain and forging load, the commercial finite element analysis code was used to simulate the forging procedure of engine mounting parts. Experimental results were compared with the simulation results by finite element analysis. Test results were in good agreement with the simulations.

Keywords: Cold forging, hot forging, engine mounting, automotive parts, optimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1611
924 Recognition of Noisy Words Using the Time Delay Neural Networks Approach

Authors: Khenfer-Koummich Fatima, Mesbahi Larbi, Hendel Fatiha

Abstract:

This paper presents a recognition system for isolated words like robot commands. It’s carried out by Time Delay Neural Networks; TDNN. To teleoperate a robot for specific tasks as turn, close, etc… In industrial environment and taking into account the noise coming from the machine. The choice of TDNN is based on its generalization in terms of accuracy, in more it acts as a filter that allows the passage of certain desirable frequency characteristics of speech; the goal is to determine the parameters of this filter for making an adaptable system to the variability of speech signal and to noise especially, for this the back propagation technique was used in learning phase. The approach was applied on commands pronounced in two languages separately: The French and Arabic. The results for two test bases of 300 spoken words for each one are 87%, 97.6% in neutral environment and 77.67%, 92.67% when the white Gaussian noisy was added with a SNR of 35 dB.

Keywords: Neural networks, Noise, Speech Recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1909
923 Matrix-Interleaved Serially Concatenated Block Codes for Speech Transmission in Fixed Wireless Communication Systems

Authors: F. Mehran

Abstract:

In this paper, we study a class of serially concatenated block codes (SCBC) based on matrix interleavers, to be employed in fixed wireless communication systems. The performances of SCBC¬coded systems are investigated under various interleaver dimensions. Numerical results reveal that the matrix interleaver could be a competitive candidate over conventional block interleaver for frame lengths of 200 bits; hence, the SCBC coding based on matrix interleaver is a promising technique to be employed for speech transmission applications in many international standards such as pan-European Global System for Mobile communications (GSM), Digital Cellular Systems (DCS) 1800, and Joint Detection Code Division Multiple Access (JD-CDMA) mobile radio systems, where the speech frame contains around 200 bits.

Keywords: Matrix Interleaver, serial concatenated block codes (SCBC), turbo codes, wireless communications.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1906
922 Applications for Additive Manufacturing Technology for Reducing the Weight of Body Parts of Gas Turbine Engines

Authors: Liubov A. Magerramova, Mikhail A. Petrov, Vladimir V. Isakov, Liana A. Shcherbinina, Suren G. Gukasyan, Daniil V. Povalyukhin, Olga G. Klimova-Korsmik, Darya V. Volosevich

Abstract:

Aircraft engines are developing along the path of increasing resource, strength, reliability, and safety. The building of gas turbine engine body parts is a complex design and technological task. Particularly complex in the design and manufacturing are the casings of the input stages of helicopter gearboxes and central drives of aircraft engines. Traditional technologies, such as precision casting or isothermal forging, are characterized by significant limitations in parts production. For parts like housing, additive technologies guarantee spatial freedom and limitless or flexible design. This article presents the results of computational and experimental studies. These investigations justify the applicability of additive technologies (AT) to reduce the weight of aircraft housing gearbox parts by up to 32%. This is possible due to geometrical optimization compared to the classical, less flexible manufacturing methods and as-casted aircraft parts with over-insured values of safety factors. Using an example of the body of the input stage of an aircraft gearbox, visualization of the layer-by-layer manufacturing of a part based on thermal deformation was demonstrated.

Keywords: Additive technologies, gas turbine engines, geometric optimization, weight reduction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 53
921 Multimodal Database of Emotional Speech, Video and Gestures

Authors: Tomasz Sapiński, Dorota Kamińska, Adam Pelikant, Egils Avots, Cagri Ozcinar, Gholamreza Anbarjafari

Abstract:

People express emotions through different modalities. Integration of verbal and non-verbal communication channels creates a system in which the message is easier to understand. Expanding the focus to several expression forms can facilitate research on emotion recognition as well as human-machine interaction. In this article, the authors present a Polish emotional database composed of three modalities: facial expressions, body movement and gestures, and speech. The corpora contains recordings registered in studio conditions, acted out by 16 professional actors (8 male and 8 female). The data is labeled with six basic emotions categories, according to Ekman’s emotion categories. To check the quality of performance, all recordings are evaluated by experts and volunteers. The database is available to academic community and might be useful in the study on audio-visual emotion recognition.

Keywords: Body movement, emotion recognition, emotional corpus, facial expressions, gestures, multimodal database, speech.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1085
920 Extracting Multiword Expressions in Machine Translation from English to Urdu using Relational Data Approach

Authors: Kashif Bilal, Uzair Muhammad, Atif Khan, M. Nasir Khan

Abstract:

Machine Translation, (hereafter in this document referred to as the "MT") faces a lot of complex problems from its origination. Extracting multiword expressions is also one of the complex problems in MT. Finding multiword expressions during translating a sentence from English into Urdu, through existing solutions, takes a lot of time and occupies system resources. We have designed a simple relational data approach, in which we simply set a bit in dictionary (database) for multiword, to find and handle multiword expression. This approach handles multiword efficiently.

Keywords: Machine Translation, Multiword Expressions, Urdulanguage processing, POS (stands for Parts of Speech) Tagging forUrdu, Expert Systems.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2324
919 SySRA: A System of a Continuous Speech Recognition in Arab Language

Authors: Samir Abdelhamid, Noureddine Bouguechal

Abstract:

We report in this paper the model adopted by our system of continuous speech recognition in Arab language SySRA and the results obtained until now. This system uses the database Arabdic-10 which is a corpus of word for the Arab language and which was manually segmented. Phonetic decoding is represented by an expert system where the knowledge base is translated in the form of production rules. This expert system transforms a vocal signal into a phonetic lattice. The higher level of the system takes care of the recognition of the lattice thus obtained by deferring it in the form of written sentences (orthographical Form). This level contains initially the lexical analyzer which is not other than the module of recognition. We subjected this analyzer to a set of spectrograms obtained by dictating a score of sentences in Arab language. The rate of recognition of these sentences is about 70% which is, to our knowledge, the best result for the recognition of the Arab language. The test set consists of twenty sentences from four speakers not having taken part in the training.

Keywords: Continuous speech recognition, lexical analyzer, phonetic decoding, phonetic lattice, vocal signal.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1356
918 Efficient System for Speech Recognition using General Regression Neural Network

Authors: Abderrahmane Amrouche, Jean Michel Rouvaen

Abstract:

In this paper we present an efficient system for independent speaker speech recognition based on neural network approach. The proposed architecture comprises two phases: a preprocessing phase which consists in segmental normalization and features extraction and a classification phase which uses neural networks based on nonparametric density estimation namely the general regression neural network (GRNN). The relative performances of the proposed model are compared to the similar recognition systems based on the Multilayer Perceptron (MLP), the Recurrent Neural Network (RNN) and the well known Discrete Hidden Markov Model (HMM-VQ) that we have achieved also. Experimental results obtained with Arabic digits have shown that the use of nonparametric density estimation with an appropriate smoothing factor (spread) improves the generalization power of the neural network. The word error rate (WER) is reduced significantly over the baseline HMM method. GRNN computation is a successful alternative to the other neural network and DHMM.

Keywords: Speech Recognition, General Regression NeuralNetwork, Hidden Markov Model, Recurrent Neural Network, ArabicDigits.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2149
917 Applying Lean Principles, Tools and Techniques in Set Parts Supply Implementation

Authors: Suhartini M. Jainury, Rizauddin Ramli, Mohd Nizam A. Rahman

Abstract:

Lean, which was initially developed by Toyota, is widely implemented in other companies to improve competitiveness. This research is an attempt to identify the adoption of lean in the production system of Malaysian car manufacturer, Proton using case study approach. To gain the in-depth information regarding lean implementation, an activity on the assembly line called Set Parts Supply (SPS) was studied. The result indicates that by using lean principles, tools and techniques in the implementation of SPS enabled to achieve the goals on safety, quality, cost, delivery and morale. The implementation increased the size of the workspace, improved the quality of assembly and the delivery of parts supply, reduced the manpower, achieved cost savings on electricity and also increased the motivation of manpower in respect of attendance at work. A framework of SPS implementation is suggested as a contribution for lean practices in production system.

Keywords: Assembly line, lean manufacturing, production, parts supply.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5613
916 A Supervised Text-Independent Speaker Recognition Approach

Authors: Tudor Barbu

Abstract:

We provide a supervised speech-independent voice recognition technique in this paper. In the feature extraction stage we propose a mel-cepstral based approach. Our feature vector classification method uses a special nonlinear metric, derived from the Hausdorff distance for sets, and a minimum mean distance classifier.

Keywords: Text-independent speaker recognition, mel cepstral analysis, speech feature vector, Hausdorff-based metric, supervised classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1800
915 Combined Automatic Speech Recognition and Machine Translation in Business Correspondence Domain for English-Croatian

Authors: Sanja Seljan, Ivan Dunđer

Abstract:

The paper presents combined automatic speech recognition (ASR) of English and machine translation (MT) for English and Croatian and Croatian-English language pairs in the domain of business correspondence. The first part presents results of training the ASR commercial system on English data sets, enriched by error analysis. The second part presents results of machine translation performed by free online tool for English and Croatian and Croatian-English language pairs. Human evaluation in terms of usability is conducted and internal consistency calculated by Cronbach's alpha coefficient, enriched by error analysis. Automatic evaluation is performed by WER (Word Error Rate) and PER (Position-independent word Error Rate) metrics, followed by investigation of Pearson’s correlation with human evaluation.

Keywords: Automatic machine translation, integrated language technologies, quality evaluation, speech recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2877