Search results for: Visual speech.

682 Unit Selection Algorithm Using Bi-grams Model For Corpus-Based Speech Synthesis

Authors: Mohamed Ali KAMMOUN, Ahmed Ben HAMIDA

Abstract:

In this paper, we present a novel statistical approach to corpus-based speech synthesis. Classically, phonetic information is defined and considered as acoustic reference to be respected. In this way, many studies were elaborated for acoustical unit classification. This type of classification allows separating units according to their symbolic characteristics. Indeed, target cost and concatenation cost were classically defined for unit selection. In Corpus-Based Speech Synthesis System, when using large text corpora, cost functions were limited to a juxtaposition of symbolic criteria and the acoustic information of units is not exploited in the definition of the target cost. In this manuscript, we token in our consideration the unit phonetic information corresponding to acoustic information. This would be realized by defining a probabilistic linguistic Bi-grams model basically used for unit selection. The selected units would be extracted from the English TIMIT corpora.

Keywords: Unit selection, Corpus-based Speech Synthesis, Bigram model

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1398

681 The Effect of Sport Specific Exercises on the Visual Skills of Rugby Players

Authors: P.J. Du Toit, P. Janse Van Vuuren , S. Le Roux , E. Henning, M. Kleynhans, H.C. Terblanche, D. Crafford, C. Grobbelaar, P.S. Wood, C.C. Grant, L. Fletcher

Abstract:

Introduction: Visual performance is an important factor in sport excellence. Visual involvement in a sport varies according to environmental demands associated with that sport. These environmental demands are matched by a task specific motor response. The purpose of this study was to determine if sport specific exercises will improve the visual performance of male rugby players, in order to achieve maximal results on the sports field. Materials & Methods: Twenty six adult male rugby players, aged 16-22, were chosen as subjects. In order to evaluate the effect of sport specific exercises on visual skills, a pre-test - post-test experimental group design was adopted for the study. Results: Significant differences (p≤0.05) were seen in the focussing, tracking, vergence, sequencing, eye-hand coordination and visualisation components Discussion & Conclusions: Sport specific exercises improved visual skills in rugby players which may provide them with an advantage over their opponents. This study suggests that these training programs and participation in regular on-line EyeDrills sports vision exercises (www.eyedrills.co.za) aimed at improving the athlete-s visual coordination, concentration, focus, hand-eye co-ordination, anticipation and motor response should be incorpotated in the rugby players exercise regime.

Keywords: Rugby players, sport specific exercises, visual skills.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2204

680 Puff Noise Detection and Cancellation for Robust Speech Recognition

Authors: Sangjun Park, Jungpyo Hong, Byung-Ok Kang, Yun-keun Lee, Minsoo Hahn

Abstract:

In this paper, an algorithm for detecting and attenuating puff noises frequently generated under the mobile environment is proposed. As a baseline system, puff detection system is designed based on Gaussian Mixture Model (GMM), and 39th Mel Frequency Cepstral Coefficient (MFCC) is extracted as feature parameters. To improve the detection performance, effective acoustic features for puff detection are proposed. In addition, detected puff intervals are attenuated by high-pass filtering. The speech recognition rate was measured for evaluation and confusion matrix and ROC curve are used to confirm the validity of the proposed system.

Keywords: Gaussian mixture model, puff detection and cancellation, speech enhancement.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2181

679 Analysis of Linguistic Disfluencies in Bilingual Children’s Discourse

Authors: Sheena Christabel Pravin, M. Palanivelan

Abstract:

Speech disfluencies are common in spontaneous speech. The primary purpose of this study was to distinguish linguistic disfluencies from stuttering disfluencies in bilingual Tamil–English (TE) speaking children. The secondary purpose was to determine whether their disfluencies are mediated by native language dominance and/or on an early onset of developmental stuttering at childhood. A detailed study was carried out to identify the prosodic and acoustic features that uniquely represent the disfluent regions of speech. This paper focuses on statistical modeling of repetitions, prolongations, pauses and interjections in the speech corpus encompassing bilingual spontaneous utterances from school going children – English and Tamil. Two classifiers including Hidden Markov Models (HMM) and the Multilayer Perceptron (MLP), which is a class of feed-forward artificial neural network, were compared in the classification of disfluencies. The results of the classifiers document the patterns of disfluency in spontaneous speech samples of school-aged children to distinguish between Children Who Stutter (CWS) and Children with Language Impairment CLI). The ability of the models in classifying the disfluencies was measured in terms of F-measure, Recall, and Precision.

Keywords: Bilingual, children who stutter, children with language impairment, Hidden Markov Models, multi-layer perceptron, linguistic disfluencies, stuttering disfluencies.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 976

678 Applications of Support Vector Machines on Smart Phone Systems for Emotional Speech Recognition

Authors: Wernhuar Tarng, Yuan-Yuan Chen, Chien-Lung Li, Kun-Rong Hsie, Mingteh Chen

Abstract:

An emotional speech recognition system for the applications on smart phones was proposed in this study to combine with 3G mobile communications and social networks to provide users and their groups with more interaction and care. This study developed a mechanism using the support vector machines (SVM) to recognize the emotions of speech such as happiness, anger, sadness and normal. The mechanism uses a hierarchical classifier to adjust the weights of acoustic features and divides various parameters into the categories of energy and frequency for training. In this study, 28 commonly used acoustic features including pitch and volume were proposed for training. In addition, a time-frequency parameter obtained by continuous wavelet transforms was also used to identify the accent and intonation in a sentence during the recognition process. The Berlin Database of Emotional Speech was used by dividing the speech into male and female data sets for training. According to the experimental results, the accuracies of male and female test sets were increased by 4.6% and 5.2% respectively after using the time-frequency parameter for classifying happy and angry emotions. For the classification of all emotions, the average accuracy, including male and female data, was 63.5% for the test set and 90.9% for the whole data set.

Keywords: Smart phones, emotional speech recognition, socialnetworks, support vector machines, time-frequency parameter, Mel-scale frequency cepstral coefficients (MFCC).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1802

677 Voice Disorders Identification Using Hybrid Approach: Wavelet Analysis and Multilayer Neural Networks

Authors: L. Salhi, M. Talbi, A. Cherif

Abstract:

This paper presents a new strategy of identification and classification of pathological voices using the hybrid method based on wavelet transform and neural networks. After speech acquisition from a patient, the speech signal is analysed in order to extract the acoustic parameters such as the pitch, the formants, Jitter, and shimmer. Obtained results will be compared to those normal and standard values thanks to a programmable database. Sounds are collected from normal people and patients, and then classified into two different categories. Speech data base is consists of several pathological and normal voices collected from the national hospital “Rabta-Tunis". Speech processing algorithm is conducted in a supervised mode for discrimination of normal and pathology voices and then for classification between neural and vocal pathologies (Parkinson, Alzheimer, laryngeal, dyslexia...). Several simulation results will be presented in function of the disease and will be compared with the clinical diagnosis in order to have an objective evaluation of the developed tool.

Keywords: Formants, Neural Networks, Pathological Voices, Pitch, Wavelet Transform.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2793

676 Preliminary Study of the Phonological Development in Three- and Four-Year-Old Bulgarian Children

Authors: Tsvetomira Braynova, Miglena Simonska

Abstract:

The article presents the results of a research of phonological processes in three- and four-year-old children. A test, created for the purpose of the study, was developed and conducted among 120 children. The study included three areas of research - at the level of words (96 words), at the level of sentence repetition (10 sentences) and at the level of generating own speech from a picture (15 pictures). The test also gives us additional information about the articulation errors of the assessed children. The main purpose of the research is to analyze all phonological processes that occur at this age in Bulgarian children and to identify which are typical and atypical for this age. The results show that the most common phonology errors that children make are: sound substitution, elision of sound, metathesis of sound, elision of syllable, elision of consonants clustered in a syllable. Measuring the correlation between average length of repeated speech and average length of generated speech, the analysis does not prove that the more words a child can repeat in part “repeated speech”, the more words they can be expected to generate in part “generating sentence”. The results of this study show that the task of naming a word provides sufficient and representative information to assess the child's phonology.

Keywords: Articulation, phonology, speech, language development.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 302

675 Continuous Feature Adaptation for Non-Native Speech Recognition

Authors: Y. Deng, X. Li, C. Kwan, B. Raj, R. Stern

Abstract:

The current speech interfaces in many military applications may be adequate for native speakers. However, the recognition rate drops quite a lot for non-native speakers (people with foreign accents). This is mainly because the nonnative speakers have large temporal and intra-phoneme variations when they pronounce the same words. This problem is also complicated by the presence of large environmental noise such as tank noise, helicopter noise, etc. In this paper, we proposed a novel continuous acoustic feature adaptation algorithm for on-line accent and environmental adaptation. Implemented by incremental singular value decomposition (SVD), the algorithm captures local acoustic variation and runs in real-time. This feature-based adaptation method is then integrated with conventional model-based maximum likelihood linear regression (MLLR) algorithm. Extensive experiments have been performed on the NATO non-native speech corpus with baseline acoustic model trained on native American English. The proposed feature-based adaptation algorithm improved the average recognition accuracy by 15%, while the MLLR model based adaptation achieved 11% improvement. The corresponding word error rate (WER) reduction was 25.8% and 2.73%, as compared to that without adaptation. The combined adaptation achieved overall recognition accuracy improvement of 29.5%, and WER reduction of 31.8%, as compared to that without adaptation.

Keywords: speaker adaptation; environment adaptation; robust speech recognition; SVD; non-native speech recognition

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3174

674 Automatic Detection of Syllable Repetition in Read Speech for Objective Assessment of Stuttered Disfluencies

Authors: K. M. Ravikumar, Balakrishna Reddy, R. Rajagopal, H. C. Nagaraj

Abstract:

Automatic detection of syllable repetition is one of the important parameter in assessing the stuttered speech objectively. The existing method which uses artificial neural network (ANN) requires high levels of agreement as prerequisite before attempting to train and test ANNs to separate fluent and nonfluent. We propose automatic detection method for syllable repetition in read speech for objective assessment of stuttered disfluencies which uses a novel approach and has four stages comprising of segmentation, feature extraction, score matching and decision logic. Feature extraction is implemented using well know Mel frequency Cepstra coefficient (MFCC). Score matching is done using Dynamic Time Warping (DTW) between the syllables. The Decision logic is implemented by Perceptron based on the score given by score matching. Although many methods are available for segmentation, in this paper it is done manually. Here the assessment by human judges on the read speech of 10 adults who stutter are described using corresponding method and the result was 83%.

Keywords: Assessment, DTW, MFCC, Objective, Perceptron, Stuttering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2749

673 A Mixing Matrix Estimation Algorithm for Speech Signals under the Under-Determined Blind Source Separation Model

Authors: Jing Wu, Wei Lv, Yibing Li, Yuanfan You

Abstract:

The separation of speech signals has become a research hotspot in the field of signal processing in recent years. It has many applications and influences in teleconferencing, hearing aids, speech recognition of machines and so on. The sounds received are usually noisy. The issue of identifying the sounds of interest and obtaining clear sounds in such an environment becomes a problem worth exploring, that is, the problem of blind source separation. This paper focuses on the under-determined blind source separation (UBSS). Sparse component analysis is generally used for the problem of under-determined blind source separation. The method is mainly divided into two parts. Firstly, the clustering algorithm is used to estimate the mixing matrix according to the observed signals. Then the signal is separated based on the known mixing matrix. In this paper, the problem of mixing matrix estimation is studied. This paper proposes an improved algorithm to estimate the mixing matrix for speech signals in the UBSS model. The traditional potential algorithm is not accurate for the mixing matrix estimation, especially for low signal-to noise ratio (SNR).In response to this problem, this paper considers the idea of an improved potential function method to estimate the mixing matrix. The algorithm not only avoids the inuence of insufficient prior information in traditional clustering algorithm, but also improves the estimation accuracy of mixing matrix. This paper takes the mixing of four speech signals into two channels as an example. The results of simulations show that the approach in this paper not only improves the accuracy of estimation, but also applies to any mixing matrix.

Keywords: Clustering algorithm, potential function, speech signal, the UBSS model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 623

672 Building an Arithmetic Model to Assess Visual Consistency in Townscape

Authors: Dheyaa Hussein, Peter Armstrong

Abstract:

The phenomenon of visual disorder is prominent in contemporary townscapes. This paper provides a theoretical framework for the assessment of visual consistency in townscape in order to achieve more favourable outcomes for users. In this paper, visual consistency refers to the amount of similarity between adjacent components of townscape. The paper investigates parameters which relate to visual consistency in townscape, explores the relationships between them and highlights their significance. The paper uses arithmetic methods from outside the domain of urban design to enable the establishment of an objective approach of assessment which considers subjective indicators including users’ preferences. These methods involve the standard of deviation, colour distance and the distance between points. The paper identifies urban space as a key representative of the visual parameters of townscape. It focuses on its two components, geometry and colour in the evaluation of the visual consistency of townscape. Accordingly, this article proposes four measurements. The first quantifies the number of vertices, which are points in the three-dimensional space that are connected, by lines, to represent the appearance of elements. The second evaluates the visual surroundings of urban space through assessing the location of their vertices. The last two measurements calculate the visual similarity in both vertices and colour in townscape by the calculation of their variation using methods including standard of deviation and colour difference. The proposed quantitative assessment is based on users’ preferences towards these measurements. The paper offers a theoretical basis for a practical tool which can alter the current understanding of architectural form and its application in urban space. This tool is currently under development. The proposed method underpins expert subjective assessment and permits the establishment of a unified framework which adds to creativity by the achievement of a higher level of consistency and satisfaction among the citizens of evolving townscapes.

Keywords: Townscape, Urban Design, Visual Assessment, Visual Consistency.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1580

671 Branding Urban Spaces as an Approach for City Branding -Case study: Cairo City, Egypt

Authors: Mohammad R. M. Abdelaal, Reeman M. R. Hussein

Abstract:

With the beginning of the new century, man still faces many challenges in how to form and develop his urban environment. To meet these challenges, many cities have tried to develop its visual image. This is by transforming their urban environment into a branded visual image; this is at the level of squares, the main roads, the borders, and the landmarks. In this realm, the paper aims at activating the role of branded urban spaces as an approach for the development of visual image of cities, especially in Egypt. It concludes the need to recognize the importance of developing the visual image in Egypt, through directing the urban planners to the important role of such spaces in achieving sustainability.

Keywords: Urban branded spaces, brand image, sustainable development, Cairo.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3047

670 A Visual Cryptography and Statistics Based Method for Ownership Identification of Digital Images

Authors: Ching-Sheng Hsu, Young-Chang Hou

Abstract:

In this paper, a novel copyright protection scheme for digital images based on Visual Cryptography and Statistics is proposed. In our scheme, the theories and properties of sampling distribution of means and visual cryptography are employed to achieve the requirements of robustness and security. Our method does not need to alter the original image and can identify the ownership without resorting to the original image. Besides, our method allows multiple watermarks to be registered for a single host image without causing any damage to other hidden watermarks. Moreover, it is also possible for our scheme to cast a larger watermark into a smaller host image. Finally, experimental results will show the robustness of our scheme against several common attacks.

Keywords: Copyright protection, digital watermarking, samplingdistribution, visual cryptography.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1839

669 Spectral Entropy Employment in Speech Enhancement based on Wavelet Packet

Authors: Talbi Mourad, Salhi Lotfi, Chérif Adnen

Abstract:

In this work, we are interested in developing a speech denoising tool by using a discrete wavelet packet transform (DWPT). This speech denoising tool will be employed for applications of recognition, coding and synthesis. For noise reduction, instead of applying the classical thresholding technique, some wavelet packet nodes are set to zero and the others are thresholded. To estimate the non stationary noise level, we employ the spectral entropy. A comparison of our proposed technique to classical denoising methods based on thresholding and spectral subtraction is made in order to evaluate our approach. The experimental implementation uses speech signals corrupted by two sorts of noise, white and Volvo noises. The obtained results from listening tests show that our proposed technique is better than spectral subtraction. The obtained results from SNR computation show the superiority of our technique when compared to the classical thresholding method using the modified hard thresholding function based on u-law algorithm.

Keywords: Enhancement, spectral subtraction, SNR, discrete wavelet packet transform, spectral entropy Histogram

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1931

668 Spatiotemporal Analysis of Visual Evoked Responses Using Dense EEG

Authors: Rima Hleiss, Elie Bitar, Mahmoud Hassan, Mohamad Khalil

Abstract:

A comprehensive study of object recognition in the human brain requires combining both spatial and temporal analysis of brain activity. Here, we are mainly interested in three issues: the time perception of visual objects, the ability of discrimination between two particular categories (objects vs. animals), and the possibility to identify a particular spatial representation of visual objects. Our experiment consisted of acquiring dense electroencephalographic (EEG) signals during a picture-naming task comprising a set of objects and animals’ images. These EEG responses were recorded from nine participants. In order to determine the time perception of the presented visual stimulus, we analyzed the Event Related Potentials (ERPs) derived from the recorded EEG signals. The analysis of these signals showed that the brain perceives animals and objects with different time instants. Concerning the discrimination of the two categories, the support vector machine (SVM) was applied on the instantaneous EEG (excellent temporal resolution: on the order of millisecond) to categorize the visual stimuli into two different classes. The spatial differences between the evoked responses of the two categories were also investigated. The results showed a variation of the neural activity with the properties of the visual input. Results showed also the existence of a spatial pattern of electrodes over particular regions of the scalp in correspondence to their responses to the visual inputs.

Keywords: Brain activity, dense EEG, evoked responses, spatiotemporal analysis, SVM, perception.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1026

667 Affirming Students’ Attention and Perceptions on Prezi Presentation via Eye Tracking System

Authors: Mona Masood, Norshazlina Shaik Othman

Abstract:

The purpose of this study was to investigate graduate students’ visual attention and perceptions of a Prezi presentation. Ten postgraduate master students were presented with a Prezi presentation at the Centre for Instructional Technology and Multimedia, Universiti Sains Malaysia (USM). The eye movement indicators such as dwell time, average fixation on the areas of interests, heat maps and focus maps were abstracted to indicate the students’ visual attention. Descriptive statistics was employed to analyze the students’ perception of the Prezi presentation in terms of text, slide design, images, layout and overall presentation. The result revealed that the students paid more attention to the text followed by the images and sub heading presented through the Prezi presentation.

Keywords: Eye tracking, Prezi, visual attention, visual perception.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2273

666 Secure E-Pay System Using Steganography and Visual Cryptography

Authors: K. Suganya Devi, P. Srinivasan, M. P. Vaishnave, G. Arutperumjothi

Abstract:

Today’s internet world is highly prone to various online attacks, of which the most harmful attack is phishing. The attackers host the fake websites which are very similar and look alike. We propose an image based authentication using steganography and visual cryptography to prevent phishing. This paper presents a secure steganographic technique for true color (RGB) images and uses Discrete Cosine Transform to compress the images. The proposed method hides the secret data inside the cover image. The use of visual cryptography is to preserve the privacy of an image by decomposing the original image into two shares. Original image can be identified only when both qualified shares are simultaneously available. Individual share does not reveal the identity of the original image. Thus, the existence of the secret message is hard to be detected by the RS steganalysis.

Keywords: Image security, random LSB, steganography, visual cryptography.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1330

665 A Perceptual Image Coding method of High Compression Rate

Authors: Fahmi Kammoun, Mohamed Salim Bouhlel

Abstract:

In the framework of the image compression by Wavelet Transforms, we propose a perceptual method by incorporating Human Visual System (HVS) characteristics in the quantization stage. Indeed, human eyes haven-t an equal sensitivity across the frequency bandwidth. Therefore, the clarity of the reconstructed images can be improved by weighting the quantization according to the Contrast Sensitivity Function (CSF). The visual artifact at low bit rate is minimized. To evaluate our method, we use the Peak Signal to Noise Ratio (PSNR) and a new evaluating criteria witch takes into account visual criteria. The experimental results illustrate that our technique shows improvement on image quality at the same compression ratio.

Keywords: Contrast Sensitivity Function, Human Visual System, Image compression, Wavelet transforms.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1825

664 Igbo Art: A Reflection of the Igbo’s Visual Culture

Authors: David Osa-Egonwa

Abstract:

Visual culture is the expression of the norms and social behavior of a society in visual images. A reflection simply shows you how you look when you stand before a mirror, a clear water or stream. The mirror does not alter, improve or distort your original appearance, neither does it show you a caricature of what stands before it, this is the case with visual images created by a tribe or society. The ‘uli’ is hand drawn body design done on Igbo women and speaks of a culture of body adornment which is a practice that is appreciated by that tribe. The use of pattern of the gliding python snake ‘ije eke’ or ‘ijeagwo’ for wall painting speaks of the Igbo culture as one that appreciates wall paintings based on these patterns. Modern life came and brought a lot of change to the Igbo-speaking people of Nigeria. Change cloaked in the garment of Westernization has influenced the culture of the Igbos. This has resulted in a problem which is a break in the cultural practice that has also affected art produced by the Igbos. Before the colonial masters arrived and changed the established culture practiced by the Igbos, visual images were created that retained the culture of this people. To bring this point to limelight, this paper has adopted a historical method. A large number of works produced during pre and post-colonial era which range from sculptural pieces, paintings and other artifacts, just to mention a few, were studied carefully and it was discovered that the visual images hold the culture or aspects of the culture of the Igbos in their renditions and can rightly serve as a mirror of the Igbo visual culture.

Keywords: Artistic renditions, historical method, Igbo visual culture, changes.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 937

663 Reflections of Utopia and the Ideal City in the Development of Physical Structure of Nikšić Aspect of Visual Perception

Authors: Svetlana Perović, Svetislav Popović

Abstract:

Aspect of visual perception occupies a central position in shaping the physical structure of a city. This paper discusses the visual characteristics of utopian cities and their impact on the shaping of real urban structures. Utopian examples of cities will not be discussed in terms of social and sociological conditions, but rather the emphasis is on urban utopias and ideal cities that have achieved or have had potential impact on the shape of the physical structure of Nikšić. It is a Renaissance-Baroque period with a touch of classicism. The paper’s emphasis is on the physical dimension, not excluding the importance of social equilibrium, studies of which are dating back to Aristotle, Plato, Thomas More, Robert Owen, Tommaso Campanella and others. The emphasis is on urban utopias and their impact on the development of sustainable physical structure of a real city in the context of visual perception. In the case of Nikšić, this paper identifies the common features of a real city and a utopian city, as well as criteria for sustainable urban development in the context of visual achievement.

Keywords: Physical Structure of Nikšić, Utopia and Ideal City, Visual Perception.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3095

662 A Comparison Study of Inspector's Performance between Regular and Complex Tasks

Authors: Santirat Nansaarng, Sittichai Kaewkuekool, Supreeya Siripattanakunkajorn

Abstract:

This research was to study a comparison of inspector-s performance between regular and complex visual inspection task. Visual task was simulated on DVD read control circuit. Inspection task was performed by using computer. Subjects were 10 undergraduate randomly selected and test for 20/20. Then, subjects were divided into two groups, five for regular inspection (control group) and five for complex inspection (treatment group) tasks. Result was showed that performance on regular and complex inspectors was significantly difference at the level of 0.05. Inspector performance on regular inspection was showed high percentage on defects detected by using equal time to complex inspection. This would be indicated that inspector performance was affected by visual inspection task.

Keywords: Visual inspection task, regular and complex task.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1210

661 Bangla Vowel Characterization Based on Analysis by Synthesis

Authors: Syed Akhter Hossain, M. Lutfar Rahman, Farruk Ahmed

Abstract:

Bangla Vowel characterization determines the spectral properties of Bangla vowels for efficient synthesis as well as recognition of Bangla vowels. In this paper, Bangla vowels in isolated word have been analyzed based on speech production model within the framework of Analysis-by-Synthesis. This has led to the extraction of spectral parameters for the production model in order to produce different Bangla vowel sounds. The real and synthetic spectra are compared and a weighted square error has been computed along with the error in the formant bandwidths for efficient representation of Bangla vowels. The extracted features produced good representation of targeted Bangla vowel. Such a representation also plays essential role in low bit rate speech coding and vocoders.

Keywords: Speech, vowel, formant, synthesis, spectrum, LPC.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2325

660 A Study on Learning Styles and Academic Performance in Relation with Kinesthetic, Verbal and Visual Intelligences

Authors: Salina Budin, Nor Liawati Abu Othman, Shaira Ismail

Abstract:

This study attempts to determine kinesthetic, verbal and visual intelligences among mechanical engineering undergraduate students and explores any probable relation with students’ learning styles and academic performance. The questionnaire used in this study is based on Howard Gardner’s multiple intelligences theory comprising of five elements of learning style; environmental, sociological, emotional, physiological and psychological. Questionnaires are distributed amongst undergraduates in the Faculty of Mechanical Engineering. Additional questions on students’ perception of learning styles and their academic performance are included in the questionnaire. The results show that one third of the students are strongly dominant in the kinesthetic intelligent (33%), followed by a combination of kinesthetic and visual intelligences (29%) and 21% are strongly dominant in all three types of intelligences. There is a statistically significant correlation between kinesthetic, verbal and visual intelligences and students learning styles and academic performances. The ANOVA analysis supports that there is a significant relationship between academic performances and level of kinesthetic, verbal and visual intelligences. In addition, it has also proven a remarkable relationship between academic performances and kinesthetic, verbal and visual learning styles amongst the male and female students. Thus, it can be concluded that, academic achievements can be enhanced by understanding as well as capitalizing the students’ types of intelligences and learning styles.

Keywords: Kinesthetic intelligent, verbal intelligent, visual intelligent, learning style, academic performances.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2713

659 Speech Recognition Using Scaly Neural Networks

Authors: Akram M. Othman, May H. Riadh

Abstract:

This research work is aimed at speech recognition using scaly neural networks. A small vocabulary of 11 words were established first, these words are “word, file, open, print, exit, edit, cut, copy, paste, doc1, doc2". These chosen words involved with executing some computer functions such as opening a file, print certain text document, cutting, copying, pasting, editing and exit. It introduced to the computer then subjected to feature extraction process using LPC (linear prediction coefficients). These features are used as input to an artificial neural network in speaker dependent mode. Half of the words are used for training the artificial neural network and the other half are used for testing the system; those are used for information retrieval. The system components are consist of three parts, speech processing and feature extraction, training and testing by using neural networks and information retrieval. The retrieve process proved to be 79.5-88% successful, which is quite acceptable, considering the variation to surrounding, state of the person, and the microphone type.

Keywords: Feature extraction, Liner prediction coefficients, neural network, Speech Recognition, Scaly ANN.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1695

658 Forensic Speaker Verification in Noisy Environmental by Enhancing the Speech Signal Using ICA Approach

Authors: Ahmed Kamil Hasan Al-Ali, Bouchra Senadji, Ganesh Naik

Abstract:

We propose a system to real environmental noise and channel mismatch for forensic speaker verification systems. This method is based on suppressing various types of real environmental noise by using independent component analysis (ICA) algorithm. The enhanced speech signal is applied to mel frequency cepstral coefficients (MFCC) or MFCC feature warping to extract the essential characteristics of the speech signal. Channel effects are reduced using an intermediate vector (i-vector) and probabilistic linear discriminant analysis (PLDA) approach for classification. The proposed algorithm is evaluated by using an Australian forensic voice comparison database, combined with car, street and home noises from QUT-NOISE at a signal to noise ratio (SNR) ranging from -10 dB to 10 dB. Experimental results indicate that the MFCC feature warping-ICA achieves a reduction in equal error rate about (48.22%, 44.66%, and 50.07%) over using MFCC feature warping when the test speech signals are corrupted with random sessions of street, car, and home noises at -10 dB SNR.

Keywords: Noisy forensic speaker verification, ICA algorithm, MFCC, MFCC feature warping.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 926

657 Visual Analytics of Higher Order Information for Trajectory Datasets

Authors: Ye Wang, Ickjai Lee

Abstract:

Due to the widespread of mobile sensing, there is a strong need to handle trails of moving objects, and trajectories. This paper proposes three visual analytics approaches for higher order information of trajectory datasets based on the higher order Voronoi diagram data structure. Proposed approaches reveal geometrical, topological, and directional information. Experimental resultsdemonstrate the applicability and usefulness of proposed three approaches.

Keywords: Visual Analytics, Higher Order Information, Trajectory Datasets, Spatio-temporal data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1591

656 Computationally Efficient Signal Quality Improvement Method for VoIP System

Authors: H. P. Singh, S. Singh

Abstract:

The voice signal in Voice over Internet protocol (VoIP) system is processed through the best effort policy based IP network, which leads to the network degradations including delay, packet loss jitter. The work in this paper presents the implementation of finite impulse response (FIR) filter for voice quality improvement in the VoIP system through distributed arithmetic (DA) algorithm. The VoIP simulations are conducted with AMR-NB 6.70 kbps and G.729a speech coders at different packet loss rates and the performance of the enhanced VoIP signal is evaluated using the perceptual evaluation of speech quality (PESQ) measurement for narrowband signal. The results show reduction in the computational complexity in the system and significant improvement in the quality of the VoIP voice signal.

Keywords: VoIP, Signal Quality, Distributed Arithmetic, Packet Loss, Speech Coder.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1777

655 Extracting Tongue Shape Dynamics from Magnetic Resonance Image Sequences

Authors: María S. Avila-García, John N. Carter, Robert I. Damper

Abstract:

An important problem in speech research is the automatic extraction of information about the shape and dimensions of the vocal tract during real-time speech production. We have previously developed Southampton dynamic magnetic resonance imaging (SDMRI) as an approach to the solution of this problem.However, the SDMRI images are very noisy so that shape extraction is a major challenge. In this paper, we address the problem of tongue shape extraction, which poses difficulties because this is a highly deforming non-parametric shape. We show that combining active shape models with the dynamic Hough transform allows the tongue shape to be reliably tracked in the image sequence.

Keywords: Vocal tract imaging, speech production, active shapemodels, dynamic Hough transform, object tracking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1691

654 The Importance of Bridge Health Monitoring

Authors: Punya Chupanit, Chayatan Phromsorn

Abstract:

In the past, there were many bridge-s collapses due to lack of bridge structural capacity information. Most of concrete bridge health was relied on information from visual inspection, which sometime was inadequate. This study was conducted in order to investigate relationship between bridge structural condition and bridge visual condition. This study was a part of a big project conducted at Department of Highways of Thailand. In this study, 31 bridges including slab-type bridges, plank-girder bridges, prestressed box-beam bridges, prestressed I-girder bridges and prestressed multibeam bridges were selected for visual inspection and load test. It was found a positive correlation between bridge appearance and bridge-s load carrying capacity. However, statistical characteristic revealed low correlation between them.

Keywords: Bridge, Visual Inspection, Load Test, Condition Rating, Rating Factor

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1993

653 VISUAL JESS: AN Expandable Visual Generator of Oriented Object Expert systems

Authors: Amel Grissa-Touzi, Habib Ounally, Aissa Boulila

Abstract:

The utility of expert system generators has been widely recognized in many applications. Several generators based on concept of the paradigm object, have been recently proposed. The generator of oriented object expert system (GSEOO) offers languages that are often complex and difficult to use. We propose in this paper an extension of the expert system generator, JESS, which permits a friendly use of this expert system. The new tool, called VISUAL JESS, bring two main improvements to JESS. The first improvement concerns the easiness of its utilization while giving back transparency to the syntax and semantic aspects of the JESS programming language. The second improvement permits an easy access and modification of the JESS knowledge basis. The implementation of VISUAL JESS is made so that it is extensible and portable.

Keywords: Generator of Systems Expert, Programming oriented object classifies, object, inheritance, polymorphism.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1552