Search results for: Visual speech.
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 809

Search results for: Visual speech.

689 Applications of Support Vector Machines on Smart Phone Systems for Emotional Speech Recognition

Authors: Wernhuar Tarng, Yuan-Yuan Chen, Chien-Lung Li, Kun-Rong Hsie, Mingteh Chen

Abstract:

An emotional speech recognition system for the applications on smart phones was proposed in this study to combine with 3G mobile communications and social networks to provide users and their groups with more interaction and care. This study developed a mechanism using the support vector machines (SVM) to recognize the emotions of speech such as happiness, anger, sadness and normal. The mechanism uses a hierarchical classifier to adjust the weights of acoustic features and divides various parameters into the categories of energy and frequency for training. In this study, 28 commonly used acoustic features including pitch and volume were proposed for training. In addition, a time-frequency parameter obtained by continuous wavelet transforms was also used to identify the accent and intonation in a sentence during the recognition process. The Berlin Database of Emotional Speech was used by dividing the speech into male and female data sets for training. According to the experimental results, the accuracies of male and female test sets were increased by 4.6% and 5.2% respectively after using the time-frequency parameter for classifying happy and angry emotions. For the classification of all emotions, the average accuracy, including male and female data, was 63.5% for the test set and 90.9% for the whole data set.

Keywords: Smart phones, emotional speech recognition, socialnetworks, support vector machines, time-frequency parameter, Mel-scale frequency cepstral coefficients (MFCC).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1842
688 Analysis of Event-related Response in Human Visual Cortex with fMRI

Authors: Ayesha Zaman, Tanvir Atahary, Shahida Rafiq

Abstract:

Functional Magnetic Resonance Imaging(fMRI) is a noninvasive imaging technique that measures the hemodynamic response related to neural activity in the human brain. Event-related functional magnetic resonance imaging (efMRI) is a form of functional Magnetic Resonance Imaging (fMRI) in which a series of fMRI images are time-locked to a stimulus presentation and averaged together over many trials. Again an event related potential (ERP) is a measured brain response that is directly the result of a thought or perception. Here the neuronal response of human visual cortex in normal healthy patients have been studied. The patients were asked to perform a visual three choice reaction task; from the relative response of each patient corresponding neuronal activity in visual cortex was imaged. The average number of neurons in the adult human primary visual cortex, in each hemisphere has been estimated at around 140 million. Statistical analysis of this experiment was done with SPM5(Statistical Parametric Mapping version 5) software. The result shows a robust design of imaging the neuronal activity of human visual cortex.

Keywords: Echo Planner Imaging, Event related Response, General Linear Model, Visual Neuronal Response.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1457
687 Voice Disorders Identification Using Hybrid Approach: Wavelet Analysis and Multilayer Neural Networks

Authors: L. Salhi, M. Talbi, A. Cherif

Abstract:

This paper presents a new strategy of identification and classification of pathological voices using the hybrid method based on wavelet transform and neural networks. After speech acquisition from a patient, the speech signal is analysed in order to extract the acoustic parameters such as the pitch, the formants, Jitter, and shimmer. Obtained results will be compared to those normal and standard values thanks to a programmable database. Sounds are collected from normal people and patients, and then classified into two different categories. Speech data base is consists of several pathological and normal voices collected from the national hospital “Rabta-Tunis". Speech processing algorithm is conducted in a supervised mode for discrimination of normal and pathology voices and then for classification between neural and vocal pathologies (Parkinson, Alzheimer, laryngeal, dyslexia...). Several simulation results will be presented in function of the disease and will be compared with the clinical diagnosis in order to have an objective evaluation of the developed tool.

Keywords: Formants, Neural Networks, Pathological Voices, Pitch, Wavelet Transform.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2842
686 Preliminary Study of the Phonological Development in Three- and Four-Year-Old Bulgarian Children

Authors: Tsvetomira Braynova, Miglena Simonska

Abstract:

The article presents the results of a research of phonological processes in three- and four-year-old children. A test, created for the purpose of the study, was developed and conducted among 120 children. The study included three areas of research - at the level of words (96 words), at the level of sentence repetition (10 sentences) and at the level of generating own speech from a picture (15 pictures). The test also gives us additional information about the articulation errors of the assessed children. The main purpose of the research is to analyze all phonological processes that occur at this age in Bulgarian children and to identify which are typical and atypical for this age. The results show that the most common phonology errors that children make are: sound substitution, elision of sound, metathesis of sound, elision of syllable, elision of consonants clustered in a syllable. Measuring the correlation between average length of repeated speech and average length of generated speech, the analysis does not prove that the more words a child can repeat in part “repeated speech”, the more words they can be expected to generate in part “generating sentence”. The results of this study show that the task of naming a word provides sufficient and representative information to assess the child's phonology.

Keywords: Articulation, phonology, speech, language development.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 384
685 Controlling 6R Robot by Visionary System

Authors: Azamossadat Nourbakhsh, Moharram Habibnezhad Korayem

Abstract:

In the visual servoing systems, the data obtained by Visionary is used for controlling robots. In this project, at first the simulator which was proposed for simulating the performance of a 6R robot before, was examined in terms of software and test, and in the proposed simulator, existing defects were obviated. In the first version of simulation, the robot was directed toward the target object only in a Position-based method using two cameras in the environment. In the new version of the software, three cameras were used simultaneously. The camera which is installed as eye-inhand on the end-effector of the robot is used for visual servoing in a Feature-based method. The target object is recognized according to its characteristics and the robot is directed toward the object in compliance with an algorithm similar to the function of human-s eyes. Then, the function and accuracy of the operation of the robot are examined through Position-based visual servoing method using two cameras installed as eye-to-hand in the environment. Finally, the obtained results are tested under ANSI-RIA R15.05-2 standard.

Keywords: 6R Robot , camera, visual servoing, Feature-based visual servoing, Position-based visual servoing, Performance tests.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1385
684 Continuous Feature Adaptation for Non-Native Speech Recognition

Authors: Y. Deng, X. Li, C. Kwan, B. Raj, R. Stern

Abstract:

The current speech interfaces in many military applications may be adequate for native speakers. However, the recognition rate drops quite a lot for non-native speakers (people with foreign accents). This is mainly because the nonnative speakers have large temporal and intra-phoneme variations when they pronounce the same words. This problem is also complicated by the presence of large environmental noise such as tank noise, helicopter noise, etc. In this paper, we proposed a novel continuous acoustic feature adaptation algorithm for on-line accent and environmental adaptation. Implemented by incremental singular value decomposition (SVD), the algorithm captures local acoustic variation and runs in real-time. This feature-based adaptation method is then integrated with conventional model-based maximum likelihood linear regression (MLLR) algorithm. Extensive experiments have been performed on the NATO non-native speech corpus with baseline acoustic model trained on native American English. The proposed feature-based adaptation algorithm improved the average recognition accuracy by 15%, while the MLLR model based adaptation achieved 11% improvement. The corresponding word error rate (WER) reduction was 25.8% and 2.73%, as compared to that without adaptation. The combined adaptation achieved overall recognition accuracy improvement of 29.5%, and WER reduction of 31.8%, as compared to that without adaptation.

Keywords: speaker adaptation; environment adaptation; robust speech recognition; SVD; non-native speech recognition

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3217
683 Pictorial Multimodal Analysis of Selected Paintings of Salvador Dali

Authors: Shaza Melies, Abeer Refky, Nihad Mansoor

Abstract:

Multimodality involves the communication between verbal and visual components in various discourses. A painting represents a form of communication between the artist and the viewer in terms of colors, shades, objects, and the title. This paper aims to present how multimodality can be used to decode the verbal and visual dimensions a painting holds. For that purpose, this study uses Kress and van Leeuwen’s theoretical framework of visual grammar for the analysis of the multimodal semiotic resources of selected paintings of Salvador Dali. This study investigates the visual decoding of the selected paintings of Salvador Dali and analyzing their social and political meanings using Kress and van Leeuwen’s framework of visual grammar. The paper attempts to answer the following questions: 1. How far can multimodality decode the verbal and non-verbal meanings of surrealistic art? 2. How can Kress and van Leeuwen’s theoretical framework of visual grammar be applied to analyze Dali’s paintings? 3. To what extent is Kress and van Leeuwen’s theoretical framework of visual grammar apt to deliver political and social messages of Dali? The paper reached the following findings: the framework’s descriptive tools (representational, interactive, and compositional meanings) can be used to analyze the paintings’ title and their visual elements. Social and political messages were delivered by appropriate usage of color, gesture, vectors, modality, and the way social actors were represented.

Keywords: Multimodality, multimodal analysis, paintings analysis, Salvador Dali, visual grammar.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 752
682 Automatic Detection of Syllable Repetition in Read Speech for Objective Assessment of Stuttered Disfluencies

Authors: K. M. Ravikumar, Balakrishna Reddy, R. Rajagopal, H. C. Nagaraj

Abstract:

Automatic detection of syllable repetition is one of the important parameter in assessing the stuttered speech objectively. The existing method which uses artificial neural network (ANN) requires high levels of agreement as prerequisite before attempting to train and test ANNs to separate fluent and nonfluent. We propose automatic detection method for syllable repetition in read speech for objective assessment of stuttered disfluencies which uses a novel approach and has four stages comprising of segmentation, feature extraction, score matching and decision logic. Feature extraction is implemented using well know Mel frequency Cepstra coefficient (MFCC). Score matching is done using Dynamic Time Warping (DTW) between the syllables. The Decision logic is implemented by Perceptron based on the score given by score matching. Although many methods are available for segmentation, in this paper it is done manually. Here the assessment by human judges on the read speech of 10 adults who stutter are described using corresponding method and the result was 83%.

Keywords: Assessment, DTW, MFCC, Objective, Perceptron, Stuttering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2812
681 The Effect of Sport Specific Exercises on the Visual Skills of Rugby Players

Authors: P.J. Du Toit, P. Janse Van Vuuren , S. Le Roux , E. Henning, M. Kleynhans, H.C. Terblanche, D. Crafford, C. Grobbelaar, P.S. Wood, C.C. Grant, L. Fletcher

Abstract:

Introduction: Visual performance is an important factor in sport excellence. Visual involvement in a sport varies according to environmental demands associated with that sport. These environmental demands are matched by a task specific motor response. The purpose of this study was to determine if sport specific exercises will improve the visual performance of male rugby players, in order to achieve maximal results on the sports field. Materials & Methods: Twenty six adult male rugby players, aged 16-22, were chosen as subjects. In order to evaluate the effect of sport specific exercises on visual skills, a pre-test - post-test experimental group design was adopted for the study. Results: Significant differences (p≤0.05) were seen in the focussing, tracking, vergence, sequencing, eye-hand coordination and visualisation components Discussion & Conclusions: Sport specific exercises improved visual skills in rugby players which may provide them with an advantage over their opponents. This study suggests that these training programs and participation in regular on-line EyeDrills sports vision exercises (www.eyedrills.co.za) aimed at improving the athlete-s visual coordination, concentration, focus, hand-eye co-ordination, anticipation and motor response should be incorpotated in the rugby players exercise regime.

Keywords: Rugby players, sport specific exercises, visual skills.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2257
680 A Mixing Matrix Estimation Algorithm for Speech Signals under the Under-Determined Blind Source Separation Model

Authors: Jing Wu, Wei Lv, Yibing Li, Yuanfan You

Abstract:

The separation of speech signals has become a research hotspot in the field of signal processing in recent years. It has many applications and influences in teleconferencing, hearing aids, speech recognition of machines and so on. The sounds received are usually noisy. The issue of identifying the sounds of interest and obtaining clear sounds in such an environment becomes a problem worth exploring, that is, the problem of blind source separation. This paper focuses on the under-determined blind source separation (UBSS). Sparse component analysis is generally used for the problem of under-determined blind source separation. The method is mainly divided into two parts. Firstly, the clustering algorithm is used to estimate the mixing matrix according to the observed signals. Then the signal is separated based on the known mixing matrix. In this paper, the problem of mixing matrix estimation is studied. This paper proposes an improved algorithm to estimate the mixing matrix for speech signals in the UBSS model. The traditional potential algorithm is not accurate for the mixing matrix estimation, especially for low signal-to noise ratio (SNR).In response to this problem, this paper considers the idea of an improved potential function method to estimate the mixing matrix. The algorithm not only avoids the inuence of insufficient prior information in traditional clustering algorithm, but also improves the estimation accuracy of mixing matrix. This paper takes the mixing of four speech signals into two channels as an example. The results of simulations show that the approach in this paper not only improves the accuracy of estimation, but also applies to any mixing matrix.

Keywords: Clustering algorithm, potential function, speech signal, the UBSS model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 680
679 Spectral Entropy Employment in Speech Enhancement based on Wavelet Packet

Authors: Talbi Mourad, Salhi Lotfi, Chérif Adnen

Abstract:

In this work, we are interested in developing a speech denoising tool by using a discrete wavelet packet transform (DWPT). This speech denoising tool will be employed for applications of recognition, coding and synthesis. For noise reduction, instead of applying the classical thresholding technique, some wavelet packet nodes are set to zero and the others are thresholded. To estimate the non stationary noise level, we employ the spectral entropy. A comparison of our proposed technique to classical denoising methods based on thresholding and spectral subtraction is made in order to evaluate our approach. The experimental implementation uses speech signals corrupted by two sorts of noise, white and Volvo noises. The obtained results from listening tests show that our proposed technique is better than spectral subtraction. The obtained results from SNR computation show the superiority of our technique when compared to the classical thresholding method using the modified hard thresholding function based on u-law algorithm.

Keywords: Enhancement, spectral subtraction, SNR, discrete wavelet packet transform, spectral entropy Histogram

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1992
678 Bangla Vowel Characterization Based on Analysis by Synthesis

Authors: Syed Akhter Hossain, M. Lutfar Rahman, Farruk Ahmed

Abstract:

Bangla Vowel characterization determines the spectral properties of Bangla vowels for efficient synthesis as well as recognition of Bangla vowels. In this paper, Bangla vowels in isolated word have been analyzed based on speech production model within the framework of Analysis-by-Synthesis. This has led to the extraction of spectral parameters for the production model in order to produce different Bangla vowel sounds. The real and synthetic spectra are compared and a weighted square error has been computed along with the error in the formant bandwidths for efficient representation of Bangla vowels. The extracted features produced good representation of targeted Bangla vowel. Such a representation also plays essential role in low bit rate speech coding and vocoders.

Keywords: Speech, vowel, formant, synthesis, spectrum, LPC.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2371
677 Speech Recognition Using Scaly Neural Networks

Authors: Akram M. Othman, May H. Riadh

Abstract:

This research work is aimed at speech recognition using scaly neural networks. A small vocabulary of 11 words were established first, these words are “word, file, open, print, exit, edit, cut, copy, paste, doc1, doc2". These chosen words involved with executing some computer functions such as opening a file, print certain text document, cutting, copying, pasting, editing and exit. It introduced to the computer then subjected to feature extraction process using LPC (linear prediction coefficients). These features are used as input to an artificial neural network in speaker dependent mode. Half of the words are used for training the artificial neural network and the other half are used for testing the system; those are used for information retrieval. The system components are consist of three parts, speech processing and feature extraction, training and testing by using neural networks and information retrieval. The retrieve process proved to be 79.5-88% successful, which is quite acceptable, considering the variation to surrounding, state of the person, and the microphone type.

Keywords: Feature extraction, Liner prediction coefficients, neural network, Speech Recognition, Scaly ANN.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1738
676 Building an Arithmetic Model to Assess Visual Consistency in Townscape

Authors: Dheyaa Hussein, Peter Armstrong

Abstract:

The phenomenon of visual disorder is prominent in contemporary townscapes. This paper provides a theoretical framework for the assessment of visual consistency in townscape in order to achieve more favourable outcomes for users. In this paper, visual consistency refers to the amount of similarity between adjacent components of townscape. The paper investigates parameters which relate to visual consistency in townscape, explores the relationships between them and highlights their significance. The paper uses arithmetic methods from outside the domain of urban design to enable the establishment of an objective approach of assessment which considers subjective indicators including users’ preferences. These methods involve the standard of deviation, colour distance and the distance between points. The paper identifies urban space as a key representative of the visual parameters of townscape. It focuses on its two components, geometry and colour in the evaluation of the visual consistency of townscape. Accordingly, this article proposes four measurements. The first quantifies the number of vertices, which are points in the three-dimensional space that are connected, by lines, to represent the appearance of elements. The second evaluates the visual surroundings of urban space through assessing the location of their vertices. The last two measurements calculate the visual similarity in both vertices and colour in townscape by the calculation of their variation using methods including standard of deviation and colour difference. The proposed quantitative assessment is based on users’ preferences towards these measurements. The paper offers a theoretical basis for a practical tool which can alter the current understanding of architectural form and its application in urban space. This tool is currently under development. The proposed method underpins expert subjective assessment and permits the establishment of a unified framework which adds to creativity by the achievement of a higher level of consistency and satisfaction among the citizens of evolving townscapes.

Keywords: Townscape, Urban Design, Visual Assessment, Visual Consistency.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1635
675 Branding Urban Spaces as an Approach for City Branding -Case study: Cairo City, Egypt

Authors: Mohammad R. M. Abdelaal, Reeman M. R. Hussein

Abstract:

With the beginning of the new century, man still faces many challenges in how to form and develop his urban environment. To meet these challenges, many cities have tried to develop its visual image. This is by transforming their urban environment into a branded visual image; this is at the level of squares, the main roads, the borders, and the landmarks. In this realm, the paper aims at activating the role of branded urban spaces as an approach for the development of visual image of cities, especially in Egypt. It concludes the need to recognize the importance of developing the visual image in Egypt, through directing the urban planners to the important role of such spaces in achieving sustainability.

Keywords: Urban branded spaces, brand image, sustainable development, Cairo.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3093
674 A Visual Cryptography and Statistics Based Method for Ownership Identification of Digital Images

Authors: Ching-Sheng Hsu, Young-Chang Hou

Abstract:

In this paper, a novel copyright protection scheme for digital images based on Visual Cryptography and Statistics is proposed. In our scheme, the theories and properties of sampling distribution of means and visual cryptography are employed to achieve the requirements of robustness and security. Our method does not need to alter the original image and can identify the ownership without resorting to the original image. Besides, our method allows multiple watermarks to be registered for a single host image without causing any damage to other hidden watermarks. Moreover, it is also possible for our scheme to cast a larger watermark into a smaller host image. Finally, experimental results will show the robustness of our scheme against several common attacks.

Keywords: Copyright protection, digital watermarking, samplingdistribution, visual cryptography.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1883
673 Forensic Speaker Verification in Noisy Environmental by Enhancing the Speech Signal Using ICA Approach

Authors: Ahmed Kamil Hasan Al-Ali, Bouchra Senadji, Ganesh Naik

Abstract:

We propose a system to real environmental noise and channel mismatch for forensic speaker verification systems. This method is based on suppressing various types of real environmental noise by using independent component analysis (ICA) algorithm. The enhanced speech signal is applied to mel frequency cepstral coefficients (MFCC) or MFCC feature warping to extract the essential characteristics of the speech signal. Channel effects are reduced using an intermediate vector (i-vector) and probabilistic linear discriminant analysis (PLDA) approach for classification. The proposed algorithm is evaluated by using an Australian forensic voice comparison database, combined with car, street and home noises from QUT-NOISE at a signal to noise ratio (SNR) ranging from -10 dB to 10 dB. Experimental results indicate that the MFCC feature warping-ICA achieves a reduction in equal error rate about (48.22%, 44.66%, and 50.07%) over using MFCC feature warping when the test speech signals are corrupted with random sessions of street, car, and home noises at -10 dB SNR.

Keywords: Noisy forensic speaker verification, ICA algorithm, MFCC, MFCC feature warping.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 991
672 Spatiotemporal Analysis of Visual Evoked Responses Using Dense EEG

Authors: Rima Hleiss, Elie Bitar, Mahmoud Hassan, Mohamad Khalil

Abstract:

A comprehensive study of object recognition in the human brain requires combining both spatial and temporal analysis of brain activity. Here, we are mainly interested in three issues: the time perception of visual objects, the ability of discrimination between two particular categories (objects vs. animals), and the possibility to identify a particular spatial representation of visual objects. Our experiment consisted of acquiring dense electroencephalographic (EEG) signals during a picture-naming task comprising a set of objects and animals’ images. These EEG responses were recorded from nine participants. In order to determine the time perception of the presented visual stimulus, we analyzed the Event Related Potentials (ERPs) derived from the recorded EEG signals. The analysis of these signals showed that the brain perceives animals and objects with different time instants. Concerning the discrimination of the two categories, the support vector machine (SVM) was applied on the instantaneous EEG (excellent temporal resolution: on the order of millisecond) to categorize the visual stimuli into two different classes. The spatial differences between the evoked responses of the two categories were also investigated. The results showed a variation of the neural activity with the properties of the visual input. Results showed also the existence of a spatial pattern of electrodes over particular regions of the scalp in correspondence to their responses to the visual inputs.

Keywords: Brain activity, dense EEG, evoked responses, spatiotemporal analysis, SVM, perception.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1071
671 Affirming Students’ Attention and Perceptions on Prezi Presentation via Eye Tracking System

Authors: Mona Masood, Norshazlina Shaik Othman

Abstract:

The purpose of this study was to investigate graduate students’ visual attention and perceptions of a Prezi presentation. Ten postgraduate master students were presented with a Prezi presentation at the Centre for Instructional Technology and Multimedia, Universiti Sains Malaysia (USM). The eye movement indicators such as dwell time, average fixation on the areas of interests, heat maps and focus maps were abstracted to indicate the students’ visual attention. Descriptive statistics was employed to analyze the students’ perception of the Prezi presentation in terms of text, slide design, images, layout and overall presentation. The result revealed that the students paid more attention to the text followed by the images and sub heading presented through the Prezi presentation.

Keywords: Eye tracking, Prezi, visual attention, visual perception.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2336
670 Computationally Efficient Signal Quality Improvement Method for VoIP System

Authors: H. P. Singh, S. Singh

Abstract:

The voice signal in Voice over Internet protocol (VoIP) system is processed through the best effort policy based IP network, which leads to the network degradations including delay, packet loss jitter. The work in this paper presents the implementation of finite impulse response (FIR) filter for voice quality improvement in the VoIP system through distributed arithmetic (DA) algorithm. The VoIP simulations are conducted with AMR-NB 6.70 kbps and G.729a speech coders at different packet loss rates and the performance of the enhanced VoIP signal is evaluated using the perceptual evaluation of speech quality (PESQ) measurement for narrowband signal. The results show reduction in the computational complexity in the system and significant improvement in the quality of the VoIP voice signal.

Keywords: VoIP, Signal Quality, Distributed Arithmetic, Packet Loss, Speech Coder.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1830
669 Secure E-Pay System Using Steganography and Visual Cryptography

Authors: K. Suganya Devi, P. Srinivasan, M. P. Vaishnave, G. Arutperumjothi

Abstract:

Today’s internet world is highly prone to various online attacks, of which the most harmful attack is phishing. The attackers host the fake websites which are very similar and look alike. We propose an image based authentication using steganography and visual cryptography to prevent phishing. This paper presents a secure steganographic technique for true color (RGB) images and uses Discrete Cosine Transform to compress the images. The proposed method hides the secret data inside the cover image. The use of visual cryptography is to preserve the privacy of an image by decomposing the original image into two shares. Original image can be identified only when both qualified shares are simultaneously available. Individual share does not reveal the identity of the original image. Thus, the existence of the secret message is hard to be detected by the RS steganalysis.

Keywords: Image security, random LSB, steganography, visual cryptography.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1386
668 A Perceptual Image Coding method of High Compression Rate

Authors: Fahmi Kammoun, Mohamed Salim Bouhlel

Abstract:

In the framework of the image compression by Wavelet Transforms, we propose a perceptual method by incorporating Human Visual System (HVS) characteristics in the quantization stage. Indeed, human eyes haven-t an equal sensitivity across the frequency bandwidth. Therefore, the clarity of the reconstructed images can be improved by weighting the quantization according to the Contrast Sensitivity Function (CSF). The visual artifact at low bit rate is minimized. To evaluate our method, we use the Peak Signal to Noise Ratio (PSNR) and a new evaluating criteria witch takes into account visual criteria. The experimental results illustrate that our technique shows improvement on image quality at the same compression ratio.

Keywords: Contrast Sensitivity Function, Human Visual System, Image compression, Wavelet transforms.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1875
667 Extracting Tongue Shape Dynamics from Magnetic Resonance Image Sequences

Authors: María S. Avila-García, John N. Carter, Robert I. Damper

Abstract:

An important problem in speech research is the automatic extraction of information about the shape and dimensions of the vocal tract during real-time speech production. We have previously developed Southampton dynamic magnetic resonance imaging (SDMRI) as an approach to the solution of this problem.However, the SDMRI images are very noisy so that shape extraction is a major challenge. In this paper, we address the problem of tongue shape extraction, which poses difficulties because this is a highly deforming non-parametric shape. We show that combining active shape models with the dynamic Hough transform allows the tongue shape to be reliably tracked in the image sequence.

Keywords: Vocal tract imaging, speech production, active shapemodels, dynamic Hough transform, object tracking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1735
666 Igbo Art: A Reflection of the Igbo’s Visual Culture

Authors: David Osa-Egonwa

Abstract:

Visual culture is the expression of the norms and social behavior of a society in visual images. A reflection simply shows you how you look when you stand before a mirror, a clear water or stream. The mirror does not alter, improve or distort your original appearance, neither does it show you a caricature of what stands before it, this is the case with visual images created by a tribe or society. The ‘uli’ is hand drawn body design done on Igbo women and speaks of a culture of body adornment which is a practice that is appreciated by that tribe. The use of pattern of the gliding python snake ‘ije eke’ or ‘ijeagwo’ for wall painting speaks of the Igbo culture as one that appreciates wall paintings based on these patterns. Modern life came and brought a lot of change to the Igbo-speaking people of Nigeria. Change cloaked in the garment of Westernization has influenced the culture of the Igbos. This has resulted in a problem which is a break in the cultural practice that has also affected art produced by the Igbos. Before the colonial masters arrived and changed the established culture practiced by the Igbos, visual images were created that retained the culture of this people. To bring this point to limelight, this paper has adopted a historical method. A large number of works produced during pre and post-colonial era which range from sculptural pieces, paintings and other artifacts, just to mention a few, were studied carefully and it was discovered that the visual images hold the culture or aspects of the culture of the Igbos in their renditions and can rightly serve as a mirror of the Igbo visual culture.

Keywords: Artistic renditions, historical method, Igbo visual culture, changes.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1014
665 Reflections of Utopia and the Ideal City in the Development of Physical Structure of Nikšić Aspect of Visual Perception

Authors: Svetlana Perović, Svetislav Popović

Abstract:

Aspect of visual perception occupies a central position in shaping the physical structure of a city. This paper discusses the visual characteristics of utopian cities and their impact on the shaping of real urban structures. Utopian examples of cities will not be discussed in terms of social and sociological conditions, but rather the emphasis is on urban utopias and ideal cities that have achieved or have had potential impact on the shape of the physical structure of Nikšić. It is a Renaissance-Baroque period with a touch of classicism. The paper’s emphasis is on the physical dimension, not excluding the importance of social equilibrium, studies of which are dating back to Aristotle, Plato, Thomas More, Robert Owen, Tommaso Campanella and others. The emphasis is on urban utopias and their impact on the development of sustainable physical structure of a real city in the context of visual perception. In the case of Nikšić, this paper identifies the common features of a real city and a utopian city, as well as criteria for sustainable urban development in the context of visual achievement.

Keywords: Physical Structure of Nikšić, Utopia and Ideal City, Visual Perception.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3148
664 A Comparison Study of Inspector's Performance between Regular and Complex Tasks

Authors: Santirat Nansaarng, Sittichai Kaewkuekool, Supreeya Siripattanakunkajorn

Abstract:

This research was to study a comparison of inspector-s performance between regular and complex visual inspection task. Visual task was simulated on DVD read control circuit. Inspection task was performed by using computer. Subjects were 10 undergraduate randomly selected and test for 20/20. Then, subjects were divided into two groups, five for regular inspection (control group) and five for complex inspection (treatment group) tasks. Result was showed that performance on regular and complex inspectors was significantly difference at the level of 0.05. Inspector performance on regular inspection was showed high percentage on defects detected by using equal time to complex inspection. This would be indicated that inspector performance was affected by visual inspection task.

Keywords: Visual inspection task, regular and complex task.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1257
663 A Study on Learning Styles and Academic Performance in Relation with Kinesthetic, Verbal and Visual Intelligences

Authors: Salina Budin, Nor Liawati Abu Othman, Shaira Ismail

Abstract:

This study attempts to determine kinesthetic, verbal and visual intelligences among mechanical engineering undergraduate students and explores any probable relation with students’ learning styles and academic performance. The questionnaire used in this study is based on Howard Gardner’s multiple intelligences theory comprising of five elements of learning style; environmental, sociological, emotional, physiological and psychological. Questionnaires are distributed amongst undergraduates in the Faculty of Mechanical Engineering. Additional questions on students’ perception of learning styles and their academic performance are included in the questionnaire. The results show that one third of the students are strongly dominant in the kinesthetic intelligent (33%), followed by a combination of kinesthetic and visual intelligences (29%) and 21% are strongly dominant in all three types of intelligences. There is a statistically significant correlation between kinesthetic, verbal and visual intelligences and students learning styles and academic performances. The ANOVA analysis supports that there is a significant relationship between academic performances and level of kinesthetic, verbal and visual intelligences. In addition, it has also proven a remarkable relationship between academic performances and kinesthetic, verbal and visual learning styles amongst the male and female students. Thus, it can be concluded that, academic achievements can be enhanced by understanding as well as capitalizing the students’ types of intelligences and learning styles.

Keywords: Kinesthetic intelligent, verbal intelligent, visual intelligent, learning style, academic performances.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2769
662 Influence of Loudness Compression on Hearing with Bone Anchored Hearing Implants

Authors: Anja Kurz, Marc Flynn, Tobias Good, Marco Caversaccio, Martin Kompis

Abstract:

Bone Anchored Hearing Implants (BAHI) are  routinely used in patients with conductive or mixed hearing loss, e.g.  if conventional air conduction hearing aids cannot be used. New  sound processors and new fitting software now allow the adjustment  of parameters such as loudness compression ratios or maximum  power output separately. Today it is unclear, how the choice of these  parameters influences aided speech understanding in BAHI users.  In this prospective experimental study, the effect of varying the  compression ratio and lowering the maximum power output in a  BAHI were investigated.  Twelve experienced adult subjects with a mixed hearing loss  participated in this study. Four different compression ratios (1.0; 1.3;  1.6; 2.0) were tested along with two different maximum power output  settings, resulting in a total of eight different programs. Each  participant tested each program during two weeks. A blinded Latin  square design was used to minimize bias.  For each of the eight programs, speech understanding in quiet and  in noise was assessed. For speech in quiet, the Freiburg number test  and the Freiburg monosyllabic word test at 50, 65, and 80 dB SPL  were used. For speech in noise, the Oldenburg sentence test was  administered.  Speech understanding in quiet and in noise was improved  significantly in the aided condition in any program, when compared  to the unaided condition. However, no significant differences were  found between any of the eight programs. In contrast, on a subjective  level there was a significant preference for medium compression  ratios of 1.3 to 1.6 and higher maximum power output.

 

Keywords: Bone Anchored Hearing Implant, Compression, Maximum Power Output, Speech understanding.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2066
661 Comparison of Fricative Vocal Tract Transfer Functions Derived using Two Different Segmentation Techniques

Authors: K. S. Subari, C. H. Shadle, A. Barney, R. I. Damper

Abstract:

The acoustic and articulatory properties of fricative speech sounds are being studied using magnetic resonance imaging (MRI) and acoustic recordings from a single subject. Area functions were derived from a complete set of axial and coronal MR slices using two different methods: the Mermelstein technique and the Blum transform. Area functions derived from the two techniques were shown to differ significantly in some cases. Such differences will lead to different acoustic predictions and it is important to know which is the more accurate. The vocal tract acoustic transfer function (VTTF) was derived from these area functions for each fricative and compared with measured speech signals for the same fricative and same subject. The VTTFs for /f/ in two vowel contexts and the corresponding acoustic spectra are derived here; the Blum transform appears to show a better match between prediction and measurement than the Mermelstein technique.

Keywords: Area functions, fricatives, vocal tract transferfunction, MRI, speech.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1652
660 Visual Analytics of Higher Order Information for Trajectory Datasets

Authors: Ye Wang, Ickjai Lee

Abstract:

Due to the widespread of mobile sensing, there is a strong need to handle trails of moving objects, and trajectories. This paper proposes three visual analytics approaches for higher order information of trajectory datasets based on the higher order Voronoi diagram data structure. Proposed approaches reveal geometrical, topological, and directional information. Experimental resultsdemonstrate the applicability and usefulness of proposed three approaches.

Keywords: Visual Analytics, Higher Order Information, Trajectory Datasets, Spatio-temporal data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1639