Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 4467

Search results for: visual speech perception

4347 History and Its Significance in Modern Visual Graphic: Its Niche with Respect to India

Authors: Hemang Madhusudan Anglay, Akash Gaur

Abstract:

Value of visual perception in today’s context is vulnerable. Visual Graphic broadly and conveniently expresses culture, language and science of art that satisfactorily is a mould to cast various expressions. It is one of the essential parts of communication design which relatively can be used to approach the above areas of expressions. In between the receptors and interpreters, there is an expanse of comprehension and cliché in relation to the use of Visual Graphics. There are pedagogies, commodification and honest reflections where Visual Graphic is a common area of interest. The traditional receptors amidst the dilemma of this very situation find themselves in the pool of media, medium and interactions. Followed by a very vague interpretation the entire circle of communication becomes a question of comprehension vs cliché. Residing in the same ‘eco-system’ these communities who make pedagogies and multiply its reflections sometimes with honesty and sometimes on commercial values tend to function differently. With the advent of technology, which is a virtual space allows the user to access various forms of content. This diminishes the core characteristics and creates a vacuum even though it satisfies the user. The symbolic interpretation of visual form and structure is transmitted in a culture by the means of contemporary media. Starting from a very individualistic approach, today it is beyond Print & Electronic media. The expected outcome will be a study of Ahmedabad City, situated in the Gujarat State of India. It is identity with respect to socio-cultural as well as economic changes. The methodology will include process to understand the evolution and narratives behind it that will encompass diverse community, its reflection and it will sum up the salient features of communication through combination of visual and graphic that is relevant in Indian context trading its values to global scenario.

Keywords: communication, culture, graphic, visual

Procedia PDF Downloads 275

4346 Morpheme Based Parts of Speech Tagger for Kannada Language

Authors: M. C. Padma, R. J. Prathibha

Abstract:

Parts of speech tagging is the process of assigning appropriate parts of speech tags to the words in a given text. The critical or crucial information needed for tagging a word come from its internal structure rather from its neighboring words. The internal structure of a word comprises of its morphological features and grammatical information. This paper presents a morpheme based parts of speech tagger for Kannada language. This proposed work uses hierarchical tag set for assigning tags. The system is tested on some Kannada words taken from EMILLE corpus. Experimental result shows that the performance of the proposed system is above 90%.

Keywords: hierarchical tag set, morphological analyzer, natural language processing, paradigms, parts of speech

Procedia PDF Downloads 296

4345 Emotion-Convolutional Neural Network for Perceiving Stress from Audio Signals: A Brain Chemistry Approach

Authors: Anup Anand Deshmukh, Catherine Soladie, Renaud Seguier

Abstract:

Emotion plays a key role in many applications like healthcare, to gather patients’ emotional behavior. Unlike typical ASR (Automated Speech Recognition) problems which focus on 'what was said', it is equally important to understand 'how it was said.' There are certain emotions which are given more importance due to their effectiveness in understanding human feelings. In this paper, we propose an approach that models human stress from audio signals. The research challenge in speech emotion detection is finding the appropriate set of acoustic features corresponding to an emotion. Another difficulty lies in defining the very meaning of emotion and being able to categorize it in a precise manner. Supervised Machine Learning models, including state of the art Deep Learning classification methods, rely on the availability of clean and labelled data. One of the problems in affective computation is the limited amount of annotated data. The existing labelled emotions datasets are highly subjective to the perception of the annotator. We address the first issue of feature selection by exploiting the use of traditional MFCC (Mel-Frequency Cepstral Coefficients) features in Convolutional Neural Network. Our proposed Emo-CNN (Emotion-CNN) architecture treats speech representations in a manner similar to how CNN’s treat images in a vision problem. Our experiments show that Emo-CNN consistently and significantly outperforms the popular existing methods over multiple datasets. It achieves 90.2% categorical accuracy on the Emo-DB dataset. We claim that Emo-CNN is robust to speaker variations and environmental distortions. The proposed approach achieves 85.5% speaker-dependant categorical accuracy for SAVEE (Surrey Audio-Visual Expressed Emotion) dataset, beating the existing CNN based approach by 10.2%. To tackle the second problem of subjectivity in stress labels, we use Lovheim’s cube, which is a 3-dimensional projection of emotions. Monoamine neurotransmitters are a type of chemical messengers in the brain that transmits signals on perceiving emotions. The cube aims at explaining the relationship between these neurotransmitters and the positions of emotions in 3D space. The learnt emotion representations from the Emo-CNN are mapped to the cube using three component PCA (Principal Component Analysis) which is then used to model human stress. This proposed approach not only circumvents the need for labelled stress data but also complies with the psychological theory of emotions given by Lovheim’s cube. We believe that this work is the first step towards creating a connection between Artificial Intelligence and the chemistry of human emotions.

Keywords: deep learning, brain chemistry, emotion perception, Lovheim's cube

Procedia PDF Downloads 154

4344 The Effect of Perceived Parental Overprotection on Morality in College Students

Authors: Sunghyun Cho, Seung-Ah Lee

Abstract:

Parental overprotection is known to have negative effects such as low independence, immature emotion regulation, and immoral behaviors on children’s development. This study investigated the effects of parental overprotection on Korean college students’ moral behaviors. In order to test the hypothesis that overprotected participants are more likely to show immoral behaviors in moral dilemma situations, we measured perceived parental overprotection using Korean-Parental Overprotection Scale (K-POS), Helicopter Parenting Behaviors, and Helicopter Parenting Instrument (HPI) for 200 college students. Participants’ level of morality was assessed using two types of online experimental tasks consisting of a word-searching puzzle and a visual perception task. Based on the level of perceived parental overprotection, 14 participants with high total scores in overparenting scales and 14 participants with average total scores in the scales were assigned to a high perceived overparenting student group, and control group, respectively. Results revealed that the high perceived overparenting group submitted significantly more untruthful answers compared to the control group in the visual perception task (t = 2.72, p < .05). However, there was no significant difference in immorality in the word-searching puzzle(t = 1.30, p > .05), yielding inconsistent results for the relationship between. These inconsistent results of two tasks assessing morality may be because submitting untruthful answers in the word-searching puzzle initiated a larger sense of immorality compared to the visual perception task. Thus, even the perceived overparenting participants seemingly tended not to submit immoral answers. Further implications and limitations of the study are discussed.

Keywords: college students, morality, overparenting, parental overprotection

Procedia PDF Downloads 181

4343 The Convolution Recurrent Network of Using Residual LSTM to Process the Output of the Downsampling for Monaural Speech Enhancement

Authors: Shibo Wei, Ting Jiang

Abstract:

Convolutional-recurrent neural networks (CRN) have achieved much success recently in the speech enhancement field. The common processing method is to use the convolution layer to compress the feature space by multiple upsampling and then model the compressed features with the LSTM layer. At last, the enhanced speech is obtained by deconvolution operation to integrate the global information of the speech sequence. However, the feature space compression process may cause the loss of information, so we propose to model the upsampling result of each step with the residual LSTM layer, then join it with the output of the deconvolution layer and input them to the next deconvolution layer, by this way, we want to integrate the global information of speech sequence better. The experimental results show the network model (RES-CRN) we introduce can achieve better performance than LSTM without residual and overlaying LSTM simply in the original CRN in terms of scale-invariant signal-to-distortion ratio (SI-SNR), speech quality (PESQ), and intelligibility (STOI).

Keywords: convolutional-recurrent neural networks, speech enhancement, residual LSTM, SI-SNR

Procedia PDF Downloads 201

4342 Temporal Characteristics of Human Perception to Significant Variation of Block Structures

Authors: Kuo-Cheng Liu

Abstract:

In the latest research efforts, the structures of the image in the spatial domain have been successfully analyzed and proved to deduce the visual masking for accurately estimating the visibility thresholds of the image. If the structural properties of the video sequence in the temporal domain are taken into account to estimate the temporal masking, the improvement and enhancement of the as-sessing spatio-temporal visibility thresholds are reasonably expected. In this paper, the temporal characteristics of human perception to the change in block structures on the time axis are analyzed. The temporal characteristics of human perception are represented in terms of the significant variation in block structures for the analysis of human visual system (HVS). Herein, the block structure in each frame is computed by combined the pattern masking and the contrast masking simultaneously. The contrast masking always overestimates the visibility thresholds of edge regions and underestimates that of texture regions, while the pattern masking is weak on a uniform background and is strong on the complex background with spatial patterns. Under considering the significant variation of block structures between successive frames, we extend the block structures of images in the spatial domain to that of video sequences in the temporal domain to analyze the relation between the inter-frame variation of structures and the temporal masking. Meanwhile, the subjective viewing test and the fair rating process are designed to evaluate the consistency of the temporal characteristics with the HVS under a specified viewing condition.

Keywords: temporal characteristic, block structure, pattern masking, contrast masking

Procedia PDF Downloads 413

4341 Detection of Clipped Fragments in Speech Signals

Authors: Sergei Aleinik, Yuri Matveev

Abstract:

In this paper a novel method for the detection of clipping in speech signals is described. It is shown that the new method has better performance than known clipping detection methods, is easy to implement, and is robust to changes in signal amplitude, size of data, etc. Statistical simulation results are presented.

Keywords: clipping, clipped signal, speech signal processing, digital signal processing

Procedia PDF Downloads 393

4340 Examining Predictive Coding in the Hierarchy of Visual Perception in the Autism Spectrum Using Fast Periodic Visual Stimulation

Authors: Min L. Stewart, Patrick Johnston

Abstract:

Predictive coding has been proposed as a general explanatory framework for understanding the neural mechanisms of perception. As such, an underweighting of perceptual priors has been hypothesised to underpin a range of differences in inferential and sensory processing in autism spectrum disorders. However, empirical evidence to support this has not been well established. The present study uses an electroencephalography paradigm involving changes of facial identity and person category (actors etc.) to explore how levels of autistic traits (AT) affect predictive coding at multiple stages in the visual processing hierarchy. The study uses a rapid serial presentation of faces, with hierarchically structured sequences involving both periodic and aperiodic repetitions of different stimulus attributes (i.e., person identity and person category) in order to induce contextual expectations relating to these attributes. It investigates two main predictions: (1) significantly larger and late neural responses to change of expected visual sequences in high-relative to low-AT, and (2) significantly reduced neural responses to violations of contextually induced expectation in high- relative to low-AT. Preliminary frequency analysis data comparing high and low-AT show greater and later event-related-potentials (ERPs) in occipitotemporal areas and prefrontal areas in high-AT than in low-AT for periodic changes of facial identity and person category but smaller ERPs over the same areas in response to aperiodic changes of identity and category. The research advances our understanding of how abnormalities in predictive coding might underpin aberrant perceptual experience in autism spectrum. This is the first stage of a research project that will inform clinical practitioners in developing better diagnostic tests and interventions for people with autism.

Keywords: hierarchical visual processing, face processing, perceptual hierarchy, prediction error, predictive coding

Procedia PDF Downloads 111

4339 Automatic Landmark Selection Based on Feature Clustering for Visual Autonomous Unmanned Aerial Vehicle Navigation

Authors: Paulo Fernando Silva Filho, Elcio Hideiti Shiguemori

Abstract:

The selection of specific landmarks for an Unmanned Aerial Vehicles’ Visual Navigation systems based on Automatic Landmark Recognition has significant influence on the precision of the system’s estimated position. At the same time, manual selection of the landmarks does not guarantee a high recognition rate, which would also result on a poor precision. This work aims to develop an automatic landmark selection that will take the image of the flight area and identify the best landmarks to be recognized by the Visual Navigation Landmark Recognition System. The criterion to select a landmark is based on features detected by ORB or AKAZE and edges information on each possible landmark. Results have shown that disposition of possible landmarks is quite different from the human perception.

Keywords: clustering, edges, feature points, landmark selection, X-means

Procedia PDF Downloads 281

4338 Essential Factors of Risk Perception Crucial in Efficient Construction Management

Authors: Francis Edum-Fotwe, Tony Thorpe, Charles Afetornu

Abstract:

Risk perception informs the outcome of how issues are responded to in either solving or overcoming a problem or improving a situation. Risk perception is established to be affected by some key factors reflecting in the varying ways in which work is done as well as the level of efficiency achieved. These factors potentially would influence risk perception to different extents. Such that if these factors are said to determine risk perception, how does a change in any affect risk perception. Since the ability to address risk is influenced by risk perception, establishing and developing awareness of that perception should enable construction professionals to make viable decisions. Any act to improve the construction industry cannot be overemphasised, considering its contribution to national development. A survey questionnaire was conducted in Ghana to elicit data that measures the risk perception and the essential factors as well as the necessary demographics of the respondents, who are construction professionals. This study finds out the sensitivity of the critical factors of risk perception. It uses the Relative Importance Index analysis tool to investigate the differential effect of these essential factors on risk perception, such that a slight change in a factor makes a significant change in risk perception, having established that it is influenced by essential factors. The findings can lead to policy formation for employers on the prioritisation factors to undertake to improve the risk perception of employees. Other areas in which this study can be useful in team formation for sensitive and complex projects where efficient risk management is critical.

Keywords: construction industry, risk, risk management, risk perception

Procedia PDF Downloads 143

4337 Physiology of Temporal Lobe and Limbic System

Authors: Khaled A. Abdel-Sater

Abstract:

There are four areas of the temporal lobe. Primary auditory area (areas 41 and 42); it is for the perception of auditory impulse, auditory association area (area 22, 21, and 20): Areas 21 and 20 are for understanding and interpretation of auditory sensation, recognition of language, and long-term memories. Area 22, also called Wernicke’s area, and a sensory speech centre. It is for interpretation of auditory and visual information, formation of thoughts in the mind, and choice of words to be used. Ideas and thoughts originate in it. The limbic system is a part of cortical and subcortical structure forming a ring around the brainstem. Cortical structures are the orbitofrontal area, subcallosal gyrus, cingulate gyrus, parahippocampal gyrus, and uncus. Subcortical structures are the hypothalamus, hippocampus, amygdala, septum, paraolfactory area, anterior nucleus of the thalamus portions of the basal ganglia. There are several physiological functions of the limbic system, including regulation of behavior, motivation, and emotion.

Keywords: limbic system, motivation, emotions, temporal lobe

Procedia PDF Downloads 201

4336 Developing an Intonation Labeled Dataset for Hindi

Authors: Esha Banerjee, Atul Kumar Ojha, Girish Nath Jha

Abstract:

This study aims to develop an intonation labeled database for Hindi. Although no single standard for prosody labeling exists in Hindi, researchers in the past have employed perceptual and statistical methods in literature to draw inferences about the behavior of prosody patterns in Hindi. Based on such existing research and largely agreed upon intonational theories in Hindi, this study attempts to develop a manually annotated prosodic corpus of Hindi speech data, which can be used for training speech models for natural-sounding speech in the future. 100 sentences ( 500 words) each for declarative and interrogative types have been labeled using Praat.

Keywords: speech dataset, Hindi, intonation, labeled corpus

Procedia PDF Downloads 200

4335 Distant Speech Recognition Using Laser Doppler Vibrometer

Authors: Yunbin Deng

Abstract:

Most existing applications of automatic speech recognition relies on cooperative subjects at a short distance to a microphone. Standoff speech recognition using microphone arrays can extend the subject to sensor distance somewhat, but it is still limited to only a few feet. As such, most deployed applications of standoff speech recognitions are limited to indoor use at short range. Moreover, these applications require air passway between the subject and the sensor to achieve reasonable signal to noise ratio. This study reports long range (50 feet) automatic speech recognition experiments using a Laser Doppler Vibrometer (LDV) sensor. This study shows that the LDV sensor modality can extend the speech acquisition standoff distance far beyond microphone arrays to hundreds of feet. In addition, LDV enables 'listening' through the windows for uncooperative subjects. This enables new capabilities in automatic audio and speech intelligence, surveillance, and reconnaissance (ISR) for law enforcement, homeland security and counter terrorism applications. The Polytec LDV model OFV-505 is used in this study. To investigate the impact of different vibrating materials, five parallel LDV speech corpora, each consisting of 630 speakers, are collected from the vibrations of a glass window, a metal plate, a plastic box, a wood slate, and a concrete wall. These are the common materials the application could encounter in a daily life. These data were compared with the microphone counterpart to manifest the impact of various materials on the spectrum of the LDV speech signal. State of the art deep neural network modeling approaches is used to conduct continuous speaker independent speech recognition on these LDV speech datasets. Preliminary phoneme recognition results using time-delay neural network, bi-directional long short term memory, and model fusion shows great promise of using LDV for long range speech recognition. To author’s best knowledge, this is the first time an LDV is reported for long distance speech recognition application.

Keywords: covert speech acquisition, distant speech recognition, DSR, laser Doppler vibrometer, LDV, speech intelligence surveillance and reconnaissance, ISR

Procedia PDF Downloads 179

4334 The Philippines’ War on Drugs: a Pragmatic Analysis on Duterte's Commemorative Speeches

Authors: Ericson O. Alieto, Aprillete C. Devanadera

Abstract:

The main objective of the study is to determine the dominant speech acts in five commemorative speeches of President Duterte. This study employed Speech Act Theory and Discourse analysis to determine how the speech acts features connote the pragmatic meaning of Duterte’s speeches. Identifying the speech acts is significant in elucidating the underlying message or the pragmatic meaning of the speeches. From the 713 sentences or utterances from the speeches, assertive with 208 occurrences from the corpus or 29% is the dominant speech acts. It was followed by expressive with 177 or 25% occurrences, directive accounts for 152 or 15% occurrences. While commisive accounts for 104 or 15% occurrences and declarative got the lowest percentage of occurrences with 72 or 10% only. These sentences when uttered by Duterte carry a certain power of language to move or influence people. Thus, the present study shows the fundamental message perceived by the listeners. Moreover, the frequent use of assertive and expressive not only explains the pragmatic message of the speeches but also reflects the personality of President Duterte.

Keywords: commemorative speech, discourse analysis, duterte, pragmatics

Procedia PDF Downloads 289

4333 Excitation Modeling for Hidden Markov Model-Based Speech Synthesis Based on Wavelet Analysis

Authors: M. Kiran Reddy, K. Sreenivasa Rao

Abstract:

The conventional Hidden Markov Model (HMM)-based speech synthesis system (HTS) uses only a pulse excitation model, which significantly differs from natural excitation signal. Hence, buzziness can be perceived in the speech generated using HTS. This paper proposes an efficient excitation modeling method that can significantly reduce the buzziness, and improve the quality of HMM-based speech synthesis. The proposed approach models the pitch-synchronous residual frames extracted from the residual excitation signal. Each pitch synchronous residual frame is parameterized using 30 wavelet coefficients. These 30 wavelet coefficients are found to accurately capture the perceptually important information present in the residual waveform. In synthesis phase, the residual frames are reconstructed from the generated wavelet coefficients and are pitch-synchronously overlap-added to generate the excitation signal. The proposed excitation modeling method is integrated into HMM-based speech synthesis system. Evaluation results indicate that the speech synthesized by the proposed excitation model is significantly better than the speech generated using state-of-the-art excitation modeling methods.

Keywords: excitation modeling, hidden Markov models, pitch-synchronous frames, speech synthesis, wavelet coefficients

Procedia PDF Downloads 249

4332 Text-to-Speech in Azerbaijani Language via Transfer Learning in a Low Resource Environment

Authors: Dzhavidan Zeinalov, Bugra Sen, Firangiz Aslanova

Abstract:

Most text-to-speech models cannot operate well in low-resource languages and require a great amount of high-quality training data to be considered good enough. Yet, with the improvements made in ASR systems, it is now much easier than ever to collect data for the design of custom text-to-speech models. In this work, our work on using the ASR model to collect data to build a viable text-to-speech system for one of the leading financial institutions of Azerbaijan will be outlined. NVIDIA’s implementation of the Tacotron 2 model was utilized along with the HiFiGAN vocoder. As for the training, the model was first trained with high-quality audio data collected from the Internet, then fine-tuned on the bank’s single speaker call center data. The results were then evaluated by 50 different listeners and got a mean opinion score of 4.17, displaying that our method is indeed viable. With this, we have successfully designed the first text-to-speech model in Azerbaijani and publicly shared 12 hours of audiobook data for everyone to use.

Keywords: Azerbaijani language, HiFiGAN, Tacotron 2, text-to-speech, transfer learning, whisper

Procedia PDF Downloads 45

4331 An Experimental Investigation of the Cognitive Noise Influence on the Bistable Visual Perception

Authors: Alexander E. Hramov, Vadim V. Grubov, Alexey A. Koronovskii, Maria K. Kurovskaуa, Anastasija E. Runnova

Abstract:

The perception of visual signals in the brain was among the first issues discussed in terms of multistability which has been introduced to provide mechanisms for information processing in biological neural systems. In this work the influence of the cognitive noise on the visual perception of multistable pictures has been investigated. The study includes an experiment with the bistable Necker cube illusion and the theoretical background explaining the obtained experimental results. In our experiments Necker cubes with different wireframe contrast were demonstrated repeatedly to different people and the probability of the choice of one of the cubes projection was calculated for each picture. The Necker cube was placed at the middle of a computer screen as black lines on a white background. The contrast of the three middle lines centered in the left middle corner was used as one of the control parameter. Between two successive demonstrations of Necker cubes another picture was shown to distract attention and to make a perception of next Necker cube more independent from the previous one. Eleven subjects, male and female, of the ages 20 through 45 were studied. The choice of the Necker cube projection was detected with the Electroencephalograph-recorder Encephalan-EEGR-19/26, Medicom MTD. To treat the experimental results we carried out theoretical consideration using the simplest double-well potential model with the presence of noise that led to the Fokker-Planck equation for the probability density of the stochastic process. At the first time an analytical solution for the probability of the selection of one of the Necker cube projection for different values of wireframe contrast have been obtained. Furthermore, having used the results of the experimental measurements with the help of the method of least squares we have calculated the value of the parameter corresponding to the cognitive noise of the person being studied. The range of cognitive noise parameter values for studied subjects turned to be [0.08; 0.55]. It should be noted, that experimental results have a good reproducibility, the same person being studied repeatedly another day produces very similar data with very close levels of cognitive noise. We found an excellent agreement between analytically deduced probability and the results obtained in the experiment. A good qualitative agreement between theoretical and experimental results indicates that even such a simple model allows simulating brain cognitive dynamics and estimating important cognitive characteristic of the brain, such as brain noise.

Keywords: bistability, brain, noise, perception, stochastic processes

Procedia PDF Downloads 445

4330 Hate Speech Detection Using Machine Learning: A Survey

Authors: Edemealem Desalegn Kingawa, Kafte Tasew Timkete, Mekashaw Girmaw Abebe, Terefe Feyisa, Abiyot Bitew Mihretie, Senait Teklemarkos Haile

Abstract:

Currently, hate speech is a growing challenge for society, individuals, policymakers, and researchers, as social media platforms make it easy to anonymously create and grow online friends and followers and provide an online forum for debate about specific issues of community life, culture, politics, and others. Despite this, research on identifying and detecting hate speech is not satisfactory performance, and this is why future research on this issue is constantly called for. This paper provides a systematic review of the literature in this field, with a focus on approaches like word embedding techniques, machine learning, deep learning technologies, hate speech terminology, and other state-of-the-art technologies with challenges. In this paper, we have made a systematic review of the last six years of literature from Research Gate and Google Scholar. Furthermore, limitations, along with algorithm selection and use challenges, data collection, and cleaning challenges, and future research directions, are discussed in detail.

Keywords: Amharic hate speech, deep learning approach, hate speech detection review, Afaan Oromo hate speech detection

Procedia PDF Downloads 178

4329 The Connection Between the Semiotic Theatrical System and the Aesthetic Perception

Authors: Păcurar Diana Istina

Abstract:

The indissoluble link between aesthetics and semiotics, the harmonization and semiotic understanding of the interactions between the viewer and the object being looked at, are the basis of the practical demonstration of the importance of aesthetic perception within the theater performance. The design of a theater performance includes several structures, some considered from the beginning, art forms (i.e., the text), others being represented by simple, common objects (e.g., scenographic elements), which, if reunited, can trigger a certain aesthetic perception. The audience is delivered, by the team involved in the performance, a series of auditory and visual signs with which they interact. It is necessary to explain some notions about the physiological support of the transformation of different types of stimuli at the level of the cerebral hemispheres. The cortex considered the superior integration center of extransecal and entanged stimuli, permanently processes the information received, but even if it is delivered at a constant rate, the generated response is individualized and is conditioned by a number of factors. Each changing situation represents a new opportunity for the viewer to cope with, developing feelings of different intensities that influence the generation of meanings and, therefore, the management of interactions. In this sense, aesthetic perception depends on the detection of the “correctness” of signs, the forms of which are associated with an aesthetic property. Fairness and aesthetic properties can have positive or negative values. Evaluating the emotions that generate judgment and implicitly aesthetic perception, whether we refer to visual emotions or auditory emotions, involves the integration of three areas of interest: Valence, arousal and context control. In this context, superior human cognitive processes, memory, interpretation, learning, attribution of meanings, etc., help trigger the mechanism of anticipation and, no less important, the identification of error. This ability to locate a short circuit produced in a series of successive events is fundamental in the process of forming an aesthetic perception. Our main purpose in this research is to investigate the possible conditions under which aesthetic perception and its minimum content are generated by all these structures and, in particular, by interactions with forms that are not commonly considered aesthetic forms. In order to demonstrate the quantitative and qualitative importance of the categories of signs used to construct a code for reading a certain message, but also to emphasize the importance of the order of using these indices, we have structured a mathematical analysis that has at its core the analysis of the percentage of signs used in a theater performance.

Keywords: semiology, aesthetics, theatre semiotics, theatre performance, structure, aesthetic perception

Procedia PDF Downloads 91

4328 Automatic Assignment of Geminate and Epenthetic Vowel for Amharic Text-to-Speech System

Authors: Tadesse Anberbir, Felix Bankole, Tomio Takara, Girma Mamo

Abstract:

In the development of a text-to-speech synthesizer, automatic derivation of correct pronunciation from the grapheme form of a text is a central problem. Particularly deriving phonological features which are not shown in orthography is challenging. In the Amharic language, geminates and epenthetic vowels are very crucial for proper pronunciation but neither is shown in orthography. In this paper, we proposed and integrated a morphological analyzer into an Amharic Text-to-Speech system, mainly to predict geminates and epenthetic vowel positions, and prepared a duration modeling method. Amharic Text-to-Speech system (AmhTTS) is a parametric and rule-based system that adopts a cepstral method and uses a source filter model for speech production and a Log Magnitude Approximation (LMA) filter as the vocal tract filter. The naturalness of the system after employing the duration modeling was evaluated by sentence listening test and we achieved an average Mean Opinion Score (MOS) 3.4 (68%) which is moderate. By modeling the duration of geminates and controlling the locations of epenthetic vowel, we are able to synthesize good quality speech. Our system is mainly suitable to be customized for other Ethiopian languages with limited resources.

Keywords: Amharic, gemination, speech synthesis, morphology, epenthesis

Procedia PDF Downloads 87

4327 Systemic Functional Grammar Analysis of Barack Obama's Second Term Inaugural Speech

Authors: Sadiq Aminu, Ahmed Lamido

Abstract:

This research studies Barack Obama’s second inaugural speech using Halliday’s Systemic Functional Grammar (SFG). SFG is a text grammar which describes how language is used, so that the meaning of the text can be better understood. The primary source of data in this research work is Barack Obama’s second inaugural speech which was obtained from the internet. The analysis of the speech was based on the ideational and textual metafunctions of Systemic Functional Grammar. Specifically, the researcher analyses the Process Types and Participants (ideational) and the Theme/Rheme (textual). It was found that material process (process of doing) was the most frequently used ‘Process type’ and ‘We’ which refers to the people of America was the frequently used ‘Theme’. Application of the SFG theory, therefore, gives a better meaning to Barack Obama’s speech.

Keywords: ideational, metafunction, rheme, textual, theme

Procedia PDF Downloads 160

4326 An Automatic Speech Recognition Tool for the Filipino Language Using the HTK System

Authors: John Lorenzo Bautista, Yoon-Joong Kim

Abstract:

This paper presents the development of a Filipino speech recognition tool using the HTK System. The system was trained from a subset of the Filipino Speech Corpus developed by the DSP Laboratory of the University of the Philippines-Diliman. The speech corpus was both used in training and testing the system by estimating the parameters for phonetic HMM-based (Hidden-Markov Model) acoustic models. Experiments on different mixture-weights were incorporated in the study. The phoneme-level word-based recognition of a 5-state HMM resulted in an average accuracy rate of 80.13 for a single-Gaussian mixture model, 81.13 after implementing a phoneme-alignment, and 87.19 for the increased Gaussian-mixture weight model. The highest accuracy rate of 88.70% was obtained from a 5-state model with 6 Gaussian mixtures.

Keywords: Filipino language, Hidden Markov Model, HTK system, speech recognition

Procedia PDF Downloads 480

4325 Cinematic Liberty vs. Offending Social, Religious Beliefs: With Special Reference to the Controversial Contents in Cinema and Print Media

Authors: Govind Ji Pandey

Abstract:

The divergent opinions in the society are important for its development but with reasonable restrictions. The world recently witnessed one of the most violent protests by a group against the editor and publisher of the magazine ‘Charlie Hebdo’ for publishing cartoon of their religious leader. The supporter of freedom of speech and expression around the world were in shock and termed it the strongest attack against the free speech. People all around the world condemned the killing of the journalists but many soft voices from several corners were also coming for reasonable restrictions on the freedom of speech and expression. Of late, Indian society has witnessed many protests and supports of films with controversial content. It is the beauty of the Indian democracy which gives an opportunity to all for discussion and debate on any issue that challenges established social norms. However, many organizations as well as individuals misuse it for their personal benefits. There have been many film directors who faced protest from several quarters for their controversial themes. This research aims at analyzing the controversial contents published in print media and shown in films. To understand the nature and frequency of such media reports, content analysis technique is used. The research also highlights the perception of the public regarding the controversies. For getting the popular opinion on the coverage of controversial content in cinema and print media, five hundred people from Lucknow, UP, India were randomly selected. The findings of this research are important to understand the response of media and society towards the controversial content presented in cinema and print media. The research highlights that how a handful of people curb free speech in a democratic country like India.

Keywords: cinema, censor board, free speech, liberty, social-religious beliefs

Procedia PDF Downloads 264

4324 A Comparison of Anger State and Trait Anger Among Adolescents with and without Visual Impairment

Authors: Sehmus Aslan, Sibel Karacaoglu, Cengiz Sevgin, Ummuhan Bas Aslan

Abstract:

Objective: Anger expression style is an important moderator of the effects on the person and person’s environment. Anger and anger expression have become important constructs in identifying individuals at high risk for psychological difficulties. To our knowledge, there is no information about anger and anger expression of adolescents with visual impairment. The aim of this study was to compare anger and anger expression among adolescents with and without visual impairment. Methods: Thirty-eight adolescents with visual impairment (18 female, 20 male) and 44 adolescents without visual impairment (22 female, 24 male), in totally 84 adolescents aged between 12 to 15 years, participated in the study. Anger and anger expression of the participants assessed with The State-Trait Anger Scale (STAS). STAS, a self-report questionnaire, is designed to measure the experience and expression of anger. STAS has four subtitles including continuous anger, anger in, anger out and anger control. Reliability and validity of the STAS have been well established among adolescents. Mann-Whitney U Test was used for statistical analysis. Results: No significant differences were found in the scores of continuous anger and anger out between adolescents with and without visual impairment (p < 0.05). On the other hand, there were differences in scores of anger control and anger in between adolescents with and without visual impairment (p>0.05). The score of anger control in adolescents with visual impairment were higher compared with adolescents without visual impairment. Meanwhile, the adolescents with visual impairment had lower score for anger in compared with adolescents without visual impairment. Conclusions: The results of this study suggest that there is no difference in anger level among adolescents with and without visual impairment meanwhile there is difference in anger expression.

Keywords: adolescent, anger, impaired, visual

Procedia PDF Downloads 414

4323 Automatic Speech Recognition Systems Performance Evaluation Using Word Error Rate Method

Authors: João Rato, Nuno Costa

Abstract:

The human verbal communication is a two-way process which requires a mutual understanding that will result in some considerations. This kind of communication, also called dialogue, besides the supposed human agents it can also be performed between human agents and machines. The interaction between Men and Machines, by means of a natural language, has an important role concerning the improvement of the communication between each other. Aiming at knowing the performance of some speech recognition systems, this document shows the results of the accomplished tests according to the Word Error Rate evaluation method. Besides that, it is also given a set of information linked to the systems of Man-Machine communication. After this work has been made, conclusions were drawn regarding the Speech Recognition Systems, among which it can be mentioned their poor performance concerning the voice interpretation in noisy environments.

Keywords: automatic speech recognition, man-machine conversation, speech recognition, spoken dialogue systems, word error rate

Procedia PDF Downloads 322

4322 Visual Improvement with Low Vision Aids in Children with Stargardt’s Disease

Authors: Anum Akhter, Sumaira Altaf

Abstract:

Purpose: To study the effect of low vision devices i.e. telescope and magnifying glasses on distance visual acuity and near visual acuity of children with Stargardt’s disease. Setting: Low vision department, Alshifa Trust Eye Hospital, Rawalpindi, Pakistan. Methods: 52 children having Stargardt’s disease were included in the study. All children were diagnosed by pediatrics ophthalmologists. Comprehensive low vision assessment was done by me in Low vision clinic. Visual acuity was measured using ETDRS chart. Refraction and other supplementary tests were performed. Children with Stargardt’s disease were provided with different telescopes and magnifying glasses for improving far vision and near vision. Results: Out of 52 children, 17 children were males and 35 children were females. Distance visual acuity and near visual acuity improved significantly with low vision aid trial. All children showed visual acuity better than 6/19 with a telescope of higher magnification. Improvement in near visual acuity was also significant with magnifying glasses trial. Conclusions: Low vision aids are useful for improvement in visual acuity in children. Children with Stargardt’s disease who are having a problem in education and daily life activities can get help from low vision aids.

Keywords: Stargardt, s disease, low vision aids, telescope, magnifiers

Procedia PDF Downloads 539

4321 English Vowel Duration Affected by Voicing Contrast: A Cross Linguistic Examination of L2 English Production and Perception by Asian Learners of English

Authors: Nguyen Van Anh Le, Mafuyu Kitahara

Abstract:

In several languages, it is widely acknowledged that vowels are longer before voiced consonants than before voiceless ones such as English. However, in Mandarin Chinese, Vietnamese, Japanese, and Korean, the distribution of voiced-voiceless stop contrasts and long-short vowel differences are vastly different from English. The purpose of this study is to determine whether these targeted learners' L2 English production and perception change in terms of vowel duration as a function of stop voicing. The production measurements in the database of Asian learners revealed a distinct effect than the one observed in native speakers. There was no evident vowel lengthening patterns. The results of the perceptual experiment with 24 participants indicated that individuals tended to prefer voiceless stops when preceding vowels were shortened, but there was no statistically significant difference between intermediate, upper-intermediate, and advanced-level learners. However, learners demonstrated distinct perceptual patterns for various vowels and stops. The findings have valuable implications for L2 English speech acquisition. Keywords: voiced/voiceless stops, preceding vowel duration, voiced/voiceless perception, L2 English, L1 Mandarin Chinese, L1 Vietnamese, L1 Japanese, L1 Korean

Keywords: voiced/voiceless stops, preceding vowel duration, voiced/voiceless perception, L2 english

Procedia PDF Downloads 105

4320 Velocity Profiles of Vowel Perception by Javanese and Sundanese English Language Learners

Authors: Arum Perwitasari

Abstract:

Learning L2 sounds is influenced by the first language (L1) sound system. This current study seeks to examine how the listeners with a different L1 vowel system perceive L2 sounds. The fact that English has a bigger number of vowel inventory than Javanese and Sundanese L1 might cause problems for Javanese and Sundanese English language learners perceiving English sounds. To reveal the L2 sound perception over time, we measured the mouse trajectories related to the hand movements made by Javanese and Sundanese language learners, two of Indonesian local languages. Do the Javanese and Sundanese listeners show higher velocity than the English listeners when they perceive English vowels which are similar and new to their L1 system? The study aims to map the patterns of real-time processing through compatible hand movements to reveal any uncertainties when making selections. The results showed that the Javanese listeners exhibited significantly slower velocity values than the English listeners for similar vowels /I, ɛ, ʊ/ in the 826-1200ms post stimulus. Unlike the Javanese, the Sundanese listeners showed slow velocity values except for similar vowel /ʊ/. For the perception of new vowels /i:, æ, ɜ:, ʌ, ɑː, u:, ɔ:/, the Javanese listeners showed slower velocity in making the lexical decision. In contrast, the Sundanese listeners showed slow velocity only for vowels /ɜ:, ɔ:, æ, I/ indicating that these vowels are hard to perceive. Our results fit well with the second language model representing how the L1 vowel system influences the L2 sound perception.

Keywords: velocity profiles, EFL learners, speech perception, experimental linguistics

Procedia PDF Downloads 217

4319 Status of Communication and Swallowing Therapy in Patient with a Tracheostomy

Authors: Ya-Hui Wang

Abstract:

Lower speech therapy rate of tracheostomized patient was noted in comparison with previous researches. This study is aim to shed light on the referral status of speech therapy in those patients in Taiwan. This study developed an analysis for the size and key characteristics of the population of tracheostomized in-patient in the Taiwan. Method: We analyzed National Healthcare Insurance data (The Collaboration Center of Health Information Application, CCHIA) from Jan 1 2010 to Dec 31 2010. Result: over ages 3, number of tracheostomized in-patient is directly proportional to age. A high service loading was observed in North region in comparison with other regions. Only 4.87% of the tracheostomized in-patients were referred for speech therapy, and 1.9% for swallow examination, 2.5% for communication evaluation.

Keywords: refer, speech therapy, training, rehabilitation

Procedia PDF Downloads 440

4318 The Effect of Symmetry on the Perception of Happiness and Boredom in Design Products

Authors: Michele Sinico

Abstract:

The present research investigates the effect of symmetry on the perception of happiness and boredom in design products. Three experiments were carried out in order to verify the degree of the visual expressive value on different models of bookcases, wall clocks, and chairs. 60 participants directly indicated the degree of happiness and boredom using 7-point rating scales. The findings show that the participants acknowledged a different value of expressive quality in the different product models. Results show also that symmetry is not a significant constraint for an emotional design project.

Keywords: product experience, emotional design, symmetry, expressive qualities

Procedia PDF Downloads 147