Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 669

Search results for: voice pitch

639 Comparing Emotion Recognition from Voice and Facial Data Using Time Invariant Features

Authors: Vesna Kirandziska, Nevena Ackovska, Ana Madevska Bogdanova

Abstract:

The problem of emotion recognition is a challenging problem. It is still an open problem from the aspect of both intelligent systems and psychology. In this paper, both voice features and facial features are used for building an emotion recognition system. A Support Vector Machine classifiers are built by using raw data from video recordings. In this paper, the results obtained for the emotion recognition are given, and a discussion about the validity and the expressiveness of different emotions is presented. A comparison between the classifiers build from facial data only, voice data only and from the combination of both data is made here. The need for a better combination of the information from facial expression and voice data is argued.

Keywords: emotion recognition, facial recognition, signal processing, machine learning

Procedia PDF Downloads 290

638 SLIITBOT: Design of a Socially Assistive Robot for SLIIT

Authors: Chandimal Jayawardena, Ridmal Mendis, Manoji Tennakoon, Theekshana Wijayathilaka, Randima Marasinghe

Abstract:

This research paper defines the research area of the implementation of the socially assistive robot (SLIITBOT). It consists of the overall process implemented within the robot’s system and limitations, along with a literature survey. This project considers developing a socially assistive robot called SLIITBOT that will interact using its voice outputs and graphical user interface with people within the university and benefit them with updates and tasks. The robot will be able to detect a person when he/she enters the room, navigate towards the position the human is standing, welcome and greet the particular person with a simple conversation using its voice, introduce the services through its voice, and provide the person with services through an electronic input via an app while guiding the person with voice outputs.

Keywords: application, detection, dialogue, navigation

Procedia PDF Downloads 143

637 Prophylactic Replacement of Voice Prosthesis: A Study to Predict Prosthesis Lifetime

Authors: Anne Heirman, Vincent van der Noort, Rob van Son, Marije Petersen, Lisette van der Molen, Gyorgy Halmos, Richard Dirven, Michiel van den Brekel

Abstract:

Objective: Voice prosthesis leakage significantly impacts laryngectomies patients' quality of life, causing insecurity and frequent unplanned hospital visits and costs. In this study, the concept of prophylactic voice prosthesis replacement was explored to prevent leakages. Study Design: A retrospective cohort study. Setting: Tertiary hospital. Methods: Device lifetimes and voice prosthesis replacements of a retrospective cohort, including all patients with laryngectomies between 2000 and 2012 in the Netherlands Cancer Institute, were used to calculate the number of needed voice prostheses per patient per year when preventing 70% of the leakages by prophylactic replacement. Various strategies for the timing of prophylactic replacement were considered: Adaptive strategies based on the individual patient’s history of replacement and fixed strategies based on the results of patients with similar voice prosthesis or treatment characteristics. Results: Patients used a median of 3.4 voice prostheses per year (range 0.1-48.1). We found a high inter-and intrapatient variability in device lifetime. When applying prophylactic replacement, this would become a median of 9.4 voice prostheses per year, which means replacement every 38 days, implying more than six additional voice prostheses per patient per year. The individual adaptive model showed that preventing 70% of the leakages was impossible for most patients, and only a median of 25% can be prevented. Monte-Carlo simulations showed that prophylactic replacement is not feasible due to the high Coefficient of Variation (Standard Deviation/Mean) in device lifetime. Conclusion: Based on our simulations, prophylactic replacement of voice prostheses is not feasible due to high inter-and intrapatient variation in device lifetime.

Keywords: voice prosthesis, voice rehabilitation, total laryngectomy, prosthetic leakage, device lifetime

Procedia PDF Downloads 99

636 Functional Outcome of Speech, Voice and Swallowing Following Excision of Glomus Jugulare Tumor

Authors: B. S. Premalatha, Kausalya Sahani

Abstract:

Background: Glomus jugulare tumors arise within the jugular foramen and are commonly seen in females particularly on the left side. Surgical excision of the tumor may cause lower cranial nerve deficits. Cranial nerve involvement produces hoarseness of voice, slurred speech, and dysphagia along with other physical symptoms, thereby affecting the quality of life of individuals. Though oncological clearance is mainly emphasized on while treating these individuals, little importance is given to their communication, voice and swallowing problems, which play a crucial part in daily functioning. Objective: To examine the functions of voice, speech and swallowing outcomes of the subjects, following excision of glomus jugulare tumor. Methods: Two female subjects aged 56 and 62 years had come with a complaint of change in voice, inability to swallow and reduced clarity of speech following surgery for left glomus jugulare tumor were participants of the study. Their surgical information revealed multiple cranial nerve palsies involving the left facial, left superior and recurrent branches of the vagus nerve, left pharyngeal, left soft palate, left hypoglossal and vestibular nerves. Functional outcomes of voice, speech and swallowing were evaluated by perceptual and objective assessment procedures. Assessment included the examination of oral structures and functions, dysarthria by Frenchey dysarthria assessment, cranial nerve functions and swallowing functions. MDVP and Dr. Speech software were used to evaluate acoustic parameters of voice and quality of voice respectively. Results: The study revealed that both the subjects, subsequent to excision of glomus jugulare tumor, showed a varied picture of affected oral structure and functions, articulation, voice and swallowing functions. The cranial nerve assessment showed impairment of the vagus, hypoglossal, facial and glossopharyngeal nerves. Voice examination indicated vocal cord paralysis associated with breathy quality of voice, weak voluntary cough, reduced pitch and loudness range, and poor respiratory support. Perturbation parameters as jitter, shimmer were affected along with s/z ratio indicative of voice fold pathology. Reduced MPD(Maximum Phonation Duration) of vowels indicated that disturbed coordination between respiratory and laryngeal systems. Hypernasality was found to be a prominent feature which reduced speech intelligibility. Imprecise articulation was seen in both the subjects as the hypoglossal nerve was affected following surgery. Injury to vagus, hypoglossal, gloss pharyngeal and facial nerves disturbed the function of swallowing. All the phases of swallow were affected. Aspiration was observed before and during the swallow, confirming the oropharyngeal dysphagia. All the subsystems were affected as per Frenchey Dysarthria Assessment signifying the diagnosis of flaccid dysarthria. Conclusion: There is an observable communication and swallowing difficulty seen following excision of glomus jugulare tumor. Even with complete resection, extensive rehabilitation may be necessary due to significant lower cranial nerve dysfunction. The finding of the present study stresses the need for involvement of as speech and swallowing therapist for pre-operative counseling and assessment of functional outcomes.

Keywords: functional outcome, glomus jugulare tumor excision, multiple cranial nerve impairment, speech and swallowing

Procedia PDF Downloads 228

635 Features Dimensionality Reduction and Multi-Dimensional Voice-Processing Program to Parkinson Disease Discrimination

Authors: Djamila Meghraoui, Bachir Boudraa, Thouraya Meksen, M.Boudraa

Abstract:

Parkinson's disease is a pathology that involves characteristic perturbations in patients’ voices. This paper describes a proposed method that aims to diagnose persons with Parkinson (PWP) by analyzing on line their voices signals. First, Thresholds signals alterations are determined by the Multi-Dimensional Voice Program (MDVP). Principal Analysis (PCA) is exploited to select the main voice principal componentsthat are significantly affected in a patient. The decision phase is realized by a Mul-tinomial Bayes (MNB) Classifier that categorizes an analyzed voice in one of the two resulting classes: healthy or PWP. The prediction accuracy achieved reaching 98.8% is very promising.

Keywords: Parkinson’s disease recognition, PCA, MDVP, multinomial Naive Bayes

Procedia PDF Downloads 250

634 A Self Organized Map Method to Classify Auditory-Color Synesthesia from Frontal Lobe Brain Blood Volume

Authors: Takashi Kaburagi, Takamasa Komura, Yosuke Kurihara

Abstract:

Absolute pitch is the ability to identify a musical note without a reference tone. Training for absolute pitch often occurs in preschool education. It is necessary to clarify how well the trainee can make use of synesthesia in order to evaluate the effect of the training. To the best of our knowledge, there are no existing methods for objectively confirming whether the subject is using synesthesia. Therefore, in this study, we present a method to distinguish the use of color-auditory synesthesia from the separate use of color and audition during absolute pitch training. This method measures blood volume in the prefrontal cortex using functional Near-infrared spectroscopy (fNIRS) and assumes that the cognitive step has two parts, a non-linear step and a linear step. For the linear step, we assume a second order ordinary differential equation. For the non-linear part, it is extremely difficult, if not impossible, to create an inverse filter of such a complex system as the brain. Therefore, we apply a method based on a self-organizing map (SOM) and are guided by the available data. The presented method was tested using 15 subjects, and the estimation accuracy is reported.

Keywords: absolute pitch, functional near-infrared spectroscopy, prefrontal cortex, synesthesia

Procedia PDF Downloads 239

633 Dynamic Modeling of Wind Farms in the Jeju Power System

Authors: Dae-Hee Son, Sang-Hee Kang, Soon-Ryul Nam

Abstract:

In this paper, we develop a dynamic modeling of wind farms in the Jeju power system. The dynamic model of wind farms is developed to study their dynamic effects on the Jeju power system. PSS/E is used to develop the dynamic model of a wind farm composed of 1.5-MW doubly fed induction generators. The output of a wind farm is regulated based on pitch angle control, in which the two controllable parameters are speed and power references. The simulation results confirm that the pitch angle is successfully controlled, regardless of the variation in wind speed and output regulation.

Keywords: dynamic model, Jeju power system, online limitation, pitch angle control, wind farm

Procedia PDF Downloads 296

632 A Computational Study of Very High Turbulent Flow and Heat Transfer Characteristics in Circular Duct with Hemispherical Inline Baffles

Authors: Dipak Sen, Rajdeep Ghosh

Abstract:

This paper presents a computational study of steady state three dimensional very high turbulent flow and heat transfer characteristics in a constant temperature-surfaced circular duct fitted with 900 hemispherical inline baffles. The computations are based on realizable k-ɛ model with standard wall function considering the finite volume method, and the SIMPLE algorithm has been implemented. Computational Study are carried out for Reynolds number, Re ranging from 80000 to 120000, Prandtl Number, Pr of 0.73, Pitch Ratios, PR of 1,2,3,4,5 based on the hydraulic diameter of the channel, hydrodynamic entry length, thermal entry length and the test section. Ansys Fluent 15.0 software has been used to solve the flow field. Study reveals that circular pipe having baffles has a higher Nusselt number and friction factor compared to the smooth circular pipe without baffles. Maximum Nusselt number and friction factor are obtained for the PR=5 and PR=1 respectively. Nusselt number increases while pitch ratio increases in the range of study; however, friction factor also decreases up to PR 3 and after which it becomes almost constant up to PR 5. Thermal enhancement factor increases with increasing pitch ratio but with slightly decreasing Reynolds number in the range of study and becomes almost constant at higher Reynolds number. The computational results reveal that optimum thermal enhancement factor of 900 inline hemispherical baffle is about 1.23 for pitch ratio 5 at Reynolds number 120000.It also shows that the optimum pitch ratio for which the baffles can be installed in such very high turbulent flows should be 5. Results show that pitch ratio and Reynolds number play an important role on both fluid flow and heat transfer characteristics.

Keywords: friction factor, heat transfer, turbulent flow, circular duct, baffle, pitch ratio

Procedia PDF Downloads 345

631 The Phonology and Phonetics of Second Language Intonation in Case of “Downstep”

Authors: Tayebeh Norouzi

Abstract:

This study aims to investigate the acquisition process of intonation. It examines the intonation structure of Tokyo Japanese and its realization by Iranian learners of Japanese. Seven Iranian learners of Japanese, differing in fluency, and two Japanese speakers participated in the experiment. Two sentences were used to test the phonological and phonetic characteristics of lexical pitch-accent as well as the intonation patterns produced by the speakers. Both sentences consisted of similar words with the same number of syllables and lexical pitch-accents but different syntactic structure. Speakers were asked to read each sentence three times at normal speed, and the data were analyzed by Praat. The results show that lexical pitch-accent, Accentual Phrase (AP) and AP boundary tone realization vary depending on sentence type. For sentences of type XdeYwo, the lexical pitch-accent is realized properly. However, there is a rise in AP boundary tone regardless of speakers’ level of fluency. In contrast, in sentences of type XnoYwo, the lexical pitch-accent and AP boundary tone vary depending on the speakers’ fluency level. Advanced speakers are better at grouping words into phrases and produce more native-like intonation patterns, though they are not able to realize downstep properly. The non-native speakers tried to realize proper intonation patterns by making changes in lexical accent and boundary tone.

Keywords: intonation, Iranian learners, Japanese prosody, lexical accent, second language acquisition.

Procedia PDF Downloads 127

630 Performance Analysis of Solar Air Heater with Fins and Perforated Twisted Tape Insert

Authors: Rajesh Kumar, Prabha Chand

Abstract:

The present paper deals with the analytical investigation on the thermal and thermo-hydraulic performance of the solar air collector fitted with fins and perforated twisted tapes (PTT) of twist ratio 2 with different axial pitch ratio. The mathematical models are presented, and the effect of mass flow rate and axial pitch ratios on the thermal and effective efficiency has been discussed. The results obtained are compared with the results of the solar air heater without fins and twisted tapes. Results conveyed that the collectors with fins and perforated twisted tape perform better but at the expense of increased pressure drop. Also, twisted tape with minimum axial pitch ratio is found to be more efficient than others.

Keywords: solar air heater, thermal efficiency, twisted tape, twist ratio

Procedia PDF Downloads 233

629 Numerical Simulation of Turbulent Flow around Two Cam Shaped Cylinders in Tandem Arrangement

Authors: Arash Mir Abdolah Lavasani, M. Ebrahimisabet

Abstract:

In this paper, the 2-D unsteady viscous flow around two cam shaped cylinders in tandem arrangement is numerically simulated in order to study the characteristics of the flow in turbulent regimes. The investigation covers the effects of high subcritical and supercritical Reynolds numbers and L/D ratio on total drag coefficient. The equivalent diameter of cylinders is 27.6 mm The space between center to center of two cam shaped cylinders is define as longitudinal pitch ratio and it varies in range of 1.5 < L/D < 6. Reynolds number base on equivalent circular cylinder varies in range of 27×103 < Re < 166×103 Results show that drag coefficient of both cylinders depends on pitch ratio. However drag coefficient of downstream cylinder is more dependent on the pitch ratio.

Keywords: cam shaped, tandem, numerical, drag coefficient, turbulent

Procedia PDF Downloads 440

628 Analysis of Vocal Fold Vibrations from High-Speed Digital Images Based on Dynamic Time Warping

Authors: A. I. A. Rahman, Sh-Hussain Salleh, K. Ahmad, K. Anuar

Abstract:

Analysis of vocal fold vibration is essential for understanding the mechanism of voice production and for improving clinical assessment of voice disorders. This paper presents a Dynamic Time Warping (DTW) based approach to analyze and objectively classify vocal fold vibration patterns. The proposed technique was designed and implemented on a Glottal Area Waveform (GAW) extracted from high-speed laryngeal images by delineating the glottal edges for each image frame. Feature extraction from the GAW was performed using Linear Predictive Coding (LPC). Several types of voice reference templates from simulations of clear, breathy, fry, pressed and hyperfunctional voice productions were used. The patterns of the reference templates were first verified using the analytical signal generated through Hilbert transformation of the GAW. Samples from normal speakers’ voice recordings were then used to evaluate and test the effectiveness of this approach. The classification of the voice patterns using the technique of LPC and DTW gave the accuracy of 81%.

Keywords: dynamic time warping, glottal area waveform, linear predictive coding, high-speed laryngeal images, Hilbert transform

Procedia PDF Downloads 212

627 Pitch Processing in Autistic Mandarin-Speaking Children with Hypersensitivityand Hypo-Sensitivity: An Event-Related Potential Study

Authors: Kaiying Lai, Suiping Wang, Luodi Yu, Yang Zhang, Pengmin Qin

Abstract:

Abnormalities in auditory processing are one of the most commonly reported sensory processing impairments in children with Autism Spectrum Disorder (ASD). Tonal language speaker with autism has enhanced neural sensitivity to pitch changes in pure tone. However, not all children with ASD exhibit the same performance in pitch processing due to different auditory sensitivity. The current study aimed to examine auditory change detection in ASD with different auditory sensitivity. K-means clustering method was adopted to classify ASD participants into two groups according to the auditory processing scores of the Sensory Profile, 11 autism with hypersensitivity (mean age = 11.36 ; SD = 1.46) and 18 with hypo-sensitivity (mean age = 10.64; SD = 1.89) participated in a passive auditory oddball paradigm designed for eliciting mismatch negativity (MMN) under the pure tone condition. Results revealed that compared to hypersensitive autism, the children with hypo-sensitivity showed smaller MMN responses to pure tone stimuli. These results suggest that ASD with auditory hypersensitivity and hypo-sensitivity performed differently in processing pure tone, so neural responses to pure tone hold promise for predicting the auditory sensitivity of ASD and targeted treatment in children with ASD.

Keywords: ASD, sensory profile, pitch processing, mismatch negativity, MMN

Procedia PDF Downloads 350

626 Wolof Voice Response Recognition System: A Deep Learning Model for Wolof Audio Classification

Authors: Krishna Mohan Bathula, Fatou Bintou Loucoubar, FNU Kaleemunnisa, Christelle Scharff, Mark Anthony De Castro

Abstract:

Voice recognition algorithms such as automatic speech recognition and text-to-speech systems with African languages can play an important role in bridging the digital divide of Artificial Intelligence in Africa, contributing to the establishment of a fully inclusive information society. This paper proposes a Deep Learning model that can classify the user responses as inputs for an interactive voice response system. A dataset with Wolof language words ‘yes’ and ‘no’ is collected as audio recordings. A two stage Data Augmentation approach is adopted for enhancing the dataset size required by the deep neural network. Data preprocessing and feature engineering with Mel-Frequency Cepstral Coefficients are implemented. Convolutional Neural Networks (CNNs) have proven to be very powerful in image classification and are promising for audio processing when sounds are transformed into spectra. For performing voice response classification, the recordings are transformed into sound frequency feature spectra and then applied image classification methodology using a deep CNN model. The inference model of this trained and reusable Wolof voice response recognition system can be integrated with many applications associated with both web and mobile platforms.

Keywords: automatic speech recognition, interactive voice response, voice response recognition, wolof word classification

Procedia PDF Downloads 86

625 Work with Children's Music Group: Important Aspects of Didactic and Artistic Performance

Authors: Eudjen Cinc

Abstract:

Work with a human voice, especially with a child s voice and cultivating the sound of the choir, presents an area of crucial importance for a conductor. We use the term conductor because it needs to be understood that regardless of whether we have in front of us an amateur or a professional choir, whether they are singers with a wealth of experience or children who are still developing and educating their inner ear so that in the future they could contribute to the development of choir music, the person who stands in front of the group and works with them, needs to have the characteristics of a conductor. Voice formation is a long-term process, without which there is no success in both solo and collective music performance.

Keywords: music group, conductor, collective, performance

Procedia PDF Downloads 199

624 Lovely, Lyrical, Lilting: Kubrick’s Translation of Lolita’s Voice

Authors: Taylor La Carriere

Abstract:

“What I had madly possessed was not she, but my own creation, another, fanciful Lolita perhaps, more real than Lolita; overlapping, encasing he and having no will, no consciousness indeed, no life of her own,” Vladimir Nabokov writes in his seminal work, Lolita. Throughout Nabokov’s novel, the eponymous character is rendered nonexistent through unreliable narrator Humbert Humbert’s impenetrable narrative, infused with lyrical rationalization. Instead, Lolita is “safely solipsised,” as Humbert muses, solidifying the potential for the erasure of Lolita’s agency and identity. In this literary work, Lolita’s voice is reduced to a nearly invisible presence, only seen through the eyes of her captor. However, in Stanley Kubrick’s film adaptation of Lolita (1962), the “nymphet,” as Nabokov coins, reemerges with a voice of her own, fueled by a lyric impulse, that displaces Humbert’s first-person narration. The lyric, as defined by Catherine Ing, is the voice of the invisible; it is also characterized by performance, the concentrated utterance of individual emotion, and the appearance of spontaneity. The novel’s lyricism is largely in the service of Humbert’s “seductive” voice, while the film reorients it more to Lolita’s subjectivity. Through a close analysis of Kubrick’s cinematic techniques, this paper examines the emergence and translation of Lolita’s voice in contrast with Humbert’s attempts to silence her in Nabokov’s Lolita, hypothesizing that Kubrick translates Lolita’s presence into a visual and aural voice with lyrical attributes, exemplified through the establishment of an altered power dynamic, Sue Lyon’s transformative performance as the titular character, Nelson Riddle and Bob Harris’ musical score, and the omission of Humbert’s first-person point-of-view. In doing so, the film reclaims Lolita’s agency by taking instances of Lolita’s voice in the novel as depicted in the last half of the work and expanding upon them in a way only cinematic depictions could allow. The results of this study suggest that Lolita’s voice in Kubrick’s adaptation functions without disrupting the lyricism present in Nabokov’s source text, materializing through the actions, expressions, and performance of Sue Lyon in the film. This voice, fueled by a lyric impulse of its own, refutes the silence bestowed upon the titular character and enables its ultimate reclamation upon the silver screen.

Keywords: cinema, adaptation, Lolita, lyric voice

Procedia PDF Downloads 166

623 Reconceptualising the Voice of Children in Child Protection

Authors: Sharon Jackson, Lynn Kelly

Abstract:

This paper proposes a conceptual review of the interdisciplinary literature which has theorised the concept of ‘children’s voices’. The primary aim is to identify and consider the theoretical relevance of conceptual thought on ‘children’s voices’ for research and practice in child protection contexts. Attending to the ‘voice of the child’ has become a core principle of social work practice in contemporary child protection contexts. Discourses of voice permeate the legislative, policy and practice frameworks of child protection practices within the UK and internationally. Voice is positioned within a ‘child-centred’ moral imperative to ‘hear the voices’ of children and take their preferences and perspectives into account. This practice is now considered to be central to working in a child-centered way. The genesis of this call to voice is revealed through sociological analysis of twentieth-century child welfare reform as rooted inter alia in intersecting political, social and cultural discourses which have situated children and childhood as cites of state intervention as enshrined in the 1989 United Nations Convention on the Rights of the Child ratified by the UK government in 1991 and more specifically Article 12 of the convention. From a policy and practice perspective, the professional ‘capturing’ of children’s voices has come to saturate child protection practice. This has incited a stream of directives, resources, advisory publications and ‘how-to’ guides which attempt to articulate practice methods to ‘listen’, ‘hear’ and above all – ‘capture’ the ‘voice of the child’. The idiom ‘capturing the voice of the child’ is frequently invoked within the literature to express the requirements of the child-centered practice task to be accomplished. Despite the centrality of voice, and an obsession with ‘capturing’ voices, evidence from research, inspection processes, serious case reviews, child abuse and death inquires has consistently highlighted professional neglect of ‘the voice of the child’. Notable research studies have highlighted the relative absence of the child’s voice in social work assessment practices, a troubling lack of meaningful engagement with children and the need to more thoroughly examine communicative practices in child protection contexts. As a consequence, the project of capturing ‘the voice of the child’ has intensified, and there has been an increasing focus on developing methods and professional skills to attend to voice. This has been guided by a recognition that professionals often lack the skills and training to engage with children in age-appropriate ways. We argue however that the problem with ‘capturing’ and [re]representing ‘voice’ in child protection contexts is, more fundamentally, a failure to adequately theorise the concept of ‘voice’ in the ‘voice of the child’. For the most part, ‘The voice of the child’ incorporates psychological conceptions of child development. While these concepts are useful in the context of direct work with children, they fail to consider other strands of sociological thought, which position ‘the voice of the child’ within an agentic paradigm to emphasise the active agency of the child.

Keywords: child-centered, child protection, views of the child, voice of the child

Procedia PDF Downloads 108

622 The Effect of Speech-Shaped Noise and Speaker’s Voice Quality on First-Grade Children’s Speech Perception and Listening Comprehension

Authors: I. Schiller, D. Morsomme, A. Remacle

Abstract:

Children’s ability to process spoken language develops until the late teenage years. At school, where efficient spoken language processing is key to academic achievement, listening conditions are often unfavorable. High background noise and poor teacher’s voice represent typical sources of interference. It can be assumed that these factors particularly affect primary school children, because their language and literacy skills are still low. While it is generally accepted that background noise and impaired voice impede spoken language processing, there is an increasing need for analyzing impacts within specific linguistic areas. Against this background, the aim of the study was to investigate the effect of speech-shaped noise and imitated dysphonic voice on first-grade primary school children’s speech perception and sentence comprehension. Via headphones, 5 to 6-year-old children, recruited within the French-speaking community of Belgium, listened to and performed a minimal-pair discrimination task and a sentence-picture matching task. Stimuli were randomly presented according to four experimental conditions: (1) normal voice / no noise, (2) normal voice / noise, (3) impaired voice / no noise, and (4) impaired voice / noise. The primary outcome measure was task score. How did performance vary with respect to listening condition? Preliminary results will be presented with respect to speech perception and sentence comprehension and carefully interpreted in the light of past findings. This study helps to support our understanding of children’s language processing skills under adverse conditions. Results shall serve as a starting point for probing new measures to optimize children’s learning environment.

Keywords: impaired voice, sentence comprehension, speech perception, speech-shaped noise, spoken language processing

Procedia PDF Downloads 164

621 Detection of Autistic Children's Voice Based on Artificial Neural Network

Authors: Royan Dawud Aldian, Endah Purwanti, Soegianto Soelistiono

Abstract:

In this research we have been developed an automatic investigation to classify normal children voice or autistic by using modern computation technology that is computation based on artificial neural network. The superiority of this computation technology is its capability on processing and saving data. In this research, digital voice features are gotten from the coefficient of linear-predictive coding with auto-correlation method and have been transformed in frequency domain using fast fourier transform, which used as input of artificial neural network in back-propagation method so that will make the difference between normal children and autistic automatically. The result of back-propagation method shows that successful classification capability for normal children voice experiment data is 100% whereas, for autistic children voice experiment data is 100%. The success rate using back-propagation classification system for the entire test data is 100%.

Keywords: autism, artificial neural network, backpropagation, linier predictive coding, fast fourier transform

Procedia PDF Downloads 425

620 Excitation Modeling for Hidden Markov Model-Based Speech Synthesis Based on Wavelet Analysis

Authors: M. Kiran Reddy, K. Sreenivasa Rao

Abstract:

The conventional Hidden Markov Model (HMM)-based speech synthesis system (HTS) uses only a pulse excitation model, which significantly differs from natural excitation signal. Hence, buzziness can be perceived in the speech generated using HTS. This paper proposes an efficient excitation modeling method that can significantly reduce the buzziness, and improve the quality of HMM-based speech synthesis. The proposed approach models the pitch-synchronous residual frames extracted from the residual excitation signal. Each pitch synchronous residual frame is parameterized using 30 wavelet coefficients. These 30 wavelet coefficients are found to accurately capture the perceptually important information present in the residual waveform. In synthesis phase, the residual frames are reconstructed from the generated wavelet coefficients and are pitch-synchronously overlap-added to generate the excitation signal. The proposed excitation modeling method is integrated into HMM-based speech synthesis system. Evaluation results indicate that the speech synthesized by the proposed excitation model is significantly better than the speech generated using state-of-the-art excitation modeling methods.

Keywords: excitation modeling, hidden Markov models, pitch-synchronous frames, speech synthesis, wavelet coefficients

Procedia PDF Downloads 219

619 Passive Voice in SLA: Armenian Learners’ Case Study

Authors: Emma Nemishalyan

Abstract:

It is believed that learners’ mother tongue (L1 hereafter) has a huge impact on their second language acquisition (L2 hereafter). This hypothesis has been exposed to both positive and negative criticism. Based on research results of a wide range of learners’ corpora (Chinese, Japanese, Spanish among others) the hypothesis has either been proved or disproved. However, no such study has been conducted on the Armenian learners. The aim of this paper is to understand the implication of the hypothesis on the Armenian learners’ corpus in terms of the use of the passive voice. To this end, the method of Contrastive Interlanguage Analysis (hereafter CIA) has been used on native speakers’ corpus (Louvain Corpus of Native English Essays (LOCNESS)) and Armenian learners’ corpus which has been compiled by me in compliance with International Corpus of Learner English (ICLE) guidelines. CIA compares the interlanguage (the language produced by learners) with the one produced by native speakers. With the help of this method, it is possible not only to highlight the mistakes that learners make, but also to underline the under or overuses. The choice of the grammar issue (passive voice) is conditioned by the fact that typologically Armenian and English are drastically different as they belong to different branches. Moreover, the passive voice is considered to be one of the most problematic grammar topics to be acquired by learners of the English language. Based on this difference, we hypothesized that Armenian learners would either overuse or underuse some types of the passive voice. With the help of Lancsbox software, we have identified the frequency rates of passive voice usage in LOCNESS and Armenian learners’ corpus to understand whether the latter have the same usage pattern of the passive voice as the native speakers. Secondly, we have identified the types of the passive voice used by the Armenian leaners trying to track down the reasons in their mother tongue. The results of the study showed that Armenian learners underused the passive voices in contrast to native speakers. Furthermore, the hypothesis that learners’ L1 has an impact on learners’ L2 acquisition and production was proved.

Keywords: corpus linguistics, applied linguistics, second language acquisition, corpus compilation

Procedia PDF Downloads 63

618 Vocal Training and Practice Methods: A Glimpse on the South Indian Carnatic Music

Authors: Raghavi Janaswamy, Saraswathi K. Vasudev

Abstract:

Music is one of the supreme arts of expressions, next to the speech itself. Its evolution over centuries has paved the way with a variety of training protocols and performing methods. Indian classical music is one of the most elaborate and refined systems with immense emphasis on the voice culture related to range, breath control, quality of the tone, flexibility and diction. Several exercises namely saraliswaram, jantaswaram, dhatuswaram, upper stayi swaram, alamkaras and varnams lay the required foundation to gain the voice culture and deeper understanding on the voice development and further on to the intricacies of the raga system. This article narrates a few of the Carnatic music training methods with an emphasis on the advanced practice methods for articulating the vocal skills, continuity in the voice, ability to produce gamakams, command in the multiple speeds of rendering with reasonable volume. The creativity on these exercises and their impact on the voice production are discussed. The articulation of the outlined conscious practice methods and vocal exercises bestow the optimum use of the natural human vocal system to not only enhance the signing quality but also to gain health benefits.

Keywords: Carnatic music, Saraliswaram, Varnam, vocal training

Procedia PDF Downloads 150

617 Minimum Data of a Speech Signal as Special Indicators of Identification in Phonoscopy

Authors: Nazaket Gazieva

Abstract:

Voice biometric data associated with physiological, psychological and other factors are widely used in forensic phonoscopy. There are various methods for identifying and verifying a person by voice. This article explores the minimum speech signal data as individual parameters of a speech signal. Monozygotic twins are believed to be genetically identical. Using the minimum data of the speech signal, we came to the conclusion that the voice imprint of monozygotic twins is individual. According to the conclusion of the experiment, we can conclude that the minimum indicators of the speech signal are more stable and reliable for phonoscopic examinations.

Keywords: phonogram, speech signal, temporal characteristics, fundamental frequency, biometric fingerprints

Procedia PDF Downloads 112

616 Ultrasonic Evaluation of Periodic Rough Inaccessible Surfaces from Back Side

Authors: Chanh Nghia Nguyen, Yu Kurokawa, Hirotsugu Inoue

Abstract:

The surface roughness is an important parameter for evaluating the quality of material surfaces since it affects functions and performance of industrial components. Although stylus and optical techniques are commonly used for measuring the surface roughness, they are applicable only to accessible surfaces. In practice, surface roughness measurement from the back side is sometimes demanded, for example, in inspection of safety-critical parts such as inner surface of pipes. However, little attention has been paid to the measurement of back surface roughness so far. Since back surface is usually inaccessible by stylus or optical techniques, ultrasonic technique is one of the most effective among others. In this research, an ultrasonic pulse-echo technique is considered for evaluating the pitch and the height of back surface having periodic triangular profile as a very first step. The pitch of the surface profile is measured by applying the diffraction grating theory for oblique incidence; then the height is evaluated by numerical analysis based on the Kirchhoff theory for normal incidence. The validity of the proposed method was verified by both numerical simulation and experiment. It was confirmed that the pitch is accurately measured in most cases. The height was also evaluated with good accuracy when it is smaller than a half of the pitch because of the approximation in the Kirchhoff theory.

Keywords: back side, inaccessible surface, periodic roughness, pulse-echo technique, ultrasonic NDE

Procedia PDF Downloads 251

615 Independent Encryption Technique for Mobile Voice Calls

Authors: Nael Hirzalla

Abstract:

The legality of some countries or agencies’ acts to spy on personal phone calls of the public became a hot topic to many social groups’ talks. It is believed that this act is considered an invasion to someone’s privacy. Such act may be justified if it is singling out specific cases but to spy without limits is very unacceptable. This paper discusses the needs for not only a simple and light weight technique to secure mobile voice calls but also a technique that is independent from any encryption standard or library. It then presents and tests one encrypting algorithm that is based of frequency scrambling technique to show fair and delay-free process that can be used to protect phone calls from such spying acts.

Keywords: frequency scrambling, mobile applications, real-time voice encryption, spying on calls

Procedia PDF Downloads 439

614 Decoding Gender Disparities in AI: An Experimental Exploration Within the Realm of AI and Trust Building

Authors: Alexander Scott English, Yilin Ma, Xiaoying Liu

Abstract:

The widespread use of artificial intelligence in everyday life has triggered a fervent discussion covering a wide range of areas. However, to date, research on the influence of gender in various segments and factors from a social science perspective is still limited. This study aims to explore whether there are gender differences in human trust in AI for its application in basic everyday life and correlates with human perceived similarity, perceived emotions (including competence and warmth), and attractiveness. We conducted a study involving 321 participants using a two-subject experimental design with a two-factor (masculinized vs. feminized voice of the AI) multiplied by a two-factor (pitch level of the AI's voice) between-subject experimental design. Four contexts were created for the study and randomly assigned. The results of the study showed significant gender differences in perceived similarity, trust, and perceived emotion of the AIs, with females rating them significantly higher than males. Trust was higher in relation to AIs presenting the same gender (e.g., human female to female AI, human male to male AI). Mediation modeling tests indicated that emotion perception and similarity played a sufficiently mediating role in trust. Notably, although trust in AIs was strongly correlated with human gender, there was no significant effect on the gender of the AI. In addition, the study discusses the effects of subjects' age, job search experience, and job type on the findings.

Keywords: artificial intelligence, gender differences, human-robot trust, mediation modeling

Procedia PDF Downloads 19

613 Empowering Leadership and Constructive Voice: A Sequential Mediation Analysis

Authors: Umamaheswara Rao Jada, Susmita Mukhopadhyay

Abstract:

In the present highly complex, dynamic and interdependent organizational environment, employees' ideas, opinions and suggestions which is technically referred to as ‘constructive employee voice’ is increasingly being recognized and valued. Literature has consistently demonstrated the relevance of leadership in employee voicing behavior, however the new form of leadership, ‘empowering leadership’ has not been given much attention. The study, therefore, devotes itself to the effort to explore the impact of this new form of leadership on employee voice behavior and the interplay with leader member exchange (LMX) and psychological safety as mediators in the same. The study utilizes structural equation modeling for analyzing the data collected from 310 Indian service industry employees through the questionnaire developed for the study. The findings of the study demonstrate the significant impact of empowering form of leadership on employees’ constructive voice behavior. Additionally, supporting results were observed for the mediating impact of leader member exchange (LMX) and psychological safety between empowering leadership and employees’ constructive voice behavior. The results of this study provide insights into the intervening mechanisms by linking leaders’ empowering behavior with employees’ constructive voice, while also highlighting the potential importance of LMX relationship in organizations and psychological safety in the context of constructive voice behavior. The study brings forth the relevance of the new form of leadership, ‘empowering leadership’ for fostering the better exchange of ideas, opinions, and suggestions between leaders and followers which tend to benefit the organization, providing empirical evidence of the sequential mediation of LMX and psychological safety. The piece of work is assumed to benefit the leaders in organizations by providing them the basis for adopting empowering form of leadership in light of results displayed.

Keywords: constructive voice, empowering leadership, leader member exchange (LMX), psychological safety, sequential mediation, structural equation modeling

Procedia PDF Downloads 277

612 The Oppressive Boss and Employees' Authoritarianism: The Relation between Suppression of Voice by Employers and Employees' Preferences for Authoritarian Political Leadership

Authors: Antonia Stanojević, Agnes Akkerman

Abstract:

In contemporary society, economically active people typically spend most of their waking hours doing their job. Having that in mind, this research examines how socialization at the workplace shapes political preferences. Innovatively, it examines, in particular, the possible relationship between employees’ voice suppression by the employer and the formation of their political preferences. Since the employer is perceived as an authority figure, their behavior might induce spillovers to attitudes about political authorities and authoritarian governance. Therefore, a positive effect of suppression of voice by employers on employees' preference for authoritarian governance is expected. Furthermore, this relation is expected to be mediated by two mechanisms: system justification and power distance. Namely, it is expected that suppression of voice would create a power distance organizational climate and increase employees’ acceptance of unequal distribution of power, as well as evoke attempts of oppression rationalization through system justification. The hypotheses will be tested on the data gathered within the first wave of Work and Politics Dataset 2017 (N=6000), which allows for a wide range of demographic and psychological control variables. Although a cross-sectional analysis to be used at this point does not allow for causal inferences, the confirmation of expected relationships would encourage and justify further longitudinal research on the same panel dataset, in order to get a clearer image of the causal relationship between employers' suppression of voice and workers' political preferences.

Keywords: authoritarian values, political preferences, power distance, system justification, voice suppression

Procedia PDF Downloads 240

611 From Script to Film: The Fading Voice of the Screenwriter

Authors: Ana Sofia Torres Pereira

Abstract:

On January 15th 2015, Peter Bart, editor in chief of Variety Magazine, published an article in the aforementioned magazine posing the following question “Are screenwriters becoming obsolete in Hollywood?” Is Hollywood loosing its interest in well plotted, well written scripts crafted by professionals? That screenwriters have been undervalued, forgotten and left behind since the begging of film, is a well-known fact, but ate they now at the brink of extinction? If fiction films are about people, stories, so, simply put, all about the script, what does it mean to say that the screenwriter is becoming obsolete? What will be the consequences of the possible death of the screenwriter for the cinema world? All of these questions lead us to an ultimate one: What is the true importance of a screenwriter? What can a screenwriter do that a director, for instance, can’t? How should a script be written and read in order not to become obsolete? And what about those countries, like Portugal, for example, in which the figure of the screenwriter is yet to be heard and known? How can screenwriters find their voice in a world driven by the tyrannical voice of the Director? In a demanding cinema world where the Director is considered the author of a film, it’s important to know where we can find the voice of the screenwriter, the true language of the screenplay and the importance this voice and specific language might have for the future of story telling and of film. In a paper that admittedly poses more questions than answers, I will try to unveil the importance a screenplay might have in Hollywood, in Portugal and in the cinema and communication world in general.

Keywords: cinema, communication, director, language, screenplay, screenwriting, story

Procedia PDF Downloads 289

610 The Use of Voice in Online Public Access Catalog as Faster Searching Device

Authors: Maisyatus Suadaa Irfana, Nove Eka Variant Anna, Dyah Puspitasari Sri Rahayu

Abstract:

Technological developments provide convenience to all the people. Nowadays, the communication of human with the computer is done via text. With the development of technology, human and computer communications have been conducted with a voice like communication between human beings. It provides an easy facility for many people, especially those who have special needs. Voice search technology is applied in the search of book collections in the OPAC (Online Public Access Catalog), so library visitors will find it faster and easier to find books that they need. Integration with Google is needed to convert the voice into text. To optimize the time and the results of searching, Server will download all the book data that is available in the server database. Then, the data will be converted into JSON format. In addition, the incorporation of some algorithms is conducted including Decomposition (parse) in the form of array of JSON format, the index making, analyzer to the result. It aims to make the process of searching much faster than the usual searching in OPAC because the data are directly taken to the database for every search warrant. Data Update Menu is provided with the purpose to enable users perform their own data updates and get the latest data information.

Keywords: OPAC, voice, searching, faster

Procedia PDF Downloads 318