Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 580

Search results for: fricative voice gestures

520 Human Gesture Recognition for Real-Time Control of Humanoid Robot

Authors: S. Aswath, Chinmaya Krishna Tilak, Amal Suresh, Ganesh Udupa

Abstract:

There are technologies to control a humanoid robot in many ways. But the use of Electromyogram (EMG) electrodes has its own importance in setting up the control system. The EMG based control system helps to control robotic devices with more fidelity and precision. In this paper, development of an electromyogram based interface for human gesture recognition for the control of a humanoid robot is presented. To recognize control signs in the gestures, a single channel EMG sensor is positioned on the muscles of the human body. Instead of using a remote control unit, the humanoid robot is controlled by various gestures performed by the human. The EMG electrodes attached to the muscles generates an analog signal due to the effect of nerve impulses generated on moving muscles of the human being. The analog signals taken up from the muscles are supplied to a differential muscle sensor that processes the given signal to generate a signal suitable for the microcontroller to get the control over a humanoid robot. The signal from the differential muscle sensor is converted to a digital form using the ADC of the microcontroller and outputs its decision to the CM-530 humanoid robot controller through a Zigbee wireless interface. The output decision of the CM-530 processor is sent to a motor driver in order to control the servo motors in required direction for human like actions. This method for gaining control of a humanoid robot could be used for performing actions with more accuracy and ease. In addition, a study has been conducted to investigate the controllability and ease of use of the interface and the employed gestures.

Keywords: electromyogram, gesture, muscle sensor, humanoid robot, microcontroller, Zigbee

Procedia PDF Downloads 381

519 Advanced Mouse Cursor Control and Speech Recognition Module

Authors: Prasad Kalagura, B. Veeresh kumar

Abstract:

We constructed an interface system that would allow a similarly paralyzed user to interact with a computer with almost full functional capability. A real-time tracking algorithm is implemented based on adaptive skin detection and motion analysis. The clicking of the mouse is activated by the user's eye blinking through a sensor. The keyboard function is implemented by voice recognition kit.

Keywords: embedded ARM7 processor, mouse pointer control, voice recognition

Procedia PDF Downloads 549

518 Islamic Perception of Modern Democratic System

Authors: Muhammad Khubaib

Abstract:

The Holy Quran purport is to establish a democratic system in which Allah has the right to special authority and He who has the supreme power or sovereignty. The supreme leader, Allah ceded the right to govern to his prophet and whoever would ever rule he would have to govern as a deputy of Prophet of Allah and he will not have the right to deviate from the basic rules of law and constitution. Centuries before the birth of prevailing democracy, Muslim scholars and researchers continuously keep using the term of “Jamhür” (majority) in their books. Islam gives the basic importance to the public opinion to establish a government and make the public confidence necessary for the government. The most effective way to gain the trust of the people in the present to build national institutions is through the vote. Vote testifies in favor of the candidate and majority tells us who is more honest and talented. Each voter stands at the position of trustworthy. To vote a cruel person would be tantamount to treason and even not to vote would be considered as a national offence. After transparent process, the selected member of government would be seemed a fine example of the saying of Muhammad (S.A.W) in which he said; the majority of my people will never be agreed at misleading. In short in this article, there would be discussed democracy in the Islamic perception, while elaborating the western democracy so that it can be cleared that in which way the Holy Quran supported the democracy and what gestures Muhammad (S.A.W) made to spread the democracy and on the basis of those gestures, and how come those gestures are being followed to choose the sacred caliphate. It's hoped that this research would be helpful to refine the democratic system and support to meet the challenges Muslim world are facing.

Keywords: democracy, modern democratic system, respect of majority opinion, vote casting

Procedia PDF Downloads 156

517 Performance Assessment in a Voice Coil Motor for Maximizing the Energy Harvesting with Gait Motions

Authors: Hector A. Tinoco, Cesar Garcia-Diaz, Olga L. Ocampo-Lopez

Abstract:

In this study, an experimental approach is established to assess the performance of different beams coupled to a Voice Coil Motor (VCM) with the aim to maximize mechanically the energy harvesting in the inductive transducer that is included on it. The VCM is extracted from a recycled hard disk drive (HDD) and it is adapted for carrying out experimental tests of energy harvesting. Two individuals were selected for walking with the VCM-beam device as well as to evaluate the performance varying two parameters in the beam; length of the beams and a mass addition. Results show that the energy harvesting is maximized with specific beams; however, the harvesting efficiency is improved when a mass is added to the end of the beams.

Keywords: hard disk drive, energy harvesting, voice coil motor, energy harvester, gait motions

Procedia PDF Downloads 328

516 Leader Personality Traits and Constructive Voice Behavior: Mediating Roles of Empowering Leadership and Leader-Member Exchange

Authors: Umamaheswara Rao Jada, Susmita Mukhopadhyay

Abstract:

Employee voice behavior has emerged as an important topic in relation to understanding the paybacks within the organizations. Organizations are expecting employees to contribute in the form of suggestions and ideas that not only help an organization to grow but also survive the turbulent times. Leadership in the organization enables and arouses an individual to offer constructive ideas. The significant impact of leadership is undeniable in a context of creating an environment that promotes a free flow of thoughts and ideas in the organization which in turn is significantly influenced by the personality of the leader. Therefore our study aims at examining the underlying factors which influence employee constructive voice behavior in connection with leader’s personality, empowering form of leadership and leader-member exchange in the organization sequentially. A standardized survey questionnaire was used to collect sample of 272 service executives in India. Smart PLS 2.0 was used to test hypothesis and explore the mediation effect. The result shows that the leader personality traits of agreeableness and conscientiousness were positively related to empowering leadership, whereas neuroticism was unrelated to empowering leadership. Empowering leadership influenced followers’ constructive voice behavior significantly. Furthermore, the relationship was partially mediated by leader member exchange relationship. Theoretical and practical implications of the findings, as well as directions for the future line of research, have been presented in the study.

Keywords: constructive voice, empowering leadership, leader member exchange (LMX), leader personality traits

Procedia PDF Downloads 274

515 Effect of Lullabies on Babies Growth and Development, Vital Signs and Hospitalization Times in the Neonatal Intensive Care Units

Authors: Işın Alkan, Meltem Kürtüncü

Abstract:

Objective: This study was carried out with an experimental design in order to determine whether the lullaby, which was listened from mother’s voice and a stranger’s voice to the babies born at term and hospitalized in neonatal intensive care unit, had an effect on saturation values (SpO2), peak heart rate (PHR), respiration, fever, growth and development and hospitalization times of the infants. Method: Data from the study were obtained from 90 newborn babies who were hospitalized in Neonatal Intensive Care Unit of Zonguldak Maternity And Children Hospital between September 2015-January 2016 and who met the eligibility criteria. Lullaby concert was performed by choosing one of the suitable care hours. SpO2, PHR, respiration, fever, growth and development and hospitalization times of the infants were recorded by the researcher on “Newborn response follow-up form” at pre-care and post-care. Vital signs of babies every day, weight, height and head circumference measurements at admission, weakly rated at an output. Results: In the experimental and control groups, like weight, height and head circumference anthropometric measurements were not found statistically significant difference intensive care units admission and output times. Hospitalization times on babies who listen to lullaby mother’s voice revealed statistically significant difference according to babies who listen to lullaby stranger’s voice. Before care and after care were examined, SpO2 rates of babies who listen to lullaby mother’s voice revealed statistically significant higher difference according to babies who listen to lullaby stranger’s voice and control group babies. Before care on PHR of babies in three groups were not found the statistical difference, but aftercare, it was found that statistically lower (normal range) on babies who listen to lullaby mother’s voice according to babies who listen to lullaby stranger’s voice. Before care in three groups were not found the statistical difference on respiration values of babies, but aftercare, it was found that statistically lower (normal range) on babies who listen to lullaby stranger’s voice according to babies who listen to mother’s voice and control groups. Before care and after care were examined, fever signs did not reveal statistically significant difference in three groups. Conclusion: Lullaby concerts as being normal ranges of vital signs of infants and also helping to shorten hospitalization times should be preferred in the neonatal intensive care units.

Keywords: growth and development, lullaby, mother voice, vital signs

Procedia PDF Downloads 190

514 N400 Investigation of Semantic Priming Effect to Symbolic Pictures in Text

Authors: Thomas Ousterhout

Abstract:

The purpose of this study was to investigate if incorporating meaningful pictures of gestures and facial expressions in short sentences of text could supplement the text with enough semantic information to produce and N400 effect when probe words incongruent to the picture were subsequently presented. Event-related potentials (ERPs) were recorded from a 14-channel commercial grade EEG headset while subjects performed congruent/incongruent reaction time discrimination tasks. Since pictures of meaningful gestures have been shown to be semantically processed in the brain in a similar manner as words are, it is believed that pictures will add supplementary information to text just as the inclusion of their equivalent synonymous word would. The hypothesis is that when subjects read the text/picture mixed sentences, they will process the images and words just like in face-to-face communication and therefore probe words incongruent to the image will produce an N400.

Keywords: EEG, ERP, N400, semantics, congruency, facilitation, Emotiv

Procedia PDF Downloads 236

513 A System Architecture for Hand Gesture Control of Robotic Technology: A Case Study Using a Myo™ Arm Band, DJI Spark™ Drone, and a Staubli™ Robotic Manipulator

Authors: Sebastian van Delden, Matthew Anuszkiewicz, Jayse White, Scott Stolarski

Abstract:

Industrial robotic manipulators have been commonplace in the manufacturing world since the early 1960s, and unmanned aerial vehicles (drones) have only begun to realize their full potential in the service industry and the military. The omnipresence of these technologies in their respective fields will only become more potent in coming years. While these technologies have greatly evolved over the years, the typical approach to human interaction with these robots has not. In the industrial robotics realm, a manipulator is typically jogged around using a teach pendant and programmed using a networked computer or the teach pendant itself via a proprietary software development platform. Drones are typically controlled using a two-handed controller equipped with throttles, buttons, and sticks, an app that can be downloaded to one’s mobile device, or a combination of both. This application-oriented work offers a novel approach to human interaction with both unmanned aerial vehicles and industrial robotic manipulators via hand gestures and movements. Two systems have been implemented, both of which use a Myo™ armband to control either a drone (DJI Spark™) or a robotic arm (Stäubli™ TX40). The methodologies developed by this work present a mapping of armband gestures (fist, finger spread, swing hand in, swing hand out, swing arm left/up/down/right, etc.) to either drone or robot arm movements. The findings of this study present the efficacy and limitations (precision and ergonomic) of hand gesture control of two distinct types of robotic technology. All source code associated with this project will be open sourced and placed on GitHub. In conclusion, this study offers a framework that maps hand and arm gestures to drone and robot arm control. The system has been implemented using current ubiquitous technologies, and these software artifacts will be open sourced for future researchers or practitioners to use in their work.

Keywords: human robot interaction, drones, gestures, robotics

Procedia PDF Downloads 128

512 A Simple Adaptive Atomic Decomposition Voice Activity Detector Implemented by Matching Pursuit

Authors: Thomas Bryan, Veton Kepuska, Ivica Kostanic

Abstract:

A simple adaptive voice activity detector (VAD) is implemented using Gabor and gammatone atomic decomposition of speech for high Gaussian noise environments. Matching pursuit is used for atomic decomposition, and is shown to achieve optimal speech detection capability at high data compression rates for low signal to noise ratios. The most active dictionary elements found by matching pursuit are used for the signal reconstruction so that the algorithm adapts to the individual speakers dominant time-frequency characteristics. Speech has a high peak to average ratio enabling matching pursuit greedy heuristic of highest inner products to isolate high energy speech components in high noise environments. Gabor and gammatone atoms are both investigated with identical logarithmically spaced center frequencies, and similar bandwidths. The algorithm performs equally well for both Gabor and gammatone atoms with no significant statistical differences. The algorithm achieves 70% accuracy at a 0 dB SNR, 90% accuracy at a 5 dB SNR and 98% accuracy at a 20dB SNR using 30dB SNR as a reference for voice activity.

Keywords: atomic decomposition, gabor, gammatone, matching pursuit, voice activity detection

Procedia PDF Downloads 266

511 Detection of Phoneme [S] Mispronounciation for Sigmatism Diagnosis in Adults

Authors: Michal Krecichwost, Zauzanna Miodonska, Pawel Badura

Abstract:

The diagnosis of sigmatism is mostly based on the observation of articulatory organs. It is, however, not always possible to precisely observe the vocal apparatus, in particular in the oral cavity of the patient. Speech processing can allow to objectify the therapy and simplify the verification of its progress. In the described study the methodology for classification of incorrectly pronounced phoneme [s] is proposed. The recordings come from adults. They were registered with the speech recorder at the sampling rate of 44.1 kHz and the resolution of 16 bit. The database of pathological and normative speech has been collected for the study including reference assessments provided by the speech therapy experts. Ten adult subjects were asked to simulate a certain type of stigmatism under the speech therapy expert supervision. In the recordings, the analyzed phone [s] was surrounded by vowels, viz: ASA, ESE, ISI, SPA, USU, YSY. Thirteen MFCC (mel-frequency cepstral coefficients) and RMS (root mean square) values are calculated within each frame being a part of the analyzed phoneme. Additionally, 3 fricative formants along with corresponding amplitudes are determined for the entire segment. In order to aggregate the information within the segment, the average value of each MFCC coefficient is calculated. All features of other types are aggregated by means of their 75th percentile. The proposed method of features aggregation reduces the size of the feature vector used in the classification. Binary SVM (support vector machine) classifier is employed at the phoneme recognition stage. The first group consists of pathological phones, while the other of the normative ones. The proposed feature vector yields classification sensitivity and specificity measures above 90% level in case of individual logo phones. The employment of a fricative formants-based information improves the sole-MFCC classification results average of 5 percentage points. The study shows that the employment of specific parameters for the selected phones improves the efficiency of pathology detection referred to the traditional methods of speech signal parameterization.

Keywords: computer-aided pronunciation evaluation, sibilants, sigmatism diagnosis, speech processing

Procedia PDF Downloads 256

510 The Effect of Voice Recognition Dictation Software on Writing Quality in Third Grade Students: An Action Research Study

Authors: Timothy J. Grebec

Abstract:

This study investigated whether using a voice dictation software program (i.e., Google Voice Typing) has an impact on student writing quality. The research took place in a third-grade general education classroom in a suburban school setting. Because the study involved minors, all data was encrypted and deidentified before analysis. The students completed a series of writings prior to the beginning of the intervention to determine their thoughts and skill level with writing. During the intervention phase, the students were introduced to the voice dictation software, given an opportunity to practice using it, and then assigned writing prompts to be completed using the software. The prompts written by nineteen student participants and surveys of student opinions on writing established a baseline for the study. The data showed that using the dictation software resulted in a 34% increase in the response quality (compared to the Pennsylvania State Standardized Assessment [PSSA] writing guidelines). Of particular interest was the increase in students' proficiency in demonstrating mastery of the English language and conventions and elaborating on the content. Although this type of research is relatively no, it has the potential to reshape the strategies educators have at their disposal when instructing students on written language.

Keywords: educational technology, accommodations, students with disabilities, writing instruction, 21st century education

Procedia PDF Downloads 41

509 Hands-off Parking: Deep Learning Gesture-based System for Individuals with Mobility Needs

Authors: Javier Romera, Alberto Justo, Ignacio Fidalgo, Joshue Perez, Javier Araluce

Abstract:

Nowadays, individuals with mobility needs face a significant challenge when docking vehicles. In many cases, after parking, they encounter insufficient space to exit, leading to two undesired outcomes: either avoiding parking in that spot or settling for improperly placed vehicles. To address this issue, the following paper presents a parking control system employing gestural teleoperation. The system comprises three main phases: capturing body markers, interpreting gestures, and transmitting orders to the vehicle. The initial phase is centered around the MediaPipe framework, a versatile tool optimized for real-time gesture recognition. MediaPipe excels at detecting and tracing body markers, with a special emphasis on hand gestures. Hands detection is done by generating 21 reference points for each hand. Subsequently, after data capture, the project employs the MultiPerceptron Layer (MPL) for indepth gesture classification. This tandem of MediaPipe's extraction prowess and MPL's analytical capability ensures that human gestures are translated into actionable commands with high precision. Furthermore, the system has been trained and validated within a built-in dataset. To prove the domain adaptation, a framework based on the Robot Operating System (ROS), as a communication backbone, alongside CARLA Simulator, is used. Following successful simulations, the system is transitioned to a real-world platform, marking a significant milestone in the project. This real vehicle implementation verifies the practicality and efficiency of the system beyond theoretical constructs.

Keywords: gesture detection, mediapipe, multiperceptron layer, robot operating system

Procedia PDF Downloads 55

508 Voice Commands Recognition of Mentor Robot in Noisy Environment Using HTK

Authors: Khenfer-Koummich Fatma, Hendel Fatiha, Mesbahi Larbi

Abstract:

this paper presents an approach based on Hidden Markov Models (HMM: Hidden Markov Model) using HTK tools. The goal is to create a man-machine interface with a voice recognition system that allows the operator to tele-operate a mentor robot to execute specific tasks as rotate, raise, close, etc. This system should take into account different levels of environmental noise. This approach has been applied to isolated words representing the robot commands spoken in two languages: French and Arabic. The recognition rate obtained is the same in both speeches, Arabic and French in the neutral words. However, there is a slight difference in favor of the Arabic speech when Gaussian white noise is added with a Signal to Noise Ratio (SNR) equal to 30 db, the Arabic speech recognition rate is 69% and 80% for French speech recognition rate. This can be explained by the ability of phonetic context of each speech when the noise is added.

Keywords: voice command, HMM, TIMIT, noise, HTK, Arabic, speech recognition

Procedia PDF Downloads 351

507 Identity Verification Based on Multimodal Machine Learning on Red Green Blue (RGB) Red Green Blue-Depth (RGB-D) Voice Data

Authors: LuoJiaoyang, Yu Hongyang

Abstract:

In this paper, we experimented with a new approach to multimodal identification using RGB, RGB-D and voice data. The multimodal combination of RGB and voice data has been applied in tasks such as emotion recognition and has shown good results and stability, and it is also the same in identity recognition tasks. We believe that the data of different modalities can enhance the effect of the model through mutual reinforcement. We try to increase the three modalities on the basis of the dual modalities and try to improve the effectiveness of the network by increasing the number of modalities. We also implemented the single-modal identification system separately, tested the data of these different modalities under clean and noisy conditions, and compared the performance with the multimodal model. In the process of designing the multimodal model, we tried a variety of different fusion strategies and finally chose the fusion method with the best performance. The experimental results show that the performance of the multimodal system is better than that of the single modality, especially in dealing with noise, and the multimodal system can achieve an average improvement of 5%.

Keywords: multimodal, three modalities, RGB-D, identity verification

Procedia PDF Downloads 46

506 Features of Normative and Pathological Realizations of Sibilant Sounds for Computer-Aided Pronunciation Evaluation in Children

Authors: Zuzanna Miodonska, Michal Krecichwost, Pawel Badura

Abstract:

Sigmatism (lisping) is a speech disorder in which sibilant consonants are mispronounced. The diagnosis of this phenomenon is usually based on the auditory assessment. However, the progress in speech analysis techniques creates a possibility of developing computer-aided sigmatism diagnosis tools. The aim of the study is to statistically verify whether specific acoustic features of sibilant sounds may be related to pronunciation correctness. Such knowledge can be of great importance while implementing classifiers and designing novel tools for automatic sibilants pronunciation evaluation. The study covers analysis of various speech signal measures, including features proposed in the literature for the description of normative sibilants realization. Amplitudes and frequencies of three fricative formants (FF) are extracted based on local spectral maxima of the friction noise. Skewness, kurtosis, four normalized spectral moments (SM) and 13 mel-frequency cepstral coefficients (MFCC) with their 1st and 2nd derivatives (13 Delta and 13 Delta-Delta MFCC) are included in the analysis as well. The resulting feature vector contains 51 measures. The experiments are performed on the speech corpus containing words with selected sibilant sounds (/ʃ, ʒ/) pronounced by 60 preschool children with proper pronunciation or with natural pathologies. In total, 224 /ʃ/ segments and 191 /ʒ/ segments are employed in the study. The Mann-Whitney U test is employed for the analysis of stigmatism and normative pronunciation. Statistically, significant differences are obtained in most of the proposed features in children divided into these two groups at p < 0.05. All spectral moments and fricative formants appear to be distinctive between pathology and proper pronunciation. These metrics describe the friction noise characteristic for sibilants, which makes them particularly promising for the use in sibilants evaluation tools. Correspondences found between phoneme feature values and an expert evaluation of the pronunciation correctness encourage to involve speech analysis tools in diagnosis and therapy of sigmatism. Proposed feature extraction methods could be used in a computer-assisted stigmatism diagnosis or therapy systems.

Keywords: computer-aided pronunciation evaluation, sigmatism diagnosis, speech signal analysis, statistical verification

Procedia PDF Downloads 273

505 The “Prologue” in Tommy Orange’S There, There: Reinventing the Introductory Section

Authors: Kristin Murray

Abstract:

The proposed paper exams prologues in 20th and 21st century American literature in order to show how Native American writer Tommy Orange’s Prologue in his 2018 novel There, Thereis different. In an interview about his 2018 novel There, There, explains he feels “a kind of burden to catch the general reader up with what really happened, because history has got it so wrong and still continue to” (Laubernds). Orange, thus, includes a “Prologue” in his novel to do this work, catching readers upon Native Americans and their history. Prologues are usually from the narrator’s voice, a character’s voice, or even from a fictionalized version of the author, but the tone of Orange’s “Prologue” is that of a non-fictional first-person essayist. Examining prologues in American literature posits Orange’s prologue outside the norm. This paper also examines other introductory sections, the preface, in particular. The research and examination reveal that Orange is adding his personal voice in the Prologue to the multiple narratorsof the novel, and his is the voice of a writer who knows that his audience comes to his novel with a plethora of misinformation. The truths he tells are horrifying and hopeful. He tells of Thanksgiving as a “land deal” and a “successful massacre,” but he also tellsreaders how urban Indians have found a sense of the land, even through concrete. Native American writers contributed and still contribute to the genre of autobiography in ways that have changed our understanding of this genre. This examination of Orange’s Prologue reveals the new and unexpected way to view this often under-examined introductory section, the prologue.

Keywords: native american literature, prologues, prefaces, 20th century american literature

Procedia PDF Downloads 153

504 The Impact of Vocal and Physical Attractiveness on the Employment Interview

Authors: Alexandra Roy

Abstract:

This research examines how physical and vocal attractiveness affect impressions of an applicant and whether these impressions are affected by gender or job type. Findings, based on two samples, indicate that individuals with less attractiveness voice and physical appearance were viewed as less suitable job applicants and as possessing more negative characteristics than those others. These negative impressions were pervasive and unaffected by either applicant gender or job type. Specifically, we found that job candidates with an attractive voice or physique were perceived as more extroverted, less agreeable, less conscientious, less trustworthy less competent, less sociable and less recruitable. Results are robust to various sensitivity checks.

Keywords: discrimination, nonverbal, hiring, attractiveness

Procedia PDF Downloads 188

503 Adaptation and Validation of Voice Handicap Index in Telugu Language

Authors: B. S. Premalatha, Kausalya Sahani

Abstract:

Background: Voice is multidimensional which convey emotion, feelings, and communication. Voice disorders have an adverse effect on the physical, emotional and functional domains of an individual. Self-rating by clients about their voice problem helps the clinicians to plan intervention strategies. Voice handicap index is one such self-rating scale contains 30 questions that quantify the functional, physical and emotional impacts of a voice disorder on a patient’s quality of life. Each subsection has 10 questions. Though adapted and validated versions of VHI are available in other Indian languages but not in Telugu, which is a Dravidian language native to India. It is mainly spoken in Andhra Pradesh and neighbouring states in southern India. Objectives: To adapt and validate the English version of Voice Handicap Index (VHI) into Telugu language and evaluate its internal consistency and clinical validate in Telugu speaking population. Materials: The study carried out in three stages. First stage was a forward translation of English version of VHI, was given to ten experts, who were well proficient in writing and reading Telugu and five speech-language pathologists to translate into Telugu. Second Stage was backward translation where translated version of Telugu was given to a different group of ten experts (who were well proficient in writing and reading Telugu) and five speech-language pathologists who were native Telugu speakers and had good proficiency in Telugu and English. The third stage was an administration of translated version on Telugu to the targeted population. Totally 40 clinical subjects and 40 normal controls served as participants, and each group had 26 males and 14 females’ age range of 20 to 60 years. Clinical group comprised of individuals with laryngectomee with the Tracheoesophageal puncture (n=18), laryngitis (n=11), vocal nodules (n=7) and vocal fold palsy (n=4). Participants were asked to mark of their each experience on a 5 point equal appearing scale (0=never, 1=almost never, 2=sometimes, 3=almost always, 4=always) with a maximum total score of 120. Results: Statistical analysis was made by using SPSS software (22.0.0 Version). Mean, standard deviation and percentage (%) were calculated all the participants for both the groups. Internal consistency of VHI in Telugu was found to be excellent with the consistency scores for all the domains such as physical, emotional and functional are 0.742, 0.934and 0.938. The validity of scores showed a significant difference between clinical population and control group for domains like physical, emotional and functional and total scores. P value found to be less than 0.001( < 0.001). Negative correlation found in age and gender among self-domains such as physical, emotional and functional total scores in dysphonic and control group. Conclusion: The present study indicated that VHI in Telugu is able to discriminate participants having voice pathology from normal populations, which make this as a valid tool to collect information about their voice from the participants.

Keywords: adaptation, Telugu Version, translation, Voice Handicap Index (VHI)

Procedia PDF Downloads 257

502 A Comprehensive Methodology for Voice Segmentation of Large Sets of Speech Files Recorded in Naturalistic Environments

Authors: Ana Londral, Burcu Demiray, Marcus Cheetham

Abstract:

Speech recording is a methodology used in many different studies related to cognitive and behaviour research. Modern advances in digital equipment brought the possibility of continuously recording hours of speech in naturalistic environments and building rich sets of sound files. Speech analysis can then extract from these files multiple features for different scopes of research in Language and Communication. However, tools for analysing a large set of sound files and automatically extract relevant features from these files are often inaccessible to researchers that are not familiar with programming languages. Manual analysis is a common alternative, with a high time and efficiency cost. In the analysis of long sound files, the first step is the voice segmentation, i.e. to detect and label segments containing speech. We present a comprehensive methodology aiming to support researchers on voice segmentation, as the first step for data analysis of a big set of sound files. Praat, an open source software, is suggested as a tool to run a voice detection algorithm, label segments and files and extract other quantitative features on a structure of folders containing a large number of sound files. We present the validation of our methodology with a set of 5000 sound files that were collected in the daily life of a group of voluntary participants with age over 65. A smartphone device was used to collect sound using the Electronically Activated Recorder (EAR): an app programmed to record 30-second sound samples that were randomly distributed throughout the day. Results demonstrated that automatic segmentation and labelling of files containing speech segments was 74% faster when compared to a manual analysis performed with two independent coders. Furthermore, the methodology presented allows manual adjustments of voiced segments with visualisation of the sound signal and the automatic extraction of quantitative information on speech. In conclusion, we propose a comprehensive methodology for voice segmentation, to be used by researchers that have to work with large sets of sound files and are not familiar with programming tools.

Keywords: automatic speech analysis, behavior analysis, naturalistic environments, voice segmentation

Procedia PDF Downloads 258

501 The Nimbārka School of Vedānta and the Indian Classical Dance: The Philosophical Relevance through Rasa Theory

Authors: Shubham Arora

Abstract:

This paper illustrates a relationship between the Dvaitādvaita (dualistic non-dualistic) doctrine of Nimbārka school of Vedānta and philosophy of Indian classical dance, through the Rasa theory. There would be a separate focus on the philosophies of both the disciplines and then analyzing Rasa theory as a connexion between them. The paper presents ideas regarding the similarity between the Brahman and the dancer, manifestation of enacting character and the Jīva (soul), the existence of the phenomenal world and the imaginary world classification of rasa on the basis of three modes of nature, and the feelings and expressions depicting the Dvaita and Advaita. The reason behind choosing such a topic is an intention to explore the relativity of the Vedantic philosophy of this school in real manner. It is really important to study the practical implications and relevance of the doctrine with other disciplines for perceiving it cogently. In our daily lives, we use various forms of facial expressions and bodily gestures in order to communicate, along with the oral and written means of communication. What if, when gestures and expressions mingle with the music beats, in order to present an idea? Indian Classical dance is highly rich in expressing the emotions using extraordinary expressions, unconventional bodily gestures and mesmerizing music beats. Ancient scriptures like Nāṭyaśāstra of Bharata Muni and Abhinava Bhārati by Abhinavaguptā recount aesthetics in a well-defined and structured way of acting and dancing and also reveal the grammar of rasa theory. Indian Classical dance is not only for entertainment but it is deeply in contact with divinity. During the period of Bhakti movement in India, this art form was used as a means to narrate the vignettes from epics like Rāmāyana and Mahābhārata and Purānas. Even in present era, this art has a deep rooted philosophy within.

Keywords: Advaita, Brahman, Dvaita, Jiva, Nimbarka, Rasa, Vedanta

Procedia PDF Downloads 278

500 Applying Biosensors’ Electromyography Signals through an Artificial Neural Network to Control a Small Unmanned Aerial Vehicle

Authors: Mylena McCoggle, Shyra Wilson, Andrea Rivera, Rocio Alba-Flores

Abstract:

This work introduces the use of EMGs (electromyography) from muscle sensors to develop an Artificial Neural Network (ANN) for pattern recognition to control a small unmanned aerial vehicle. The objective of this endeavor exhibits interfacing drone applications beyond manual control directly. MyoWare Muscle sensor contains three EMG electrodes (dual and single type) used to collect signals from the posterior (extensor) and anterior (flexor) forearm and the bicep. Collection of raw voltages from each sensor were connected to an Arduino Uno and a data processing algorithm was developed with the purpose of interpreting the voltage signals given when performing flexing, resting, and motion of the arm. Each sensor collected eight values over a two-second period for the duration of one minute, per assessment. During each two-second interval, the movements were alternating between a resting reference class and an active motion class, resulting in controlling the motion of the drone with left and right movements. This paper further investigated adding up to three sensors to differentiate between hand gestures to control the principal motions of the drone (left, right, up, and land). The hand gestures chosen to execute these movements were: a resting position, a thumbs up, a hand swipe right motion, and a flexing position. The MATLAB software was utilized to collect, process, and analyze the signals from the sensors. The protocol (machine learning tool) was used to classify the hand gestures. To generate the input vector to the ANN, the mean, root means squared, and standard deviation was processed for every two-second interval of the hand gestures. The neuromuscular information was then trained using an artificial neural network with one hidden layer of 10 neurons to categorize the four targets, one for each hand gesture. Once the machine learning training was completed, the resulting network interpreted the processed inputs and returned the probabilities of each class. Based on the resultant probability of the application process, once an output was greater or equal to 80% of matching a specific target class, the drone would perform the motion expected. Afterward, each movement was sent from the computer to the drone through a Wi-Fi network connection. These procedures have been successfully tested and integrated into trial flights, where the drone has responded successfully in real-time to predefined command inputs with the machine learning algorithm through the MyoWare sensor interface. The full paper will describe in detail the database of the hand gestures, the details of the ANN architecture, and confusion matrices results.

Keywords: artificial neural network, biosensors, electromyography, machine learning, MyoWare muscle sensors, Arduino

Procedia PDF Downloads 144

499 The Effects of Culture and Language on Social Impression Formation from Voice Pleasantness: A Study with French and Iranian People

Authors: L. Bruckert, A. Mansourzadeh

Abstract:

The voice has a major influence on interpersonal communication in everyday life via the perception of pleasantness. The evolutionary perspective postulates that the mechanisms underlying the pleasantness judgments are universal adaptations that have evolved in the service of choosing a mate (through the process of sexual selection). From this point of view, the favorite voices would be those with more marked sexually dimorphic characteristics; for example, in men with lower voice pitch, pitch is the main criterion. On the other hand, one can postulate that the mechanisms involved are gradually established since childhood through exposure to the environment, and thus the prosodic elements could take precedence in everyday life communication as it conveys information about the speaker's attitude (willingness to communicate, interest toward the interlocutors). Our study focuses on voice pleasantness and its relationship with social impression formation, exploring both the spectral aspects (pitch, timbre) and the prosodic ones. In our study, we recorded the voices through two vocal corpus (five vowels and a reading text) of 25 French males speaking French and 25 Iranian males speaking Farsi. French listeners (40 male/40 female) listened to the French voices and made a judgment either on the voice's pleasantness or on the speaker (judgment about his intelligence, honesty, sociability). The regression analyses from our acoustic measures showed that the prosodic elements (for example, the intonation and the speech rate) are the most important criteria concerning pleasantness, whatever the corpus or the listener's gender. Moreover, the correlation analyses showed that the speakers with the voices judged as the most pleasant are considered the most intelligent, sociable, and honest. The voices in Farsi have been judged by 80 other French listeners (40 male/40 female), and we found the same effect of intonation concerning the judgment of pleasantness with the corpus «vowel» whereas with the corpus «text» the pitch is more important than the prosody. It may suggest that voice perception contains some elements invariant across culture/language, whereas others are influenced by the cultural/linguistic background of the listener. Shortly in the future, Iranian people will be asked to listen either to the French voices for half of them or to the Farsi voices for the other half and produce the same judgments as the French listeners. This experimental design could potentially make it possible to distinguish what is linked to culture and what is linked to language in the case of differences in voice perception.

Keywords: cross-cultural psychology, impression formation, pleasantness, voice perception

Procedia PDF Downloads 41

498 Android – Based Wireless Electronic Stethoscope

Authors: Aw Adi Arryansyah

Abstract:

Using electronic stethoscope for detecting heartbeat sound, and breath sounds, are the effective way to investigate cardiovascular diseases. On the other side, technology is growing towards mobile. Almost everyone has a smartphone. Smartphone has many platforms. Creating mobile applications also became easier. We also can use HTML5 technology to creating mobile apps. Android is the most widely used type. This is the reason for us to make a wireless electronic stethoscope based on Android mobile. Android based Wireless Electronic Stethoscope designed by a simple system, uses sound sensors mounted membrane, then connected with Bluetooth module which will send the heart auscultation voice input data by Bluetooth signal to an android platform. On the software side, android will read the voice input then it will translate to beautiful visualization and release the voice output which can be regulated about how much of it is going to be released. We can change the heart beat sound into BPM data, and heart beat analysis, like normal beat, bradycardia or tachycardia.

Keywords: wireless, HTML 5, auscultation, bradycardia, tachycardia

Procedia PDF Downloads 325

497 Recognition of Voice Commands of Mentor Robot in Noisy Environment Using Hidden Markov Model

Authors: Khenfer Koummich Fatma, Hendel Fatiha, Mesbahi Larbi

Abstract:

This paper presents an approach based on Hidden Markov Models (HMM: Hidden Markov Model) using HTK tools. The goal is to create a human-machine interface with a voice recognition system that allows the operator to teleoperate a mentor robot to execute specific tasks as rotate, raise, close, etc. This system should take into account different levels of environmental noise. This approach has been applied to isolated words representing the robot commands pronounced in two languages: French and Arabic. The obtained recognition rate is the same in both speeches, Arabic and French in the neutral words. However, there is a slight difference in favor of the Arabic speech when Gaussian white noise is added with a Signal to Noise Ratio (SNR) equals 30 dB, in this case; the Arabic speech recognition rate is 69%, and the French speech recognition rate is 80%. This can be explained by the ability of phonetic context of each speech when the noise is added.

Keywords: Arabic speech recognition, Hidden Markov Model (HMM), HTK, noise, TIMIT, voice command

Procedia PDF Downloads 338

496 Third Language Perception of English Initial Plosives by Mandarin-Japanese Bilinguals

Authors: Rika Aoki

Abstract:

The aim of this paper is to investigate whether being bilinguals facilitates or impedes the perception of a third language. The present study conducted a perception experiment in which Mandarin-Japanese bilinguals categorized a Voice-Onset-Time (VOT) continuum into English /b/ or /p/. The results show that early bilinguals were influenced by both Mandarin and Japanese, while late bilinguals behaved in a similar manner to Mandarin monolinguals Thus, it can be concluded that in the present study having two languages did not help bilinguals to perceive L3 stop contrast native-likely.

Keywords: bilinguals, perception, third language acquisition, voice-onset-time

Procedia PDF Downloads 262

495 Hear My Voice: The Educational Experiences of Disabled Students

Authors: Karl Baker-Green, Ian Woolsey

Abstract:

Historically, a variety of methods have been used to access the student voice within higher education, including module evaluations and informal classroom feedback. However, currently, the views articulated in student-staff-committee meetings bear the most weight and can therefore have the most significant impact on departmental policy. Arguably, these forums are exclusionary as several students, including those who experience severe anxiety, might feel unable to participate in this face-to-face (large) group activities. Similarly, students who declare a disability, but are not in possession of a learning contract, are more likely to withdraw from their studies than those whose additional needs have been formally recognised. It is also worth noting that whilst the number of disabled students in Higher Education has increased in recent years, the percentage of those who have been issued a learning contract has decreased. These issues foreground the need to explore the educational experiences of students with or without a learning contract in order to identify their respective aspirations and needs and therefore help shape education policy. This is in keeping with the ‘Nothing about us without us’, agenda, which recognises that disabled individuals are best placed to understand their own requirements and the most effective strategies to meet these.

Keywords: education, student voice, student experience, student retention

Procedia PDF Downloads 75

494 Hand Motion Tracking as a Human Computer Interation for People with Cerebral Palsy

Authors: Ana Teixeira, Joao Orvalho

Abstract:

This paper describes experiments using Scratch games, to check the feasibility of employing cerebral palsy users gestures as an alternative of interaction with a computer carried out by students of Master Human Computer Interaction (HCI) of IPC Coimbra. The main focus of this work is to study the usability of a Web Camera as a motion tracking device to achieve a virtual human-computer interaction used by individuals with CP. An approach for Human-computer Interaction (HCI) is present, where individuals with cerebral palsy react and interact with a scratch game through the use of a webcam as an external interaction device. Motion tracking interaction is an emerging technology that is becoming more useful, effective and affordable. However, it raises new questions from the HCI viewpoint, for example, which environments are most suitable for interaction by users with disabilities. In our case, we put emphasis on the accessibility and usability aspects of such interaction devices to meet the special needs of people with disabilities, and specifically people with CP. Despite the fact that our work has just started, preliminary results show that, in general, computer vision interaction systems are very useful; in some cases, these systems are the only way by which some people can interact with a computer. The purpose of the experiments was to verify two hypothesis: 1) people with cerebral palsy can interact with a computer using their natural gestures, 2) scratch games can be a research tool in experiments with disabled young people. A game in Scratch with three levels is created to be played through the use of a webcam. This device permits the detection of certain key points of the user’s body, which allows to assume the head, arms and specially the hands as the most important aspects of recognition. Tests with 5 individuals of different age and gender were made throughout 3 days through periods of 30 minutes with each participant. For a more extensive and reliable statistical analysis, the number of both participants and repetitions in further investigations should be increased. However, already at this stage of research, it is possible to draw some conclusions. First, and the most important, is that simple scratch games on the computer can be a research tool that allows investigating the interaction with computer performed by young persons with CP using intentional gestures. Measurements performed with the assistance of games are attractive for young disabled users. The second important conclusion is that they are able to play scratch games using their gestures. Therefore, the proposed interaction method is promising for them as a human-computer interface. In the future, we plan to include the development of multimodal interfaces that combine various computer vision devices with other input devices improvements in the existing systems to accommodate more the special needs of individuals, in addition, to perform experiments on a larger number of participants.

Keywords: motion tracking, cerebral palsy, rehabilitation, HCI

Procedia PDF Downloads 213

493 Stereotypical Motor Movement Recognition Using Microsoft Kinect with Artificial Neural Network

Authors: M. Jazouli, S. Elhoufi, A. Majda, A. Zarghili, R. Aalouane

Abstract:

Autism spectrum disorder is a complex developmental disability. It is defined by a certain set of behaviors. Persons with Autism Spectrum Disorders (ASD) frequently engage in stereotyped and repetitive motor movements. The objective of this article is to propose a method to automatically detect this unusual behavior. Our study provides a clinical tool which facilitates for doctors the diagnosis of ASD. We focus on automatic identification of five repetitive gestures among autistic children in real time: body rocking, hand flapping, fingers flapping, hand on the face and hands behind back. In this paper, we present a gesture recognition system for children with autism, which consists of three modules: model-based movement tracking, feature extraction, and gesture recognition using artificial neural network (ANN). The first one uses the Microsoft Kinect sensor, the second one chooses points of interest from the 3D skeleton to characterize the gestures, and the last one proposes a neural connectionist model to perform the supervised classification of data. The experimental results show that our system can achieve above 93.3% recognition rate.

Keywords: ASD, artificial neural network, kinect, stereotypical motor movements

Procedia PDF Downloads 279

492 Functional Outcome of Speech, Voice and Swallowing Following Excision of Glomus Jugulare Tumor

Authors: B. S. Premalatha, Kausalya Sahani

Abstract:

Background: Glomus jugulare tumors arise within the jugular foramen and are commonly seen in females particularly on the left side. Surgical excision of the tumor may cause lower cranial nerve deficits. Cranial nerve involvement produces hoarseness of voice, slurred speech, and dysphagia along with other physical symptoms, thereby affecting the quality of life of individuals. Though oncological clearance is mainly emphasized on while treating these individuals, little importance is given to their communication, voice and swallowing problems, which play a crucial part in daily functioning. Objective: To examine the functions of voice, speech and swallowing outcomes of the subjects, following excision of glomus jugulare tumor. Methods: Two female subjects aged 56 and 62 years had come with a complaint of change in voice, inability to swallow and reduced clarity of speech following surgery for left glomus jugulare tumor were participants of the study. Their surgical information revealed multiple cranial nerve palsies involving the left facial, left superior and recurrent branches of the vagus nerve, left pharyngeal, left soft palate, left hypoglossal and vestibular nerves. Functional outcomes of voice, speech and swallowing were evaluated by perceptual and objective assessment procedures. Assessment included the examination of oral structures and functions, dysarthria by Frenchey dysarthria assessment, cranial nerve functions and swallowing functions. MDVP and Dr. Speech software were used to evaluate acoustic parameters of voice and quality of voice respectively. Results: The study revealed that both the subjects, subsequent to excision of glomus jugulare tumor, showed a varied picture of affected oral structure and functions, articulation, voice and swallowing functions. The cranial nerve assessment showed impairment of the vagus, hypoglossal, facial and glossopharyngeal nerves. Voice examination indicated vocal cord paralysis associated with breathy quality of voice, weak voluntary cough, reduced pitch and loudness range, and poor respiratory support. Perturbation parameters as jitter, shimmer were affected along with s/z ratio indicative of voice fold pathology. Reduced MPD(Maximum Phonation Duration) of vowels indicated that disturbed coordination between respiratory and laryngeal systems. Hypernasality was found to be a prominent feature which reduced speech intelligibility. Imprecise articulation was seen in both the subjects as the hypoglossal nerve was affected following surgery. Injury to vagus, hypoglossal, gloss pharyngeal and facial nerves disturbed the function of swallowing. All the phases of swallow were affected. Aspiration was observed before and during the swallow, confirming the oropharyngeal dysphagia. All the subsystems were affected as per Frenchey Dysarthria Assessment signifying the diagnosis of flaccid dysarthria. Conclusion: There is an observable communication and swallowing difficulty seen following excision of glomus jugulare tumor. Even with complete resection, extensive rehabilitation may be necessary due to significant lower cranial nerve dysfunction. The finding of the present study stresses the need for involvement of as speech and swallowing therapist for pre-operative counseling and assessment of functional outcomes.

Keywords: functional outcome, glomus jugulare tumor excision, multiple cranial nerve impairment, speech and swallowing

Procedia PDF Downloads 228

491 My Voice My Well-Being: A Participatory Research Study with Secondary School Students in Bangladesh

Authors: Saira Hossain

Abstract:

Well-being commonly refers to the concept that equates to a good life. Similarly, student well-being can be understood as a notion of a good life at school. What constitutes a good life at school for students? – is an emerging question that poses huge interest in this area of research. Student well-being is not only associated with a student’s socio-emotional and academic development at school but also success in life after school as an adult. Today, student well-being is a popular agenda for educators, policymakers, teachers, parents, and most importantly, for students. With the emergence of student well-being, student's voice in matters important to them at school is increasingly getting priority. However, the coin has another side too. Despite the growing importance of understanding student well-being, it is still an alien concept in countries like Bangladesh. The education system of Bangladesh is highly rigid, centralized, and exam-focused. Student's academic achievement has been given the utmost priority at school, whereas their voice, as well as their well-being, is grossly neglected in practice. In this regard, the study set out to explore students' conceptualization of well-being at school in Bangladesh. The study was qualitative. It employed a participatory research approach to elicit the views of 25 secondary school students of aged 14-16 in Bangladesh to explore the concept of well-being. Data analysis was conducted following the thematic analysis technique. The results suggested that student conceptualized well-being as a multidimensional concept with multiple domains, including having, being, relating, feeling, thinking, functioning, and striving. The future implication of the study findings is discussed. Additionally, the study also underscores the implication of the participatory approach as a research technique to explore students' opinion in Bangladesh, where there exists a culture of silence regarding the student's voice.

Keywords: Bangladesh, participatory research, secondary school, student well-being

Procedia PDF Downloads 97