Search results for: hearing aid output speech
2967 Minimum Data of a Speech Signal as Special Indicators of Identification in Phonoscopy
Authors: Nazaket Gazieva
Abstract:
Voice biometric data associated with physiological, psychological and other factors are widely used in forensic phonoscopy. There are various methods for identifying and verifying a person by voice. This article explores the minimum speech signal data as individual parameters of a speech signal. Monozygotic twins are believed to be genetically identical. Using the minimum data of the speech signal, we came to the conclusion that the voice imprint of monozygotic twins is individual. According to the conclusion of the experiment, we can conclude that the minimum indicators of the speech signal are more stable and reliable for phonoscopic examinations.Keywords: phonogram, speech signal, temporal characteristics, fundamental frequency, biometric fingerprints
Procedia PDF Downloads 1442966 Audio-Visual Co-Data Processing Pipeline
Authors: Rita Chattopadhyay, Vivek Anand Thoutam
Abstract:
Speech is the most acceptable means of communication where we can quickly exchange our feelings and thoughts. Quite often, people can communicate orally but cannot interact or work with computers or devices. It’s easy and quick to give speech commands than typing commands to computers. In the same way, it’s easy listening to audio played from a device than extract output from computers or devices. Especially with Robotics being an emerging market with applications in warehouses, the hospitality industry, consumer electronics, assistive technology, etc., speech-based human-machine interaction is emerging as a lucrative feature for robot manufacturers. Considering this factor, the objective of this paper is to design the “Audio-Visual Co-Data Processing Pipeline.” This pipeline is an integrated version of Automatic speech recognition, a Natural language model for text understanding, object detection, and text-to-speech modules. There are many Deep Learning models for each type of the modules mentioned above, but OpenVINO Model Zoo models are used because the OpenVINO toolkit covers both computer vision and non-computer vision workloads across Intel hardware and maximizes performance, and accelerates application development. A speech command is given as input that has information about target objects to be detected and start and end times to extract the required interval from the video. Speech is converted to text using the Automatic speech recognition QuartzNet model. The summary is extracted from text using a natural language model Generative Pre-Trained Transformer-3 (GPT-3). Based on the summary, essential frames from the video are extracted, and the You Only Look Once (YOLO) object detection model detects You Only Look Once (YOLO) objects on these extracted frames. Frame numbers that have target objects (specified objects in the speech command) are saved as text. Finally, this text (frame numbers) is converted to speech using text to speech model and will be played from the device. This project is developed for 80 You Only Look Once (YOLO) labels, and the user can extract frames based on only one or two target labels. This pipeline can be extended for more than two target labels easily by making appropriate changes in the object detection module. This project is developed for four different speech command formats by including sample examples in the prompt used by Generative Pre-Trained Transformer-3 (GPT-3) model. Based on user preference, one can come up with a new speech command format by including some examples of the respective format in the prompt used by the Generative Pre-Trained Transformer-3 (GPT-3) model. This pipeline can be used in many projects like human-machine interface, human-robot interaction, and surveillance through speech commands. All object detection projects can be upgraded using this pipeline so that one can give speech commands and output is played from the device.Keywords: OpenVINO, automatic speech recognition, natural language processing, object detection, text to speech
Procedia PDF Downloads 802965 Barriers to Marital Expectation among Individuals with Hearing Impairment in Oyo State
Authors: Adebomi M. Oyewumi, Sunday Amaize
Abstract:
The study was designed to examine the barriers to marital expectations among unmarried persons with hearing impairment in Oyo State, Nigeria. Descriptive survey research design was adopted. Purposive sampling technique was used to select one hundred participants made up forty-four (44) males and fifty-six (56) females, all with varying degrees of hearing impairment. Eight research questions were raised and answered. The instrument used was Marital Expectations Scale with reliability coefficient of 0.86. Data was analyzed using descriptive statistics tools of frequency count and simple percentage as well as inferential statistics tools of T-TEST and ANOVA. The findings revealed that there was a significant relationship existing among the main identified barriers (environmental barrier, communication barrier, hearing loss, unemployment and poor sexuality education) to the marital expectations of unmarried persons with hearing impairment. The joint contribution of the independent variables (identified barriers) to the dependent variable (marital expectations) was significant, F = 5.842, P < 0.05, accounting for about 89% of the variance. The relative contribution of the identified barriers to marital expectations of unmarried persons with hearing impairment is as follows: environmental barrier (β = 0.808, t = 5.176, P < 0.05), communication barrier (β = 0.533, t = 3.305, P < 0.05), hearing loss (β = 0.550, t = 2.233, P < 0.05), unemployment (β = 0.431, t = 2.102, P < 0.05), poor sexuality education (β = 0.361, t = 1.985, P < 0.05). Environmental barrier proved to be the most potent contributor to the poor marital expectations among unmarried persons with hearing impairment. Therefore, it is recommended that society dismantles the nagging environmental barrier through positive identification with individuals suffering from hearing impairment. In this connection, members of society should change their negative attitudes and do away with all the wrong notions about the marital ability of individuals with hearing impairment.Keywords: environmental barrier, hearing impairment, marriage, marital expectations
Procedia PDF Downloads 3702964 Learning Programming for Hearing Impaired Students via an Avatar
Authors: Nihal Esam Abuzinadah, Areej Abbas Malibari, Arwa Abdulaziz Allinjawi, Paul Krause
Abstract:
Deaf and hearing-impaired students face many obstacles throughout their education, especially with learning applied sciences such as computer programming. In addition, there is no clear signs in the Arabic Sign Language that can be used to identify programming logic terminologies such as while, for, case, switch etc. However, hearing disabilities should not be a barrier for studying purpose nowadays, especially with the rapid growth in educational technology. In this paper, we develop an Avatar based system to teach computer programming to deaf and hearing-impaired students using Arabic Signed language with new signs vocabulary that is been developed for computer programming education. The system is tested on a number of high school students and results showed the importance of visualization in increasing the comprehension or understanding of concepts for deaf students through the avatar.Keywords: hearing-impaired students, isolation, self-esteem, learning difficulties
Procedia PDF Downloads 1452963 Emotional and Physiological Reaction While Listening the Speech of Adults Who Stutter
Authors: Xharavina V., Gallopeni F., Ahmeti K.
Abstract:
Stuttered speech is filled with intermittent sound prolongations and/or rapid part word repetitions. Oftentimes, these aberrant acoustic behaviors are associated with intermittent physical tension and struggle behaviors such as head jerks, arm jerks, finger tapping, excessive eye-blinks, etc. Additionally, the jarring nature of acoustic and physical manifestations that often accompanies moderate-severe stuttering may induce negative emotional responses in listeners, which alters communication between the person who stutters and their listeners. However, researches for the influence of negative emotions in the communication and for physical reaction are limited. Therefore, to compare psycho-physiological responses of fluent adults, while listening the speech of adults who speak fluency and adults who stutter, are necessary. This study comprises the experimental method, with total of 104 participants (average age-20 years old, SD=2.1), divided into 3 groups. All participants self-reported no impairments in speech, language, or hearing. Exploring the responses of the participants, there were used two records speeches; a voice who speaks fluently and the voice who stutters. Heartbeats and the pulse were measured by the digital blood pressure monitor called 'Tensoval', as a physiological response to the fluent and stuttering sample. Meanwhile, the emotional responses of participants were measured by the self-reporting questionnaire (Steenbarger, 2001). Results showed an increase in heartbeats during the stuttering speech compared with the fluent sample (p < 0.5). The listeners also self-reported themselves as more alive, unhappy, nervous, repulsive, sad, tense, distracted and upset when listening the stuttering words versus the words of the fluent adult (where it was reported to experience positive emotions). These data support the notions that speech with stuttering can bring a psycho-physical reaction to the listeners. Speech pathologists should be aware that listeners show intolerable physiological reactions to stuttering that remain visible over time.Keywords: emotional, physiological, stuttering, fluent speech
Procedia PDF Downloads 1432962 Intervention of Self-Limiting L1 Inner Speech during L2 Presentations: A Study of Bangla-English Bilinguals
Authors: Abdul Wahid
Abstract:
Inner speech, also known as verbal thinking, self-talk or private speech, is characterized by the subjective language experience in the absence of overt or audible speech. It is a psychological form of verbal activity which is being rehearsed without the articulation of any sound wave. In Psychology, self-limiting speech means the type of speech which contains information that inhibits the development of the self. People, in most cases, experience inner speech in their first language. It is very frequent in Bangladesh where the Bangla (L1) speaking students lose track of speech during their presentations in English (L2). This paper investigates into the long pauses (more than 0.4 seconds long) in English (L2) presentations by Bangla speaking students (18-21 year old) and finds the intervention of Bangla (L1) inner speech as one of its causes. The overt speeches of the presenters are placed on Audacity Audio Editing software where the length of pauses are measured in milliseconds. Varieties of inner speech questionnaire (VISQ) have been conducted randomly amongst the participants out of whom 20 were selected who have similar phenomenology of inner speech. They have been interviewed to describe the type and content of the voices that went on in their head during the long pauses. The qualitative interview data are then codified and converted into quantitative data. It was observed that in more than 80% cases students experience self-limiting inner speech/self-talk during their unwanted pauses in L2 presentations.Keywords: Bangla-English Bilinguals, inner speech, L1 intervention in bilingualism, motor schema, pauses, phonological loop, phonological store, working memory
Procedia PDF Downloads 1522961 Personality, Coping, Quality of Life, and Distress in Persons with Hearing Loss: A Cross-Sectional Study of Patients Referred to an Audiological Service
Authors: Oyvind Nordvik, Peder O. L. Heggdal, Jonas Brannstrom, Flemming Vassbotn, Anne Kari Aarstad, Hans Jorgen Aarstad
Abstract:
Background: Hearing Loss (HL) is a condition that may affect people in all stages of life, but the prevalence increases with age, mostly because of age-related HL, generally referred to as presbyacusis. As human speech is related to relatively high frequencies, even a limited hearing loss at high frequencies may cause impaired speech intelligibility. Being diagnosed with, treated for and living with a chronic condition such as HL, must for many be a disabling and stressful condition that put ones coping resources to test. Stress is a natural part of life and most people will experience stressful events or periods. Chronic diseases, such as HL, are risk factor for distress in individuals, causing anxiety and lowered mood. How an individual cope with HL may be closely connected to the level of distress he or she is experiencing and to personality, which can be defined as those characteristics of a person that account for consistent patterns of feelings, thinking, and behavior. Thus, as to distress in life, such as illness or disease, available coping strategies may be more important than the challenge itself. The same line of arguments applies to level of experienced health-related quality of life (HRQoL). Aim: The aim of this study was to investigate the relationship between distress, HRQoL, reported hearing loss, personality and coping in patients with HL. Method: 158 adult (aged 18-78 years) patients with HL, referred for hearing aid (HA) fitting at Haukeland University Hospital in western Norway, participated in the study. Both first-time users, as well as patients referred for HA renewals were included. First-time users had been pre-examined by an ENT-specialist. The questionnaires were answered before the actual HA fitting procedure. The pure-tone average (PTA; frequencies 0.5, 1, 2, and 4 kHz) was determined for each ear. The Eysenck personality inventory, neuroticism and lie scales, the Theoretically Originated Measure of the Cognitive Activation Theory of Stress (TOMCATS) measuring active coping, hopelessness and helplessness, as well as distress (General Health Questionnaire (GHQ) - 12 items) and the EORTC Quality of Life Questionnaire general part were answered. In addition, we used a revised and shortened version of the Abbreviated Profile of Hearing Aid Benefit (APHAB) as a measure of patient-reported hearing loss. Results: Significant correlations were determined between APHAB (weak), HRQoL scores (strong), distress scores (strong) on the one side and personality and choice of coping scores on the other side. As measured by stepwise regression analyses, the distress and HRQoL scores were scored secondary to the obtained personality and coping scores. The APHAB scores were as determined by regression analyses scored secondary to PTA (best ear), level of neuroticism and lie score. Conclusion: We found that reported employed coping style, distress/HRQoL and personality are closely connected to each other in this patient group. Patient-reported HL was associated to hearing level and personality. There is need for further investigations on these questions, and how these associations may influence the clinical context.Keywords: coping, distress, hearing loss, personality
Procedia PDF Downloads 1462960 Performance Evaluation of Acoustic-Spectrographic Voice Identification Method in Native and Non-Native Speech
Authors: E. Krasnova, E. Bulgakova, V. Shchemelinin
Abstract:
The paper deals with acoustic-spectrographic voice identification method in terms of its performance in non-native language speech. Performance evaluation is conducted by comparing the result of the analysis of recordings containing native language speech with recordings that contain foreign language speech. Our research is based on Tajik and Russian speech of Tajik native speakers due to the character of the criminal situation with drug trafficking. We propose a pilot experiment that represents a primary attempt enter the field.Keywords: speaker identification, acoustic-spectrographic method, non-native speech, performance evaluation
Procedia PDF Downloads 4462959 Automatic Segmentation of the Clean Speech Signal
Authors: M. A. Ben Messaoud, A. Bouzid, N. Ellouze
Abstract:
Speech Segmentation is the measure of the change point detection for partitioning an input speech signal into regions each of which accords to only one speaker. In this paper, we apply two features based on multi-scale product (MP) of the clean speech, namely the spectral centroid of MP, and the zero crossings rate of MP. We focus on multi-scale product analysis as an important tool for segmentation extraction. The multi-scale product is based on making the product of the speech wavelet transform coefficients at three successive dyadic scales. We have evaluated our method on the Keele database. Experimental results show the effectiveness of our method presenting a good performance. It shows that the two simple features can find word boundaries, and extracted the segments of the clean speech.Keywords: multiscale product, spectral centroid, speech segmentation, zero crossings rate
Procedia PDF Downloads 5012958 Eisenhower’s Farewell Speech: Initial and Continuing Communication Effects
Authors: B. Kuiper
Abstract:
When Dwight D. Eisenhower delivered his final Presidential speech in 1961, he was using the opportunity to bid farewell to America, but he was also trying to warn his fellow countrymen about deeper challenges threatening the country. In this analysis, Eisenhower’s speech is examined in light of the impact it had on American culture, communication concepts, and political ramifications. The paper initially highlights the previous literature on the speech, especially in light of its 50th anniversary, and reveals a man whose main concern was how the speech’s words would affect his beloved country. The painstaking approach to the wording of the speech to reveal the intent is key, particularly in light of analyzing the motivations according to “virtuous communication.” This philosophical construct indicates that Eisenhower’s Farewell Address was crafted carefully according to a departing President’s deepest values and concerns, concepts that he wanted to pass along to his successor, to his country, and even to the world.Keywords: Eisenhower, mass communication, political speech, rhetoric
Procedia PDF Downloads 2752957 A Sparse Representation Speech Denoising Method Based on Adapted Stopping Residue Error
Authors: Qianhua He, Weili Zhou, Aiwu Chen
Abstract:
A sparse representation speech denoising method based on adapted stopping residue error was presented in this paper. Firstly, the cross-correlation between the clean speech spectrum and the noise spectrum was analyzed, and an estimation method was proposed. In the denoising method, an over-complete dictionary of the clean speech power spectrum was learned with the K-singular value decomposition (K-SVD) algorithm. In the sparse representation stage, the stopping residue error was adaptively achieved according to the estimated cross-correlation and the adjusted noise spectrum, and the orthogonal matching pursuit (OMP) approach was applied to reconstruct the clean speech spectrum from the noisy speech. Finally, the clean speech was re-synthesised via the inverse Fourier transform with the reconstructed speech spectrum and the noisy speech phase. The experiment results show that the proposed method outperforms the conventional methods in terms of subjective and objective measure.Keywords: speech denoising, sparse representation, k-singular value decomposition, orthogonal matching pursuit
Procedia PDF Downloads 5002956 Co-Design of Accessible Speech Recognition for Users with Dysarthric Speech
Authors: Elizabeth Howarth, Dawn Green, Sean Connolly, Geena Vabulas, Sara Smolley
Abstract:
Through the EU Horizon 2020 Nuvoic Project, the project team recruited 70 individuals in the UK and Ireland to test the Voiceitt speech recognition app and provide user feedback to developers. The app is designed for people with dysarthric speech, to support communication with unfamiliar people and access to speech-driven technologies such as smart home equipment and smart assistants. Participants with atypical speech, due to a range of conditions such as cerebral palsy, acquired brain injury, Down syndrome, stroke and hearing impairment, were recruited, primarily through organisations supporting disabled people. Most had physical or learning disabilities in addition to dysarthric speech. The project team worked with individuals, their families and local support teams, to provide access to the app, including through additional assistive technologies where needed. Testing was user-led, with participants asked to identify and test use cases most relevant to their daily lives over a period of three months or more. Ongoing technical support and training were provided remotely and in-person throughout the testing period. Structured interviews were used to collect feedback on users' experiences, with delivery adapted to individuals' needs and preferences. Informal feedback was collected through ongoing contact between participants, their families and support teams and the project team. Focus groups were held to collect feedback on specific design proposals. User feedback shared with developers has led to improvements to the user interface and functionality, including faster voice training, simplified navigation, the introduction of gamification elements and of switch access as an alternative to touchscreen access, with other feature requests from users still in development. This work offers a case-study in successful and inclusive co-design with the disabled community.Keywords: co-design, assistive technology, dysarthria, inclusive speech recognition
Procedia PDF Downloads 1112955 Factors That Contribute to Noise Induced Hearing Loss Amongst Employees at the Platinum Mine in Limpopo Province, South Africa
Authors: Livhuwani Muthelo, R. N. Malema, T. M. Mothiba
Abstract:
Long term exposure to excessive noise in the mining industry increases the risk of noise induced hearing loss, with consequences for employee’s health, productivity and the overall quality of life. Objective: The objective of this study was to investigate the factors that contribute to Noise Induced Hearing Loss amongst employees at the Platinum mine in the Limpopo Province, South Africa. Study method: A qualitative, phenomenological, exploratory, descriptive, contextual design was applied in order to explore and describe the contributory factors. Purposive non-probability sampling was used to select 10 male employees who were diagnosed with NIHL in the year 2014 in four mine shafts, and 10 managers who were involved in a Hearing Conservation Programme. The data were collected using semi-structured one-on-one interviews. A qualitative data analysis of Tesch’s approach was followed. Results: The following themes emerged: Experiences and challenges faced by employees in the work environment, hearing protective device factors and management and leadership factors. Hearing loss was caused by partial application of guidelines, policies, and procedures from the Department of Minerals and Energy. Conclusion: The study results indicate that although there are guidelines, policies, and procedures available, failure in the implementation of one element will affect the development and maintenance of employees hearing mechanism. It is recommended that the mine management should apply the guidelines, policies, and procedures and promptly repair the broken hearing protective devices.Keywords: employees, factors, noise induced hearing loss, noise exposure
Procedia PDF Downloads 1282954 Development of an Artificial Ear for Bone-Conducted Objective Occlusion Measurement
Authors: Yu Luan
Abstract:
The bone-conducted objective occlusion effect (OE) is characterized by a discomforting sensation of fullness experienced in an occluded ear. This phenomenon arises from various external stimuli, such as human speech, chewing, and walking, which generate vibrations transmitted through the body to the ear canal walls. The bone-conducted OE occurs due to the pressure build-up inside the occluded ear caused by sound radiating into the ear canal cavity from its walls. In the hearing aid industry, artificial ears are utilized as a tool for developing hearing aids. However, the currently available commercial artificial ears primarily focus on pure acoustics measurements, neglecting the bone-conducted vibration aspect. This research endeavors to develop an artificial ear specifically designed for bone-conducted occlusion measurements. Finite element analysis (FEA) modeling has been employed to gain insights into the behavior of the artificial ear.Keywords: artificial ear, bone conducted vibration, occlusion measurement, finite element modeling
Procedia PDF Downloads 902953 Concurrent Validity of Synchronous Tele-Audiology Hearing Screening
Authors: Thidilweli Denga, Bessie Malila, Lucretia Petersen
Abstract:
The Coronavirus Disease of 2019 (COVID-19) pandemic should be taken as a wake-up call on the importance of hearing health care considering amongst other things the electronic methods of communication used. The World Health Organization (WHO) estimated that by 2050, there will be more than 2.5 billion people living with hearing loss. These numbers show that more people will need rehabilitation services. Studies have shown that most people living with hearing loss reside in Low-Middle Income Countries (LIMC). Innovative technological solutions such as digital health interventions that can be used to deliver hearing health services to remote areas now exist. Tele-audiology implementation can potentially enable the delivery of hearing loss services to rural and remote areas. This study aimed to establish the concurrent validity of the tele-audiology practice in school-based hearing screening. The study employed a cross-sectional design with a within-group comparison. The portable KUDUwave Audiometer was used to conduct hearing screening from 50 participants (n=50). In phase I of the study, the audiologist conducted on-site hearing screening, while the synchronous remote hearing screening (tele-audiology) using a 5G network was done in phase II. On-site hearing screening results were obtained for the first 25 participants (aged between 5-6 years). The second half started with the synchronous tele-audiology model to avoid order-effect. Repeated sample t-tests compared threshold results obtained in the left and right ears for onsite and remote screening. There was a good correspondence between the two methods with a threshold average within ±5 dB (decibels). The synchronous tele-audiology model has the potential to reduce the audiologists' case overload, while at the same time reaching populations that lack access due to distance, and shortage of hearing professionals in their areas of reach. With reliable and broadband connectivity, tele-audiology delivers the same service quality as the conventional method while reducing the travel costs of audiologists.Keywords: hearing screening, low-resource communities, portable audiometer, tele-audiology
Procedia PDF Downloads 1212952 Healthcare-SignNet: Advanced Video Classification for Medical Sign Language Recognition Using CNN and RNN Models
Authors: Chithra A. V., Somoshree Datta, Sandeep Nithyanandan
Abstract:
Sign Language Recognition (SLR) is the process of interpreting and translating sign language into spoken or written language using technological systems. It involves recognizing hand gestures, facial expressions, and body movements that makeup sign language communication. The primary goal of SLR is to facilitate communication between hearing- and speech-impaired communities and those who do not understand sign language. Due to the increased awareness and greater recognition of the rights and needs of the hearing- and speech-impaired community, sign language recognition has gained significant importance over the past 10 years. Technological advancements in the fields of Artificial Intelligence and Machine Learning have made it more practical and feasible to create accurate SLR systems. This paper presents a distinct approach to SLR by framing it as a video classification problem using Deep Learning (DL), whereby a combination of Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) has been used. This research targets the integration of sign language recognition into healthcare settings, aiming to improve communication between medical professionals and patients with hearing impairments. The spatial features from each video frame are extracted using a CNN, which captures essential elements such as hand shapes, movements, and facial expressions. These features are then fed into an RNN network that learns the temporal dependencies and patterns inherent in sign language sequences. The INCLUDE dataset has been enhanced with more videos from the healthcare domain and the model is evaluated on the same. Our model achieves 91% accuracy, representing state-of-the-art performance in this domain. The results highlight the effectiveness of treating SLR as a video classification task with the CNN-RNN architecture. This approach not only improves recognition accuracy but also offers a scalable solution for real-time SLR applications, significantly advancing the field of accessible communication technologies.Keywords: sign language recognition, deep learning, convolution neural network, recurrent neural network
Procedia PDF Downloads 312951 Speech Acts and Politeness Strategies in an EFL Classroom in Georgia
Authors: Tinatin Kurdghelashvili
Abstract:
The paper deals with the usage of speech acts and politeness strategies in an EFL classroom in Georgia (Rep of). It explores the students’ and the teachers’ practice of the politeness strategies and the speech acts of apology, thanking, request, compliment/encouragement, command, agreeing/disagreeing, addressing and code switching. The research method includes observation as well as a questionnaire. The target group involves the students from Georgian public schools and two certified, experienced local English teachers. The analysis is based on Searle’s Speech Act Theory and Brown and Levinson’s politeness strategies. The findings show that the students have certain knowledge regarding politeness yet they fail to apply them in English communication. In addition, most of the speech acts from the classroom interaction are used by the teachers and not the students. Thereby, it is suggested that teachers should cultivate the students’ communicative competence and attempt to give them opportunities to practice more English speech acts than they do today.Keywords: english as a foreign language, Georgia, politeness principles, speech acts
Procedia PDF Downloads 6382950 Speech Detection Model Based on Deep Neural Networks Classifier for Speech Emotions Recognition
Authors: A. Shoiynbek, K. Kozhakhmet, P. Menezes, D. Kuanyshbay, D. Bayazitov
Abstract:
Speech emotion recognition has received increasing research interest all through current years. There was used emotional speech that was collected under controlled conditions in most research work. Actors imitating and artificially producing emotions in front of a microphone noted those records. There are four issues related to that approach, namely, (1) emotions are not natural, and it means that machines are learning to recognize fake emotions. (2) Emotions are very limited by quantity and poor in their variety of speaking. (3) There is language dependency on SER. (4) Consequently, each time when researchers want to start work with SER, they need to find a good emotional database on their language. In this paper, we propose the approach to create an automatic tool for speech emotion extraction based on facial emotion recognition and describe the sequence of actions of the proposed approach. One of the first objectives of the sequence of actions is a speech detection issue. The paper gives a detailed description of the speech detection model based on a fully connected deep neural network for Kazakh and Russian languages. Despite the high results in speech detection for Kazakh and Russian, the described process is suitable for any language. To illustrate the working capacity of the developed model, we have performed an analysis of speech detection and extraction from real tasks.Keywords: deep neural networks, speech detection, speech emotion recognition, Mel-frequency cepstrum coefficients, collecting speech emotion corpus, collecting speech emotion dataset, Kazakh speech dataset
Procedia PDF Downloads 1022949 The Influence of Advertising Captions on the Internet through the Consumer Purchasing Decision
Authors: Suwimol Apapol, Punrapha Praditpong
Abstract:
The objectives of the study were to find out the frequencies of figures of speech in fragrance advertising captions as well as the types of figures of speech most commonly applied in captions. The relation between figures of speech and fragrance was also examined in order to analyze how figures of speech were used to represent fragrance. Thirty-five fragrance advertisements were randomly selected from the Internet. Content analysis was applied in order to consider the relation between figures of speech and fragrance. The results showed that figures of speech were found in almost every fragrance advertisement except one advertisement of several Goods service. Thirty-four fragrance advertising captions used at least one kind of figure of speech. Metaphor was most frequently found and also most frequently applied in fragrance advertising captions, followed by alliteration, rhyme, simile and personification, and hyperbole respectively which is in harmony with the research hypotheses as well.Keywords: advertising captions, captions on internet, consumer purchasing decision, e-commerce
Procedia PDF Downloads 2712948 The Use of Hearing Protection Devices and Hearing Loss in Steel Industry Workers in Samut Prakan Province, Thailand
Authors: Petcharat Kerdonfag, Surasak Taneepanichskul, Winai Wadwongtham
Abstract:
Background: Although there have not been effective treatments for Noise Induced Hearing Loss (NIHL), it can be definitely preventable with promoting the use of Hearing Protection devices (HPDs) among workers who have been exposed to excessive noise for a long period. Objectives: The objectives of this study were to explore the use of HPDs among steel industrial workers in the high noise level zone in Samut Prakan province, Thailand and to examine the relationships of the HPDs use and hearing loss. Materials and Methods: In this cross-sectional study, eligible ninety-three participants were recruited in the designated zone of higher noise (> 85dBA) of two factories, using simple random sampling. The use of HPDs was gathered by the self-record form, examined and confirmed by the researcher team. Hearing loss was assessed by the audiometric screening at the regional Samut Prakan hospital. If an average threshold level exceeds 25 dBA at high frequency (4 and 6 Hz) in each ear, participants would be lost of hearing. Data were collected from October to December, 2016. All participants were examined by the same examiners for the validity. An Audiometric testing was performed with the participants who have been exposed to high noise levels at least 14 hours from workplace. Results: Sixty participants (64.5%) had secondary level of education. The average mean score of percent time of using HPDs was 60.5% (SD = 25.34). Sixty-seven participants (72.0%) had abnormal hearing which they have still needed to increase lower percent time of using HPDs (Mean = 37.01, SD = 23.81) than those having normal hearing (Mean = 45.77, SD = 28.44). However, there was no difference in the mean average of percent time of using HPDs between these two groups.Conclusion: The findings of this study have confirmed that the steel industrial workers still need to be motivated to use HPDs regularly. Future research should pay more attentions for creating a meaningful innovation to steel industrial workers.Keywords: hearing protection devices, noise induced hearing loss, audiometric testing, steel industry
Procedia PDF Downloads 2562947 Pediatric Hearing Aid Use: A Study Based on Data Logging Information
Authors: Mina Salamatmanesh, Elizabeth Fitzpatrick, Tim Ramsay, Josee Lagacé, Lindsey Sikora, JoAnne Whittingham
Abstract:
Introduction: Hearing loss (HL) is one of the most common disorders that presents at birth and in early childhood. Universal newborn hearing screening (UNHS) has been adopted based on the assumption that with early identification of HL, children will have access to optimal amplification and intervention at younger ages, therefore, taking advantage of the brain’s maximal plasticity. One particular challenge for parents in the early years is achieving consistent hearing aid (HA) use which is critical to the child’s development and constitutes the first step in the rehabilitation process. This study examined the consistency of hearing aid use in young children based on data logging information documented during audiology sessions in the first three years after hearing aid fitting. Methodology: The first 100 children who were diagnosed with bilateral HL before 72 months of age since 2003 to 2015 in a pediatric audiology clinic and who had at least two hearing aid follow-up sessions with available data logging information were included in the study. Data from each audiology session (age of child at the session, average hours of use per day (for each ear) in the first three years after HA fitting) were collected. Clinical characteristics (degree of hearing loss, age of HA fitting) were also documented to further understanding of factors that impact HA use. Results: Preliminary analysis of the results of the first 20 children shows that all of them (100%) have at least one data logging session recorded in the clinical audiology system (Noah). Of the 20 children, 17(85%) have three data logging events recorded in the first three years after HA fitting. Based on the statistical analysis of the first 20 cases, the median hours of use in the first follow-up session after the hearing aid fitting in the right ear is 3.9 hours with an interquartile range (IQR) of 10.2h. For the left ear the median is 4.4 and the IQR is 9.7h. In the first session 47% of the children use their hearing aids ≤5 hours, 12% use them between 5 to 10 hours and 22% use them ≥10 hours a day. However, these children showed increased use by the third follow-up session with a median (IQR) of 9.1 hours for the right ear and 2.5, and of 8.2 hours for left ear (IQR) IQR is 5.6 By the third follow-up session, 14% of children used hearing aids ≤5 hours, while 38% of children used them ≥10 hours. Based on the primary results, factors like age and level of HL significantly impact the hours of use. Conclusion: The use of data logging information to assess the actual hours of HA provides an opportunity to examine the: a) challenges of families of young children with HAs, b) factors that impact use in very young children. Data logging when used collaboratively with parents, can be a powerful tool to identify problems and to encourage and assist families in maximizing their child’s hearing potential.Keywords: hearing loss, hearing aid, data logging, hours of use
Procedia PDF Downloads 2302946 What Children Do and Do Not Like about Taking Part in Sport: Using Focus Groups to Investigate Thoughts and Feelings of Children with Hearing Loss
Authors: S. Somerset, D. J. Hoare, P. Leighton
Abstract:
Limited participation in physical activity and sport has been linked to poorer mental and physical health in children. Studies have shown that children who participate in sports benefit from improved social skills, self-confidence, communication skills and a better quality of life. Children who participate in sport are also more likely to continue their participation into their adult life. Deaf or hard of hearing children should have the same opportunities to participate in sport and receive the benefits as their hearing peers. Anecdotal evidence suggests this isn’t always the case. This is concerning given there are 45,000 children in the UK with permanent hearing loss. The aim of this study was to understand what encourages or discourages deaf or hard of hearing children to take part in sports. Ethical approval for the study was obtained from the University of Nottingham School of Medicine ethics committee. We conducted eight focus groups with deaf or hard of hearing children aged 10 to 15 years. A total of 45 children (19 male, 26 female) recruited from local schools and sports clubs took part. Information was gathered on the children’s thoughts and feelings about participation in sport. This included whether they played sports and who with, whether they did or did not like sport, and why they got involved in sport. Focus groups were audio recorded and transcribed. Transcripts were analysed using thematic analysis. Several key themes were identified as being associated with levels of sports participation. These included friendships, family and communication. Deaf or hard of hearing children with active siblings had participated in more sports. Communication was a common theme throughout regardless of the type of hearing-assistive technology a child used. Children found communication easier during sport if they were allowed to use their technology and had particular difficulty during sports such as swimming. Children expressed a desire not to have to identify themselves at a club as having a hearing loss. This affected their confidence when participating in sport. Not surprisingly, children who are deaf or hard of hearing are more likely to participate in sport if they have a good support network of parents, coaches and friends. The key barriers to participation for these children are communication, lack of visual information, lack of opportunity and a lack of awareness. By addressing these issues more deaf and hard of hearing children will take part in sport and will continue their participation.Keywords: barrier, children, deaf, participation, hard of hearing, sport
Procedia PDF Downloads 4252945 Hearing Threshold Levels among Steel Industry Workers in Samut Prakan Province, Thailand
Authors: Petcharat Kerdonfag, Surasak Taneepanichskul, Winai Wadwongtham
Abstract:
Industrial noise is usually considered as the main impact of the environmental health and safety because its exposure can cause permanently serious hearing damage. Despite providing strictly hearing protection standards and campaigning extensively encouraging public health awareness among industrial workers in Thailand, hazard noise-induced hearing loss has dramatically been massive obstacles for workers’ health. The aims of the study were to explore and specify the hearing threshold levels among steel industrial workers responsible in which higher noise levels of work zone and to examine the relationships of hearing loss and workers’ age and the length of employment in Samut Prakan province, Thailand. Cross-sectional study design was done. Ninety-three steel industrial workers in the designated zone of higher noise (> 85dBA) with more than 1 year of employment from two factories by simple random sampling and available to participate in were assessed by the audiometric screening at regional Samut Prakan hospital. Data of doing screening were collected from October to December, 2016 by the occupational medicine physician and a qualified occupational nurse. All participants were examined by the same examiners for the validity. An Audiometric testing was performed at least 14 hours after the last noise exposure from the workplace. Workers’ age and the length of employment were gathered by the developed occupational record form. Results: The range of workers’ age was from 23 to 59 years, (Mean = 41.67, SD = 9.69) and the length of employment was from 1 to 39 years, (Mean = 13.99, SD = 9.88). Fifty three (60.0%) out of all participants have been exposing to the hazard of noise in the workplace for more than 10 years. Twenty-three (24.7%) of them have been exposing to the hazard of noise less than or equal to 5 years. Seventeen (18.3%) of them have been exposing to the hazard of noise for 5 to 10 years. Using the cut point of less than or equal to 25 dBA of hearing thresholds, the average means of hearing thresholds for participants at 4, 6, and 8 kHz were 31.34, 29.62, and 25.64 dB, respectively for the right ear and 40.15, 32.20, and 25.48 dB for the left ear, respectively. The more developing age of workers in the work zone with hazard of noise, the more the hearing thresholds would be increasing at frequencies of 4, 6, and 8 kHz (p =.012, p =.026, p =.024) for the right ear, respectively and for the left ear only at the frequency 4 kHz (p =.009). Conclusion: The participants’ age in the hazard of noise work zone was significantly associated with the hearing loss in different levels while the length of participants’ employment was not significantly associated with the hearing loss. Thus hearing threshold levels among industrial workers would be regularly assessed and needed to be protected at the beginning of working.Keywords: hearing threshold levels, hazard of noise, hearing loss, audiometric testing
Procedia PDF Downloads 2282944 Prosodic Characteristics of Post Traumatic Stress Disorder Induced Speech Changes
Authors: Jarek Krajewski, Andre Wittenborn, Martin Sauerland
Abstract:
This abstract describes a promising approach for estimating post-traumatic stress disorder (PTSD) based on prosodic speech characteristics. It illustrates the validity of this method by briefly discussing results from an Arabic refugee sample (N= 47, 32 m, 15 f). A well-established standardized self-report scale “Reaction of Adolescents to Traumatic Stress” (RATS) was used to determine the ground truth level of PTSD. The speech material was prompted by telling about autobiographical related sadness inducing experiences (sampling rate 16 kHz, 8 bit resolution). In order to investigate PTSD-induced speech changes, a self-developed set of 136 prosodic speech features was extracted from the .wav files. This set was adapted to capture traumatization related speech phenomena. An artificial neural network (ANN) machine learning model was applied to determine the PTSD level and reached a correlation of r = .37. These results indicate that our classifiers can achieve similar results to those seen in speech-based stress research.Keywords: speech prosody, PTSD, machine learning, feature extraction
Procedia PDF Downloads 912943 An Algorithm Based on the Nonlinear Filter Generator for Speech Encryption
Authors: A. Belmeguenai, K. Mansouri, R. Djemili
Abstract:
This work present a new algorithm based on the nonlinear filter generator for speech encryption and decryption. The proposed algorithm consists on the use a linear feedback shift register (LFSR) whose polynomial is primitive and nonlinear Boolean function. The purpose of this system is to construct Keystream with good statistical properties, but also easily computable on a machine with limited capacity calculated. This proposed speech encryption scheme is very simple, highly efficient, and fast to implement the speech encryption and decryption. We conclude the paper by showing that this system can resist certain known attacks.Keywords: nonlinear filter generator, stream ciphers, speech encryption, security analysis
Procedia PDF Downloads 2972942 Review of Speech Recognition Research on Low-Resource Languages
Authors: XuKe Cao
Abstract:
This paper reviews the current state of research on low-resource languages in the field of speech recognition, focusing on the challenges faced by low-resource language speech recognition, including the scarcity of data resources, the lack of linguistic resources, and the diversity of dialects and accents. The article reviews recent progress in low-resource language speech recognition, including techniques such as data augmentation, end to-end models, transfer learning, and multi-task learning. Based on the challenges currently faced, the paper also provides an outlook on future research directions. Through these studies, it is expected that the performance of speech recognition for low resource languages can be improved, promoting the widespread application and adoption of related technologies.Keywords: low-resource languages, speech recognition, data augmentation techniques, NLP
Procedia PDF Downloads 182941 Speech Detection Model Based on Deep Neural Networks Classifier for Speech Emotions Recognition
Authors: Aisultan Shoiynbek, Darkhan Kuanyshbay, Paulo Menezes, Akbayan Bekarystankyzy, Assylbek Mukhametzhanov, Temirlan Shoiynbek
Abstract:
Speech emotion recognition (SER) has received increasing research interest in recent years. It is a common practice to utilize emotional speech collected under controlled conditions recorded by actors imitating and artificially producing emotions in front of a microphone. There are four issues related to that approach: emotions are not natural, meaning that machines are learning to recognize fake emotions; emotions are very limited in quantity and poor in variety of speaking; there is some language dependency in SER; consequently, each time researchers want to start work with SER, they need to find a good emotional database in their language. This paper proposes an approach to create an automatic tool for speech emotion extraction based on facial emotion recognition and describes the sequence of actions involved in the proposed approach. One of the first objectives in the sequence of actions is the speech detection issue. The paper provides a detailed description of the speech detection model based on a fully connected deep neural network for Kazakh and Russian. Despite the high results in speech detection for Kazakh and Russian, the described process is suitable for any language. To investigate the working capacity of the developed model, an analysis of speech detection and extraction from real tasks has been performed.Keywords: deep neural networks, speech detection, speech emotion recognition, Mel-frequency cepstrum coefficients, collecting speech emotion corpus, collecting speech emotion dataset, Kazakh speech dataset
Procedia PDF Downloads 272940 A Systematic Review of Quality of Life in Older Adults with Sensory Impairments
Authors: Ya-Chuan Tseng, Hsin-Yi Liu, Meei-Fang Lou, Guey-Shiun Huang
Abstract:
Purpose: Sensory impairments are common in older adults. Hearing and visual impairments affect their physical and mental health and quality of life (QOL) adversely. However, systematic reviews of the relationship between hearing impairment, visual impairment, dual sensory impairment and quality of life are scarce. The purpose of this systematic review was to determine the relationship between hearing impairment, visual impairment, dual sensory impairment and quality of life. Methods: Searches of EMBASE, PubMed, CINAHL, MEDLINE, Cochrane Library and Airiti Library were conducted between January 2006 and December 2017 using the keywords ‘quality of life,’ ‘life satisfaction,’ ‘well-being,’ ‘hearing impairment’ and ‘visual impairment’ Two authors independently assessed methodologic quality using a modified Downs and Black tool. Data were extracted by the first author and then cross-checked by the second author. Results: Twenty-three studies consisting mostly of community-dwelling older adults were included in our review. Sensory impairment was found to be in significant association with quality of life, with an increase in hearing impairment or visual impairment severity resulting in a lower quality of life. Quality of life for dual sensory impairment was worse than for hearing impairment or visual impairment individually. Conclusions: A significant association was confirmed between hearing impairment, visual impairment, dual sensory impairment and quality of life. Our review can be used to enhance health care personnel’s understanding of sensory impairment in older adults and enable healthcare personnel to actively assess older adults’ sensory functions so that they can help alleviate the negative impact of sensory impairments on QOL in older adults.Keywords: nursing, older adults, quality of life, systematic review, hearing impairment, visual impairment
Procedia PDF Downloads 2422939 Mutations in the GJB2 Gene Are the Cause of an Important Number of Non-Syndromic Deafness Cases
Authors: Habib Onsori, Somayeh Akrami, Mohammad Rahmati
Abstract:
Deafness is the most common sensory disorder with the frequency of 1/1000 in many populations. Mutations in the GJB2 (CX26) gene at the DFNB1 locus on chromosome 13q12 are associated with congenital hearing loss. Approximately 80% of congenital hearing loss cases are recessively inherited and 15% dominantly inherited. Mutations of the GJB2 gene, encoding gap junction protein Connexin 26 (Cx26), are the most common cause of hereditary congenital hearing loss in many countries. This report presents two cases of different mutations from Iranian patients with bilateral hearing loss. DNA studies were performed for the GJB2 gene by PCR and sequencing methods. In one of them, direct sequencing of the gene showed a heterozygous T→C transition at nucleotide 604 resulting in a cysteine to arginine amino acid substitution at codon 202 (C202R) in the fourth extracellular domain (TM4) of the protein. The analyses indicate that the C202R mutation appeared de novo in the proband with a possible dominant effect (GenBank: KF 638275). In the other one, DNA sequencing revealed a compound heterozygous mutation (35delG, 363delC) in the Cx26 gene that is strongly associated with congenital non-syndromic hearing loss (NSHL). So screening the mutations for hearing loss individuals referring to genetics counseling centers before marriage and or pregnancy is recommended.Keywords: CX26, deafness, GJB2, mutation
Procedia PDF Downloads 4902938 Modern Machine Learning Conniptions for Automatic Speech Recognition
Authors: S. Jagadeesh Kumar
Abstract:
This expose presents a luculent of recent machine learning practices as employed in the modern and as pertinent to prospective automatic speech recognition schemes. The aspiration is to promote additional traverse ablution among the machine learning and automatic speech recognition factions that have transpired in the precedent. The manuscript is structured according to the chief machine learning archetypes that are furthermore trendy by now or have latency for building momentous hand-outs to automatic speech recognition expertise. The standards offered and convoluted in this article embraces adaptive and multi-task learning, active learning, Bayesian learning, discriminative learning, generative learning, supervised and unsupervised learning. These learning archetypes are aggravated and conferred in the perspective of automatic speech recognition tools and functions. This manuscript bequeaths and surveys topical advances of deep learning and learning with sparse depictions; further limelight is on their incessant significance in the evolution of automatic speech recognition.Keywords: automatic speech recognition, deep learning methods, machine learning archetypes, Bayesian learning, supervised and unsupervised learning
Procedia PDF Downloads 448