Search results for: speech articulation
842 Eisenhower’s Farewell Speech: Initial and Continuing Communication Effects
Authors: B. Kuiper
Abstract:
When Dwight D. Eisenhower delivered his final Presidential speech in 1961, he was using the opportunity to bid farewell to America, but he was also trying to warn his fellow countrymen about deeper challenges threatening the country. In this analysis, Eisenhower’s speech is examined in light of the impact it had on American culture, communication concepts, and political ramifications. The paper initially highlights the previous literature on the speech, especially in light of its 50th anniversary, and reveals a man whose main concern was how the speech’s words would affect his beloved country. The painstaking approach to the wording of the speech to reveal the intent is key, particularly in light of analyzing the motivations according to “virtuous communication.” This philosophical construct indicates that Eisenhower’s Farewell Address was crafted carefully according to a departing President’s deepest values and concerns, concepts that he wanted to pass along to his successor, to his country, and even to the world.Keywords: Eisenhower, mass communication, political speech, rhetoric
Procedia PDF Downloads 274841 A Sparse Representation Speech Denoising Method Based on Adapted Stopping Residue Error
Authors: Qianhua He, Weili Zhou, Aiwu Chen
Abstract:
A sparse representation speech denoising method based on adapted stopping residue error was presented in this paper. Firstly, the cross-correlation between the clean speech spectrum and the noise spectrum was analyzed, and an estimation method was proposed. In the denoising method, an over-complete dictionary of the clean speech power spectrum was learned with the K-singular value decomposition (K-SVD) algorithm. In the sparse representation stage, the stopping residue error was adaptively achieved according to the estimated cross-correlation and the adjusted noise spectrum, and the orthogonal matching pursuit (OMP) approach was applied to reconstruct the clean speech spectrum from the noisy speech. Finally, the clean speech was re-synthesised via the inverse Fourier transform with the reconstructed speech spectrum and the noisy speech phase. The experiment results show that the proposed method outperforms the conventional methods in terms of subjective and objective measure.Keywords: speech denoising, sparse representation, k-singular value decomposition, orthogonal matching pursuit
Procedia PDF Downloads 499840 Speech Acts and Politeness Strategies in an EFL Classroom in Georgia
Authors: Tinatin Kurdghelashvili
Abstract:
The paper deals with the usage of speech acts and politeness strategies in an EFL classroom in Georgia (Rep of). It explores the students’ and the teachers’ practice of the politeness strategies and the speech acts of apology, thanking, request, compliment/encouragement, command, agreeing/disagreeing, addressing and code switching. The research method includes observation as well as a questionnaire. The target group involves the students from Georgian public schools and two certified, experienced local English teachers. The analysis is based on Searle’s Speech Act Theory and Brown and Levinson’s politeness strategies. The findings show that the students have certain knowledge regarding politeness yet they fail to apply them in English communication. In addition, most of the speech acts from the classroom interaction are used by the teachers and not the students. Thereby, it is suggested that teachers should cultivate the students’ communicative competence and attempt to give them opportunities to practice more English speech acts than they do today.Keywords: english as a foreign language, Georgia, politeness principles, speech acts
Procedia PDF Downloads 636839 Speech Detection Model Based on Deep Neural Networks Classifier for Speech Emotions Recognition
Authors: A. Shoiynbek, K. Kozhakhmet, P. Menezes, D. Kuanyshbay, D. Bayazitov
Abstract:
Speech emotion recognition has received increasing research interest all through current years. There was used emotional speech that was collected under controlled conditions in most research work. Actors imitating and artificially producing emotions in front of a microphone noted those records. There are four issues related to that approach, namely, (1) emotions are not natural, and it means that machines are learning to recognize fake emotions. (2) Emotions are very limited by quantity and poor in their variety of speaking. (3) There is language dependency on SER. (4) Consequently, each time when researchers want to start work with SER, they need to find a good emotional database on their language. In this paper, we propose the approach to create an automatic tool for speech emotion extraction based on facial emotion recognition and describe the sequence of actions of the proposed approach. One of the first objectives of the sequence of actions is a speech detection issue. The paper gives a detailed description of the speech detection model based on a fully connected deep neural network for Kazakh and Russian languages. Despite the high results in speech detection for Kazakh and Russian, the described process is suitable for any language. To illustrate the working capacity of the developed model, we have performed an analysis of speech detection and extraction from real tasks.Keywords: deep neural networks, speech detection, speech emotion recognition, Mel-frequency cepstrum coefficients, collecting speech emotion corpus, collecting speech emotion dataset, Kazakh speech dataset
Procedia PDF Downloads 101838 The Influence of Advertising Captions on the Internet through the Consumer Purchasing Decision
Authors: Suwimol Apapol, Punrapha Praditpong
Abstract:
The objectives of the study were to find out the frequencies of figures of speech in fragrance advertising captions as well as the types of figures of speech most commonly applied in captions. The relation between figures of speech and fragrance was also examined in order to analyze how figures of speech were used to represent fragrance. Thirty-five fragrance advertisements were randomly selected from the Internet. Content analysis was applied in order to consider the relation between figures of speech and fragrance. The results showed that figures of speech were found in almost every fragrance advertisement except one advertisement of several Goods service. Thirty-four fragrance advertising captions used at least one kind of figure of speech. Metaphor was most frequently found and also most frequently applied in fragrance advertising captions, followed by alliteration, rhyme, simile and personification, and hyperbole respectively which is in harmony with the research hypotheses as well.Keywords: advertising captions, captions on internet, consumer purchasing decision, e-commerce
Procedia PDF Downloads 270837 Representation of Phonemic Changes in Arabic Dialect of Yemen: Speech Disorder and Consonant Substitution
Authors: Sadeq Al Yaari, Muhammad Alkhunayn, Adham Al Yaari, Montaha Al Yaari, Ayman Al Yaari, Aayah Al Yaari, Sajedah Al Yaari, Fatehi Eissa
Abstract:
Introduction: Like many dialects, the Arabic dialect of Yemen (ADY) exhibited utterance phonemic distinction- vowel deletion, lengthening, and insertion- that were investigated using speakers from different dialectal backgrounds, with particular focus on the difference typically developing and achieving speakers and those suffering linguistic problems make. Phonological variations were found to be inevitable, suggesting further investigation of consonants to see to what extent they are prone to such phonemic changes. This study investigates the patterns of consonant substitution in ADY by examining if there is a clear-cut line between normal and pathological consonants to decide which of these consonants is substituted more. Methods: A total of hundred and twenty nine Yemeni male participants (age= 6-13) were enrolled in this study. Participants were preassigned into two groups (Articulation disorders (AD) group= 42 and typically developing and achieving group (TD) = 70), each of which consists of five sub-groups in decided sociolinguistic classification. In a 45 minute-session, 180 pictures of commonly used verbs (4 pics/m.) were presented to participants who were asked to impulsively describe these verbs before their production was psychoneurolinguistically and statistically analyzed. Results: There was a pattern of consonant substitution in some dialects that participants from both groups have in common: Voiceless consonants (/t/, /ṣ/,/s/, /ḥ, /k/, /ʃ/, /f//, and /k/) in northern and eastern dialects; voiced consonants (/q/, /gh/, /Ʒ/, /g/,/ḍ/, /b/, and /d/) in southern, eastern, western and central dialects; and voiceless and voiced consonants(/t/, /f/, /Ø/, /ṣ/, /s/, /q/, /gh/, /Ʒ/, /g/,/ḍ/, and /b/) in southern dialect. Voiceless consonants (/t/, /ṣ/,/s/, /ḥ, /k/, /ʃ/, /f//, /Ø/and /k/) found to be substituted more by ADY speakers of both AD and TD groups followed by voiced consonants (/q/, /gh/, /Ʒ/, /g/,/ḍ/,/d/ /b/, and /ð/), nasals (/m/, /n/), mute (/h/), semi-vowels (/w/ and /j/) and laterals (/l/ and /r/). Unexpectedly, a short vowel (/æ/) and two long vowels (/u: and /a:/) were found to substitute consonants in ADY both by AD and TD participants. Conclusions: AD and TD participants of ADY substitute consonants in their dialectal speech. Consonant substitution processes cover not only consonants but extend to include monophthongs. The finding that speakers of ADY substitute consonants in multisyllabic words is probably due to the fact that the sociolinguistic factor plays a pivotal role in the problematic substitution of consonants in ADY speakers. Larger longitudinal studies are necessary to further investigate the effect of sociolinguistic background on phonological variations, notably sound change in the speech of Yemeni TD speakers compared to those with linguistic impairments.Keywords: consonant substitution, Arabic dialect of Yemen, phonetics, phonology, syllables, articulation disorders
Procedia PDF Downloads 43836 A Novel RLS Based Adaptive Filtering Method for Speech Enhancement
Authors: Pogula Rakesh, T. Kishore Kumar
Abstract:
Speech enhancement is a long standing problem with numerous applications like teleconferencing, VoIP, hearing aids, and speech recognition. The motivation behind this research work is to obtain a clean speech signal of higher quality by applying the optimal noise cancellation technique. Real-time adaptive filtering algorithms seem to be the best candidate among all categories of the speech enhancement methods. In this paper, we propose a speech enhancement method based on Recursive Least Squares (RLS) adaptive filter of speech signals. Experiments were performed on noisy data which was prepared by adding AWGN, Babble and Pink noise to clean speech samples at -5dB, 0dB, 5dB, and 10dB SNR levels. We then compare the noise cancellation performance of proposed RLS algorithm with existing NLMS algorithm in terms of Mean Squared Error (MSE), Signal to Noise ratio (SNR), and SNR loss. Based on the performance evaluation, the proposed RLS algorithm was found to be a better optimal noise cancellation technique for speech signals.Keywords: adaptive filter, adaptive noise canceller, mean squared error, noise reduction, NLMS, RLS, SNR, SNR loss
Procedia PDF Downloads 481835 Prosodic Characteristics of Post Traumatic Stress Disorder Induced Speech Changes
Authors: Jarek Krajewski, Andre Wittenborn, Martin Sauerland
Abstract:
This abstract describes a promising approach for estimating post-traumatic stress disorder (PTSD) based on prosodic speech characteristics. It illustrates the validity of this method by briefly discussing results from an Arabic refugee sample (N= 47, 32 m, 15 f). A well-established standardized self-report scale “Reaction of Adolescents to Traumatic Stress” (RATS) was used to determine the ground truth level of PTSD. The speech material was prompted by telling about autobiographical related sadness inducing experiences (sampling rate 16 kHz, 8 bit resolution). In order to investigate PTSD-induced speech changes, a self-developed set of 136 prosodic speech features was extracted from the .wav files. This set was adapted to capture traumatization related speech phenomena. An artificial neural network (ANN) machine learning model was applied to determine the PTSD level and reached a correlation of r = .37. These results indicate that our classifiers can achieve similar results to those seen in speech-based stress research.Keywords: speech prosody, PTSD, machine learning, feature extraction
Procedia PDF Downloads 90834 An Algorithm Based on the Nonlinear Filter Generator for Speech Encryption
Authors: A. Belmeguenai, K. Mansouri, R. Djemili
Abstract:
This work present a new algorithm based on the nonlinear filter generator for speech encryption and decryption. The proposed algorithm consists on the use a linear feedback shift register (LFSR) whose polynomial is primitive and nonlinear Boolean function. The purpose of this system is to construct Keystream with good statistical properties, but also easily computable on a machine with limited capacity calculated. This proposed speech encryption scheme is very simple, highly efficient, and fast to implement the speech encryption and decryption. We conclude the paper by showing that this system can resist certain known attacks.Keywords: nonlinear filter generator, stream ciphers, speech encryption, security analysis
Procedia PDF Downloads 296833 The Effect of The Speaker's Speaking Style as A Factor of Understanding and Comfort of The Listener
Authors: Made Rahayu Putri Saron, Mochamad Nizar Palefi Ma’ady
Abstract:
Communication skills are important in everyday life, communication can be done verbally in the form of oral or written and nonverbal in the form of expressions or body movements. Good communication should be able to provide information clearly, and there is feedback from the speaker and listener. However, it is often found that the information conveyed is not clear, and there is no feedback from the listeners, so it cannot be ensured that the communication is effective and understandable. The speaker's understanding of the topic is one of the supporting factors for the listener to be able to accept the meaning of the conversation. However, based on the results of the literature review, it found that the influence factors of person speaking style are as follows: (i) environmental conditions; (ii) voice, articulation, and accent; (iii) gender; (iv) personality; (v) speech disorders (Dysarthria); when speaking also have an important influence on speaker’s speaking style. It can be concluded the factors that support understanding and comfort of the listener are dependent on the nature of the speaker (environmental conditions, voice, gender, personality) or also it the speaker have speech disorders.Keywords: listener, public speaking, speaking style, understanding, and comfortable factor
Procedia PDF Downloads 166832 Review of Speech Recognition Research on Low-Resource Languages
Authors: XuKe Cao
Abstract:
This paper reviews the current state of research on low-resource languages in the field of speech recognition, focusing on the challenges faced by low-resource language speech recognition, including the scarcity of data resources, the lack of linguistic resources, and the diversity of dialects and accents. The article reviews recent progress in low-resource language speech recognition, including techniques such as data augmentation, end to-end models, transfer learning, and multi-task learning. Based on the challenges currently faced, the paper also provides an outlook on future research directions. Through these studies, it is expected that the performance of speech recognition for low resource languages can be improved, promoting the widespread application and adoption of related technologies.Keywords: low-resource languages, speech recognition, data augmentation techniques, NLP
Procedia PDF Downloads 12831 Speech Detection Model Based on Deep Neural Networks Classifier for Speech Emotions Recognition
Authors: Aisultan Shoiynbek, Darkhan Kuanyshbay, Paulo Menezes, Akbayan Bekarystankyzy, Assylbek Mukhametzhanov, Temirlan Shoiynbek
Abstract:
Speech emotion recognition (SER) has received increasing research interest in recent years. It is a common practice to utilize emotional speech collected under controlled conditions recorded by actors imitating and artificially producing emotions in front of a microphone. There are four issues related to that approach: emotions are not natural, meaning that machines are learning to recognize fake emotions; emotions are very limited in quantity and poor in variety of speaking; there is some language dependency in SER; consequently, each time researchers want to start work with SER, they need to find a good emotional database in their language. This paper proposes an approach to create an automatic tool for speech emotion extraction based on facial emotion recognition and describes the sequence of actions involved in the proposed approach. One of the first objectives in the sequence of actions is the speech detection issue. The paper provides a detailed description of the speech detection model based on a fully connected deep neural network for Kazakh and Russian. Despite the high results in speech detection for Kazakh and Russian, the described process is suitable for any language. To investigate the working capacity of the developed model, an analysis of speech detection and extraction from real tasks has been performed.Keywords: deep neural networks, speech detection, speech emotion recognition, Mel-frequency cepstrum coefficients, collecting speech emotion corpus, collecting speech emotion dataset, Kazakh speech dataset
Procedia PDF Downloads 26830 Modern Machine Learning Conniptions for Automatic Speech Recognition
Authors: S. Jagadeesh Kumar
Abstract:
This expose presents a luculent of recent machine learning practices as employed in the modern and as pertinent to prospective automatic speech recognition schemes. The aspiration is to promote additional traverse ablution among the machine learning and automatic speech recognition factions that have transpired in the precedent. The manuscript is structured according to the chief machine learning archetypes that are furthermore trendy by now or have latency for building momentous hand-outs to automatic speech recognition expertise. The standards offered and convoluted in this article embraces adaptive and multi-task learning, active learning, Bayesian learning, discriminative learning, generative learning, supervised and unsupervised learning. These learning archetypes are aggravated and conferred in the perspective of automatic speech recognition tools and functions. This manuscript bequeaths and surveys topical advances of deep learning and learning with sparse depictions; further limelight is on their incessant significance in the evolution of automatic speech recognition.Keywords: automatic speech recognition, deep learning methods, machine learning archetypes, Bayesian learning, supervised and unsupervised learning
Procedia PDF Downloads 447829 Prosody Generation in Neutral Speech Storytelling Application Using Tilt Model
Authors: Manjare Chandraprabha A., S. D. Shirbahadurkar, Manjare Anil S., Paithne Ajay N.
Abstract:
This paper proposes Intonation Modeling for Prosody generation in Neutral speech for Marathi (language spoken in Maharashtra, India) story telling applications. Nowadays audio story telling devices are very eminent for children. In this paper, we proposed tilt model for stressed words in Marathi for speech modification. Tilt model predicts modification in tone of neutral speech. GMM is used to identify stressed words for modification.Keywords: tilt model, fundamental frequency, statistical parametric speech synthesis, GMM
Procedia PDF Downloads 392828 The Importance of Right Speech in Buddhism and Its Relevance Today
Authors: Gautam Sharda
Abstract:
The concept of right speech is the third stage of the noble eightfold path as prescribed by the Buddha and followed by millions of practicing Buddhists. The Buddha lays a lot of importance on the notion of right speech (Samma Vacca). In the Angutara Nikaya, the Buddha mentioned what constitutes right speech, which is basically four kinds of abstentions; namely abstaining from false speech, abstaining from slanderous speech, abstaining from harsh or hateful speech and abstaining from idle chatter. The Buddha gives reasons in support of his view as to why abstaining from these four kinds of speeches is favourable not only for maintaining the peace and equanimity within an individual but also within a society. It is a known fact that when we say something harsh or slanderous to others, it eventually affects our individual peace of mind too. We also know about the many examples of hate speeches which have led to senseless cases of violence and which are well documented within our country and the world. Also, indulging in false speech is not a healthy sign for individuals within a group as this kind of a social group which is based on falsities and lies cannot really survive for long and will eventually lead to chaos. Buddha also told us to refrain from idle chatter or gossip as generally we have seen that idle chatter or gossip does more harm than any good to the individual and the society. Hence, if most of us actually inculcate this third stage (namely, right speech) of the noble eightfold path of the Buddha in our daily life, it would be highly beneficial both for the individual and for the harmony of the society.Keywords: Buddhism, speech, individual, society
Procedia PDF Downloads 264827 Advances in Artificial intelligence Using Speech Recognition
Authors: Khaled M. Alhawiti
Abstract:
This research study aims to present a retrospective study about speech recognition systems and artificial intelligence. Speech recognition has become one of the widely used technologies, as it offers great opportunity to interact and communicate with automated machines. Precisely, it can be affirmed that speech recognition facilitates its users and helps them to perform their daily routine tasks, in a more convenient and effective manner. This research intends to present the illustration of recent technological advancements, which are associated with artificial intelligence. Recent researches have revealed the fact that speech recognition is found to be the utmost issue, which affects the decoding of speech. In order to overcome these issues, different statistical models were developed by the researchers. Some of the most prominent statistical models include acoustic model (AM), language model (LM), lexicon model, and hidden Markov models (HMM). The research will help in understanding all of these statistical models of speech recognition. Researchers have also formulated different decoding methods, which are being utilized for realistic decoding tasks and constrained artificial languages. These decoding methods include pattern recognition, acoustic phonetic, and artificial intelligence. It has been recognized that artificial intelligence is the most efficient and reliable methods, which are being used in speech recognition.Keywords: speech recognition, acoustic phonetic, artificial intelligence, hidden markov models (HMM), statistical models of speech recognition, human machine performance
Procedia PDF Downloads 477826 Speech Enhancement Using Wavelet Coefficients Masking with Local Binary Patterns
Authors: Christian Arcos, Marley Vellasco, Abraham Alcaim
Abstract:
In this paper, we present a wavelet coefficients masking based on Local Binary Patterns (WLBP) approach to enhance the temporal spectra of the wavelet coefficients for speech enhancement. This technique exploits the wavelet denoising scheme, which splits the degraded speech into pyramidal subband components and extracts frequency information without losing temporal information. Speech enhancement in each high-frequency subband is performed by binary labels through the local binary pattern masking that encodes the ratio between the original value of each coefficient and the values of the neighbour coefficients. This approach enhances the high-frequency spectra of the wavelet transform instead of eliminating them through a threshold. A comparative analysis is carried out with conventional speech enhancement algorithms, demonstrating that the proposed technique achieves significant improvements in terms of PESQ, an international recommendation of objective measure for estimating subjective speech quality. Informal listening tests also show that the proposed method in an acoustic context improves the quality of speech, avoiding the annoying musical noise present in other speech enhancement techniques. Experimental results obtained with a DNN based speech recognizer in noisy environments corroborate the superiority of the proposed scheme in the robust speech recognition scenario.Keywords: binary labels, local binary patterns, mask, wavelet coefficients, speech enhancement, speech recognition
Procedia PDF Downloads 229825 Application of the Bionic Wavelet Transform and Psycho-Acoustic Model for Speech Compression
Authors: Chafik Barnoussi, Mourad Talbi, Adnane Cherif
Abstract:
In this paper we propose a new speech compression system based on the application of the Bionic Wavelet Transform (BWT) combined with the psychoacoustic model. This compression system is a modified version of the compression system using a MDCT (Modified Discrete Cosine Transform) filter banks of 32 filters each and the psychoacoustic model. This modification consists in replacing the banks of the MDCT filter banks by the bionic wavelet coefficients which are obtained from the application of the BWT to the speech signal to be compressed. These two methods are evaluated and compared with each other by computing bits before and bits after compression. They are tested on different speech signals and the obtained simulation results show that the proposed technique outperforms the second technique and this in term of compressed file size. In term of SNR, PSNR and NRMSE, the outputs speech signals of the proposed compression system are with acceptable quality. In term of PESQ and speech signal intelligibility, the proposed speech compression technique permits to obtain reconstructed speech signals with good quality.Keywords: speech compression, bionic wavelet transform, filterbanks, psychoacoustic model
Procedia PDF Downloads 384824 Hate Speech Detection Using Deep Learning and Machine Learning Models
Authors: Nabil Shawkat, Jamil Saquer
Abstract:
Social media has accelerated our ability to engage with others and eliminated many communication barriers. On the other hand, the widespread use of social media resulted in an increase in online hate speech. This has drastic impacts on vulnerable individuals and societies. Therefore, it is critical to detect hate speech to prevent innocent users and vulnerable communities from becoming victims of hate speech. We investigate the performance of different deep learning and machine learning algorithms on three different datasets. Our results show that the BERT model gives the best performance among all the models by achieving an F1-score of 90.6% on one of the datasets and F1-scores of 89.7% and 88.2% on the other two datasets.Keywords: hate speech, machine learning, deep learning, abusive words, social media, text classification
Procedia PDF Downloads 136823 Speech Intelligibility Improvement Using Variable Level Decomposition DWT
Authors: Samba Raju, Chiluveru, Manoj Tripathy
Abstract:
Intelligibility is an essential characteristic of a speech signal, which is used to help in the understanding of information in speech signal. Background noise in the environment can deteriorate the intelligibility of a recorded speech. In this paper, we presented a simple variance subtracted - variable level discrete wavelet transform, which improve the intelligibility of speech. The proposed algorithm does not require an explicit estimation of noise, i.e., prior knowledge of the noise; hence, it is easy to implement, and it reduces the computational burden. The proposed algorithm decides a separate decomposition level for each frame based on signal dominant and dominant noise criteria. The performance of the proposed algorithm is evaluated with speech intelligibility measure (STOI), and results obtained are compared with Universal Discrete Wavelet Transform (DWT) thresholding and Minimum Mean Square Error (MMSE) methods. The experimental results revealed that the proposed scheme outperformed competing methodsKeywords: discrete wavelet transform, speech intelligibility, STOI, standard deviation
Procedia PDF Downloads 148822 The Language Use of Middle Eastern Freedom Activists' Speeches: A Gender Perspective
Authors: Sulistyaningtyas
Abstract:
Examining the role of Middle Eastern freedom activists’ speech based on gender perspective is considered noteworthy because the society in the Middle East is patriarchal. This research aims to examine the language use of the Middle Eastern freedom activists’ speeches through gender perspective. The data sources are from male and female Middle Eastern freedom activists’ speech videos. In analyzing the data, the theories employed are about Language Style from Gender Perspective and The Language for Speech. The result reveals that there are sets of spoken language differences between male and female speakers. In using the language for speech, both male and female speakers produce metaphor, euphemism, the ‘rule of three’, parallelism, and pronouns in random frequency of production, which cannot be separated by genders. Moreover, it cannot be concluded that one gender is more potential than the other to influence the audience in delivering speech. There are other factors, particularly non-verbal factors, existing to give impacts on how a speech can influence the audience.Keywords: gender perspective, language use, Middle Eastern freedom activists, speech
Procedia PDF Downloads 421821 Considering Cultural and Linguistic Variables When Working as a Speech-Language Pathologist with Multicultural Students
Authors: Gabriela Smeckova
Abstract:
The entire world is becoming more and more diverse. The reasons why people migrate are different and unique for each family /individual. Professionals delivering services (including speech-language pathologists) must be prepared to work with clients coming from different cultural and/or linguistic backgrounds. Well-educated speech-language pathologists will consider many factors when delivering services. Some of them will be discussed during the presentation (language spoken, beliefs about health care and disabilities, reasons for immigration, etc.). The communication styles of the client can be different than the styles of the speech-language pathologist. The goal is to become culturally responsive in service delivery.Keywords: culture, cultural competence, culturallly responsive practices, speech-language pathologist, cultural and linguistical variables, communication styles
Procedia PDF Downloads 76820 Effect of Noise Reduction Algorithms on Temporal Splitting of Speech Signal to Improve Speech Perception for Binaural Hearing Aids
Authors: Rajani S. Pujar, Pandurangarao N. Kulkarni
Abstract:
Increased temporal masking affects the speech perception in persons with sensorineural hearing impairment especially under adverse listening conditions. This paper presents a cascaded scheme, which employs a noise reduction algorithm as well as temporal splitting of the speech signal. Earlier investigations have shown that by splitting the speech temporally and presenting alternate segments to the two ears help in reducing the effect of temporal masking. In this technique, the speech signal is processed by two fading functions, complementary to each other, and presented to left and right ears for binaural dichotic presentation. In the present study, half cosine signal is used as a fading function with crossover gain of 6 dB for the perceptual balance of loudness. Temporal splitting is combined with noise reduction algorithm to improve speech perception in the background noise. Two noise reduction schemes, namely spectral subtraction and Wiener filter are used. Listening tests were conducted on six normal-hearing subjects, with sensorineural loss simulated by adding broadband noise to the speech signal at different signal-to-noise ratios (∞, 3, 0, and -3 dB). Objective evaluation using PESQ was also carried out. The MOS score for VCV syllable /asha/ for SNR values of ∞, 3, 0, and -3 dB were 5, 4.46, 4.4 and 4.05 respectively, while the corresponding MOS scores for unprocessed speech were 5, 1.2, 0.9 and 0.65, indicating significant improvement in the perceived speech quality for the proposed scheme compared to the unprocessed speech.Keywords: MOS, PESQ, spectral subtraction, temporal splitting, wiener filter
Procedia PDF Downloads 327819 Efficacy of a Wiener Filter Based Technique for Speech Enhancement in Hearing Aids
Authors: Ajish K. Abraham
Abstract:
Hearing aid is the most fundamental technology employed towards rehabilitation of persons with sensory neural hearing impairment. Hearing in noise is still a matter of major concern for many hearing aid users and thus continues to be a challenging issue for the hearing aid designers. Several techniques are being currently used to enhance the speech at the hearing aid output. Most of these techniques, when implemented, result in reduction of intelligibility of the speech signal. Thus the dissatisfaction of the hearing aid user towards comprehending the desired speech amidst noise is prevailing. Multichannel Wiener Filter is widely implemented in binaural hearing aid technology for noise reduction. In this study, Wiener filter based noise reduction approach is experimented for a single microphone based hearing aid set up. This method checks the status of the input speech signal in each frequency band and then selects the relevant noise reduction procedure. Results showed that the Wiener filter based algorithm is capable of enhancing speech even when the input acoustic signal has a very low Signal to Noise Ratio (SNR). Performance of the algorithm was compared with other similar algorithms on the basis of improvement in intelligibility and SNR of the output, at different SNR levels of the input speech. Wiener filter based algorithm provided significant improvement in SNR and intelligibility compared to other techniques.Keywords: hearing aid output speech, noise reduction, SNR improvement, Wiener filter, speech enhancement
Procedia PDF Downloads 247818 A Two-Stage Adaptation towards Automatic Speech Recognition System for Malay-Speaking Children
Authors: Mumtaz Begum Mustafa, Siti Salwah Salim, Feizal Dani Rahman
Abstract:
Recently, Automatic Speech Recognition (ASR) systems were used to assist children in language acquisition as it has the ability to detect human speech signal. Despite the benefits offered by the ASR system, there is a lack of ASR systems for Malay-speaking children. One of the contributing factors for this is the lack of continuous speech database for the target users. Though cross-lingual adaptation is a common solution for developing ASR systems for under-resourced language, it is not viable for children as there are very limited speech databases as a source model. In this research, we propose a two-stage adaptation for the development of ASR system for Malay-speaking children using a very limited database. The two stage adaptation comprises the cross-lingual adaptation (first stage) and cross-age adaptation. For the first stage, a well-known speech database that is phonetically rich and balanced, is adapted to the medium-sized Malay adults using supervised MLLR. The second stage adaptation uses the speech acoustic model generated from the first adaptation, and the target database is a small-sized database of the target users. We have measured the performance of the proposed technique using word error rate, and then compare them with the conventional benchmark adaptation. The two stage adaptation proposed in this research has better recognition accuracy as compared to the benchmark adaptation in recognizing children’s speech.Keywords: Automatic Speech Recognition System, children speech, adaptation, Malay
Procedia PDF Downloads 397817 The Complaint Speech Act Set Produced by Arab Students in the UAE
Authors: Tanju Deveci
Abstract:
It appears that the speech act of complaint has not received as much attention as other speech acts. However, the face-threatening nature of this speech act requires a special attention in multicultural contexts in particular. The teaching context in the UAE universities, where a big majority of teaching staff comes from other cultures, requires investigations into this speech act in order to improve communication between students and faculty. This session will outline the results of a study conducted with this purpose. The realization of complaints by Freshman English students in Communication courses at Petroleum Institute was investigated to identify communication patterns that seem to cause a strain. Data were collected using a role-play between a teacher and students, and a judgment scale completed by two of the instructors in the Communications Department. The initial findings reveal that the students had difficulty putting their case, produced the speech act of criticism along with a complaint and that they produced both requests and demands as candidate solutions. The judgement scales revealed that the students’ attitude was not appropriate most of the time and that the judges would behave differently from students. It is concluded that speech acts, in general, and complaint, in particular, need to be taught to learners explicitly to improve interpersonal communication in multicultural societies. Some teaching ideas are provided to help increase foreign language learners’ sociolinguistic competence.Keywords: speech act, complaint, pragmatics, sociolinguistics, language teaching
Procedia PDF Downloads 507816 On Overcoming Common Oral Speech Problems through Authentic Films
Authors: Tamara Matevosyan
Abstract:
The present paper discusses the main problems that students face while developing oral skills through authentic films. It states that special attention should be paid not only to the study of verbal speech but also to non-verbal communication. Authentic films serve as an important tool to understand both native speaker’s gestures and their culture of pausing while speaking. Various phonetic difficulties causing phonetic interference in actual speech are covered in the paper emphasizing the role of authentic films in overcoming them.Keywords: compressive speech, filled pauses, unfilled pauses, pausing culture
Procedia PDF Downloads 353815 Acoustic Characteristics of Ḫijaiyaḫ Letters Pronunciation by Indonesian Native Speaker
Authors: Romi Hardiyansyah, Raden Sugeng Joko Sarwono, Agus Samsi
Abstract:
Indonesian people have a mother language but not Arabic. Meanwhile, they must be able to pronounce the Arabic because Islam is the biggest religion in Indonesia. Arabic is composed by ḫijaiyaḫ letters which has its own pronunciation. Sound production process in humans can be divided into three physiological processes, namely: the formation of airflow from the lungs, the change in airflow from the lungs into the sound, and articulation (the modulation/sound setting into a specific sound). Ḫijaiyaḫ letters has its own articulation, some of which seem strange for most people in Indonesia. Those letters come out from the middle and upper throat so that the letters has its own acoustic characteristics. Acoustic characteristics of voice can be observed by source-filter approach that has parameters: pitch, formant, and formant bandwidth. Pitch is the basic tone in every human being. Formant is the resonance frequency of the human voice. Formant bandwidth is the time-width of a formant. After recording the sound from 21 subjects, data is processed by software Praat version 5.3.39. The analysis showed that each pronunciation, syakal (vowel changer), and the place of discharge letters has the same timbre which are determined by third and fourth formant.Keywords: ḫijaiyaḫ, articulation, pitch, formant, formant bandwidth, timbre
Procedia PDF Downloads 396814 Morpheme Based Parts of Speech Tagger for Kannada Language
Authors: M. C. Padma, R. J. Prathibha
Abstract:
Parts of speech tagging is the process of assigning appropriate parts of speech tags to the words in a given text. The critical or crucial information needed for tagging a word come from its internal structure rather from its neighboring words. The internal structure of a word comprises of its morphological features and grammatical information. This paper presents a morpheme based parts of speech tagger for Kannada language. This proposed work uses hierarchical tag set for assigning tags. The system is tested on some Kannada words taken from EMILLE corpus. Experimental result shows that the performance of the proposed system is above 90%.Keywords: hierarchical tag set, morphological analyzer, natural language processing, paradigms, parts of speech
Procedia PDF Downloads 296813 The Convolution Recurrent Network of Using Residual LSTM to Process the Output of the Downsampling for Monaural Speech Enhancement
Authors: Shibo Wei, Ting Jiang
Abstract:
Convolutional-recurrent neural networks (CRN) have achieved much success recently in the speech enhancement field. The common processing method is to use the convolution layer to compress the feature space by multiple upsampling and then model the compressed features with the LSTM layer. At last, the enhanced speech is obtained by deconvolution operation to integrate the global information of the speech sequence. However, the feature space compression process may cause the loss of information, so we propose to model the upsampling result of each step with the residual LSTM layer, then join it with the output of the deconvolution layer and input them to the next deconvolution layer, by this way, we want to integrate the global information of speech sequence better. The experimental results show the network model (RES-CRN) we introduce can achieve better performance than LSTM without residual and overlaying LSTM simply in the original CRN in terms of scale-invariant signal-to-distortion ratio (SI-SNR), speech quality (PESQ), and intelligibility (STOI).Keywords: convolutional-recurrent neural networks, speech enhancement, residual LSTM, SI-SNR
Procedia PDF Downloads 200