Search results for: speech therapy
2672 Speech and LanguageTherapists’ Advices for Multilingual Children with Developmental Language Disorders
Authors: Rudinë Fetahaj, Flaka Isufi, Kristina Hansson
Abstract:
While evidence shows that in most European countries’ multilingualism is rising, unfortunately, the focus of Speech and Language Therapy (SLT) is still monolingualism. Furthermore, there is sparse information on how the needs of multilingual children with language disorders such as Developmental Language Disorder (DLD) are being met and which factors affect the intervention approach of SLTs when treating DLD. This study aims to examine the relationship and correlation between the number of languages SLTs speak, years of experience, and length of education with the advice they give to parents of multilingual children with DLD regarding which language to be spoken. This is a cross-sectional study where a survey was completed online by 2608 SLTs across Europe and data has been used from a 2017 COST-action project. IBM-SPSS-28 was used where descriptive analysis, correlation and Kruskal-Wallis test were performed.SLTs mainly advise the parents of multilingual children with DLD to speak their native language at home. Besides years of experience, language status and the level of education showed to have no association with the type of advice SLTs give. Results showed a non-significant moderate positive correlation between SLTs years of experience and their advice regarding the native language, whereas language status and length of education showed no correlation with the advice SLTs give to parents.Keywords: quantitative study, developmental language disorders, multilingualism, speech and language therapy, children, European context
Procedia PDF Downloads 812671 Speech Acts and Politeness Strategies in an EFL Classroom in Georgia
Authors: Tinatin Kurdghelashvili
Abstract:
The paper deals with the usage of speech acts and politeness strategies in an EFL classroom in Georgia (Rep of). It explores the students’ and the teachers’ practice of the politeness strategies and the speech acts of apology, thanking, request, compliment/encouragement, command, agreeing/disagreeing, addressing and code switching. The research method includes observation as well as a questionnaire. The target group involves the students from Georgian public schools and two certified, experienced local English teachers. The analysis is based on Searle’s Speech Act Theory and Brown and Levinson’s politeness strategies. The findings show that the students have certain knowledge regarding politeness yet they fail to apply them in English communication. In addition, most of the speech acts from the classroom interaction are used by the teachers and not the students. Thereby, it is suggested that teachers should cultivate the students’ communicative competence and attempt to give them opportunities to practice more English speech acts than they do today.Keywords: english as a foreign language, Georgia, politeness principles, speech acts
Procedia PDF Downloads 6382670 Speech Detection Model Based on Deep Neural Networks Classifier for Speech Emotions Recognition
Authors: A. Shoiynbek, K. Kozhakhmet, P. Menezes, D. Kuanyshbay, D. Bayazitov
Abstract:
Speech emotion recognition has received increasing research interest all through current years. There was used emotional speech that was collected under controlled conditions in most research work. Actors imitating and artificially producing emotions in front of a microphone noted those records. There are four issues related to that approach, namely, (1) emotions are not natural, and it means that machines are learning to recognize fake emotions. (2) Emotions are very limited by quantity and poor in their variety of speaking. (3) There is language dependency on SER. (4) Consequently, each time when researchers want to start work with SER, they need to find a good emotional database on their language. In this paper, we propose the approach to create an automatic tool for speech emotion extraction based on facial emotion recognition and describe the sequence of actions of the proposed approach. One of the first objectives of the sequence of actions is a speech detection issue. The paper gives a detailed description of the speech detection model based on a fully connected deep neural network for Kazakh and Russian languages. Despite the high results in speech detection for Kazakh and Russian, the described process is suitable for any language. To illustrate the working capacity of the developed model, we have performed an analysis of speech detection and extraction from real tasks.Keywords: deep neural networks, speech detection, speech emotion recognition, Mel-frequency cepstrum coefficients, collecting speech emotion corpus, collecting speech emotion dataset, Kazakh speech dataset
Procedia PDF Downloads 1022669 The Influence of Advertising Captions on the Internet through the Consumer Purchasing Decision
Authors: Suwimol Apapol, Punrapha Praditpong
Abstract:
The objectives of the study were to find out the frequencies of figures of speech in fragrance advertising captions as well as the types of figures of speech most commonly applied in captions. The relation between figures of speech and fragrance was also examined in order to analyze how figures of speech were used to represent fragrance. Thirty-five fragrance advertisements were randomly selected from the Internet. Content analysis was applied in order to consider the relation between figures of speech and fragrance. The results showed that figures of speech were found in almost every fragrance advertisement except one advertisement of several Goods service. Thirty-four fragrance advertising captions used at least one kind of figure of speech. Metaphor was most frequently found and also most frequently applied in fragrance advertising captions, followed by alliteration, rhyme, simile and personification, and hyperbole respectively which is in harmony with the research hypotheses as well.Keywords: advertising captions, captions on internet, consumer purchasing decision, e-commerce
Procedia PDF Downloads 2712668 A Novel RLS Based Adaptive Filtering Method for Speech Enhancement
Authors: Pogula Rakesh, T. Kishore Kumar
Abstract:
Speech enhancement is a long standing problem with numerous applications like teleconferencing, VoIP, hearing aids, and speech recognition. The motivation behind this research work is to obtain a clean speech signal of higher quality by applying the optimal noise cancellation technique. Real-time adaptive filtering algorithms seem to be the best candidate among all categories of the speech enhancement methods. In this paper, we propose a speech enhancement method based on Recursive Least Squares (RLS) adaptive filter of speech signals. Experiments were performed on noisy data which was prepared by adding AWGN, Babble and Pink noise to clean speech samples at -5dB, 0dB, 5dB, and 10dB SNR levels. We then compare the noise cancellation performance of proposed RLS algorithm with existing NLMS algorithm in terms of Mean Squared Error (MSE), Signal to Noise ratio (SNR), and SNR loss. Based on the performance evaluation, the proposed RLS algorithm was found to be a better optimal noise cancellation technique for speech signals.Keywords: adaptive filter, adaptive noise canceller, mean squared error, noise reduction, NLMS, RLS, SNR, SNR loss
Procedia PDF Downloads 4832667 A Novel Machine Learning Approach to Aid Agrammatism in Non-fluent Aphasia
Authors: Rohan Bhasin
Abstract:
Agrammatism in non-fluent Aphasia Cases can be defined as a language disorder wherein a patient can only use content words ( nouns, verbs and adjectives ) for communication and their speech is devoid of functional word types like conjunctions and articles, generating speech of with extremely rudimentary grammar . Past approaches involve Speech Therapy of some order with conversation analysis used to analyse pre-therapy speech patterns and qualitative changes in conversational behaviour after therapy. We describe this approach as a novel method to generate functional words (prepositions, articles, ) around content words ( nouns, verbs and adjectives ) using a combination of Natural Language Processing and Deep Learning algorithms. The applications of this approach can be used to assist communication. The approach the paper investigates is : LSTMs or Seq2Seq: A sequence2sequence approach (seq2seq) or LSTM would take in a sequence of inputs and output sequence. This approach needs a significant amount of training data, with each training data containing pairs such as (content words, complete sentence). We generate such data by starting with complete sentences from a text source, removing functional words to get just the content words. However, this approach would require a lot of training data to get a coherent input. The assumptions of this approach is that the content words received in the inputs of both text models are to be preserved, i.e, won't alter after the functional grammar is slotted in. This is a potential limit to cases of severe Agrammatism where such order might not be inherently correct. The applications of this approach can be used to assist communication mild Agrammatism in non-fluent Aphasia Cases. Thus by generating these function words around the content words, we can provide meaningful sentence options to the patient for articulate conversations. Thus our project translates the use case of generating sentences from content-specific words into an assistive technology for non-Fluent Aphasia Patients.Keywords: aphasia, expressive aphasia, assistive algorithms, neurology, machine learning, natural language processing, language disorder, behaviour disorder, sequence to sequence, LSTM
Procedia PDF Downloads 1642666 Prosodic Characteristics of Post Traumatic Stress Disorder Induced Speech Changes
Authors: Jarek Krajewski, Andre Wittenborn, Martin Sauerland
Abstract:
This abstract describes a promising approach for estimating post-traumatic stress disorder (PTSD) based on prosodic speech characteristics. It illustrates the validity of this method by briefly discussing results from an Arabic refugee sample (N= 47, 32 m, 15 f). A well-established standardized self-report scale “Reaction of Adolescents to Traumatic Stress” (RATS) was used to determine the ground truth level of PTSD. The speech material was prompted by telling about autobiographical related sadness inducing experiences (sampling rate 16 kHz, 8 bit resolution). In order to investigate PTSD-induced speech changes, a self-developed set of 136 prosodic speech features was extracted from the .wav files. This set was adapted to capture traumatization related speech phenomena. An artificial neural network (ANN) machine learning model was applied to determine the PTSD level and reached a correlation of r = .37. These results indicate that our classifiers can achieve similar results to those seen in speech-based stress research.Keywords: speech prosody, PTSD, machine learning, feature extraction
Procedia PDF Downloads 912665 An Algorithm Based on the Nonlinear Filter Generator for Speech Encryption
Authors: A. Belmeguenai, K. Mansouri, R. Djemili
Abstract:
This work present a new algorithm based on the nonlinear filter generator for speech encryption and decryption. The proposed algorithm consists on the use a linear feedback shift register (LFSR) whose polynomial is primitive and nonlinear Boolean function. The purpose of this system is to construct Keystream with good statistical properties, but also easily computable on a machine with limited capacity calculated. This proposed speech encryption scheme is very simple, highly efficient, and fast to implement the speech encryption and decryption. We conclude the paper by showing that this system can resist certain known attacks.Keywords: nonlinear filter generator, stream ciphers, speech encryption, security analysis
Procedia PDF Downloads 2972664 Review of Speech Recognition Research on Low-Resource Languages
Authors: XuKe Cao
Abstract:
This paper reviews the current state of research on low-resource languages in the field of speech recognition, focusing on the challenges faced by low-resource language speech recognition, including the scarcity of data resources, the lack of linguistic resources, and the diversity of dialects and accents. The article reviews recent progress in low-resource language speech recognition, including techniques such as data augmentation, end to-end models, transfer learning, and multi-task learning. Based on the challenges currently faced, the paper also provides an outlook on future research directions. Through these studies, it is expected that the performance of speech recognition for low resource languages can be improved, promoting the widespread application and adoption of related technologies.Keywords: low-resource languages, speech recognition, data augmentation techniques, NLP
Procedia PDF Downloads 182663 Speech Detection Model Based on Deep Neural Networks Classifier for Speech Emotions Recognition
Authors: Aisultan Shoiynbek, Darkhan Kuanyshbay, Paulo Menezes, Akbayan Bekarystankyzy, Assylbek Mukhametzhanov, Temirlan Shoiynbek
Abstract:
Speech emotion recognition (SER) has received increasing research interest in recent years. It is a common practice to utilize emotional speech collected under controlled conditions recorded by actors imitating and artificially producing emotions in front of a microphone. There are four issues related to that approach: emotions are not natural, meaning that machines are learning to recognize fake emotions; emotions are very limited in quantity and poor in variety of speaking; there is some language dependency in SER; consequently, each time researchers want to start work with SER, they need to find a good emotional database in their language. This paper proposes an approach to create an automatic tool for speech emotion extraction based on facial emotion recognition and describes the sequence of actions involved in the proposed approach. One of the first objectives in the sequence of actions is the speech detection issue. The paper provides a detailed description of the speech detection model based on a fully connected deep neural network for Kazakh and Russian. Despite the high results in speech detection for Kazakh and Russian, the described process is suitable for any language. To investigate the working capacity of the developed model, an analysis of speech detection and extraction from real tasks has been performed.Keywords: deep neural networks, speech detection, speech emotion recognition, Mel-frequency cepstrum coefficients, collecting speech emotion corpus, collecting speech emotion dataset, Kazakh speech dataset
Procedia PDF Downloads 272662 Modern Machine Learning Conniptions for Automatic Speech Recognition
Authors: S. Jagadeesh Kumar
Abstract:
This expose presents a luculent of recent machine learning practices as employed in the modern and as pertinent to prospective automatic speech recognition schemes. The aspiration is to promote additional traverse ablution among the machine learning and automatic speech recognition factions that have transpired in the precedent. The manuscript is structured according to the chief machine learning archetypes that are furthermore trendy by now or have latency for building momentous hand-outs to automatic speech recognition expertise. The standards offered and convoluted in this article embraces adaptive and multi-task learning, active learning, Bayesian learning, discriminative learning, generative learning, supervised and unsupervised learning. These learning archetypes are aggravated and conferred in the perspective of automatic speech recognition tools and functions. This manuscript bequeaths and surveys topical advances of deep learning and learning with sparse depictions; further limelight is on their incessant significance in the evolution of automatic speech recognition.Keywords: automatic speech recognition, deep learning methods, machine learning archetypes, Bayesian learning, supervised and unsupervised learning
Procedia PDF Downloads 4482661 Prosody Generation in Neutral Speech Storytelling Application Using Tilt Model
Authors: Manjare Chandraprabha A., S. D. Shirbahadurkar, Manjare Anil S., Paithne Ajay N.
Abstract:
This paper proposes Intonation Modeling for Prosody generation in Neutral speech for Marathi (language spoken in Maharashtra, India) story telling applications. Nowadays audio story telling devices are very eminent for children. In this paper, we proposed tilt model for stressed words in Marathi for speech modification. Tilt model predicts modification in tone of neutral speech. GMM is used to identify stressed words for modification.Keywords: tilt model, fundamental frequency, statistical parametric speech synthesis, GMM
Procedia PDF Downloads 3932660 The Importance of Right Speech in Buddhism and Its Relevance Today
Authors: Gautam Sharda
Abstract:
The concept of right speech is the third stage of the noble eightfold path as prescribed by the Buddha and followed by millions of practicing Buddhists. The Buddha lays a lot of importance on the notion of right speech (Samma Vacca). In the Angutara Nikaya, the Buddha mentioned what constitutes right speech, which is basically four kinds of abstentions; namely abstaining from false speech, abstaining from slanderous speech, abstaining from harsh or hateful speech and abstaining from idle chatter. The Buddha gives reasons in support of his view as to why abstaining from these four kinds of speeches is favourable not only for maintaining the peace and equanimity within an individual but also within a society. It is a known fact that when we say something harsh or slanderous to others, it eventually affects our individual peace of mind too. We also know about the many examples of hate speeches which have led to senseless cases of violence and which are well documented within our country and the world. Also, indulging in false speech is not a healthy sign for individuals within a group as this kind of a social group which is based on falsities and lies cannot really survive for long and will eventually lead to chaos. Buddha also told us to refrain from idle chatter or gossip as generally we have seen that idle chatter or gossip does more harm than any good to the individual and the society. Hence, if most of us actually inculcate this third stage (namely, right speech) of the noble eightfold path of the Buddha in our daily life, it would be highly beneficial both for the individual and for the harmony of the society.Keywords: Buddhism, speech, individual, society
Procedia PDF Downloads 2662659 Advances in Artificial intelligence Using Speech Recognition
Authors: Khaled M. Alhawiti
Abstract:
This research study aims to present a retrospective study about speech recognition systems and artificial intelligence. Speech recognition has become one of the widely used technologies, as it offers great opportunity to interact and communicate with automated machines. Precisely, it can be affirmed that speech recognition facilitates its users and helps them to perform their daily routine tasks, in a more convenient and effective manner. This research intends to present the illustration of recent technological advancements, which are associated with artificial intelligence. Recent researches have revealed the fact that speech recognition is found to be the utmost issue, which affects the decoding of speech. In order to overcome these issues, different statistical models were developed by the researchers. Some of the most prominent statistical models include acoustic model (AM), language model (LM), lexicon model, and hidden Markov models (HMM). The research will help in understanding all of these statistical models of speech recognition. Researchers have also formulated different decoding methods, which are being utilized for realistic decoding tasks and constrained artificial languages. These decoding methods include pattern recognition, acoustic phonetic, and artificial intelligence. It has been recognized that artificial intelligence is the most efficient and reliable methods, which are being used in speech recognition.Keywords: speech recognition, acoustic phonetic, artificial intelligence, hidden markov models (HMM), statistical models of speech recognition, human machine performance
Procedia PDF Downloads 4782658 Speech Enhancement Using Wavelet Coefficients Masking with Local Binary Patterns
Authors: Christian Arcos, Marley Vellasco, Abraham Alcaim
Abstract:
In this paper, we present a wavelet coefficients masking based on Local Binary Patterns (WLBP) approach to enhance the temporal spectra of the wavelet coefficients for speech enhancement. This technique exploits the wavelet denoising scheme, which splits the degraded speech into pyramidal subband components and extracts frequency information without losing temporal information. Speech enhancement in each high-frequency subband is performed by binary labels through the local binary pattern masking that encodes the ratio between the original value of each coefficient and the values of the neighbour coefficients. This approach enhances the high-frequency spectra of the wavelet transform instead of eliminating them through a threshold. A comparative analysis is carried out with conventional speech enhancement algorithms, demonstrating that the proposed technique achieves significant improvements in terms of PESQ, an international recommendation of objective measure for estimating subjective speech quality. Informal listening tests also show that the proposed method in an acoustic context improves the quality of speech, avoiding the annoying musical noise present in other speech enhancement techniques. Experimental results obtained with a DNN based speech recognizer in noisy environments corroborate the superiority of the proposed scheme in the robust speech recognition scenario.Keywords: binary labels, local binary patterns, mask, wavelet coefficients, speech enhancement, speech recognition
Procedia PDF Downloads 2302657 Application of the Bionic Wavelet Transform and Psycho-Acoustic Model for Speech Compression
Authors: Chafik Barnoussi, Mourad Talbi, Adnane Cherif
Abstract:
In this paper we propose a new speech compression system based on the application of the Bionic Wavelet Transform (BWT) combined with the psychoacoustic model. This compression system is a modified version of the compression system using a MDCT (Modified Discrete Cosine Transform) filter banks of 32 filters each and the psychoacoustic model. This modification consists in replacing the banks of the MDCT filter banks by the bionic wavelet coefficients which are obtained from the application of the BWT to the speech signal to be compressed. These two methods are evaluated and compared with each other by computing bits before and bits after compression. They are tested on different speech signals and the obtained simulation results show that the proposed technique outperforms the second technique and this in term of compressed file size. In term of SNR, PSNR and NRMSE, the outputs speech signals of the proposed compression system are with acceptable quality. In term of PESQ and speech signal intelligibility, the proposed speech compression technique permits to obtain reconstructed speech signals with good quality.Keywords: speech compression, bionic wavelet transform, filterbanks, psychoacoustic model
Procedia PDF Downloads 3842656 Features of Normative and Pathological Realizations of Sibilant Sounds for Computer-Aided Pronunciation Evaluation in Children
Authors: Zuzanna Miodonska, Michal Krecichwost, Pawel Badura
Abstract:
Sigmatism (lisping) is a speech disorder in which sibilant consonants are mispronounced. The diagnosis of this phenomenon is usually based on the auditory assessment. However, the progress in speech analysis techniques creates a possibility of developing computer-aided sigmatism diagnosis tools. The aim of the study is to statistically verify whether specific acoustic features of sibilant sounds may be related to pronunciation correctness. Such knowledge can be of great importance while implementing classifiers and designing novel tools for automatic sibilants pronunciation evaluation. The study covers analysis of various speech signal measures, including features proposed in the literature for the description of normative sibilants realization. Amplitudes and frequencies of three fricative formants (FF) are extracted based on local spectral maxima of the friction noise. Skewness, kurtosis, four normalized spectral moments (SM) and 13 mel-frequency cepstral coefficients (MFCC) with their 1st and 2nd derivatives (13 Delta and 13 Delta-Delta MFCC) are included in the analysis as well. The resulting feature vector contains 51 measures. The experiments are performed on the speech corpus containing words with selected sibilant sounds (/ʃ, ʒ/) pronounced by 60 preschool children with proper pronunciation or with natural pathologies. In total, 224 /ʃ/ segments and 191 /ʒ/ segments are employed in the study. The Mann-Whitney U test is employed for the analysis of stigmatism and normative pronunciation. Statistically, significant differences are obtained in most of the proposed features in children divided into these two groups at p < 0.05. All spectral moments and fricative formants appear to be distinctive between pathology and proper pronunciation. These metrics describe the friction noise characteristic for sibilants, which makes them particularly promising for the use in sibilants evaluation tools. Correspondences found between phoneme feature values and an expert evaluation of the pronunciation correctness encourage to involve speech analysis tools in diagnosis and therapy of sigmatism. Proposed feature extraction methods could be used in a computer-assisted stigmatism diagnosis or therapy systems.Keywords: computer-aided pronunciation evaluation, sigmatism diagnosis, speech signal analysis, statistical verification
Procedia PDF Downloads 3022655 Examining the Effects of Increasing Lexical Retrieval Attempts in Tablet-Based Naming Therapy for Aphasia
Authors: Jeanne Gallee, Sofia Vallila-Rohter
Abstract:
Technology-based applications are increasingly being utilized in aphasia rehabilitation as a means of increasing intensity of treatment and improving accessibility to treatment. These interactive therapies, often available on tablets, lead individuals to complete language and cognitive rehabilitation tasks that draw upon skills such as the ability to name items, recognize semantic features, count syllables, rhyme, and categorize objects. Tasks involve visual and auditory stimulus cues and provide feedback about the accuracy of a person’s response. Research has begun to examine the efficacy of tablet-based therapies for aphasia, yet much remains unknown about how individuals interact with these therapy applications. Thus, the current study aims to examine the efficacy of a tablet-based therapy program for anomia, further examining how strategy training might influence the way that individuals with aphasia engage with and benefit from therapy. Individuals with aphasia are enrolled in one of two treatment paradigms: traditional therapy or strategy therapy. For ten weeks, all participants receive 2 hours of weekly in-house therapy using Constant Therapy, a tablet-based therapy application. Participants are provided with iPads and are additionally encouraged to work on therapy tasks for one hour a day at home (home logins). For those enrolled in traditional therapy, in-house sessions involve completing therapy tasks while a clinician researcher is present. For those enrolled in the strategy training group, in-house sessions focus on limiting cue use in order to maximize lexical retrieval attempts and naming opportunities. The strategy paradigm is based on the principle that retrieval attempts may foster long-term naming gains. Data have been collected from 7 participants with aphasia (3 in the traditional therapy group, 4 in the strategy training group). We examine cue use, latency of responses and accuracy through the course of therapy, comparing results across group and setting (in-house sessions vs. home logins).Keywords: aphasia, speech-language pathology, traumatic brain injury, language
Procedia PDF Downloads 2042654 Hate Speech Detection Using Deep Learning and Machine Learning Models
Authors: Nabil Shawkat, Jamil Saquer
Abstract:
Social media has accelerated our ability to engage with others and eliminated many communication barriers. On the other hand, the widespread use of social media resulted in an increase in online hate speech. This has drastic impacts on vulnerable individuals and societies. Therefore, it is critical to detect hate speech to prevent innocent users and vulnerable communities from becoming victims of hate speech. We investigate the performance of different deep learning and machine learning algorithms on three different datasets. Our results show that the BERT model gives the best performance among all the models by achieving an F1-score of 90.6% on one of the datasets and F1-scores of 89.7% and 88.2% on the other two datasets.Keywords: hate speech, machine learning, deep learning, abusive words, social media, text classification
Procedia PDF Downloads 1392653 Speech Intelligibility Improvement Using Variable Level Decomposition DWT
Authors: Samba Raju, Chiluveru, Manoj Tripathy
Abstract:
Intelligibility is an essential characteristic of a speech signal, which is used to help in the understanding of information in speech signal. Background noise in the environment can deteriorate the intelligibility of a recorded speech. In this paper, we presented a simple variance subtracted - variable level discrete wavelet transform, which improve the intelligibility of speech. The proposed algorithm does not require an explicit estimation of noise, i.e., prior knowledge of the noise; hence, it is easy to implement, and it reduces the computational burden. The proposed algorithm decides a separate decomposition level for each frame based on signal dominant and dominant noise criteria. The performance of the proposed algorithm is evaluated with speech intelligibility measure (STOI), and results obtained are compared with Universal Discrete Wavelet Transform (DWT) thresholding and Minimum Mean Square Error (MMSE) methods. The experimental results revealed that the proposed scheme outperformed competing methodsKeywords: discrete wavelet transform, speech intelligibility, STOI, standard deviation
Procedia PDF Downloads 1492652 The Language Use of Middle Eastern Freedom Activists' Speeches: A Gender Perspective
Authors: Sulistyaningtyas
Abstract:
Examining the role of Middle Eastern freedom activists’ speech based on gender perspective is considered noteworthy because the society in the Middle East is patriarchal. This research aims to examine the language use of the Middle Eastern freedom activists’ speeches through gender perspective. The data sources are from male and female Middle Eastern freedom activists’ speech videos. In analyzing the data, the theories employed are about Language Style from Gender Perspective and The Language for Speech. The result reveals that there are sets of spoken language differences between male and female speakers. In using the language for speech, both male and female speakers produce metaphor, euphemism, the ‘rule of three’, parallelism, and pronouns in random frequency of production, which cannot be separated by genders. Moreover, it cannot be concluded that one gender is more potential than the other to influence the audience in delivering speech. There are other factors, particularly non-verbal factors, existing to give impacts on how a speech can influence the audience.Keywords: gender perspective, language use, Middle Eastern freedom activists, speech
Procedia PDF Downloads 4232651 Considering Cultural and Linguistic Variables When Working as a Speech-Language Pathologist with Multicultural Students
Authors: Gabriela Smeckova
Abstract:
The entire world is becoming more and more diverse. The reasons why people migrate are different and unique for each family /individual. Professionals delivering services (including speech-language pathologists) must be prepared to work with clients coming from different cultural and/or linguistic backgrounds. Well-educated speech-language pathologists will consider many factors when delivering services. Some of them will be discussed during the presentation (language spoken, beliefs about health care and disabilities, reasons for immigration, etc.). The communication styles of the client can be different than the styles of the speech-language pathologist. The goal is to become culturally responsive in service delivery.Keywords: culture, cultural competence, culturallly responsive practices, speech-language pathologist, cultural and linguistical variables, communication styles
Procedia PDF Downloads 782650 Effect of Noise Reduction Algorithms on Temporal Splitting of Speech Signal to Improve Speech Perception for Binaural Hearing Aids
Authors: Rajani S. Pujar, Pandurangarao N. Kulkarni
Abstract:
Increased temporal masking affects the speech perception in persons with sensorineural hearing impairment especially under adverse listening conditions. This paper presents a cascaded scheme, which employs a noise reduction algorithm as well as temporal splitting of the speech signal. Earlier investigations have shown that by splitting the speech temporally and presenting alternate segments to the two ears help in reducing the effect of temporal masking. In this technique, the speech signal is processed by two fading functions, complementary to each other, and presented to left and right ears for binaural dichotic presentation. In the present study, half cosine signal is used as a fading function with crossover gain of 6 dB for the perceptual balance of loudness. Temporal splitting is combined with noise reduction algorithm to improve speech perception in the background noise. Two noise reduction schemes, namely spectral subtraction and Wiener filter are used. Listening tests were conducted on six normal-hearing subjects, with sensorineural loss simulated by adding broadband noise to the speech signal at different signal-to-noise ratios (∞, 3, 0, and -3 dB). Objective evaluation using PESQ was also carried out. The MOS score for VCV syllable /asha/ for SNR values of ∞, 3, 0, and -3 dB were 5, 4.46, 4.4 and 4.05 respectively, while the corresponding MOS scores for unprocessed speech were 5, 1.2, 0.9 and 0.65, indicating significant improvement in the perceived speech quality for the proposed scheme compared to the unprocessed speech.Keywords: MOS, PESQ, spectral subtraction, temporal splitting, wiener filter
Procedia PDF Downloads 3282649 Efficacy of a Wiener Filter Based Technique for Speech Enhancement in Hearing Aids
Authors: Ajish K. Abraham
Abstract:
Hearing aid is the most fundamental technology employed towards rehabilitation of persons with sensory neural hearing impairment. Hearing in noise is still a matter of major concern for many hearing aid users and thus continues to be a challenging issue for the hearing aid designers. Several techniques are being currently used to enhance the speech at the hearing aid output. Most of these techniques, when implemented, result in reduction of intelligibility of the speech signal. Thus the dissatisfaction of the hearing aid user towards comprehending the desired speech amidst noise is prevailing. Multichannel Wiener Filter is widely implemented in binaural hearing aid technology for noise reduction. In this study, Wiener filter based noise reduction approach is experimented for a single microphone based hearing aid set up. This method checks the status of the input speech signal in each frequency band and then selects the relevant noise reduction procedure. Results showed that the Wiener filter based algorithm is capable of enhancing speech even when the input acoustic signal has a very low Signal to Noise Ratio (SNR). Performance of the algorithm was compared with other similar algorithms on the basis of improvement in intelligibility and SNR of the output, at different SNR levels of the input speech. Wiener filter based algorithm provided significant improvement in SNR and intelligibility compared to other techniques.Keywords: hearing aid output speech, noise reduction, SNR improvement, Wiener filter, speech enhancement
Procedia PDF Downloads 2472648 A Two-Stage Adaptation towards Automatic Speech Recognition System for Malay-Speaking Children
Authors: Mumtaz Begum Mustafa, Siti Salwah Salim, Feizal Dani Rahman
Abstract:
Recently, Automatic Speech Recognition (ASR) systems were used to assist children in language acquisition as it has the ability to detect human speech signal. Despite the benefits offered by the ASR system, there is a lack of ASR systems for Malay-speaking children. One of the contributing factors for this is the lack of continuous speech database for the target users. Though cross-lingual adaptation is a common solution for developing ASR systems for under-resourced language, it is not viable for children as there are very limited speech databases as a source model. In this research, we propose a two-stage adaptation for the development of ASR system for Malay-speaking children using a very limited database. The two stage adaptation comprises the cross-lingual adaptation (first stage) and cross-age adaptation. For the first stage, a well-known speech database that is phonetically rich and balanced, is adapted to the medium-sized Malay adults using supervised MLLR. The second stage adaptation uses the speech acoustic model generated from the first adaptation, and the target database is a small-sized database of the target users. We have measured the performance of the proposed technique using word error rate, and then compare them with the conventional benchmark adaptation. The two stage adaptation proposed in this research has better recognition accuracy as compared to the benchmark adaptation in recognizing children’s speech.Keywords: Automatic Speech Recognition System, children speech, adaptation, Malay
Procedia PDF Downloads 3982647 The Complaint Speech Act Set Produced by Arab Students in the UAE
Authors: Tanju Deveci
Abstract:
It appears that the speech act of complaint has not received as much attention as other speech acts. However, the face-threatening nature of this speech act requires a special attention in multicultural contexts in particular. The teaching context in the UAE universities, where a big majority of teaching staff comes from other cultures, requires investigations into this speech act in order to improve communication between students and faculty. This session will outline the results of a study conducted with this purpose. The realization of complaints by Freshman English students in Communication courses at Petroleum Institute was investigated to identify communication patterns that seem to cause a strain. Data were collected using a role-play between a teacher and students, and a judgment scale completed by two of the instructors in the Communications Department. The initial findings reveal that the students had difficulty putting their case, produced the speech act of criticism along with a complaint and that they produced both requests and demands as candidate solutions. The judgement scales revealed that the students’ attitude was not appropriate most of the time and that the judges would behave differently from students. It is concluded that speech acts, in general, and complaint, in particular, need to be taught to learners explicitly to improve interpersonal communication in multicultural societies. Some teaching ideas are provided to help increase foreign language learners’ sociolinguistic competence.Keywords: speech act, complaint, pragmatics, sociolinguistics, language teaching
Procedia PDF Downloads 5092646 On Overcoming Common Oral Speech Problems through Authentic Films
Authors: Tamara Matevosyan
Abstract:
The present paper discusses the main problems that students face while developing oral skills through authentic films. It states that special attention should be paid not only to the study of verbal speech but also to non-verbal communication. Authentic films serve as an important tool to understand both native speaker’s gestures and their culture of pausing while speaking. Various phonetic difficulties causing phonetic interference in actual speech are covered in the paper emphasizing the role of authentic films in overcoming them.Keywords: compressive speech, filled pauses, unfilled pauses, pausing culture
Procedia PDF Downloads 3532645 Morpheme Based Parts of Speech Tagger for Kannada Language
Authors: M. C. Padma, R. J. Prathibha
Abstract:
Parts of speech tagging is the process of assigning appropriate parts of speech tags to the words in a given text. The critical or crucial information needed for tagging a word come from its internal structure rather from its neighboring words. The internal structure of a word comprises of its morphological features and grammatical information. This paper presents a morpheme based parts of speech tagger for Kannada language. This proposed work uses hierarchical tag set for assigning tags. The system is tested on some Kannada words taken from EMILLE corpus. Experimental result shows that the performance of the proposed system is above 90%.Keywords: hierarchical tag set, morphological analyzer, natural language processing, paradigms, parts of speech
Procedia PDF Downloads 2962644 Effect of Early Therapeutic Intervention for the Children with Autism Spectrum Disorders: A Quasi Experimental Design
Authors: Sultana Razia
Abstract:
The purpose of this study was to investigate the effect of early therapeutic intervention on children with an autism spectrum disorder. Participants were 140 children with autism spectrum disorder from Autism Corner in a selected rehabilitation center of Bangladesh. This study included children who are at aged of 18-month to 36-month and who were taking occupational therapy and speech and language therapy from the autism center. They were primarily screened using M-CHAT; however, children with other physical disabilities or medical conditions were excluded. 3-months interventions of 6 sessions per week are a minimum of 45-minutes long per session, one to one interaction followed by parent-led structured home-based therapy were provided. The results indicated that early intensive therapeutic intervention improves understanding, social skills and sensory skills. It can be concluded that therapeutic early intervention has a positive effect on diminishing symptoms of Autism Spectrum Disorder.Keywords: autism, m-chat, reciprocal social behavior, CRP
Procedia PDF Downloads 1222643 The Convolution Recurrent Network of Using Residual LSTM to Process the Output of the Downsampling for Monaural Speech Enhancement
Authors: Shibo Wei, Ting Jiang
Abstract:
Convolutional-recurrent neural networks (CRN) have achieved much success recently in the speech enhancement field. The common processing method is to use the convolution layer to compress the feature space by multiple upsampling and then model the compressed features with the LSTM layer. At last, the enhanced speech is obtained by deconvolution operation to integrate the global information of the speech sequence. However, the feature space compression process may cause the loss of information, so we propose to model the upsampling result of each step with the residual LSTM layer, then join it with the output of the deconvolution layer and input them to the next deconvolution layer, by this way, we want to integrate the global information of speech sequence better. The experimental results show the network model (RES-CRN) we introduce can achieve better performance than LSTM without residual and overlaying LSTM simply in the original CRN in terms of scale-invariant signal-to-distortion ratio (SI-SNR), speech quality (PESQ), and intelligibility (STOI).Keywords: convolutional-recurrent neural networks, speech enhancement, residual LSTM, SI-SNR
Procedia PDF Downloads 202