Search results for: speech understanding
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 7127

Search results for: speech understanding

6947 An Event-Related Potential Investigation of Speech-in-Noise Recognition in Native and Nonnative Speakers of English

Authors: Zahra Fotovatnia, Jeffery A. Jones, Alexandra Gottardo

Abstract:

Speech communication often occurs in environments where noise conceals part of a message. Listeners should compensate for the lack of auditory information by picking up distinct acoustic cues and using semantic and sentential context to recreate the speaker’s intended message. This situation seems to be more challenging in a nonnative than native language. On the other hand, early bilinguals are expected to show an advantage over the late bilingual and monolingual speakers of a language due to their better executive functioning components. In this study, English monolingual speakers were compared with early and late nonnative speakers of English to understand speech in noise processing (SIN) and the underlying neurobiological features of this phenomenon. Auditory mismatch negativities (MMNs) were recorded using a double-oddball paradigm in response to a minimal pair that differed in their middle vowel (beat/bit) at Wilfrid Laurier University in Ontario, Canada. The results did not show any significant structural and electroneural differences across groups. However, vocabulary knowledge correlated positively with performance on tests that measured SIN processing in participants who learned English after age 6. Moreover, their performance on the test negatively correlated with the integral area amplitudes in the left superior temporal gyrus (STG). In addition, the STG was engaged before the inferior frontal gyrus (IFG) in noise-free and low-noise test conditions in all groups. We infer that the pre-attentive processing of words engages temporal lobes earlier than the fronto-central areas and that vocabulary knowledge helps the nonnative perception of degraded speech.

Keywords: degraded speech perception, event-related brain potentials, mismatch negativities, brain regions

Procedia PDF Downloads 70
6946 Using Speech Emotion Recognition as a Longitudinal Biomarker for Alzheimer’s Diseases

Authors: Yishu Gong, Liangliang Yang, Jianyu Zhang, Zhengyu Chen, Sihong He, Xusheng Zhang, Wei Zhang

Abstract:

Alzheimer’s disease (AD) is a progressive neurodegenerative disorder that affects millions of people worldwide and is characterized by cognitive decline and behavioral changes. People living with Alzheimer’s disease often find it hard to complete routine tasks. However, there are limited objective assessments that aim to quantify the difficulty of certain tasks for AD patients compared to non-AD people. In this study, we propose to use speech emotion recognition (SER), especially the frustration level, as a potential biomarker for quantifying the difficulty patients experience when describing a picture. We build an SER model using data from the IEMOCAP dataset and apply the model to the DementiaBank data to detect the AD/non-AD group difference and perform longitudinal analysis to track the AD disease progression. Our results show that the frustration level detected from the SER model can possibly be used as a cost-effective tool for objective tracking of AD progression in addition to the Mini-Mental State Examination (MMSE) score.

Keywords: Alzheimer’s disease, speech emotion recognition, longitudinal biomarker, machine learning

Procedia PDF Downloads 81
6945 Teaching Pragmatic Coherence in Literary Text: Analysis of Chimamanda Adichie’s Americanah

Authors: Joy Aworo-Okoroh

Abstract:

Literary texts are mirrors of a real-life situation. Thus, authors choose the linguistic items that would best encode their intended meanings and messages. However, words mean more than they seem. The meaning of words is not static rather, it is dynamic as they constantly enter into relationships within a context. Literary texts can only be meaningful if all pragmatic cues are identified and interpreted. Drawing upon Teun Van Djik's theory of local pragmatic coherence, it is established that words enter into relations in a text and these relations account for sequential speech acts in the texts. Comprehension of the text is dependent on the interpretation of these relations.To show the relevance of pragmatic coherence in literary text analysis, ten conversations were selected in Americanah in order to give a clear idea of the pragmatic relations used. The conversations were analysed, identifying the speech act and epistemic relations inherent in them. A subtle analysis of the structure of the conversations was also carried out. It was discovered that justification is the most commonly used relation and the meaning of the text is dependent on the interpretation of these instances' pragmatic coherence. The study concludes that to effectively teach literature in English, pragmatic coherence should be incorporated as words mean more than they say.

Keywords: pragmatic coherence, epistemic coherence, speech act, Americanah

Procedia PDF Downloads 106
6944 Deep-Learning to Generation of Weights for Image Captioning Using Part-of-Speech Approach

Authors: Tiago do Carmo Nogueira, Cássio Dener Noronha Vinhal, Gélson da Cruz Júnior, Matheus Rudolfo Diedrich Ullmann

Abstract:

Generating automatic image descriptions through natural language is a challenging task. Image captioning is a task that consistently describes an image by combining computer vision and natural language processing techniques. To accomplish this task, cutting-edge models use encoder-decoder structures. Thus, Convolutional Neural Networks (CNN) are used to extract the characteristics of the images, and Recurrent Neural Networks (RNN) generate the descriptive sentences of the images. However, cutting-edge approaches still suffer from problems of generating incorrect captions and accumulating errors in the decoders. To solve this problem, we propose a model based on the encoder-decoder structure, introducing a module that generates the weights according to the importance of the word to form the sentence, using the part-of-speech (PoS). Thus, the results demonstrate that our model surpasses state-of-the-art models.

Keywords: gated recurrent units, caption generation, convolutional neural network, part-of-speech

Procedia PDF Downloads 68
6943 Complications and Outcomes of Cochlear Implantation in Children Younger than 12 Months: A Multicenter Study

Authors: Alimohamad Asghari, Ahmad Daneshi, Mohammad Farhadi, Arash Bayat, Mohammad Ajalloueyan, Marjan Mirsalehi, Mohsen Rajati, Seyed Basir Hashemi, Nader Saki, Ali Omidvari

Abstract:

Evidence suggests that Cochlear Implantation (CI) is a beneficial approach for auditory and speech skills improvement in children with severe to profound hearing loss. However, it remains controversial if implantation in children <12 months is safe and effective compared to older children. The present study aimed to determine whether children's ages affect surgical complications and auditory and speech development. The current multicenter study enrolled 86 children who underwent CI surgery at <12 months of age (group A) and 362 children who underwent implantation between 12 and 24 months of age (group B). The Categories of Auditory Performance (CAP) and Speech Intelligibility Rating (SIR) scores were determined pre-impanation, and "one-year" and "two-year" post-implantation. Four complications (overall rate: 4.65%; three minor) occurred in group A and 12 complications (overall rate: 4.41%; nine minor) occurred in group B. We found no statistically significant difference in the complication rates between the groups (p>0.05). The mean SIR and CAP scores improved over time following CI activation in both groups. However, we did not find significant differences in CAP and SIR scores between the groups across different time points. Cochlear implantation is a safe and efficient procedure in children younger than 12 months, providing substantial auditory and speech benefits comparable to children undergoing implantation at 12 to 24 months of age. Furthermore, surgical complications in younger children are similar to those of children undergoing the CI at an older age.

Keywords: cochlear implant, Infant, complications, outcome

Procedia PDF Downloads 76
6942 English Learning Speech Assistant Speak Application in Artificial Intelligence

Authors: Albatool Al Abdulwahid, Bayan Shakally, Mariam Mohamed, Wed Almokri

Abstract:

Artificial intelligence has infiltrated every part of our life and every field we can think of. With technical developments, artificial intelligence applications are becoming more prevalent. We chose ELSA speak because it is a magnificent example of Artificial intelligent applications, ELSA speak is a smartphone application that is free to download on both IOS and Android smartphones. ELSA speak utilizes artificial intelligence to help non-native English speakers pronounce words and phrases similar to a native speaker, as well as enhance their English skills. It employs speech-recognition technology that aids the application to excel the pronunciation of its users. This remarkable feature distinguishes ELSA from other voice recognition algorithms and increase the efficiency of the application. This study focused on evaluating ELSA speak application, by testing the degree of effectiveness based on survey questions. The results of the questionnaire were variable. The generality of the participants strongly agreed that ELSA has helped them enhance their pronunciation skills. However, a few participants were unconfident about the application’s ability to assist them in their learning journey.

Keywords: ELSA speak application, artificial intelligence, speech-recognition technology, language learning, english pronunciation

Procedia PDF Downloads 73
6941 Conspiracy Theory in Discussions of the Coronavirus Pandemic in the Gulf Region

Authors: Rasha Salameh

Abstract:

In light of the tense relationship between Saudi Arabia and Iran, this research paper sheds some light on Al-Arabiya’s reporting of Coronavirus in the Gulf. Particularly because most of the cases, in the beginning, were coming from Iran, some programs of this Saudi channel embraced a conspiracy theory. Hate speech has been used in talking about the topic and discussing it. The results of these discussions will be detailed in this paper in percentages with regard to the research sample, which includes five programs on Al-Arabiya channel: ‘DNA’, ‘Marraya’ (Mirrors), ‘Panorama’, ‘Tafaolcom’ (Your Interaction) and the ‘Diplomatic Street’, in the period between January 19, that is, the date of the first case in Iran, and April 10, 2020. The research shows the use of a conspiracy theory in the programs, in addition to some professional violations. The surveyed sample also shows that the matter receded due to the Arab Gulf states' preoccupation with the successively increasing cases that have appeared there since the start of the pandemic. The results indicate that hate speech was present in the sample at a rate of 98.1% and that most of the programs that dealt with the Iranian issue under the Corona pandemic on Al Arabiya used the conspiracy theory at a rate of 75.5%.

Keywords: Al-Arabiya, Iran, Corona, hate speech, conspiracy theory, politicization of the pandemic

Procedia PDF Downloads 103
6940 Reduced Lung Volume: A Possible Cause of Stuttering

Authors: Shantanu Arya, Sachin Sakhuja, Gunjan Mehta, Sanjay Munjal

Abstract:

Stuttering may be defined as a speech disorder affecting the fluency domain of speech and characterized by covert features like word substitution, omittance and circumlocution and overt features like prolongation of sound, syllables and blocks etc. Many etiologies have been postulated to explain stuttering based on various experiments and research. Moreover, Breathlessness has also been reported by many individuals with stuttering for which breathing exercises are generally advised. However, no studies reporting objective evaluation of the pulmonary capacity and further objective assessment of the efficacy of breathing exercises have been conducted. Pulmonary Function Test which evaluates parameters like Forced Vital Capacity, Peak Expiratory Flow Rate, Forced expiratory flow Rate can be used to study the pulmonary behavior of individuals with stuttering. The study aimed: a) To identify speech motor & physiologic behaviours associated with stuttering by administering PFT. b) To recognize possible reasons for an association between speech motor behaviour & stuttering severity. In this regard, PFT tests were administered on individuals who reported signs and symptoms of stuttering and showed abnormal scores on Stuttering Severity Index. Parameters like Forced Vital Capacity, Forced Expiratory Volume, Peak Expiratory Flow Rate (L/min), Forced Expiratory Flow Rate (L/min) were evaluated and correlated with scores of Stuttering Severity Index. Results showed significant decrease in the parameters (lower than normal scores) in individuals with established stuttering. Strong correlation was also found between degree of stuttering and the degree of decrease in the pulmonary volumes. Thus, it is evident that fluent speech requires strong support of lung pressure and requisite volumes. Further research in demonstrating the efficacy of abdominal breathing exercises in this regard is needed.

Keywords: forced expiratory flow rate, forced expiratory volume, forced vital capacity, peak expiratory flow rate, stuttering

Procedia PDF Downloads 243
6939 The Analysis of Deceptive and Truthful Speech: A Computational Linguistic Based Method

Authors: Seham El Kareh, Miramar Etman

Abstract:

Recently, detecting liars and extracting features which distinguish them from truth-tellers have been the focus of a wide range of disciplines. To the author’s best knowledge, most of the work has been done on facial expressions and body gestures but only few works have been done on the language used by both liars and truth-tellers. This paper sheds light on four axes. The first axis copes with building an audio corpus for deceptive and truthful speech for Egyptian Arabic speakers. The second axis focuses on examining the human perception of lies and proving our need for computational linguistic-based methods to extract features which characterize truthful and deceptive speech. The third axis is concerned with building a linguistic analysis program that could extract from the corpus the inter- and intra-linguistic cues for deceptive and truthful speech. The program built here is based on selected categories from the Linguistic Inquiry and Word Count program. Our results demonstrated that Egyptian Arabic speakers on one hand preferred to use first-person pronouns and present tense compared to the past tense when lying and their lies lacked of second-person pronouns, and on the other hand, when telling the truth, they preferred to use the verbs related to motion and the nouns related to time. The results also showed that there is a need for bigger data to prove the significance of words related to emotions and numbers.

Keywords: Egyptian Arabic corpus, computational analysis, deceptive features, forensic linguistics, human perception, truthful features

Procedia PDF Downloads 180
6938 Intonation Salience as an Underframe to Text Intonation Models

Authors: Tatiana Stanchuliak

Abstract:

It is common knowledge that intonation is not laid over a ready text. On the contrary, intonation forms and accompanies the text on the level of its birth in the speaker’s mind. As a result, intonation plays one of the fundamental roles in the process of transferring a thought into external speech. Intonation structure can highlight the semantic significance of textual elements and become a ranging mark in understanding the information structure of the text. Intonation functions by means of prosodic characteristics, one of which is intonation salience, whose function in texts results in making some textual elements more prominent than others. This function of intonation, therefore, performs as organizing. It helps to form the frame of key elements of the text. The study under consideration made an attempt to look into the inner nature of salience and create a sort of a text intonation model. This general goal brought to some more specific intermediate results. First, there were established degrees of salience on the level of the smallest semantic element - intonation group, as well as prosodic means of creating salience, were examined. Second, the most frequent combinations of prosodic means made it possible to distinguish patterns of salience, which then became constituent elements of a text intonation model. Third, the analysis of the predicate structure allowed to divide the whole text into smaller parts, or units, which performed a specific function in the developing of the general communicative intention. It appeared that such units can be found in any text and they have common characteristics of their intonation arrangement. These findings are certainly very important both for the theory of intonation and their practical application.

Keywords: accentuation , inner speech, intention, intonation, intonation functions, models, patterns, predicate, salience, semantics, sentence stress, text

Procedia PDF Downloads 234
6937 Potential and Development of Children with Atypical Rett Syndrome (CDKL5 Gene Mutation) and Augmentative and Alternative Communication

Authors: Anna Amato

Abstract:

Every child needs communication. If spoken language is not or not fully available due to congenital or acquired limitations, those affected need appropriate ways. These can be found in many possibilities of Augmentative and Alternative Communications (AAC). In the communication promotion of severely impaired children, who can use their own body communication forms only to a limited extent for the differentiated understanding, computers with eye control play an essential role. It takes some time to understand the individual forms of communication of the child. Children who depend on the AAC need competent support to learn to communicate in a motivated way in their everyday life. The aim of the present parents' survey (n = 4), which was evaluated descriptively, is to demonstrate the development of communicative abilities as well as the motivation to use complex communication aids with eye control by patients with atypical Rett Syndrome. An increase in communication skills, well-being, self-reliance, and self-esteem, an improvement in social participation, as well as a reduction in anger and screaming events, were noted. The complex visual communication tools were available daily for 3 out of 4 patients with atypical Rett Syndrome. It raises research questions regarding speech understanding and the ability to drive eye control technology in a larger group of atypical Rett Syndrome patients.

Keywords: augmentative and alternative communications, AAC, atypical Rett-syndrome, children, development

Procedia PDF Downloads 93
6936 Features of Normative and Pathological Realizations of Sibilant Sounds for Computer-Aided Pronunciation Evaluation in Children

Authors: Zuzanna Miodonska, Michal Krecichwost, Pawel Badura

Abstract:

Sigmatism (lisping) is a speech disorder in which sibilant consonants are mispronounced. The diagnosis of this phenomenon is usually based on the auditory assessment. However, the progress in speech analysis techniques creates a possibility of developing computer-aided sigmatism diagnosis tools. The aim of the study is to statistically verify whether specific acoustic features of sibilant sounds may be related to pronunciation correctness. Such knowledge can be of great importance while implementing classifiers and designing novel tools for automatic sibilants pronunciation evaluation. The study covers analysis of various speech signal measures, including features proposed in the literature for the description of normative sibilants realization. Amplitudes and frequencies of three fricative formants (FF) are extracted based on local spectral maxima of the friction noise. Skewness, kurtosis, four normalized spectral moments (SM) and 13 mel-frequency cepstral coefficients (MFCC) with their 1st and 2nd derivatives (13 Delta and 13 Delta-Delta MFCC) are included in the analysis as well. The resulting feature vector contains 51 measures. The experiments are performed on the speech corpus containing words with selected sibilant sounds (/ʃ, ʒ/) pronounced by 60 preschool children with proper pronunciation or with natural pathologies. In total, 224 /ʃ/ segments and 191 /ʒ/ segments are employed in the study. The Mann-Whitney U test is employed for the analysis of stigmatism and normative pronunciation. Statistically, significant differences are obtained in most of the proposed features in children divided into these two groups at p < 0.05. All spectral moments and fricative formants appear to be distinctive between pathology and proper pronunciation. These metrics describe the friction noise characteristic for sibilants, which makes them particularly promising for the use in sibilants evaluation tools. Correspondences found between phoneme feature values and an expert evaluation of the pronunciation correctness encourage to involve speech analysis tools in diagnosis and therapy of sigmatism. Proposed feature extraction methods could be used in a computer-assisted stigmatism diagnosis or therapy systems.

Keywords: computer-aided pronunciation evaluation, sigmatism diagnosis, speech signal analysis, statistical verification

Procedia PDF Downloads 273
6935 Part of Speech Tagging Using Statistical Approach for Nepali Text

Authors: Archit Yajnik

Abstract:

Part of Speech Tagging has always been a challenging task in the era of Natural Language Processing. This article presents POS tagging for Nepali text using Hidden Markov Model and Viterbi algorithm. From the Nepali text, annotated corpus training and testing data set are randomly separated. Both methods are employed on the data sets. Viterbi algorithm is found to be computationally faster and accurate as compared to HMM. The accuracy of 95.43% is achieved using Viterbi algorithm. Error analysis where the mismatches took place is elaborately discussed.

Keywords: hidden markov model, natural language processing, POS tagging, viterbi algorithm

Procedia PDF Downloads 302
6934 Italian Speech Vowels Landmark Detection through the Legacy Tool 'xkl' with Integration of Combined CNNs and RNNs

Authors: Kaleem Kashif, Tayyaba Anam, Yizhi Wu

Abstract:

This paper introduces a methodology for advancing Italian speech vowels landmark detection within the distinctive feature-based speech recognition domain. Leveraging the legacy tool 'xkl' by integrating combined convolutional neural networks (CNNs) and recurrent neural networks (RNNs), the study presents a comprehensive enhancement to the 'xkl' legacy software. This integration incorporates re-assigned spectrogram methodologies, enabling meticulous acoustic analysis. Simultaneously, our proposed model, integrating combined CNNs and RNNs, demonstrates unprecedented precision and robustness in landmark detection. The augmentation of re-assigned spectrogram fusion within the 'xkl' software signifies a meticulous advancement, particularly enhancing precision related to vowel formant estimation. This augmentation catalyzes unparalleled accuracy in landmark detection, resulting in a substantial performance leap compared to conventional methods. The proposed model emerges as a state-of-the-art solution in the distinctive feature-based speech recognition systems domain. In the realm of deep learning, a synergistic integration of combined CNNs and RNNs is introduced, endowed with specialized temporal embeddings, harnessing self-attention mechanisms, and positional embeddings. The proposed model allows it to excel in capturing intricate dependencies within Italian speech vowels, rendering it highly adaptable and sophisticated in the distinctive feature domain. Furthermore, our advanced temporal modeling approach employs Bayesian temporal encoding, refining the measurement of inter-landmark intervals. Comparative analysis against state-of-the-art models reveals a substantial improvement in accuracy, highlighting the robustness and efficacy of the proposed methodology. Upon rigorous testing on a database (LaMIT) speech recorded in a silent room by four Italian native speakers, the landmark detector demonstrates exceptional performance, achieving a 95% true detection rate and a 10% false detection rate. A majority of missed landmarks were observed in proximity to reduced vowels. These promising results underscore the robust identifiability of landmarks within the speech waveform, establishing the feasibility of employing a landmark detector as a front end in a speech recognition system. The synergistic integration of re-assigned spectrogram fusion, CNNs, RNNs, and Bayesian temporal encoding not only signifies a significant advancement in Italian speech vowels landmark detection but also positions the proposed model as a leader in the field. The model offers distinct advantages, including unparalleled accuracy, adaptability, and sophistication, marking a milestone in the intersection of deep learning and distinctive feature-based speech recognition. This work contributes to the broader scientific community by presenting a methodologically rigorous framework for enhancing landmark detection accuracy in Italian speech vowels. The integration of cutting-edge techniques establishes a foundation for future advancements in speech signal processing, emphasizing the potential of the proposed model in practical applications across various domains requiring robust speech recognition systems.

Keywords: landmark detection, acoustic analysis, convolutional neural network, recurrent neural network

Procedia PDF Downloads 14
6933 The Influence of Neural Synchrony on Auditory Middle Latency and Late Latency Responses and Its Correlation with Audiological Profile in Individuals with Auditory Neuropathy

Authors: P. Renjitha, P. Hari Prakash

Abstract:

Auditory neuropathy spectrum disorder (ANSD) is an auditory disorder with normal cochlear outer hair cell function and disrupted auditory nerve function. It results in unique clinical characteristic with absent auditory brainstem response (ABR), absent acoustic reflex and the presence of otoacoustic emissions (OAE) and cochlear microphonics. The lesion site could be at cochlear inner hair cells, the synapse between the inner hair cells and type I auditory nerve fibers, and/or the auditory nerve itself. But the literatures on synchrony at higher auditory system are sporadic and are less understood. It might be interesting to see if there is a recovery of neural synchrony at higher auditory centers. Also, does the level at which the auditory system recovers with adequate synchrony to the extent of observable evoke response potentials (ERPs) can predict speech perception? In the current study, eight ANSD participants and healthy controls underwent detailed audiological assessment including ABR, auditory middle latency response (AMLR), and auditory late latency response (ALLR). AMLR was recorded for clicks and ALLR was evoked using 500Hz and 2 kHz tone bursts. Analysis revealed that the participant could be categorized into three groups. Group I (2/8) where ALLR was present only for 2kHz tone burst. Group II (4/8), where AMLR was absent and ALLR was seen for both the stimuli. Group III (2/8) consisted individuals with identifiable AMLR and ALLR for all the stimuli. The highest speech identification sore observed in ANSD group was 30% and hence considered having poor speech perception. Overall test result indicates that the site of neural synchrony recovery could be varying across individuals with ANSD. Some individuals show recovery of neural synchrony at the thalamocortical level while others show the same only at the cortical level. Within ALLR itself there could be variation across stimuli again could be related to neural synchrony. Nevertheless, none of these patterns could possible explain the speech perception ability of the individuals. Hence, it could be concluded that neural synchrony as measured by evoked potentials could not be a good clinical predictor speech perception.

Keywords: auditory late latency response, auditory middle latency response, auditory neuropathy spectrum disorder, correlation with speech identification score

Procedia PDF Downloads 117
6932 A Stylistic Analysis of the Short Story ‘The Escape’ by Qaisra Shahraz

Authors: Huma Javed

Abstract:

Stylistics is a broad term that is concerned with both literature and linguistics, due to which the significance of the stylistics increases. This research aims to analyze Qaisra Shahraz's short story ‘The Escape’ from the stylistic analysis viewpoint. The focus of this study is on three aspects grammar category, lexical category, and figure of speech of the short story. The research designs for this article are both explorative and descriptive. The analysis of the data shows that the writer has used more nouns in the story as compared to other lexical items, which suggests that story has a descriptive style rather than narrative.

Keywords: The Escape, stylistics, grammatical category, lexical category, figure of speech

Procedia PDF Downloads 191
6931 Imprecise Vowel Articulation in Down Syndrome: An Acoustic Study

Authors: Anitha Naittee Abraham, N. Sreedevi

Abstract:

Individuals with Down syndrome (DS) have relatively better expressive language compared to other individuals with intellectual disabilities. Reduced speech intelligibility is one of the major concerns of this group of individuals due to their anatomical and physiological differences. The study investigated the vowel articulation of Malayalam speaking children with DS in the age range of 5-10 years. The vowel production of 10 children with DS was compared with typically developing children in the same age range. Vowels were extracted from 3 words with the corner vowels /a/, /i/ and /u/ in the word-initial position, using Praat (version 5.3.23) software. Acoustic analysis was based on vowel space area (VSA), Formant centralization ration (FCR) and F2i/F2u. The findings revealed increased formant values for the control group except for F2a and F2u. Also, the experimental group had higher FCR, lower VSA, and F2i/F2u values suggestive of imprecise vowel articulation due to restricted tongue movements. The results of the independent t-test revealed a significant difference in F1a, F2i, F2u, VSA, FCR and F2i/F2u values between the experimental and control group. These findings support the fact that children with DS have imprecise vowel articulation that interferes with the overall speech intelligibility. Hence it is essential to target the oromotor skills to enhance the speech intelligibility which in turn benefit in the social and vocational domains of these individuals.

Keywords: Down syndrome, FCR, vowel articulation, vowel space

Procedia PDF Downloads 148
6930 Development of a Sequential Multimodal Biometric System for Web-Based Physical Access Control into a Security Safe

Authors: Babatunde Olumide Olawale, Oyebode Olumide Oyediran

Abstract:

The security safe is a place or building where classified document and precious items are kept. To prevent unauthorised persons from gaining access to this safe a lot of technologies had been used. But frequent reports of an unauthorised person gaining access into security safes with the aim of removing document and items from the safes are pointers to the fact that there is still security gap in the recent technologies used as access control for the security safe. In this paper we try to solve this problem by developing a multimodal biometric system for physical access control into a security safe using face and voice recognition. The safe is accessed by the combination of face and speech pattern recognition and also in that sequential order. User authentication is achieved through the use of camera/sensor unit and a microphone unit both attached to the door of the safe. The user face was captured by the camera/sensor while the speech was captured by the use of the microphone unit. The Scale Invariance Feature Transform (SIFT) algorithm was used to train images to form templates for the face recognition system while the Mel-Frequency Cepitral Coefficients (MFCC) algorithm was used to train the speech recognition system to recognise authorise user’s speech. Both algorithms were hosted in two separate web based servers and for automatic analysis of our work; our developed system was simulated in a MATLAB environment. The results obtained shows that the developed system was able to give access to authorise users while declining unauthorised person access to the security safe.

Keywords: access control, multimodal biometrics, pattern recognition, security safe

Procedia PDF Downloads 299
6929 Acoustic Analysis for Comparison and Identification of Normal and Disguised Speech of Individuals

Authors: Surbhi Mathur, J. M. Vyas

Abstract:

Although the rapid development of forensic speaker recognition technology has been conducted, there are still many problems to be solved. The biggest problem arises when the cases involving disguised voice samples come across for the purpose of examination and identification. Such type of voice samples of anonymous callers is frequently encountered in crimes involving kidnapping, blackmailing, hoax extortion and many more, where the speaker makes a deliberate effort to manipulate their natural voice in order to conceal their identity due to the fear of being caught. Voice disguise causes serious damage to the natural vocal parameters of the speakers and thus complicates the process of identification. The sole objective of this doctoral project is to find out the possibility of rendering definite opinions in cases involving disguised speech by experimentally determining the effects of different disguise forms on personal identification and percentage rate of speaker recognition for various voice disguise techniques such as raised pitch, lower pitch, increased nasality, covering the mouth, constricting tract, obstacle in mouth etc by analyzing and comparing the amount of phonetic and acoustic variation in of artificial (disguised) and natural sample of an individual, by auditory as well as spectrographic analysis.

Keywords: forensic, speaker recognition, voice, speech, disguise, identification

Procedia PDF Downloads 338
6928 Human Computer Interaction Using Computer Vision and Speech Processing

Authors: Shreyansh Jain Jeetmal, Shobith P. Chadaga, Shreyas H. Srinivas

Abstract:

Internet of Things (IoT) is seen as the next major step in the ongoing revolution in the Information Age. It is predicted that in the near future billions of embedded devices will be communicating with each other to perform a plethora of tasks with or without human intervention. One of the major ongoing hotbed of research activity in IoT is Human Computer Interaction (HCI). HCI is used to facilitate communication between an intelligent system and a user. An intelligent system typically comprises of a system consisting of various sensors, actuators and embedded controllers which communicate with each other to monitor data collected from the environment. Communication by the user to the system is typically done using voice. One of the major ongoing applications of HCI is in home automation as a personal assistant. The prime objective of our project is to implement a use case of HCI for home automation. Our system is designed to detect and recognize the users and personalize the appliances in the house according to their individual preferences. Our HCI system is also capable of speaking with the user when certain commands are spoken such as searching on the web for information and controlling appliances. Our system can also monitor the environment in the house such as air quality and gas leakages for added safety.

Keywords: human computer interaction, internet of things, computer vision, sensor networks, speech to text, text to speech, android

Procedia PDF Downloads 328
6927 The Importance of Visual Communication in Artificial Intelligence

Authors: Manjitsingh Rajput

Abstract:

Visual communication plays an important role in artificial intelligence (AI) because it enables machines to understand and interpret visual information, similar to how humans do. This abstract explores the importance of visual communication in AI and emphasizes the importance of various applications such as computer vision, object emphasis recognition, image classification and autonomous systems. In going deeper, with deep learning techniques and neural networks that modify visual understanding, In addition to AI programming, the abstract discusses challenges facing visual interfaces for AI, such as data scarcity, domain optimization, and interpretability. Visual communication and other approaches, such as natural language processing and speech recognition, have also been explored. Overall, this abstract highlights the critical role that visual communication plays in advancing AI capabilities and enabling machines to perceive and understand the world around them. The abstract also explores the integration of visual communication with other modalities like natural language processing and speech recognition, emphasizing the critical role of visual communication in AI capabilities. This methodology explores the importance of visual communication in AI development and implementation, highlighting its potential to enhance the effectiveness and accessibility of AI systems. It provides a comprehensive approach to integrating visual elements into AI systems, making them more user-friendly and efficient. In conclusion, Visual communication is crucial in AI systems for object recognition, facial analysis, and augmented reality, but challenges like data quality, interpretability, and ethics must be addressed. Visual communication enhances user experience, decision-making, accessibility, and collaboration. Developers can integrate visual elements for efficient and accessible AI systems.

Keywords: visual communication AI, computer vision, visual aid in communication, essence of visual communication.

Procedia PDF Downloads 50
6926 Leadership Effectiveness Compared among Three Cultures Using Voice Pitches

Authors: Asena Biber, Ates Gul Ergun, Seda Bulut

Abstract:

Based on the literature, there are large numbers of studies investigating the relationship between culture and leadership effectiveness. Although giving effective speeches is vital characteristic for a leader to be perceived as effective, to our knowledge, there is no research study the determinants of perceived effective leader speech. The aim of this study is to find the effects of both culture and voice pitch on perceptions of leader's speech effectiveness. Our hypothesis is that people from high power distance countries will perceive leaders' speech effective when the leader's voice pitch is high, comparing with people from relatively low power distance countries. The participants of the study were 36 undergraduate students (12 Pakistanis, 12 Nigerians, and 12 Turks) who are studying in Turkey. National power distance scores of Nigerians ranked as first, Turks ranked as second and Pakistanis ranked as third. There are two independent variables in this study; three nationality groups that representing three levels of power distance and voice pitch of the leader which is manipulated as high and low levels. Researchers prepared an audio to manipulate high and low conditions of voice pitch. A professional whose native language is English read the predetermined speech in high and low voice pitch conditions. Voice pitch was measured using Hertz (Hz) and Decibel (dB). Each nationality group (Pakistan, Nigeria, and Turkey) were divided into groups of six students who listened to either the low or high pitch conditions in the cubicles of the laboratory. It was expected from participants to listen to the audio and fill in the questionnaire which was measuring the leadership effectiveness on a response scale ranging from 1 to 5. To determine the effects of nationality and voice pitch on perceived effectiveness of leader' voice pitch, 3 (Pakistani, Nigerian, and Turk) x 2 (low voice pitch and high voice pitch) two way between subjects analysis of variances was carried out. The results indicated that there was no significant main effect of voice pitch and interaction effect on perceived effectiveness of the leader’s voice pitch. However, there was a significant main effect of nationality on perceived effectiveness of the leader's voice pitch. Based on the results of Turkey’s HSD post-hoc test, only the perceived effectiveness of the leader's speech difference between Pakistanis and Nigerians was statistically significant. The results show that the hypothesis of this study was not supported. As limitations of the study, it is of importance to mention that the sample size should be bigger. Also, the language of the questionnaire and speech should be in the participant’s native language in further studies.

Keywords: culture, leadership effectiveness, power distance, voice pitch

Procedia PDF Downloads 156
6925 The Feminine Speech and the Ritual of Death in Albania

Authors: Aida Lamaj

Abstract:

Death is an inevitable phenomenon in our life, in the same way, are also the ritual of death accompanied by the dirge and the keening performed by men. Keening is a phenomenon common among all peoples, the instances in which the ritual of death and keening coincide, as a special phenomenon of its, are numerous given the fact that keening is an outcome of an extremely special emotional state. However, even during the ritual of death, every people try to display through words its qualities, a multitude of characteristics preserved and transmitted with fanaticism from one generation to the other. The ritual of death constitutes an important element of our tradition and at the same time a material always interesting to be studied in minute details. In this study, we have tried to limit ourselves to the feminine speech, since keening, in general in Albania has been carried out by women. Differences and similarities among keening on the national scale, from the diachronic and synchronic point of view, can be seen clearly if we compare the Albanian creations in different regions. The similarities and differences within the Albanian culture serve as a typical paradigm to study how the ancient elements of outlook that the Albanians have had on death, history, and the social organization in these regions have been preserved and transmitted and above all, in what way these feelings have been clothed from the linguistic point of view, the typologies of keening and of all of the ritual of death, which clearly shows archaic forms as well as new developments. These data have been gathered not only by conducting various surveys but also by observing closely the linguistic behavior of women in Albania during the ritual of death. The study has encompassed the popular lyric poetry as well as new entries, whereas from the geographic point of view we focus mainly in the Southern regions, although examples from other regions where Albanian speaking people live are also present. The main results of the study show that women use much more than men dialect form, peripheral language elements and descriptive elements during their speech in the ritual of death.

Keywords: feminine speech in Albania, linguistic characteristics of the dirge, ritual of death, the typologies of keening

Procedia PDF Downloads 133
6924 Effect of Classroom Acoustic Factors on Language and Cognition in Bilinguals and Children with Mild to Moderate Hearing Loss

Authors: Douglas MacCutcheon, Florian Pausch, Robert Ljung, Lorna Halliday, Stuart Rosen

Abstract:

Contemporary classrooms are increasingly inclusive of children with mild to moderate disabilities and children from different language backgrounds (bilinguals, multilinguals), but classroom environments and standards have not yet been adapted adequately to meet these challenges brought about by this inclusivity. Additionally, classrooms are becoming noisier as a learner-centered as opposed to teacher-centered teaching paradigm is adopted, which prioritizes group work and peer-to-peer learning. Challenging listening conditions with distracting sound sources and background noise are known to have potentially negative effects on children, particularly those that are prone to struggle with speech perception in noise. Therefore, this research investigates two groups vulnerable to these environmental effects, namely children with a mild to moderate hearing loss (MMHLs) and sequential bilinguals learning in their second language. In the MMHL study, this group was assessed on speech-in-noise perception, and a number of receptive language and cognitive measures (auditory working memory, auditory attention) and correlations were evaluated. Speech reception thresholds were found to be predictive of language and cognitive ability, and the nature of correlations is discussed. In the bilinguals study, sequential bilingual children’s listening comprehension, speech-in-noise perception, listening effort and release from masking was evaluated under a number of different ecologically valid acoustic scenarios in order to pinpoint the extent of the ‘native language benefit’ for Swedish children learning in English, their second language. Scene manipulations included target-to-distractor ratios and introducing spatially separated noise. This research will contribute to the body of findings from which educational institutions can draw when designing or adapting educational environments in inclusive schools.

Keywords: sequential bilinguals, classroom acoustics, mild to moderate hearing loss, speech-in-noise, release from masking

Procedia PDF Downloads 306
6923 Assessment of the Occupancy’s Effect on Speech Intelligibility in Al-Madinah Holy Mosque

Authors: Wasim Orfali, Hesham Tolba

Abstract:

This research investigates the acoustical characteristics of Al-Madinah Holy Mosque. Extensive field measurements were conducted in different locations of Al-Madinah Holy Mosque to characterize its acoustic characteristics. The acoustical characteristics are usually evaluated by the use of objective parameters in unoccupied rooms due to practical considerations. However, under normal conditions, the room occupancy can vary such characteristics due to the effect of the additional sound absorption present in the room or by the change in signal-to-noise ratio. Based on the acoustic measurements carried out in Al-Madinah Holy Mosque with and without occupancy, and the analysis of such measurements, the existence of acoustical deficiencies has been confirmed.

Keywords: Al-Madinah Holy Mosque, mosque acoustics, speech intelligibility, worship sound

Procedia PDF Downloads 144
6922 A Sociolinguistic Study of the Outcomes of Arabic-French Contact in the Algerian Dialect Tlemcen Speech Community as a Case Study

Authors: R. Rahmoun-Mrabet

Abstract:

It is acknowledged that our style of speaking changes according to a wide range of variables such as gender, setting, the age of both the addresser and the addressee, the conversation topic, and the aim of the interaction. These differences in style are noticeable in monolingual and multilingual speech communities. Yet, they are more observable in speech communities where two or more codes coexist. The linguistic situation in Algeria reflects a state of bilingualism because of the coexistence of Arabic and French. Nevertheless, like all Arab countries, it is characterized by diglossia i.e. the concomitance of Modern Standard Arabic (MSA) and Algerian Arabic (AA), the former standing for the ‘high variety’ and the latter for the ‘low variety’. The two varieties are derived from the same source but are used to fulfil distinct functions that is, MSA is used in the domains of religion, literature, education and formal settings. AA, on the other hand, is used in informal settings, in everyday speech. French has strongly affected the Algerian language and culture because of the historical background of Algeria, thus, what can easily be noticed in Algeria is that everyday speech is characterized by code-switching from dialectal Arabic and French or by the use of borrowings. Tamazight is also very present in many regions of Algeria and is the mother tongue of many Algerians. Yet, it is not used in the west of Algeria, where the study has been conducted. The present work, which was directed in the speech community of Tlemcen-Algeria, aims at depicting some of the outcomes of the contact of Arabic with French such as code-switching, borrowing and interference. The question that has been asked is whether Algerians are aware of their use of borrowings or not. Three steps are followed in this research; the first one is to depict the sociolinguistic situation in Algeria and to describe the linguistic characteristics of the dialect of Tlemcen, which are specific to this city. The second one is concerned with data collection. Data have been collected from 57 informants who were given questionnaires and who have then been classified according to their age, gender and level of education. Information has also been collected through observation, and note taking. The third step is devoted to analysis. The results obtained reveal that most Algerians are aware of their use of borrowings. The present work clarifies how words are borrowed from French, and then adapted to Arabic. It also illustrates the way in which singular words inflect into plural. The results expose the main characteristics of borrowing as opposed to code-switching. The study also clarifies how interference occurs at the level of nouns, verbs and adjectives.

Keywords: bilingualism, borrowing, code-switching, interference, language contact

Procedia PDF Downloads 246
6921 The Beauty of Islamic Etiquette: How an Elegant Muslim Woman Represents Her Culture in a Multicultural Society

Authors: Julia A. Ermakova

Abstract:

As a member of a multicultural society, it is imperative that individuals demonstrate the highest level of decorum in order to exemplify the beauty of their culture. Adab, the practice of praiseworthy words and deeds, as well as possessing good manners and pursuing that which is considered good, is a fundamental concept that guards against all types of mistakes. In Islam, etiquette for every situation in life is taught, and it constitutes the way of life for a Muslim. In light of this, the personality of an elegant Muslim woman can be described as one who embodies the following qualities: Firstly, cultural speech and erudition are essential components. Improving one's intellect, learning new things, reading diverse literature, expanding one's vocabulary, working on articulation, and avoiding obscene speech and verbosity are crucial. Additionally, listening more than speaking and being willing to discuss one's culture when asked are commendable qualities. Conversely, it is important to avoid discussing foolish matters with foolish people and to be able to respond appropriately and change the subject if someone attempts to hurt or manipulate. Secondly, the style of speech is also of paramount importance. It is recommended to speak in a measured tone with a quiet voice and deep breathing. Avoiding rushing and shortness of breath is also recommended. Thirdly, awareness of how to greet others is essential. Combining Shariah and small talk etiquette, such as making a gesture of respect by putting one's hand to the chest and smiling slightly when a man offers a handshake, is recommended. Understanding the rules of small talk, taboo topics, and self-presentation is also important. Fourthly, knowing how to give and receive compliments without devaluing them is imperative. Knowledge of the rules of good manners and etiquette, both secular and Shariah, is also essential. Fifthly, avoiding arguments and responding elegantly to rudeness and tactlessness is a sign of an elegant Muslim woman. Treating everyone with respect and avoiding prejudices, taboo topics, inappropriate questions, and bad habits are all aspects of politeness. Sixthly, a neat appearance appropriate to Shariah and the local community, as well as a well-put-together outfit with a touch of elegance and style, are crucial. Posture, graceful movement, and a pleasant gaze are also important. Finally, good spirits and inner calm are key to projecting a harmonious image, which encourages people to listen attentively. Giving thanks to Allah in every situation in life is the key to maintaining good spirits. In conclusion, an elegant Muslim woman in a multicultural society is characterized by her high moral qualities and adherence to Islamic etiquette. These qualities, such as cultural speech and erudition, style of speech, awareness of how to greet, knowledge of good manners and etiquette, avoiding arguments, politeness, a neat appearance, and good spirits, all contribute to projecting an image of elegance and respectability. By exemplifying these qualities, Muslim women can serve as positive ambassadors for their culture and religion in diverse societies.

Keywords: adab, elegance, muslim woman, multicultural societies, good manners, etiquette

Procedia PDF Downloads 40
6920 Psycholinguistic Analysis on Stuttering Treatment through Systemic Functional Grammar in Tom Hooper’s The King’s Speech

Authors: Nurvita Wijayanti

Abstract:

The movie titled The King’s Speech is based on a true story telling an English king suffers from stuttering and how he gets the treatment from the therapist, so that he can reduce the high frequency on stuttering. The treatment uses the unique approach implying the linguistic principles. This study shows how the language works significantly in order to treat the stuttering sufferer using psychological approach. Therefore, the linguistic study is done to analyze the treatment activity. Halliday’s Systemic Functional Grammar is used as the main approach in this study along with qualitative descriptive method. The study finds that the therapist though using the orthodox approach applies the psycholinguistic method to overcome the king’s stuttering.

Keywords: psycholinguistics, stuttering, systemic functional grammar, treatment

Procedia PDF Downloads 220
6919 The Role of Media Relations in the Brand Image: Case Study in Three Brands of the Automobile Industry

Authors: Rosa Sobreira, Paula Arriscado

Abstract:

Marketers are aware that media relations is an important touch point, which is also cheaper, to bring their products and their brands to the consumer. They recognize the role of journalists as moderators and transformers of public opinion, and they realize their influence on brand image. And also, they know that readers, listeners, viewers and internet users "believe" more what they read, hear and see in the news than in an advertisement. The study is focused on the automotive industry and analyses the news published about three brands that share industrial facilities and components. We wanted to understand the role of the information created by the brand`s media team in the journalists’ work, and the impact on management, activation and differentiation of brands and their products` attributes and benefits. Based on a qualitative methodology, the analysis focused on press news, making comparison between media coverage and their “narratives” about the three cars from different brands. The results point to the fact that journalists easily integrate speech from the marks on their products. In the case of this study, we found that apart from the description of the many similarities between the three cars, the average speech also "struggled" for revealing the attributes that differentiate them. This interpretation of the results helps us to understand the "marriage" between branding and media. We believe also this paper let us to understand how journalists, through news, join the speech of the brands.

Keywords: brand management, media relations, differentiation, positioning

Procedia PDF Downloads 198
6918 South African Students' Statistical Literacy in the Conceptual Understanding about Measures of Central Tendency after Completing Their High School Studies

Authors: Lukanda Kalobo

Abstract:

In South Africa, the High School Mathematics Curriculum provides teachers with specific aims and skills to be developed which involves the understanding about the measures of central tendency. The exploration begins with the definitions of statistical literacy, measurement of central tendency and a discussion on why statistical literacy is essential today. It furthermore discusses the statistical literacy basics involved in understanding the concepts of measures of central tendency. The statistical literacy test on the measures of central tendency, was used to collect data which was administered to 78 first year students direct from high schools. The results indicated that students seemed to have forgotten about the statistical literacy in understanding the concepts of measure of central tendency after completing their high school study. The authors present inferences regarding the alignment between statistical literacy and the understanding of the concepts about the measures of central tendency, leading to the conclusion that there is a need to provide in-service and pre-service training.

Keywords: conceptual understanding, mean, median, mode, statistical literacy

Procedia PDF Downloads 268