Search results for: L2 speech corpus
937 Compensatory Articulation of Pressure Consonants in Telugu Cleft Palate Speech: A Spectrographic Analysis
Authors: Indira Kothalanka
Abstract:
For individuals born with a cleft palate (CP), there is no separation between the nasal cavity and the oral cavity, due to which they cannot build up enough air pressure in the mouth for speech. Therefore, it is common for them to have speech problems. Common cleft type speech errors include abnormal articulation (compensatory or obligatory) and abnormal resonance (hyper, hypo and mixed nasality). These are generally resolved after palate repair. However, in some individuals, articulation problems do persist even after the palate repair. Such individuals develop variant articulations in an attempt to compensate for the inability to produce the target phonemes. A spectrographic analysis is used to investigate the compensatory articulatory behaviours of pressure consonants in the speech of 10 Telugu speaking individuals aged between 7-17 years with a history of cleft palate. Telugu is a Dravidian language which is spoken in Andhra Pradesh and Telangana states in India. It is a language with the third largest number of native speakers in India and the most spoken Dravidian language. The speech of the informants is analysed using single word list, sentences, passage and conversation. Spectrographic analysis is carried out using PRAAT, speech analysis software. The place and manner of articulation of consonant sounds is studied through spectrograms with the help of various acoustic cues. The types of compensatory articulation identified are glottal stops, palatal stops, uvular, velar stops and nasal fricatives which are non-native in Telugu.Keywords: cleft palate, compensatory articulation, spectrographic analysis, PRAAT
Procedia PDF Downloads 445936 Virtual Reality Based 3D Video Games and Speech-Lip Synchronization Superseding Algebraic Code Excited Linear Prediction
Authors: P. S. Jagadeesh Kumar, S. Meenakshi Sundaram, Wenli Hu, Yang Yung
Abstract:
In 3D video games, the dominance of production is unceasingly growing with a protruding level of affordability in terms of budget. Afterward, the automation of speech-lip synchronization technique is customarily onerous and has advanced a critical research subject in virtual reality based 3D video games. This paper presents one of these automatic tools, precisely riveted on the synchronization of the speech and the lip movement of the game characters. A robust and precise speech recognition segment that systematized with Algebraic Code Excited Linear Prediction method is developed which unconventionally delivers lip sync results. The Algebraic Code Excited Linear Prediction algorithm is constructed on that used in code-excited linear prediction, but Algebraic Code Excited Linear Prediction codebooks have an explicit algebraic structure levied upon them. This affords a quicker substitute to the software enactments of lip sync algorithms and thus advances the superiority of service factors abridged production cost.Keywords: algebraic code excited linear prediction, speech-lip synchronization, video games, virtual reality
Procedia PDF Downloads 474935 VIAN-DH: Computational Multimodal Conversation Analysis Software and Infrastructure
Authors: Teodora Vukovic, Christoph Hottiger, Noah Bubenhofer
Abstract:
The development of VIAN-DH aims at bridging two linguistic approaches: conversation analysis/interactional linguistics (IL), so far a dominantly qualitative field, and computational/corpus linguistics and its quantitative and automated methods. Contemporary IL investigates the systematic organization of conversations and interactions composed of speech, gaze, gestures, and body positioning, among others. These highly integrated multimodal behaviour is analysed based on video data aimed at uncovering so called “multimodal gestalts”, patterns of linguistic and embodied conduct that reoccur in specific sequential positions employed for specific purposes. Multimodal analyses (and other disciplines using videos) are so far dependent on time and resource intensive processes of manual transcription of each component from video materials. Automating these tasks requires advanced programming skills, which is often not in the scope of IL. Moreover, the use of different tools makes the integration and analysis of different formats challenging. Consequently, IL research often deals with relatively small samples of annotated data which are suitable for qualitative analysis but not enough for making generalized empirical claims derived quantitatively. VIAN-DH aims to create a workspace where many annotation layers required for the multimodal analysis of videos can be created, processed, and correlated in one platform. VIAN-DH will provide a graphical interface that operates state-of-the-art tools for automating parts of the data processing. The integration of tools that already exist in computational linguistics and computer vision, facilitates data processing for researchers lacking programming skills, speeds up the overall research process, and enables the processing of large amounts of data. The main features to be introduced are automatic speech recognition for the transcription of language, automatic image recognition for extraction of gestures and other visual cues, as well as grammatical annotation for adding morphological and syntactic information to the verbal content. In the ongoing instance of VIAN-DH, we focus on gesture extraction (pointing gestures, in particular), making use of existing models created for sign language and adapting them for this specific purpose. In order to view and search the data, VIAN-DH will provide a unified format and enable the import of the main existing formats of annotated video data and the export to other formats used in the field, while integrating different data source formats in a way that they can be combined in research. VIAN-DH will adapt querying methods from corpus linguistics to enable parallel search of many annotation levels, combining token-level and chronological search for various types of data. VIAN-DH strives to bring crucial and potentially revolutionary innovation to the field of IL, (that can also extend to other fields using video materials). It will allow the processing of large amounts of data automatically and, the implementation of quantitative analyses, combining it with the qualitative approach. It will facilitate the investigation of correlations between linguistic patterns (lexical or grammatical) with conversational aspects (turn-taking or gestures). Users will be able to automatically transcribe and annotate visual, spoken and grammatical information from videos, and to correlate those different levels and perform queries and analyses.Keywords: multimodal analysis, corpus linguistics, computational linguistics, image recognition, speech recognition
Procedia PDF Downloads 110934 A Cross-Dialect Statistical Analysis of Final Declarative Intonation in Tuvinian
Authors: D. Beziakina, E. Bulgakova
Abstract:
This study continues the research on Tuvinian intonation and presents a general cross-dialect analysis of intonation of Tuvinian declarative utterances, specifically the character of the tone movement in order to test the hypothesis about the prevalence of level tone in some Tuvinian dialects. The results of the analysis of basic pitch characteristics of Tuvinian speech (in general and in comparison with two other Turkic languages - Uzbek and Azerbaijani) are also given in this paper. The goal of our work was to obtain the ranges of pitch parameter values typical for Tuvinian speech. Such language-specific values can be used in speaker identification systems in order to get more accurate results of ethnic speech analysis. We also present the results of a cross-dialect analysis of declarative intonation in the poorly studied Tuvinian language.Keywords: speech analysis, statistical analysis, speaker recognition, identification of person
Procedia PDF Downloads 471933 A Profile of the Patients at the Hearing and Speech Clinic at the University of Jordan: A Retrospective Study
Authors: Maisa Haj-Tas, Jehad Alaraifi
Abstract:
The significance of the study: This retrospective study examined the speech and language profiles of patients who received clinical services at the University of Jordan Hearing and Speech Clinic (UJ-HSC) from 2009 to 2014. The UJ-HSC clinic is located in the capital Amman and was established in the late 1990s. It is the first hearing and speech clinic in Jordan and one of first speech and hearing clinics in the Middle East. This clinic provides services to an annual average of 2000 patients who are diagnosed with different communication disorders. Examining the speech and language profiles of patients in this clinic could provide an insight about the most common disorders seen in patients who attend similar clinics in Jordan. It could also provide information about community awareness of the role of speech therapists in the management of speech and language disorders. Methodology: The researchers examined the clinical records of 1140 patients (797 males and 343 females) who received clinical services at the UJ-HSC between the years 2009 and 2014 for the purpose of data analysis for this study. The main variables examined in the study were disorder type and gender. Participants were divided into four age groups: children, adolescents, adults, and older adults. The examined disorders were classified as either speech disorders, language disorders, or dysphagia (i.e., swallowing problems). The disorders were further classified as childhood language impairments, articulation disorders, stuttering, cluttering, voice disorders, aphasia, and dysphagia. Results: The results indicated that the prevalence for language disorders was the highest (50.7%) followed by speech disorders (48.3%), and dysphagia (0.9%). The majority of patients who were seen at the JU-HSC were diagnosed with childhood language impairments (47.3%) followed consecutively by articulation disorders (21.1%), stuttering (16.3%), voice disorders (12.1%), aphasia (2.2%), dysphagia (0.9%), and cluttering (0.2%). As for gender, the majority of patients seen at the clinic were males in all disorders except for voice disorders and cluttering. Discussion: The results of the present study indicate that the majority of examined patients were diagnosed with childhood language impairments. Based on this result, the researchers suggest that there seems to be a high prevalence of childhood language impairments among children in Jordan compared to other types of speech and language disorders. The researchers also suggest that there is a need for further examination of the actual prevalence data on speech and language disorders in Jordan. The fact that many of the children seen at the UJ-HSC were brought to the clinic either as a result of parental concern or teacher referral indicates that there seems to an increased awareness among parents and teachers about the services speech pathologists can provide about assessment and treatment of childhood speech and language disorders. The small percentage of other disorders (i.e., stuttering, cluttering, dysphasia, aphasia, and voice disorders) seen at the UJ-HSC may indicate a little awareness by the local community about the role of speech pathologists in the assessment and treatment of these disorders.Keywords: clinic, disorders, language, profile, speech
Procedia PDF Downloads 316932 An Interdisciplinary Approach to Investigating Style: A Case Study of a Chinese Translation of Gilbert’s (2006) Eat Pray Love
Authors: Elaine Y. L. Ng
Abstract:
Elizabeth Gilbert’s (2006) biography Eat, Pray, Love describes her travels to Italy, India, and Indonesia after a painful divorce. The author’s experiences with love, loss, search for happiness, and meaning have resonated with a huge readership. As regards the translation of Gilbert’s (2006) Eat, Pray, Love into Chinese, it was first translated by a Taiwanese translator He Pei-Hua and published in Taiwan in 2007 by Make Boluo Wenhua Chubanshe with the fairly catching title “Enjoy! Traveling Alone.” The same translation was translocated to China, republished in simplified Chinese characters by Shanxi Shifan Daxue Chubanshe in 2008 and renamed in China, entitled “To Be a Girl for the Whole Life.” Later on, the same translation in simplified Chinese characters was reprinted by Hunan Wenyi Chubanshe in 2013. This study employs Munday’s (2002) systemic model for descriptive translation studies to investigate the translation of Gilbert’s (2006) Eat, Pray, Love into Chinese by the Taiwanese translator Hu Pei-Hua. It employs an interdisciplinary approach, combining systemic functional linguistics and corpus stylistics with sociohistorical research within a descriptive framework to study the translator’s discursive presence in the text. The research consists of three phases. The first phase is to locate the target text within its socio-cultural context. The target-text context concerning the para-texts, readers’ responses, and the publishers’ orientation will be explored. The second phase is to compare the source text and the target text for the categorization of translation shifts by using the methodological tools of systemic functional linguistics and corpus stylistics. The investigation concerns the rendering of mental clauses and speech and thought presentation. The final phase is an explanation of the causes of translation shifts. The linguistic findings are related to the extra-textual information collected in an effort to ascertain the motivations behind the translator’s choices. There exist sets of possible factors that may have contributed to shaping the textual features of the given translation within a specific socio-cultural context. The study finds that the translator generally reproduces the mental clauses and speech and thought presentation closely according to the original. Nevertheless, the language of the translation has been widely criticized to be unidiomatic and stiff, losing the elegance of the original. In addition, the several Chinese translations of the given text produced by one Taiwanese and two Chinese publishers are basically the same. They are repackaged slightly differently, mainly with the change of the book cover and its captions for each version. By relating the textual findings to the extra-textual data of the study, it is argued that the popularity of the Chinese translation of Gilbert’s (2006) Eat, Pray, Love may not be attributed to the quality of the translation. Instead, it may have to do with the way the work is promoted strategically by the social media manipulated by the four e-bookstores promoting and selling the book online in China.Keywords: chinese translation of eat pray love, corpus stylistics, motivations for translation shifts, systemic approach to translation studies
Procedia PDF Downloads 176931 Environmentally Adaptive Acoustic Echo Suppression for Barge-in Speech Recognition
Authors: Jong Han Joo, Jung Hoon Lee, Young Sun Kim, Jae Young Kang, Seung Ho Choi
Abstract:
In this study, we propose a novel technique for acoustic echo suppression (AES) during speech recognition under barge-in conditions. Conventional AES methods based on spectral subtraction apply fixed weights to the estimated echo path transfer function (EPTF) at the current signal segment and to the EPTF estimated until the previous time interval. We propose a new approach that adaptively updates weight parameters in response to abrupt changes in the acoustic environment due to background noises or double-talk. Furthermore, we devised a voice activity detector and an initial time-delay estimator for barge-in speech recognition in communication networks. The initial time delay is estimated using log-spectral distance measure, as well as cross-correlation coefficients. The experimental results show that the developed techniques can be successfully applied in barge-in speech recognition systems.Keywords: acoustic echo suppression, barge-in, speech recognition, echo path transfer function, initial delay estimator, voice activity detector
Procedia PDF Downloads 373930 Role of Speech Articulation in English Language Learning
Authors: Khadija Rafi, Neha Jamil, Laiba Khalid, Meerub Nawaz, Mahwish Farooq
Abstract:
Speech articulation is a complex process to produce intelligible sounds with the help of precise movements of various structures within the vocal tract. All these structures in the vocal tract are named as articulators, which comprise lips, teeth, tongue, and palate. These articulators work together to produce a range of distinct phonemes, which happen to be the basis of language. It starts with the airstream from the lungs passing through the trachea and into oral and nasal cavities. When the air passes through the mouth, the tongue and the muscles around it form such coordination it creates certain sounds. It can be seen when the tongue is placed in different positions- sometimes near the alveolar ridge, soft palate, roof of the mouth or the back of the teeth which end up creating unique qualities of each phoneme. We can articulate vowels with open vocal tracts, but the height and position of the tongue is different every time depending upon each vowel, while consonants can be pronounced when we create obstructions in the airflow. For instance, the alphabet ‘b’ is a plosive and can be produced only by briefly closing the lips. Articulation disorders can not only affect communication but can also be a hurdle in speech production. To improve articulation skills for such individuals, doctors often recommend speech therapy, which involves various kinds of exercises like jaw exercises and tongue twisters. However, this disorder is more common in children who are going through developmental articulation issues right after birth, but in adults, it can be caused by injury, neurological conditions, or other speech-related disorders. In short, speech articulation is an essential aspect of productive communication, which also includes coordination of the specific articulators to produce different intelligible sounds, which are a vital part of spoken language.Keywords: linguistics, speech articulation, speech therapy, language learning
Procedia PDF Downloads 63929 Hate Speech in Selected Nigerian Newspapers
Authors: Laurel Chikwado Madumere, Kevin O. Ugorji
Abstract:
A speech is said to be full of hate when it appropriates disparaging and vituperative locutions and/or appellations, which are riddled with prejudices and misconceptions about an antagonizing party on the grounds of gender, race, political orientation, religious affiliations, tribe, etc. Due largely to the dichotomies and polarities that exist in Nigeria across political ideological spectrum, tribal affiliations, and gender contradistinctions, there are possibilities for the existence of socioeconomic, religious and political conditions that would induce, provoke and catalyze hate speeches in Nigeria’s mainstream media. Therefore the aim of this paper is to investigate, using select daily newspapers in Nigeria, the extent and complexity of those likely hate speeches that emanate from the pluralism in Nigeria and to set in to relief, the discrepancies and contrariety in the interpretation of those hate words. To achieve the above, the paper shall be qualitative in orientation as it shall be using the Speech Act Theory of J. L. Austin and J. R. Searle to interpret and evaluate the hate speeches in the select Nigerian daily newspapers. Also this paper shall help to elucidate the conditions that generate hate, and inform the government and NGOs how best to approach those conditions and put an end to the possible violence and extremism that emanate from extreme cases of hate.Keywords: extremism, gender, hate speech, pluralism, prejudice, speech act theory
Procedia PDF Downloads 148928 The Usage of Negative Emotive Words in Twitter
Authors: Martina Katalin Szabó, István Üveges
Abstract:
In this paper, the usage of negative emotive words is examined on the basis of a large Hungarian twitter-database via NLP methods. The data is analysed from a gender point of view, as well as changes in language usage over time. The term negative emotive word refers to those words that, on their own, without context, have semantic content that can be associated with negative emotion, but in particular cases, they may function as intensifiers (e.g. rohadt jó ’damn good’) or a sentiment expression with positive polarity despite their negative prior polarity (e.g. brutális, ahogy ez a férfi rajzol ’it’s awesome (lit. brutal) how this guy draws’. Based on the findings of several authors, the same phenomenon can be found in other languages, so it is probably a language-independent feature. For the recent analysis, 67783 tweets were collected: 37818 tweets (19580 tweets written by females and 18238 tweets written by males) in 2016 and 48344 (18379 tweets written by females and 29965 tweets written by males) in 2021. The goal of the research was to make up two datasets comparable from the viewpoint of semantic changes, as well as from gender specificities. An exhaustive lexicon of Hungarian negative emotive intensifiers was also compiled (containing 214 words). After basic preprocessing steps, tweets were processed by ‘magyarlanc’, a toolkit is written in JAVA for the linguistic processing of Hungarian texts. Then, the frequency and collocation features of all these words in our corpus were automatically analyzed (via the analysis of parts-of-speech and sentiment values of the co-occurring words). Finally, the results of all four subcorpora were compared. Here some of the main outcomes of our analyses are provided: There are almost four times fewer cases in the male corpus compared to the female corpus when the negative emotive intensifier modified a negative polarity word in the tweet (e.g., damn bad). At the same time, male authors used these intensifiers more frequently, modifying a positive polarity or a neutral word (e.g., damn good and damn big). Results also pointed out that, in contrast to female authors, male authors used these words much more frequently as a positive polarity word as well (e.g., brutális, ahogy ez a férfi rajzol ’it’s awesome (lit. brutal) how this guy draws’). We also observed that male authors use significantly fewer types of emotive intensifiers than female authors, and the frequency proportion of the words is more balanced in the female corpus. As for changes in language usage over time, some notable differences in the frequency and collocation features of the words examined were identified: some of the words collocate with more positive words in the 2nd subcorpora than in the 1st, which points to the semantic change of these words over time.Keywords: gender differences, negative emotive words, semantic changes over time, twitter
Procedia PDF Downloads 206927 Grammatical and Lexical Cohesion in the Japan’s Prime Minister Shinzo Abe’s Speech Text ‘Nihon wa Modottekimashita’
Authors: Nadya Inda Syartanti
Abstract:
This research aims to identify, classify, and analyze descriptively the aspects of grammatical and lexical cohesion in the speech text of Japan’s Prime Minister Shinzo Abe entitled Nihon wa Modotte kimashita delivered in Washington DC, the United States on February 23, 2013, as a research data source. The method used is qualitative research, which uses descriptions through words that are applied by analyzing aspects of grammatical and lexical cohesion proposed by Halliday and Hasan (1976). The aspects of grammatical cohesion consist of references (personal, demonstrative, interrogative pronouns), substitution, ellipsis, and conjunction. In contrast, lexical cohesion consists of reiteration (repetition, synonym, antonym, hyponym, meronym) and collocation. Data classification is based on the 6 aspects of the cohesion. Through some aspects of cohesion, this research tries to find out the frequency of using grammatical and lexical cohesion in Shinzo Abe's speech text entitled Nihon wa Modotte kimashita. The results of this research are expected to help overcome the difficulty of understanding speech texts in Japanese. Therefore, this research can be a reference for learners, researchers, and anyone who is interested in the field of discourse analysis.Keywords: cohesion, grammatical cohesion, lexical cohesion, speech text, Shinzo Abe
Procedia PDF Downloads 162926 Designing a Corpus Database to Enhance the Learning of Old English Language
Authors: Raquel Mateo Mendaza, Carmen Novo Urraca
Abstract:
The current paper presents the elaboration of a corpus database that aligns two different corpora in order to simplify the search of information both for researchers and students of Old English. This database comprises the information contained in two main reference corpora, namely the Dictionary of Old English Corpus (DOEC), compiled at the University of Toronto, and the York-Toronto-Helsinki Parsed Corpus of Old English (YCOE). The first one provides information on all surviving texts written in the Old English language. The latter offers the syntactical and morphological annotation of several texts included in the DOEC. Although both corpora are closely related, as the YCOE includes the DOE source text identifier, the main problem detected is that there is not an alignment of texts that allows for the search of whole fragments to be further analysed in terms of morphology and syntax. The database proposed in this paper gathers all this information and presents it in a simple, more accessible, visual, and educational way. The alignment of fragments has been done in an automatized way. However, some problems have emerged during the creating process particularly related to the lack of correspondence in the division of fragments. For this reason, it has been necessary to revise the whole entries manually to obtain a truthful high-quality product and to carefully indicate the gaps encountered in these corpora. All in all, this database contains more than 60,000 entries corresponding with the DOE fragments annotated by the YCOE. The main strength of the resulting product is its research and teaching implications in the study of Old English. The use of this database will help researchers and students in the study of different aspects of the language, such as inflectional morphology, syntactic behaviour of given words, or translation studies, among others. By means of the search of words or fragments, the annotated information on morphology and syntax will be automatically displayed, automatizing, and speeding up the search of data.Keywords: alignment, corpus database, morphosyntactic analysis, Old English
Procedia PDF Downloads 134925 Speech and Swallowing Function after Tonsillo-Lingual Sulcus Resection with PMMC Flap Reconstruction: A Case Study
Authors: K. Rhea Devaiah, B. S. Premalatha
Abstract:
Background: Tonsillar Lingual sulcus is the area between the tonsils and the base of the tongue. The surgical resection of the lesions in the head and neck results in changes in speech and swallowing functions. The severity of the speech and swallowing problem depends upon the site and extent of the lesion, types and extent of surgery and also the flexibility of the remaining structures. Need of the study: This paper focuses on the importance of speech and swallowing rehabilitation in an individual with the lesion in the Tonsillar Lingual Sulcus and post-operative functions. Aim: Evaluating the speech and swallow functions post-intensive speech and swallowing rehabilitation. The objectives are to evaluate the speech intelligibility and swallowing functions after intensive therapy and assess the quality of life. Method: The present study describes a report of an individual aged 47years male, with the diagnosis of basaloid squamous cell carcinoma, left tonsillar lingual sulcus (pT2n2M0) and underwent wide local excision with left radical neck dissection with PMMC flap reconstruction. Post-surgery the patient came with a complaint of reduced speech intelligibility, and difficulty in opening the mouth and swallowing. Detailed evaluation of the speech and swallowing functions were carried out such as OPME, articulation test, speech intelligibility, different phases of swallowing and trismus evaluation. Self-reported questionnaires such as SHI-E(Speech handicap Index- Indian English), DHI (Dysphagia handicap Index) and SESEQ -K (Self Evaluation of Swallowing Efficiency in Kannada) were also administered to know what the patient feels about his problem. Based on the evaluation, the patient was diagnosed with pharyngeal phase dysphagia associated with trismus and reduced speech intelligibility. Intensive speech and swallowing therapy was advised weekly twice for the duration of 1 hour. Results: Totally the patient attended 10 intensive speech and swallowing therapy sessions. Results indicated misarticulation of speech sounds such as lingua-palatal sounds. Mouth opening was restricted to one finger width with difficulty chewing, masticating, and swallowing the bolus. Intervention strategies included Oro motor exercise, Indirect swallowing therapy, usage of a trismus device to facilitate mouth opening, and change in the food consistency to help to swallow. A practice session was held with articulation drills to improve the production of speech sounds and also improve speech intelligibility. Significant changes in articulatory production and speech intelligibility and swallowing abilities were observed. The self-rated quality of life measures such as DHI, SHI and SESE Q-K revealed no speech handicap and near-normal swallowing ability indicating the improved QOL after the intensive speech and swallowing therapy. Conclusion: Speech and swallowing therapy post carcinoma in the tonsillar lingual sulcus is crucial as the tongue plays an important role in both speech and swallowing. The role of Speech-language and swallowing therapists in oral cancer should be highlighted in treating these patients and improving the overall quality of life. With intensive speech-language and swallowing therapy post-surgery for oral cancer, there can be a significant change in the speech outcome and swallowing functions depending on the site and extent of lesions which will thereby improve the individual’s QOL.Keywords: oral cancer, speech and swallowing therapy, speech intelligibility, trismus, quality of life
Procedia PDF Downloads 112924 The Communicative Nature of Linguistic Interference in Learning and Teaching of Slavic Languages
Authors: Kseniia Fedorova
Abstract:
The article is devoted to interlinguistic homonymy and enantiosemy analysis. These phenomena belong to the process of linguistic interference, which leads to violation of the communicative utterances integrity and causes misunderstanding between foreign interlocutors - native speakers of different Slavic languages. More attention is paid to investigation of non-typical speech situations, which occurred spontaneously or created by somebody intentionally being based on described phenomenon mechanism. The classification of typical students' mistakes connected with the paradox of interference is being represented in the article. The survey contributes to speech act theory, contemporary linguodidactics, translation science and comparative lexicology of Slavonic languages.Keywords: adherent enantiosemy, interference, interslavonic homonymy, speech act
Procedia PDF Downloads 246923 Speech Emotion Recognition with Bi-GRU and Self-Attention based Feature Representation
Authors: Bubai Maji, Monorama Swain
Abstract:
Speech is considered an essential and most natural medium for the interaction between machines and humans. However, extracting effective features for speech emotion recognition (SER) is remains challenging. The present studies show that the temporal information captured but high-level temporal-feature learning is yet to be investigated. In this paper, we present an efficient novel method using the Self-attention (SA) mechanism in a combination of Convolutional Neural Network (CNN) and Bi-directional Gated Recurrent Unit (Bi-GRU) network to learn high-level temporal-feature. In order to further enhance the representation of the high-level temporal-feature, we integrate a Bi-GRU output with learnable weights features by SA, and improve the performance. We evaluate our proposed method on our created SITB-OSED and IEMOCAP databases. We report that the experimental results of our proposed method achieve state-of-the-art performance on both databases.Keywords: Bi-GRU, 1D-CNNs, self-attention, speech emotion recognition
Procedia PDF Downloads 114922 A Linguistic Product of K-Pop: A Corpus-Based Study on the Korean-Originated Chinese Neologism Simida
Authors: Hui Shi
Abstract:
This article examines the online popularity of Chinese neologism simida, which is a loanword derived from Korean declarative sentence-final suffix seumnida. Facilitated by corpus data obtained from Weibo, the Chinese counterpart of Twitter, this study analyzes the morphological and syntactical processes behind simida’s coinage, as well as the causes of its prevalence on Chinese social media. The findings show that simida is used by Weibo bloggers in two manners: (1) as an alternative word of 'Korea' and 'Korean'; (2) as a redundant sentence-final particle which adds a Korean-like speech style to a statement. Additionally, Weibo user profile analysis further reveals demographical distribution patterns concerning this neologism and highlights young Weibo users in the third-tier cities as the leading adopters of simida. These results are accounted for under the theoretical framework of social indexicality, especially how variations generate style in the indexical field. This article argues that the creation of such an ethnically-targeted neologism is a linguistic demonstration of Chinese netizen’s two-sided attitudes toward the previously heated Korean-wave. The exotic suffix seumnida is borrowed to Chinese as simida due to its high-frequency in Korean cultural exports. Therefore, it gradually becomes a replacement of Korea-related lexical items due to markedness, regardless of semantic prosody. Its innovative implantation to Chinese syntax, on the other hand, reflects Chinese netizens’ active manipulation of language for their online identity building. This study has implications for research on the linguistic construction of identity and style and lays the groundwork for linguistic creativity in the Chinese new media.Keywords: Chinese neologism, loanword, humor, new media
Procedia PDF Downloads 175921 Exploring Reading into Writing: A Corpus-Based Analysis of Postgraduate Students’ Literature Review Essays
Authors: Tanzeela Anbreen, Ammara Maqsood
Abstract:
Reading into writing is one of university students' most required academic skills. The current study explored postgraduate university students’ writing quality using a corpus-based approach. Twelve postgraduate students’ literature review essays were chosen for the corpus-based analysis. These essays were chosen because students had to incorporate multiple reading sources in these essays, which was a new writing exercise for them. The students were provided feedback at least two times which comprised of the written comments by the tutor highlighting the areas of improvement and also by using the ‘track changes’ function. This exercise was repeated two times, and students submitted two drafts. This investigation included only the finally submitted work of the students. A corpus-based approach was adopted to analyse the essays because it promotes autonomous discovery and personalised learning. The aim of this analysis was to understand the existing level of students’ writing before the start of their postgraduate thesis. Text Inspector was used to analyse the quality of essays. With the help of the Text Inspector tool, the vocabulary used in the essays was compared to the English Vocabulary Profile (EVP), which describes what learners know and can do at each Common European Framework of Reference (CEFR) level. Writing quality was also measured for the Flesch reading ease score, which is a standard to describe the ease of understanding the writing content. The results reflected that students found writing essays using multiple sources challenging. In most essays, the vocabulary level achieved was between B1-B2 of the CEFL level. The study recommends that students need extensive training in developing academic writing skills, particularly in writing the literature review type assignment, which requires multiple sources citations.Keywords: literature review essays, postgraduate students, corpus-based analysis, vocabulary proficiency
Procedia PDF Downloads 73920 Investigating the Online Effect of Language on Gesture in Advanced Bilinguals of Two Structurally Different Languages in Comparison to L1 Native Speakers of L2 and Explores Whether Bilinguals Will Follow Target L2 Patterns in Speech and Co-speech
Authors: Armita Ghobadi, Samantha Emerson, Seyda Ozcaliskan
Abstract:
Being a bilingual involves mastery of both speech and gesture patterns in a second language (L2). We know from earlier work in first language (L1) production contexts that speech and co-speech gesture form a tightly integrated system: co-speech gesture mirrors the patterns observed in speech, suggesting an online effect of language on nonverbal representation of events in gesture during the act of speaking (i.e., “thinking for speaking”). Relatively less is known about the online effect of language on gesture in bilinguals speaking structurally different languages. The few existing studies—mostly with small sample sizes—suggests inconclusive findings: some show greater achievement of L2 patterns in gesture with more advanced L2 speech production, while others show preferences for L1 gesture patterns even in advanced bilinguals. In this study, we focus on advanced bilingual speakers of two structurally different languages (Spanish L1 with English L2) in comparison to L1 English speakers. We ask whether bilingual speakers will follow target L2 patterns not only in speech but also in gesture, or alternatively, follow L2 patterns in speech but resort to L1 patterns in gesture. We examined this question by studying speech and gestures produced by 23 advanced adult Spanish (L1)-English (L2) bilinguals (Mage=22; SD=7) and 23 monolingual English speakers (Mage=20; SD=2). Participants were shown 16 animated motion event scenes that included distinct manner and path components (e.g., "run over the bridge"). We recorded and transcribed all participant responses for speech and segmented it into sentence units that included at least one motion verb and its associated arguments. We also coded all gestures that accompanied each sentence unit. We focused on motion event descriptions as it shows strong crosslinguistic differences in the packaging of motion elements in speech and co-speech gesture in first language production contexts. English speakers synthesize manner and path into a single clause or gesture (he runs over the bridge; running fingers forward), while Spanish speakers express each component separately (manner-only: el corre=he is running; circle arms next to body conveying running; path-only: el cruza el puente=he crosses the bridge; trace finger forward conveying trajectory). We tallied all responses by group and packaging type, separately for speech and co-speech gesture. Our preliminary results (n=4/group) showed that productions in English L1 and Spanish L1 differed, with greater preference for conflated packaging in L1 English and separated packaging in L1 Spanish—a pattern that was also largely evident in co-speech gesture. Bilinguals’ production in L2 English, however, followed the patterns of the target language in speech—with greater preference for conflated packaging—but not in gesture. Bilinguals used separated and conflated strategies in gesture in roughly similar rates in their L2 English, showing an effect of both L1 and L2 on co-speech gesture. Our results suggest that online production of L2 language has more limited effects on L2 gestures and that mastery of native-like patterns in L2 gesture might take longer than native-like L2 speech patterns.Keywords: bilingualism, cross-linguistic variation, gesture, second language acquisition, thinking for speaking hypothesis
Procedia PDF Downloads 76919 Cognitive Semantics Study of Conceptual and Metonymical Expressions in Johnson's Speeches about COVID-19
Authors: Hussain Hameed Mayuuf
Abstract:
The study is an attempt to investigate the conceptual metonymies is used in political discourse about COVID-19. Thus, this study tries to analyze and investigate how the conceptual metonymies in Johnson's speech about coronavirus are constructed. This study aims at: Identifying how are metonymies relevant to understand the messages in Boris Johnson speeches and to find out how can conceptual blending theory help people to understand the messages in the political speech about COVID-19. Lastly, it tries to Point out the kinds of integration networks are common in political speech. The study is based on the hypotheses that conceptual blending theory is a powerful tool for investigating the intended messages in Johnson's speech and there are different processes of blending networks and conceptual mapping that enable the listeners to identify the messages in political speech. This study presents a qualitative and quantitative analysis of four speeches about COVID-19; they are said by Boris Johnson. The selected data have been tackled from the cognitive-semantic perspective by adopting Conceptual Blending Theory as a model for the analysis. It concludes that CBT is applicable to the analysis of metonymies in political discourse. Its mechanisms enable listeners to analyze and understand these speeches. Also the listener can identify and understand the hidden messages in Biden and Johnson's discourse about COVID-19 by using different conceptual networks. Finally, it is concluded that the double scope networks are the most common types of blending of metonymies in the political speech.Keywords: cognitive, semantics, conceptual, metonymical, Covid-19
Procedia PDF Downloads 130918 Bidirectional Dynamic Time Warping Algorithm for the Recognition of Isolated Words Impacted by Transient Noise Pulses
Authors: G. Tamulevičius, A. Serackis, T. Sledevič, D. Navakauskas
Abstract:
We consider the biggest challenge in speech recognition – noise reduction. Traditionally detected transient noise pulses are removed with the corrupted speech using pulse models. In this paper we propose to cope with the problem directly in Dynamic Time Warping domain. Bidirectional Dynamic Time Warping algorithm for the recognition of isolated words impacted by transient noise pulses is proposed. It uses simple transient noise pulse detector, employs bidirectional computation of dynamic time warping and directly manipulates with warping results. Experimental investigation with several alternative solutions confirms effectiveness of the proposed algorithm in the reduction of impact of noise on recognition process – 3.9% increase of the noisy speech recognition is achieved.Keywords: transient noise pulses, noise reduction, dynamic time warping, speech recognition
Procedia PDF Downloads 559917 A Corpus-Based Study on the Lexical, Syntactic and Sequential Features across Interpreting Types
Authors: Qianxi Lv, Junying Liang
Abstract:
Among the various modes of interpreting, simultaneous interpreting (SI) is regarded as a ‘complex’ and ‘extreme condition’ of cognitive tasks while consecutive interpreters (CI) do not have to share processing capacity between tasks. Given that SI exerts great cognitive demand, it makes sense to posit that the output of SI may be more compromised than that of CI in the linguistic features. The bulk of the research has stressed the varying cognitive demand and processes involved in different modes of interpreting; however, related empirical research is sparse. In keeping with our interest in investigating the quantitative linguistic factors discriminating between SI and CI, the current study seeks to examine the potential lexical simplification, syntactic complexity and sequential organization mechanism with a self-made inter-model corpus of transcribed simultaneous and consecutive interpretation, translated speech and original speech texts with a total running word of 321960. The lexical features are extracted in terms of the lexical density, list head coverage, hapax legomena, and type-token ratio, as well as core vocabulary percentage. Dependency distance, an index for syntactic complexity and reflective of processing demand is employed. Frequency motif is a non-grammatically-bound sequential unit and is also used to visualize the local function distribution of interpreting the output. While SI is generally regarded as multitasking with high cognitive load, our findings evidently show that CI may impose heavier or taxing cognitive resource differently and hence yields more lexically and syntactically simplified output. In addition, the sequential features manifest that SI and CI organize the sequences from the source text in different ways into the output, to minimize the cognitive load respectively. We reasoned the results in the framework that cognitive demand is exerted both on maintaining and coordinating component of Working Memory. On the one hand, the information maintained in CI is inherently larger in volume compared to SI. On the other hand, time constraints directly influence the sentence reformulation process. The temporal pressure from the input in SI makes the interpreters only keep a small chunk of information in the focus of attention. Thus, SI interpreters usually produce the output by largely retaining the source structure so as to relieve the information from the working memory immediately after formulated in the target language. Conversely, CI interpreters receive at least a few sentences before reformulation, when they are more self-paced. CI interpreters may thus tend to retain and generate the information in a way to lessen the demand. In other words, interpreters cope with the high demand in the reformulation phase of CI by generating output with densely distributed function words, more content words of higher frequency values and fewer variations, simpler structures and more frequently used language sequences. We consequently propose a revised effort model based on the result for a better illustration of cognitive demand during both interpreting types.Keywords: cognitive demand, corpus-based, dependency distance, frequency motif, interpreting types, lexical simplification, sequential units distribution, syntactic complexity
Procedia PDF Downloads 181916 The Combination of the Mel Frequency Cepstral Coefficients (MFCC), Perceptual Linear Prediction (PLP), JITTER and SHIMMER Coefficients for the Improvement of Automatic Recognition System for Dysarthric Speech
Authors: Brahim-Fares Zaidi, Malika Boudraa, Sid-Ahmed Selouani
Abstract:
Our work aims to improve our Automatic Recognition System for Dysarthria Speech (ARSDS) based on the Hidden Models of Markov (HMM) and the Hidden Markov Model Toolkit (HTK) to help people who are sick. With pronunciation problems, we applied two techniques of speech parameterization based on Mel Frequency Cepstral Coefficients (MFCC's) and Perceptual Linear Prediction (PLP's) and concatenated them with JITTER and SHIMMER coefficients in order to increase the recognition rate of a dysarthria speech. For our tests, we used the NEMOURS database that represents speakers with dysarthria and normal speakers.Keywords: hidden Markov model toolkit (HTK), hidden models of Markov (HMM), Mel-frequency cepstral coefficients (MFCC), perceptual linear prediction (PLP’s)
Procedia PDF Downloads 164915 Conceptual Metaphors of Responsibility in Arabic to English Translation of Political Speeches: A Corpus-Based Study
Authors: Amr Anany
Abstract:
This study offers a corpus-based analysis of the conceptual metaphors of RESPONSIBILITY inherent in the Arabic political speeches of King Abdulla II and their English translations rendered by the translators of the Royal Hashemite Court ("RHC translators"). In view of the Conceptual Metaphor Theory (CMT), the current study aims to uncover the extent to which the dominant ideology in the source Arabic speeches of King Abdulla II is conveyed into the target English translation. The study explores a bilingual corpus, including eleven authentic Arabic speeches delivered by King Abdulla II and their English translations. The study finds that both Arabic and English share several metaphorical expressions of RESPONSIBILITY that are based on bodily experience such as RESPONSIBILITY IS UP, RESPONSIBILITY IS AN OBJECT, and RESPONSIBILITY IS AN HONOR. Apparently, the study concludes that RHC translators succeed to convey the dominant ideology from the source Arabic speeches to the English ones using specific translation strategies.Keywords: cognitive linguistics, CDA, conceptual metaphor theory, ideology, responsibility
Procedia PDF Downloads 72914 Cultural-Creative Design with Language Figures of Speech
Authors: Wei Chen Chang, Ming Yu Hsiao
Abstract:
The commodity takes one kind of mark, the designer how to construction and interpretation the user how to use the process and effectively convey message in design education has always been an important issue. Cultural-creative design refers to signifying cultural heritage for product design. In terms of Peirce’s Semiotic Triangle: signifying elements-object-interpretant, signifying elements are the outcomes of design, the object is cultural heritage, and the interpretant is the positioning and description of product design. How to elaborate the positioning, design, and development of a product is a narrative issue of the interpretant, and how to shape the signifying elements of a product by modifying and adapting styles is a rhetoric matter. This study investigated the rhetoric of elements signifying products to develop a rhetoric model with cultural style. Figures of speech are a rhetoric method in narrative. By adapting figures of speech to the interpretant, this study developed the rhetoric context of cultural context by narrative means. In this two-phase study, phase I defines figures of speech and phase II analyzes existing cultural-creative products in terms of figures of speech to develop a rhetoric of style model. We expect it can reference for the future development of Cultural-creative design.Keywords: cultural-creative design, cultural-creative products, figures of speech, Peirce’s semiotic triangle, rhetoric of style model
Procedia PDF Downloads 373913 Investigating (Im)Politeness Strategies in Email Communication: The Case Algerian PhD Supervisees and Irish Supervisors
Authors: Zehor Ktitni
Abstract:
In pragmatics, politeness is regarded as a feature of paramount importance to successful interpersonal relationships. On the other hand, emails have recently become one of the indispensable means of communication in educational settings. This research puts email communication at the core of the study and analyses it from a politeness perspective. More specifically, it endeavours to look closely at how the concept of (im)politeness is reflected through students’ emails. To this end, a corpus of Algerian supervisees’ email threads, exchanged with their Irish supervisors, was compiled. Leech’s model of politeness (2014) was selected as the main theoretical framework of this study, in addition to making reference to Brown and Levinson’s model (1987) as it is one of the most influential models in the area of pragmatic politeness. Further, some follow-up interviews are to be conducted with Algerian students to reinforce the results derived from the corpus. Initial findings suggest that Algerian Ph.D. students’ emails tend to include more politeness markers than impoliteness ones, they heavily make use of academic titles when addressing their supervisors (Dr. or Prof.), and they rely on hedging devices in order to sound polite.Keywords: politeness, email communication, corpus pragmatics, Algerian PhD supervisees, Irish supervisors
Procedia PDF Downloads 70912 Exploratory Analysis of A Review of Nonexistence Polarity in Native Speech
Authors: Deawan Rakin Ahamed Remal, Sinthia Chowdhury, Sharun Akter Khushbu, Sheak Rashed Haider Noori
Abstract:
Native Speech to text synthesis has its own leverage for the purpose of mankind. The extensive nature of art to speaking different accents is common but the purpose of communication between two different accent types of people is quite difficult. This problem will be motivated by the extraction of the wrong perception of language meaning. Thus, many existing automatic speech recognition has been placed to detect text. Overall study of this paper mentions a review of NSTTR (Native Speech Text to Text Recognition) synthesis compared with Text to Text recognition. Review has exposed many text to text recognition systems that are at a very early stage to comply with the system by native speech recognition. Many discussions started about the progression of chatbots, linguistic theory another is rule based approach. In the Recent years Deep learning is an overwhelming chapter for text to text learning to detect language nature. To the best of our knowledge, In the sub continent a huge number of people speak in Bangla language but they have different accents in different regions therefore study has been elaborate contradictory discussion achievement of existing works and findings of future needs in Bangla language acoustic accent.Keywords: TTR, NSTTR, text to text recognition, deep learning, natural language processing
Procedia PDF Downloads 133911 Quantum Cum Synaptic-Neuronal Paradigm and Schema for Human Speech Output and Autism
Authors: Gobinathan Devathasan, Kezia Devathasan
Abstract:
Objective: To improve the current modified Broca-Wernicke-Lichtheim-Kussmaul speech schema and provide insight into autism. Methods: We reviewed the pertinent literature. Current findings, involving Brodmann areas 22, 46, 9,44,45,6,4 are based on neuropathology and functional MRI studies. However, in primary autism, there is no lucid explanation and changes described, whether neuropathology or functional MRI, appear consequential. Findings: We forward an enhanced model which may explain the enigma related to autism. Vowel output is subcortical and does need cortical representation whereas consonant speech is cortical in origin. Left lateralization is needed to commence the circuitry spin as our life have evolved with L-amino acids and left spin of electrons. A fundamental species difference is we are capable of three syllable-consonants and bi-syllable expression whereas cetaceans and songbirds are confined to single or dual consonants. The 4 key sites for speech are superior auditory cortex, Broca’s two areas, and the supplementary motor cortex. Using the Argand’s diagram and Reimann’s projection, we theorize that the Euclidean three dimensional synaptic neuronal circuits of speech are quantized to coherent waves, and then decoherence takes place at area 6 (spherical representation). In this quantum state complex, 3-consonant languages are instantaneously integrated and multiple languages can be learned, verbalized and differentiated. Conclusion: We postulate that evolutionary human speech is elevated to quantum interaction unlike cetaceans and birds to achieve the three consonants/bi-syllable speech. In classical primary autism, the sudden speech switches off and on noted in several cases could now be explained not by any anatomical lesion but failure of coherence. Area 6 projects directly into prefrontal saccadic area (8); and this further explains the second primary feature in autism: lack of eye contact. The third feature which is repetitive finger gestures, located adjacent to the speech/motor areas, are actual attempts to communicate with the autistic child akin to sign language for the deaf.Keywords: quantum neuronal paradigm, cetaceans and human speech, autism and rapid magnetic stimulation, coherence and decoherence of speech
Procedia PDF Downloads 195910 The Visual Side of Islamophobia: A Social-Semiotic Analysis
Authors: Carmen Aguilera-Carnerero
Abstract:
Islamophobia, the unfounded hostility towards Muslims and Islam, has been deeply studied in the last decades from different perspectives ranging from anthropology, sociology, media studies, and linguistics. In the past few years, we have witnessed how the birth of social media has transformed formerly passive audiences into an active group that not only receives and digests information but also creates and comments publicly on any event of their interest. In this way, average citizens now have been entitled with the power of becoming potential opinion leaders. This rise of social media in the last years gave way to a different way of Islamophobia, the so called ‘cyberIslamophobia’. Considerably less attention, however, has been given to the study of islamophobic images that accompany the texts in social media. This paper attempts to analyse a corpus of 300 images of islamophobic nature taken from social media (from Twitter and Facebook) from the years 2014-2017 to see: a) how hate speech is visually constructed, b) how cyberislamophobia is articulated through images and whether there are differences/similarities between the textual and the visual elements, c) the impact of those images in the audience and their reaction to it and d) whether visual cyberislamophobia has undergone any process of permeating popular culture (for example, through memes) and its real impact. To carry out this task, we have used Critical Discourse Analysis as the most suitable theoretical framework that analyses and criticizes the dominant discourses that affect inequality, injustice, and oppression. The analysis of images was studied according to the theoretical framework provided by the visual framing theory and the visual design grammar to conclude that memes are subtle but very powerful tools to spread Islamophobia and foster hate speech under the guise of humour within popular culture.Keywords: cyberIslamophobia, visual grammar, social media, popular culture
Procedia PDF Downloads 169909 Introducing Data-Driven Learning into Chinese Higher Education English for Academic Purposes Writing Instructional Settings
Authors: Jingwen Ou
Abstract:
Writing for academic purposes in a second or foreign language is one of the most important and the most demanding skills to be mastered by non-native speakers. Traditionally, the EAP writing instruction at the tertiary level encompasses the teaching of academic genre knowledge, more specifically, the disciplinary writing conventions, the rhetorical functions, and specific linguistic features. However, one of the main sources of challenges in English academic writing for L2 students at the tertiary level can still be found in proficiency in academic discourse, especially vocabulary, academic register, and organization. Data-Driven Learning (DDL) is defined as “a pedagogical approach featuring direct learner engagement with corpus data”. In the past two decades, the rising popularity of the application of the data-driven learning (DDL) approach in the field of EAP writing teaching has been noticed. Such a combination has not only transformed traditional pedagogy aided by published DDL guidebooks in classroom use but also triggered global research on corpus use in EAP classrooms. This study endeavors to delineate a systematic review of research in the intersection of DDL and EAP writing instruction by conducting a systematic literature review on both indirect and direct DDL practice in EAP writing instructional settings in China. Furthermore, the review provides a synthesis of significant discoveries emanating from prior research investigations concerning Chinese university students’ perception of Data-Driven Learning (DDL) and the subsequent impact on their academic writing performance following corpus-based training. Research papers were selected from Scopus-indexed journals and core journals from two main Chinese academic databases (CNKI and Wanfang) published in both English and Chinese over the last ten years based on keyword searches. Results indicated an insufficiency of empirical DDL research despite a noticeable upward trend in corpus research on discourse analysis and indirect corpus applications for material design by language teachers. Research on the direct use of corpora and corpus tools in DDL, particularly in combination with genre-based EAP teaching, remains a relatively small fraction of the whole body of research in Chinese higher education settings. Such scarcity is highly related to the prevailing absence of systematic training in English academic writing registers within most Chinese universities' EAP syllabi due to the Chinese English Medium Instruction policy, where only English major students are mandated to submit English dissertations. Findings also revealed that Chinese learners still held mixed attitudes towards corpus tools influenced by learner differences, limited access to language corpora, and insufficient pre-training on corpus theoretical concepts, despite their improvements in final academic writing performance.Keywords: corpus linguistics, data-driven learning, EAP, tertiary education in China
Procedia PDF Downloads 61908 Performance Analysis of VoIP Coders for Different Modulations Under Pervasive Environment
Authors: Jasbinder Singh, Harjit Pal Singh, S. A. Khan
Abstract:
The work, in this paper, presents the comparison of encoded speech signals by different VoIP narrow-band and wide-band codecs for different modulation schemes. The simulation results indicate that codec has an impact on the speech quality and also effected by modulation schemes.Keywords: VoIP, coders, modulations, BER, MOS
Procedia PDF Downloads 518