Search results for: collecting speech emotion corpus
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2030

Search results for: collecting speech emotion corpus

1790 The Communicative Nature of Linguistic Interference in Learning and Teaching of Slavic Languages

Authors: Kseniia Fedorova

Abstract:

The article is devoted to interlinguistic homonymy and enantiosemy analysis. These phenomena belong to the process of linguistic interference, which leads to violation of the communicative utterances integrity and causes misunderstanding between foreign interlocutors - native speakers of different Slavic languages. More attention is paid to investigation of non-typical speech situations, which occurred spontaneously or created by somebody intentionally being based on described phenomenon mechanism. The classification of typical students' mistakes connected with the paradox of interference is being represented in the article. The survey contributes to speech act theory, contemporary linguodidactics, translation science and comparative lexicology of Slavonic languages.

Keywords: adherent enantiosemy, interference, interslavonic homonymy, speech act

Procedia PDF Downloads 214
1789 Exploring Reading into Writing: A Corpus-Based Analysis of Postgraduate Students’ Literature Review Essays

Authors: Tanzeela Anbreen, Ammara Maqsood

Abstract:

Reading into writing is one of university students' most required academic skills. The current study explored postgraduate university students’ writing quality using a corpus-based approach. Twelve postgraduate students’ literature review essays were chosen for the corpus-based analysis. These essays were chosen because students had to incorporate multiple reading sources in these essays, which was a new writing exercise for them. The students were provided feedback at least two times which comprised of the written comments by the tutor highlighting the areas of improvement and also by using the ‘track changes’ function. This exercise was repeated two times, and students submitted two drafts. This investigation included only the finally submitted work of the students. A corpus-based approach was adopted to analyse the essays because it promotes autonomous discovery and personalised learning. The aim of this analysis was to understand the existing level of students’ writing before the start of their postgraduate thesis. Text Inspector was used to analyse the quality of essays. With the help of the Text Inspector tool, the vocabulary used in the essays was compared to the English Vocabulary Profile (EVP), which describes what learners know and can do at each Common European Framework of Reference (CEFR) level. Writing quality was also measured for the Flesch reading ease score, which is a standard to describe the ease of understanding the writing content. The results reflected that students found writing essays using multiple sources challenging. In most essays, the vocabulary level achieved was between B1-B2 of the CEFL level. The study recommends that students need extensive training in developing academic writing skills, particularly in writing the literature review type assignment, which requires multiple sources citations.

Keywords: literature review essays, postgraduate students, corpus-based analysis, vocabulary proficiency

Procedia PDF Downloads 37
1788 Dyadic Video Evidence on How Emotions in Parent Verbal Bids Affect Child Compliance in a British Sample

Authors: Iris Sirirada Pattara-Angkoon, Rory Devine, Anja Lindberg, Wendy Browne, Sarah Foley, Gabrielle McHarg, Claire Hughes

Abstract:

Introduction: The “Terrible Twos” is a phrase used to describe toddlers 18-30 months old. It characterizes a transition from high dependency to their caregivers in infancy to more autonomy and mastery of the body and environment. Toddlers at this age may also show more willfulness and stubbornness that could predict a future trajectory leading to conduct disorders. Thus, an important goal for this age group is to promote responsiveness to their caregivers (i.e., compliance). Existing literature tends to focus on praise to increase desirable child behavior. However, this relationship is not always straightforward as some studies have found no or negative association between praise and child compliance. Research suggests positive emotions and affection showed through body language (e.g., smiles) and actions (e.g., hugs, kisses) along with positive parent-child relationship can strengthen the praise and child compliance association. Nonetheless, few studies have examined the influences of positive emotionality within the speech. This is important as implementing verbal positive emotionality is easier than physical adjustments. The literature also tends not to include fathers in the study sample as mothers were traditionally the primary caregiver. However, as child-caring duties are increasing shared equally between mothers and fathers, it is important to include fathers within the study as studies have frequently found differences between female and male caregiver characteristics. Thus, the study will address the literary gap in two ways: 1. explore the influences of positive emotionality in parental speech and 2. include an equal sample of mothers and fathers. Positive emotionality is expected to positively correlate with and predict child compliance. Methodology: This study analyzed toddlers (18-24 months) in their dyadic interactions with mothers and fathers. A Duplo (block) task was used where parents had to work with their children to build the Duplo according to the given photo for four minutes. Then, they would be told to clean up the blocks. Parental positive emotionality in different speech types (e.g., bids, praises, affirmations) and child compliance were measured. Results: The study found that mothers (M = 28.92, SD = 12.01) were significantly more likely than fathers (M = 23.01, SD = 12.28) to use positive verbal emotionality in their speech, t(105) = 4.35, p< .001. High positive emotionality in bids during Duplo task and Clean Up was positively correlated with more child compliance in each task, r(273) = .35, p< .001 and r(264) = .58, p< .001, respectively. Overall, parental positive emotionality in speech significantly predicted child compliance, F(6, 218) = 13.33, p< .001, R² = .27) with emotionality in verbal bids (t = 6.20, p< .001) and affirmations (t = 3.12, p = .002) being significant predictors. Conclusion: Positive verbal emotions may be useful for increasing compliance in toddlers. This can be beneficial for compliance interventions as well as to the parent-child relationship quality through reduction of conflict and child defiance. As this study is correlational in nature, it will be important for future research to test the directional influence of positive emotionality within speech.

Keywords: child temperament, compliance, positive emotion, toddler, verbal bids

Procedia PDF Downloads 145
1787 Assessment of the Validity of Sentiment Analysis as a Tool to Analyze the Emotional Content of Text

Authors: Trisha Malhotra

Abstract:

Sentiment analysis is a recent field of study that computationally assesses the emotional nature of a body of text. To assess its test-validity, sentiment analysis was carried out on the emotional corpus of text from a personal 15-day mood diary. Self-reported mood scores varied more or less accurately with daily mood evaluation score given by the software. On further assessment, it was found that while sentiment analysis was good at assessing ‘global’ mood, it was not able to ‘locally’ identify and differentially score synonyms of various emotional words. It is further critiqued for treating the intensity of an emotion as universal across cultures. Finally, the software is shown not to account for emotional complexity in sentences by treating emotions as strictly positive or negative. Hence, it is posited that a better output could be two (positive and negative) affect scores for the same body of text.

Keywords: analysis, data, diary, emotions, mood, sentiment

Procedia PDF Downloads 238
1786 The Effects of Emotional Working Memory Training on Trait Anxiety

Authors: Gabrielle Veloso, Welison Ty

Abstract:

Trait anxiety is a pervasive tendency to attend to and experience fears and worries to a disproportionate degree, across various situations. This study sought to determine if participants who undergo emotional working memory training will have significantly lower scores on the trait anxiety scales post-intervention. The study also sought to determine if emotional regulation mediated the relationship between working memory training and trait anxiety. Forty-nine participants underwent 20 days of computerized emotional working memory training called Emotional Dual n-back, which involves viewing a continuous stream of emotional content on a grid, and then remembering the location and color of items presented on the grid. Participants of the treatment group had significantly lower trait anxiety compared to controls post-intervention. Mediation analysis determined that working memory training had no significant relationship to anxiety as measured by the Beck’s Anxiety Inventory-Trait (BAIT), but was significantly related to anxiety as measured by form Y2 of the Spielberger State-Trait Anxiety Inventory (STAI-Y2). Emotion regulation, as measured by the Emotional Regulation Questionnaire (ERQ), was found not to mediate between working memory training and trait anxiety reduction. Results suggest that working memory training may be useful in reducing psychoemotional symptoms rather than somatic symptoms of trait anxiety. Moreover, it proposes for future research to further look into the mediating role of emotion regulation via neuroimaging and the development of more comprehensive measures of emotion regulation.

Keywords: anxiety, emotion regulation, working-memory, working-memory training

Procedia PDF Downloads 111
1785 A Novel Method for Face Detection

Authors: H. Abas Nejad, A. R. Teymoori

Abstract:

Facial expression recognition is one of the open problems in computer vision. Robust neutral face recognition in real time is a major challenge for various supervised learning based facial expression recognition methods. This is due to the fact that supervised methods cannot accommodate all appearance variability across the faces with respect to race, pose, lighting, facial biases, etc. in the limited amount of training data. Moreover, processing each and every frame to classify emotions is not required, as the user stays neutral for the majority of the time in usual applications like video chat or photo album/web browsing. Detecting neutral state at an early stage, thereby bypassing those frames from emotion classification would save the computational power. In this work, we propose a light-weight neutral vs. emotion classification engine, which acts as a preprocessor to the traditional supervised emotion classification approaches. It dynamically learns neutral appearance at Key Emotion (KE) points using a textural statistical model, constructed by a set of reference neutral frames for each user. The proposed method is made robust to various types of user head motions by accounting for affine distortions based on a textural statistical model. Robustness to dynamic shift of KE points is achieved by evaluating the similarities on a subset of neighborhood patches around each KE point using the prior information regarding the directionality of specific facial action units acting on the respective KE point. The proposed method, as a result, improves ER accuracy and simultaneously reduces the computational complexity of ER system, as validated on multiple databases.

Keywords: neutral vs. emotion classification, Constrained Local Model, procrustes analysis, Local Binary Pattern Histogram, statistical model

Procedia PDF Downloads 318
1784 School Refusal Behaviours: The Roles of Adolescent and Parental Factors

Authors: Junwen Chen, Celina Feleppa, Tingyue Sun, Satoko Sasagawa, Michael Smithson

Abstract:

School refusal behaviours refer to behaviours to avoid school attendance, chronic lateness in arriving at school, or regular early dismissal. Poor attendance in schools is highly correlated with anxiety, depression, suicide attempts, delinquency, violence, and substance use and abuse. Poor attendance is also a strong indicator of lower achievement in school, as well as problematic social-emotional development. Long-term consequences of school refusal behaviours include fewer opportunities for higher education, employment, and social difficulties, and high risks of later psychiatric illness. Given its negative impacts on youth educational outcomes and well-being, a thorough understanding of factors that are involved in the development of this phenomenon is warranted for developing effective management approaches. This study investigated parental and adolescent factors that may contribute to school refusal behaviours by specifically focusing on the role of parental and adolescents’ anxiety and depression, emotion dysregulation, and parental rearing style. Findings are expected to inform the identification of both parental and adolescents’ factors that may contribute to school refusal behaviours. This knowledge will enable novel and effective approaches that incorporate these factors to managing school refusal behaviours in adolescents, which in turn improve their school and daily functioning. Results are important for an integrative understanding of school refusal behaviours. Furthermore, findings will also provide information for policymakers to weigh the benefits of interventions targeting school refusal behaviours in adolescents. One-hundred-and-six adolescents aged 12-18 years (mean age = 14.79 years old, SD = 1.78, males = 44) and their parents (mean age = 47.49 years old, SD = 5.61, males = 27) completed an online questionnaire measuring both parental and adolescents’ anxiety, depression, emotion dysregulation, parental rearing styles, and adolescents’ school refusal behaviours. Adolescents with school refusal behaviours reported greater anxiety and depression, with their parents showing greater emotion dysregulation. Parental emotion dysregulation and adolescents’ anxiety and depression predicted school refusal behaviours independently. To date, only limited studies have investigated the interplay between parental and youth factors in relation to youth school refusal behaviours. Although parental emotion dysregulation has been investigated in relation to youth emotion dysregulation, little is known about its role in the context of school refusal. This study is one of the very few that investigated both parental and adolescent factors in relation to school refusal behaviours in adolescents. The findings support the theoretical models that emphasise the role of youth and parental psychopathology in school refusal behaviours. Future management of school refusal behaviours should target adolescents’ anxiety and depression while incorporating training for parental emotion regulation skills.

Keywords: adolescents, school refusal behaviors, parental factors, anxiety and depression, emotion dysregulation

Procedia PDF Downloads 91
1783 Color-Based Emotion Regulation Model: An Affective E-Learning Environment

Authors: Sabahat Nadeem, Farman Ali Khan

Abstract:

Emotions are considered as a vital factor affecting the process of information handling, level of attention, memory capacity and decision making. Latest e-Learning systems are therefore taking into consideration the effective state of learners to make the learning process more effective and enjoyable. One such use of user’s affective information is in the systems that tend to regulate users’ emotions to a state optimally desirable for learning. So for, this objective has been tried to be achieved with the help of teaching strategies, background music, guided imagery, video clips and odors. Nevertheless, we know that colors can affect human emotions. Relationship between color and emotions has a strong influence on how we perceive our environment. Similarly, the colors of the interface can also affect the user positively as well as negatively. This affective behavior of color and its use as emotion regulation agent is not yet exploited. Therefore, this research proposes a Color-based Emotion Regulation Model (CERM), a new framework that can automatically adapt its colors according to user’s emotional state and her personality type and can help in producing a desirable emotional effect, aiming at providing an unobtrusive emotional support to the users of e-learning environment. The evaluation of CERM is carried out by comparing it with classical non-adaptive, static colored learning management system. Results indicate that colors of the interface, when carefully selected has significant positive impact on learner’s emotions.

Keywords: effective learning, e-learning, emotion regulation, emotional design

Procedia PDF Downloads 280
1782 A Corpus-Based Study on the Lexical, Syntactic and Sequential Features across Interpreting Types

Authors: Qianxi Lv, Junying Liang

Abstract:

Among the various modes of interpreting, simultaneous interpreting (SI) is regarded as a ‘complex’ and ‘extreme condition’ of cognitive tasks while consecutive interpreters (CI) do not have to share processing capacity between tasks. Given that SI exerts great cognitive demand, it makes sense to posit that the output of SI may be more compromised than that of CI in the linguistic features. The bulk of the research has stressed the varying cognitive demand and processes involved in different modes of interpreting; however, related empirical research is sparse. In keeping with our interest in investigating the quantitative linguistic factors discriminating between SI and CI, the current study seeks to examine the potential lexical simplification, syntactic complexity and sequential organization mechanism with a self-made inter-model corpus of transcribed simultaneous and consecutive interpretation, translated speech and original speech texts with a total running word of 321960. The lexical features are extracted in terms of the lexical density, list head coverage, hapax legomena, and type-token ratio, as well as core vocabulary percentage. Dependency distance, an index for syntactic complexity and reflective of processing demand is employed. Frequency motif is a non-grammatically-bound sequential unit and is also used to visualize the local function distribution of interpreting the output. While SI is generally regarded as multitasking with high cognitive load, our findings evidently show that CI may impose heavier or taxing cognitive resource differently and hence yields more lexically and syntactically simplified output. In addition, the sequential features manifest that SI and CI organize the sequences from the source text in different ways into the output, to minimize the cognitive load respectively. We reasoned the results in the framework that cognitive demand is exerted both on maintaining and coordinating component of Working Memory. On the one hand, the information maintained in CI is inherently larger in volume compared to SI. On the other hand, time constraints directly influence the sentence reformulation process. The temporal pressure from the input in SI makes the interpreters only keep a small chunk of information in the focus of attention. Thus, SI interpreters usually produce the output by largely retaining the source structure so as to relieve the information from the working memory immediately after formulated in the target language. Conversely, CI interpreters receive at least a few sentences before reformulation, when they are more self-paced. CI interpreters may thus tend to retain and generate the information in a way to lessen the demand. In other words, interpreters cope with the high demand in the reformulation phase of CI by generating output with densely distributed function words, more content words of higher frequency values and fewer variations, simpler structures and more frequently used language sequences. We consequently propose a revised effort model based on the result for a better illustration of cognitive demand during both interpreting types.

Keywords: cognitive demand, corpus-based, dependency distance, frequency motif, interpreting types, lexical simplification, sequential units distribution, syntactic complexity

Procedia PDF Downloads 140
1781 Investigating the Online Effect of Language on Gesture in Advanced Bilinguals of Two Structurally Different Languages in Comparison to L1 Native Speakers of L2 and Explores Whether Bilinguals Will Follow Target L2 Patterns in Speech and Co-speech

Authors: Armita Ghobadi, Samantha Emerson, Seyda Ozcaliskan

Abstract:

Being a bilingual involves mastery of both speech and gesture patterns in a second language (L2). We know from earlier work in first language (L1) production contexts that speech and co-speech gesture form a tightly integrated system: co-speech gesture mirrors the patterns observed in speech, suggesting an online effect of language on nonverbal representation of events in gesture during the act of speaking (i.e., “thinking for speaking”). Relatively less is known about the online effect of language on gesture in bilinguals speaking structurally different languages. The few existing studies—mostly with small sample sizes—suggests inconclusive findings: some show greater achievement of L2 patterns in gesture with more advanced L2 speech production, while others show preferences for L1 gesture patterns even in advanced bilinguals. In this study, we focus on advanced bilingual speakers of two structurally different languages (Spanish L1 with English L2) in comparison to L1 English speakers. We ask whether bilingual speakers will follow target L2 patterns not only in speech but also in gesture, or alternatively, follow L2 patterns in speech but resort to L1 patterns in gesture. We examined this question by studying speech and gestures produced by 23 advanced adult Spanish (L1)-English (L2) bilinguals (Mage=22; SD=7) and 23 monolingual English speakers (Mage=20; SD=2). Participants were shown 16 animated motion event scenes that included distinct manner and path components (e.g., "run over the bridge"). We recorded and transcribed all participant responses for speech and segmented it into sentence units that included at least one motion verb and its associated arguments. We also coded all gestures that accompanied each sentence unit. We focused on motion event descriptions as it shows strong crosslinguistic differences in the packaging of motion elements in speech and co-speech gesture in first language production contexts. English speakers synthesize manner and path into a single clause or gesture (he runs over the bridge; running fingers forward), while Spanish speakers express each component separately (manner-only: el corre=he is running; circle arms next to body conveying running; path-only: el cruza el puente=he crosses the bridge; trace finger forward conveying trajectory). We tallied all responses by group and packaging type, separately for speech and co-speech gesture. Our preliminary results (n=4/group) showed that productions in English L1 and Spanish L1 differed, with greater preference for conflated packaging in L1 English and separated packaging in L1 Spanish—a pattern that was also largely evident in co-speech gesture. Bilinguals’ production in L2 English, however, followed the patterns of the target language in speech—with greater preference for conflated packaging—but not in gesture. Bilinguals used separated and conflated strategies in gesture in roughly similar rates in their L2 English, showing an effect of both L1 and L2 on co-speech gesture. Our results suggest that online production of L2 language has more limited effects on L2 gestures and that mastery of native-like patterns in L2 gesture might take longer than native-like L2 speech patterns.

Keywords: bilingualism, cross-linguistic variation, gesture, second language acquisition, thinking for speaking hypothesis

Procedia PDF Downloads 48
1780 Conceptual Metaphors of Responsibility in Arabic to English Translation of Political Speeches: A Corpus-Based Study

Authors: Amr Anany

Abstract:

This study offers a corpus-based analysis of the conceptual metaphors of RESPONSIBILITY inherent in the Arabic political speeches of King Abdulla II and their English translations rendered by the translators of the Royal Hashemite Court ("RHC translators"). In view of the Conceptual Metaphor Theory (CMT), the current study aims to uncover the extent to which the dominant ideology in the source Arabic speeches of King Abdulla II is conveyed into the target English translation. The study explores a bilingual corpus, including eleven authentic Arabic speeches delivered by King Abdulla II and their English translations. The study finds that both Arabic and English share several metaphorical expressions of RESPONSIBILITY that are based on bodily experience such as RESPONSIBILITY IS UP, RESPONSIBILITY IS AN OBJECT, and RESPONSIBILITY IS AN HONOR. Apparently, the study concludes that RHC translators succeed to convey the dominant ideology from the source Arabic speeches to the English ones using specific translation strategies.

Keywords: cognitive linguistics, CDA, conceptual metaphor theory, ideology, responsibility

Procedia PDF Downloads 35
1779 Cognitive Semantics Study of Conceptual and Metonymical Expressions in Johnson's Speeches about COVID-19

Authors: Hussain Hameed Mayuuf

Abstract:

The study is an attempt to investigate the conceptual metonymies is used in political discourse about COVID-19. Thus, this study tries to analyze and investigate how the conceptual metonymies in Johnson's speech about coronavirus are constructed. This study aims at: Identifying how are metonymies relevant to understand the messages in Boris Johnson speeches and to find out how can conceptual blending theory help people to understand the messages in the political speech about COVID-19. Lastly, it tries to Point out the kinds of integration networks are common in political speech. The study is based on the hypotheses that conceptual blending theory is a powerful tool for investigating the intended messages in Johnson's speech and there are different processes of blending networks and conceptual mapping that enable the listeners to identify the messages in political speech. This study presents a qualitative and quantitative analysis of four speeches about COVID-19; they are said by Boris Johnson. The selected data have been tackled from the cognitive-semantic perspective by adopting Conceptual Blending Theory as a model for the analysis. It concludes that CBT is applicable to the analysis of metonymies in political discourse. Its mechanisms enable listeners to analyze and understand these speeches. Also the listener can identify and understand the hidden messages in Biden and Johnson's discourse about COVID-19 by using different conceptual networks. Finally, it is concluded that the double scope networks are the most common types of blending of metonymies in the political speech.

Keywords: cognitive, semantics, conceptual, metonymical, Covid-19

Procedia PDF Downloads 80
1778 Development of a Social Assistive Robot for Elderly Care

Authors: Edwin Foo, Woei Wen, Lui, Meijun Zhao, Shigeru Kuchii, Chin Sai Wong, Chung Sern Goh, Yi Hao He

Abstract:

This presentation presents an elderly care and assistive social robot development work. We named this robot JOS and he is restricted to table top operation. JOS is designed to have a maximum volume of 3600 cm3 with its base restricted to 250 mm and his mission is to provide companion, assist and help the elderly. In order for JOS to accomplish his mission, he will be equipped with perception, reaction and cognition capability. His appearance will be not human like but more towards cute and approachable type. JOS will also be designed to be neutral gender. However, the robot will still have eyes, eyelid and a mouth. For his eyes and eyelids, they will be built entirely with Robotis Dynamixel AX18 motor. To realize this complex task, JOS will be also be equipped with micro-phone array, vision camera and Intel i5 NUC computer and a powered by a 12 V lithium battery that will be self-charging. His face is constructed using 1 motor each for the eyelid, 2 motors for the eyeballs, 3 motors for the neck mechanism and 1 motor for the lips movement. The vision senor will be house on JOS forehead and the microphone array will be somewhere below the mouth. For the vision system, Omron latest OKAO vision sensor is used. It is a compact and versatile sensor that is only 60mm by 40mm in size and operates with only 5V supply. In addition, OKAO vision sensor is capable of identifying the user and recognizing the expression of the user. With these functions, JOS is able to track and identify the user. If he cannot recognize the user, JOS will ask the user if he would want him to remember the user. If yes, JOS will store the user information together with the capture face image into a database. This will allow JOS to recognize the user the next time the user is with JOS. In addition, JOS is also able to interpret the mood of the user through the facial expression of the user. This will allow the robot to understand the user mood and behavior and react according. Machine learning will be later incorporated to learn the behavior of the user so as to understand the mood of the user and requirement better. For the speech system, Microsoft speech and grammar engine is used for the speech recognition. In order to use the speech engine, we need to build up a speech grammar database that captures the commonly used words by the elderly. This database is built from research journals and literature on elderly speech and also interviewing elderly what do they want to robot to assist them with. Using the result from the interview and research from journal, we are able to derive a set of common words the elderly frequently used to request for the help. It is from this set that we build up our grammar database. In situation where there is more than one person near JOS, he is able to identify the person who is talking to him through an in-house developed microphone array structure. In order to make the robot more interacting, we have also included the capability for the robot to express his emotion to the user through the facial expressions by changing the position and movement of the eyelids and mouth. All robot emotions will be in response to the user mood and request. Lastly, we are expecting to complete this phase of project and test it with elderly and also delirium patient by Feb 2015.

Keywords: social robot, vision, elderly care, machine learning

Procedia PDF Downloads 416
1777 Cognitive Emotion Regulation Strategies in 9–14-Year-Old Hungarian Children with Neurotypical Development in the Light of the Hungarian Version of Cognitive Emotion Regulation Questionnaire for Children

Authors: Dorottya Horváth, Andras Lang, Diana Varro-Horvath

Abstract:

This research activity and study is part of a major research effort to gain an integrative, neuropsychological, and personality psychological understanding of Attention Deficit Hyperactivity Disorder (ADHD) and thus improve the specification of diagnostic and therapeutic care. In the past, the neuropsychology section has investigated working memory, executive function, attention, and behavioural manifestations in children. Currently, we are looking for personality psychological protective factors for ADHD and its symptomatic exacerbation. We hypothesise that secure attachment, adaptive emotion regulation, and high resilience are protective factors. The aim of this study is to measure and report the results of a Hungarian sample of the Cognitive Emotion Regulation Questionnaire for Children (CERQ-k) because before studying groups with different developmental differences, it is essential to know the average scores of groups with neurotypical devel-opment. Until now, there was no Hungarian version of the above test, so we used our own translation. This questionnaire has been developed to assess children's thoughts after experiencing negative life events. It consists of 4-4 items per subscale, for a total of 36 items. The response categories for each item range from 1 (almost never) to 5 (almost always). The subscales were self-blame, blaming others, acceptance, planning, positive refocusing, rumination or thought-focusing, positive reappraisal, putting into perspective, and catastrophizing. The data for this study were collected from 120 children aged 9-14 years. It was analysed using descriptive statistical analysis, where the mean and standard deviation values for each age group, as well as the Cronbach's alpha value, were significant in testing the reliability of the questionnaire. The results showed that the questionnaire is a reliable and valid measuring instrument also on a Hungarian sample. These developments and results will allow the use of a version of the Cognitive Emotion Regulation Questionnaire for children in Hungarian and pave the way for the study of different developmental groups such as children with learning disabilities and/or with ADHD.

Keywords: neurotypical development, emotion regulation, negative life events, CERQ-k, Hungarian average scores

Procedia PDF Downloads 41
1776 Bidirectional Dynamic Time Warping Algorithm for the Recognition of Isolated Words Impacted by Transient Noise Pulses

Authors: G. Tamulevičius, A. Serackis, T. Sledevič, D. Navakauskas

Abstract:

We consider the biggest challenge in speech recognition – noise reduction. Traditionally detected transient noise pulses are removed with the corrupted speech using pulse models. In this paper we propose to cope with the problem directly in Dynamic Time Warping domain. Bidirectional Dynamic Time Warping algorithm for the recognition of isolated words impacted by transient noise pulses is proposed. It uses simple transient noise pulse detector, employs bidirectional computation of dynamic time warping and directly manipulates with warping results. Experimental investigation with several alternative solutions confirms effectiveness of the proposed algorithm in the reduction of impact of noise on recognition process – 3.9% increase of the noisy speech recognition is achieved.

Keywords: transient noise pulses, noise reduction, dynamic time warping, speech recognition

Procedia PDF Downloads 525
1775 The Combination of the Mel Frequency Cepstral Coefficients (MFCC), Perceptual Linear Prediction (PLP), JITTER and SHIMMER Coefficients for the Improvement of Automatic Recognition System for Dysarthric Speech

Authors: Brahim-Fares Zaidi, Malika Boudraa, Sid-Ahmed Selouani

Abstract:

Our work aims to improve our Automatic Recognition System for Dysarthria Speech (ARSDS) based on the Hidden Models of Markov (HMM) and the Hidden Markov Model Toolkit (HTK) to help people who are sick. With pronunciation problems, we applied two techniques of speech parameterization based on Mel Frequency Cepstral Coefficients (MFCC's) and Perceptual Linear Prediction (PLP's) and concatenated them with JITTER and SHIMMER coefficients in order to increase the recognition rate of a dysarthria speech. For our tests, we used the NEMOURS database that represents speakers with dysarthria and normal speakers.

Keywords: hidden Markov model toolkit (HTK), hidden models of Markov (HMM), Mel-frequency cepstral coefficients (MFCC), perceptual linear prediction (PLP’s)

Procedia PDF Downloads 130
1774 Investigating (Im)Politeness Strategies in Email Communication: The Case Algerian PhD Supervisees and Irish Supervisors

Authors: Zehor Ktitni

Abstract:

In pragmatics, politeness is regarded as a feature of paramount importance to successful interpersonal relationships. On the other hand, emails have recently become one of the indispensable means of communication in educational settings. This research puts email communication at the core of the study and analyses it from a politeness perspective. More specifically, it endeavours to look closely at how the concept of (im)politeness is reflected through students’ emails. To this end, a corpus of Algerian supervisees’ email threads, exchanged with their Irish supervisors, was compiled. Leech’s model of politeness (2014) was selected as the main theoretical framework of this study, in addition to making reference to Brown and Levinson’s model (1987) as it is one of the most influential models in the area of pragmatic politeness. Further, some follow-up interviews are to be conducted with Algerian students to reinforce the results derived from the corpus. Initial findings suggest that Algerian Ph.D. students’ emails tend to include more politeness markers than impoliteness ones, they heavily make use of academic titles when addressing their supervisors (Dr. or Prof.), and they rely on hedging devices in order to sound polite.

Keywords: politeness, email communication, corpus pragmatics, Algerian PhD supervisees, Irish supervisors

Procedia PDF Downloads 31
1773 Cultural-Creative Design with Language Figures of Speech

Authors: Wei Chen Chang, Ming Yu Hsiao

Abstract:

The commodity takes one kind of mark, the designer how to construction and interpretation the user how to use the process and effectively convey message in design education has always been an important issue. Cultural-creative design refers to signifying cultural heritage for product design. In terms of Peirce’s Semiotic Triangle: signifying elements-object-interpretant, signifying elements are the outcomes of design, the object is cultural heritage, and the interpretant is the positioning and description of product design. How to elaborate the positioning, design, and development of a product is a narrative issue of the interpretant, and how to shape the signifying elements of a product by modifying and adapting styles is a rhetoric matter. This study investigated the rhetoric of elements signifying products to develop a rhetoric model with cultural style. Figures of speech are a rhetoric method in narrative. By adapting figures of speech to the interpretant, this study developed the rhetoric context of cultural context by narrative means. In this two-phase study, phase I defines figures of speech and phase II analyzes existing cultural-creative products in terms of figures of speech to develop a rhetoric of style model. We expect it can reference for the future development of Cultural-creative design.

Keywords: cultural-creative design, cultural-creative products, figures of speech, Peirce’s semiotic triangle, rhetoric of style model

Procedia PDF Downloads 345
1772 Introducing Data-Driven Learning into Chinese Higher Education EAP Writing Instructional Settings

Authors: Jingwen Ou

Abstract:

Writing for academic purposes in a second or foreign language is one of the most important and the most demanding skills to be mastered by non-native speakers. Traditionally, the EAP writing instruction at the tertiary level encompasses the teaching of academic genre knowledge, more specifically, the disciplinary writing conventions, the rhetorical functions, and specific linguistic features. However, one of the main sources of challenges in English academic writing for L2 students at the tertiary level can still be found in proficiency in academic discourse, especially vocabulary, academic register, and organization. Data-Driven Learning (DDL) is defined as “a pedagogical approach featuring direct learner engagement with corpus data”. In the past two decades, the rising popularity of the application of the data-driven learning (DDL) approach in the field of EAP writing teaching has been noticed. Such a combination has not only transformed traditional pedagogy aided by published DDL guidebooks in classroom use but also triggered global research on corpus use in EAP classrooms. This study endeavors to delineate a systematic review of research in the intersection of DDL and EAP writing instruction by conducting a systematic literature review on both indirect and direct DDL practice in EAP writing instructional settings in China. Furthermore, the review provides a synthesis of significant discoveries emanating from prior research investigations concerning Chinese university students’ perception of Data-Driven Learning (DDL) and the subsequent impact on their academic writing performance following corpus-based training. Research papers were selected from Scopus-indexed journals and core journals from two main Chinese academic databases (CNKI and Wanfang) published in both English and Chinese over the last ten years based on keyword searches. Results indicated an insufficiency of empirical DDL research despite a noticeable upward trend in corpus research on discourse analysis and indirect corpus applications for material design by language teachers. Research on the direct use of corpora and corpus tools in DDL, particularly in combination with genre-based EAP teaching, remains a relatively small fraction of the whole body of research in Chinese higher education settings. Such scarcity is highly related to the prevailing absence of systematic training in English academic writing registers within most Chinese universities' EAP syllabi due to the Chinese English Medium Instruction policy, where only English major students are mandated to submit English dissertations. Findings also revealed that Chinese learners still held mixed attitudes towards corpus tools influenced by learner differences, limited access to language corpora, and insufficient pre-training on corpus theoretical concepts, despite their improvements in final academic writing performance.

Keywords: corpus linguistics, data-driven learning, EAP, tertiary education in China

Procedia PDF Downloads 15
1771 A Literature Review of Emotional Labor and Non-Task Behavior

Authors: Yeong-Gyeong Choi, Kyoung-Seok Kim

Abstract:

This study, literature review research, intends to deal with the problem of conceptual ambiguity among research on emotional labor, and to look into the evolutionary trends and changing aspects of defining the concept of emotional labor. In addition, in existing studies, deep acting and surface acting are highly related to a positive outcome variable and a negative outcome variable, respectively. It was confirmed that for employees performing emotional labor, deep acting and surface acting are highly related to OCB and CWB, respectively. While positive emotion that employees come to experience during job performance process can easily trigger a positive non-task behavior such as OCB, negative emotion that employees experience through excessive workload or unfair treatment can easily induce a negative behavior like CWB. The two management behaviors of emotional labor, surface acting and deep acting, can have either a positive or negative effect on non-task behavior of employees, depending on which one they would choose. Thus, the purpose of this review paper is to clarify the relationship between emotional labor and non-task behavior more specifically.

Keywords: emotion labor, non-task behavior, OCB, CWB

Procedia PDF Downloads 319
1770 Understanding Mental Constructs of Language and Emotion

Authors: Sakshi Ghai

Abstract:

The word ‘emotion’ has been microscopically studied through psychological, anthropological and biological lenses and have indubitably been one of the most researched concepts as, in all situations and reactions that constitute human life, emotions form the very niche of our mutual existence. While understanding the social aspects of cognition, one can realize that emotions are deeply interwoven with language and thereby are pivotal in inducing human actions and behavior. The society or the outward social structure is the result of the inward psychological structure of our human relationships, for the individual is the result of the total experience, knowledge and conduct of man. The aim of this paper is threefold: first, to establish the relation between mental representations of emotions and its neuropsychological connection with language on a conscious and sub-conscious level; secondly, to describe how innate, basic and higher cognitive emotions affect the constantly changing state of an agent and peruse its assistance in determining the moral compass within all beings. Lastly, in the course of this paper, the concept of the architecture of mind is explored considering how it has developed an ability to display adaptive emotional states and responses, which are in sync with the language of thought. For every response to the social environment is so deeply determined by the very social milieu in which one is situated, language has a fundamental role in constructing emotions and articulating behavior. Being linguistic beings, we tend to associate emotion, feelings and other aspects of inwards mental states intrinsically with the language we use. This paper aims to devise a discursive approach to understand how emotions are fabricated, intertwined with the mental constructs further expressed and communicated through the various units of language.

Keywords: mental representation, emotion, language, psychology

Procedia PDF Downloads 260
1769 Personality Moderates the Relation Between Mother´s Emotional Intelligence and Young Children´s Emotion Situation Knowledge

Authors: Natalia Alonso-Alberca, Ana I. Vergara

Abstract:

From the very first years of their life, children are confronted with situations in which they need to deal with emotions. The family provides the first emotional experiences, and it is in the family context that children usually take their first steps towards acquiring emotion knowledge. Parents play a key role in this important task, helping their children develop emotional skills that they will need in challenging situations throughout their lives. Specifically, mothers are models imitated by their children. They create specific spatial and temporal contexts in which children learn about emotions, their causes, consequences, and complexity. This occurs not only through what mothers say or do directly to the child. Rather, it occurs, to a large extent, through the example that they set using their own emotional skills. The aim of the current study was to analyze how maternal abilities to perceive and to manage emotions influence children’s emotion knowledge, specifically, their emotion situation knowledge, taking into account the role played by the mother’s personality, the time spent together, and controlling the effect of age, sex and the child’s verbal abilities. Participants were 153 children from 4 schools in Spain, and their mothers. Children (41.8% girls)age range was 35 - 72 months. Mothers (N = 140) age (M = 38.7; R = 27-49). Twelve mothers had more than one child participating in the study. Main variables were the child´s emotion situation knowledge (ESK), measured by the Emotion Matching Task (EMT), and receptive language, using the Picture Vocabulary Test. Also, their mothers´ Emotional Intelligence (EI), through the Mayer, Salovey, Caruso Emotional Intelligence Test (MSCEIT) and personality, with The Big Five Inventory were analyzed. The results showed that the predictive power of maternal emotional skills on ESK was moderated by the mother’s personality, affecting both the direction and size of the relationships detected: low neuroticism and low openness to experience lead to a positive influence of maternal EI on children’s ESK, while high levels in these personality dimensions resulted in a negative influence on child´s ESK. The time that the mother and the child spend together was revealed as a positive predictor of this EK, while it did not moderate the influence of the mother's EI on child’s ESK. In light of the results, we can infer that maternal EI is linked to children’s emotional skills, though high level of maternal EI does not necessarily predict a greater degree of emotionknowledge in children, which seems rather to depend on specific personality profiles. The results of the current study indicate that a good level of maternal EI does not guarantee that children will learn the emotional skills that foster prosocial adaptation. Rather, EI must be accompanied by certain psychological characteristics (personality traits in this case).

Keywords: emotional intelligence, emotion situation knowledge, mothers, personality, young children

Procedia PDF Downloads 96
1768 Exploratory Analysis of A Review of Nonexistence Polarity in Native Speech

Authors: Deawan Rakin Ahamed Remal, Sinthia Chowdhury, Sharun Akter Khushbu, Sheak Rashed Haider Noori

Abstract:

Native Speech to text synthesis has its own leverage for the purpose of mankind. The extensive nature of art to speaking different accents is common but the purpose of communication between two different accent types of people is quite difficult. This problem will be motivated by the extraction of the wrong perception of language meaning. Thus, many existing automatic speech recognition has been placed to detect text. Overall study of this paper mentions a review of NSTTR (Native Speech Text to Text Recognition) synthesis compared with Text to Text recognition. Review has exposed many text to text recognition systems that are at a very early stage to comply with the system by native speech recognition. Many discussions started about the progression of chatbots, linguistic theory another is rule based approach. In the Recent years Deep learning is an overwhelming chapter for text to text learning to detect language nature. To the best of our knowledge, In the sub continent a huge number of people speak in Bangla language but they have different accents in different regions therefore study has been elaborate contradictory discussion achievement of existing works and findings of future needs in Bangla language acoustic accent.

Keywords: TTR, NSTTR, text to text recognition, deep learning, natural language processing

Procedia PDF Downloads 99
1767 The Visual Side of Islamophobia: A Social-Semiotic Analysis

Authors: Carmen Aguilera-Carnerero

Abstract:

Islamophobia, the unfounded hostility towards Muslims and Islam, has been deeply studied in the last decades from different perspectives ranging from anthropology, sociology, media studies, and linguistics. In the past few years, we have witnessed how the birth of social media has transformed formerly passive audiences into an active group that not only receives and digests information but also creates and comments publicly on any event of their interest. In this way, average citizens now have been entitled with the power of becoming potential opinion leaders. This rise of social media in the last years gave way to a different way of Islamophobia, the so called ‘cyberIslamophobia’. Considerably less attention, however, has been given to the study of islamophobic images that accompany the texts in social media. This paper attempts to analyse a corpus of 300 images of islamophobic nature taken from social media (from Twitter and Facebook) from the years 2014-2017 to see: a) how hate speech is visually constructed, b) how cyberislamophobia is articulated through images and whether there are differences/similarities between the textual and the visual elements, c) the impact of those images in the audience and their reaction to it and d) whether visual cyberislamophobia has undergone any process of permeating popular culture (for example, through memes) and its real impact. To carry out this task, we have used Critical Discourse Analysis as the most suitable theoretical framework that analyses and criticizes the dominant discourses that affect inequality, injustice, and oppression. The analysis of images was studied according to the theoretical framework provided by the visual framing theory and the visual design grammar to conclude that memes are subtle but very powerful tools to spread Islamophobia and foster hate speech under the guise of humour within popular culture.

Keywords: cyberIslamophobia, visual grammar, social media, popular culture

Procedia PDF Downloads 131
1766 Quantum Cum Synaptic-Neuronal Paradigm and Schema for Human Speech Output and Autism

Authors: Gobinathan Devathasan, Kezia Devathasan

Abstract:

Objective: To improve the current modified Broca-Wernicke-Lichtheim-Kussmaul speech schema and provide insight into autism. Methods: We reviewed the pertinent literature. Current findings, involving Brodmann areas 22, 46, 9,44,45,6,4 are based on neuropathology and functional MRI studies. However, in primary autism, there is no lucid explanation and changes described, whether neuropathology or functional MRI, appear consequential. Findings: We forward an enhanced model which may explain the enigma related to autism. Vowel output is subcortical and does need cortical representation whereas consonant speech is cortical in origin. Left lateralization is needed to commence the circuitry spin as our life have evolved with L-amino acids and left spin of electrons. A fundamental species difference is we are capable of three syllable-consonants and bi-syllable expression whereas cetaceans and songbirds are confined to single or dual consonants. The 4 key sites for speech are superior auditory cortex, Broca’s two areas, and the supplementary motor cortex. Using the Argand’s diagram and Reimann’s projection, we theorize that the Euclidean three dimensional synaptic neuronal circuits of speech are quantized to coherent waves, and then decoherence takes place at area 6 (spherical representation). In this quantum state complex, 3-consonant languages are instantaneously integrated and multiple languages can be learned, verbalized and differentiated. Conclusion: We postulate that evolutionary human speech is elevated to quantum interaction unlike cetaceans and birds to achieve the three consonants/bi-syllable speech. In classical primary autism, the sudden speech switches off and on noted in several cases could now be explained not by any anatomical lesion but failure of coherence. Area 6 projects directly into prefrontal saccadic area (8); and this further explains the second primary feature in autism: lack of eye contact. The third feature which is repetitive finger gestures, located adjacent to the speech/motor areas, are actual attempts to communicate with the autistic child akin to sign language for the deaf.

Keywords: quantum neuronal paradigm, cetaceans and human speech, autism and rapid magnetic stimulation, coherence and decoherence of speech

Procedia PDF Downloads 160
1765 Performance Analysis of VoIP Coders for Different Modulations Under Pervasive Environment

Authors: Jasbinder Singh, Harjit Pal Singh, S. A. Khan

Abstract:

The work, in this paper, presents the comparison of encoded speech signals by different VoIP narrow-band and wide-band codecs for different modulation schemes. The simulation results indicate that codec has an impact on the speech quality and also effected by modulation schemes.

Keywords: VoIP, coders, modulations, BER, MOS

Procedia PDF Downloads 479
1764 Ahmad Sabzi Balkhkanloo, Motahareh Sadat Hashemi, Seyede Marzieh Hosseini, Saeedeh Shojaee-Aliabadi, Leila Mirmoghtadaie

Authors: Elyria Kemp, Kelly Cowart, My Bui

Abstract:

According to the National Institute of Mental Health, an estimated 31.9% of adolescents have had an anxiety disorder. Several environmental factors may help to contribute to high levels of anxiety and depression in young people (i.e., Generation Z, Millennials). However, as young people negotiate life on social media, they may begin to evaluate themselves using excessively high standards and adopt self-perfectionism tendencies. Broadly defined, self-perfectionism involves very critical evaluations of the self. Perfectionism may also come from others and may manifest as socially prescribed perfectionism, and young adults are reporting higher levels of socially prescribed perfectionism than previous generations. This rising perfectionism is also associated with anxiety, greater physiological reactivity, and a sense of social disconnection. However, theories from psychology suggest that improvement in emotion regulation can contribute to enhanced psychological and emotional well-being. Emotion regulation refers to the ways people manage how and when they experience and express their emotions. Cognitive reappraisal and expressive suppression are common emotion regulation strategies. Cognitive reappraisal involves changing the meaning of a stimulus that involves construing a potentially emotion-eliciting situation in a way that changes its emotional impact. By contrast, expressive suppression involves inhibiting the behavioral expression of emotion. The purpose of this research is to examine the efficacy of social marketing initiatives which promote emotion regulation strategies to help young adults regulate their emotions. In Study 1 a single factor (emotional regulation strategy: a cognitive reappraisal, expressive, control) between-subjects design was conducted using an online, non-student consumer panel (n=96). Sixty-eight percent of participants were male, and 32% were female. Study participants belonged to the Millennial and Gen Z cohort, ranging in age from 22 to 35 (M=27). Participants were first told to spend at least three minutes writing about a public speaking appearance which made them anxious. The purpose of this exercise was to induce anxiety. Next, participants viewed one of three advertisements (randomly assigned) which promoted an emotion regulation strategy—cognitive reappraisal, expressive suppression, or an advertisement non-emotional in nature. After being exposed to one of the ads, participants responded to a measure composed of two items to access their emotional state and the efficacy of the messages in fostering emotion management. Findings indicated that individuals in the cognitive reappraisal condition (M=3.91) exhibited the most positive feelings and more effective emotion regulation than the expressive suppression (M=3.39) and control conditions (M=3.72, F(1,92) = 3.3, p<.05). Results from this research can be used by institutions (e.g., schools) in taking a leadership role in attacking anxiety and other mental health issues. Social stigmas regarding mental health can be removed and a more proactive stance can be taken in promoting healthy coping behaviors and strategies to manage negative emotions.

Keywords: emotion regulation, anxiety, social marketing, generation z

Procedia PDF Downloads 177
1763 Audio-Visual Co-Data Processing Pipeline

Authors: Rita Chattopadhyay, Vivek Anand Thoutam

Abstract:

Speech is the most acceptable means of communication where we can quickly exchange our feelings and thoughts. Quite often, people can communicate orally but cannot interact or work with computers or devices. It’s easy and quick to give speech commands than typing commands to computers. In the same way, it’s easy listening to audio played from a device than extract output from computers or devices. Especially with Robotics being an emerging market with applications in warehouses, the hospitality industry, consumer electronics, assistive technology, etc., speech-based human-machine interaction is emerging as a lucrative feature for robot manufacturers. Considering this factor, the objective of this paper is to design the “Audio-Visual Co-Data Processing Pipeline.” This pipeline is an integrated version of Automatic speech recognition, a Natural language model for text understanding, object detection, and text-to-speech modules. There are many Deep Learning models for each type of the modules mentioned above, but OpenVINO Model Zoo models are used because the OpenVINO toolkit covers both computer vision and non-computer vision workloads across Intel hardware and maximizes performance, and accelerates application development. A speech command is given as input that has information about target objects to be detected and start and end times to extract the required interval from the video. Speech is converted to text using the Automatic speech recognition QuartzNet model. The summary is extracted from text using a natural language model Generative Pre-Trained Transformer-3 (GPT-3). Based on the summary, essential frames from the video are extracted, and the You Only Look Once (YOLO) object detection model detects You Only Look Once (YOLO) objects on these extracted frames. Frame numbers that have target objects (specified objects in the speech command) are saved as text. Finally, this text (frame numbers) is converted to speech using text to speech model and will be played from the device. This project is developed for 80 You Only Look Once (YOLO) labels, and the user can extract frames based on only one or two target labels. This pipeline can be extended for more than two target labels easily by making appropriate changes in the object detection module. This project is developed for four different speech command formats by including sample examples in the prompt used by Generative Pre-Trained Transformer-3 (GPT-3) model. Based on user preference, one can come up with a new speech command format by including some examples of the respective format in the prompt used by the Generative Pre-Trained Transformer-3 (GPT-3) model. This pipeline can be used in many projects like human-machine interface, human-robot interaction, and surveillance through speech commands. All object detection projects can be upgraded using this pipeline so that one can give speech commands and output is played from the device.

Keywords: OpenVINO, automatic speech recognition, natural language processing, object detection, text to speech

Procedia PDF Downloads 49
1762 A Method for the Extraction of the Character's Tendency from Korean Novels

Authors: Min-Ha Hong, Kee-Won Kim, Seung-Hoon Kim

Abstract:

The character in the story-based content, such as novels and movies, is one of the core elements to understand the story. In particular, the character’s tendency is an important factor to analyze the story-based content, because it has a significant influence on the storyline. If readers have the knowledge of the tendency of characters before reading a novel, it will be helpful to understand the structure of conflict, episode and relationship between characters in the novel. It may therefore help readers to select novel that the reader wants to read. In this paper, we propose a method of extracting the tendency of the characters from a novel written in Korean. In advance, we build the dictionary with pairs of the emotional words in Korean and English since the emotion words in the novel’s sentences express character’s feelings. We rate the degree of polarity (positive or negative) of words in our emotional words dictionary based on SenticNet. Then we extract characters and emotion words from sentences in a novel. Since the polarity of a word grows strong or weak due to sentence features such as quotations and modifiers, our proposed method consider them to calculate the polarity of characters. The information of the extracted character’s polarity can be used in the book search service or book recommendation service.

Keywords: character tendency, data mining, emotion word, Korean novel

Procedia PDF Downloads 309
1761 Multimodal Data Fusion Techniques in Audiovisual Speech Recognition

Authors: Hadeer M. Sayed, Hesham E. El Deeb, Shereen A. Taie

Abstract:

In the big data era, we are facing a diversity of datasets from different sources in different domains that describe a single life event. These datasets consist of multiple modalities, each of which has a different representation, distribution, scale, and density. Multimodal fusion is the concept of integrating information from multiple modalities in a joint representation with the goal of predicting an outcome through a classification task or regression task. In this paper, multimodal fusion techniques are classified into two main classes: model-agnostic techniques and model-based approaches. It provides a comprehensive study of recent research in each class and outlines the benefits and limitations of each of them. Furthermore, the audiovisual speech recognition task is expressed as a case study of multimodal data fusion approaches, and the open issues through the limitations of the current studies are presented. This paper can be considered a powerful guide for interested researchers in the field of multimodal data fusion and audiovisual speech recognition particularly.

Keywords: multimodal data, data fusion, audio-visual speech recognition, neural networks

Procedia PDF Downloads 78