Search results for: speech emotion recognition
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2504

Search results for: speech emotion recognition

2354 The Role of Parental Stress and Emotion Regulation in Responding to Children’s Expression of Negative Emotion

Authors: Lizel Bertie, Kim Johnston

Abstract:

Parental emotion regulation plays a central role in the socialisation of emotion, especially when teaching young children to cope with negative emotions. Despite evidence which shows non-supportive parental responses to children’s expression of negative emotions has implications for the social and emotional development of the child, few studies have investigated risk factors which impact parental emotion socialisation processes. The current study aimed to explore the extent to which parental stress contributes to both difficulties in parental emotion regulation and non-supportive parental responses to children’s expression of negative emotions. In addition, the study examined whether parental use of expressive suppression as an emotion regulation strategy facilitates the influence of parental stress on non-supportive responses by testing the relations in a mediation model. A sample of 140 Australian adults, who identified as parents with children aged 5 to 10 years, completed an online questionnaire. The measures explored recent symptoms of depression, anxiety, and stress, the use of expressive suppression as an emotion regulation strategy, and hypothetical parental responses to scenarios related to children’s expression of negative emotions. A mediated regression indicated that parents who reported higher levels of stress also reported higher levels of expressive suppression as an emotion regulation strategy and increased use of non-supportive responses in relation to young children’s expression of negative emotions. These findings suggest that parents who experience heightened symptoms of stress are more likely to both suppress their emotions in parent-child interaction and engage in non-supportive responses. Furthermore, higher use of expressive suppression strongly predicted the use of non-supportive responses, despite the presence of parental stress. Contrary to expectation, no indirect effect of stress on non-supportive responses was observed via expressive suppression. The findings from the study suggest that parental stress may become a more salient manifestation of psychological distress in a sub-clinical population of parents while contributing to impaired parental responses. As such, the study offers support for targeting overarching factors such as difficulties in parental emotion regulation and stress management, not only as an intervention for parental psychological distress, but also the detection and prevention of maladaptive parenting practices.

Keywords: emotion regulation, emotion socialisation, expressive suppression, non-supportive responses, parental stress

Procedia PDF Downloads 132
2353 The Discursive Construction of Emotions in the Headlines of French Newspapers on Seismic Disasters

Authors: Mirela-Gabriela Bratu

Abstract:

The main objective of this study is to highlight the way in which emotions are constructed discursively in the French written press, more particularly in the titles of informative articles. To achieve this objective, we will begin the study with the theoretical part, which aims to capture the characteristics of journalistic discourse, to which we will add clues of emotions that we will identify in the titles of the articles. The approach is based on the empirical results from the analysis of the articles published on the earthquake that took place on August 24, 2016, in Italy, as described by two French national daily newspapers: Le Monde and Le Point. The corpus submitted to the analysis contains thirty-seven titles, published between August 24, 2016, and August 24, 2017. If the textual content of the speech offers information respecting the grammatical standards and following the presentation conventions, the choice of words can touch the reader, so the journalist must add other means than mastering of the language to create emotion. This study aims to highlight the strategies, such as rhetorical figures, the tenses, or factual data, used by journalists to create emotions for the readers. We also try, thanks to the study of the articles which were published for several days relating to the same event, to emphasize whether we can speak or not of the dissipation of emotion and the catastrophic side as the event fades away in time. The theoretical framework is offered by works on rhetorical strategies (Perelman, 1992; Amossi, 2000; Charaudeau, 2000) and on the study of emotions (Plantin, 1997, 1998, 2004; Tetu, 2004).

Keywords: disaster, earthquake, emotion, feeling

Procedia PDF Downloads 108
2352 The Role of Cultural Expectations in Emotion Regulation among Nepali Adolescents

Authors: Martha Berg, Megan Ramaiya, Andi Schmidt, Susanna Sharma, Brandon Kohrt

Abstract:

Nepali adolescents report tension and negative emotion due to perceived expectations of both academic and social achievement. These societal goals, which are internalized through early-life socialization, drive the development of self-regulatory processes such as emotion regulation. Emotion dysregulation is linked with adverse psychological outcomes such as depression, self-harm, and suicide, which are public health concerns for organizations working with Nepali adolescents. This study examined the relation among socialization, internalized cultural goals, and emotion regulation to inform interventions for reducing depression and suicide in this population. Participants included 102 students in grades 7 through 9 in a post-earthquake school setting in rural Kathmandu valley. All participants completed a tablet-based battery of quantitative measures, comprising transculturally adapted assessments of emotion regulation, depression, and self-harm/suicide ideation and behavior. Qualitative measures included two focus groups and semi-structured interviews with 22 students and 3 parents. A notable proportion of the sample reported depression symptoms in the past 2 weeks (68%), lifetime self-harm ideation (28%), and lifetime suicide attempts (13%). Students who lived with their nuclear family reported lower levels of difficulty than those who lived with more distant relatives (z=2.16, p=.03), which suggests a link between family environment and adolescent emotion regulation, potentially mediated by socialization and internalization of cultural goals. These findings call for further research into the aspects of nuclear versus extended family environments that shape the development of emotion regulation.

Keywords: adolescent mental health, emotion regulation, Nepal, socialization

Procedia PDF Downloads 244
2351 Hate Speech Detection Using Deep Learning and Machine Learning Models

Authors: Nabil Shawkat, Jamil Saquer

Abstract:

Social media has accelerated our ability to engage with others and eliminated many communication barriers. On the other hand, the widespread use of social media resulted in an increase in online hate speech. This has drastic impacts on vulnerable individuals and societies. Therefore, it is critical to detect hate speech to prevent innocent users and vulnerable communities from becoming victims of hate speech. We investigate the performance of different deep learning and machine learning algorithms on three different datasets. Our results show that the BERT model gives the best performance among all the models by achieving an F1-score of 90.6% on one of the datasets and F1-scores of 89.7% and 88.2% on the other two datasets.

Keywords: hate speech, machine learning, deep learning, abusive words, social media, text classification

Procedia PDF Downloads 99
2350 DBN-Based Face Recognition System Using Light Field

Authors: Bing Gu

Abstract:

Abstract—Most of Conventional facial recognition systems are based on image features, such as LBP, SIFT. Recently some DBN-based 2D facial recognition systems have been proposed. However, we find there are few DBN-based 3D facial recognition system and relative researches. 3D facial images include all the individual biometric information. We can use these information to build more accurate features, So we present our DBN-based face recognition system using Light Field. We can see Light Field as another presentation of 3D image, and Light Field Camera show us a way to receive a Light Field. We use the commercially available Light Field Camera to act as the collector of our face recognition system, and the system receive a state-of-art performance as convenient as conventional 2D face recognition system.

Keywords: DBN, face recognition, light field, Lytro

Procedia PDF Downloads 429
2349 Development of a Social Assistive Robot for Elderly Care

Authors: Edwin Foo, Woei Wen, Lui, Meijun Zhao, Shigeru Kuchii, Chin Sai Wong, Chung Sern Goh, Yi Hao He

Abstract:

This presentation presents an elderly care and assistive social robot development work. We named this robot JOS and he is restricted to table top operation. JOS is designed to have a maximum volume of 3600 cm3 with its base restricted to 250 mm and his mission is to provide companion, assist and help the elderly. In order for JOS to accomplish his mission, he will be equipped with perception, reaction and cognition capability. His appearance will be not human like but more towards cute and approachable type. JOS will also be designed to be neutral gender. However, the robot will still have eyes, eyelid and a mouth. For his eyes and eyelids, they will be built entirely with Robotis Dynamixel AX18 motor. To realize this complex task, JOS will be also be equipped with micro-phone array, vision camera and Intel i5 NUC computer and a powered by a 12 V lithium battery that will be self-charging. His face is constructed using 1 motor each for the eyelid, 2 motors for the eyeballs, 3 motors for the neck mechanism and 1 motor for the lips movement. The vision senor will be house on JOS forehead and the microphone array will be somewhere below the mouth. For the vision system, Omron latest OKAO vision sensor is used. It is a compact and versatile sensor that is only 60mm by 40mm in size and operates with only 5V supply. In addition, OKAO vision sensor is capable of identifying the user and recognizing the expression of the user. With these functions, JOS is able to track and identify the user. If he cannot recognize the user, JOS will ask the user if he would want him to remember the user. If yes, JOS will store the user information together with the capture face image into a database. This will allow JOS to recognize the user the next time the user is with JOS. In addition, JOS is also able to interpret the mood of the user through the facial expression of the user. This will allow the robot to understand the user mood and behavior and react according. Machine learning will be later incorporated to learn the behavior of the user so as to understand the mood of the user and requirement better. For the speech system, Microsoft speech and grammar engine is used for the speech recognition. In order to use the speech engine, we need to build up a speech grammar database that captures the commonly used words by the elderly. This database is built from research journals and literature on elderly speech and also interviewing elderly what do they want to robot to assist them with. Using the result from the interview and research from journal, we are able to derive a set of common words the elderly frequently used to request for the help. It is from this set that we build up our grammar database. In situation where there is more than one person near JOS, he is able to identify the person who is talking to him through an in-house developed microphone array structure. In order to make the robot more interacting, we have also included the capability for the robot to express his emotion to the user through the facial expressions by changing the position and movement of the eyelids and mouth. All robot emotions will be in response to the user mood and request. Lastly, we are expecting to complete this phase of project and test it with elderly and also delirium patient by Feb 2015.

Keywords: social robot, vision, elderly care, machine learning

Procedia PDF Downloads 413
2348 Speech Intelligibility Improvement Using Variable Level Decomposition DWT

Authors: Samba Raju, Chiluveru, Manoj Tripathy

Abstract:

Intelligibility is an essential characteristic of a speech signal, which is used to help in the understanding of information in speech signal. Background noise in the environment can deteriorate the intelligibility of a recorded speech. In this paper, we presented a simple variance subtracted - variable level discrete wavelet transform, which improve the intelligibility of speech. The proposed algorithm does not require an explicit estimation of noise, i.e., prior knowledge of the noise; hence, it is easy to implement, and it reduces the computational burden. The proposed algorithm decides a separate decomposition level for each frame based on signal dominant and dominant noise criteria. The performance of the proposed algorithm is evaluated with speech intelligibility measure (STOI), and results obtained are compared with Universal Discrete Wavelet Transform (DWT) thresholding and Minimum Mean Square Error (MMSE) methods. The experimental results revealed that the proposed scheme outperformed competing methods

Keywords: discrete wavelet transform, speech intelligibility, STOI, standard deviation

Procedia PDF Downloads 113
2347 A Psychophysiological Evaluation of an Effective Recognition Technique Using Interactive Dynamic Virtual Environments

Authors: Mohammadhossein Moghimi, Robert Stone, Pia Rotshtein

Abstract:

Recording psychological and physiological correlates of human performance within virtual environments and interpreting their impacts on human engagement, ‘immersion’ and related emotional or ‘effective’ states is both academically and technologically challenging. By exposing participants to an effective, real-time (game-like) virtual environment, designed and evaluated in an earlier study, a psychophysiological database containing the EEG, GSR and Heart Rate of 30 male and female gamers, exposed to 10 games, was constructed. Some 174 features were subsequently identified and extracted from a number of windows, with 28 different timing lengths (e.g. 2, 3, 5, etc. seconds). After reducing the number of features to 30, using a feature selection technique, K-Nearest Neighbour (KNN) and Support Vector Machine (SVM) methods were subsequently employed for the classification process. The classifiers categorised the psychophysiological database into four effective clusters (defined based on a 3-dimensional space – valence, arousal and dominance) and eight emotion labels (relaxed, content, happy, excited, angry, afraid, sad, and bored). The KNN and SVM classifiers achieved average cross-validation accuracies of 97.01% (±1.3%) and 92.84% (±3.67%), respectively. However, no significant differences were found in the classification process based on effective clusters or emotion labels.

Keywords: virtual reality, effective computing, effective VR, emotion-based effective physiological database

Procedia PDF Downloads 203
2346 The Language Use of Middle Eastern Freedom Activists' Speeches: A Gender Perspective

Authors: Sulistyaningtyas

Abstract:

Examining the role of Middle Eastern freedom activists’ speech based on gender perspective is considered noteworthy because the society in the Middle East is patriarchal. This research aims to examine the language use of the Middle Eastern freedom activists’ speeches through gender perspective. The data sources are from male and female Middle Eastern freedom activists’ speech videos. In analyzing the data, the theories employed are about Language Style from Gender Perspective and The Language for Speech. The result reveals that there are sets of spoken language differences between male and female speakers. In using the language for speech, both male and female speakers produce metaphor, euphemism, the ‘rule of three’, parallelism, and pronouns in random frequency of production, which cannot be separated by genders. Moreover, it cannot be concluded that one gender is more potential than the other to influence the audience in delivering speech. There are other factors, particularly non-verbal factors, existing to give impacts on how a speech can influence the audience.

Keywords: gender perspective, language use, Middle Eastern freedom activists, speech

Procedia PDF Downloads 396
2345 Considering Cultural and Linguistic Variables When Working as a Speech-Language Pathologist with Multicultural Students

Authors: Gabriela Smeckova

Abstract:

The entire world is becoming more and more diverse. The reasons why people migrate are different and unique for each family /individual. Professionals delivering services (including speech-language pathologists) must be prepared to work with clients coming from different cultural and/or linguistic backgrounds. Well-educated speech-language pathologists will consider many factors when delivering services. Some of them will be discussed during the presentation (language spoken, beliefs about health care and disabilities, reasons for immigration, etc.). The communication styles of the client can be different than the styles of the speech-language pathologist. The goal is to become culturally responsive in service delivery.

Keywords: culture, cultural competence, culturallly responsive practices, speech-language pathologist, cultural and linguistical variables, communication styles

Procedia PDF Downloads 42
2344 An Investigation of the Association between Pathological Personality Dimensions and Emotion Dysregulation among Virtual Network Users: The Mediating Role of Cyberchondria Behaviors

Authors: Mehdi Destani, Asghar Heydari

Abstract:

Objective: The present study aimed to investigate the association between pathological personality dimensions and emotion dysregulation through the mediating role of Cyberchondria behaviors among users of virtual networks. Materials and methods: A descriptive–correlational research method was used in this study, and the statistical population consisted of all people active on social network sites in 2020. The sample size was 300 people who were selected through Convenience Sampling. Data collection was carried out in a survey method using online questionnaires, including the "Difficulties in Emotion Regulation Scale" (DERS), Personality Inventory for DSM-5 Brief Form (PID-5-BF), and Cyberchondria Severity Scale Brief Form (CSS-12). Data analysis was conducted using Pearson's Correlation Coefficient and Structural Equation Modeling (SEM). Findings: Findings suggested that pathological personality dimensions and Cyberchondria behaviors have a positive and significant association with emotion dysregulation (p<0.001). The presented model had a good fit with the data. The variable “pathological personality dimensions” with an overall effect (p<0.001, β=0.658), a direct effect (p<0.001, β=0.528), and an indirect mediating effect through Cyberchondria Behaviors (p<.001), β=0.130), accounted for emotion dysregulation among virtual network users. Conclusion: The research findings showed a necessity to pay attention to the pathological personality dimensions as a determining variable and Cyberchondria behaviors as a mediator in the vulnerability of users of social network sites to emotion dysregulation.

Keywords: cyberchondria, emotion dysregulation, pathological personality dimensions, social networks

Procedia PDF Downloads 73
2343 Effect of Noise Reduction Algorithms on Temporal Splitting of Speech Signal to Improve Speech Perception for Binaural Hearing Aids

Authors: Rajani S. Pujar, Pandurangarao N. Kulkarni

Abstract:

Increased temporal masking affects the speech perception in persons with sensorineural hearing impairment especially under adverse listening conditions. This paper presents a cascaded scheme, which employs a noise reduction algorithm as well as temporal splitting of the speech signal. Earlier investigations have shown that by splitting the speech temporally and presenting alternate segments to the two ears help in reducing the effect of temporal masking. In this technique, the speech signal is processed by two fading functions, complementary to each other, and presented to left and right ears for binaural dichotic presentation. In the present study, half cosine signal is used as a fading function with crossover gain of 6 dB for the perceptual balance of loudness. Temporal splitting is combined with noise reduction algorithm to improve speech perception in the background noise. Two noise reduction schemes, namely spectral subtraction and Wiener filter are used. Listening tests were conducted on six normal-hearing subjects, with sensorineural loss simulated by adding broadband noise to the speech signal at different signal-to-noise ratios (∞, 3, 0, and -3 dB). Objective evaluation using PESQ was also carried out. The MOS score for VCV syllable /asha/ for SNR values of ∞, 3, 0, and -3 dB were 5, 4.46, 4.4 and 4.05 respectively, while the corresponding MOS scores for unprocessed speech were 5, 1.2, 0.9 and 0.65, indicating significant improvement in the perceived speech quality for the proposed scheme compared to the unprocessed speech.

Keywords: MOS, PESQ, spectral subtraction, temporal splitting, wiener filter

Procedia PDF Downloads 301
2342 Facial Expression Phoenix (FePh): An Annotated Sequenced Dataset for Facial and Emotion-Specified Expressions in Sign Language

Authors: Marie Alaghband, Niloofar Yousefi, Ivan Garibay

Abstract:

Facial expressions are important parts of both gesture and sign language recognition systems. Despite the recent advances in both fields, annotated facial expression datasets in the context of sign language are still scarce resources. In this manuscript, we introduce an annotated sequenced facial expression dataset in the context of sign language, comprising over 3000 facial images extracted from the daily news and weather forecast of the public tv-station PHOENIX. Unlike the majority of currently existing facial expression datasets, FePh provides sequenced semi-blurry facial images with different head poses, orientations, and movements. In addition, in the majority of images, identities are mouthing the words, which makes the data more challenging. To annotate this dataset we consider primary, secondary, and tertiary dyads of seven basic emotions of "sad", "surprise", "fear", "angry", "neutral", "disgust", and "happy". We also considered the "None" class if the image’s facial expression could not be described by any of the aforementioned emotions. Although we provide FePh as a facial expression dataset of signers in sign language, it has a wider application in gesture recognition and Human Computer Interaction (HCI) systems.

Keywords: annotated facial expression dataset, gesture recognition, sequenced facial expression dataset, sign language recognition

Procedia PDF Downloads 128
2341 Efficacy of a Wiener Filter Based Technique for Speech Enhancement in Hearing Aids

Authors: Ajish K. Abraham

Abstract:

Hearing aid is the most fundamental technology employed towards rehabilitation of persons with sensory neural hearing impairment. Hearing in noise is still a matter of major concern for many hearing aid users and thus continues to be a challenging issue for the hearing aid designers. Several techniques are being currently used to enhance the speech at the hearing aid output. Most of these techniques, when implemented, result in reduction of intelligibility of the speech signal. Thus the dissatisfaction of the hearing aid user towards comprehending the desired speech amidst noise is prevailing. Multichannel Wiener Filter is widely implemented in binaural hearing aid technology for noise reduction. In this study, Wiener filter based noise reduction approach is experimented for a single microphone based hearing aid set up. This method checks the status of the input speech signal in each frequency band and then selects the relevant noise reduction procedure. Results showed that the Wiener filter based algorithm is capable of enhancing speech even when the input acoustic signal has a very low Signal to Noise Ratio (SNR). Performance of the algorithm was compared with other similar algorithms on the basis of improvement in intelligibility and SNR of the output, at different SNR levels of the input speech. Wiener filter based algorithm provided significant improvement in SNR and intelligibility compared to other techniques.

Keywords: hearing aid output speech, noise reduction, SNR improvement, Wiener filter, speech enhancement

Procedia PDF Downloads 224
2340 The Complaint Speech Act Set Produced by Arab Students in the UAE

Authors: Tanju Deveci

Abstract:

It appears that the speech act of complaint has not received as much attention as other speech acts. However, the face-threatening nature of this speech act requires a special attention in multicultural contexts in particular. The teaching context in the UAE universities, where a big majority of teaching staff comes from other cultures, requires investigations into this speech act in order to improve communication between students and faculty. This session will outline the results of a study conducted with this purpose. The realization of complaints by Freshman English students in Communication courses at Petroleum Institute was investigated to identify communication patterns that seem to cause a strain. Data were collected using a role-play between a teacher and students, and a judgment scale completed by two of the instructors in the Communications Department. The initial findings reveal that the students had difficulty putting their case, produced the speech act of criticism along with a complaint and that they produced both requests and demands as candidate solutions. The judgement scales revealed that the students’ attitude was not appropriate most of the time and that the judges would behave differently from students. It is concluded that speech acts, in general, and complaint, in particular, need to be taught to learners explicitly to improve interpersonal communication in multicultural societies. Some teaching ideas are provided to help increase foreign language learners’ sociolinguistic competence.

Keywords: speech act, complaint, pragmatics, sociolinguistics, language teaching

Procedia PDF Downloads 479
2339 Emotional Processing Difficulties in Recovered Anorexia Nervosa Patients: State or Trait

Authors: Telma Fontao de Castro, Kylee Miller, Maria Xavier Araújo, Isabel Brandao, Sandra Torres

Abstract:

Objective: There is a dearth of research investigating the long-term emotional functioning of individuals recovered from anorexia nervosa (AN). This 15-year longitudinal study aimed to examine whether difficulties in cognitive processing of emotions persisted after long-term AN recovery and its link to anxiety and depression. Method: Twenty-four females, who were tested longitudinally during their acute and recovered AN phases, and 24 healthy control (HC) women, were screened for anxiety, depression, alexithymia, and emotion regulation difficulties (ER; only assessed in recovery phase). Results: Anxiety, depression, and alexithymia levels decreased significantly with AN recovery. However, scores on anxiety and difficulty in identifying feelings (alexithymia factor) remained high when compared to the HC group. Scores on emotion regulation difficulties were also lower in HC group. The abovementioned differences between AN recovered group and HC group in difficulties in identifying and accepting feelings and lack of emotional clarity were no longer present when the effect of anxiety and depression was controlled. Conclusions: Findings suggest that emotional dysfunction tends to decrease in AN recovered phase. However, using an HC group as a reference, we conclude that several emotional difficulties are still increased after long-term AN recovery, in particular, limited access to emotion regulation strategies, and difficulty controlling impulses and engaging in goal-directed behavior, thus suggesting to be a trait vulnerability. In turn, competencies related to emotional clarity and acceptance of emotional responses seem to be state-dependent phenomena linked to anxiety and depression. In sum, managing emotions remains a challenge for individuals recovered from AN. Under this circumstance, maladaptive eating behavior can serve as an affect regulatory function, increasing the risk of relapse. Emotional education and stabilization of depressive and anxious symptomatology after recovery emerge as an important avenue to protect from long-term AN relapse.

Keywords: alexithymia, anorexia nervosa, emotion recognition, emotion regulation

Procedia PDF Downloads 93
2338 Evaluation of Features Extraction Algorithms for a Real-Time Isolated Word Recognition System

Authors: Tomyslav Sledevič, Artūras Serackis, Gintautas Tamulevičius, Dalius Navakauskas

Abstract:

This paper presents a comparative evaluation of features extraction algorithm for a real-time isolated word recognition system based on FPGA. The Mel-frequency cepstral, linear frequency cepstral, linear predictive and their cepstral coefficients were implemented in hardware/software design. The proposed system was investigated in the speaker-dependent mode for 100 different Lithuanian words. The robustness of features extraction algorithms was tested recognizing the speech records at different signals to noise rates. The experiments on clean records show highest accuracy for Mel-frequency cepstral and linear frequency cepstral coefficients. For records with 15 dB signal to noise rate the linear predictive cepstral coefficients give best result. The hard and soft part of the system is clocked on 50 MHz and 100 MHz accordingly. For the classification purpose, the pipelined dynamic time warping core was implemented. The proposed word recognition system satisfies the real-time requirements and is suitable for applications in embedded systems.

Keywords: isolated word recognition, features extraction, MFCC, LFCC, LPCC, LPC, FPGA, DTW

Procedia PDF Downloads 465
2337 The Emotional Experience of Urban Ruins and the Exploration of Urban Memory

Authors: Yan Jia China

Abstract:

The ruins is a kind of historical intention, which is also the current real existence of developing city. Zen culture of ancient China has a profound esthetic emotion, similarly, the west establish the concept of aesthetics of relic along with the Romanism’s (such as Rousseau etc.) sentiment to historical ruins at the end of 18th century. Nowadays, with the decline of traditional industrial society as well as the rise of post-industrial age, contemporary society must face the ruins and garbage problem which is left by industrial society. Commencing from the perspective of emotion and memory, this paper analyzes the importance for emotional needs as well as their existing status of several projects, such as the Capital Steelworks in Beijing (industrial devastation), the Shibati old section in Chongqing (urban slums) and the Old Hurva Synagogue in Jerusalem (ruins of war). It emphasizes urban design which is started from emotion and the sustainable development of city memory through managing the urban ruins which is criticized by people with the perspective of ecology and art.

Keywords: cultural heritage, urban ruins, ecology, emotion, sustainable urban memory

Procedia PDF Downloads 410
2336 Changes in EEG and Emotion Regulation in the Course of Inward-Attention Meditation Training

Authors: Yuchien Lin

Abstract:

This study attempted to investigate the changes in electroencephalography (EEG) and emotion regulation following eight-week inward-attention meditation training program. The subjects were 24 adults without meditation experiences divided into meditation and control groups. The quantitatively analyzed changes in psychophysiological parameters during inward-attention meditation, and evaluated the emotion scores assessed by the State-Trait Anxiety Inventory (STAI), the Positive and Negative Affect Schedule (PANAS), and the Emotion Regulation Scale (ERS). The results were found: (1) During meditation, significant EEG increased for theta-band activity in the frontal and the bilateral temporal areas, for alpha-band activity in the left and central frontal areas, and for gamma-band activity in the left frontal and the left temporal areas. (2) The meditation group had significantly higher positive affect in posttest than in pretest. (3) There was no significant difference in the changes of EEG spectral characteristics and emotion scores in posttest and pretest for the control group. In the present study, a unique meditative concentration task with a constant level of moderate mental effort focusing on the center of brain was used, so as to enhance frontal midline theta, alpha, and gamma-band activity. These results suggest that this mental training allows individual reach a specific mental state of relaxed but focused awareness. The gamma-band activity, in particular, enhanced over left frontoparietal area may suggest that inward-attention meditation training involves temporal integrative mechanisms and may induce short-term and long-term emotion regulation abilities.

Keywords: meditation, EEG, emotion regulation, gamma activity

Procedia PDF Downloads 185
2335 Generating Music with More Refined Emotions

Authors: Shao-Di Feng, Von-Wun Soo

Abstract:

To generate symbolic music with specific emotions is a challenging task due to symbolic music datasets that have emotion labels are scarce and incomplete. This research aims to generate more refined emotions based on the training datasets that are only labeled with four quadrants in Russel’s 2D emotion model. We focus on the theory of Music Fadernet and map arousal and valence to the low-level attributes, and build a symbolic music generation model by combining transformer and GM-VAE. We adopt an in-attention mechanism for the model and improve it by allowing modulation by conditional information. And we show the music generation model could control the generation of music according to the emotions specified by users in terms of high-level linguistic expression and by manipulating their corresponding low-level musical attributes. Finally, we evaluate the model performance using a pre-trained emotion classifier against a pop piano midi dataset called EMOPIA, and by subjective listening evaluation, we demonstrate that the model could generate music with more refined emotions correctly.

Keywords: music generation, music emotion controlling, deep learning, semi-supervised learning

Procedia PDF Downloads 54
2334 Face Tracking and Recognition Using Deep Learning Approach

Authors: Degale Desta, Cheng Jian

Abstract:

The most important factor in identifying a person is their face. Even identical twins have their own distinct faces. As a result, identification and face recognition are needed to tell one person from another. A face recognition system is a verification tool used to establish a person's identity using biometrics. Nowadays, face recognition is a common technique used in a variety of applications, including home security systems, criminal identification, and phone unlock systems. This system is more secure because it only requires a facial image instead of other dependencies like a key or card. Face detection and face identification are the two phases that typically make up a human recognition system.The idea behind designing and creating a face recognition system using deep learning with Azure ML Python's OpenCV is explained in this paper. Face recognition is a task that can be accomplished using deep learning, and given the accuracy of this method, it appears to be a suitable approach. To show how accurate the suggested face recognition system is, experimental results are given in 98.46% accuracy using Fast-RCNN Performance of algorithms under different training conditions.

Keywords: deep learning, face recognition, identification, fast-RCNN

Procedia PDF Downloads 89
2333 On Overcoming Common Oral Speech Problems through Authentic Films

Authors: Tamara Matevosyan

Abstract:

The present paper discusses the main problems that students face while developing oral skills through authentic films. It states that special attention should be paid not only to the study of verbal speech but also to non-verbal communication. Authentic films serve as an important tool to understand both native speaker’s gestures and their culture of pausing while speaking. Various phonetic difficulties causing phonetic interference in actual speech are covered in the paper emphasizing the role of authentic films in overcoming them.

Keywords: compressive speech, filled pauses, unfilled pauses, pausing culture

Procedia PDF Downloads 314
2332 ViraPart: A Text Refinement Framework for Automatic Speech Recognition and Natural Language Processing Tasks in Persian

Authors: Narges Farokhshad, Milad Molazadeh, Saman Jamalabbasi, Hamed Babaei Giglou, Saeed Bibak

Abstract:

The Persian language is an inflectional subject-object-verb language. This fact makes Persian a more uncertain language. However, using techniques such as Zero-Width Non-Joiner (ZWNJ) recognition, punctuation restoration, and Persian Ezafe construction will lead us to a more understandable and precise language. In most of the works in Persian, these techniques are addressed individually. Despite that, we believe that for text refinement in Persian, all of these tasks are necessary. In this work, we proposed a ViraPart framework that uses embedded ParsBERT in its core for text clarifications. First, used the BERT variant for Persian followed by a classifier layer for classification procedures. Next, we combined models outputs to output cleartext. In the end, the proposed model for ZWNJ recognition, punctuation restoration, and Persian Ezafe construction performs the averaged F1 macro scores of 96.90%, 92.13%, and 98.50%, respectively. Experimental results show that our proposed approach is very effective in text refinement for the Persian language.

Keywords: Persian Ezafe, punctuation, ZWNJ, NLP, ParsBERT, transformers

Procedia PDF Downloads 168
2331 Morpheme Based Parts of Speech Tagger for Kannada Language

Authors: M. C. Padma, R. J. Prathibha

Abstract:

Parts of speech tagging is the process of assigning appropriate parts of speech tags to the words in a given text. The critical or crucial information needed for tagging a word come from its internal structure rather from its neighboring words. The internal structure of a word comprises of its morphological features and grammatical information. This paper presents a morpheme based parts of speech tagger for Kannada language. This proposed work uses hierarchical tag set for assigning tags. The system is tested on some Kannada words taken from EMILLE corpus. Experimental result shows that the performance of the proposed system is above 90%.

Keywords: hierarchical tag set, morphological analyzer, natural language processing, paradigms, parts of speech

Procedia PDF Downloads 261
2330 Interventions for Children with Autism Using Interactive Technologies

Authors: Maria Hopkins, Sarah Koch, Fred Biasini

Abstract:

Autism is lifelong disorder that affects one out of every 110 Americans. The deficits that accompany Autism Spectrum Disorders (ASD), such as abnormal behaviors and social incompetence, often make it extremely difficult for these individuals to gain functional independence from caregivers. These long-term implications necessitate an immediate effort to improve social skills among children with an ASD. Any technology that could teach individuals with ASD necessary social skills would not only be invaluable for the individuals affected, but could also effect a massive saving to society in treatment programs. The overall purpose of the first study was to develop, implement, and evaluate an avatar tutor for social skills training in children with ASD. “Face Say” was developed as a colorful computer program that contains several different activities designed to teach children specific social skills, such as eye gaze, joint attention, and facial recognition. The children with ASD were asked to attend to FaceSay or a control painting computer game for six weeks. Children with ASD who received the training had an increase in emotion recognition, F(1, 48) = 23.04, p < 0.001 (adjusted Ms 8.70 and 6.79, respectively) compared to the control group. In addition, children who received the FaceSay training had higher post-test scored in facial recognition, F(1, 48) = 5.09, p < 0.05 (adjusted Ms: 38.11 and 33.37, respectively) compared to controls. The findings provide information about the benefits of computer-based training for children with ASD. Recent research suggests the value of also using socially assistive robots with children who have an ASD. Researchers investigating robots as tools for therapy in ASD have reported increased engagement, increased levels of attention, and novel social behaviors when robots are part of the social interaction. The overall goal of the second study was to develop a social robot designed to teach children specific social skills such as emotion recognition. The robot is approachable, with both an animal-like appearance and features of a human face (i.e., eyes, eyebrows, mouth). The feasibility of the robot is being investigated in children ages 7-12 to explore whether the social robot is capable of forming different facial expressions to accurately display emotions similar to those observed in the human face. The findings of this study will be used to create a potentially effective and cost efficient therapy for improving the cognitive-emotional skills of children with autism. Implications and study findings using the robot as an intervention tool will be discussed.

Keywords: autism, intervention, technology, emotions

Procedia PDF Downloads 346
2329 The Convolution Recurrent Network of Using Residual LSTM to Process the Output of the Downsampling for Monaural Speech Enhancement

Authors: Shibo Wei, Ting Jiang

Abstract:

Convolutional-recurrent neural networks (CRN) have achieved much success recently in the speech enhancement field. The common processing method is to use the convolution layer to compress the feature space by multiple upsampling and then model the compressed features with the LSTM layer. At last, the enhanced speech is obtained by deconvolution operation to integrate the global information of the speech sequence. However, the feature space compression process may cause the loss of information, so we propose to model the upsampling result of each step with the residual LSTM layer, then join it with the output of the deconvolution layer and input them to the next deconvolution layer, by this way, we want to integrate the global information of speech sequence better. The experimental results show the network model (RES-CRN) we introduce can achieve better performance than LSTM without residual and overlaying LSTM simply in the original CRN in terms of scale-invariant signal-to-distortion ratio (SI-SNR), speech quality (PESQ), and intelligibility (STOI).

Keywords: convolutional-recurrent neural networks, speech enhancement, residual LSTM, SI-SNR

Procedia PDF Downloads 168
2328 Detection of Clipped Fragments in Speech Signals

Authors: Sergei Aleinik, Yuri Matveev

Abstract:

In this paper a novel method for the detection of clipping in speech signals is described. It is shown that the new method has better performance than known clipping detection methods, is easy to implement, and is robust to changes in signal amplitude, size of data, etc. Statistical simulation results are presented.

Keywords: clipping, clipped signal, speech signal processing, digital signal processing

Procedia PDF Downloads 359
2327 Direct and Indirect Effects of Childhood Traumas, Emotion Regulation Difficulties and Age on Tendency to Violence

Authors: Selin Kara-Bahçekapılı, Bengisu Nehir Aydın

Abstract:

Objective: In this study, it is aimed to examine the relationship between childhood traumas (overprotection-control, emotional/physical/sexual abuse, emotional/physical neglect), age, emotional regulation difficulties, and the tendency of violence in adults. In the study, the direct and indirect effects of 6 sub-factors of childhood traumas, emotion regulation difficulties, and age on tendency to violence are evaluated on a model that theoretically reveals. Method: The population of this cross-sectional study consists of individuals between the ages of 18-65 living in Turkey. The data from 527 participants were obtained by online surveys and convenience sampling method within the scope of the study. As a result of exclusion criteria and then outlier data analysis, the data of 443 participants were included in the analysis. Data were collected by demographic information form, childhood trauma scale, emotion regulation difficulty scale, and violence tendency scale. Research data were analyzed by SPSS and AMOS using correlation, path analysis, direct and indirect effects. Results: According to the research findings, the variables in the model explained 28.2% of the variance of the mean scores of the individuals' tendency to violence. Emotion regulation difficulties have the most direct effect on the tendency to violence (d=.387; p<.01). The effects of excessive protection and control, emotional neglect, and physical neglect variables on the tendency to violence are not significant. When the significant and indirect effects of the variables on tendency to violence over emotion regulation difficulties are examined, age has a negative effect, emotional neglect has a positive effect, emotional abuse has a positive effect, and overprotection-control has a positive effect. The indirect effects of sexual abuse, physical neglect, and physical abuse on tendency to violence are not significant. Childhood traumas and age variables in the model explained 24.1% of the variance of the mean scores of the individuals’ emotion regulation difficulties. The variable that most affects emotion regulation difficulties is age (d=-.268; p<.001). The direct effects of sexual abuse, physical neglect, and physical abuse on emotion regulation difficulties are not significant. Conclusion: The results of the research emphasize the critical role of difficulty in emotion regulation on the tendency to violence. Difficulty in emotion regulation affects the tendency to violence both directly and by mediating different variables. In addition, it is seen that some sub-factors of childhood traumas have direct and/or indirect effects on the tendency to violence. Emotional abuse and age have both direct and indirect effects on the tendency to violence over emotion regulation difficulties.

Keywords: childhood trauma, emotion regulation difficulties, tendency to violence, path analysis

Procedia PDF Downloads 60
2326 Detection of Phoneme [S] Mispronounciation for Sigmatism Diagnosis in Adults

Authors: Michal Krecichwost, Zauzanna Miodonska, Pawel Badura

Abstract:

The diagnosis of sigmatism is mostly based on the observation of articulatory organs. It is, however, not always possible to precisely observe the vocal apparatus, in particular in the oral cavity of the patient. Speech processing can allow to objectify the therapy and simplify the verification of its progress. In the described study the methodology for classification of incorrectly pronounced phoneme [s] is proposed. The recordings come from adults. They were registered with the speech recorder at the sampling rate of 44.1 kHz and the resolution of 16 bit. The database of pathological and normative speech has been collected for the study including reference assessments provided by the speech therapy experts. Ten adult subjects were asked to simulate a certain type of stigmatism under the speech therapy expert supervision. In the recordings, the analyzed phone [s] was surrounded by vowels, viz: ASA, ESE, ISI, SPA, USU, YSY. Thirteen MFCC (mel-frequency cepstral coefficients) and RMS (root mean square) values are calculated within each frame being a part of the analyzed phoneme. Additionally, 3 fricative formants along with corresponding amplitudes are determined for the entire segment. In order to aggregate the information within the segment, the average value of each MFCC coefficient is calculated. All features of other types are aggregated by means of their 75th percentile. The proposed method of features aggregation reduces the size of the feature vector used in the classification. Binary SVM (support vector machine) classifier is employed at the phoneme recognition stage. The first group consists of pathological phones, while the other of the normative ones. The proposed feature vector yields classification sensitivity and specificity measures above 90% level in case of individual logo phones. The employment of a fricative formants-based information improves the sole-MFCC classification results average of 5 percentage points. The study shows that the employment of specific parameters for the selected phones improves the efficiency of pathology detection referred to the traditional methods of speech signal parameterization.

Keywords: computer-aided pronunciation evaluation, sibilants, sigmatism diagnosis, speech processing

Procedia PDF Downloads 254
2325 A Contribution to Human Activities Recognition Using Expert System Techniques

Authors: Malika Yaici, Soraya Aloui, Sara Semchaoui

Abstract:

This paper deals with human activity recognition from sensor data. It is an active research area, and the main objective is to obtain a high recognition rate. In this work, a recognition system based on expert systems is proposed; the recognition is performed using the objects, object states, and gestures and taking into account the context (the location of the objects and of the person performing the activity, the duration of the elementary actions and the activity). The system recognizes complex activities after decomposing them into simple, easy-to-recognize activities. The proposed method can be applied to any type of activity. The simulation results show the robustness of our system and its speed of decision.

Keywords: human activity recognition, ubiquitous computing, context-awareness, expert system

Procedia PDF Downloads 63