Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2479

Search results for: text emotion recognition

2479 OCR/ICR Text Recognition Using ABBYY FineReader as an Example Text

Authors: A. R. Bagirzade, A. Sh. Najafova, S. M. Yessirkepova, E. S. Albert

Abstract:

This article describes a text recognition method based on Optical Character Recognition (OCR). The features of the OCR method were examined using the ABBYY FineReader program. It describes automatic text recognition in images. OCR is necessary because optical input devices can only transmit raster graphics as a result. Text recognition describes the task of recognizing letters shown as such, to identify and assign them an assigned numerical value in accordance with the usual text encoding (ASCII, Unicode). The peculiarity of this study conducted by the authors using the example of the ABBYY FineReader, was confirmed and shown in practice, the improvement of digital text recognition platforms developed by Electronic Publication.

Keywords: ABBYY FineReader system, algorithm symbol recognition, OCR/ICR techniques, recognition technologies

Procedia PDF Downloads 75
2478 Comparing Emotion Recognition from Voice and Facial Data Using Time Invariant Features

Authors: Vesna Kirandziska, Nevena Ackovska, Ana Madevska Bogdanova

Abstract:

The problem of emotion recognition is a challenging problem. It is still an open problem from the aspect of both intelligent systems and psychology. In this paper, both voice features and facial features are used for building an emotion recognition system. A Support Vector Machine classifiers are built by using raw data from video recordings. In this paper, the results obtained for the emotion recognition are given, and a discussion about the validity and the expressiveness of different emotions is presented. A comparison between the classifiers build from facial data only, voice data only and from the combination of both data is made here. The need for a better combination of the information from facial expression and voice data is argued.

Keywords: emotion recognition, facial recognition, signal processing, machine learning

Procedia PDF Downloads 242
2477 Small Text Extraction From Documents and Chart Images

Authors: Rominkumar Busa, Shahira K. C., Lijiya A.

Abstract:

Text recognition is an important area in computer vision which deals with detecting and recognising text from an image. The Optical Character Recognition (OCR) is a saturated area these days and with very good text recognition accuracy. However the same OCR methods when applied on text with small font sizes like the text data of chart images, the recognition rate is less than 30\%. In this work, aims to extract small text in images using the deep learning model, CRNN with CTC loss. The text recognition accuracy is found to improve by applying image enhancement by super resolution prior to CRNN model. We also observe the text recognition rate further increases by 18\% by applying the proposed method, which involves super resolution and character segmentation followed by CRNN with CTC loss. The efficiency of the proposed method shows that further pre-processing on chart image text and other small text images will improve the accuracy further, thereby helping text extraction from chart images.

Keywords: small text extraction, OCR, scene text recognition, CRNN

Procedia PDF Downloads 22
2476 Composite Kernels for Public Emotion Recognition from Twitter

Authors: Chien-Hung Chen, Yan-Chun Hsing, Yung-Chun Chang

Abstract:

The Internet has grown into a powerful medium for information dispersion and social interaction that leads to a rapid growth of social media which allows users to easily post their emotions and perspectives regarding certain topics online. Our research aims at using natural language processing and text mining techniques to explore the public emotions expressed on Twitter by analyzing the sentiment behind tweets. In this paper, we propose a composite kernel method that integrates tree kernel with the linear kernel to simultaneously exploit both the tree representation and the distributed emotion keyword representation to analyze the syntactic and content information in tweets. The experiment results demonstrate that our method can effectively detect public emotion of tweets while outperforming the other compared methods.

Keywords: emotion recognition, natural language processing, composite kernel, sentiment analysis, text mining

Procedia PDF Downloads 97
2475 Emotion Recognition in Video and Images in the Wild

Authors: Faizan Tariq, Moayid Ali Zaidi

Abstract:

Facial emotion recognition algorithms are expanding rapidly now a days. People are using different algorithms with different combinations to generate best results. There are six basic emotions which are being studied in this area. Author tried to recognize the facial expressions using object detector algorithms instead of traditional algorithms. So, two object detection have been choosen for algorithms which are Faster R-CNN and YOLO. For pre-processing we used image rotation and batch normalization. The dataset, we choosed, the experiments that is Static Facial Expression in Wild (SFEW). Our approach worked well but there is still a lot of room to improve it, which will be a future direction.

Keywords: face recognition, emotion recognition, deep learning, CNN

Procedia PDF Downloads 21
2474 Text Emotion Recognition by Multi-Head Attention based Bidirectional LSTM Utilizing Multi-Level Classification

Authors: Vishwanath Pethri Kamath, Jayantha Gowda Sarapanahalli, Vishal Mishra, Siddhesh Balwant Bandgar

Abstract:

Recognition of emotional information is essential in any form of communication. Growing HCI (Human-Computer Interaction) in recent times indicates the importance of understanding of emotions expressed and becomes crucial for improving the system or the interaction itself. In this research work, textual data for emotion recognition is used. The text being the least expressive amongst the multimodal resources poses various challenges such as contextual information and also sequential nature of the language construction. In this research work, the proposal is made for a neural architecture to resolve not less than 8 emotions from textual data sources derived from multiple datasets using google pre-trained word2vec word embeddings and a Multi-head attention-based bidirectional LSTM model with a one-vs-all Multi-Level Classification. The emotions targeted in this research are Anger, Disgust, Fear, Guilt, Joy, Sadness, Shame, and Surprise. Textual data from multiple datasets were used for this research work such as ISEAR, Go Emotions, Affect datasets for creating the emotions’ dataset. Data samples overlap or conflicts were considered with careful preprocessing. Our results show a significant improvement with the modeling architecture and as good as 10 points improvement in recognizing some emotions.

Keywords: text emotion recognition, bidirectional LSTM, multi-head attention, multi-level classification, google word2vec word embeddings

Procedia PDF Downloads 21
2473 Exploratory Analysis of A Review of Nonexistence Polarity in Native Speech

Authors: Deawan Rakin Ahamed Remal, Sinthia Chowdhury, Sharun Akter Khushbu, Sheak Rashed Haider Noori

Abstract:

Native Speech to text synthesis has its own leverage for the purpose of mankind. The extensive nature of art to speaking different accents is common but the purpose of communication between two different accent types of people is quite difficult. This problem will be motivated by the extraction of the wrong perception of language meaning. Thus, many existing automatic speech recognition has been placed to detect text. Overall study of this paper mentions a review of NSTTR (Native Speech Text to Text Recognition) synthesis compared with Text to Text recognition. Review has exposed many text to text recognition systems that are at a very early stage to comply with the system by native speech recognition. Many discussions started about the progression of chatbots, linguistic theory another is rule based approach. In the Recent years Deep learning is an overwhelming chapter for text to text learning to detect language nature. To the best of our knowledge, In the sub continent a huge number of people speak in Bangla language but they have different accents in different regions therefore study has been elaborate contradictory discussion achievement of existing works and findings of future needs in Bangla language acoustic accent.

Keywords: TTR, NSTTR, text to text recognition, deep learning, natural language processing

Procedia PDF Downloads 28
2472 Characterising the Processes Underlying Emotion Recognition Deficits in Adolescents with Conduct Disorder

Authors: Nayra Martin-Key, Erich Graf, Wendy Adams, Graeme Fairchild

Abstract:

Children and adolescents with Conduct Disorder (CD) have been shown to demonstrate impairments in emotion recognition, but it is currently unclear whether this deficit is related to specific emotions or whether it represents a global deficit in emotion recognition. An emotion recognition task with concurrent eye-tracking was employed to further explore this relationship in a sample of male and female adolescents with CD. Participants made emotion categorization judgements for presented dynamic and morphed static facial expressions. The results demonstrated that males with CD, and to a lesser extent, females with CD, displayed impaired facial expression recognition in general, whereas callous-unemotional (CU) traits were linked to specific problems in sadness recognition in females with CD. A region-of-interest analysis of the eye-tracking data indicated that males with CD exhibited reduced fixation times for the eye-region of the face compared to typically-developing (TD) females, but not TD males. Females with CD did not show reduced fixation to the eye-region of the face relative to TD females. In addition, CU traits did not influence CD subjects’ attention to the eye-region of the face. These findings suggest that the emotion recognition deficits found in CD males, the worst performing group in the behavioural tasks, are partly driven by reduced attention to the eyes.

Keywords: attention, callous-unemotional traits, conduct disorder, emotion recognition, eye-region, eye-tracking, sex differences

Procedia PDF Downloads 210
2471 Distorted Document Images Dataset for Text Detection and Recognition

Authors: Ilia Zharikov, Philipp Nikitin, Ilia Vasiliev, Vladimir Dokholyan

Abstract:

With the increasing popularity of document analysis and recognition systems, text detection (TD) and optical character recognition (OCR) in document images become challenging tasks. However, according to our best knowledge, no publicly available datasets for these particular problems exist. In this paper, we introduce a Distorted Document Images dataset (DDI-100) and provide a detailed analysis of the DDI-100 in its current state. To create the dataset we collected 7000 unique document pages, and extend it by applying different types of distortions and geometric transformations. In total, DDI-100 contains more than 100,000 document images together with binary text masks, text and character locations in terms of bounding boxes. We also present an analysis of several state-of-the-art TD and OCR approaches on the presented dataset. Lastly, we demonstrate the usefulness of DDI-100 to improve accuracy and stability of the considered TD and OCR models.

Keywords: document analysis, open dataset, optical character recognition, text detection

Procedia PDF Downloads 72
2470 Recognition of Cursive Arabic Handwritten Text Using Embedded Training Based on Hidden Markov Models (HMMs)

Authors: Rabi Mouhcine, Amrouch Mustapha, Mahani Zouhir, Mammass Driss

Abstract:

In this paper, we present a system for offline recognition cursive Arabic handwritten text based on Hidden Markov Models (HMMs). The system is analytical without explicit segmentation used embedded training to perform and enhance the character models. Extraction features preceded by baseline estimation are statistical and geometric to integrate both the peculiarities of the text and the pixel distribution characteristics in the word image. These features are modelled using hidden Markov models and trained by embedded training. The experiments on images of the benchmark IFN/ENIT database show that the proposed system improves recognition.

Keywords: recognition, handwriting, Arabic text, HMMs, embedded training

Procedia PDF Downloads 274
2469 Recognition of Grocery Products in Images Captured by Cellular Phones

Authors: Farshideh Einsele, Hassan Foroosh

Abstract:

In this paper, we present a robust algorithm to recognize extracted text from grocery product images captured by mobile phone cameras. Recognition of such text is challenging since text in grocery product images varies in its size, orientation, style, illumination, and can suffer from perspective distortion. Pre-processing is performed to make the characters scale and rotation invariant. Since text degradations can not be appropriately defined using wellknown geometric transformations such as translation, rotation, affine transformation and shearing, we use the whole character black pixels as our feature vector. Classification is performed with minimum distance classifier using the maximum likelihood criterion, which delivers very promising Character Recognition Rate (CRR) of 89%. We achieve considerably higher Word Recognition Rate (WRR) of 99% when using lower level linguistic knowledge about product words during the recognition process.

Keywords: camera-based OCR, feature extraction, document, image processing, grocery products

Procedia PDF Downloads 334
2468 Facial Emotion Recognition Using Deep Learning

Authors: Ashutosh Mishra, Nikhil Goyal

Abstract:

A 3D facial emotion recognition model based on deep learning is proposed in this paper. Two convolution layers and a pooling layer are employed in the deep learning architecture. After the convolution process, the pooling is finished. The probabilities for various classes of human faces are calculated using the sigmoid activation function. To verify the efficiency of deep learning-based systems, a set of faces. The Kaggle dataset is used to verify the accuracy of a deep learning-based face recognition model. The model's accuracy is about 65 percent, which is lower than that of other facial expression recognition techniques. Despite significant gains in representation precision due to the nonlinearity of profound image representations.

Keywords: facial recognition, computational intelligence, convolutional neural network, depth map

Procedia PDF Downloads 84
2467 “Octopub”: Geographical Sentiment Analysis Using Named Entity Recognition from Social Networks for Geo-Targeted Billboard Advertising

Authors: Oussama Hafferssas, Hiba Benyahia, Amina Madani, Nassima Zeriri

Abstract:

Although data nowadays has multiple forms; from text to images, and from audio to videos, yet text is still the most used one at a public level. At an academical and research level, and unlike other forms, text can be considered as the easiest form to process. Therefore, a brunch of Data Mining researches has been always under its shadow, called "Text Mining". Its concept is just like data mining’s, finding valuable patterns in data, from large collections and tremendous volumes of data, in this case: Text. Named entity recognition (NER) is one of Text Mining’s disciplines, it aims to extract and classify references such as proper names, locations, expressions of time and dates, organizations and more in a given text. Our approach "Octopub" does not aim to find new ways to improve named entity recognition process, rather than that it’s about finding a new, and yet smart way, to use NER in a way that we can extract sentiments of millions of people using Social Networks as a limitless information source, and Marketing for product promotion as the main domain of application.

Keywords: textmining, named entity recognition(NER), sentiment analysis, social media networks (SN, SMN), business intelligence(BI), marketing

Procedia PDF Downloads 434
2466 Emotion Recognition with Occlusions Based on Facial Expression Reconstruction and Weber Local Descriptor

Authors: Jadisha Cornejo, Helio Pedrini

Abstract:

Recognition of emotions based on facial expressions has received increasing attention from the scientific community over the last years. Several fields of applications can benefit from facial emotion recognition, such as behavior prediction, interpersonal relations, human-computer interactions, recommendation systems. In this work, we develop and analyze an emotion recognition framework based on facial expressions robust to occlusions through the Weber Local Descriptor (WLD). Initially, the occluded facial expressions are reconstructed following an extension approach of Robust Principal Component Analysis (RPCA). Then, WLD features are extracted from the facial expression representation, as well as Local Binary Patterns (LBP) and Histogram of Oriented Gradients (HOG). The feature vector space is reduced using Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA). Finally, K-Nearest Neighbor (K-NN) and Support Vector Machine (SVM) classifiers are used to recognize the expressions. Experimental results on three public datasets demonstrated that the WLD representation achieved competitive accuracy rates for occluded and non-occluded facial expressions compared to other approaches available in the literature.

Keywords: emotion recognition, facial expression, occlusion, fiducial landmarks

Procedia PDF Downloads 105
2465 Extraction of Text Subtitles in Multimedia Systems

Authors: Amarjit Singh

Abstract:

In this paper, a method for extraction of text subtitles in large video is proposed. The video data needs to be annotated for many multimedia applications. Text is incorporated in digital video for the motive of providing useful information about that video. So need arises to detect text present in video to understanding and video indexing. This is achieved in two steps. First step is text localization and the second step is text verification. The method of text detection can be extended to text recognition which finds applications in automatic video indexing; video annotation and content based video retrieval. The method has been tested on various types of videos.

Keywords: video, subtitles, extraction, annotation, frames

Procedia PDF Downloads 519
2464 The Application of a Hybrid Neural Network for Recognition of a Handwritten Kazakh Text

Authors: Almagul Assainova , Dariya Abykenova, Liudmila Goncharenko, Sergey Sybachin, Saule Rakhimova, Abay Aman

Abstract:

The recognition of a handwritten Kazakh text is a relevant objective today for the digitization of materials. The study presents a model of a hybrid neural network for handwriting recognition, which includes a convolutional neural network and a multi-layer perceptron. Each network includes 1024 input neurons and 42 output neurons. The model is implemented in the program, written in the Python programming language using the EMNIST database, NumPy, Keras, and Tensorflow modules. The neural network training of such specific letters of the Kazakh alphabet as ә, ғ, қ, ң, ө, ұ, ү, h, і was conducted. The neural network model and the program created on its basis can be used in electronic document management systems to digitize the Kazakh text.

Keywords: handwriting recognition system, image recognition, Kazakh font, machine learning, neural networks

Procedia PDF Downloads 55
2463 End-to-End Multilingual Text Recognition Based on Byte Modeling

Authors: Jiajia Wu, Kun Zhao, Zhengyan Yang, Bing Yin, Cong Liu, Lirong Dai

Abstract:

Nowadays, multilingual text recognition is more and more widely used in computer vision. However, in practical applications, the independent modeling of each language cannot make full use of the information between different languages and consumes hardware resources very much, which makes the unified modeling of multiple languages very necessary. A natural approach to unified multilingual modeling is to combine modeling units (characters, subwords, or words) from all languages into a large vocabulary, and then use a sequence-to-sequence approach to modeling. However, this vocabulary is often very large making modeling difficult. In this paper, we propose a byte-based multilingual text recognition method, which makes the vocabulary size only 256, which effectively solves the problem of unified modeling. The experiments show that our method effectively utilizes the information between different languages and outperforms the baseline of independent modeling by a large margin.

Keywords: multilingual, end-to-end text recognition, unified modeling, byte modeling

Procedia PDF Downloads 8
2462 Multimodal Database of Emotional Speech, Video and Gestures

Authors: Tomasz Sapiński, Dorota Kamińska, Adam Pelikant, Egils Avots, Cagri Ozcinar, Gholamreza Anbarjafari

Abstract:

People express emotions through different modalities. Integration of verbal and non-verbal communication channels creates a system in which the message is easier to understand. Expanding the focus to several expression forms can facilitate research on emotion recognition as well as human-machine interaction. In this article, the authors present a Polish emotional database composed of three modalities: facial expressions, body movement and gestures, and speech. The corpora contains recordings registered in studio conditions, acted out by 16 professional actors (8 male and 8 female). The data is labeled with six basic emotions categories, according to Ekman’s emotion categories. To check the quality of performance, all recordings are evaluated by experts and volunteers. The database is available to academic community and might be useful in the study on audio-visual emotion recognition.

Keywords: body movement, emotion recognition, emotional corpus, facial expressions, gestures, multimodal database, speech

Procedia PDF Downloads 270
2461 A Conglomerate of Multiple Optical Character Recognition Table Detection and Extraction

Authors: Smita Pallavi, Raj Ratn Pranesh, Sumit Kumar

Abstract:

Information representation as tables is compact and concise method that eases searching, indexing, and storage requirements. Extracting and cloning tables from parsable documents is easier and widely used; however, industry still faces challenges in detecting and extracting tables from OCR (Optical Character Recognition) documents or images. This paper proposes an algorithm that detects and extracts multiple tables from OCR document. The algorithm uses a combination of image processing techniques, text recognition, and procedural coding to identify distinct tables in the same image and map the text to appropriate the corresponding cell in dataframe, which can be stored as comma-separated values, database, excel, and multiple other usable formats.

Keywords: table extraction, optical character recognition, image processing, text extraction, morphological transformation

Procedia PDF Downloads 66
2460 Sarcasm Recognition System Using Hybrid Tone-Word Spotting Audio Mining Technique

Authors: Sandhya Baskaran, Hari Kumar Nagabushanam

Abstract:

Sarcasm sentiment recognition is an area of natural language processing that is being probed into in the recent times. Even with the advancements in NLP, typical translations of words, sentences in its context fail to provide the exact information on a sentiment or emotion of a user. For example, if something bad happens, the statement ‘That's just what I need, great! Terrific!’ is expressed in a sarcastic tone which could be misread as a positive sign by any text-based analyzer. In this paper, we are presenting a unique real time ‘word with its tone’ spotting technique which would provide the sentiment analysis for a tone or pitch of a voice in combination with the words being expressed. This hybrid approach increases the probability for identification of special sentiment like sarcasm much closer to the real world than by mining text or speech individually. The system uses a tone analyzer such as YIN-FFT which extracts pitch segment-wise that would be used in parallel with a speech recognition system. The clustered data is classified for sentiments and sarcasm score for each of it determined. Our Simulations demonstrates the improvement in f-measure of around 12% compared to existing detection techniques with increased precision and recall.

Keywords: sarcasm recognition, tone-word spotting, natural language processing, pitch analyzer

Procedia PDF Downloads 232
2459 Switching to the Latin Alphabet in Kazakhstan: A Brief Overview of Character Recognition Methods

Authors: Ainagul Yermekova, Liudmila Goncharenko, Ali Baghirzade, Sergey Sybachin

Abstract:

In this article, we address the problem of Kazakhstan's transition to the Latin alphabet. The transition process started in 2017 and is scheduled to be completed in 2025. In connection with these events, the problem of recognizing the characters of the new alphabet is raised. Well-known character recognition programs such as ABBYY FineReader, FormReader, MyScript Stylus did not recognize specific Kazakh letters that were used in Cyrillic. The author tries to give an assessment of the well-known method of character recognition that could be in demand as part of the country's transition to the Latin alphabet. Three methods of character recognition: template, structured, and feature-based, are considered through the algorithms of operation. At the end of the article, a general conclusion is made about the possibility of applying a certain method to a particular recognition process: for example, in the process of population census, recognition of typographic text in Latin, or recognition of photos of car numbers, store signs, etc.

Keywords: text detection, template method, recognition algorithm, structured method, feature method

Procedia PDF Downloads 94
2458 Speech Detection Model Based on Deep Neural Networks Classifier for Speech Emotions Recognition

Authors: A. Shoiynbek, K. Kozhakhmet, P. Menezes, D. Kuanyshbay, D. Bayazitov

Abstract:

Speech emotion recognition has received increasing research interest all through current years. There was used emotional speech that was collected under controlled conditions in most research work. Actors imitating and artificially producing emotions in front of a microphone noted those records. There are four issues related to that approach, namely, (1) emotions are not natural, and it means that machines are learning to recognize fake emotions. (2) Emotions are very limited by quantity and poor in their variety of speaking. (3) There is language dependency on SER. (4) Consequently, each time when researchers want to start work with SER, they need to find a good emotional database on their language. In this paper, we propose the approach to create an automatic tool for speech emotion extraction based on facial emotion recognition and describe the sequence of actions of the proposed approach. One of the first objectives of the sequence of actions is a speech detection issue. The paper gives a detailed description of the speech detection model based on a fully connected deep neural network for Kazakh and Russian languages. Despite the high results in speech detection for Kazakh and Russian, the described process is suitable for any language. To illustrate the working capacity of the developed model, we have performed an analysis of speech detection and extraction from real tasks.

Keywords: deep neural networks, speech detection, speech emotion recognition, Mel-frequency cepstrum coefficients, collecting speech emotion corpus, collecting speech emotion dataset, Kazakh speech dataset

Procedia PDF Downloads 2
2457 Hand Gestures Based Emotion Identification Using Flex Sensors

Authors: S. Ali, R. Yunus, A. Arif, Y. Ayaz, M. Baber Sial, R. Asif, N. Naseer, M. Jawad Khan

Abstract:

In this study, we have proposed a gesture to emotion recognition method using flex sensors mounted on metacarpophalangeal joints. The flex sensors are fixed in a wearable glove. The data from the glove are sent to PC using Wi-Fi. Four gestures: finger pointing, thumbs up, fist open and fist close are performed by five subjects. Each gesture is categorized into sad, happy, and excited class based on the velocity and acceleration of the hand gesture. Seventeen inspectors observed the emotions and hand gestures of the five subjects. The emotional state based on the investigators assessment and acquired movement speed data is compared. Overall, we achieved 77% accurate results. Therefore, the proposed design can be used for emotional state detection applications.

Keywords: emotion identification, emotion models, gesture recognition, user perception

Procedia PDF Downloads 124
2456 Facial Emotion Recognition with Convolutional Neural Network Based Architecture

Authors: Koray U. Erbas

Abstract:

Neural networks are appealing for many applications since they are able to learn complex non-linear relationships between input and output data. As the number of neurons and layers in a neural network increase, it is possible to represent more complex relationships with automatically extracted features. Nowadays Deep Neural Networks (DNNs) are widely used in Computer Vision problems such as; classification, object detection, segmentation image editing etc. In this work, Facial Emotion Recognition task is performed by proposed Convolutional Neural Network (CNN)-based DNN architecture using FER2013 Dataset. Moreover, the effects of different hyperparameters (activation function, kernel size, initializer, batch size and network size) are investigated and ablation study results for Pooling Layer, Dropout and Batch Normalization are presented.

Keywords: convolutional neural network, deep learning, deep learning based FER, facial emotion recognition

Procedia PDF Downloads 84
2455 Speech Emotion Recognition with Bi-GRU and Self-Attention based Feature Representation

Authors: Bubai Maji, Monorama Swain

Abstract:

Speech is considered an essential and most natural medium for the interaction between machines and humans. However, extracting effective features for speech emotion recognition (SER) is remains challenging. The present studies show that the temporal information captured but high-level temporal-feature learning is yet to be investigated. In this paper, we present an efficient novel method using the Self-attention (SA) mechanism in a combination of Convolutional Neural Network (CNN) and Bi-directional Gated Recurrent Unit (Bi-GRU) network to learn high-level temporal-feature. In order to further enhance the representation of the high-level temporal-feature, we integrate a Bi-GRU output with learnable weights features by SA, and improve the performance. We evaluate our proposed method on our created SITB-OSED and IEMOCAP databases. We report that the experimental results of our proposed method achieve state-of-the-art performance on both databases.

Keywords: Bi-GRU, 1D-CNNs, self-attention, speech emotion recognition

Procedia PDF Downloads 15
2454 Individualized Emotion Recognition Through Dual-Representations and Ground-Established Ground Truth

Authors: Valentina Zhang

Abstract:

While facial expression is a complex and individualized behavior, all facial emotion recognition (FER) systems known to us rely on a single facial representation and are trained on universal data. We conjecture that: (i) different facial representations can provide different, sometimes complementing views of emotions; (ii) when employed collectively in a discussion group setting, they enable more accurate emotion reading which is highly desirable in autism care and other applications context sensitive to errors. In this paper, we first study FER using pixel-based DL vs semantics-based DL in the context of deepfake videos. Our experiment indicates that while the semantics-trained model performs better with articulated facial feature changes, the pixel-trained model outperforms on subtle or rare facial expressions. Armed with these findings, we have constructed an adaptive FER system learning from both types of models for dyadic or small interacting groups and further leveraging the synthesized group emotions as the ground truth for individualized FER training. Using a collection of group conversation videos, we demonstrate that FER accuracy and personalization can benefit from such an approach.

Keywords: neurodivergence care, facial emotion recognition, deep learning, ground truth for supervised learning

Procedia PDF Downloads 20
2453 Emotions in Health Tweets: Analysis of American Government Official Accounts

Authors: García López

Abstract:

The Government Departments of Health have the task of informing and educating citizens about public health issues. For this, they use channels like Twitter, key in the search for health information and the propagation of content. The tweets, important in the virality of the content, may contain emotions that influence the contagion and exchange of knowledge. The goal of this study is to perform an analysis of the emotional projection of health information shared on Twitter by official American accounts: the disease control account CDCgov, National Institutes of Health, NIH, the government agency HHSGov, and the professional organization PublicHealth. For this, we used Tone Analyzer, an International Business Machines Corporation (IBM) tool specialized in emotion detection in text, corresponding to the categorical model of emotion representation. For 15 days, all tweets from these accounts were analyzed with the emotional analysis tool in text. The results showed that their tweets contain an important emotional load, a determining factor in the success of their communications. This exposes that official accounts also use subjective language and contain emotions. The predominance of emotion joy over sadness and the strong presence of emotions in their tweets stimulate the virality of content, a key in the work of informing that government health departments have.

Keywords: emotions in tweets, emotion detection in the text, health information on Twitter, American health official accounts, emotions on Twitter, emotions and content

Procedia PDF Downloads 72
2452 Effects of Oxytocin on Neural Response to Facial Emotion Recognition in Schizophrenia

Authors: Avyarthana Dey, Naren P. Rao, Arpitha Jacob, Chaitra V. Hiremath, Shivarama Varambally, Ganesan Venkatasubramanian, Rose Dawn Bharath, Bangalore N. Gangadhar

Abstract:

Objective: Impaired facial emotion recognition is widely reported in schizophrenia. Neuropeptide oxytocin is known to modulate brain regions involved in facial emotion recognition, namely amygdala, in healthy volunteers. However, its effect on facial emotion recognition deficits seen in schizophrenia is not well explored. In this study, we examined the effect of intranasal OXT on processing facial emotions and its neural correlates in patients with schizophrenia. Method: 12 male patients (age= 31.08±7.61 years, education= 14.50±2.20 years) participated in this single-blind, counterbalanced functional magnetic resonance imaging (fMRI) study. All participants underwent three fMRI scans; one at baseline, one each after single dose 24IU intranasal OXT and intranasal placebo. The order of administration of OXT and placebo were counterbalanced and subject was blind to the drug administered. Participants performed a facial emotion recognition task presented in a block design with six alternating blocks of faces and shapes. The faces depicted happy, angry or fearful emotions. The images were preprocessed and analyzed using SPM 12. First level contrasts comparing recognition of emotions and shapes were modelled at individual subject level. A group level analysis was performed using the contrasts generated at the first level to compare the effects of intranasal OXT and placebo. The results were thresholded at uncorrected p < 0.001 with a cluster size of 6 voxels. Neuropeptide oxytocin is known to modulate brain regions involved in facial emotion recognition, namely amygdala, in healthy volunteers. Results: Compared to placebo, intranasal OXT attenuated activity in inferior temporal, fusiform and parahippocampal gyri (BA 20), premotor cortex (BA 6), middle frontal gyrus (BA 10) and anterior cingulate gyrus (BA 24) and enhanced activity in the middle occipital gyrus (BA 18), inferior occipital gyrus (BA 19), and superior temporal gyrus (BA 22). There were no significant differences between the conditions on the accuracy scores of emotion recognition between baseline (77.3±18.38), oxytocin (82.63 ± 10.92) or Placebo (76.62 ± 22.67). Conclusion: Our results provide further evidence to the modulatory effect of oxytocin in patients with schizophrenia. Single dose oxytocin resulted in significant changes in activity of brain regions involved in emotion processing. Future studies need to examine the effectiveness of long-term treatment with OXT for emotion recognition deficits in patients with schizophrenia.

Keywords: recognition, functional connectivity, oxytocin, schizophrenia, social cognition

Procedia PDF Downloads 139
2451 Analysis of Nonlinear and Non-Stationary Signal to Extract the Features Using Hilbert Huang Transform

Authors: A. N. Paithane, D. S. Bormane, S. D. Shirbahadurkar

Abstract:

It has been seen that emotion recognition is an important research topic in the field of Human and computer interface. A novel technique for Feature Extraction (FE) has been presented here, further a new method has been used for human emotion recognition which is based on HHT method. This method is feasible for analyzing the nonlinear and non-stationary signals. Each signal has been decomposed into the IMF using the EMD. These functions are used to extract the features using fission and fusion process. The decomposition technique which we adopt is a new technique for adaptively decomposing signals. In this perspective, we have reported here potential usefulness of EMD based techniques.We evaluated the algorithm on Augsburg University Database; the manually annotated database.

Keywords: intrinsic mode function (IMF), Hilbert-Huang transform (HHT), empirical mode decomposition (EMD), emotion detection, electrocardiogram (ECG)

Procedia PDF Downloads 477
2450 Development of an EEG-Based Real-Time Emotion Recognition System on Edge AI

Authors: James Rigor Camacho, Wansu Lim

Abstract:

Over the last few years, the development of new wearable and processing technologies has accelerated in order to harness physiological data such as electroencephalograms (EEGs) for EEG-based applications. EEG has been demonstrated to be a source of emotion recognition signals with the highest classification accuracy among physiological signals. However, when emotion recognition systems are used for real-time classification, the training unit is frequently left to run offline or in the cloud rather than working locally on the edge. That strategy has hampered research, and the full potential of using an edge AI device has yet to be realized. Edge AI devices are computers with high performance that can process complex algorithms. It is capable of collecting, processing, and storing data on its own. It can also analyze and apply complicated algorithms like localization, detection, and recognition on a real-time application, making it a powerful embedded device. The NVIDIA Jetson series, specifically the Jetson Nano device, was used in the implementation. The cEEGrid, which is integrated to the open-source brain computer-interface platform (OpenBCI), is used to collect EEG signals. An EEG-based real-time emotion recognition system on Edge AI is proposed in this paper. To perform graphical spectrogram categorization of EEG signals and to predict emotional states based on input data properties, machine learning-based classifiers were used. Until the emotional state was identified, the EEG signals were analyzed using the K-Nearest Neighbor (KNN) technique, which is a supervised learning system. In EEG signal processing, after each EEG signal has been received in real-time and translated from time to frequency domain, the Fast Fourier Transform (FFT) technique is utilized to observe the frequency bands in each EEG signal. To appropriately show the variance of each EEG frequency band, power density, standard deviation, and mean are calculated and employed. The next stage is to identify the features that have been chosen to predict emotion in EEG data using the K-Nearest Neighbors (KNN) technique. Arousal and valence datasets are used to train the parameters defined by the KNN technique.Because classification and recognition of specific classes, as well as emotion prediction, are conducted both online and locally on the edge, the KNN technique increased the performance of the emotion recognition system on the NVIDIA Jetson Nano. Finally, this implementation aims to bridge the research gap on cost-effective and efficient real-time emotion recognition using a resource constrained hardware device, like the NVIDIA Jetson Nano. On the cutting edge of AI, EEG-based emotion identification can be employed in applications that can rapidly expand the research and implementation industry's use.

Keywords: edge AI device, EEG, emotion recognition system, supervised learning algorithm, sensors

Procedia PDF Downloads 31