Search results for: facial pose classification
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2806

Search results for: facial pose classification

2806 Facial Pose Classification Using Hilbert Space Filling Curve and Multidimensional Scaling

Authors: Mekamı Hayet, Bounoua Nacer, Benabderrahmane Sidahmed, Taleb Ahmed

Abstract:

Pose estimation is an important task in computer vision. Though the majority of the existing solutions provide good accuracy results, they are often overly complex and computationally expensive. In this perspective, we propose the use of dimensionality reduction techniques to address the problem of facial pose estimation. Firstly, a face image is converted into one-dimensional time series using Hilbert space filling curve, then the approach converts these time series data to a symbolic representation. Furthermore, a distance matrix is calculated between symbolic series of an input learning dataset of images, to generate classifiers of frontal vs. profile face pose. The proposed method is evaluated with three public datasets. Experimental results have shown that our approach is able to achieve a correct classification rate exceeding 97% with K-NN algorithm.

Keywords: machine learning, pattern recognition, facial pose classification, time series

Procedia PDF Downloads 322
2805 Pose Normalization Network for Object Classification

Authors: Bingquan Shen

Abstract:

Convolutional Neural Networks (CNN) have demonstrated their effectiveness in synthesizing 3D views of object instances at various viewpoints. Given the problem where one have limited viewpoints of a particular object for classification, we present a pose normalization architecture to transform the object to existing viewpoints in the training dataset before classification to yield better classification performance. We have demonstrated that this Pose Normalization Network (PNN) can capture the style of the target object and is able to re-render it to a desired viewpoint. Moreover, we have shown that the PNN improves the classification result for the 3D chairs dataset and ShapeNet airplanes dataset when given only images at limited viewpoint, as compared to a CNN baseline.

Keywords: convolutional neural networks, object classification, pose normalization, viewpoint invariant

Procedia PDF Downloads 307
2804 KSVD-SVM Approach for Spontaneous Facial Expression Recognition

Authors: Dawood Al Chanti, Alice Caplier

Abstract:

Sparse representations of signals have received a great deal of attention in recent years. In this paper, the interest of using sparse representation as a mean for performing sparse discriminative analysis between spontaneous facial expressions is demonstrated. An automatic facial expressions recognition system is presented. It uses a KSVD-SVM approach which is made of three main stages: A pre-processing and feature extraction stage, which solves the problem of shared subspace distribution based on the random projection theory, to obtain low dimensional discriminative and reconstructive features; A dictionary learning and sparse coding stage, which uses the KSVD model to learn discriminative under or over dictionaries for sparse coding; Finally a classification stage, which uses a SVM classifier for facial expressions recognition. Our main concern is to be able to recognize non-basic affective states and non-acted expressions. Extensive experiments on the JAFFE static acted facial expressions database but also on the DynEmo dynamic spontaneous facial expressions database exhibit very good recognition rates.

Keywords: dictionary learning, random projection, pose and spontaneous facial expression, sparse representation

Procedia PDF Downloads 273
2803 A Novel Method for Face Detection

Authors: H. Abas Nejad, A. R. Teymoori

Abstract:

Facial expression recognition is one of the open problems in computer vision. Robust neutral face recognition in real time is a major challenge for various supervised learning based facial expression recognition methods. This is due to the fact that supervised methods cannot accommodate all appearance variability across the faces with respect to race, pose, lighting, facial biases, etc. in the limited amount of training data. Moreover, processing each and every frame to classify emotions is not required, as the user stays neutral for the majority of the time in usual applications like video chat or photo album/web browsing. Detecting neutral state at an early stage, thereby bypassing those frames from emotion classification would save the computational power. In this work, we propose a light-weight neutral vs. emotion classification engine, which acts as a preprocessor to the traditional supervised emotion classification approaches. It dynamically learns neutral appearance at Key Emotion (KE) points using a textural statistical model, constructed by a set of reference neutral frames for each user. The proposed method is made robust to various types of user head motions by accounting for affine distortions based on a textural statistical model. Robustness to dynamic shift of KE points is achieved by evaluating the similarities on a subset of neighborhood patches around each KE point using the prior information regarding the directionality of specific facial action units acting on the respective KE point. The proposed method, as a result, improves ER accuracy and simultaneously reduces the computational complexity of ER system, as validated on multiple databases.

Keywords: neutral vs. emotion classification, Constrained Local Model, procrustes analysis, Local Binary Pattern Histogram, statistical model

Procedia PDF Downloads 319
2802 Curvelet Features with Mouth and Face Edge Ratios for Facial Expression Identification

Authors: S. Kherchaoui, A. Houacine

Abstract:

This paper presents a facial expression recognition system. It performs identification and classification of the seven basic expressions; happy, surprise, fear, disgust, sadness, anger, and neutral states. It consists of three main parts. The first one is the detection of a face and the corresponding facial features to extract the most expressive portion of the face, followed by a normalization of the region of interest. Then calculus of curvelet coefficients is performed with dimensionality reduction through principal component analysis. The resulting coefficients are combined with two ratios; mouth ratio and face edge ratio to constitute the whole feature vector. The third step is the classification of the emotional state using the SVM method in the feature space.

Keywords: facial expression identification, curvelet coefficient, support vector machine (SVM), recognition system

Procedia PDF Downloads 211
2801 Tensor Deep Stacking Neural Networks and Bilinear Mapping Based Speech Emotion Classification Using Facial Electromyography

Authors: P. S. Jagadeesh Kumar, Yang Yung, Wenli Hu

Abstract:

Speech emotion classification is a dominant research field in finding a sturdy and profligate classifier appropriate for different real-life applications. This effort accentuates on classifying different emotions from speech signal quarried from the features related to pitch, formants, energy contours, jitter, shimmer, spectral, perceptual and temporal features. Tensor deep stacking neural networks were supported to examine the factors that influence the classification success rate. Facial electromyography signals were composed of several forms of focuses in a controlled atmosphere by means of audio-visual stimuli. Proficient facial electromyography signals were pre-processed using moving average filter, and a set of arithmetical features were excavated. Extracted features were mapped into consistent emotions using bilinear mapping. With facial electromyography signals, a database comprising diverse emotions will be exposed with a suitable fine-tuning of features and training data. A success rate of 92% can be attained deprived of increasing the system connivance and the computation time for sorting diverse emotional states.

Keywords: speech emotion classification, tensor deep stacking neural networks, facial electromyography, bilinear mapping, audio-visual stimuli

Procedia PDF Downloads 222
2800 Gender Recognition with Deep Belief Networks

Authors: Xiaoqi Jia, Qing Zhu, Hao Zhang, Su Yang

Abstract:

A gender recognition system is able to tell the gender of the given person through a few of frontal facial images. An effective gender recognition approach enables to improve the performance of many other applications, including security monitoring, human-computer interaction, image or video retrieval and so on. In this paper, we present an effective method for gender classification task in frontal facial images based on deep belief networks (DBNs), which can pre-train model and improve accuracy a little bit. Our experiments have shown that the pre-training method with DBNs for gender classification task is feasible and achieves a little improvement of accuracy on FERET and CAS-PEAL-R1 facial datasets.

Keywords: gender recognition, beep belief net-works, semi-supervised learning, greedy-layer wise RBMs

Procedia PDF Downloads 423
2799 Classifying Facial Expressions Based on a Motion Local Appearance Approach

Authors: Fabiola M. Villalobos-Castaldi, Nicolás C. Kemper, Esther Rojas-Krugger, Laura G. Ramírez-Sánchez

Abstract:

This paper presents the classification results about exploring the combination of a motion based approach with a local appearance method to describe the facial motion caused by the muscle contractions and expansions that are presented in facial expressions. The proposed feature extraction method take advantage of the knowledge related to which parts of the face reflects the highest deformations, so we selected 4 specific facial regions at which the appearance descriptor were applied. The most common used approaches for feature extraction are the holistic and the local strategies. In this work we present the results of using a local appearance approach estimating the correlation coefficient to the 4 corresponding landmark-localized facial templates of the expression face related to the neutral face. The results let us to probe how the proposed motion estimation scheme based on the local appearance correlation computation can simply and intuitively measure the motion parameters for some of the most relevant facial regions and how these parameters can be used to recognize facial expressions automatically.

Keywords: facial expression recognition system, feature extraction, local-appearance method, motion-based approach

Procedia PDF Downloads 387
2798 Use of Computer and Machine Learning in Facial Recognition

Authors: Neha Singh, Ananya Arora

Abstract:

Facial expression measurement plays a crucial role in the identification of emotion. Facial expression plays a key role in psychophysiology, neural bases, and emotional disorder, to name a few. The Facial Action Coding System (FACS) has proven to be the most efficient and widely used of the various systems used to describe facial expressions. Coders can manually code facial expressions with FACS and, by viewing video-recorded facial behaviour at a specified frame rate and slow motion, can decompose into action units (AUs). Action units are the most minor visually discriminable facial movements. FACS explicitly differentiates between facial actions and inferences about what the actions mean. Action units are the fundamental unit of FACS methodology. It is regarded as the standard measure for facial behaviour and finds its application in various fields of study beyond emotion science. These include facial neuromuscular disorders, neuroscience, computer vision, computer graphics and animation, and face encoding for digital processing. This paper discusses the conceptual basis for FACS, a numerical listing of discrete facial movements identified by the system, the system's psychometric evaluation, and the software's recommended training requirements.

Keywords: facial action, action units, coding, machine learning

Procedia PDF Downloads 77
2797 Management of Facial Nerve Palsy Following Physiotherapy

Authors: Bassam Band, Simon Freeman, Rohan Munir, Hisham Band

Abstract:

Objective: To determine efficacy of facial physiotherapy provided for patients with facial nerve palsy. Design: Retrospective study Subjects: 54 patients diagnosed with Facial nerve palsy were included in the study after they met the selection criteria including unilateral facial paralysis and start of therapy twelve months after the onset of facial nerve palsy. Interventions: Patients received the treatment offered at a facial physiotherapy clinic consisting of: Trophic electrical stimulation, surface electromyography with biofeedback, neuromuscular re-education and myofascial release. Main measures: The Sunnybrook facial grading scale was used to evaluate the severity of facial paralysis. Results: This study demonstrated the positive impact of physiotherapy for patient with facial nerve palsy with improvement of 24.2% on the Sunnybrook facial grading score from a mean baseline of 34.2% to 58.2%. The greatest improvement looking at different causes was seen in patient who had reconstructive surgery post Acoustic Neuroma at 31.3%. Conclusion: The therapy shows significant improvement for patients with facial nerve palsy even when started 12 months post onset of paralysis across different causes. This highlights the benefit of this non-invasive technique in managing facial nerve paralysis and possibly preventing the need for surgery.

Keywords: facial nerve palsy, treatment, physiotherapy, bells palsy, acoustic neuroma, ramsey-hunt syndrome

Procedia PDF Downloads 506
2796 Automatic Facial Skin Segmentation Using Possibilistic C-Means Algorithm for Evaluation of Facial Surgeries

Authors: Elham Alaee, Mousa Shamsi, Hossein Ahmadi, Soroosh Nazem, Mohammad Hossein Sedaaghi

Abstract:

Human face has a fundamental role in the appearance of individuals. So the importance of facial surgeries is undeniable. Thus, there is a need for the appropriate and accurate facial skin segmentation in order to extract different features. Since Fuzzy C-Means (FCM) clustering algorithm doesn’t work appropriately for noisy images and outliers, in this paper we exploit Possibilistic C-Means (PCM) algorithm in order to segment the facial skin. For this purpose, first, we convert facial images from RGB to YCbCr color space. To evaluate performance of the proposed algorithm, the database of Sahand University of Technology, Tabriz, Iran was used. In order to have a better understanding from the proposed algorithm; FCM and Expectation-Maximization (EM) algorithms are also used for facial skin segmentation. The proposed method shows better results than the other segmentation methods. Results include misclassification error (0.032) and the region’s area error (0.045) for the proposed algorithm.

Keywords: facial image, segmentation, PCM, FCM, skin error, facial surgery

Procedia PDF Downloads 555
2795 Comparative Analysis of Feature Extraction and Classification Techniques

Authors: R. L. Ujjwal, Abhishek Jain

Abstract:

In the field of computer vision, most facial variations such as identity, expression, emotions and gender have been extensively studied. Automatic age estimation has been rarely explored. With age progression of a human, the features of the face changes. This paper is providing a new comparable study of different type of algorithm to feature extraction [Hybrid features using HAAR cascade & HOG features] & classification [KNN & SVM] training dataset. By using these algorithms we are trying to find out one of the best classification algorithms. Same thing we have done on the feature selection part, we extract the feature by using HAAR cascade and HOG. This work will be done in context of age group classification model.

Keywords: computer vision, age group, face detection

Procedia PDF Downloads 332
2794 Quantification and Preference of Facial Asymmetry of the Sub-Saharan Africans' 3D Facial Models

Authors: Anas Ibrahim Yahaya, Christophe Soligo

Abstract:

A substantial body of literature has reported on facial symmetry and asymmetry and their role in human mate choice. However, major gaps persist, with nearly all data originating from the WEIRD (Western, Educated, Industrialised, Rich and Developed) populations, and results remaining largely equivocal when compared across studies. This study is aimed at quantifying facial asymmetry from the 3D faces of the Hausa of northern Nigeria and also aimed at determining their (Hausa) perceptions and judgements of standardised facial images with different levels of asymmetry using questionnaires. Data were analysed using R-studio software and results indicated that individuals with lower levels of facial asymmetry (near facial symmetry) were perceived as more attractive, more suitable as marriage partners and more caring, whereas individuals with higher levels of facial asymmetry were perceived as more aggressive. The study conclusively asserts that all faces are asymmetric including the most beautiful ones, and the preference of less asymmetric faces was not just dependent on single facial trait, but rather on multiple facial traits; thus the study supports that physical attractiveness is not just an arbitrary social construct, but at least in part a cue to general health and possibly related to environmental context.

Keywords: face, asymmetry, symmetry, Hausa, preference

Procedia PDF Downloads 161
2793 Detection and Classification Strabismus Using Convolutional Neural Network and Spatial Image Processing

Authors: Anoop T. R., Otman Basir, Robert F. Hess, Eileen E. Birch, Brooke A. Koritala, Reed M. Jost, Becky Luu, David Stager, Ben Thompson

Abstract:

Strabismus refers to a misalignment of the eyes. Early detection and treatment of strabismus in childhood can prevent the development of permanent vision loss due to abnormal development of visual brain areas. We developed a two-stage method for strabismus detection and classification based on photographs of the face. The first stage detects the presence or absence of strabismus, and the second stage classifies the type of strabismus. The first stage comprises face detection using Haar cascade, facial landmark estimation, face alignment, aligned face landmark detection, segmentation of the eye region, and detection of strabismus using VGG 16 convolution neural networks. Face alignment transforms the face to a canonical pose to ensure consistency in subsequent analysis. Using facial landmarks, the eye region is segmented from the aligned face and fed into a VGG 16 CNN model, which has been trained to classify strabismus. The CNN determines whether strabismus is present and classifies the type of strabismus (exotropia, esotropia, and vertical deviation). If stage 1 detects strabismus, the eye region image is fed into stage 2, which starts with the estimation of pupil center coordinates using mask R-CNN deep neural networks. Then, the distance between the pupil coordinates and eye landmarks is calculated along with the angle that the pupil coordinates make with the horizontal and vertical axis. The distance and angle information is used to characterize the degree and direction of the strabismic eye misalignment. This model was tested on 100 clinically labeled images of children with (n = 50) and without (n = 50) strabismus. The True Positive Rate (TPR) and False Positive Rate (FPR) of the first stage were 94% and 6% respectively. The classification stage has produced a TPR of 94.73%, 94.44%, and 100% for esotropia, exotropia, and vertical deviations, respectively. This method also had an FPR of 5.26%, 5.55%, and 0% for esotropia, exotropia, and vertical deviation, respectively. The addition of one more feature related to the location of corneal light reflections may reduce the FPR, which was primarily due to children with pseudo-strabismus (the appearance of strabismus due to a wide nasal bridge or skin folds on the nasal side of the eyes).

Keywords: strabismus, deep neural networks, face detection, facial landmarks, face alignment, segmentation, VGG 16, mask R-CNN, pupil coordinates, angle deviation, horizontal and vertical deviation

Procedia PDF Downloads 56
2792 Facial Expression Phoenix (FePh): An Annotated Sequenced Dataset for Facial and Emotion-Specified Expressions in Sign Language

Authors: Marie Alaghband, Niloofar Yousefi, Ivan Garibay

Abstract:

Facial expressions are important parts of both gesture and sign language recognition systems. Despite the recent advances in both fields, annotated facial expression datasets in the context of sign language are still scarce resources. In this manuscript, we introduce an annotated sequenced facial expression dataset in the context of sign language, comprising over 3000 facial images extracted from the daily news and weather forecast of the public tv-station PHOENIX. Unlike the majority of currently existing facial expression datasets, FePh provides sequenced semi-blurry facial images with different head poses, orientations, and movements. In addition, in the majority of images, identities are mouthing the words, which makes the data more challenging. To annotate this dataset we consider primary, secondary, and tertiary dyads of seven basic emotions of "sad", "surprise", "fear", "angry", "neutral", "disgust", and "happy". We also considered the "None" class if the image’s facial expression could not be described by any of the aforementioned emotions. Although we provide FePh as a facial expression dataset of signers in sign language, it has a wider application in gesture recognition and Human Computer Interaction (HCI) systems.

Keywords: annotated facial expression dataset, gesture recognition, sequenced facial expression dataset, sign language recognition

Procedia PDF Downloads 131
2791 Online Pose Estimation and Tracking Approach with Siamese Region Proposal Network

Authors: Cheng Fang, Lingwei Quan, Cunyue Lu

Abstract:

Human pose estimation and tracking are to accurately identify and locate the positions of human joints in the video. It is a computer vision task which is of great significance for human motion recognition, behavior understanding and scene analysis. There has been remarkable progress on human pose estimation in recent years. However, more researches are needed for human pose tracking especially for online tracking. In this paper, a framework, called PoseSRPN, is proposed for online single-person pose estimation and tracking. We use Siamese network attaching a pose estimation branch to incorporate Single-person Pose Tracking (SPT) and Visual Object Tracking (VOT) into one framework. The pose estimation branch has a simple network structure that replaces the complex upsampling and convolution network structure with deconvolution. By augmenting the loss of fully convolutional Siamese network with the pose estimation task, pose estimation and tracking can be trained in one stage. Once trained, PoseSRPN only relies on a single bounding box initialization and producing human joints location. The experimental results show that while maintaining the good accuracy of pose estimation on COCO and PoseTrack datasets, the proposed method achieves a speed of 59 frame/s, which is superior to other pose tracking frameworks.

Keywords: computer vision, pose estimation, pose tracking, Siamese network

Procedia PDF Downloads 127
2790 Tick Induced Facial Nerve Paresis: A Narrative Review

Authors: Jemma Porrett

Abstract:

Background: We present a literature review examining the research surrounding tick paralysis resulting in facial nerve palsy. A case of an intra-aural paralysis tick bite resulting in unilateral facial nerve palsy is also discussed. Methods: A novel case of otoacariasis with associated ipsilateral facial nerve involvement is presented. Additionally, we conducted a review of the literature, and we searched the MEDLINE and EMBASE databases for relevant literature published between 1915 and 2020. Utilising the following keywords; 'Ixodes', 'Facial paralysis', 'Tick bite', and 'Australia', 18 articles were deemed relevant to this study. Results: Eighteen articles included in the review comprised a total of 48 patients. Patients' ages ranged from one year to 84 years of age. Ten studies estimated the possible duration between a tick bite and facial nerve palsy, averaging 8.9 days. Forty-one patients presented with a single tick within the external auditory canal, three had a single tick located on the temple or forehead region, three had post-auricular ticks, and one patient had a remarkable 44 ticks removed from the face, scalp, neck, back, and limbs. A complete ipsilateral facial nerve palsy was present in 45 patients, notably, in 16 patients, this occurred following tick removal. House-Brackmann classification was utilised in 7 patients; four patients with grade 4, one patient with grade three, and two patients with grade 2 facial nerve palsy. Thirty-eight patients had complete recovery of facial palsy. Thirteen studies were analysed for time to recovery, with an average time of 19 days. Six patients had partial recovery at the time of follow-up. One article reported improvement in facial nerve palsy at 24 hours, but no further follow-up was reported. One patient was lost to follow up, and one article failed to mention any resolution of facial nerve palsy. One patient died from respiratory arrest following generalized paralysis. Conclusions: Tick paralysis is a severe but preventable disease. Careful examination of the face, scalp, and external auditory canal should be conducted in patients presenting with otalgia and facial nerve palsy, particularly in tropical areas, to exclude the possibility of tick infestation.

Keywords: facial nerve palsy, tick bite, intra-aural, Australia

Procedia PDF Downloads 79
2789 A Robust Spatial Feature Extraction Method for Facial Expression Recognition

Authors: H. G. C. P. Dinesh, G. Tharshini, M. P. B. Ekanayake, G. M. R. I. Godaliyadda

Abstract:

This paper presents a new spatial feature extraction method based on principle component analysis (PCA) and Fisher Discernment Analysis (FDA) for facial expression recognition. It not only extracts reliable features for classification, but also reduces the feature space dimensions of pattern samples. In this method, first each gray scale image is considered in its entirety as the measurement matrix. Then, principle components (PCs) of row vectors of this matrix and variance of these row vectors along PCs are estimated. Therefore, this method would ensure the preservation of spatial information of the facial image. Afterwards, by incorporating the spectral information of the eigen-filters derived from the PCs, a feature vector was constructed, for a given image. Finally, FDA was used to define a set of basis in a reduced dimension subspace such that the optimal clustering is achieved. The method of FDA defines an inter-class scatter matrix and intra-class scatter matrix to enhance the compactness of each cluster while maximizing the distance between cluster marginal points. In order to matching the test image with the training set, a cosine similarity based Bayesian classification was used. The proposed method was tested on the Cohn-Kanade database and JAFFE database. It was observed that the proposed method which incorporates spatial information to construct an optimal feature space outperforms the standard PCA and FDA based methods.

Keywords: facial expression recognition, principle component analysis (PCA), fisher discernment analysis (FDA), eigen-filter, cosine similarity, bayesian classifier, f-measure

Procedia PDF Downloads 403
2788 Comparing Emotion Recognition from Voice and Facial Data Using Time Invariant Features

Authors: Vesna Kirandziska, Nevena Ackovska, Ana Madevska Bogdanova

Abstract:

The problem of emotion recognition is a challenging problem. It is still an open problem from the aspect of both intelligent systems and psychology. In this paper, both voice features and facial features are used for building an emotion recognition system. A Support Vector Machine classifiers are built by using raw data from video recordings. In this paper, the results obtained for the emotion recognition are given, and a discussion about the validity and the expressiveness of different emotions is presented. A comparison between the classifiers build from facial data only, voice data only and from the combination of both data is made here. The need for a better combination of the information from facial expression and voice data is argued.

Keywords: emotion recognition, facial recognition, signal processing, machine learning

Procedia PDF Downloads 290
2787 Classification System for Soft Tissue Injuries of Face: Bringing Objectiveness to Injury Severity

Authors: Garg Ramneesh, Uppal Sanjeev, Mittal Rajinder, Shah Sheerin, Jain Vikas, Singla Bhupinder

Abstract:

Introduction: Despite advances in trauma care, a classification system for soft tissue injuries of the face still needs to be objectively defined. Aim: To develop a classification system for soft tissue injuries of the face; that is objective, easy to remember, reproducible, universally applicable, aids in surgical management and helps to develop a structured data that can be used for future use. Material and Methods: This classification system includes those patients that need surgical management of facial injuries. Associated underlying bony fractures have been intentionally excluded. Depending upon the severity of soft tissue injury, these can be graded from 0 to IV (O-Abrasions, I-lacerations, II-Avulsion injuries with no skin loss, III-Avulsion injuries with skin loss that would need graft or flap cover, and IV-complex injuries). Anatomically, the face has been divided into three zones (Zone 1/2/3), as per aesthetic subunits. Zone 1e stands for injury of eyebrows; Zones 2 a/b/c stand for nose, upper eyelid and lower eyelid respectively; Zones 3 a/b/c stand for upper lip, lower lip and cheek respectively. Suffices R and L stand for right or left involved side, B for presence of foreign body like glass or pellets, C for extensive contamination and D for depth which can be graded as D 1/2/3 if depth is still fat, muscle or bone respectively. I is for damage to facial nerve or parotid duct. Results and conclusions: This classification system is easy to remember, clinically applicable and would help in standardization of surgical management of soft tissue injuries of face. Certain inherent limitations of this classification system are inability to classify sutured wounds, hematomas and injuries along or against Langer’s lines.

Keywords: soft tissue injuries, face, avulsion, classification

Procedia PDF Downloads 360
2786 Adversarial Disentanglement Using Latent Classifier for Pose-Independent Representation

Authors: Hamed Alqahtani, Manolya Kavakli-Thorne

Abstract:

The large pose discrepancy is one of the critical challenges in face recognition during video surveillance. Due to the entanglement of pose attributes with identity information, the conventional approaches for pose-independent representation lack in providing quality results in recognizing largely posed faces. In this paper, we propose a practical approach to disentangle the pose attribute from the identity information followed by synthesis of a face using a classifier network in latent space. The proposed approach employs a modified generative adversarial network framework consisting of an encoder-decoder structure embedded with a classifier in manifold space for carrying out factorization on the latent encoding. It can be further generalized to other face and non-face attributes for real-life video frames containing faces with significant attribute variations. Experimental results and comparison with state of the art in the field prove that the learned representation of the proposed approach synthesizes more compelling perceptual images through a combination of adversarial and classification losses.

Keywords: disentanglement, face detection, generative adversarial networks, video surveillance

Procedia PDF Downloads 93
2785 Emotion Recognition with Occlusions Based on Facial Expression Reconstruction and Weber Local Descriptor

Authors: Jadisha Cornejo, Helio Pedrini

Abstract:

Recognition of emotions based on facial expressions has received increasing attention from the scientific community over the last years. Several fields of applications can benefit from facial emotion recognition, such as behavior prediction, interpersonal relations, human-computer interactions, recommendation systems. In this work, we develop and analyze an emotion recognition framework based on facial expressions robust to occlusions through the Weber Local Descriptor (WLD). Initially, the occluded facial expressions are reconstructed following an extension approach of Robust Principal Component Analysis (RPCA). Then, WLD features are extracted from the facial expression representation, as well as Local Binary Patterns (LBP) and Histogram of Oriented Gradients (HOG). The feature vector space is reduced using Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA). Finally, K-Nearest Neighbor (K-NN) and Support Vector Machine (SVM) classifiers are used to recognize the expressions. Experimental results on three public datasets demonstrated that the WLD representation achieved competitive accuracy rates for occluded and non-occluded facial expressions compared to other approaches available in the literature.

Keywords: emotion recognition, facial expression, occlusion, fiducial landmarks

Procedia PDF Downloads 156
2784 Emotion Recognition Using Artificial Intelligence

Authors: Rahul Mohite, Lahcen Ouarbya

Abstract:

This paper focuses on the interplay between humans and computer systems and the ability of these systems to understand and respond to human emotions, including non-verbal communication. Current emotion recognition systems are based solely on either facial or verbal expressions. The limitation of these systems is that it requires large training data sets. The paper proposes a system for recognizing human emotions that combines both speech and emotion recognition. The system utilizes advanced techniques such as deep learning and image recognition to identify facial expressions and comprehend emotions. The results show that the proposed system, based on the combination of facial expression and speech, outperforms existing ones, which are based solely either on facial or verbal expressions. The proposed system detects human emotion with an accuracy of 86%, whereas the existing systems have an accuracy of 70% using verbal expression only and 76% using facial expression only. In this paper, the increasing significance and demand for facial recognition technology in emotion recognition are also discussed.

Keywords: facial reputation, expression reputation, deep gaining knowledge of, photo reputation, facial technology, sign processing, photo type

Procedia PDF Downloads 80
2783 Facial Emotion Recognition with Convolutional Neural Network Based Architecture

Authors: Koray U. Erbas

Abstract:

Neural networks are appealing for many applications since they are able to learn complex non-linear relationships between input and output data. As the number of neurons and layers in a neural network increase, it is possible to represent more complex relationships with automatically extracted features. Nowadays Deep Neural Networks (DNNs) are widely used in Computer Vision problems such as; classification, object detection, segmentation image editing etc. In this work, Facial Emotion Recognition task is performed by proposed Convolutional Neural Network (CNN)-based DNN architecture using FER2013 Dataset. Moreover, the effects of different hyperparameters (activation function, kernel size, initializer, batch size and network size) are investigated and ablation study results for Pooling Layer, Dropout and Batch Normalization are presented.

Keywords: convolutional neural network, deep learning, deep learning based FER, facial emotion recognition

Procedia PDF Downloads 232
2782 Improving the Performance of Deep Learning in Facial Emotion Recognition with Image Sharpening

Authors: Ksheeraj Sai Vepuri, Nada Attar

Abstract:

We as humans use words with accompanying visual and facial cues to communicate effectively. Classifying facial emotion using computer vision methodologies has been an active research area in the computer vision field. In this paper, we propose a simple method for facial expression recognition that enhances accuracy. We tested our method on the FER-2013 dataset that contains static images. Instead of using Histogram equalization to preprocess the dataset, we used Unsharp Mask to emphasize texture and details and sharpened the edges. We also used ImageDataGenerator from Keras library for data augmentation. Then we used Convolutional Neural Networks (CNN) model to classify the images into 7 different facial expressions, yielding an accuracy of 69.46% on the test set. Our results show that using image preprocessing such as the sharpening technique for a CNN model can improve the performance, even when the CNN model is relatively simple.

Keywords: facial expression recognittion, image preprocessing, deep learning, CNN

Procedia PDF Downloads 106
2781 Somatosensory-Evoked Blink Reflex in Peripheral Facial Palsy

Authors: Sarah Sayed El- Tawab, Emmanuel Kamal Azix Saba

Abstract:

Objectives: Somatosensory blink reflex (SBR) is an eye blink response obtained from electrical stimulation of peripheral nerves or skin area of the body. It has been studied in various neurological diseases as well as among healthy subjects in different population. We designed this study to detect SBR positivity in patients with facial palsy and patients with post facial syndrome, to relate the facial palsy severity and the presence of SBR, and to associate between trigeminal BR changes and SBR positivity in peripheral facial palsy patients. Methods: 50 patients with peripheral facial palsy and post-facial syndrome 31 age and gender matched healthy volunteers were enrolled to this study. Facial motor conduction studies, trigeminal BR, and SBR were studied in all. Results: SBR was elicited in 67.7% of normal subjects, in 68% of PFS group, and in 32% of PFP group. On the non-paralytic side SBR was found in 28% by paralyzed side stimulation and in 24% by healthy side stimulation among PFP patients. For PFS group SBR was found on the non- paralytic side in 48%. Bilateral SBR elicitability was higher than its unilateral elicitability. Conclusion: Increased brainstem interneurons excitability is not essential to generate SBR. The hypothetical sensory-motor gating mechanism is responsible for SBR generation.

Keywords: somatosensory evoked blink reflex, post facial syndrome, blink reflex, enchanced gain

Procedia PDF Downloads 590
2780 Deep Learning Based 6D Pose Estimation for Bin-Picking Using 3D Point Clouds

Authors: Hesheng Wang, Haoyu Wang, Chungang Zhuang

Abstract:

Estimating the 6D pose of objects is a core step for robot bin-picking tasks. The problem is that various objects are usually randomly stacked with heavy occlusion in real applications. In this work, we propose a method to regress 6D poses by predicting three points for each object in the 3D point cloud through deep learning. To solve the ambiguity of symmetric pose, we propose a labeling method to help the network converge better. Based on the predicted pose, an iterative method is employed for pose optimization. In real-world experiments, our method outperforms the classical approach in both precision and recall.

Keywords: pose estimation, deep learning, point cloud, bin-picking, 3D computer vision

Procedia PDF Downloads 135
2779 A Unified Webcam Proctoring Solution on Edge

Authors: Saw Thiha, Jay Rajasekera

Abstract:

A boom in video conferencing generated millions of hours of video data daily to be analyzed. However, such enormous data pose certain scalability issues to be analyzed efficiently, let alone do it in real-time, as online conferences can involve hundreds of people and can last for hours. This paper proposes an efficient online proctoring solution that can analyze the online conferences real-time on edge devices such as Android, iOS, and desktops. Since the computation can be done upfront on the devices where online conferences take place, it can scale well without requiring intensive resources such as GPU servers and complex cloud infrastructure. According to the linear models, face orientation does indeed impact the perceived eye openness. Also, the proposed z score facial landmark standardization was proven to be functional in detecting face orientation and contributed to classifying eye blinks with single eyelid distance computation while achieving a better f1 score and accuracy than the Eye Aspect Ratio (EAR) threshold method. Last but not least, the authors implemented the solution natively in the MediaPipe framework and open-sourced it along with the reproducible experimental results on GitHub. The solution provides face orientation, eye blink, facial activity, and translation detections out of the box and is highly customizable and extensible.

Keywords: android, desktop, edge computing, blink, face orientation, facial activity and translation, MediaPipe, open source, real-time, video conference, web, iOS, Z score facial landmark standardization

Procedia PDF Downloads 72
2778 Real Time Multi Person Action Recognition Using Pose Estimates

Authors: Aishrith Rao

Abstract:

Human activity recognition is an important aspect of video analytics, and many approaches have been recommended to enable action recognition. In this approach, the model is used to identify the action of the multiple people in the frame and classify them accordingly. A few approaches use RNNs and 3D CNNs, which are computationally expensive and cannot be trained with the small datasets which are currently available. Multi-person action recognition has been performed in order to understand the positions and action of people present in the video frame. The size of the video frame can be adjusted as a hyper-parameter depending on the hardware resources available. OpenPose has been used to calculate pose estimate using CNN to produce heap-maps, one of which provides skeleton features, which are basically joint features. The features are then extracted, and a classification algorithm can be applied to classify the action.

Keywords: human activity recognition, computer vision, pose estimates, convolutional neural networks

Procedia PDF Downloads 112
2777 Individualized Emotion Recognition Through Dual-Representations and Ground-Established Ground Truth

Authors: Valentina Zhang

Abstract:

While facial expression is a complex and individualized behavior, all facial emotion recognition (FER) systems known to us rely on a single facial representation and are trained on universal data. We conjecture that: (i) different facial representations can provide different, sometimes complementing views of emotions; (ii) when employed collectively in a discussion group setting, they enable more accurate emotion reading which is highly desirable in autism care and other applications context sensitive to errors. In this paper, we first study FER using pixel-based DL vs semantics-based DL in the context of deepfake videos. Our experiment indicates that while the semantics-trained model performs better with articulated facial feature changes, the pixel-trained model outperforms on subtle or rare facial expressions. Armed with these findings, we have constructed an adaptive FER system learning from both types of models for dyadic or small interacting groups and further leveraging the synthesized group emotions as the ground truth for individualized FER training. Using a collection of group conversation videos, we demonstrate that FER accuracy and personalization can benefit from such an approach.

Keywords: neurodivergence care, facial emotion recognition, deep learning, ground truth for supervised learning

Procedia PDF Downloads 105