Search results for: maxim infringement recognition
1617 Features Vector Selection for the Recognition of the Fragmented Handwritten Numeric Chains
Authors: Salim Ouchtati, Aissa Belmeguenai, Mouldi Bedda
Abstract:
In this study, we propose an offline system for the recognition of the fragmented handwritten numeric chains. Firstly, we realized a recognition system of the isolated handwritten digits, in this part; the study is based mainly on the evaluation of neural network performances, trained with the gradient backpropagation algorithm. The used parameters to form the input vector of the neural network are extracted from the binary images of the isolated handwritten digit by several methods: the distribution sequence, sondes application, the Barr features, and the centered moments of the different projections and profiles. Secondly, the study is extended for the reading of the fragmented handwritten numeric chains constituted of a variable number of digits. The vertical projection was used to segment the numeric chain at isolated digits and every digit (or segment) was presented separately to the entry of the system achieved in the first part (recognition system of the isolated handwritten digits).Keywords: features extraction, handwritten numeric chains, image processing, neural networks
Procedia PDF Downloads 2671616 Semantic Data Schema Recognition
Authors: Aïcha Ben Salem, Faouzi Boufares, Sebastiao Correia
Abstract:
The subject covered in this paper aims at assisting the user in its quality approach. The goal is to better extract, mix, interpret and reuse data. It deals with the semantic schema recognition of a data source. This enables the extraction of data semantics from all the available information, inculding the data and the metadata. Firstly, it consists of categorizing the data by assigning it to a category and possibly a sub-category, and secondly, of establishing relations between columns and possibly discovering the semantics of the manipulated data source. These links detected between columns offer a better understanding of the source and the alternatives for correcting data. This approach allows automatic detection of a large number of syntactic and semantic anomalies.Keywords: schema recognition, semantic data profiling, meta-categorisation, semantic dependencies inter columns
Procedia PDF Downloads 4181615 Speech Recognition Performance by Adults: A Proposal for a Battery for Marathi
Authors: S. B. Rathna Kumar, Pranjali A Ujwane, Panchanan Mohanty
Abstract:
The present study aimed to develop a battery for assessing speech recognition performance by adults in Marathi. A total of four word lists were developed by considering word frequency, word familiarity, words in common use, and phonemic balance. Each word list consists of 25 words (15 monosyllabic words in CVC structure and 10 monosyllabic words in CVCV structure). Equivalence analysis and performance-intensity function testing was carried using the four word lists on a total of 150 native speakers of Marathi belonging to different regions of Maharashtra (Vidarbha, Marathwada, Khandesh and Northern Maharashtra, Pune, and Konkan). The subjects were further equally divided into five groups based on above mentioned regions. It was found that there was no significant difference (p > 0.05) in the speech recognition performance between groups for each word list and between word lists for each group. Hence, the four word lists developed were equally difficult for all the groups and can be used interchangeably. The performance-intensity (PI) function curve showed semi-linear function, and the groups’ mean slope of the linear portions of the curve indicated an average linear slope of 4.64%, 4.73%, 4.68%, and 4.85% increase in word recognition score per dB for list 1, list 2, list 3 and list 4 respectively. Although, there is no data available on speech recognition tests for adults in Marathi, most of the findings of the study are in line with the findings of research reports on other languages. The four word lists, thus developed, were found to have sufficient reliability and validity in assessing speech recognition performance by adults in Marathi.Keywords: speech recognition performance, phonemic balance, equivalence analysis, performance-intensity function testing, reliability, validity
Procedia PDF Downloads 3581614 Face Recognition Using Body-Worn Camera: Dataset and Baseline Algorithms
Authors: Ali Almadan, Anoop Krishnan, Ajita Rattani
Abstract:
Facial recognition is a widely adopted technology in surveillance, border control, healthcare, banking services, and lately, in mobile user authentication with Apple introducing “Face ID” moniker with iPhone X. A lot of research has been conducted in the area of face recognition on datasets captured by surveillance cameras, DSLR, and mobile devices. Recently, face recognition technology has also been deployed on body-worn cameras to keep officers safe, enabling situational awareness and providing evidence for trial. However, limited academic research has been conducted on this topic so far, without the availability of any publicly available datasets with a sufficient sample size. This paper aims to advance research in the area of face recognition using body-worn cameras. To this aim, the contribution of this work is two-fold: (1) collection of a dataset consisting of a total of 136,939 facial images of 102 subjects captured using body-worn cameras in in-door and daylight conditions and (2) evaluation of various deep-learning architectures for face identification on the collected dataset. Experimental results suggest a maximum True Positive Rate(TPR) of 99.86% at False Positive Rate(FPR) of 0.000 obtained by SphereFace based deep learning architecture in daylight condition. The collected dataset and the baseline algorithms will promote further research and development. A downloadable link of the dataset and the algorithms is available by contacting the authors.Keywords: face recognition, body-worn cameras, deep learning, person identification
Procedia PDF Downloads 1631613 Pre-Analysis of Printed Circuit Boards Based on Multispectral Imaging for Vision Based Recognition of Electronics Waste
Authors: Florian Kleber, Martin Kampel
Abstract:
The increasing demand of gallium, indium and rare-earth elements for the production of electronics, e.g. solid state-lighting, photovoltaics, integrated circuits, and liquid crystal displays, will exceed the world-wide supply according to current forecasts. Recycling systems to reclaim these materials are not yet in place, which challenges the sustainability of these technologies. This paper proposes a multispectral imaging system as a basis for a vision based recognition system for valuable components of electronics waste. Multispectral images intend to enhance the contrast of images of printed circuit boards (single components, as well as labels) for further analysis, such as optical character recognition and entire printed circuit board recognition. The results show that a higher contrast is achieved in the near infrared compared to ultraviolet and visible light.Keywords: electronics waste, multispectral imaging, printed circuit boards, rare-earth elements
Procedia PDF Downloads 4161612 The Combination of the Mel Frequency Cepstral Coefficients, Perceptual Linear Prediction, Jitter and Shimmer Coefficients for the Improvement of Automatic Recognition System for Dysarthric Speech
Authors: Brahim Fares Zaidi
Abstract:
Our work aims to improve our Automatic Recognition System for Dysarthria Speech based on the Hidden Models of Markov and the Hidden Markov Model Toolkit to help people who are sick. With pronunciation problems, we applied two techniques of speech parameterization based on Mel Frequency Cepstral Coefficients and Perceptual Linear Prediction and concatenated them with JITTER and SHIMMER coefficients in order to increase the recognition rate of a dysarthria speech. For our tests, we used the NEMOURS database that represents speakers with dysarthria and normal speakers.Keywords: ARSDS, HTK, HMM, MFCC, PLP
Procedia PDF Downloads 1101611 Multimodal Data Fusion Techniques in Audiovisual Speech Recognition
Authors: Hadeer M. Sayed, Hesham E. El Deeb, Shereen A. Taie
Abstract:
In the big data era, we are facing a diversity of datasets from different sources in different domains that describe a single life event. These datasets consist of multiple modalities, each of which has a different representation, distribution, scale, and density. Multimodal fusion is the concept of integrating information from multiple modalities in a joint representation with the goal of predicting an outcome through a classification task or regression task. In this paper, multimodal fusion techniques are classified into two main classes: model-agnostic techniques and model-based approaches. It provides a comprehensive study of recent research in each class and outlines the benefits and limitations of each of them. Furthermore, the audiovisual speech recognition task is expressed as a case study of multimodal data fusion approaches, and the open issues through the limitations of the current studies are presented. This paper can be considered a powerful guide for interested researchers in the field of multimodal data fusion and audiovisual speech recognition particularly.Keywords: multimodal data, data fusion, audio-visual speech recognition, neural networks
Procedia PDF Downloads 1141610 Distant Speech Recognition Using Laser Doppler Vibrometer
Authors: Yunbin Deng
Abstract:
Most existing applications of automatic speech recognition relies on cooperative subjects at a short distance to a microphone. Standoff speech recognition using microphone arrays can extend the subject to sensor distance somewhat, but it is still limited to only a few feet. As such, most deployed applications of standoff speech recognitions are limited to indoor use at short range. Moreover, these applications require air passway between the subject and the sensor to achieve reasonable signal to noise ratio. This study reports long range (50 feet) automatic speech recognition experiments using a Laser Doppler Vibrometer (LDV) sensor. This study shows that the LDV sensor modality can extend the speech acquisition standoff distance far beyond microphone arrays to hundreds of feet. In addition, LDV enables 'listening' through the windows for uncooperative subjects. This enables new capabilities in automatic audio and speech intelligence, surveillance, and reconnaissance (ISR) for law enforcement, homeland security and counter terrorism applications. The Polytec LDV model OFV-505 is used in this study. To investigate the impact of different vibrating materials, five parallel LDV speech corpora, each consisting of 630 speakers, are collected from the vibrations of a glass window, a metal plate, a plastic box, a wood slate, and a concrete wall. These are the common materials the application could encounter in a daily life. These data were compared with the microphone counterpart to manifest the impact of various materials on the spectrum of the LDV speech signal. State of the art deep neural network modeling approaches is used to conduct continuous speaker independent speech recognition on these LDV speech datasets. Preliminary phoneme recognition results using time-delay neural network, bi-directional long short term memory, and model fusion shows great promise of using LDV for long range speech recognition. To author’s best knowledge, this is the first time an LDV is reported for long distance speech recognition application.Keywords: covert speech acquisition, distant speech recognition, DSR, laser Doppler vibrometer, LDV, speech intelligence surveillance and reconnaissance, ISR
Procedia PDF Downloads 1801609 Interactive Shadow Play Animation System
Authors: Bo Wan, Xiu Wen, Lingling An, Xiaoling Ding
Abstract:
The paper describes a Chinese shadow play animation system based on Kinect. Users, without any professional training, can personally manipulate the shadow characters to finish a shadow play performance by their body actions and get a shadow play video through giving the record command to our system if they want. In our system, Kinect is responsible for capturing human movement and voice commands data. Gesture recognition module is used to control the change of the shadow play scenes. After packaging the data from Kinect and the recognition result from gesture recognition module, VRPN transmits them to the server-side. At last, the server-side uses the information to control the motion of shadow characters and video recording. This system not only achieves human-computer interaction, but also realizes the interaction between people. It brings an entertaining experience to users and easy to operate for all ages. Even more important is that the application background of Chinese shadow play embodies the protection of the art of shadow play animation.Keywords: hadow play animation, Kinect, gesture recognition, VRPN, HCI
Procedia PDF Downloads 4021608 Effective Stacking of Deep Neural Models for Automated Object Recognition in Retail Stores
Authors: Ankit Sinha, Soham Banerjee, Pratik Chattopadhyay
Abstract:
Automated product recognition in retail stores is an important real-world application in the domain of Computer Vision and Pattern Recognition. In this paper, we consider the problem of automatically identifying the classes of the products placed on racks in retail stores from an image of the rack and information about the query/product images. We improve upon the existing approaches in terms of effectiveness and memory requirement by developing a two-stage object detection and recognition pipeline comprising of a Faster-RCNN-based object localizer that detects the object regions in the rack image and a ResNet-18-based image encoder that classifies the detected regions into the appropriate classes. Each of the models is fine-tuned using appropriate data sets for better prediction and data augmentation is performed on each query image to prepare an extensive gallery set for fine-tuning the ResNet-18-based product recognition model. This encoder is trained using a triplet loss function following the strategy of online-hard-negative-mining for improved prediction. The proposed models are lightweight and can be connected in an end-to-end manner during deployment to automatically identify each product object placed in a rack image. Extensive experiments using Grozi-32k and GP-180 data sets verify the effectiveness of the proposed model.Keywords: retail stores, faster-RCNN, object localization, ResNet-18, triplet loss, data augmentation, product recognition
Procedia PDF Downloads 1571607 Evolution of the Environmental Justice Concept
Authors: Zahra Bakhtiari
Abstract:
This article explores the development and evolution of the concept of environmental justice, which has shifted from being dominated by white and middle-class individuals to a civil struggle by marginalized communities against environmental injustices. Environmental justice aims to achieve equity in decision-making and policy-making related to the environment. The concept of justice in this context includes four fundamental aspects: distribution, procedure, recognition, and capabilities. Recent scholars have attempted to broaden the concept of justice to include dimensions of participation, recognition, and capabilities. Focusing on all four dimensions of environmental justice is crucial for effective planning and policy-making to address environmental issues. Ignoring any of these aspects can lead to the failure of efforts and the waste of resources.Keywords: environmental justice, distribution, procedure, recognition, capabilities
Procedia PDF Downloads 931606 Two Concurrent Convolution Neural Networks TC*CNN Model for Face Recognition Using Edge
Authors: T. Alghamdi, G. Alaghband
Abstract:
In this paper we develop a model that couples Two Concurrent Convolution Neural Network with different filters (TC*CNN) for face recognition and compare its performance to an existing sequential CNN (base model). We also test and compare the quality and performance of the models on three datasets with various levels of complexity (easy, moderate, and difficult) and show that for the most complex datasets, edges will produce the most accurate and efficient results. We further show that in such cases while Support Vector Machine (SVM) models are fast, they do not produce accurate results.Keywords: Convolution Neural Network, Edges, Face Recognition , Support Vector Machine.
Procedia PDF Downloads 1561605 Real-Time Recognition of Dynamic Hand Postures on a Neuromorphic System
Authors: Qian Liu, Steve Furber
Abstract:
To explore how the brain may recognize objects in its general,accurate and energy-efficient manner, this paper proposes the use of a neuromorphic hardware system formed from a Dynamic Video Sensor~(DVS) silicon retina in concert with the SpiNNaker real-time Spiking Neural Network~(SNN) simulator. As a first step in the exploration on this platform a recognition system for dynamic hand postures is developed, enabling the study of the methods used in the visual pathways of the brain. Inspired by the behaviours of the primary visual cortex, Convolutional Neural Networks (CNNs) are modeled using both linear perceptrons and spiking Leaky Integrate-and-Fire (LIF) neurons. In this study's largest configuration using these approaches, a network of 74,210 neurons and 15,216,512 synapses is created and operated in real-time using 290 SpiNNaker processor cores in parallel and with 93.0% accuracy. A smaller network using only 1/10th of the resources is also created, again operating in real-time, and it is able to recognize the postures with an accuracy of around 86.4% -only 6.6% lower than the much larger system. The recognition rate of the smaller network developed on this neuromorphic system is sufficient for a successful hand posture recognition system, and demonstrates a much-improved cost to performance trade-off in its approach.Keywords: spiking neural network (SNN), convolutional neural network (CNN), posture recognition, neuromorphic system
Procedia PDF Downloads 4731604 Pattern Recognition Search: An Advancement Over Interpolation Search
Authors: Shahpar Yilmaz, Yasir Nadeem, Syed A. Mehdi
Abstract:
Searching for a record in a dataset is always a frequent task for any data structure-related application. Hence, a fast and efficient algorithm for the approach has its importance in yielding the quickest results and enhancing the overall productivity of the company. Interpolation search is one such technique used to search through a sorted set of elements. This paper proposes a new algorithm, an advancement over interpolation search for the application of search over a sorted array. Pattern Recognition Search or PR Search (PRS), like interpolation search, is a pattern-based divide and conquer algorithm whose objective is to reduce the sample size in order to quicken the process and it does so by treating the array as a perfect arithmetic progression series and thereby deducing the key element’s position. We look to highlight some of the key drawbacks of interpolation search, which are accounted for in the Pattern Recognition Search.Keywords: array, complexity, index, sorting, space, time
Procedia PDF Downloads 2471603 Pattern Recognition Based on Simulation of Chemical Senses (SCS)
Authors: Nermeen El Kashef, Yasser Fouad, Khaled Mahar
Abstract:
No AI-complete system can model the human brain or behavior, without looking at the totality of the whole situation and incorporating a combination of senses. This paper proposes a Pattern Recognition model based on Simulation of Chemical Senses (SCS) for separation and classification of sign language. The model based on human taste controlling strategy. The main idea of the introduced model is motivated by the facts that the tongue cluster input substance into its basic tastes first, and then the brain recognizes its flavor. To implement this strategy, two level architecture is proposed (this is inspired from taste system). The separation-level of the architecture focuses on hand posture cluster, while the classification-level of the architecture to recognizes the sign language. The efficiency of proposed model is demonstrated experimentally by recognizing American Sign Language (ASL) data set. The recognition accuracy obtained for numbers of ASL is 92.9 percent.Keywords: artificial intelligence, biocybernetics, gustatory system, sign language recognition, taste sense
Procedia PDF Downloads 2951602 Image Processing techniques for Surveillance in Outdoor Environment
Authors: Jayanth C., Anirudh Sai Yetikuri, Kavitha S. N.
Abstract:
This paper explores the development and application of computer vision and machine learning techniques for real-time pose detection, facial recognition, and number plate extraction. Utilizing MediaPipe for pose estimation, the research presents methods for detecting hand raises and ducking postures through real-time video analysis. Complementarily, facial recognition is employed to compare and verify individual identities using the face recognition library. Additionally, the paper demonstrates a robust approach for extracting and storing vehicle number plates from images, integrating Optical Character Recognition (OCR) with a database management system. The study highlights the effectiveness and versatility of these technologies in practical scenarios, including security and surveillance applications. The findings underscore the potential of combining computer vision techniques to address diverse challenges and enhance automated systems for both individual and vehicular identification. This research contributes to the fields of computer vision and machine learning by providing scalable solutions and demonstrating their applicability in real-world contexts.Keywords: computer vision, pose detection, facial recognition, number plate extraction, machine learning, real-time analysis, OCR, database management
Procedia PDF Downloads 271601 Models and Metamodels for Computer-Assisted Natural Language Grammar Learning
Authors: Evgeny Pyshkin, Maxim Mozgovoy, Vladislav Volkov
Abstract:
The paper follows a discourse on computer-assisted language learning. We examine problems of foreign language teaching and learning and introduce a metamodel that can be used to define learning models of language grammar structures in order to support teacher/student interaction. Special attention is paid to the concept of a virtual language lab. Our approach to language education assumes to encourage learners to experiment with a language and to learn by discovering patterns of grammatically correct structures created and managed by a language expert.Keywords: computer-assisted instruction, language learning, natural language grammar models, HCI
Procedia PDF Downloads 5211600 Defect Localization and Interaction on Surfaces with Projection Mapping and Gesture Recognition
Authors: Qiang Wang, Hongyang Yu, MingRong Lai, Miao Luo
Abstract:
This paper presents a method for accurately localizing and interacting with known surface defects by overlaying patterns onto real-world surfaces using a projection system. Given the world coordinates of the defects, we project corresponding patterns onto the surfaces, providing an intuitive visualization of the specific defect locations. To enable users to interact with and retrieve more information about individual defects, we implement a gesture recognition system based on a pruned and optimized version of YOLOv6. This lightweight model achieves an accuracy of 82.8% and is suitable for deployment on low-performance devices. Our approach demonstrates the potential for enhancing defect identification, inspection processes, and user interaction in various applications.Keywords: defect localization, projection mapping, gesture recognition, YOLOv6
Procedia PDF Downloads 901599 SCNet: A Vehicle Color Classification Network Based on Spatial Cluster Loss and Channel Attention Mechanism
Authors: Fei Gao, Xinyang Dong, Yisu Ge, Shufang Lu, Libo Weng
Abstract:
Vehicle color recognition plays an important role in traffic accident investigation. However, due to the influence of illumination, weather, and noise, vehicle color recognition still faces challenges. In this paper, a vehicle color classification network based on spatial cluster loss and channel attention mechanism (SCNet) is proposed for vehicle color recognition. A channel attention module is applied to extract the features of vehicle color representative regions and reduce the weight of nonrepresentative color regions in the channel. The proposed loss function, called spatial clustering loss (SC-loss), consists of two channel-specific components, such as a concentration component and a diversity component. The concentration component forces all feature channels belonging to the same class to be concentrated through the channel cluster. The diversity components impose additional constraints on the channels through the mean distance coefficient, making them mutually exclusive in spatial dimensions. In the comparison experiments, the proposed method can achieve state-of-the-art performance on the public datasets, VCD, and VeRi, which are 96.1% and 96.2%, respectively. In addition, the ablation experiment further proves that SC-loss can effectively improve the accuracy of vehicle color recognition.Keywords: feature extraction, convolutional neural networks, intelligent transportation, vehicle color recognition
Procedia PDF Downloads 1851598 Analyzing the Use of Augmented Reality and Image Recognition in Cultural Education: Use Case of Sintra Palace Treasure Hunt Application
Authors: Marek Maruszczak
Abstract:
Gamified applications have been used successfully in education for years. The rapid development of technologies such as augmented reality and image recognition increases their availability and reduces their prices. Thus, there is an increasing possibility and need for a wide use of such applications in education. The main purpose of this article is to present the effects of work on a mobile application with augmented reality, the aim of which is to motivate tourists to pay more attention to the attractions and increase the likelihood of moving from one attraction to the next while visiting the Palácio Nacional de Sintra in Portugal. Work on the application was carried out together with the employees of Parques de Sintra from 2019 to 2021. Their effect was the preparation of a mobile application using augmented reality and image recognition. The application was tested on the palace premises by both Parques de Sintra employees and tourists visiting Palácio Nacional de Sintra. The collected conclusions allowed for the formulation of good practices and guidelines that can be used when designing gamified apps for the purpose of cultural education.Keywords: augmented reality, cultural education, gamification, image recognition, mobile games
Procedia PDF Downloads 1901597 The Effect of Experimentally Induced Stress on Facial Recognition Ability of Security Personnel’s
Authors: Zunjarrao Kadam, Vikas Minchekar
Abstract:
The facial recognition is an important task in criminal investigation procedure. The security guards-constantly watching the persons-can help to identify the suspected accused. The forensic psychologists are tackled such cases in the criminal justice system. The security personnel may loss their ability to correctly identify the persons due to constant stress while performing the duty. The present study aimed at to identify the effect of experimentally induced stress on facial recognition ability of security personnel’s. For this study 50, security guards from Sangli, Miraj & Jaysingpur city of the Maharashtra States of India were recruited in the experimental study. The randomized two group design was employed to carry out the research. In the initial condition twenty identity card size photographs were shown to both groups. Afterward, artificial stress was induced in the experimental group through the difficultpuzzle-solvingtask in a limited period. In the second condition, both groups were presented earlier photographs with another additional thirty new photographs. The subjects were asked to recognize the photographs which are shown earliest. The analyzed data revealed that control group has ahighest mean score of facial recognition than experimental group. The results were discussed in the present research.Keywords: experimentally induced stress, facial recognition, cognition, security personnel
Procedia PDF Downloads 2621596 Optimized Dynamic Bayesian Networks and Neural Verifier Test Applied to On-Line Isolated Characters Recognition
Authors: Redouane Tlemsani, Redouane, Belkacem Kouninef, Abdelkader Benyettou
Abstract:
In this paper, our system is a Markovien system which we can see it like a Dynamic Bayesian Networks. One of the major interests of these systems resides in the complete training of the models (topology and parameters) starting from training data. The Bayesian Networks are representing models of dubious knowledge on complex phenomena. They are a union between the theory of probability and the graph theory in order to give effective tools to represent a joined probability distribution on a set of random variables. The representation of knowledge bases on description, by graphs, relations of causality existing between the variables defining the field of study. The theory of Dynamic Bayesian Networks is a generalization of the Bayesians networks to the dynamic processes. Our objective amounts finding the better structure which represents the relationships (dependencies) between the variables of a dynamic bayesian network. In applications in pattern recognition, one will carry out the fixing of the structure which obliges us to admit some strong assumptions (for example independence between some variables).Keywords: Arabic on line character recognition, dynamic Bayesian network, pattern recognition, networks
Procedia PDF Downloads 6191595 Size-Reduction Strategies for Iris Codes
Authors: Jutta Hämmerle-Uhl, Georg Penn, Gerhard Pötzelsberger, Andreas Uhl
Abstract:
Iris codes contain bits with different entropy. This work investigates different strategies to reduce the size of iris code templates with the aim of reducing storage requirements and computational demand in the matching process. Besides simple sub-sampling schemes, also a binary multi-resolution representation as used in the JBIG hierarchical coding mode is assessed. We find that iris code template size can be reduced significantly while maintaining recognition accuracy. Besides, we propose a two stage identification approach, using small-sized iris code templates in a pre-selection satge, and full resolution templates for final identification, which shows promising recognition behaviour.Keywords: iris recognition, compact iris code, fast matching, best bits, pre-selection identification, two-stage identification
Procedia PDF Downloads 4411594 Static and Dynamic Hand Gesture Recognition Using Convolutional Neural Network Models
Authors: Keyi Wang
Abstract:
Similar to the touchscreen, hand gesture based human-computer interaction (HCI) is a technology that could allow people to perform a variety of tasks faster and more conveniently. This paper proposes a training method of an image-based hand gesture image and video clip recognition system using a CNN (Convolutional Neural Network) with a dataset. A dataset containing 6 hand gesture images is used to train a 2D CNN model. ~98% accuracy is achieved. Furthermore, a 3D CNN model is trained on a dataset containing 4 hand gesture video clips resulting in ~83% accuracy. It is demonstrated that a Cozmo robot loaded with pre-trained models is able to recognize static and dynamic hand gestures.Keywords: deep learning, hand gesture recognition, computer vision, image processing
Procedia PDF Downloads 1431593 Features Reduction Using Bat Algorithm for Identification and Recognition of Parkinson Disease
Authors: P. Shrivastava, A. Shukla, K. Verma, S. Rungta
Abstract:
Parkinson's disease is a chronic neurological disorder that directly affects human gait. It leads to slowness of movement, causes muscle rigidity and tremors. Gait serve as a primary outcome measure for studies aiming at early recognition of disease. Using gait techniques, this paper implements efficient binary bat algorithm for an early detection of Parkinson's disease by selecting optimal features required for classification of affected patients from others. The data of 166 people, both fit and affected is collected and optimal feature selection is done using PSO and Bat algorithm. The reduced dataset is then classified using neural network. The experiments indicate that binary bat algorithm outperforms traditional PSO and genetic algorithm and gives a fairly good recognition rate even with the reduced dataset.Keywords: parkinson, gait, feature selection, bat algorithm
Procedia PDF Downloads 5491592 KSVD-SVM Approach for Spontaneous Facial Expression Recognition
Authors: Dawood Al Chanti, Alice Caplier
Abstract:
Sparse representations of signals have received a great deal of attention in recent years. In this paper, the interest of using sparse representation as a mean for performing sparse discriminative analysis between spontaneous facial expressions is demonstrated. An automatic facial expressions recognition system is presented. It uses a KSVD-SVM approach which is made of three main stages: A pre-processing and feature extraction stage, which solves the problem of shared subspace distribution based on the random projection theory, to obtain low dimensional discriminative and reconstructive features; A dictionary learning and sparse coding stage, which uses the KSVD model to learn discriminative under or over dictionaries for sparse coding; Finally a classification stage, which uses a SVM classifier for facial expressions recognition. Our main concern is to be able to recognize non-basic affective states and non-acted expressions. Extensive experiments on the JAFFE static acted facial expressions database but also on the DynEmo dynamic spontaneous facial expressions database exhibit very good recognition rates.Keywords: dictionary learning, random projection, pose and spontaneous facial expression, sparse representation
Procedia PDF Downloads 3081591 Implementation of a Multimodal Biometrics Recognition System with Combined Palm Print and Iris Features
Authors: Rabab M. Ramadan, Elaraby A. Elgallad
Abstract:
With extensive application, the performance of unimodal biometrics systems has to face a diversity of problems such as signal and background noise, distortion, and environment differences. Therefore, multimodal biometric systems are proposed to solve the above stated problems. This paper introduces a bimodal biometric recognition system based on the extracted features of the human palm print and iris. Palm print biometric is fairly a new evolving technology that is used to identify people by their palm features. The iris is a strong competitor together with face and fingerprints for presence in multimodal recognition systems. In this research, we introduced an algorithm to the combination of the palm and iris-extracted features using a texture-based descriptor, the Scale Invariant Feature Transform (SIFT). Since the feature sets are non-homogeneous as features of different biometric modalities are used, these features will be concatenated to form a single feature vector. Particle swarm optimization (PSO) is used as a feature selection technique to reduce the dimensionality of the feature. The proposed algorithm will be applied to the Institute of Technology of Delhi (IITD) database and its performance will be compared with various iris recognition algorithms found in the literature.Keywords: iris recognition, particle swarm optimization, feature extraction, feature selection, palm print, the Scale Invariant Feature Transform (SIFT)
Procedia PDF Downloads 2351590 Hand Gesture Detection via EmguCV Canny Pruning
Authors: N. N. Mosola, S. J. Molete, L. S. Masoebe, M. Letsae
Abstract:
Hand gesture recognition is a technique used to locate, detect, and recognize a hand gesture. Detection and recognition are concepts of Artificial Intelligence (AI). AI concepts are applicable in Human Computer Interaction (HCI), Expert systems (ES), etc. Hand gesture recognition can be used in sign language interpretation. Sign language is a visual communication tool. This tool is used mostly by deaf societies and those with speech disorder. Communication barriers exist when societies with speech disorder interact with others. This research aims to build a hand recognition system for Lesotho’s Sesotho and English language interpretation. The system will help to bridge the communication problems encountered by the mentioned societies. The system has various processing modules. The modules consist of a hand detection engine, image processing engine, feature extraction, and sign recognition. Detection is a process of identifying an object. The proposed system uses Canny pruning Haar and Haarcascade detection algorithms. Canny pruning implements the Canny edge detection. This is an optimal image processing algorithm. It is used to detect edges of an object. The system employs a skin detection algorithm. The skin detection performs background subtraction, computes the convex hull, and the centroid to assist in the detection process. Recognition is a process of gesture classification. Template matching classifies each hand gesture in real-time. The system was tested using various experiments. The results obtained show that time, distance, and light are factors that affect the rate of detection and ultimately recognition. Detection rate is directly proportional to the distance of the hand from the camera. Different lighting conditions were considered. The more the light intensity, the faster the detection rate. Based on the results obtained from this research, the applied methodologies are efficient and provide a plausible solution towards a light-weight, inexpensive system which can be used for sign language interpretation.Keywords: canny pruning, hand recognition, machine learning, skin tracking
Procedia PDF Downloads 1851589 Arabic Character Recognition Using Regression Curves with the Expectation Maximization Algorithm
Authors: Abdullah A. AlShaher
Abstract:
In this paper, we demonstrate how regression curves can be used to recognize 2D non-rigid handwritten shapes. Each shape is represented by a set of non-overlapping uniformly distributed landmarks. The underlying models utilize 2nd order of polynomials to model shapes within a training set. To estimate the regression models, we need to extract the required coefficients which describe the variations for a set of shape class. Hence, a least square method is used to estimate such modes. We then proceed by training these coefficients using the apparatus Expectation Maximization algorithm. Recognition is carried out by finding the least error landmarks displacement with respect to the model curves. Handwritten isolated Arabic characters are used to evaluate our approach.Keywords: character recognition, regression curves, handwritten Arabic letters, expectation maximization algorithm
Procedia PDF Downloads 1451588 History, Challenges and Solutions for Social Work Education and Recognition in Vietnam
Authors: Thuy Bui Anh, Ngan Nguyen Thi Thanh
Abstract:
Currently, social work in Vietnam is entering the first step in the development process to become a true profession with a strong position in society. However, Spirit of helping and sharing of social work has already existed in the daily life of Vietnamese people for a very long time, becoming a precious heritage passed down from ancestors to the next generations while expanding the territory, building and defending for the country. Following the stream of history, charity work in Vietnam has gradually transformed itself towards a more professional work, especially in the last 2 decades. Accordingly, more than 50 universities and educational institutions in Vietnam have been licensed to train social work, ensuring a stronger foundation on human resources working in this field. Despite the strong growth, social work profession, social work education and the recognition of the role of the social workers still need to be fueled to develop, responded to the increasing demand of Vietnam society.Keywords: education, history, recognition, social work, Vietnam
Procedia PDF Downloads 321