Search results for: hand gesture recognition
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 5153

Search results for: hand gesture recognition

5093 Stable Diffusion, Context-to-Motion Model to Augmenting Dexterity of Prosthetic Limbs

Authors: André Augusto Ceballos Melo

Abstract:

Design to facilitate the recognition of congruent prosthetic movements, context-to-motion translations guided by image, verbal prompt, users nonverbal communication such as facial expressions, gestures, paralinguistics, scene context, and object recognition contributes to this process though it can also be applied to other tasks, such as walking, Prosthetic limbs as assistive technology through gestures, sound codes, signs, facial, body expressions, and scene context The context-to-motion model is a machine learning approach that is designed to improve the control and dexterity of prosthetic limbs. It works by using sensory input from the prosthetic limb to learn about the dynamics of the environment and then using this information to generate smooth, stable movements. This can help to improve the performance of the prosthetic limb and make it easier for the user to perform a wide range of tasks. There are several key benefits to using the context-to-motion model for prosthetic limb control. First, it can help to improve the naturalness and smoothness of prosthetic limb movements, which can make them more comfortable and easier to use for the user. Second, it can help to improve the accuracy and precision of prosthetic limb movements, which can be particularly useful for tasks that require fine motor control. Finally, the context-to-motion model can be trained using a variety of different sensory inputs, which makes it adaptable to a wide range of prosthetic limb designs and environments. Stable diffusion is a machine learning method that can be used to improve the control and stability of movements in robotic and prosthetic systems. It works by using sensory feedback to learn about the dynamics of the environment and then using this information to generate smooth, stable movements. One key aspect of stable diffusion is that it is designed to be robust to noise and uncertainty in the sensory feedback. This means that it can continue to produce stable, smooth movements even when the sensory data is noisy or unreliable. To implement stable diffusion in a robotic or prosthetic system, it is typically necessary to first collect a dataset of examples of the desired movements. This dataset can then be used to train a machine learning model to predict the appropriate control inputs for a given set of sensory observations. Once the model has been trained, it can be used to control the robotic or prosthetic system in real-time. The model receives sensory input from the system and uses it to generate control signals that drive the motors or actuators responsible for moving the system. Overall, the use of the context-to-motion model has the potential to significantly improve the dexterity and performance of prosthetic limbs, making them more useful and effective for a wide range of users Hand Gesture Body Language Influence Communication to social interaction, offering a possibility for users to maximize their quality of life, social interaction, and gesture communication.

Keywords: stable diffusion, neural interface, smart prosthetic, augmenting

Procedia PDF Downloads 71
5092 DBN-Based Face Recognition System Using Light Field

Authors: Bing Gu

Abstract:

Abstract—Most of Conventional facial recognition systems are based on image features, such as LBP, SIFT. Recently some DBN-based 2D facial recognition systems have been proposed. However, we find there are few DBN-based 3D facial recognition system and relative researches. 3D facial images include all the individual biometric information. We can use these information to build more accurate features, So we present our DBN-based face recognition system using Light Field. We can see Light Field as another presentation of 3D image, and Light Field Camera show us a way to receive a Light Field. We use the commercially available Light Field Camera to act as the collector of our face recognition system, and the system receive a state-of-art performance as convenient as conventional 2D face recognition system.

Keywords: DBN, face recognition, light field, Lytro

Procedia PDF Downloads 432
5091 Kannada HandWritten Character Recognition by Edge Hinge and Edge Distribution Techniques Using Manhatan and Minimum Distance Classifiers

Authors: C. V. Aravinda, H. N. Prakash

Abstract:

In this paper, we tried to convey fusion and state of art pertaining to SIL character recognition systems. In the first step, the text is preprocessed and normalized to perform the text identification correctly. The second step involves extracting relevant and informative features. The third step implements the classification decision. The three stages which involved are Data acquisition and preprocessing, Feature extraction, and Classification. Here we concentrated on two techniques to obtain features, Feature Extraction & Feature Selection. Edge-hinge distribution is a feature that characterizes the changes in direction of a script stroke in handwritten text. The edge-hinge distribution is extracted by means of a windowpane that is slid over an edge-detected binary handwriting image. Whenever the mid pixel of the window is on, the two edge fragments (i.e. connected sequences of pixels) emerging from this mid pixel are measured. Their directions are measured and stored as pairs. A joint probability distribution is obtained from a large sample of such pairs. Despite continuous effort, handwriting identification remains a challenging issue, due to different approaches use different varieties of features, having different. Therefore, our study will focus on handwriting recognition based on feature selection to simplify features extracting task, optimize classification system complexity, reduce running time and improve the classification accuracy.

Keywords: word segmentation and recognition, character recognition, optical character recognition, hand written character recognition, South Indian languages

Procedia PDF Downloads 471
5090 Applying Biosensors’ Electromyography Signals through an Artificial Neural Network to Control a Small Unmanned Aerial Vehicle

Authors: Mylena McCoggle, Shyra Wilson, Andrea Rivera, Rocio Alba-Flores

Abstract:

This work introduces the use of EMGs (electromyography) from muscle sensors to develop an Artificial Neural Network (ANN) for pattern recognition to control a small unmanned aerial vehicle. The objective of this endeavor exhibits interfacing drone applications beyond manual control directly. MyoWare Muscle sensor contains three EMG electrodes (dual and single type) used to collect signals from the posterior (extensor) and anterior (flexor) forearm and the bicep. Collection of raw voltages from each sensor were connected to an Arduino Uno and a data processing algorithm was developed with the purpose of interpreting the voltage signals given when performing flexing, resting, and motion of the arm. Each sensor collected eight values over a two-second period for the duration of one minute, per assessment. During each two-second interval, the movements were alternating between a resting reference class and an active motion class, resulting in controlling the motion of the drone with left and right movements. This paper further investigated adding up to three sensors to differentiate between hand gestures to control the principal motions of the drone (left, right, up, and land). The hand gestures chosen to execute these movements were: a resting position, a thumbs up, a hand swipe right motion, and a flexing position. The MATLAB software was utilized to collect, process, and analyze the signals from the sensors. The protocol (machine learning tool) was used to classify the hand gestures. To generate the input vector to the ANN, the mean, root means squared, and standard deviation was processed for every two-second interval of the hand gestures. The neuromuscular information was then trained using an artificial neural network with one hidden layer of 10 neurons to categorize the four targets, one for each hand gesture. Once the machine learning training was completed, the resulting network interpreted the processed inputs and returned the probabilities of each class. Based on the resultant probability of the application process, once an output was greater or equal to 80% of matching a specific target class, the drone would perform the motion expected. Afterward, each movement was sent from the computer to the drone through a Wi-Fi network connection. These procedures have been successfully tested and integrated into trial flights, where the drone has responded successfully in real-time to predefined command inputs with the machine learning algorithm through the MyoWare sensor interface. The full paper will describe in detail the database of the hand gestures, the details of the ANN architecture, and confusion matrices results.

Keywords: artificial neural network, biosensors, electromyography, machine learning, MyoWare muscle sensors, Arduino

Procedia PDF Downloads 144
5089 Face Tracking and Recognition Using Deep Learning Approach

Authors: Degale Desta, Cheng Jian

Abstract:

The most important factor in identifying a person is their face. Even identical twins have their own distinct faces. As a result, identification and face recognition are needed to tell one person from another. A face recognition system is a verification tool used to establish a person's identity using biometrics. Nowadays, face recognition is a common technique used in a variety of applications, including home security systems, criminal identification, and phone unlock systems. This system is more secure because it only requires a facial image instead of other dependencies like a key or card. Face detection and face identification are the two phases that typically make up a human recognition system.The idea behind designing and creating a face recognition system using deep learning with Azure ML Python's OpenCV is explained in this paper. Face recognition is a task that can be accomplished using deep learning, and given the accuracy of this method, it appears to be a suitable approach. To show how accurate the suggested face recognition system is, experimental results are given in 98.46% accuracy using Fast-RCNN Performance of algorithms under different training conditions.

Keywords: deep learning, face recognition, identification, fast-RCNN

Procedia PDF Downloads 93
5088 Pattern Recognition Based on Simulation of Chemical Senses (SCS)

Authors: Nermeen El Kashef, Yasser Fouad, Khaled Mahar

Abstract:

No AI-complete system can model the human brain or behavior, without looking at the totality of the whole situation and incorporating a combination of senses. This paper proposes a Pattern Recognition model based on Simulation of Chemical Senses (SCS) for separation and classification of sign language. The model based on human taste controlling strategy. The main idea of the introduced model is motivated by the facts that the tongue cluster input substance into its basic tastes first, and then the brain recognizes its flavor. To implement this strategy, two level architecture is proposed (this is inspired from taste system). The separation-level of the architecture focuses on hand posture cluster, while the classification-level of the architecture to recognizes the sign language. The efficiency of proposed model is demonstrated experimentally by recognizing American Sign Language (ASL) data set. The recognition accuracy obtained for numbers of ASL is 92.9 percent.

Keywords: artificial intelligence, biocybernetics, gustatory system, sign language recognition, taste sense

Procedia PDF Downloads 263
5087 Comparing Emotion Recognition from Voice and Facial Data Using Time Invariant Features

Authors: Vesna Kirandziska, Nevena Ackovska, Ana Madevska Bogdanova

Abstract:

The problem of emotion recognition is a challenging problem. It is still an open problem from the aspect of both intelligent systems and psychology. In this paper, both voice features and facial features are used for building an emotion recognition system. A Support Vector Machine classifiers are built by using raw data from video recordings. In this paper, the results obtained for the emotion recognition are given, and a discussion about the validity and the expressiveness of different emotions is presented. A comparison between the classifiers build from facial data only, voice data only and from the combination of both data is made here. The need for a better combination of the information from facial expression and voice data is argued.

Keywords: emotion recognition, facial recognition, signal processing, machine learning

Procedia PDF Downloads 290
5086 Using Augmented Reality to Enhance Doctor Patient Communication

Authors: Rutusha Bhutada, Gaurav Chavan, Sarvesh Kasat, Varsha Mujumdar

Abstract:

This software system will be an Augmented Reality application designed to maximize the doctor’s productivity by providing tools to assist in automating the patient recognition and updating patient’s records using face and voice recognition features, which would otherwise have to be performed manually. By maximizing the doctor’s work efficiency and production, the application will meet the doctor’s needs while remaining easy to understand and use. More specifically, this application is designed to allow a doctor to manage his productive time in handling the patient without losing eye-contact with him and communicate with a group of other doctors for consultation, for in-place treatments through video streaming, as a video study. The system also contains a relational database containing a list of doctor, patient and display techniques.

Keywords: augmented reality, hand-held devices, head-mounted devices, marker based systems, speech recognition, face detection

Procedia PDF Downloads 409
5085 Exploring Multimodal Communication: Intersections of Language, Gesture, and Technology

Authors: Rasha Ali Dheyab

Abstract:

In today's increasingly interconnected and technologically-driven world, communication has evolved beyond traditional verbal exchanges. This paper delves into the fascinating realm of multimodal communication, a dynamic field at the intersection of linguistics, gesture studies, and technology. The study of how humans convey meaning through a combination of spoken language, gestures, facial expressions, and digital platforms has gained prominence as our modes of interaction continue to diversify. This exploration begins by examining the foundational theories in linguistics and gesture studies, tracing their historical development and mutual influences. It further investigates the role of nonverbal cues, such as gestures and facial expressions, in augmenting and sometimes even altering the meanings conveyed by spoken language. Additionally, the paper delves into the modern technological landscape, where emojis, GIFs, and other digital symbols have emerged as new linguistic tools, reshaping the ways in which we communicate and express emotions. The interaction between traditional and digital modes of communication is a central focus of this study. The paper investigates how technology has not only introduced new modes of expression but has also influenced the adaptation of existing linguistic and gestural patterns in online discourse. The emergence of virtual reality and augmented reality environments introduces yet another layer of complexity to multimodal communication, offering new avenues for studying how humans navigate and negotiate meaning in immersive digital spaces. Through a combination of literature review, case studies, and theoretical analysis, this paper seeks to shed light on the intricate interplay between language, gesture, and technology in the realm of multimodal communication. By understanding how these diverse modes of expression intersect and interact, we gain valuable insights into the ever-evolving nature of human communication and its implications for fields ranging from linguistics and psychology to human-computer interaction and digital anthropology.

Keywords: multimodal communication, linguistics ., gesture studies., emojis., verbal communication., digital

Procedia PDF Downloads 50
5084 Possibilities, Challenges and the State of the Art of Automatic Speech Recognition in Air Traffic Control

Authors: Van Nhan Nguyen, Harald Holone

Abstract:

Over the past few years, a lot of research has been conducted to bring Automatic Speech Recognition (ASR) into various areas of Air Traffic Control (ATC), such as air traffic control simulation and training, monitoring live operators for with the aim of safety improvements, air traffic controller workload measurement and conducting analysis on large quantities controller-pilot speech. Due to the high accuracy requirements of the ATC context and its unique challenges, automatic speech recognition has not been widely adopted in this field. With the aim of providing a good starting point for researchers who are interested bringing automatic speech recognition into ATC, this paper gives an overview of possibilities and challenges of applying automatic speech recognition in air traffic control. To provide this overview, we present an updated literature review of speech recognition technologies in general, as well as specific approaches relevant to the ATC context. Based on this literature review, criteria for selecting speech recognition approaches for the ATC domain are presented, and remaining challenges and possible solutions are discussed.

Keywords: automatic speech recognition, asr, air traffic control, atc

Procedia PDF Downloads 364
5083 A Contribution to Human Activities Recognition Using Expert System Techniques

Authors: Malika Yaici, Soraya Aloui, Sara Semchaoui

Abstract:

This paper deals with human activity recognition from sensor data. It is an active research area, and the main objective is to obtain a high recognition rate. In this work, a recognition system based on expert systems is proposed; the recognition is performed using the objects, object states, and gestures and taking into account the context (the location of the objects and of the person performing the activity, the duration of the elementary actions and the activity). The system recognizes complex activities after decomposing them into simple, easy-to-recognize activities. The proposed method can be applied to any type of activity. The simulation results show the robustness of our system and its speed of decision.

Keywords: human activity recognition, ubiquitous computing, context-awareness, expert system

Procedia PDF Downloads 70
5082 Switching to the Latin Alphabet in Kazakhstan: A Brief Overview of Character Recognition Methods

Authors: Ainagul Yermekova, Liudmila Goncharenko, Ali Baghirzade, Sergey Sybachin

Abstract:

In this article, we address the problem of Kazakhstan's transition to the Latin alphabet. The transition process started in 2017 and is scheduled to be completed in 2025. In connection with these events, the problem of recognizing the characters of the new alphabet is raised. Well-known character recognition programs such as ABBYY FineReader, FormReader, MyScript Stylus did not recognize specific Kazakh letters that were used in Cyrillic. The author tries to give an assessment of the well-known method of character recognition that could be in demand as part of the country's transition to the Latin alphabet. Three methods of character recognition: template, structured, and feature-based, are considered through the algorithms of operation. At the end of the article, a general conclusion is made about the possibility of applying a certain method to a particular recognition process: for example, in the process of population census, recognition of typographic text in Latin, or recognition of photos of car numbers, store signs, etc.

Keywords: text detection, template method, recognition algorithm, structured method, feature method

Procedia PDF Downloads 156
5081 Recognizing an Individual, Their Topic of Conversation and Cultural Background from 3D Body Movement

Authors: Gheida J. Shahrour, Martin J. Russell

Abstract:

The 3D body movement signals captured during human-human conversation include clues not only to the content of people’s communication but also to their culture and personality. This paper is concerned with automatic extraction of this information from body movement signals. For the purpose of this research, we collected a novel corpus from 27 subjects, arranged them into groups according to their culture. We arranged each group into pairs and each pair communicated with each other about different topics. A state-of-art recognition system is applied to the problems of person, culture, and topic recognition. We borrowed modeling, classification, and normalization techniques from speech recognition. We used Gaussian Mixture Modeling (GMM) as the main technique for building our three systems, obtaining 77.78%, 55.47%, and 39.06% from the person, culture, and topic recognition systems respectively. In addition, we combined the above GMM systems with Support Vector Machines (SVM) to obtain 85.42%, 62.50%, and 40.63% accuracy for person, culture, and topic recognition respectively. Although direct comparison among these three recognition systems is difficult, it seems that our person recognition system performs best for both GMM and GMM-SVM, suggesting that inter-subject differences (i.e. subject’s personality traits) are a major source of variation. When removing these traits from culture and topic recognition systems using the Nuisance Attribute Projection (NAP) and the Intersession Variability Compensation (ISVC) techniques, we obtained 73.44% and 46.09% accuracy from culture and topic recognition systems respectively.

Keywords: person recognition, topic recognition, culture recognition, 3D body movement signals, variability compensation

Procedia PDF Downloads 514
5080 Human Activities Recognition Based on Expert System

Authors: Malika Yaici, Soraya Aloui, Sara Semchaoui

Abstract:

Recognition of human activities from sensor data is an active research area, and the main objective is to obtain a high recognition rate. In this work, we propose a recognition system based on expert systems. The proposed system makes the recognition based on the objects, object states, and gestures, taking into account the context (the location of the objects and of the person performing the activity, the duration of the elementary actions, and the activity). This work focuses on complex activities which are decomposed into simple easy to recognize activities. The proposed method can be applied to any type of activity. The simulation results show the robustness of our system and its speed of decision.

Keywords: human activity recognition, ubiquitous computing, context-awareness, expert system

Procedia PDF Downloads 101
5079 Comparative Study of Skeletonization and Radial Distance Methods for Automated Finger Enumeration

Authors: Mohammad Hossain Mohammadi, Saif Al Ameri, Sana Ziaei, Jinane Mounsef

Abstract:

Automated enumeration of the number of hand fingers is widely used in several motion gaming and distance control applications, and is discussed in several published papers as a starting block for hand recognition systems. The automated finger enumeration technique should not only be accurate, but also must have a fast response for a moving-picture input. The high performance of video in motion games or distance control will inhibit the program’s overall speed, for image processing software such as Matlab need to produce results at high computation speeds. Since an automated finger enumeration with minimum error and processing time is desired, a comparative study between two finger enumeration techniques is presented and analyzed in this paper. In the pre-processing stage, various image processing functions were applied on a real-time video input to obtain the final cleaned auto-cropped image of the hand to be used for the two techniques. The first technique uses the known morphological tool of skeletonization to count the number of skeleton’s endpoints for fingers. The second technique uses a radial distance method to enumerate the number of fingers in order to obtain a one dimensional hand representation. For both discussed methods, the different steps of the algorithms are explained. Then, a comparative study analyzes the accuracy and speed of both techniques. Through experimental testing in different background conditions, it was observed that the radial distance method was more accurate and responsive to a real-time video input compared to the skeletonization method. All test results were generated in Matlab and were based on displaying a human hand for three different orientations on top of a plain color background. Finally, the limitations surrounding the enumeration techniques are presented.

Keywords: comparative study, hand recognition, fingertip detection, skeletonization, radial distance, Matlab

Procedia PDF Downloads 355
5078 Enhanced Face Recognition with Daisy Descriptors Using 1BT Based Registration

Authors: Sevil Igit, Merve Meric, Sarp Erturk

Abstract:

In this paper, it is proposed to improve Daisy descriptor based face recognition using a novel One-Bit Transform (1BT) based pre-registration approach. The 1BT based pre-registration procedure is fast and has low computational complexity. It is shown that the face recognition accuracy is improved with the proposed approach. The proposed approach can facilitate highly accurate face recognition using DAISY descriptor with simple matching and thereby facilitate a low-complexity approach.

Keywords: face recognition, Daisy descriptor, One-Bit Transform, image registration

Procedia PDF Downloads 337
5077 Advances in Artificial intelligence Using Speech Recognition

Authors: Khaled M. Alhawiti

Abstract:

This research study aims to present a retrospective study about speech recognition systems and artificial intelligence. Speech recognition has become one of the widely used technologies, as it offers great opportunity to interact and communicate with automated machines. Precisely, it can be affirmed that speech recognition facilitates its users and helps them to perform their daily routine tasks, in a more convenient and effective manner. This research intends to present the illustration of recent technological advancements, which are associated with artificial intelligence. Recent researches have revealed the fact that speech recognition is found to be the utmost issue, which affects the decoding of speech. In order to overcome these issues, different statistical models were developed by the researchers. Some of the most prominent statistical models include acoustic model (AM), language model (LM), lexicon model, and hidden Markov models (HMM). The research will help in understanding all of these statistical models of speech recognition. Researchers have also formulated different decoding methods, which are being utilized for realistic decoding tasks and constrained artificial languages. These decoding methods include pattern recognition, acoustic phonetic, and artificial intelligence. It has been recognized that artificial intelligence is the most efficient and reliable methods, which are being used in speech recognition.

Keywords: speech recognition, acoustic phonetic, artificial intelligence, hidden markov models (HMM), statistical models of speech recognition, human machine performance

Procedia PDF Downloads 445
5076 Biometric Recognition Techniques: A Survey

Authors: Shabir Ahmad Sofi, Shubham Aggarwal, Sanyam Singhal, Roohie Naaz

Abstract:

Biometric recognition refers to an automatic recognition of individuals based on a feature vector(s) derived from their physiological and/or behavioral characteristic. Biometric recognition systems should provide a reliable personal recognition schemes to either confirm or determine the identity of an individual. These features are used to provide an authentication for computer based security systems. Applications of such a system include computer systems security, secure electronic banking, mobile phones, credit cards, secure access to buildings, health and social services. By using biometrics a person could be identified based on 'who she/he is' rather than 'what she/he has' (card, token, key) or 'what she/he knows' (password, PIN). In this paper, a brief overview of biometric methods, both unimodal and multimodal and their advantages and disadvantages, will be presented.

Keywords: biometric, DNA, fingerprint, ear, face, retina scan, gait, iris, voice recognition, unimodal biometric, multimodal biometric

Procedia PDF Downloads 730
5075 Segmentation of Arabic Handwritten Numeral Strings Based on Watershed Approach

Authors: Nidal F. Shilbayeh, Remah W. Al-Khatib, Sameer A. Nooh

Abstract:

Arabic offline handwriting recognition systems are considered as one of the most challenging topics. Arabic Handwritten Numeral Strings are used to automate systems that deal with numbers such as postal code, banking account numbers and numbers on car plates. Segmentation of connected numerals is the main bottleneck in the handwritten numeral recognition system.  This is in turn can increase the speed and efficiency of the recognition system. In this paper, we proposed algorithms for automatic segmentation and feature extraction of Arabic handwritten numeral strings based on Watershed approach. The algorithms have been designed and implemented to achieve the main goal of segmenting and extracting the string of numeral digits written by hand especially in a courtesy amount of bank checks. The segmentation algorithm partitions the string into multiple regions that can be associated with the properties of one or more criteria. The numeral extraction algorithm extracts the numeral string digits into separated individual digit. Both algorithms for segmentation and feature extraction have been tested successfully and efficiently for all types of numerals.

Keywords: handwritten numerals, segmentation, courtesy amount, feature extraction, numeral recognition

Procedia PDF Downloads 353
5074 Hand Hygiene Habits of Ghanaian Youths in Accra

Authors: Cecilia Amponsem-Boateng, Timothy B. Oppong, Haiyan Yang, Guangcai Duan

Abstract:

The human palm has been identified as one of the richest habitats for human microbial accommodation making hand hygiene essential to primary prevention of infection. Since the hand is in constant contact with fomites which have been proven to be mostly contaminated, building hand hygiene habits is essential for the prevention of infection. This research was conducted to assess the hand hygiene habits of Ghanaian youths in Accra. This study used a survey as a quantitative method of research. The findings of the study revealed that out of the 254 participants who fully answered the questionnaire, 22% had the habit of washing their hands after outings while only 51.6% had the habit of washing their hands after using the bathroom. However, about 60% of the participants said they sometimes ate with their hands while 28.9% had the habit of eating with the hand very often, a situation that put them at risk of infection from their hands since some participants had poor handwashing habits; prompting the need for continuous education on hand hygiene.

Keywords: hand hygiene, hand hygiene habit, hand washing, hand sanitizer use

Procedia PDF Downloads 80
5073 Printed Thai Character Recognition Using Particle Swarm Optimization Algorithm

Authors: Phawin Sangsuvan, Chutimet Srinilta

Abstract:

This Paper presents the applications of Particle Swarm Optimization (PSO) Method for Thai optical character recognition (OCR). OCR consists of the pre-processing, character recognition and post-processing. Before enter into recognition process. The Character must be “Prepped” by pre-processing process. The PSO is an optimization method that belongs to the swarm intelligence family based on the imitation of social behavior patterns of animals. Route of each particle is determined by an individual data among neighborhood particles. The interaction of the particles with neighbors is the advantage of Particle Swarm to determine the best solution. So PSO is interested by a lot of researchers in many difficult problems including character recognition. As the previous this research used a Projection Histogram to extract printed digits features and defined the simple Fitness Function for PSO. The results reveal that PSO gives 67.73% for testing dataset. So in the future there can be explored enhancement the better performance of PSO with improve the Fitness Function.

Keywords: character recognition, histogram projection, particle swarm optimization, pattern recognition techniques

Procedia PDF Downloads 441
5072 Enhanced Thai Character Recognition with Histogram Projection Feature Extraction

Authors: Benjawan Rangsikamol, Chutimet Srinilta

Abstract:

This research paper deals with extraction of Thai character features using the proposed histogram projection so as to improve the recognition performance. The process starts with transformation of image files into binary files before thinning. After character thinning, the skeletons are entered into the proposed extraction using histogram projection (horizontal and vertical) to extract unique features which are inputs of the subsequent recognition step. The recognition rate with the proposed extraction technique is as high as 97 percent since the technique works very well with the idiosyncrasies of Thai characters.

Keywords: character recognition, histogram projection, multilayer perceptron, Thai character features extraction

Procedia PDF Downloads 434
5071 Speaker Recognition Using LIRA Neural Networks

Authors: Nestor A. Garcia Fragoso, Tetyana Baydyk, Ernst Kussul

Abstract:

This article contains information from our investigation in the field of voice recognition. For this purpose, we created a voice database that contains different phrases in two languages, English and Spanish, for men and women. As a classifier, the LIRA (Limited Receptive Area) grayscale neural classifier was selected. The LIRA grayscale neural classifier was developed for image recognition tasks and demonstrated good results. Therefore, we decided to develop a recognition system using this classifier for voice recognition. From a specific set of speakers, we can recognize the speaker’s voice. For this purpose, the system uses spectrograms of the voice signals as input to the system, extracts the characteristics and identifies the speaker. The results are described and analyzed in this article. The classifier can be used for speaker identification in security system or smart buildings for different types of intelligent devices.

Keywords: extreme learning, LIRA neural classifier, speaker identification, voice recognition

Procedia PDF Downloads 142
5070 New Approaches for the Handwritten Digit Image Features Extraction for Recognition

Authors: U. Ravi Babu, Mohd Mastan

Abstract:

The present paper proposes a novel approach for handwritten digit recognition system. The present paper extract digit image features based on distance measure and derives an algorithm to classify the digit images. The distance measure can be performing on the thinned image. Thinning is the one of the preprocessing technique in image processing. The present paper mainly concentrated on an extraction of features from digit image for effective recognition of the numeral. To find the effectiveness of the proposed method tested on MNIST database, CENPARMI, CEDAR, and newly collected data. The proposed method is implemented on more than one lakh digit images and it gets good comparative recognition results. The percentage of the recognition is achieved about 97.32%.

Keywords: handwritten digit recognition, distance measure, MNIST database, image features

Procedia PDF Downloads 435
5069 Emotion Recognition in Video and Images in the Wild

Authors: Faizan Tariq, Moayid Ali Zaidi

Abstract:

Facial emotion recognition algorithms are expanding rapidly now a day. People are using different algorithms with different combinations to generate best results. There are six basic emotions which are being studied in this area. Author tried to recognize the facial expressions using object detector algorithms instead of traditional algorithms. Two object detection algorithms were chosen which are Faster R-CNN and YOLO. For pre-processing we used image rotation and batch normalization. The dataset I have chosen for the experiments is Static Facial Expression in Wild (SFEW). Our approach worked well but there is still a lot of room to improve it, which will be a future direction.

Keywords: face recognition, emotion recognition, deep learning, CNN

Procedia PDF Downloads 157
5068 An Improved Face Recognition Algorithm Using Histogram-Based Features in Spatial and Frequency Domains

Authors: Qiu Chen, Koji Kotani, Feifei Lee, Tadahiro Ohmi

Abstract:

In this paper, we propose an improved face recognition algorithm using histogram-based features in spatial and frequency domains. For adding spatial information of the face to improve recognition performance, a region-division (RD) method is utilized. The facial area is firstly divided into several regions, then feature vectors of each facial part are generated by Binary Vector Quantization (BVQ) histogram using DCT coefficients in low frequency domains, as well as Local Binary Pattern (LBP) histogram in spatial domain. Recognition results with different regions are first obtained separately and then fused by weighted averaging. Publicly available ORL database is used for the evaluation of our proposed algorithm, which is consisted of 40 subjects with 10 images per subject containing variations in lighting, posing, and expressions. It is demonstrated that face recognition using RD method can achieve much higher recognition rate.

Keywords: binary vector quantization (BVQ), DCT coefficients, face recognition, local binary patterns (LBP)

Procedia PDF Downloads 317
5067 Deep-Learning Based Approach to Facial Emotion Recognition through Convolutional Neural Network

Authors: Nouha Khediri, Mohammed Ben Ammar, Monji Kherallah

Abstract:

Recently, facial emotion recognition (FER) has become increasingly essential to understand the state of the human mind. Accurately classifying emotion from the face is a challenging task. In this paper, we present a facial emotion recognition approach named CV-FER, benefiting from deep learning, especially CNN and VGG16. First, the data is pre-processed with data cleaning and data rotation. Then, we augment the data and proceed to our FER model, which contains five convolutions layers and five pooling layers. Finally, a softmax classifier is used in the output layer to recognize emotions. Based on the above contents, this paper reviews the works of facial emotion recognition based on deep learning. Experiments show that our model outperforms the other methods using the same FER2013 database and yields a recognition rate of 92%. We also put forward some suggestions for future work.

Keywords: CNN, deep-learning, facial emotion recognition, machine learning

Procedia PDF Downloads 62
5066 Facial Emotion Recognition Using Deep Learning

Authors: Ashutosh Mishra, Nikhil Goyal

Abstract:

A 3D facial emotion recognition model based on deep learning is proposed in this paper. Two convolution layers and a pooling layer are employed in the deep learning architecture. After the convolution process, the pooling is finished. The probabilities for various classes of human faces are calculated using the sigmoid activation function. To verify the efficiency of deep learning-based systems, a set of faces. The Kaggle dataset is used to verify the accuracy of a deep learning-based face recognition model. The model's accuracy is about 65 percent, which is lower than that of other facial expression recognition techniques. Despite significant gains in representation precision due to the nonlinearity of profound image representations.

Keywords: facial recognition, computational intelligence, convolutional neural network, depth map

Procedia PDF Downloads 196
5065 Hull Detection from Handwritten Digit Image

Authors: Sriraman Kothuri, Komal Teja Mattupalli

Abstract:

In this paper we proposed a novel algorithm for recognizing hulls in a hand written digits. This is an extension to the work on “Digit Recognition Using Freeman Chain code”. In order to find out the hulls in a user given digit it is necessary to follow three steps. Those are pre-processing, Boundary Extraction and at last apply the Hull Detection system in a way to attain the better results. The detection of Hull Regions is mainly intended to increase the machine learning capability in detection of characters or digits. This can also extend this in order to get the hull regions and their intensities in Black Holes in Space Exploration.

Keywords: chain code, machine learning, hull regions, hull recognition system, SASK algorithm

Procedia PDF Downloads 378
5064 Hindi Speech Synthesis by Concatenation of Recognized Hand Written Devnagri Script Using Support Vector Machines Classifier

Authors: Saurabh Farkya, Govinda Surampudi

Abstract:

Optical Character Recognition is one of the current major research areas. This paper is focussed on recognition of Devanagari script and its sound generation. This Paper consists of two parts. First, Optical Character Recognition of Devnagari handwritten Script. Second, speech synthesis of the recognized text. This paper shows an implementation of support vector machines for the purpose of Devnagari Script recognition. The Support Vector Machines was trained with Multi Domain features; Transform Domain and Spatial Domain or Structural Domain feature. Transform Domain includes the wavelet feature of the character. Structural Domain consists of Distance Profile feature and Gradient feature. The Segmentation of the text document has been done in 3 levels-Line Segmentation, Word Segmentation, and Character Segmentation. The pre-processing of the characters has been done with the help of various Morphological operations-Otsu's Algorithm, Erosion, Dilation, Filtration and Thinning techniques. The Algorithm was tested on the self-prepared database, a collection of various handwriting. Further, Unicode was used to convert recognized Devnagari text into understandable computer document. The document so obtained is an array of codes which was used to generate digitized text and to synthesize Hindi speech. Phonemes from the self-prepared database were used to generate the speech of the scanned document using concatenation technique.

Keywords: Character Recognition (OCR), Text to Speech (TTS), Support Vector Machines (SVM), Library of Support Vector Machines (LIBSVM)

Procedia PDF Downloads 469