Search results for: novel object recognition
2507 Objects Tracking in Catadioptric Images Using Spherical Snake
Authors: Khald Anisse, Amina Radgui, Mohammed Rziza
Abstract:
Tracking objects on video sequences is a very challenging task in many works in computer vision applications. However, there is no article that treats this topic in catadioptric vision. This paper is an attempt that tries to describe a new approach of omnidirectional images processing based on inverse stereographic projection in the half-sphere. We used the spherical model proposed by Gayer and al. For object tracking, our work is based on snake method, with optimization using the Greedy algorithm, by adapting its different operators. The algorithm will respect the deformed geometries of omnidirectional images such as spherical neighborhood, spherical gradient and reformulation of optimization algorithm on the spherical domain. This tracking method that we call "spherical snake" permitted to know the change of the shape and the size of object in different replacements in the spherical image.Keywords: computer vision, spherical snake, omnidirectional image, object tracking, inverse stereographic projection
Procedia PDF Downloads 4022506 Evaluating the Performance of Color Constancy Algorithm
Authors: Damanjit Kaur, Avani Bhatia
Abstract:
Color constancy is significant for human vision since color is a pictorial cue that helps in solving different visions tasks such as tracking, object recognition, or categorization. Therefore, several computational methods have tried to simulate human color constancy abilities to stabilize machine color representations. Two different kinds of methods have been used, i.e., normalization and constancy. While color normalization creates a new representation of the image by canceling illuminant effects, color constancy directly estimates the color of the illuminant in order to map the image colors to a canonical version. Color constancy is the capability to determine colors of objects independent of the color of the light source. This research work studies the most of the well-known color constancy algorithms like white point and gray world.Keywords: color constancy, gray world, white patch, modified white patch
Procedia PDF Downloads 3192505 Image Instance Segmentation Using Modified Mask R-CNN
Authors: Avatharam Ganivada, Krishna Shah
Abstract:
The Mask R-CNN is recently introduced by the team of Facebook AI Research (FAIR), which is mainly concerned with instance segmentation in images. Here, the Mask R-CNN is based on ResNet and feature pyramid network (FPN), where a single dropout method is employed. This paper provides a modified Mask R-CNN by adding multiple dropout methods into the Mask R-CNN. The proposed model has also utilized the concepts of Resnet and FPN to extract stage-wise network feature maps, wherein a top-down network path having lateral connections is used to obtain semantically strong features. The proposed model produces three outputs for each object in the image: class label, bounding box coordinates, and object mask. The performance of the proposed network is evaluated in the segmentation of every instance in images using COCO and cityscape datasets. The proposed model achieves better performance than the state-of-the-networks for the datasets.Keywords: instance segmentation, object detection, convolutional neural networks, deep learning, computer vision
Procedia PDF Downloads 722504 Recognition of Spelling Problems during the Text in Progress: A Case Study on the Comments Made by Portuguese Students Newly Literate
Authors: E. Calil, L. A. Pereira
Abstract:
The acquisition of orthography is a complex process, involving both lexical and grammatical questions. This learning occurs simultaneously with the domain of multiple textual aspects (e.g.: graphs, punctuation, etc.). However, most of the research on orthographic acquisition focus on this acquisition from an autonomous point of view, separated from the process of textual production. This means that their object of analysis is the production of words selected by the researcher or the requested sentences in an experimental and controlled setting. In addition, the analysis of the Spelling Problems (SP) are identified by the researcher on the sheet of paper. Considering the perspective of Textual Genetics, from an enunciative approach, this study will discuss the SPs recognized by dyads of newly literate students, while they are writing a text collaboratively. Six proposals of textual production were registered, requested by a 2nd year teacher of a Portuguese Primary School between January and March 2015. In our case study we discuss the SPs recognized by the dyad B and L (7 years old). We adopted as a methodological tool the Ramos System audiovisual record. This system allows real-time capture of the text in process and of the face-to-face dialogue between both students and their teacher, and also captures the body movements and facial expressions of the participants during textual production proposals in the classroom. In these ecological conditions of multimodal registration of collaborative writing, we could identify the emergence of SP in two dimensions: i. In the product (finished text): SP identification without recursive graphic marks (without erasures) and the identification of SPs with erasures, indicating the recognition of SP by the student; ii. In the process (text in progress): identification of comments made by students about recognized SPs. Given this, we’ve analyzed the comments on identified SPs during the text in progress. These comments characterize a type of reformulation referred to as Commented Oral Erasure (COE). The COE has two enunciative forms: Simple Comment (SC) such as ' 'X' is written with 'Y' '; or Unfolded Comment (UC), such as ' 'X' is written with 'Y' because...'. The spelling COE may also occur before or during the SP (Early Spelling Recognition - ESR) or after the SP has been entered (Later Spelling Recognition - LSR). There were 631 words entered in the 6 stories written by the B-L dyad, 145 of them containing some type of SP. During the text in progress, the students recognized orally 174 SP, 46 of which were identified in advance (ESRs) and 128 were identified later (LSPs). If we consider that the 88 erasure SPs in the product indicate some form of SP recognition, we can observe that there were twice as many SPs recognized orally. The ESR was characterized by SC when students asked their colleague or teacher how to spell a given word. The LSR presented predominantly UC, verbalizing meta-orthographic arguments, mostly made by L. These results indicate that writing in dyad is an important didactic strategy for the promotion of metalinguistic reflection, favoring the learning of spelling.Keywords: collaborative writing, erasure, learning, metalinguistic awareness, spelling, text production
Procedia PDF Downloads 1632503 Rehabilitation of the Blind Using Sono-Visualization Tool
Authors: Ashwani Kumar
Abstract:
In human beings, eyes play a vital role. A very less research has been done for rehabilitation of blindness for the blind people. This paper discusses the work that helps blind people for recognizing the basic shapes of the objects like circle, square, triangle, horizontal lines, vertical lines, diagonal lines and the wave forms like sinusoidal, square, triangular etc. This is largely achieved by using a digital camera, which is used to capture the visual information present in front of the blind person and a software program, which achieves the image processing operations, and finally the processed image is converted into sound. After the sound generation process, the generated sound is fed to the blind person through headphones for visualizing the imaginary image of the object. For visualizing the imaginary image of the object, it needs to train the blind person. Various training process methods had been applied for recognizing the object.Keywords: image processing, pixel, pitch, loudness, sound generation, edge detection, brightness
Procedia PDF Downloads 3882502 Image Processing techniques for Surveillance in Outdoor Environment
Authors: Jayanth C., Anirudh Sai Yetikuri, Kavitha S. N.
Abstract:
This paper explores the development and application of computer vision and machine learning techniques for real-time pose detection, facial recognition, and number plate extraction. Utilizing MediaPipe for pose estimation, the research presents methods for detecting hand raises and ducking postures through real-time video analysis. Complementarily, facial recognition is employed to compare and verify individual identities using the face recognition library. Additionally, the paper demonstrates a robust approach for extracting and storing vehicle number plates from images, integrating Optical Character Recognition (OCR) with a database management system. The study highlights the effectiveness and versatility of these technologies in practical scenarios, including security and surveillance applications. The findings underscore the potential of combining computer vision techniques to address diverse challenges and enhance automated systems for both individual and vehicular identification. This research contributes to the fields of computer vision and machine learning by providing scalable solutions and demonstrating their applicability in real-world contexts.Keywords: computer vision, pose detection, facial recognition, number plate extraction, machine learning, real-time analysis, OCR, database management
Procedia PDF Downloads 262501 Defect Localization and Interaction on Surfaces with Projection Mapping and Gesture Recognition
Authors: Qiang Wang, Hongyang Yu, MingRong Lai, Miao Luo
Abstract:
This paper presents a method for accurately localizing and interacting with known surface defects by overlaying patterns onto real-world surfaces using a projection system. Given the world coordinates of the defects, we project corresponding patterns onto the surfaces, providing an intuitive visualization of the specific defect locations. To enable users to interact with and retrieve more information about individual defects, we implement a gesture recognition system based on a pruned and optimized version of YOLOv6. This lightweight model achieves an accuracy of 82.8% and is suitable for deployment on low-performance devices. Our approach demonstrates the potential for enhancing defect identification, inspection processes, and user interaction in various applications.Keywords: defect localization, projection mapping, gesture recognition, YOLOv6
Procedia PDF Downloads 882500 A Novel Computer-Generated Hologram (CGH) Achieved Scheme Generated from Point Cloud by Using a Lens Array
Authors: Wei-Na Li, Mei-Lan Piao, Nam Kim
Abstract:
We proposed a novel computer-generated hologram (CGH) achieved scheme, wherein the CGH is generated from a point cloud which is transformed by a mapping relationship of a series of elemental images captured from a real three-dimensional (3D) object by using a lens array. This scheme is composed of three procedures: mapping from elemental images to point cloud, hologram generation, and hologram display. A mapping method is figured out to achieve a virtual volume date (point cloud) from a series of elemental images. This mapping method consists of two steps. Firstly, the coordinate (x, y) pairs and its appearing number are calculated from the series of sub-images, which are generated from the elemental images. Secondly, a series of corresponding coordinates (x, y, z) are calculated from the elemental images. Then a hologram is generated from the volume data that is calculated by the previous two steps. Eventually, a spatial light modulator (SLM) and a green laser beam are utilized to display this hologram and reconstruct the original 3D object. In this paper, in order to show a more auto stereoscopic display of a real 3D object, we successfully obtained the actual depth data of every discrete point of the real 3D object, and overcame the inherent drawbacks of the depth camera by obtaining point cloud from the elemental images.Keywords: elemental image, point cloud, computer-generated hologram (CGH), autostereoscopic display
Procedia PDF Downloads 5842499 SCNet: A Vehicle Color Classification Network Based on Spatial Cluster Loss and Channel Attention Mechanism
Authors: Fei Gao, Xinyang Dong, Yisu Ge, Shufang Lu, Libo Weng
Abstract:
Vehicle color recognition plays an important role in traffic accident investigation. However, due to the influence of illumination, weather, and noise, vehicle color recognition still faces challenges. In this paper, a vehicle color classification network based on spatial cluster loss and channel attention mechanism (SCNet) is proposed for vehicle color recognition. A channel attention module is applied to extract the features of vehicle color representative regions and reduce the weight of nonrepresentative color regions in the channel. The proposed loss function, called spatial clustering loss (SC-loss), consists of two channel-specific components, such as a concentration component and a diversity component. The concentration component forces all feature channels belonging to the same class to be concentrated through the channel cluster. The diversity components impose additional constraints on the channels through the mean distance coefficient, making them mutually exclusive in spatial dimensions. In the comparison experiments, the proposed method can achieve state-of-the-art performance on the public datasets, VCD, and VeRi, which are 96.1% and 96.2%, respectively. In addition, the ablation experiment further proves that SC-loss can effectively improve the accuracy of vehicle color recognition.Keywords: feature extraction, convolutional neural networks, intelligent transportation, vehicle color recognition
Procedia PDF Downloads 1832498 Analyzing the Use of Augmented Reality and Image Recognition in Cultural Education: Use Case of Sintra Palace Treasure Hunt Application
Authors: Marek Maruszczak
Abstract:
Gamified applications have been used successfully in education for years. The rapid development of technologies such as augmented reality and image recognition increases their availability and reduces their prices. Thus, there is an increasing possibility and need for a wide use of such applications in education. The main purpose of this article is to present the effects of work on a mobile application with augmented reality, the aim of which is to motivate tourists to pay more attention to the attractions and increase the likelihood of moving from one attraction to the next while visiting the Palácio Nacional de Sintra in Portugal. Work on the application was carried out together with the employees of Parques de Sintra from 2019 to 2021. Their effect was the preparation of a mobile application using augmented reality and image recognition. The application was tested on the palace premises by both Parques de Sintra employees and tourists visiting Palácio Nacional de Sintra. The collected conclusions allowed for the formulation of good practices and guidelines that can be used when designing gamified apps for the purpose of cultural education.Keywords: augmented reality, cultural education, gamification, image recognition, mobile games
Procedia PDF Downloads 1902497 A Relative Entropy Regularization Approach for Fuzzy C-Means Clustering Problem
Authors: Ouafa Amira, Jiangshe Zhang
Abstract:
Clustering is an unsupervised machine learning technique; its aim is to extract the data structures, in which similar data objects are grouped in the same cluster, whereas dissimilar objects are grouped in different clusters. Clustering methods are widely utilized in different fields, such as: image processing, computer vision , and pattern recognition, etc. Fuzzy c-means clustering (fcm) is one of the most well known fuzzy clustering methods. It is based on solving an optimization problem, in which a minimization of a given cost function has been studied. This minimization aims to decrease the dissimilarity inside clusters, where the dissimilarity here is measured by the distances between data objects and cluster centers. The degree of belonging of a data point in a cluster is measured by a membership function which is included in the interval [0, 1]. In fcm clustering, the membership degree is constrained with the condition that the sum of a data object’s memberships in all clusters must be equal to one. This constraint can cause several problems, specially when our data objects are included in a noisy space. Regularization approach took a part in fuzzy c-means clustering technique. This process introduces an additional information in order to solve an ill-posed optimization problem. In this study, we focus on regularization by relative entropy approach, where in our optimization problem we aim to minimize the dissimilarity inside clusters. Finding an appropriate membership degree to each data object is our objective, because an appropriate membership degree leads to an accurate clustering result. Our clustering results in synthetic data sets, gaussian based data sets, and real world data sets show that our proposed model achieves a good accuracy.Keywords: clustering, fuzzy c-means, regularization, relative entropy
Procedia PDF Downloads 2592496 Application of Optical Method for Calcul of Deformed Object Samples
Authors: R. Daira
Abstract:
The electronic speckle interferometry technique used to measure the deformations of scatterers process is based on the subtraction of interference patterns. A speckle image is first recorded before deformation of the object in the RAM of a computer, after a second deflection. The square of the difference between two images showing correlation fringes observable in real time directly on monitor. The interpretation these fringes to determine the deformation. In this paper, we present experimental results of deformation out of the plane of two samples in aluminum, electronic boards and stainless steel.Keywords: optical method, holography, interferometry, deformation
Procedia PDF Downloads 4042495 The Effect of Experimentally Induced Stress on Facial Recognition Ability of Security Personnel’s
Authors: Zunjarrao Kadam, Vikas Minchekar
Abstract:
The facial recognition is an important task in criminal investigation procedure. The security guards-constantly watching the persons-can help to identify the suspected accused. The forensic psychologists are tackled such cases in the criminal justice system. The security personnel may loss their ability to correctly identify the persons due to constant stress while performing the duty. The present study aimed at to identify the effect of experimentally induced stress on facial recognition ability of security personnel’s. For this study 50, security guards from Sangli, Miraj & Jaysingpur city of the Maharashtra States of India were recruited in the experimental study. The randomized two group design was employed to carry out the research. In the initial condition twenty identity card size photographs were shown to both groups. Afterward, artificial stress was induced in the experimental group through the difficultpuzzle-solvingtask in a limited period. In the second condition, both groups were presented earlier photographs with another additional thirty new photographs. The subjects were asked to recognize the photographs which are shown earliest. The analyzed data revealed that control group has ahighest mean score of facial recognition than experimental group. The results were discussed in the present research.Keywords: experimentally induced stress, facial recognition, cognition, security personnel
Procedia PDF Downloads 2612494 Optimized Dynamic Bayesian Networks and Neural Verifier Test Applied to On-Line Isolated Characters Recognition
Authors: Redouane Tlemsani, Redouane, Belkacem Kouninef, Abdelkader Benyettou
Abstract:
In this paper, our system is a Markovien system which we can see it like a Dynamic Bayesian Networks. One of the major interests of these systems resides in the complete training of the models (topology and parameters) starting from training data. The Bayesian Networks are representing models of dubious knowledge on complex phenomena. They are a union between the theory of probability and the graph theory in order to give effective tools to represent a joined probability distribution on a set of random variables. The representation of knowledge bases on description, by graphs, relations of causality existing between the variables defining the field of study. The theory of Dynamic Bayesian Networks is a generalization of the Bayesians networks to the dynamic processes. Our objective amounts finding the better structure which represents the relationships (dependencies) between the variables of a dynamic bayesian network. In applications in pattern recognition, one will carry out the fixing of the structure which obliges us to admit some strong assumptions (for example independence between some variables).Keywords: Arabic on line character recognition, dynamic Bayesian network, pattern recognition, networks
Procedia PDF Downloads 6172493 Multi Object Tracking for Predictive Collision Avoidance
Authors: Bruk Gebregziabher
Abstract:
The safe and efficient operation of Autonomous Mobile Robots (AMRs) in complex environments, such as manufacturing, logistics, and agriculture, necessitates accurate multiobject tracking and predictive collision avoidance. This paper presents algorithms and techniques for addressing these challenges using Lidar sensor data, emphasizing ensemble Kalman filter. The developed predictive collision avoidance algorithm employs the data provided by lidar sensors to track multiple objects and predict their velocities and future positions, enabling the AMR to navigate safely and effectively. A modification to the dynamic windowing approach is introduced to enhance the performance of the collision avoidance system. The overall system architecture encompasses object detection, multi-object tracking, and predictive collision avoidance control. The experimental results, obtained from both simulation and real-world data, demonstrate the effectiveness of the proposed methods in various scenarios, which lays the foundation for future research on global planners, other controllers, and the integration of additional sensors. This thesis contributes to the ongoing development of safe and efficient autonomous systems in complex and dynamic environments.Keywords: autonomous mobile robots, multi-object tracking, predictive collision avoidance, ensemble Kalman filter, lidar sensors
Procedia PDF Downloads 842492 Size-Reduction Strategies for Iris Codes
Authors: Jutta Hämmerle-Uhl, Georg Penn, Gerhard Pötzelsberger, Andreas Uhl
Abstract:
Iris codes contain bits with different entropy. This work investigates different strategies to reduce the size of iris code templates with the aim of reducing storage requirements and computational demand in the matching process. Besides simple sub-sampling schemes, also a binary multi-resolution representation as used in the JBIG hierarchical coding mode is assessed. We find that iris code template size can be reduced significantly while maintaining recognition accuracy. Besides, we propose a two stage identification approach, using small-sized iris code templates in a pre-selection satge, and full resolution templates for final identification, which shows promising recognition behaviour.Keywords: iris recognition, compact iris code, fast matching, best bits, pre-selection identification, two-stage identification
Procedia PDF Downloads 4402491 Understanding the Impact of Spatial Light Distribution on Object Identification in Low Vision: A Pilot Psychophysical Study
Authors: Alexandre Faure, Yoko Mizokami, éRic Dinet
Abstract:
These recent years, the potential of light in assisting visually impaired people in their indoor mobility has been demonstrated by different studies. Implementing smart lighting systems for selective visual enhancement, especially designed for low-vision people, is an approach that breaks with the existing visual aids. The appearance of the surface of an object is significantly influenced by the lighting conditions and the constituent materials of the objects. Appearance of objects may appear to be different from expectation. Therefore, lighting conditions lead to an important part of accurate material recognition. The main objective of this work was to investigate the effect of the spatial distribution of light on object identification in the context of low vision. The purpose was to determine whether and what specific lighting approaches should be preferred for visually impaired people. A psychophysical experiment was designed to study the ability of individuals to identify the smallest cube of a pair under different lighting diffusion conditions. Participants were divided into two distinct groups: a reference group of observers with normal or corrected-to-normal visual acuity and a test group, in which observers were required to wear visual impairment simulation glasses. All participants were presented with pairs of cubes in a "miniature room" and were instructed to estimate the relative size of the two cubes. The miniature room replicates real-life settings, adorned with decorations and separated from external light sources by black curtains. The correlated color temperature was set to 6000 K, and the horizontal illuminance at the object level at approximately 240 lux. The objects presented for comparison consisted of 11 white cubes and 11 black cubes of different sizes manufactured with a 3D printer. Participants were seated 60 cm away from the objects. Two different levels of light diffuseness were implemented. After receiving instructions, participants were asked to judge whether the two presented cubes were the same size or if one was smaller. They provided one of five possible answers: "Left one is smaller," "Left one is smaller but unsure," "Same size," "Right one is smaller," or "Right one is smaller but unsure.". The method of constant stimuli was used, presenting stimulus pairs in a random order to prevent learning and expectation biases. Each pair consisted of a comparison stimulus and a reference cube. A psychometric function was constructed to link stimulus value with the frequency of correct detection, aiming to determine the 50% correct detection threshold. Collected data were analyzed through graphs illustrating participants' responses to stimuli, with accuracy increasing as the size difference between cubes grew. Statistical analyses, including 2-way ANOVA tests, showed that light diffuseness had no significant impact on the difference threshold, whereas object color had a significant influence in low vision scenarios. The first results and trends derived from this pilot experiment clearly and strongly suggest that future investigations could explore extreme diffusion conditions to comprehensively assess the impact of diffusion on object identification. For example, the first findings related to light diffuseness may be attributed to the range of manipulation, emphasizing the need to explore how other lighting-related factors interact with diffuseness.Keywords: Lighting, Low Vision, Visual Aid, Object Identification, Psychophysical Experiment
Procedia PDF Downloads 642490 Robust and Real-Time Traffic Counting System
Authors: Hossam M. Moftah, Aboul Ella Hassanien
Abstract:
In the recent years the importance of automatic traffic control has increased due to the traffic jams problem especially in big cities for signal control and efficient traffic management. Traffic counting as a kind of traffic control is important to know the road traffic density in real time. This paper presents a fast and robust traffic counting system using different image processing techniques. The proposed system is composed of the following four fundamental building phases: image acquisition, pre-processing, object detection, and finally counting the connected objects. The object detection phase is comprised of the following five steps: subtracting the background, converting the image to binary, closing gaps and connecting nearby blobs, image smoothing to remove noises and very small objects, and detecting the connected objects. Experimental results show the great success of the proposed approach.Keywords: traffic counting, traffic management, image processing, object detection, computer vision
Procedia PDF Downloads 2942489 Vision-Based Collision Avoidance for Unmanned Aerial Vehicles by Recurrent Neural Networks
Authors: Yao-Hong Tsai
Abstract:
Due to the sensor technology, video surveillance has become the main way for security control in every big city in the world. Surveillance is usually used by governments for intelligence gathering, the prevention of crime, the protection of a process, person, group or object, or the investigation of crime. Many surveillance systems based on computer vision technology have been developed in recent years. Moving target tracking is the most common task for Unmanned Aerial Vehicle (UAV) to find and track objects of interest in mobile aerial surveillance for civilian applications. The paper is focused on vision-based collision avoidance for UAVs by recurrent neural networks. First, images from cameras on UAV were fused based on deep convolutional neural network. Then, a recurrent neural network was constructed to obtain high-level image features for object tracking and extracting low-level image features for noise reducing. The system distributed the calculation of the whole system to local and cloud platform to efficiently perform object detection, tracking and collision avoidance based on multiple UAVs. The experiments on several challenging datasets showed that the proposed algorithm outperforms the state-of-the-art methods.Keywords: unmanned aerial vehicle, object tracking, deep learning, collision avoidance
Procedia PDF Downloads 1602488 RV-YOLOX: Object Detection on Inland Waterways Based on Optimized YOLOX Through Fusion of Vision and 3+1D Millimeter Wave Radar
Authors: Zixian Zhang, Shanliang Yao, Zile Huang, Zhaodong Wu, Xiaohui Zhu, Yong Yue, Jieming Ma
Abstract:
Unmanned Surface Vehicles (USVs) are valuable due to their ability to perform dangerous and time-consuming tasks on the water. Object detection tasks are significant in these applications. However, inherent challenges, such as the complex distribution of obstacles, reflections from shore structures, water surface fog, etc., hinder the performance of object detection of USVs. To address these problems, this paper provides a fusion method for USVs to effectively detect objects in the inland surface environment, utilizing vision sensors and 3+1D Millimeter-wave radar. MMW radar is complementary to vision sensors, providing robust environmental information. The radar 3D point cloud is transferred to 2D radar pseudo image to unify radar and vision information format by utilizing the point transformer. We propose a multi-source object detection network (RV-YOLOX )based on radar-vision fusion for inland waterways environment. The performance is evaluated on our self-recording waterways dataset. Compared with the YOLOX network, our fusion network significantly improves detection accuracy, especially for objects with bad light conditions.Keywords: inland waterways, YOLO, sensor fusion, self-attention
Procedia PDF Downloads 1212487 Implementation of a Serializer to Represent PHP Objects in the Extensible Markup Language
Authors: Lidia N. Hernández-Piña, Carlos R. Jaimez-González
Abstract:
Interoperability in distributed systems is an important feature that refers to the communication of two applications written in different programming languages. This paper presents a serializer and a de-serializer of PHP objects to and from XML, which is an independent library written in the PHP programming language. The XML generated by this serializer is independent of the programming language, and can be used by other existing Web Objects in XML (WOX) serializers and de-serializers, which allow interoperability with other object-oriented programming languages.Keywords: interoperability, PHP object serialization, PHP to XML, web objects in XML, WOX
Procedia PDF Downloads 2362486 Static and Dynamic Hand Gesture Recognition Using Convolutional Neural Network Models
Authors: Keyi Wang
Abstract:
Similar to the touchscreen, hand gesture based human-computer interaction (HCI) is a technology that could allow people to perform a variety of tasks faster and more conveniently. This paper proposes a training method of an image-based hand gesture image and video clip recognition system using a CNN (Convolutional Neural Network) with a dataset. A dataset containing 6 hand gesture images is used to train a 2D CNN model. ~98% accuracy is achieved. Furthermore, a 3D CNN model is trained on a dataset containing 4 hand gesture video clips resulting in ~83% accuracy. It is demonstrated that a Cozmo robot loaded with pre-trained models is able to recognize static and dynamic hand gestures.Keywords: deep learning, hand gesture recognition, computer vision, image processing
Procedia PDF Downloads 1382485 Features Reduction Using Bat Algorithm for Identification and Recognition of Parkinson Disease
Authors: P. Shrivastava, A. Shukla, K. Verma, S. Rungta
Abstract:
Parkinson's disease is a chronic neurological disorder that directly affects human gait. It leads to slowness of movement, causes muscle rigidity and tremors. Gait serve as a primary outcome measure for studies aiming at early recognition of disease. Using gait techniques, this paper implements efficient binary bat algorithm for an early detection of Parkinson's disease by selecting optimal features required for classification of affected patients from others. The data of 166 people, both fit and affected is collected and optimal feature selection is done using PSO and Bat algorithm. The reduced dataset is then classified using neural network. The experiments indicate that binary bat algorithm outperforms traditional PSO and genetic algorithm and gives a fairly good recognition rate even with the reduced dataset.Keywords: parkinson, gait, feature selection, bat algorithm
Procedia PDF Downloads 5452484 KSVD-SVM Approach for Spontaneous Facial Expression Recognition
Authors: Dawood Al Chanti, Alice Caplier
Abstract:
Sparse representations of signals have received a great deal of attention in recent years. In this paper, the interest of using sparse representation as a mean for performing sparse discriminative analysis between spontaneous facial expressions is demonstrated. An automatic facial expressions recognition system is presented. It uses a KSVD-SVM approach which is made of three main stages: A pre-processing and feature extraction stage, which solves the problem of shared subspace distribution based on the random projection theory, to obtain low dimensional discriminative and reconstructive features; A dictionary learning and sparse coding stage, which uses the KSVD model to learn discriminative under or over dictionaries for sparse coding; Finally a classification stage, which uses a SVM classifier for facial expressions recognition. Our main concern is to be able to recognize non-basic affective states and non-acted expressions. Extensive experiments on the JAFFE static acted facial expressions database but also on the DynEmo dynamic spontaneous facial expressions database exhibit very good recognition rates.Keywords: dictionary learning, random projection, pose and spontaneous facial expression, sparse representation
Procedia PDF Downloads 3052483 Implementation of a Multimodal Biometrics Recognition System with Combined Palm Print and Iris Features
Authors: Rabab M. Ramadan, Elaraby A. Elgallad
Abstract:
With extensive application, the performance of unimodal biometrics systems has to face a diversity of problems such as signal and background noise, distortion, and environment differences. Therefore, multimodal biometric systems are proposed to solve the above stated problems. This paper introduces a bimodal biometric recognition system based on the extracted features of the human palm print and iris. Palm print biometric is fairly a new evolving technology that is used to identify people by their palm features. The iris is a strong competitor together with face and fingerprints for presence in multimodal recognition systems. In this research, we introduced an algorithm to the combination of the palm and iris-extracted features using a texture-based descriptor, the Scale Invariant Feature Transform (SIFT). Since the feature sets are non-homogeneous as features of different biometric modalities are used, these features will be concatenated to form a single feature vector. Particle swarm optimization (PSO) is used as a feature selection technique to reduce the dimensionality of the feature. The proposed algorithm will be applied to the Institute of Technology of Delhi (IITD) database and its performance will be compared with various iris recognition algorithms found in the literature.Keywords: iris recognition, particle swarm optimization, feature extraction, feature selection, palm print, the Scale Invariant Feature Transform (SIFT)
Procedia PDF Downloads 2352482 Problems Arising in Visual Perception
Authors: K. A. Tharanga, K. H. H. Damayanthi
Abstract:
Perception is an epistemological concept discussed in Philosophy. Perception, in other word, vision, is one of the ways that human beings get empirical knowledge after five senses. However, we face innumerable problems when achieving knowledge from perception, and therefore the knowledge gained through perception is uncertain. what we see in the external world is not real. These are the major issues that we face when receiving knowledge through perception. Sometimes there is no physical existence of what we really see. In such cases, the perception is relative. The following frames will be taken into consideration when perception is analyzed illusions and delusions, the figure of a physical object, appearance and the reality of a physical object, time factor, and colour of a physical object.seeing and knowing become vary according to the above conceptual frames. We cannot come to a proper conclusion of what we see in the empirical world. Because the things that we see are not really there. Hence the scientific knowledge which is gained from observation is doubtful. All the factors discussed in science remain in the physical world. There is a leap from ones existence to the existence of a world outside his/her mind. Indeed, one can suppose that what he/she takes to be real is just anmassive deception. However, depending on the above facts, if someone begins to doubt about the whole world, it is unavoidable to become his/her view a scepticism or nihilism. This is a certain reality.Keywords: empirical, perception, sceptisism, nihilism
Procedia PDF Downloads 932481 Arabic Character Recognition Using Regression Curves with the Expectation Maximization Algorithm
Authors: Abdullah A. AlShaher
Abstract:
In this paper, we demonstrate how regression curves can be used to recognize 2D non-rigid handwritten shapes. Each shape is represented by a set of non-overlapping uniformly distributed landmarks. The underlying models utilize 2nd order of polynomials to model shapes within a training set. To estimate the regression models, we need to extract the required coefficients which describe the variations for a set of shape class. Hence, a least square method is used to estimate such modes. We then proceed by training these coefficients using the apparatus Expectation Maximization algorithm. Recognition is carried out by finding the least error landmarks displacement with respect to the model curves. Handwritten isolated Arabic characters are used to evaluate our approach.Keywords: character recognition, regression curves, handwritten Arabic letters, expectation maximization algorithm
Procedia PDF Downloads 1452480 History, Challenges and Solutions for Social Work Education and Recognition in Vietnam
Authors: Thuy Bui Anh, Ngan Nguyen Thi Thanh
Abstract:
Currently, social work in Vietnam is entering the first step in the development process to become a true profession with a strong position in society. However, Spirit of helping and sharing of social work has already existed in the daily life of Vietnamese people for a very long time, becoming a precious heritage passed down from ancestors to the next generations while expanding the territory, building and defending for the country. Following the stream of history, charity work in Vietnam has gradually transformed itself towards a more professional work, especially in the last 2 decades. Accordingly, more than 50 universities and educational institutions in Vietnam have been licensed to train social work, ensuring a stronger foundation on human resources working in this field. Despite the strong growth, social work profession, social work education and the recognition of the role of the social workers still need to be fueled to develop, responded to the increasing demand of Vietnam society.Keywords: education, history, recognition, social work, Vietnam
Procedia PDF Downloads 3192479 Wave Energy: Efficient Conversion of the Big Waves
Authors: Md. Moniruzzaman
Abstract:
The energy of ocean waves across a large part of the earth is inexhaustible. The whole world will benefit if this endless energy can be used in an easy way. The coastal countries will easily be able to meet their own energy needs. The purpose of this article is to use the infinite energy of the ocean wave in a simple way. i.e. a method of efficient use of wave energy. The paper starts by discussing various forces acting on a floating object and, afterward, about the method. And then a calculation for a 73.39MW hydropower from the tidal wave. Used some sketches/pictures. Finally, the conclusion states the possibilities and advantages.Keywords: anchor, electricity, floating object, pump, ship city, wave energy
Procedia PDF Downloads 842478 Sentence Structure for Free Word Order Languages in Context with Anaphora Resolution: A Case Study of Hindi
Authors: Pardeep Singh, Kamlesh Dutta
Abstract:
Many languages have fixed sentence structure and others are free word order. The accuracy of anaphora resolution of syntax based algorithm depends on structure of the sentence. So, it is important to analyze the structure of any language before implementing these algorithms. In this study, we analyzed the sentence structure exploiting the case marker in Hindi as well as some special tag for subject and object. We also investigated the word order for Hindi. Word order typology refers to the study of the order of the syntactic constituents of a language. We analyzed 165 news items of Ranchi Express from EMILEE corpus of plain text. It consisted of 1745 sentences. Eight file of dialogue based from the same corpus has been analyzed which will have 1521 sentences. The percentages of subject object verb structure (SOV) and object subject verb (OSV) are 66.90 and 33.10, respectively.Keywords: anaphora resolution, free word order languages, SOV, OSV
Procedia PDF Downloads 472