Search results for: graph recognition
1850 Wolof Voice Response Recognition System: A Deep Learning Model for Wolof Audio Classification
Authors: Krishna Mohan Bathula, Fatou Bintou Loucoubar, FNU Kaleemunnisa, Christelle Scharff, Mark Anthony De Castro
Abstract:
Voice recognition algorithms such as automatic speech recognition and text-to-speech systems with African languages can play an important role in bridging the digital divide of Artificial Intelligence in Africa, contributing to the establishment of a fully inclusive information society. This paper proposes a Deep Learning model that can classify the user responses as inputs for an interactive voice response system. A dataset with Wolof language words ‘yes’ and ‘no’ is collected as audio recordings. A two stage Data Augmentation approach is adopted for enhancing the dataset size required by the deep neural network. Data preprocessing and feature engineering with Mel-Frequency Cepstral Coefficients are implemented. Convolutional Neural Networks (CNNs) have proven to be very powerful in image classification and are promising for audio processing when sounds are transformed into spectra. For performing voice response classification, the recordings are transformed into sound frequency feature spectra and then applied image classification methodology using a deep CNN model. The inference model of this trained and reusable Wolof voice response recognition system can be integrated with many applications associated with both web and mobile platforms.Keywords: automatic speech recognition, interactive voice response, voice response recognition, wolof word classification
Procedia PDF Downloads 1161849 Makhraj Recognition Using Convolutional Neural Network
Authors: Zan Azma Nasruddin, Irwan Mazlin, Nor Aziah Daud, Fauziah Redzuan, Fariza Hanis Abdul Razak
Abstract:
This paper focuses on a machine learning that learn the correct pronunciation of Makhraj Huroofs. Usually, people need to find an expert to pronounce the Huroof accurately. In this study, the researchers have developed a system that is able to learn the selected Huroofs which are ha, tsa, zho, and dza using the Convolutional Neural Network. The researchers present the chosen type of the CNN architecture to make the system that is able to learn the data (Huroofs) as quick as possible and produces high accuracy during the prediction. The researchers have experimented the system to measure the accuracy and the cross entropy in the training process.Keywords: convolutional neural network, Makhraj recognition, speech recognition, signal processing, tensorflow
Procedia PDF Downloads 3351848 Managing Cognitive Load in Accounting: An Analysis of Three Instructional Designs in Financial Accounting
Authors: Seedwell Sithole
Abstract:
One of the persistent problems in accounting education is how to effectively support students’ learning. A promising technique to this issue is to investigate the extent that learning is determined by the design of instructional material. This study examines the academic performance of students using three instructional designs in financial accounting. Student’s performance scores and reported mental effort ratings were used to determine the instructional effectiveness. The findings of this study show that accounting students prefer graph and text designs that are integrated. The results suggest that spatially separated graph and text presentations in accounting should be reorganized to align with the requirements of human cognitive architecture.Keywords: accounting, cognitive load, education, instructional preferences, students
Procedia PDF Downloads 1511847 The Artificial Intelligence Technologies Used in PhotoMath Application
Authors: Tala Toonsi, Marah Alagha, Lina Alnowaiser, Hala Rajab
Abstract:
This report is about the Photomath app, which is an AI application that uses image recognition technology, specifically optical character recognition (OCR) algorithms. The (OCR) algorithm translates the images into a mathematical equation, and the app automatically provides a step-by-step solution. The application supports decimals, basic arithmetic, fractions, linear equations, and multiple functions such as logarithms. Testing was conducted to examine the usage of this app, and results were collected by surveying ten participants. Later, the results were analyzed. This paper seeks to answer the question: To what level the artificial intelligence features are accurate and the speed of process in this app. It is hoped this study will inform about the efficiency of AI in Photomath to the users.Keywords: photomath, image recognition, app, OCR, artificial intelligence, mathematical equations.
Procedia PDF Downloads 1711846 A Human Activity Recognition System Based on Sensory Data Related to Object Usage
Authors: M. Abdullah, Al-Wadud
Abstract:
Sensor-based activity recognition systems usually accounts which sensors have been activated to perform an activity. The system then combines the conditional probabilities of those sensors to represent different activities and takes the decision based on that. However, the information about the sensors which are not activated may also be of great help in deciding which activity has been performed. This paper proposes an approach where the sensory data related to both usage and non-usage of objects are utilized to make the classification of activities. Experimental results also show the promising performance of the proposed method.Keywords: Naïve Bayesian, based classification, activity recognition, sensor data, object-usage model
Procedia PDF Downloads 3221845 Features Vector Selection for the Recognition of the Fragmented Handwritten Numeric Chains
Authors: Salim Ouchtati, Aissa Belmeguenai, Mouldi Bedda
Abstract:
In this study, we propose an offline system for the recognition of the fragmented handwritten numeric chains. Firstly, we realized a recognition system of the isolated handwritten digits, in this part; the study is based mainly on the evaluation of neural network performances, trained with the gradient backpropagation algorithm. The used parameters to form the input vector of the neural network are extracted from the binary images of the isolated handwritten digit by several methods: the distribution sequence, sondes application, the Barr features, and the centered moments of the different projections and profiles. Secondly, the study is extended for the reading of the fragmented handwritten numeric chains constituted of a variable number of digits. The vertical projection was used to segment the numeric chain at isolated digits and every digit (or segment) was presented separately to the entry of the system achieved in the first part (recognition system of the isolated handwritten digits).Keywords: features extraction, handwritten numeric chains, image processing, neural networks
Procedia PDF Downloads 2651844 Semantic Data Schema Recognition
Authors: Aïcha Ben Salem, Faouzi Boufares, Sebastiao Correia
Abstract:
The subject covered in this paper aims at assisting the user in its quality approach. The goal is to better extract, mix, interpret and reuse data. It deals with the semantic schema recognition of a data source. This enables the extraction of data semantics from all the available information, inculding the data and the metadata. Firstly, it consists of categorizing the data by assigning it to a category and possibly a sub-category, and secondly, of establishing relations between columns and possibly discovering the semantics of the manipulated data source. These links detected between columns offer a better understanding of the source and the alternatives for correcting data. This approach allows automatic detection of a large number of syntactic and semantic anomalies.Keywords: schema recognition, semantic data profiling, meta-categorisation, semantic dependencies inter columns
Procedia PDF Downloads 4181843 Speech Recognition Performance by Adults: A Proposal for a Battery for Marathi
Authors: S. B. Rathna Kumar, Pranjali A Ujwane, Panchanan Mohanty
Abstract:
The present study aimed to develop a battery for assessing speech recognition performance by adults in Marathi. A total of four word lists were developed by considering word frequency, word familiarity, words in common use, and phonemic balance. Each word list consists of 25 words (15 monosyllabic words in CVC structure and 10 monosyllabic words in CVCV structure). Equivalence analysis and performance-intensity function testing was carried using the four word lists on a total of 150 native speakers of Marathi belonging to different regions of Maharashtra (Vidarbha, Marathwada, Khandesh and Northern Maharashtra, Pune, and Konkan). The subjects were further equally divided into five groups based on above mentioned regions. It was found that there was no significant difference (p > 0.05) in the speech recognition performance between groups for each word list and between word lists for each group. Hence, the four word lists developed were equally difficult for all the groups and can be used interchangeably. The performance-intensity (PI) function curve showed semi-linear function, and the groups’ mean slope of the linear portions of the curve indicated an average linear slope of 4.64%, 4.73%, 4.68%, and 4.85% increase in word recognition score per dB for list 1, list 2, list 3 and list 4 respectively. Although, there is no data available on speech recognition tests for adults in Marathi, most of the findings of the study are in line with the findings of research reports on other languages. The four word lists, thus developed, were found to have sufficient reliability and validity in assessing speech recognition performance by adults in Marathi.Keywords: speech recognition performance, phonemic balance, equivalence analysis, performance-intensity function testing, reliability, validity
Procedia PDF Downloads 3561842 Face Recognition Using Body-Worn Camera: Dataset and Baseline Algorithms
Authors: Ali Almadan, Anoop Krishnan, Ajita Rattani
Abstract:
Facial recognition is a widely adopted technology in surveillance, border control, healthcare, banking services, and lately, in mobile user authentication with Apple introducing “Face ID” moniker with iPhone X. A lot of research has been conducted in the area of face recognition on datasets captured by surveillance cameras, DSLR, and mobile devices. Recently, face recognition technology has also been deployed on body-worn cameras to keep officers safe, enabling situational awareness and providing evidence for trial. However, limited academic research has been conducted on this topic so far, without the availability of any publicly available datasets with a sufficient sample size. This paper aims to advance research in the area of face recognition using body-worn cameras. To this aim, the contribution of this work is two-fold: (1) collection of a dataset consisting of a total of 136,939 facial images of 102 subjects captured using body-worn cameras in in-door and daylight conditions and (2) evaluation of various deep-learning architectures for face identification on the collected dataset. Experimental results suggest a maximum True Positive Rate(TPR) of 99.86% at False Positive Rate(FPR) of 0.000 obtained by SphereFace based deep learning architecture in daylight condition. The collected dataset and the baseline algorithms will promote further research and development. A downloadable link of the dataset and the algorithms is available by contacting the authors.Keywords: face recognition, body-worn cameras, deep learning, person identification
Procedia PDF Downloads 1631841 Pre-Analysis of Printed Circuit Boards Based on Multispectral Imaging for Vision Based Recognition of Electronics Waste
Authors: Florian Kleber, Martin Kampel
Abstract:
The increasing demand of gallium, indium and rare-earth elements for the production of electronics, e.g. solid state-lighting, photovoltaics, integrated circuits, and liquid crystal displays, will exceed the world-wide supply according to current forecasts. Recycling systems to reclaim these materials are not yet in place, which challenges the sustainability of these technologies. This paper proposes a multispectral imaging system as a basis for a vision based recognition system for valuable components of electronics waste. Multispectral images intend to enhance the contrast of images of printed circuit boards (single components, as well as labels) for further analysis, such as optical character recognition and entire printed circuit board recognition. The results show that a higher contrast is achieved in the near infrared compared to ultraviolet and visible light.Keywords: electronics waste, multispectral imaging, printed circuit boards, rare-earth elements
Procedia PDF Downloads 4151840 Modeling and Simulation of Underwater Flexible Manipulator as Raleigh Beam Using Bond Graph
Authors: Sumit Kumar, Sunil Kumar, Chandan Deep Singh
Abstract:
This paper presents modeling and simulation of flexible robot in an underwater environment. The underwater environment completely contrasts with ground or space environment. The robot in an underwater situation is subjected to various dynamic forces like buoyancy forces, hydrostatic and hydrodynamic forces. The underwater robot is modeled as Rayleigh beam. The developed model further allows estimating the deflection of tip in two directions. The complete dynamics of the underwater robot is analyzed, which is the main focus of this investigation. The control of robot trajectory is not discussed in this paper. Simulation is performed using Symbol Shakti software.Keywords: bond graph modeling, dynamics. modeling, rayleigh beam, underwater robot
Procedia PDF Downloads 5871839 The Combination of the Mel Frequency Cepstral Coefficients, Perceptual Linear Prediction, Jitter and Shimmer Coefficients for the Improvement of Automatic Recognition System for Dysarthric Speech
Authors: Brahim Fares Zaidi
Abstract:
Our work aims to improve our Automatic Recognition System for Dysarthria Speech based on the Hidden Models of Markov and the Hidden Markov Model Toolkit to help people who are sick. With pronunciation problems, we applied two techniques of speech parameterization based on Mel Frequency Cepstral Coefficients and Perceptual Linear Prediction and concatenated them with JITTER and SHIMMER coefficients in order to increase the recognition rate of a dysarthria speech. For our tests, we used the NEMOURS database that represents speakers with dysarthria and normal speakers.Keywords: ARSDS, HTK, HMM, MFCC, PLP
Procedia PDF Downloads 1081838 Multimodal Data Fusion Techniques in Audiovisual Speech Recognition
Authors: Hadeer M. Sayed, Hesham E. El Deeb, Shereen A. Taie
Abstract:
In the big data era, we are facing a diversity of datasets from different sources in different domains that describe a single life event. These datasets consist of multiple modalities, each of which has a different representation, distribution, scale, and density. Multimodal fusion is the concept of integrating information from multiple modalities in a joint representation with the goal of predicting an outcome through a classification task or regression task. In this paper, multimodal fusion techniques are classified into two main classes: model-agnostic techniques and model-based approaches. It provides a comprehensive study of recent research in each class and outlines the benefits and limitations of each of them. Furthermore, the audiovisual speech recognition task is expressed as a case study of multimodal data fusion approaches, and the open issues through the limitations of the current studies are presented. This paper can be considered a powerful guide for interested researchers in the field of multimodal data fusion and audiovisual speech recognition particularly.Keywords: multimodal data, data fusion, audio-visual speech recognition, neural networks
Procedia PDF Downloads 1111837 Distant Speech Recognition Using Laser Doppler Vibrometer
Authors: Yunbin Deng
Abstract:
Most existing applications of automatic speech recognition relies on cooperative subjects at a short distance to a microphone. Standoff speech recognition using microphone arrays can extend the subject to sensor distance somewhat, but it is still limited to only a few feet. As such, most deployed applications of standoff speech recognitions are limited to indoor use at short range. Moreover, these applications require air passway between the subject and the sensor to achieve reasonable signal to noise ratio. This study reports long range (50 feet) automatic speech recognition experiments using a Laser Doppler Vibrometer (LDV) sensor. This study shows that the LDV sensor modality can extend the speech acquisition standoff distance far beyond microphone arrays to hundreds of feet. In addition, LDV enables 'listening' through the windows for uncooperative subjects. This enables new capabilities in automatic audio and speech intelligence, surveillance, and reconnaissance (ISR) for law enforcement, homeland security and counter terrorism applications. The Polytec LDV model OFV-505 is used in this study. To investigate the impact of different vibrating materials, five parallel LDV speech corpora, each consisting of 630 speakers, are collected from the vibrations of a glass window, a metal plate, a plastic box, a wood slate, and a concrete wall. These are the common materials the application could encounter in a daily life. These data were compared with the microphone counterpart to manifest the impact of various materials on the spectrum of the LDV speech signal. State of the art deep neural network modeling approaches is used to conduct continuous speaker independent speech recognition on these LDV speech datasets. Preliminary phoneme recognition results using time-delay neural network, bi-directional long short term memory, and model fusion shows great promise of using LDV for long range speech recognition. To author’s best knowledge, this is the first time an LDV is reported for long distance speech recognition application.Keywords: covert speech acquisition, distant speech recognition, DSR, laser Doppler vibrometer, LDV, speech intelligence surveillance and reconnaissance, ISR
Procedia PDF Downloads 1791836 Interactive Shadow Play Animation System
Authors: Bo Wan, Xiu Wen, Lingling An, Xiaoling Ding
Abstract:
The paper describes a Chinese shadow play animation system based on Kinect. Users, without any professional training, can personally manipulate the shadow characters to finish a shadow play performance by their body actions and get a shadow play video through giving the record command to our system if they want. In our system, Kinect is responsible for capturing human movement and voice commands data. Gesture recognition module is used to control the change of the shadow play scenes. After packaging the data from Kinect and the recognition result from gesture recognition module, VRPN transmits them to the server-side. At last, the server-side uses the information to control the motion of shadow characters and video recording. This system not only achieves human-computer interaction, but also realizes the interaction between people. It brings an entertaining experience to users and easy to operate for all ages. Even more important is that the application background of Chinese shadow play embodies the protection of the art of shadow play animation.Keywords: hadow play animation, Kinect, gesture recognition, VRPN, HCI
Procedia PDF Downloads 4011835 Effective Stacking of Deep Neural Models for Automated Object Recognition in Retail Stores
Authors: Ankit Sinha, Soham Banerjee, Pratik Chattopadhyay
Abstract:
Automated product recognition in retail stores is an important real-world application in the domain of Computer Vision and Pattern Recognition. In this paper, we consider the problem of automatically identifying the classes of the products placed on racks in retail stores from an image of the rack and information about the query/product images. We improve upon the existing approaches in terms of effectiveness and memory requirement by developing a two-stage object detection and recognition pipeline comprising of a Faster-RCNN-based object localizer that detects the object regions in the rack image and a ResNet-18-based image encoder that classifies the detected regions into the appropriate classes. Each of the models is fine-tuned using appropriate data sets for better prediction and data augmentation is performed on each query image to prepare an extensive gallery set for fine-tuning the ResNet-18-based product recognition model. This encoder is trained using a triplet loss function following the strategy of online-hard-negative-mining for improved prediction. The proposed models are lightweight and can be connected in an end-to-end manner during deployment to automatically identify each product object placed in a rack image. Extensive experiments using Grozi-32k and GP-180 data sets verify the effectiveness of the proposed model.Keywords: retail stores, faster-RCNN, object localization, ResNet-18, triplet loss, data augmentation, product recognition
Procedia PDF Downloads 1561834 Evolution of the Environmental Justice Concept
Authors: Zahra Bakhtiari
Abstract:
This article explores the development and evolution of the concept of environmental justice, which has shifted from being dominated by white and middle-class individuals to a civil struggle by marginalized communities against environmental injustices. Environmental justice aims to achieve equity in decision-making and policy-making related to the environment. The concept of justice in this context includes four fundamental aspects: distribution, procedure, recognition, and capabilities. Recent scholars have attempted to broaden the concept of justice to include dimensions of participation, recognition, and capabilities. Focusing on all four dimensions of environmental justice is crucial for effective planning and policy-making to address environmental issues. Ignoring any of these aspects can lead to the failure of efforts and the waste of resources.Keywords: environmental justice, distribution, procedure, recognition, capabilities
Procedia PDF Downloads 931833 Programmed Speech to Text Summarization Using Graph-Based Algorithm
Authors: Hamsini Pulugurtha, P. V. S. L. Jagadamba
Abstract:
Programmed Speech to Text and Text Summarization Using Graph-based Algorithms can be utilized in gatherings to get the short depiction of the gathering for future reference. This gives signature check utilizing Siamese neural organization to confirm the personality of the client and convert the client gave sound record which is in English into English text utilizing the discourse acknowledgment bundle given in python. At times just the outline of the gathering is required, the answer for this text rundown. Thus, the record is then summed up utilizing the regular language preparing approaches, for example, solo extractive text outline calculationsKeywords: Siamese neural network, English speech, English text, natural language processing, unsupervised extractive text summarization
Procedia PDF Downloads 2181832 Two Concurrent Convolution Neural Networks TC*CNN Model for Face Recognition Using Edge
Authors: T. Alghamdi, G. Alaghband
Abstract:
In this paper we develop a model that couples Two Concurrent Convolution Neural Network with different filters (TC*CNN) for face recognition and compare its performance to an existing sequential CNN (base model). We also test and compare the quality and performance of the models on three datasets with various levels of complexity (easy, moderate, and difficult) and show that for the most complex datasets, edges will produce the most accurate and efficient results. We further show that in such cases while Support Vector Machine (SVM) models are fast, they do not produce accurate results.Keywords: Convolution Neural Network, Edges, Face Recognition , Support Vector Machine.
Procedia PDF Downloads 1531831 Real-Time Recognition of Dynamic Hand Postures on a Neuromorphic System
Authors: Qian Liu, Steve Furber
Abstract:
To explore how the brain may recognize objects in its general,accurate and energy-efficient manner, this paper proposes the use of a neuromorphic hardware system formed from a Dynamic Video Sensor~(DVS) silicon retina in concert with the SpiNNaker real-time Spiking Neural Network~(SNN) simulator. As a first step in the exploration on this platform a recognition system for dynamic hand postures is developed, enabling the study of the methods used in the visual pathways of the brain. Inspired by the behaviours of the primary visual cortex, Convolutional Neural Networks (CNNs) are modeled using both linear perceptrons and spiking Leaky Integrate-and-Fire (LIF) neurons. In this study's largest configuration using these approaches, a network of 74,210 neurons and 15,216,512 synapses is created and operated in real-time using 290 SpiNNaker processor cores in parallel and with 93.0% accuracy. A smaller network using only 1/10th of the resources is also created, again operating in real-time, and it is able to recognize the postures with an accuracy of around 86.4% -only 6.6% lower than the much larger system. The recognition rate of the smaller network developed on this neuromorphic system is sufficient for a successful hand posture recognition system, and demonstrates a much-improved cost to performance trade-off in its approach.Keywords: spiking neural network (SNN), convolutional neural network (CNN), posture recognition, neuromorphic system
Procedia PDF Downloads 4721830 Pattern Recognition Search: An Advancement Over Interpolation Search
Authors: Shahpar Yilmaz, Yasir Nadeem, Syed A. Mehdi
Abstract:
Searching for a record in a dataset is always a frequent task for any data structure-related application. Hence, a fast and efficient algorithm for the approach has its importance in yielding the quickest results and enhancing the overall productivity of the company. Interpolation search is one such technique used to search through a sorted set of elements. This paper proposes a new algorithm, an advancement over interpolation search for the application of search over a sorted array. Pattern Recognition Search or PR Search (PRS), like interpolation search, is a pattern-based divide and conquer algorithm whose objective is to reduce the sample size in order to quicken the process and it does so by treating the array as a perfect arithmetic progression series and thereby deducing the key element’s position. We look to highlight some of the key drawbacks of interpolation search, which are accounted for in the Pattern Recognition Search.Keywords: array, complexity, index, sorting, space, time
Procedia PDF Downloads 2431829 Pattern Recognition Based on Simulation of Chemical Senses (SCS)
Authors: Nermeen El Kashef, Yasser Fouad, Khaled Mahar
Abstract:
No AI-complete system can model the human brain or behavior, without looking at the totality of the whole situation and incorporating a combination of senses. This paper proposes a Pattern Recognition model based on Simulation of Chemical Senses (SCS) for separation and classification of sign language. The model based on human taste controlling strategy. The main idea of the introduced model is motivated by the facts that the tongue cluster input substance into its basic tastes first, and then the brain recognizes its flavor. To implement this strategy, two level architecture is proposed (this is inspired from taste system). The separation-level of the architecture focuses on hand posture cluster, while the classification-level of the architecture to recognizes the sign language. The efficiency of proposed model is demonstrated experimentally by recognizing American Sign Language (ASL) data set. The recognition accuracy obtained for numbers of ASL is 92.9 percent.Keywords: artificial intelligence, biocybernetics, gustatory system, sign language recognition, taste sense
Procedia PDF Downloads 2941828 Ultraviolet Visible Spectroscopy Analysis on Transformer Oil by Correlating It with Various Oil Parameters
Authors: Rajnish Shrivastava, Y. R. Sood, Priti Pundir, Rahul Srivastava
Abstract:
Power transformer is one of the most important devices that are used in power station. Due to several fault impending upon it or due to ageing, etc its life gets lowered. So, it becomes necessary to have diagnosis of oil for fault analysis. Due to the chemical, electrical, thermal and mechanical stress the insulating material in the power transformer degraded. It is important to regularly assess the condition of oil and the remaining life of the power transformer. In this paper UV-VIS absorption graph area is correlated with moisture content, Flash point, IFT and Density of Transformer oil. Since UV-VIS absorption graph area varies accordingly with the variation in different transformer parameters. So by obtaining the correlation among different oil parameters for oil with respect to UV-VIS absorption area, decay contents of transformer oil can be predictedKeywords: breakdown voltage (BDV), interfacial Tension (IFT), moisture content, ultra violet-visible rays spectroscopy (UV-VIS)
Procedia PDF Downloads 6421827 Jordan Curves in the Digital Plane with Respect to the Connectednesses given by Certain Adjacency Graphs
Authors: Josef Slapal
Abstract:
Digital images are approximations of real ones and, therefore, to be able to study them, we need the digital plane Z2 to be equipped with a convenient structure that behaves analogously to the Euclidean topology on the real plane. In particular, it is required that such a structure allows for a digital analogue of the Jordan curve theorem. We introduce certain adjacency graphs on the digital plane and prove digital Jordan curves for them thus showing that the graphs provide convenient structures on Z2 for the study and processing of digital images. Further convenient structures including the wellknown Khalimsky and Marcus-Wyse adjacency graphs may be obtained as quotients of the graphs introduced. Since digital Jordan curves represent borders of objects in digital images, the adjacency graphs discussed may be used as background structures on the digital plane for solving the problems of digital image processing that are closely related to borders like border detection, contour filling, pattern recognition, thinning, etc.Keywords: digital plane, adjacency graph, Jordan curve, quotient adjacency
Procedia PDF Downloads 3791826 Image Processing techniques for Surveillance in Outdoor Environment
Authors: Jayanth C., Anirudh Sai Yetikuri, Kavitha S. N.
Abstract:
This paper explores the development and application of computer vision and machine learning techniques for real-time pose detection, facial recognition, and number plate extraction. Utilizing MediaPipe for pose estimation, the research presents methods for detecting hand raises and ducking postures through real-time video analysis. Complementarily, facial recognition is employed to compare and verify individual identities using the face recognition library. Additionally, the paper demonstrates a robust approach for extracting and storing vehicle number plates from images, integrating Optical Character Recognition (OCR) with a database management system. The study highlights the effectiveness and versatility of these technologies in practical scenarios, including security and surveillance applications. The findings underscore the potential of combining computer vision techniques to address diverse challenges and enhance automated systems for both individual and vehicular identification. This research contributes to the fields of computer vision and machine learning by providing scalable solutions and demonstrating their applicability in real-world contexts.Keywords: computer vision, pose detection, facial recognition, number plate extraction, machine learning, real-time analysis, OCR, database management
Procedia PDF Downloads 261825 Defect Localization and Interaction on Surfaces with Projection Mapping and Gesture Recognition
Authors: Qiang Wang, Hongyang Yu, MingRong Lai, Miao Luo
Abstract:
This paper presents a method for accurately localizing and interacting with known surface defects by overlaying patterns onto real-world surfaces using a projection system. Given the world coordinates of the defects, we project corresponding patterns onto the surfaces, providing an intuitive visualization of the specific defect locations. To enable users to interact with and retrieve more information about individual defects, we implement a gesture recognition system based on a pruned and optimized version of YOLOv6. This lightweight model achieves an accuracy of 82.8% and is suitable for deployment on low-performance devices. Our approach demonstrates the potential for enhancing defect identification, inspection processes, and user interaction in various applications.Keywords: defect localization, projection mapping, gesture recognition, YOLOv6
Procedia PDF Downloads 881824 SCNet: A Vehicle Color Classification Network Based on Spatial Cluster Loss and Channel Attention Mechanism
Authors: Fei Gao, Xinyang Dong, Yisu Ge, Shufang Lu, Libo Weng
Abstract:
Vehicle color recognition plays an important role in traffic accident investigation. However, due to the influence of illumination, weather, and noise, vehicle color recognition still faces challenges. In this paper, a vehicle color classification network based on spatial cluster loss and channel attention mechanism (SCNet) is proposed for vehicle color recognition. A channel attention module is applied to extract the features of vehicle color representative regions and reduce the weight of nonrepresentative color regions in the channel. The proposed loss function, called spatial clustering loss (SC-loss), consists of two channel-specific components, such as a concentration component and a diversity component. The concentration component forces all feature channels belonging to the same class to be concentrated through the channel cluster. The diversity components impose additional constraints on the channels through the mean distance coefficient, making them mutually exclusive in spatial dimensions. In the comparison experiments, the proposed method can achieve state-of-the-art performance on the public datasets, VCD, and VeRi, which are 96.1% and 96.2%, respectively. In addition, the ablation experiment further proves that SC-loss can effectively improve the accuracy of vehicle color recognition.Keywords: feature extraction, convolutional neural networks, intelligent transportation, vehicle color recognition
Procedia PDF Downloads 1831823 Analyzing the Use of Augmented Reality and Image Recognition in Cultural Education: Use Case of Sintra Palace Treasure Hunt Application
Authors: Marek Maruszczak
Abstract:
Gamified applications have been used successfully in education for years. The rapid development of technologies such as augmented reality and image recognition increases their availability and reduces their prices. Thus, there is an increasing possibility and need for a wide use of such applications in education. The main purpose of this article is to present the effects of work on a mobile application with augmented reality, the aim of which is to motivate tourists to pay more attention to the attractions and increase the likelihood of moving from one attraction to the next while visiting the Palácio Nacional de Sintra in Portugal. Work on the application was carried out together with the employees of Parques de Sintra from 2019 to 2021. Their effect was the preparation of a mobile application using augmented reality and image recognition. The application was tested on the palace premises by both Parques de Sintra employees and tourists visiting Palácio Nacional de Sintra. The collected conclusions allowed for the formulation of good practices and guidelines that can be used when designing gamified apps for the purpose of cultural education.Keywords: augmented reality, cultural education, gamification, image recognition, mobile games
Procedia PDF Downloads 1901822 The Effect of Experimentally Induced Stress on Facial Recognition Ability of Security Personnel’s
Authors: Zunjarrao Kadam, Vikas Minchekar
Abstract:
The facial recognition is an important task in criminal investigation procedure. The security guards-constantly watching the persons-can help to identify the suspected accused. The forensic psychologists are tackled such cases in the criminal justice system. The security personnel may loss their ability to correctly identify the persons due to constant stress while performing the duty. The present study aimed at to identify the effect of experimentally induced stress on facial recognition ability of security personnel’s. For this study 50, security guards from Sangli, Miraj & Jaysingpur city of the Maharashtra States of India were recruited in the experimental study. The randomized two group design was employed to carry out the research. In the initial condition twenty identity card size photographs were shown to both groups. Afterward, artificial stress was induced in the experimental group through the difficultpuzzle-solvingtask in a limited period. In the second condition, both groups were presented earlier photographs with another additional thirty new photographs. The subjects were asked to recognize the photographs which are shown earliest. The analyzed data revealed that control group has ahighest mean score of facial recognition than experimental group. The results were discussed in the present research.Keywords: experimentally induced stress, facial recognition, cognition, security personnel
Procedia PDF Downloads 2611821 Size-Reduction Strategies for Iris Codes
Authors: Jutta Hämmerle-Uhl, Georg Penn, Gerhard Pötzelsberger, Andreas Uhl
Abstract:
Iris codes contain bits with different entropy. This work investigates different strategies to reduce the size of iris code templates with the aim of reducing storage requirements and computational demand in the matching process. Besides simple sub-sampling schemes, also a binary multi-resolution representation as used in the JBIG hierarchical coding mode is assessed. We find that iris code template size can be reduced significantly while maintaining recognition accuracy. Besides, we propose a two stage identification approach, using small-sized iris code templates in a pre-selection satge, and full resolution templates for final identification, which shows promising recognition behaviour.Keywords: iris recognition, compact iris code, fast matching, best bits, pre-selection identification, two-stage identification
Procedia PDF Downloads 440