Search results for: novel object recognition
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2777

Search results for: novel object recognition

2267 Developmental Psycholinguistic Approach to Conversational Skills - A Continuum of the Sensitivity to Gricean Maxims

Authors: Zsuzsanna Schnell, Francesca Ervas

Abstract:

Background: the experimental pragmatic study confirms a basic tenet in the Relevance theoretical views in language philosophy. It draws up a developmental trajectory of the maxims, revealing the cognitive difficulty of their interpretation, their relative place to each other, and the order they may follow in development. A central claim of the present research is that social-cognitive skills play a significant role in inferential meaning construction. Children passing the False Belief Test are significantly more successful in tasks measuring the recognition of the infringement of conversational maxims. Aims and method: Preschoolers’ conversational skills and pragmatic competence is examined in view of their mentalization skills. In doing so it use a measure of linguistic tasks, containing 5 short scenarios for each Gricean maxim. it measure preschoolers’ ToM performance with a first- and a second order ToM task and compare participants’ ability to recognize the infringement of the Gricean maxims in view of their social cognitive skills. Results: Findings suggest that Theory of Mind has a predictive force of 75% concerning the ability to follow Gricean maxims efficiently. ToM proved to be a significant factor in predicting the group’s performance and success rates in 3 out of 4 maxim infringement recognition tasks: in the Quantity, Relevance and Manner conditions, but not in the Quality trial. Conclusions: the results confirm that children’s communicative competence in social contexts requires the development of higher-order social-cognitive reasoning, and reveal the cognitive effort needed for the recognition of the infringement of each maxim, yielding a continuum of their cognitive difficulty and trajectory of development.

Keywords: maxim infringement recognition, social cognition, Gricean maxims, developmental pragmatics

Procedia PDF Downloads 6
2266 Becoming a Warrior: Conspiracy, Dramaturgy, and Follower Charisma on the Far Right

Authors: Anthony Albanese

Abstract:

While much of the literature concerning Max Weber’s concept of charisma has addressed the importance of the follower’s recognition of and devotion to the charismatic leader, very little has been said about the processes that lead to the development of follower charisma. This article examines this largely overlooked aspect of the concept, as doing so (1) exacts the dynamics behind charisma’s transferability by moving beyond follower-centric models that focus on the recognition of the leader and toward one that emphasizes the follower’s generation and exhibition of charisma, (2) bridges a crucial gap between the rather wanting “losers of modernization” thesis and the social actor’s proclivity to produce stories and self-cast in said stories, (3) presents authoritarian dispositions as a reaction to the weakening effects everydayness have on charisma, and (4) complicates Weber’s formulation by reassessing the role of continually demonstrable mastery. To illustrate these dynamics, one should turn to the January 6th Capitol attack in the United States.

Keywords: max weber, extremism, right-wing populism, charisma

Procedia PDF Downloads 92
2265 Metallacyclodimeric Array Containing Both Suprachannels and Cages: Selective Reservoir and Recognition of Diiodomethane

Authors: Daseul Lee, Jeong Jun Lee, Ok-Sang Jung

Abstract:

Self-assembly of a series of ZnX2 (X- = Cl-, Br-, and I-) with 2,3-bis(4’-nicotinamidephenoxy)naphthalene (L) as a new bidentate pyridyl-donor ligand yields systematic metallacyclodimeric unit, [ZnX2L]2. The supramolecule constitutes a characteristically stacked forming both 1D suprachannels and cages. Weak C-H⋯π and inter-digitated π⋯π interactions are main driving forces in the formation of both suprachannels and cages. The slightly different features between the suprachannel and cage have been investigated by 1H NMR and TG analysis, which solvent quantitatively exchange within only suprachannels. Photo-unstable CH2I2 molecules are stabilized via capturing within suprachannels, which is monitored by UV-Vis spectroscopy. Furthermore, the photoluminescence intensity, from the chromophore naphthyl moiety of [ZnCl2L]2, gradually decreases with the addition of CH2I2. And washing off the CH2I2 by dichloromethane returned the PL intensity back to its approximately original signal.

Keywords: metallacyclodimer, suprachannel, π⋯π interaction, molecular recognition

Procedia PDF Downloads 322
2264 Online Pose Estimation and Tracking Approach with Siamese Region Proposal Network

Authors: Cheng Fang, Lingwei Quan, Cunyue Lu

Abstract:

Human pose estimation and tracking are to accurately identify and locate the positions of human joints in the video. It is a computer vision task which is of great significance for human motion recognition, behavior understanding and scene analysis. There has been remarkable progress on human pose estimation in recent years. However, more researches are needed for human pose tracking especially for online tracking. In this paper, a framework, called PoseSRPN, is proposed for online single-person pose estimation and tracking. We use Siamese network attaching a pose estimation branch to incorporate Single-person Pose Tracking (SPT) and Visual Object Tracking (VOT) into one framework. The pose estimation branch has a simple network structure that replaces the complex upsampling and convolution network structure with deconvolution. By augmenting the loss of fully convolutional Siamese network with the pose estimation task, pose estimation and tracking can be trained in one stage. Once trained, PoseSRPN only relies on a single bounding box initialization and producing human joints location. The experimental results show that while maintaining the good accuracy of pose estimation on COCO and PoseTrack datasets, the proposed method achieves a speed of 59 frame/s, which is superior to other pose tracking frameworks.

Keywords: computer vision, pose estimation, pose tracking, Siamese network

Procedia PDF Downloads 153
2263 Image Ranking to Assist Object Labeling for Training Detection Models

Authors: Tonislav Ivanov, Oleksii Nedashkivskyi, Denis Babeshko, Vadim Pinskiy, Matthew Putman

Abstract:

Training a machine learning model for object detection that generalizes well is known to benefit from a training dataset with diverse examples. However, training datasets usually contain many repeats of common examples of a class and lack rarely seen examples. This is due to the process commonly used during human annotation where a person would proceed sequentially through a list of images labeling a sufficiently high total number of examples. Instead, the method presented involves an active process where, after the initial labeling of several images is completed, the next subset of images for labeling is selected by an algorithm. This process of algorithmic image selection and manual labeling continues in an iterative fashion. The algorithm used for the image selection is a deep learning algorithm, based on the U-shaped architecture, which quantifies the presence of unseen data in each image in order to find images that contain the most novel examples. Moreover, the location of the unseen data in each image is highlighted, aiding the labeler in spotting these examples. Experiments performed using semiconductor wafer data show that labeling a subset of the data, curated by this algorithm, resulted in a model with a better performance than a model produced from sequentially labeling the same amount of data. Also, similar performance is achieved compared to a model trained on exhaustive labeling of the whole dataset. Overall, the proposed approach results in a dataset that has a diverse set of examples per class as well as more balanced classes, which proves beneficial when training a deep learning model.

Keywords: computer vision, deep learning, object detection, semiconductor

Procedia PDF Downloads 136
2262 Robotic Arm-Automated Spray Painting with One-Shot Object Detection and Region-Based Path Optimization

Authors: Iqraq Kamal, Akmal Razif, Sivadas Chandra Sekaran, Ahmad Syazwan Hisaburi

Abstract:

Painting plays a crucial role in the aerospace manufacturing industry, serving both protective and cosmetic purposes for components. However, the traditional manual painting method is time-consuming and labor-intensive, posing challenges for the sector in achieving higher efficiency. Additionally, the current automated robot path planning has been a bottleneck for spray painting processes, as typical manual teaching methods are time-consuming, error-prone, and skill-dependent. Therefore, it is essential to develop automated tool path planning methods to replace manual ones, reducing costs and improving product quality. Focusing on flat panel painting in aerospace manufacturing, this study aims to address issues related to unreliable part identification techniques caused by the high-mixture, low-volume nature of the industry. The proposed solution involves using a spray gun and a UR10 robotic arm with a vision system that utilizes one-shot object detection (OS2D) to identify parts accurately. Additionally, the research optimizes path planning by concentrating on the region of interest—specifically, the identified part, rather than uniformly covering the entire painting tray.

Keywords: aerospace manufacturing, one-shot object detection, automated spray painting, vision-based path optimization, deep learning, automation, robotic arm

Procedia PDF Downloads 81
2261 Refactoring Object Oriented Software through Community Detection Using Evolutionary Computation

Authors: R. Nagarani

Abstract:

An intrinsic property of software in a real-world environment is its need to evolve, which is usually accompanied by the increase of software complexity and deterioration of software quality, making software maintenance a tough problem. Refactoring is regarded as an effective way to address this problem. Many refactoring approaches at the method and class level have been proposed. But the extent of research on software refactoring at the package level is less. This work presents a novel approach to refactor the package structures of object oriented software using genetic algorithm based community detection. It uses software networks to represent classes and their dependencies. It uses a constrained community detection algorithm to obtain the optimized community structures in software networks, which also correspond to the optimized package structures. It finally provides a list of classes as refactoring candidates by comparing the optimized package structures with the real package structures.

Keywords: community detection, complex network, genetic algorithm, package, refactoring

Procedia PDF Downloads 418
2260 Reed: An Approach Towards Quickly Bootstrapping Multilingual Acoustic Models

Authors: Bipasha Sen, Aditya Agarwal

Abstract:

Multilingual automatic speech recognition (ASR) system is a single entity capable of transcribing multiple languages sharing a common phone space. Performance of such a system is highly dependent on the compatibility of the languages. State of the art speech recognition systems are built using sequential architectures based on recurrent neural networks (RNN) limiting the computational parallelization in training. This poses a significant challenge in terms of time taken to bootstrap and validate the compatibility of multiple languages for building a robust multilingual system. Complex architectural choices based on self-attention networks are made to improve the parallelization thereby reducing the training time. In this work, we propose Reed, a simple system based on 1D convolutions which uses very short context to improve the training time. To improve the performance of our system, we use raw time-domain speech signals directly as input. This enables the convolutional layers to learn feature representations rather than relying on handcrafted features such as MFCC. We report improvement on training and inference times by atleast a factor of 4x and 7.4x respectively with comparable WERs against standard RNN based baseline systems on SpeechOcean's multilingual low resource dataset.

Keywords: convolutional neural networks, language compatibility, low resource languages, multilingual automatic speech recognition

Procedia PDF Downloads 123
2259 Integrated Gesture and Voice-Activated Mouse Control System

Authors: Dev Pratap Singh, Harshika Hasija, Ashwini S.

Abstract:

The project aims to provide a touchless, intuitive interface for human-computer interaction, enabling users to control their computers using hand gestures and voice commands. The system leverages advanced computer vision techniques using the Media Pipe framework and OpenCV to detect and interpret real-time hand gestures, transforming them into mouse actions such as clicking, dragging, and scrolling. Additionally, the integration of a voice assistant powered by the speech recognition library allows for seamless execution of tasks like web searches, location navigation, and gesture control in the system through voice commands.

Keywords: gesture recognition, hand tracking, machine learning, convolutional neural networks, natural language processing, voice assistant

Procedia PDF Downloads 10
2258 Fast and Robust Long-term Tracking with Effective Searching Model

Authors: Thang V. Kieu, Long P. Nguyen

Abstract:

Kernelized Correlation Filter (KCF) based trackers have gained a lot of attention recently because of their accuracy and fast calculation speed. However, this algorithm is not robust in cases where the object is lost by a sudden change of direction, being obscured or going out of view. In order to improve KCF performance in long-term tracking, this paper proposes an anomaly detection method for target loss warning by analyzing the response map of each frame, and a classification algorithm for reliable target re-locating mechanism by using Random fern. Being tested with Visual Tracker Benchmark and Visual Object Tracking datasets, the experimental results indicated that the precision and success rate of the proposed algorithm were 2.92 and 2.61 times higher than that of the original KCF algorithm, respectively. Moreover, the proposed tracker handles occlusion better than many state-of-the-art long-term tracking methods while running at 60 frames per second.

Keywords: correlation filter, long-term tracking, random fern, real-time tracking

Procedia PDF Downloads 137
2257 GRCNN: Graph Recognition Convolutional Neural Network for Synthesizing Programs from Flow Charts

Authors: Lin Cheng, Zijiang Yang

Abstract:

Program synthesis is the task to automatically generate programs based on user specification. In this paper, we present a framework that synthesizes programs from flow charts that serve as accurate and intuitive specification. In order doing so, we propose a deep neural network called GRCNN that recognizes graph structure from its image. GRCNN is trained end-to-end, which can predict edge and node information of the flow chart simultaneously. Experiments show that the accuracy rate to synthesize a program is 66.4%, and the accuracy rates to recognize edge and node are 94.1% and 67.9%, respectively. On average, it takes about 60 milliseconds to synthesize a program.

Keywords: program synthesis, flow chart, specification, graph recognition, CNN

Procedia PDF Downloads 119
2256 An Approach for Reducing Morphological Operator Dataset and Recognize Optical Character Based on Significant Features

Authors: Ashis Pradhan, Mohan P. Pradhan

Abstract:

Pattern Matching is useful for recognizing character in a digital image. OCR is one such technique which reads character from a digital image and recognizes them. Line segmentation is initially used for identifying character in an image and later refined by morphological operations like binarization, erosion, thinning, etc. The work discusses a recognition technique that defines a set of morphological operators based on its orientation in a character. These operators are further categorized into groups having similar shape but different orientation for efficient utilization of memory. Finally the characters are recognized in accordance with the occurrence of frequency in hierarchy of significant pattern of those morphological operators and by comparing them with the existing database of each character.

Keywords: binary image, morphological patterns, frequency count, priority, reduction data set and recognition

Procedia PDF Downloads 413
2255 Laser - Ultrasonic Method for the Measurement of Residual Stresses in Metals

Authors: Alexander A. Karabutov, Natalia B. Podymova, Elena B. Cherepetskaya

Abstract:

The theoretical analysis is carried out to get the relation between the ultrasonic wave velocity and the value of residual stresses. The laser-ultrasonic method is developed to evaluate the residual stresses and subsurface defects in metals. The method is based on the laser thermooptical excitation of longitudinal ultrasonic wave sand their detection by a broadband piezoelectric detector. A laser pulse with the time duration of 8 ns of the full width at half of maximum and with the energy of 300 µJ is absorbed in a thin layer of the special generator that is inclined relative to the object under study. The non-uniform heating of the generator causes the formation of a broadband powerful pulse of longitudinal ultrasonic waves. It is shown that the temporal profile of this pulse is the convolution of the temporal envelope of the laser pulse and the profile of the in-depth distribution of the heat sources. The ultrasonic waves reach the surface of the object through the prism that serves as an acoustic duct. At the interface ‚laser-ultrasonic transducer-object‘ the conversion of the most part of the longitudinal wave energy takes place into the shear, subsurface longitudinal and Rayleigh waves. They spread within the subsurface layer of the studied object and are detected by the piezoelectric detector. The electrical signal that corresponds to the detected acoustic signal is acquired by an analog-to-digital converter and when is mathematically processed and visualized with a personal computer. The distance between the generator and the piezodetector as well as the spread times of acoustic waves in the acoustic ducts are the characteristic parameters of the laser-ultrasonic transducer and are determined using the calibration samples. There lative precision of the measurement of the velocity of longitudinal ultrasonic waves is 0.05% that corresponds to approximately ±3 m/s for the steels of conventional quality. This precision allows one to determine the mechanical stress in the steel samples with the minimal detection threshold of approximately 22.7 MPa. The results are presented for the measured dependencies of the velocity of longitudinal ultrasonic waves in the samples on the values of the applied compression stress in the range of 20-100 MPa.

Keywords: laser-ultrasonic method, longitudinal ultrasonic waves, metals, residual stresses

Procedia PDF Downloads 325
2254 Kantian Epistemology in Examination of the Axiomatic Principles of Economics: The Synthetic a Priori in the Economic Structure of Society

Authors: Mirza Adil Ahmad Mughal

Abstract:

Transcendental analytics, in the critique of pure reason, combines space and time as conditions of the possibility of the phenomenon from the transcendental aesthetic with the pure magnitude-intuition notion. The property of continuity as a qualitative result of the additive magnitude brings the possibility of connecting with experience, even though only as a potential because of the a priori necessity from assumption, as syntheticity of the a priori task of a scientific method of philosophy given by Kant, which precludes the application of categories to something not empirically reducible to the content of such a category's corresponding and possible object. This continuity as the qualitative result of a priori constructed notion of magnitude lies as a fundamental assumption and property of, what in Microeconomic theory is called as, 'choice rules' which combine the potentially-empirical and practical budget-price pairs with preference relations. This latter result is the purest qualitative side of the choice rules', otherwise autonomously, quantitative nature. The theoretical, barring the empirical, nature of this qualitative result is a synthetic a priori truth, which, if at all, it should be, if the axiomatic structure of the economic theory is held to be correct. It has a potentially verifiable content as its possible object in the form of quantitative price-budget pairs. Yet, the object that serves the respective Kantian category is qualitative itself, which is utility. This article explores the validity of Kantian qualifications for this application of 'categories' to the economic structure of society.

Keywords: categories of understanding, continuity, convexity, psyche, revealed preferences, synthetic a priori

Procedia PDF Downloads 98
2253 NLRP3-Inflammassome Participates in the Inflammatory Response Induced by Paracoccidioides brasiliensis

Authors: Eduardo Kanagushiku Pereira, Frank Gregory Cavalcante da Silva, Barbara Soares Gonçalves, Ana Lúcia Bergamasco Galastri, Ronei Luciano Mamoni

Abstract:

The inflammatory response initiates after the recognition of pathogens by receptors expressed by innate immune cells. Among these receptors, the NLRP3 was associated with the recognition of pathogenic fungi in experimental models. NLRP3 operates forming a multiproteic complex called inflammasome, which actives caspase-1, responsible for the production of the inflammatory cytokines IL-1beta and IL-18. In this study, we aimed to investigate the involvement of NLRP3 in the inflammatory response elicited in macrophages against Paracoccidioides brasiliensis (Pb), the etiologic agent of PCM. Macrophages were differentiated from THP-1 cells by treatment with phorbol-myristate-acetate. Following differentiation, macrophages were stimulated by Pb yeast cells for 24 hours, after previous treatment with specific NLRP3 (3,4-methylenedioxy-beta-nitrostyrene) and/or caspase-1 (VX-765) inhibitors, or specific inhibitors of pathways involved in NLRP3 activation such as: Reactive Oxigen Species (ROS) production (N-Acetyl-L-cysteine), K+ efflux (Glibenclamide) or phagossome acidification (Bafilomycin). Quantification of IL-1beta and IL-18 in supernatants was performed by ELISA. Our results showed that the production of IL-1beta and IL-18 by THP-1-derived-macrophages stimulated with Pb yeast cells was dependent on NLRP3 and caspase-1 activation, once the presence of their specific inhibitors diminished the production of these cytokines. Furthermore, we found that the major pathways involved in NLRP3 activation, after Pb recognition, were dependent on ROS production and K+ efflux. In conclusion, our results showed that NLRP3 participates in the recognition of Pb yeast cells by macrophages, leading to the activation of the NLRP3-inflammasome and production of IL-1beta and IL-18. Together, these cytokines can induce an inflammatory response against P. brasiliensis, essential for the establishment of the initial inflammatory response and for the development of the subsequent acquired immune response.

Keywords: inflammation, IL-1beta, IL-18, NLRP3, Paracoccidioidomycosis

Procedia PDF Downloads 273
2252 Analog Railway Signal Object Controller Development

Authors: Ercan Kızılay, Mustafa Demi̇rel, Selçuk Coşkun

Abstract:

Railway signaling systems consist of vital products that regulate railway traffic and provide safe route arrangements and maneuvers of trains. SIL 4 signal lamps are produced by many manufacturers today. There is a need for systems that enable these signal lamps to be controlled by commands from the interlocking. These systems should act as fail-safe and give error indications to the interlocking system when an unexpected situation occurs for the safe operation of railway systems from the RAMS perspective. In the past, driving and proving the lamp in relay-based systems was typically done via signaling relays. Today, the proving of lamps is done by comparing the current values read over the return circuit, the lower and upper threshold values. The purpose is an analog electronic object controller with the possibility of easy integration with vital systems and the signal lamp itself. During the study, the EN50126 standard approach was considered, and the concept, definition, risk analysis, requirements, architecture, design, and prototyping were performed throughout this study. FMEA (Failure Modes and Effects Analysis) and FTA (Fault Tree) Analysis) have been used for safety analysis in accordance with EN 50129. Concerning these analyzes, the 1oo2D reactive fail-safe hardware design of a controller has been researched. Electromagnetic compatibility (EMC) effects on the functional safety of equipment, insulation coordination, and over-voltage protection were discussed during hardware design according to EN 50124 and EN 50122 standards. As vital equipment for railway signaling, railway signal object controllers should be developed according to EN 50126 and EN 50129 standards which identify the steps and requirements of the development in accordance with the SIL 4(Safety Integrity Level) target. In conclusion of this study, an analog railway signal object controller, which takes command from the interlocking system, is processed in driver cards. Driver cards arrange the voltage level according to desired visibility by means of semiconductors. Additionally, prover cards evaluate the current upper and lower thresholds. Evaluated values are processed via logic gates which are composed as 1oo2D by means of analog electronic technologies. This logic evaluates the voltage level of the lamp and mitigates the risks of undue dimming.

Keywords: object controller, railway electronic, analog electronic, safety, railway signal

Procedia PDF Downloads 99
2251 Setting up a Prototype for the Artificial Interactive Reality Unified System to Transform Psychosocial Intervention in Occupational Therapy

Authors: Tsang K. L. V., Lewis L. A., Griffith S., Tucker P.

Abstract:

Background:  Many children with high incidence disabilities, such as autism spectrum disorder (ASD), struggle to participate in the community in a socially acceptable manner. There are limitations for clinical settings to provide natural, real-life scenarios for them to practice the life skills needed to meet their real-life challenges. Virtual reality (VR) offers potential solutions to resolve the existing limitations faced by clinicians to create simulated natural environments for their clients to generalize the facilitated skills. Research design: The research aimed to develop a prototype of an interactive VR system to provide realistic and immersive environments for clients to practice skills. The descriptive qualitative methodology is employed to design and develop the Artificial Interactive Reality Unified System (AIRUS) prototype, which provided insights on how to use advanced VR technology to create simulated real-life social scenarios and enable users to interact with the objects and people inside the virtual environment using natural eye-gazes, hand and body movements. The eye tracking (e.g., selective or joint attention), hand- or body-tracking (e.g., repetitive stimming or fidgeting), and facial tracking (e.g., emotion recognition) functions allowed behavioral data to be captured and managed in the AIRUS architecture. Impact of project: Instead of using external controllers or sensors, hand tracking software enabled the users to interact naturally with the simulated environment using daily life behavior such as handshaking and waving to control and interact with the virtual objects and people. The AIRUS protocol offers opportunities for breakthroughs in future VR-based psychosocial assessment and intervention in occupational therapy. Implications for future projects: AI technology can allow more efficient data capturing and interpretation of object identification and human facial emotion recognition at any given moment. The data points captured can be used to pinpoint our users’ focus and where their interests lie. AI can further help advance the data interpretation system.

Keywords: occupational therapy, psychosocial assessment and intervention, simulated interactive environment, virtual reality

Procedia PDF Downloads 35
2250 Patient-Friendly Hand Gesture Recognition Using AI

Authors: K. Prabhu, K. Dinesh, M. Ranjani, M. Suhitha

Abstract:

During the tough times of covid, those people who were hospitalized found it difficult to always convey what they wanted to or needed to the attendee. Sometimes the attendees might also not be there. In that case, the patients can use simple hand gestures to control electrical appliances (like its set it for a zero watts bulb)and three other gestures for voice note intimation. In this AI-based hand recognition project, NodeMCU is used for the control action of the relay, and it is connected to the firebase for storing the value in the cloud and is interfaced with the python code via raspberry pi. For three hand gestures, a voice clip is added for intimation to the attendee. This is done with the help of Google’s text to speech and the inbuilt audio file option in the raspberry pi 4. All the five gestures will be detected when shown with their hands via the webcam, which is placed for gesture detection. The personal computer is used for displaying the gestures and for running the code in the raspberry pi imager.

Keywords: nodeMCU, AI technology, gesture, patient

Procedia PDF Downloads 166
2249 Real-Time Finger Tracking: Evaluating YOLOv8 and MediaPipe for Enhanced HCI

Authors: Zahra Alipour, Amirreza Moheb Afzali

Abstract:

In the field of human-computer interaction (HCI), hand gestures play a crucial role in facilitating communication by expressing emotions and intentions. The precise tracking of the index finger and the estimation of joint positions are essential for developing effective gesture recognition systems. However, various challenges, such as anatomical variations, occlusions, and environmental influences, hinder optimal functionality. This study investigates the performance of the YOLOv8m model for hand detection using the EgoHands dataset, which comprises diverse hand gesture images captured in various environments. Over three training processes, the model demonstrated significant improvements in precision (from 88.8% to 96.1%) and recall (from 83.5% to 93.5%), achieving a mean average precision (mAP) of 97.3% at an IoU threshold of 0.7. We also compared YOLOv8m with MediaPipe and an integrated YOLOv8 + MediaPipe approach. The combined method outperformed the individual models, achieving an accuracy of 99% and a recall of 99%. These findings underscore the benefits of model integration in enhancing gesture recognition accuracy and localization for real-time applications. The results suggest promising avenues for future research in HCI, particularly in augmented reality and assistive technologies, where improved gesture recognition can significantly enhance user experience.

Keywords: YOLOv8, mediapipe, finger tracking, joint estimation, human-computer interaction (HCI)

Procedia PDF Downloads 5
2248 Hand Motion Trajectory Analysis for Dynamic Hand Gestures Used in Indian Sign Language

Authors: Daleesha M. Viswanathan, Sumam Mary Idicula

Abstract:

Dynamic hand gestures are an intrinsic component in sign language communication. Extracting spatial temporal features of the hand gesture trajectory plays an important role in a dynamic gesture recognition system. Finding a discrete feature descriptor for the motion trajectory based on the orientation feature is the main concern of this paper. Kalman filter algorithm and Hidden Markov Models (HMM) models are incorporated with this recognition system for hand trajectory tracking and for spatial temporal classification, respectively.

Keywords: orientation features, discrete feature vector, HMM., Indian sign language

Procedia PDF Downloads 368
2247 Markov Random Field-Based Segmentation Algorithm for Detection of Land Cover Changes Using Uninhabited Aerial Vehicle Synthetic Aperture Radar Polarimetric Images

Authors: Mehrnoosh Omati, Mahmod Reza Sahebi

Abstract:

The information on land use/land cover changing plays an essential role for environmental assessment, planning and management in regional development. Remotely sensed imagery is widely used for providing information in many change detection applications. Polarimetric Synthetic aperture radar (PolSAR) image, with the discrimination capability between different scattering mechanisms, is a powerful tool for environmental monitoring applications. This paper proposes a new boundary-based segmentation algorithm as a fundamental step for land cover change detection. In this method, first, two PolSAR images are segmented using integration of marker-controlled watershed algorithm and coupled Markov random field (MRF). Then, object-based classification is performed to determine changed/no changed image objects. Compared with pixel-based support vector machine (SVM) classifier, this novel segmentation algorithm significantly reduces the speckle effect in PolSAR images and improves the accuracy of binary classification in object-based level. The experimental results on Uninhabited Aerial Vehicle Synthetic Aperture Radar (UAVSAR) polarimetric images show a 3% and 6% improvement in overall accuracy and kappa coefficient, respectively. Also, the proposed method can correctly distinguish homogeneous image parcels.

Keywords: coupled Markov random field (MRF), environment, object-based analysis, polarimetric SAR (PolSAR) images

Procedia PDF Downloads 217
2246 Optoelectronic Hardware Architecture for Recurrent Learning Algorithm in Image Processing

Authors: Abdullah Bal, Sevdenur Bal

Abstract:

This paper purposes a new type of hardware application for training of cellular neural networks (CNN) using optical joint transform correlation (JTC) architecture for image feature extraction. CNNs require much more computation during the training stage compare to test process. Since optoelectronic hardware applications offer possibility of parallel high speed processing capability for 2D data processing applications, CNN training algorithm can be realized using Fourier optics technique. JTC employs lens and CCD cameras with laser beam that realize 2D matrix multiplication and summation in the light speed. Therefore, in the each iteration of training, JTC carries more computation burden inherently and the rest of mathematical computation realized digitally. The bipolar data is encoded by phase and summation of correlation operations is realized using multi-object input joint images. Overlapping properties of JTC are then utilized for summation of two cross-correlations which provide less computation possibility for training stage. Phase-only JTC does not require data rearrangement, electronic pre-calculation and strict system alignment. The proposed system can be incorporated simultaneously with various optical image processing or optical pattern recognition techniques just in the same optical system.

Keywords: CNN training, image processing, joint transform correlation, optoelectronic hardware

Procedia PDF Downloads 506
2245 Analysis of Nonlinear and Non-Stationary Signal to Extract the Features Using Hilbert Huang Transform

Authors: A. N. Paithane, D. S. Bormane, S. D. Shirbahadurkar

Abstract:

It has been seen that emotion recognition is an important research topic in the field of Human and computer interface. A novel technique for Feature Extraction (FE) has been presented here, further a new method has been used for human emotion recognition which is based on HHT method. This method is feasible for analyzing the nonlinear and non-stationary signals. Each signal has been decomposed into the IMF using the EMD. These functions are used to extract the features using fission and fusion process. The decomposition technique which we adopt is a new technique for adaptively decomposing signals. In this perspective, we have reported here potential usefulness of EMD based techniques.We evaluated the algorithm on Augsburg University Database; the manually annotated database.

Keywords: intrinsic mode function (IMF), Hilbert-Huang transform (HHT), empirical mode decomposition (EMD), emotion detection, electrocardiogram (ECG)

Procedia PDF Downloads 580
2244 Integration of Wireless Sensor Networks and Radio Frequency Identification (RFID): An Assesment

Authors: Arslan Murtaza

Abstract:

RFID (Radio Frequency Identification) and WSN (Wireless sensor network) are two significant wireless technologies that have extensive diversity of applications and provide limitless forthcoming potentials. RFID is used to identify existence and location of objects whereas WSN is used to intellect and monitor the environment. Incorporating RFID with WSN not only provides identity and location of an object but also provides information regarding the condition of the object carrying the sensors enabled RFID tag. It can be widely used in stock management, asset tracking, asset counting, security, military, environmental monitoring and forecasting, healthcare, intelligent home, intelligent transport vehicles, warehouse management, and precision agriculture. This assessment presents a brief introduction of RFID, WSN, and integration of WSN and RFID, and then applications related to both RFID and WSN. This assessment also deliberates status of the projects on RFID technology carried out in different computing group projects to be taken on WSN and RFID technology.

Keywords: wireless sensor network, RFID, embedded sensor, Wi-Fi, Bluetooth, integration, time saving, cost efficient

Procedia PDF Downloads 334
2243 Comparison Study of Machine Learning Classifiers for Speech Emotion Recognition

Authors: Aishwarya Ravindra Fursule, Shruti Kshirsagar

Abstract:

In the intersection of artificial intelligence and human-centered computing, this paper delves into speech emotion recognition (SER). It presents a comparative analysis of machine learning models such as K-Nearest Neighbors (KNN),logistic regression, support vector machines (SVM), decision trees, ensemble classifiers, and random forests, applied to SER. The research employs four datasets: Crema D, SAVEE, TESS, and RAVDESS. It focuses on extracting salient audio signal features like Zero Crossing Rate (ZCR), Chroma_stft, Mel Frequency Cepstral Coefficients (MFCC), root mean square (RMS) value, and MelSpectogram. These features are used to train and evaluate the models’ ability to recognize eight types of emotions from speech: happy, sad, neutral, angry, calm, disgust, fear, and surprise. Among the models, the Random Forest algorithm demonstrated superior performance, achieving approximately 79% accuracy. This suggests its suitability for SER within the parameters of this study. The research contributes to SER by showcasing the effectiveness of various machine learning algorithms and feature extraction techniques. The findings hold promise for the development of more precise emotion recognition systems in the future. This abstract provides a succinct overview of the paper’s content, methods, and results.

Keywords: comparison, ML classifiers, KNN, decision tree, SVM, random forest, logistic regression, ensemble classifiers

Procedia PDF Downloads 45
2242 Density Measurement of Underexpanded Jet Using Stripe Patterned Background Oriented Schlieren Method

Authors: Shinsuke Udagawa, Masato Yamagishi, Masanori Ota

Abstract:

The Schlieren method, which has been conventionally used to visualize high-speed flows, has disadvantages such as the complexity of the experimental setup and the inability to quantitatively analyze the amount of refraction of light. The Background Oriented Schlieren (BOS) method proposed by Meier is one of the measurement methods that solves the problems, as mentioned above. The refraction of light is used for BOS method same as the Schlieren method. The BOS method is characterized using a digital camera to capture the images of the background behind the observation area. The images are later analyzed by a computer to quantitatively detect the amount of shift of the background image. The experimental setup for BOS does not require concave mirrors, pinholes, or color filters, which are necessary in the conventional Schlieren method, thus simplifying the experimental setup. However, the defocusing of the observation results is caused in case of using BOS method. Since the focus of camera on the background image leads to defocusing of the observed object. The defocusing of object becomes greater with increasing the distance between the background and the object. On the other hand, the higher sensitivity can be obtained. Therefore, it is necessary to adjust the distance between the background and the object to be appropriate for the experiment, considering the relation between the defocus and the sensitivity. The purpose of this study is to experimentally clarify the effect of defocus on density field reconstruction. In this study, the visualization experiment of underexpanded jet using BOS measurement system with ronchi ruling as the background that we constructed, have been performed. The reservoir pressure of the jet and the distance between camera and axis of jet is fixed, and the distance between background and axis of jet has been changed as the parameter. The images have been later analyzed by using personal computer to quantitatively detect the amount of shift of the background image from the comparison between the background pattern and the captured image of underexpanded jet. The quantitatively measured amount of shift have been reconstructed into a density flow field using the Abel transformation and the Gradstone-Dale equation. From the experimental results, it is found that the reconstructed density image becomes blurring, and noise becomes decreasing with increasing the distance between background and axis of underexpanded jet. Consequently, it is cralified that the sensitivity constant should be greater than 20, and the circle of confusion diameter should be less than 2.7mm at least in this experimental setup.

Keywords: BOS method, underexpanded jet, abel transformation, density field visualization

Procedia PDF Downloads 78
2241 Developing Computational Thinking in Early Childhood Education

Authors: Kalliopi Kanaki, Michael Kalogiannakis

Abstract:

Nowadays, in the digital era, the early acquisition of basic programming skills and knowledge is encouraged, as it facilitates students’ exposure to computational thinking and empowers their creativity, problem-solving skills, and cognitive development. More and more researchers and educators investigate the introduction of computational thinking in K-12 since it is expected to be a fundamental skill for everyone by the middle of the 21st century, just like reading, writing and arithmetic are at the moment. In this paper, a doctoral research in the process is presented, which investigates the infusion of computational thinking into science curriculum in early childhood education. The whole attempt aims to develop young children’s computational thinking by introducing them to the fundamental concepts of object-oriented programming in an enjoyable, yet educational framework. The backbone of the research is the digital environment PhysGramming (an abbreviation of Physical Science Programming), which provides children the opportunity to create their own digital games, turning them from passive consumers to active creators of technology. PhysGramming deploys an innovative hybrid schema of visual and text-based programming techniques, with emphasis on object-orientation. Through PhysGramming, young students are familiarized with basic object-oriented programming concepts, such as classes, objects, and attributes, while, at the same time, get a view of object-oriented programming syntax. Nevertheless, the most noteworthy feature of PhysGramming is that children create their own digital games within the context of physical science courses, in a way that provides familiarization with the basic principles of object-oriented programming and computational thinking, even though no specific reference is made to these principles. Attuned to the ethical guidelines of educational research, interventions were conducted in two classes of second grade. The interventions were designed with respect to the thematic units of the curriculum of physical science courses, as a part of the learning activities of the class. PhysGramming was integrated into the classroom, after short introductory sessions. During the interventions, 6-7 years old children worked in pairs on computers and created their own digital games (group games, matching games, and puzzles). The authors participated in these interventions as observers in order to achieve a realistic evaluation of the proposed educational framework concerning its applicability in the classroom and its educational and pedagogical perspectives. To better examine if the objectives of the research are met, the investigation was focused on six criteria; the educational value of PhysGramming, its engaging and enjoyable characteristics, its child-friendliness, its appropriateness for the purpose that is proposed, its ability to monitor the user’s progress and its individualizing features. In this paper, the functionality of PhysGramming and the philosophy of its integration in the classroom are both described in detail. Information about the implemented interventions and the results obtained is also provided. Finally, several limitations of the research conducted that deserve attention are denoted.

Keywords: computational thinking, early childhood education, object-oriented programming, physical science courses

Procedia PDF Downloads 120
2240 Best Timing for Capturing Satellite Thermal Images, Asphalt, and Concrete Objects

Authors: Toufic Abd El-Latif Sadek

Abstract:

The asphalt object represents the asphalted areas like roads, and the concrete object represents the concrete areas like concrete buildings. The efficient extraction of asphalt and concrete objects from one satellite thermal image occurred at a specific time, by preventing the gaps in times which give the close and same brightness values between asphalt and concrete, and among other objects. So that to achieve efficient extraction and then better analysis. Seven sample objects were used un this study, asphalt, concrete, metal, rock, dry soil, vegetation, and water. It has been found that, the best timing for capturing satellite thermal images to extract the two objects asphalt and concrete from one satellite thermal image, saving time and money, occurred at a specific time in different months. A table is deduced shows the optimal timing for capturing satellite thermal images to extract effectively these two objects.

Keywords: asphalt, concrete, satellite thermal images, timing

Procedia PDF Downloads 322
2239 Breast Cancer Risk is Predicted Using Fuzzy Logic in MATLAB Environment

Authors: S. Valarmathi, P. B. Harathi, R. Sridhar, S. Balasubramanian

Abstract:

Machine learning tools in medical diagnosis is increasing due to the improved effectiveness of classification and recognition systems to help medical experts in diagnosing breast cancer. In this study, ID3 chooses the splitting attribute with the highest gain in information, where gain is defined as the difference between before the split versus after the split. It is applied for age, location, taluk, stage, year, period, martial status, treatment, heredity, sex, and habitat against Very Serious (VS), Very Serious Moderate (VSM), Serious (S) and Not Serious (NS) to calculate the gain of information. The ranked histogram gives the gain of each field for the breast cancer data. The doctors use TNM staging which will decide the risk level of the breast cancer and play an important decision making field in fuzzy logic for perception based measurement. Spatial risk area (taluk) of the breast cancer is calculated. Result clearly states that Coimbatore (North and South) was found to be risk region to the breast cancer than other areas at 20% criteria. Weighted value of taluk was compared with criterion value and integrated with Map Object to visualize the results. ID3 algorithm shows the high breast cancer risk regions in the study area. The study has outlined, discussed and resolved the algorithms, techniques / methods adopted through soft computing methodology like ID3 algorithm for prognostic decision making in the seriousness of the breast cancer.

Keywords: ID3 algorithm, breast cancer, fuzzy logic, MATLAB

Procedia PDF Downloads 518
2238 Curvelet Features with Mouth and Face Edge Ratios for Facial Expression Identification

Authors: S. Kherchaoui, A. Houacine

Abstract:

This paper presents a facial expression recognition system. It performs identification and classification of the seven basic expressions; happy, surprise, fear, disgust, sadness, anger, and neutral states. It consists of three main parts. The first one is the detection of a face and the corresponding facial features to extract the most expressive portion of the face, followed by a normalization of the region of interest. Then calculus of curvelet coefficients is performed with dimensionality reduction through principal component analysis. The resulting coefficients are combined with two ratios; mouth ratio and face edge ratio to constitute the whole feature vector. The third step is the classification of the emotional state using the SVM method in the feature space.

Keywords: facial expression identification, curvelet coefficient, support vector machine (SVM), recognition system

Procedia PDF Downloads 232