Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 2777

Search results for: object recognition

2687 Small Text Extraction from Documents and Chart Images

Authors: Rominkumar Busa, Shahira K. C., Lijiya A.

Abstract:

Text recognition is an important area in computer vision which deals with detecting and recognising text from an image. The Optical Character Recognition (OCR) is a saturated area these days and with very good text recognition accuracy. However the same OCR methods when applied on text with small font sizes like the text data of chart images, the recognition rate is less than 30%. In this work, aims to extract small text in images using the deep learning model, CRNN with CTC loss. The text recognition accuracy is found to improve by applying image enhancement by super resolution prior to CRNN model. We also observe the text recognition rate further increases by 18% by applying the proposed method, which involves super resolution and character segmentation followed by CRNN with CTC loss. The efficiency of the proposed method shows that further pre-processing on chart image text and other small text images will improve the accuracy further, thereby helping text extraction from chart images.

Keywords: small text extraction, OCR, scene text recognition, CRNN

Procedia PDF Downloads 125

2686 Design of Speed Bump Recognition System Integrated with Adjustable Shock Absorber Control

Authors: Ming-Yen Chang, Sheng-Hung Ke

Abstract:

This research focuses on the development of a speed bump identification system for real-time control of adjustable shock absorbers in vehicular suspension systems. The study initially involved the collection of images of various speed bumps, and rubber speed bump profiles found on roadways. These images were utilized for training and recognition purposes through the deep learning object detection algorithm YOLOv5. Subsequently, the trained speed bump identification program was integrated with an in-vehicle camera system for live image capture during driving. These images were instantly transmitted to a computer for processing. Using the principles of monocular vision ranging, the distance between the vehicle and an approaching speed bump was determined. The appropriate control distance was established through both practical vehicle measurements and theoretical calculations. Collaboratively, with the electronically adjustable shock absorbers equipped in the vehicle, a shock absorber control system was devised to dynamically adapt the damping force just prior to encountering a speed bump. This system effectively mitigates passenger discomfort and enhances ride quality.

Keywords: adjustable shock absorbers, image recognition, monocular vision ranging, ride

Procedia PDF Downloads 66

2685 Intelligent Campus Monitoring: YOLOv8-Based High-Accuracy Activity Recognition

Authors: A. Degale Desta, Tamirat Kebamo

Abstract:

Background: Recent advances in computer vision and pattern recognition have significantly improved activity recognition through video analysis, particularly with the application of Deep Convolutional Neural Networks (CNNs). One-stage detectors now enable efficient video-based recognition by simultaneously predicting object categories and locations. Such advancements are highly relevant in educational settings where CCTV surveillance could automatically monitor academic activities, enhancing security and classroom management. However, current datasets and recognition systems lack the specific focus on campus environments necessary for practical application in these settings.Objective: This study aims to address this gap by developing a dataset and testing an automated activity recognition system specifically tailored for educational campuses. The EthioCAD dataset was created to capture various classroom activities and teacher-student interactions, facilitating reliable recognition of academic activities using deep learning models. Method: EthioCAD, a novel video-based dataset, was created with a design science research approach to encompass teacher-student interactions across three domains and 18 distinct classroom activities. Using the Roboflow AI framework, the data was processed, with 4.224 KB of frames and 33.485 MB of images managed for frame extraction, labeling, and organization. The Ultralytics YOLOv8 model was then implemented within Google Colab to evaluate the dataset’s effectiveness, achieving high mean Average Precision (mAP) scores. Results: The YOLOv8 model demonstrated robust activity recognition within campus-like settings, achieving an mAP50 of 90.2% and an mAP50-95 of 78.6%. These results highlight the potential of EthioCAD, combined with YOLOv8, to provide reliable detection and classification of classroom activities, supporting automated surveillance needs on educational campuses. Discussion: The high performance of YOLOv8 on the EthioCAD dataset suggests that automated activity recognition for surveillance is feasible within educational environments. This system addresses current limitations in campus-specific data and tools, offering a tailored solution for academic monitoring that could enhance the effectiveness of CCTV systems in these settings. Conclusion: The EthioCAD dataset, alongside the YOLOv8 model, provides a promising framework for automated campus activity recognition. This approach lays the groundwork for future advancements in CCTV-based educational surveillance systems, enabling more refined and reliable monitoring of classroom activities.

Keywords: deep CNN, EthioCAD, deep learning, YOLOv8, activity recognition

Procedia PDF Downloads 10

2684 Recognition and Protection of Indigenous Society in Indonesia

Authors: Triyanto, Rima Vien Permata Hartanto

Abstract:

Indonesia is a legal state. The consequence of this status is the recognition and protection of the existence of indigenous peoples. This paper aims to describe the dynamics of legal recognition and protection for indigenous peoples within the framework of Indonesian law. This paper is library research based on literature. The result states that although the constitution has normatively recognized the existence of indigenous peoples and their traditional rights, in reality, not all rights were recognized and protected. The protection and recognition for indigenous people need to be strengthened.

Keywords: indigenous peoples, customary law, state law, state of law

Procedia PDF Downloads 330

2683 Relevant LMA Features for Human Motion Recognition

Authors: Insaf Ajili, Malik Mallem, Jean-Yves Didier

Abstract:

Motion recognition from videos is actually a very complex task due to the high variability of motions. This paper describes the challenges of human motion recognition, especially motion representation step with relevant features. Our descriptor vector is inspired from Laban Movement Analysis method. We propose discriminative features using the Random Forest algorithm in order to remove redundant features and make learning algorithms operate faster and more effectively. We validate our method on MSRC-12 and UTKinect datasets.

Keywords: discriminative LMA features, features reduction, human motion recognition, random forest

Procedia PDF Downloads 195

2682 Designing AI-Enabled Smart Maintenance Scheduler: Enhancing Object Reliability through Automated Management

Authors: Arun Prasad Jaganathan

Abstract:

In today's rapidly evolving technological landscape, the need for efficient and proactive maintenance management solutions has become increasingly evident across various industries. Traditional approaches often suffer from drawbacks such as reactive strategies, leading to potential downtime, increased costs, and decreased operational efficiency. In response to these challenges, this paper proposes an AI-enabled approach to object-based maintenance management aimed at enhancing reliability and efficiency. The paper contributes to the growing body of research on AI-driven maintenance management systems, highlighting the transformative impact of intelligent technologies on enhancing object reliability and operational efficiency.

Keywords: AI, machine learning, predictive maintenance, object-based maintenance, expert team scheduling

Procedia PDF Downloads 58

2681 Effects of Reversible Watermarking on Iris Recognition Performance

Authors: Andrew Lock, Alastair Allen

Abstract:

Fragile watermarking has been proposed as a means of adding additional security or functionality to biometric systems, particularly for authentication and tamper detection. In this paper we describe an experimental study on the effect of watermarking iris images with a particular class of fragile algorithm, reversible algorithms, and the ability to correctly perform iris recognition. We investigate two scenarios, matching watermarked images to unmodiﬁed images, and matching watermarked images to watermarked images. We show that different watermarking schemes give very different results for a given capacity, highlighting the importance of investigation. At high embedding rates most algorithms cause significant reduction in recognition performance. However, in many cases, for low embedding rates, recognition accuracy is improved by the watermarking process.

Keywords: biometrics, iris recognition, reversible watermarking, vision engineering

Procedia PDF Downloads 456

2680 Contrastive Learning for Unsupervised Object Segmentation in Sequential Images

Authors: Tian Zhang

Abstract:

Unsupervised object segmentation aims at segmenting objects in sequential images and obtaining the mask of each object without any manual intervention. Unsupervised segmentation remains a challenging task due to the lack of prior knowledge about these objects. Previous methods often require manually specifying the action of each object, which is often difficult to obtain. Instead, this paper does not need action information of objects and automatically learns the actions and relations among objects from the structured environment. To obtain the object segmentation of sequential images, the relationships between objects and images are extracted to infer the action and interaction of objects based on the multi-head attention mechanism. Three types of objects’ relationships in the object segmentation task are proposed: the relationship between objects in the same frame, the relationship between objects in two frames, and the relationship between objects and historical information. Based on these relationships, the proposed model (1) is effective in multiple objects segmentation tasks, (2) just needs images as input, and (3) produces better segmentation results as more relationships are considered. The experimental results on multiple datasets show that this paper’s method achieves state-of-art performance. The quantitative and qualitative analyses of the result are conducted. The proposed method could be easily extended to other similar applications.

Keywords: unsupervised object segmentation, attention mechanism, contrastive learning, structured environment

Procedia PDF Downloads 109

2679 Learning Object Interface Adapted to the Learner's Learning Style

Authors: Zenaide Carvalho da Silva, Leandro Rodrigues Ferreira, Andrey Ricardo Pimentel

Abstract:

Learning styles (LS) refer to the ways and forms that the student prefers to learn in the teaching and learning process. Each student has their own way of receiving and processing information throughout the learning process. Therefore, knowing their LS is important to better understand their individual learning preferences, and also, understand why the use of some teaching methods and techniques give better results with some students, while others it does not. We believe that knowledge of these styles enables the possibility of making propositions for teaching; thus, reorganizing teaching methods and techniques in order to allow learning that is adapted to the individual needs of the student. Adapting learning would be possible through the creation of online educational resources adapted to the style of the student. In this context, this article presents the structure of a learning object interface adaptation based on the LS. The structure created should enable the creation of the adapted learning object according to the student's LS and contributes to the increase of student’s motivation in the use of a learning object as an educational resource.

Keywords: adaptation, interface, learning object, learning style

Procedia PDF Downloads 406

2678 ICanny: CNN Modulation Recognition Algorithm

Authors: Jingpeng Gao, Xinrui Mao, Zhibin Deng

Abstract:

Aiming at the low recognition rate on the composite signal modulation in low signal to noise ratio (SNR), this paper proposes a modulation recognition algorithm based on ICanny-CNN. Firstly, the radar signal is transformed into the time-frequency image by Choi-Williams Distribution (CWD). Secondly, we propose an image processing algorithm using the Guided Filter and the threshold selection method, which is combined with the hole filling and the mask operation. Finally, the shallow convolutional neural network (CNN) is combined with the idea of the depth-wise convolution (Dw Conv) and the point-wise convolution (Pw Conv). The proposed CNN is designed to complete image classification and realize modulation recognition of radar signal. The simulation results show that the proposed algorithm can reach 90.83% at 0dB and 71.52% at -8dB. Therefore, the proposed algorithm has a good classification and anti-noise performance in radar signal modulation recognition and other fields.

Keywords: modulation recognition, image processing, composite signal, improved Canny algorithm

Procedia PDF Downloads 191

2677 Video Based Automatic License Plate Recognition System

Authors: Ali Ganoun, Wesam Algablawi, Wasim BenAnaif

Abstract:

Video based traffic surveillance based on License Plate Recognition (LPR) system is an essential part for any intelligent traffic management system. The LPR system utilizes computer vision and pattern recognition technologies to obtain trafﬁc and road information by detecting and recognizing vehicles based on their license plates. Generally, the video based LPR system is a challenging area of research due to the variety of environmental conditions. The LPR systems used in a wide range of commercial applications such as collision warning systems, finding stolen cars, controlling access to car parks and automatic congestion charge systems. This paper presents an automatic LPR system of Libyan license plate. The performance of the proposed system is evaluated with three video sequences.

Keywords: license plate recognition, localization, segmentation, recognition

Procedia PDF Downloads 464

2676 Face Recognition Using Discrete Orthogonal Hahn Moments

Authors: Fatima Akhmedova, Simon Liao

Abstract:

One of the most critical decision points in the design of a face recognition system is the choice of an appropriate face representation. Effective feature descriptors are expected to convey sufficient, invariant and non-redundant facial information. In this work, we propose a set of Hahn moments as a new approach for feature description. Hahn moments have been widely used in image analysis due to their invariance, non-redundancy and the ability to extract features either globally and locally. To assess the applicability of Hahn moments to Face Recognition we conduct two experiments on the Olivetti Research Laboratory (ORL) database and University of Notre-Dame (UND) X1 biometric collection. Fusion of the global features along with the features from local facial regions are used as an input for the conventional k-NN classifier. The method reaches an accuracy of 93% of correctly recognized subjects for the ORL database and 94% for the UND database.

Keywords: face recognition, Hahn moments, recognition-by-parts, time-lapse

Procedia PDF Downloads 375

2675 Topology-Based Character Recognition Method for Coin Date Detection

Authors: Xingyu Pan, Laure Tougne

Abstract:

For recognizing coins, the graved release date is important information to identify precisely its monetary type. However, reading characters in coins meets much more obstacles than traditional character recognition tasks in the other fields, such as reading scanned documents or license plates. To address this challenging issue in a numismatic context, we propose a training-free approach dedicated to detection and recognition of the release date of the coin. In the first step, the date zone is detected by comparing histogram features; in the second step, a topology-based algorithm is introduced to recognize coin numbers with various font types represented by binary gradient map. Our method obtained a recognition rate of 92% on synthetic data and of 44% on real noised data.

Keywords: coin, detection, character recognition, topology

Procedia PDF Downloads 253

2674 Development of Sound Tactile Interface by Use of Human Sensation of Stiffness

Authors: K. Doi, T. Nishimura, M. Umeda

Abstract:

There are very few sound interfaces that both healthy people and hearing handicapped people can use to play together. In this study, we developed a sound tactile interface that makes use of the human sensation of stiffness. The interface comprises eight elastic objects having varying degrees of stiffness. Each elastic object is shaped like a column. When people with and without hearing disabilities press each elastic object, different sounds are produced depending on the stiffness of the elastic object. The types of sounds used were “Do Re Mi sounds.” The interface has a major advantage in that people with or without hearing disabilities can play with it. We found that users were able to recognize the hardness sensation and relate it to the corresponding Do Re Mi sounds.

Keywords: tactile sense, sound interface, stiffness perception, elastic object

Procedia PDF Downloads 285

2673 Predicting the Relationship Between Childhood Trauma on the Formation of Defense Mechanisms with the Mediating Role of Object Relations in Traders

Authors: Ahmadreza Jabalameli, Mohammad Ebrahimpour Borujeni

Abstract:

According to psychodynamic theories, the major personality structure of individuals is formed in the first years of life. Trauma is an inseparable and undeniable part of everyone's life and they inevitably struggle with many traumas that can have a very significant impact on their lives. The present study deals with the relationship between childhood trauma on the formation of defense mechanisms and the role of object relations. The present descriptive study is a correlation with structural equation modeling (SEM). Sample selection is available and consists of 200 knowledgeable traders in Jabalameli Information Technology Company. The results indicate that the experience of childhood trauma with a demographic moderating effect, through the mediating role of object relations can lead to vulnerability to ego reality functionality and immature and psychically disturbed defense mechanisms. In this regard, there is a significant negative relationship between childhood trauma and object relations with mature defense mechanisms.

Keywords: childhood trauma, defense mechanisms, object relations, trade

Procedia PDF Downloads 132

2672 Protective Effect of the Histamine H3 Receptor Antagonist DL77 in Behavioral Cognitive Deficits Associated with Schizophrenia

Authors: B. Sadek, N. Khan, D. Łażewska, K. Kieć-Kononowicz

Abstract:

The eﬀects of the non-imidazole histamine H3 receptor (H3R) antagonist DL77 in passive avoidance paradigm (PAP) and novel object recognition (NOR) task in MK801-induced cognitive deficits associated with schizophrenia (CDS) in adult male rats, and applying donepezil (DOZ) as a reference drug were investigated. The results show that acute systemic administration of DL77 (2.5, 5, and 10 mg/kg, i.p.) significantly improved MK801-induced (0.1 mg/kg, i.p.) memory deficits in PAP. The ameliorating activity of DL77 (5 mg/kg, i.p.) in MK801-induced deficits was partly reversed when rats were pretreated with the centrally-acting H2R antagonist zolantidine (ZOL, 10 mg/kg, i.p.) or with the antimuscarinic antagonist scopolamine (SCO, 0.1 mg/kg, i.p.), but not with the CNS penetrant H1R antagonist pyrilamine (PYR, 10 mg/kg, i.p.). Moreover, the memory enhancing effect of DL77 (5 mg/kg, i.p.) in MK801-induced memory deficits in PAP was strongly reversed when rats were pretreated with a combination of ZOL (10 mg/kg, i.p.) and SCO (1.0 mg/kg, i.p.). Furthermore, the significant ameliorative effect of DL77 (5 mg/kg, i.p.) on MK801-induced long-term memory (LTM) impairment in NOR test was comparable to the DOZ-provided memory-enhancing effect, and was abrogated when animals were pretreated with the histamine H3R agonist R-(α)-methylhistamine (RAMH, 10 mg/kg, i.p.). However, DL77(5 mg/kg, i.p.) failed to provide procognitive effect on MK801-induced short-term memory (STM) impairment in NOR test. In addition, DL77 (5 mg/kg) did not alter anxiety levels and locomotor activity of animals naive to elevated-plus maze (EPM), demonstrating that improved performances with DL77 (5 mg/kg) in PAP or NOR are unrelated to changes in emotional responding or spontaneous locomotor activity. These results provide evidence for the potential of H3Rs for the treatment of neurodegenerative disorders related to impaired memory function, e.g. CDS.

Keywords: histamine H3 receptor, antagonist, learning, memory impairment, passive avoidance paradigm, novel object recognition

Procedia PDF Downloads 203

2671 Construction Information Visualization System Using nD CAD Model

Authors: Hyeon-seoung Kim, Sang-mi Park, Sun-ju Han, Leen-seok Kang

Abstract:

The visualization technology of construction information using 3D and nD modeling can satisfy the visualization needs of each construction project participant. The nD CAD system is a tool that the construction information, such as construction schedule, cost and resource utilization, are simulated by 4D, 5D and 6D object formats based on 3D object. This study developed a methodology and simulation engine for nD CAD system for construction project management. It has improved functions such as built-in schedule generation, cost simulation of changed budget and built-in resource allocation comparing with the current systems. To develop an integrated nD CAD system, this study attempts an integrated method to link 5D and 6D objects based on 4D object.

Keywords: building information modeling, visual simulation, 3D object, nD CAD augmented reality

Procedia PDF Downloads 312

2670 Object Detection in Digital Images under Non-Standardized Conditions Using Illumination and Shadow Filtering

Authors: Waqqas-ur-Rehman Butt, Martin Servin, Marion Pause

Abstract:

In recent years, object detection has gained much attention and very encouraging research area in the field of computer vision. The robust object boundaries detection in an image is demanded in numerous applications of human computer interaction and automated surveillance systems. Many methods and approaches have been developed for automatic object detection in various fields, such as automotive, quality control management and environmental services. Inappropriately, to the best of our knowledge, object detection under illumination with shadow consideration has not been well solved yet. Furthermore, this problem is also one of the major hurdles to keeping an object detection method from the practical applications. This paper presents an approach to automatic object detection in images under non-standardized environmental conditions. A key challenge is how to detect the object, particularly under uneven illumination conditions. Image capturing conditions the algorithms need to consider a variety of possible environmental factors as the colour information, lightening and shadows varies from image to image. Existing methods mostly failed to produce the appropriate result due to variation in colour information, lightening effects, threshold specifications, histogram dependencies and colour ranges. To overcome these limitations we propose an object detection algorithm, with pre-processing methods, to reduce the interference caused by shadow and illumination effects without fixed parameters. We use the Y CrCb colour model without any specific colour ranges and predefined threshold values. The segmented object regions are further classified using morphological operations (Erosion and Dilation) and contours. Proposed approach applied on a large image data set acquired under various environmental conditions for wood stack detection. Experiments show the promising result of the proposed approach in comparison with existing methods.

Keywords: image processing, illumination equalization, shadow filtering, object detection

Procedia PDF Downloads 216

2669 Object Tracking in Motion Blurred Images with Adaptive Mean Shift and Wavelet Feature

Authors: Iman Iraei, Mina Sharifi

Abstract:

A method for object tracking in motion blurred images is proposed in this article. This paper shows that object tracking could be improved with this approach. We use mean shift algorithm to track different objects as a main tracker. But, the problem is that mean shift could not track the selected object accurately in blurred scenes. So, for better tracking result, and increasing the accuracy of tracking, wavelet transform is used. We use a feature named as blur extent, which could help us to get better results in tracking. For calculating of this feature, we should use Harr wavelet. We can look at this matter from two different angles which lead to determine whether an image is blurred or not and to what extent an image is blur. In fact, this feature left an impact on the covariance matrix of mean shift algorithm and cause to better performance of tracking. This method has been concentrated mostly on motion blur parameter. transform. The results reveal the ability of our method in order to reach more accurately tracking.

Keywords: mean shift, object tracking, blur extent, wavelet transform, motion blur

Procedia PDF Downloads 210

2668 Exploring Multi-Feature Based Action Recognition Using Multi-Dimensional Dynamic Time Warping

Authors: Guoliang Lu, Changhou Lu, Xueyong Li

Abstract:

In action recognition, previous studies have demonstrated the effectiveness of using multiple features to improve the recognition performance. We focus on two practical issues: i) most studies use a direct way of concatenating/accumulating multi features to evaluate the similarity between two actions. This way could be too strong since each kind of feature can include different dimensions, quantities, etc; ii) in many studies, the employed classification methods lack of a flexible and effective mechanism to add new feature(s) into classification. In this paper, we explore an unified scheme based on recently-proposed multi-dimensional dynamic time warping (MD-DTW). Experiments demonstrated the scheme's effectiveness of combining multi-feature and the flexibility of adding new feature(s) to increase the recognition performance. In addition, the explored scheme also provides us an open architecture for using new advanced classification methods in the future to enhance action recognition.

Keywords: action recognition, multi features, dynamic time warping, feature combination

Procedia PDF Downloads 437

2667 Design and Fabrication of a Programmable Stiffness-Sensitive Gripper for Object Handling

Authors: Mehdi Modabberifar, Sanaz Jabary, Mojtaba Ghodsi

Abstract:

Stiffness sensing is an important issue in medical diagnostic, robotics surgery, safe handling, and safe grasping of objects in production lines. Detecting and obtaining the characteristics in dwelling lumps embedded in a soft tissue and safe removing and handling of detected lumps is needed in surgery. Also in industry, grasping and handling an object without damaging in a place where it is not possible to access a human operator is very important. In this paper, a method for object handling is presented. It is based on the use of an intelligent gripper to detect the object stiffness and then setting a programmable force for grasping the object to move it. The main components of this system includes sensors (sensors for measuring force and displacement), electrical (electrical and electronic circuits, tactile data processing and force control system), mechanical (gripper mechanism and driving system for the gripper) and the display unit. The system uses a rotary potentiometer for measuring gripper displacement. A microcontroller using the feedback received by the load cell, mounted on the finger of the gripper, calculates the amount of stiffness, and then commands the gripper motor to apply a certain force on the object. Results of Experiments on some samples with different stiffness show that the gripper works successfully. The gripper can be used in haptic interfaces or robotic systems used for object handling.

Keywords: gripper, haptic, stiffness, robotic

Procedia PDF Downloads 358

2666 Voice Commands Recognition of Mentor Robot in Noisy Environment Using HTK

Authors: Khenfer-Koummich Fatma, Hendel Fatiha, Mesbahi Larbi

Abstract:

this paper presents an approach based on Hidden Markov Models (HMM: Hidden Markov Model) using HTK tools. The goal is to create a man-machine interface with a voice recognition system that allows the operator to tele-operate a mentor robot to execute specific tasks as rotate, raise, close, etc. This system should take into account different levels of environmental noise. This approach has been applied to isolated words representing the robot commands spoken in two languages: French and Arabic. The recognition rate obtained is the same in both speeches, Arabic and French in the neutral words. However, there is a slight difference in favor of the Arabic speech when Gaussian white noise is added with a Signal to Noise Ratio (SNR) equal to 30 db, the Arabic speech recognition rate is 69% and 80% for French speech recognition rate. This can be explained by the ability of phonetic context of each speech when the noise is added.

Keywords: voice command, HMM, TIMIT, noise, HTK, Arabic, speech recognition

Procedia PDF Downloads 382

2665 Design of Orientation-Free Handler and Fuzzy Controller for Wire-Driven Heavy Object Lifting System

Authors: Bo-Wei Song, Yun-Jung Lee

Abstract:

This paper presents an intention interface and controller for a wire-driven heavy object lifting system that assists the operator with moving a heavy object. The handler is designed to allow a comfortable working posture for the operator. Plus, as a human assistive system, the operator is involved in the control loop, where a fuzzy control system is used to consider the human control characteristics. The effectiveness and performance of the proposed system are proved by experiments.

Keywords: fuzzy controller, handler design, heavy object lifting system, human-assistive device, human-in-the-loop system

Procedia PDF Downloads 514

2664 Multi-Modal Feature Fusion Network for Speaker Recognition Task

Authors: Xiang Shijie, Zhou Dong, Tian Dan

Abstract:

Speaker recognition is a crucial task in the field of speech processing, aimed at identifying individuals based on their vocal characteristics. However, existing speaker recognition methods face numerous challenges. Traditional methods primarily rely on audio signals, which often suffer from limitations in noisy environments, variations in speaking style, and insufficient sample sizes. Additionally, relying solely on audio features can sometimes fail to capture the unique identity of the speaker comprehensively, impacting recognition accuracy. To address these issues, we propose a multi-modal network architecture that simultaneously processes both audio and text signals. By gradually integrating audio and text features, we leverage the strengths of both modalities to enhance the robustness and accuracy of speaker recognition. Our experiments demonstrate significant improvements with this multi-modal approach, particularly in complex environments, where recognition performance has been notably enhanced. Our research not only highlights the limitations of current speaker recognition methods but also showcases the effectiveness of multi-modal fusion techniques in overcoming these limitations, providing valuable insights for future research.

Keywords: feature fusion, memory network, multimodal input, speaker recognition

Procedia PDF Downloads 32

2663 Improved Dynamic Bayesian Networks Applied to Arabic On Line Characters Recognition

Authors: Redouane Tlemsani, Abdelkader Benyettou

Abstract:

Work is in on line Arabic character recognition and the principal motivation is to study the Arab manuscript with on line technology. This system is a Markovian system, which one can see as like a Dynamic Bayesian Network (DBN). One of the major interests of these systems resides in the complete models training (topology and parameters) starting from training data. Our approach is based on the dynamic Bayesian Networks formalism. The DBNs theory is a Bayesians networks generalization to the dynamic processes. Among our objective, amounts finding better parameters, which represent the links (dependences) between dynamic network variables. In applications in pattern recognition, one will carry out the fixing of the structure, which obliges us to admit some strong assumptions (for example independence between some variables). Our application will relate to the Arabic isolated characters on line recognition using our laboratory database: NOUN. A neural tester proposed for DBN external optimization. The DBN scores and DBN mixed are respectively 70.24% and 62.50%, which lets predict their further development; other approaches taking account time were considered and implemented until obtaining a significant recognition rate 94.79%.

Keywords: Arabic on line character recognition, dynamic Bayesian network, pattern recognition, computer vision

Procedia PDF Downloads 428

2662 Bidirectional Dynamic Time Warping Algorithm for the Recognition of Isolated Words Impacted by Transient Noise Pulses

Authors: G. Tamulevičius, A. Serackis, T. Sledevič, D. Navakauskas

Abstract:

We consider the biggest challenge in speech recognition – noise reduction. Traditionally detected transient noise pulses are removed with the corrupted speech using pulse models. In this paper we propose to cope with the problem directly in Dynamic Time Warping domain. Bidirectional Dynamic Time Warping algorithm for the recognition of isolated words impacted by transient noise pulses is proposed. It uses simple transient noise pulse detector, employs bidirectional computation of dynamic time warping and directly manipulates with warping results. Experimental investigation with several alternative solutions confirms effectiveness of the proposed algorithm in the reduction of impact of noise on recognition process – 3.9% increase of the noisy speech recognition is achieved.

Keywords: transient noise pulses, noise reduction, dynamic time warping, speech recognition

Procedia PDF Downloads 558

2661 Advanced Mouse Cursor Control and Speech Recognition Module

Authors: Prasad Kalagura, B. Veeresh kumar

Abstract:

We constructed an interface system that would allow a similarly paralyzed user to interact with a computer with almost full functional capability. A real-time tracking algorithm is implemented based on adaptive skin detection and motion analysis. The clicking of the mouse is activated by the user's eye blinking through a sensor. The keyboard function is implemented by voice recognition kit.

Keywords: embedded ARM7 processor, mouse pointer control, voice recognition

Procedia PDF Downloads 578

2660 Real Time Multi Person Action Recognition Using Pose Estimates

Authors: Aishrith Rao

Abstract:

Human activity recognition is an important aspect of video analytics, and many approaches have been recommended to enable action recognition. In this approach, the model is used to identify the action of the multiple people in the frame and classify them accordingly. A few approaches use RNNs and 3D CNNs, which are computationally expensive and cannot be trained with the small datasets which are currently available. Multi-person action recognition has been performed in order to understand the positions and action of people present in the video frame. The size of the video frame can be adjusted as a hyper-parameter depending on the hardware resources available. OpenPose has been used to calculate pose estimate using CNN to produce heap-maps, one of which provides skeleton features, which are basically joint features. The features are then extracted, and a classification algorithm can be applied to classify the action.

Keywords: human activity recognition, computer vision, pose estimates, convolutional neural networks

Procedia PDF Downloads 139

2659 A Neural Approach for the Offline Recognition of the Arabic Handwritten Words of the Algerian Departments

Authors: Salim Ouchtati, Jean Sequeira, Mouldi Bedda

Abstract:

In this work we present an off line system for the recognition of the Arabic handwritten words of the Algerian departments. The study is based mainly on the evaluation of neural network performances, trained with the gradient back propagation algorithm. The used parameters to form the input vector of the neural network are extracted on the binary images of the handwritten word by several methods: the parameters of distribution, the moments centered of the different projections and the Barr features. It should be noted that these methods are applied on segments gotten after the division of the binary image of the word in six segments. The classification is achieved by a multi layers perceptron. Detailed experiments are carried and satisfactory recognition results are reported.

Keywords: handwritten word recognition, neural networks, image processing, pattern recognition, features extraction

Procedia PDF Downloads 513

2658 The Ameliorative Effects of the Histamine H3 Receptor Antagonist/Inverse Agonist DL77 on MK801-Induced Memory Deficits in Rats

Authors: B. Sadek, N. Khan, Shreesh K. Ojha, Adel Sadeq, D. Lazewska, K. Kiec-Kononowicz

Abstract:

The involvement of Histamine H3 receptors (H3Rs) in memory and the potential role of H3R antagonists in pharmacological control of neurodegenerative disorders, e.g., Alzheimer disease (AD) is well established. Therefore, the memory-enhancing effects of the H3R antagonist DL77 on MK801-induced cognitive deficits were evaluated in passive avoidance paradigm (PAP) and novel object recognition (NOR) tasks in adult male rats, applying donepezil (DOZ) as a reference drug. Animals pretreated with acute systemic administration of DL77 (2.5, 5, and 10 mg/kg, i.p.) were significantly ameliorated in regard to MK801-induced memory deficits in PAP. The ameliorative effect of most effective dose of DL77 (5 mg/kg, i.p.) was abrogated when animals were pretreated with a co-injection with the H3R agonist R-(α)-methylhistamine (RAMH, 10 mg/kg, i.p.). Moreover, and in the NOR paradigm, DL77 (5 mg/kg, i.p.) reversed MK801-induced deficits long-term memory (LTM), and the DL77-provided procognitive effect was comparable to that of reference drug DOZ, and was reversed when animals were co-injected with RAMH (10 mg/kg, i.p.). However, DL77(5 mg/kg, i.p.) failed to alter short-term memory (STM) impairment in NOR test. Furthermore, DL77 (5 mg/kg) failed to induce any alterations of anxiety and locomotor behaviors of animals naive to elevated-plus maze (EPM), indicating that the ameliorative effects observed in PAP or NOR tests were not associated to alterations in emotions or in natural locomotion of tested animals. These results reveal the potential contribution of H3Rs in modulating CNS neurotransmission systems associated with neurodegenerative disorders, e.g., AD.

Keywords: histamine H3 receptor, antagonist, learning and memory, Alzheimer's disease, neurodegeneration, passive avoidance paradigm, novel object recognition, behavioral research

Procedia PDF Downloads 155