Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 4584

Search results for: cosine margin face recognition

4404 Gender Recognition with Deep Belief Networks

Authors: Xiaoqi Jia, Qing Zhu, Hao Zhang, Su Yang

Abstract:

A gender recognition system is able to tell the gender of the given person through a few of frontal facial images. An effective gender recognition approach enables to improve the performance of many other applications, including security monitoring, human-computer interaction, image or video retrieval and so on. In this paper, we present an effective method for gender classification task in frontal facial images based on deep belief networks (DBNs), which can pre-train model and improve accuracy a little bit. Our experiments have shown that the pre-training method with DBNs for gender classification task is feasible and achieves a little improvement of accuracy on FERET and CAS-PEAL-R1 facial datasets.

Keywords: gender recognition, beep belief net-works, semi-supervised learning, greedy-layer wise RBMs

Procedia PDF Downloads 451

4403 Emotion Recognition Using Artificial Intelligence

Authors: Rahul Mohite, Lahcen Ouarbya

Abstract:

This paper focuses on the interplay between humans and computer systems and the ability of these systems to understand and respond to human emotions, including non-verbal communication. Current emotion recognition systems are based solely on either facial or verbal expressions. The limitation of these systems is that it requires large training data sets. The paper proposes a system for recognizing human emotions that combines both speech and emotion recognition. The system utilizes advanced techniques such as deep learning and image recognition to identify facial expressions and comprehend emotions. The results show that the proposed system, based on the combination of facial expression and speech, outperforms existing ones, which are based solely either on facial or verbal expressions. The proposed system detects human emotion with an accuracy of 86%, whereas the existing systems have an accuracy of 70% using verbal expression only and 76% using facial expression only. In this paper, the increasing significance and demand for facial recognition technology in emotion recognition are also discussed.

Keywords: facial reputation, expression reputation, deep gaining knowledge of, photo reputation, facial technology, sign processing, photo type

Procedia PDF Downloads 117

4402 Improving Activity Recognition Classification of Repetitious Beginner Swimming Using a 2-Step Peak/Valley Segmentation Method with Smoothing and Resampling for Machine Learning

Authors: Larry Powell, Seth Polsley, Drew Casey, Tracy Hammond

Abstract:

Human activity recognition (HAR) systems have shown positive performance when recognizing repetitive activities like walking, running, and sleeping. Water-based activities are a reasonably new area for activity recognition. However, water-based activity recognition has largely focused on supporting the elite and competitive swimming population, which already has amazing coordination and proper form. Beginner swimmers are not perfect, and activity recognition needs to support the individual motions to help beginners. Activity recognition algorithms are traditionally built around short segments of timed sensor data. Using a time window input can cause performance issues in the machine learning model. The window’s size can be too small or large, requiring careful tuning and precise data segmentation. In this work, we present a method that uses a time window as the initial segmentation, then separates the data based on the change in the sensor value. Our system uses a multi-phase segmentation method that pulls all peaks and valleys for each axis of an accelerometer placed on the swimmer’s lower back. This results in high recognition performance using leave-one-subject-out validation on our study with 20 beginner swimmers, with our model optimized from our final dataset resulting in an F-Score of 0.95.

Keywords: time window, peak/valley segmentation, feature extraction, beginner swimming, activity recognition

Procedia PDF Downloads 121

4401 A Framework for Chinese Domain-Specific Distant Supervised Named Entity Recognition

Authors: Qin Long, Li Xiaoge

Abstract:

The Knowledge Graphs have now become a new form of knowledge representation. However, there is no consensus in regard to a plausible and definition of entities and relationships in the domain-specific knowledge graph. Further, in conjunction with several limitations and deficiencies, various domain-specific entities and relationships recognition approaches are far from perfect. Specifically, named entity recognition in Chinese domain is a critical task for the natural language process applications. However, a bottleneck problem with Chinese named entity recognition in new domains is the lack of annotated data. To address this challenge, a domain distant supervised named entity recognition framework is proposed. The framework is divided into two stages: first, the distant supervised corpus is generated based on the entity linking model of graph attention neural network; secondly, the generated corpus is trained as the input of the distant supervised named entity recognition model to train to obtain named entities. The link model is verified in the ccks2019 entity link corpus, and the F1 value is 2% higher than that of the benchmark method. The re-pre-trained BERT language model is added to the benchmark method, and the results show that it is more suitable for distant supervised named entity recognition tasks. Finally, it is applied in the computer field, and the results show that this framework can obtain domain named entities.

Keywords: distant named entity recognition, entity linking, knowledge graph, graph attention neural network

Procedia PDF Downloads 90

4400 Recognition of Spelling Problems during the Text in Progress: A Case Study on the Comments Made by Portuguese Students Newly Literate

Authors: E. Calil, L. A. Pereira

Abstract:

The acquisition of orthography is a complex process, involving both lexical and grammatical questions. This learning occurs simultaneously with the domain of multiple textual aspects (e.g.: graphs, punctuation, etc.). However, most of the research on orthographic acquisition focus on this acquisition from an autonomous point of view, separated from the process of textual production. This means that their object of analysis is the production of words selected by the researcher or the requested sentences in an experimental and controlled setting. In addition, the analysis of the Spelling Problems (SP) are identified by the researcher on the sheet of paper. Considering the perspective of Textual Genetics, from an enunciative approach, this study will discuss the SPs recognized by dyads of newly literate students, while they are writing a text collaboratively. Six proposals of textual production were registered, requested by a 2nd year teacher of a Portuguese Primary School between January and March 2015. In our case study we discuss the SPs recognized by the dyad B and L (7 years old). We adopted as a methodological tool the Ramos System audiovisual record. This system allows real-time capture of the text in process and of the face-to-face dialogue between both students and their teacher, and also captures the body movements and facial expressions of the participants during textual production proposals in the classroom. In these ecological conditions of multimodal registration of collaborative writing, we could identify the emergence of SP in two dimensions: i. In the product (finished text): SP identification without recursive graphic marks (without erasures) and the identification of SPs with erasures, indicating the recognition of SP by the student; ii. In the process (text in progress): identification of comments made by students about recognized SPs. Given this, we’ve analyzed the comments on identified SPs during the text in progress. These comments characterize a type of reformulation referred to as Commented Oral Erasure (COE). The COE has two enunciative forms: Simple Comment (SC) such as ' 'X' is written with 'Y' '; or Unfolded Comment (UC), such as ' 'X' is written with 'Y' because...'. The spelling COE may also occur before or during the SP (Early Spelling Recognition - ESR) or after the SP has been entered (Later Spelling Recognition - LSR). There were 631 words entered in the 6 stories written by the B-L dyad, 145 of them containing some type of SP. During the text in progress, the students recognized orally 174 SP, 46 of which were identified in advance (ESRs) and 128 were identified later (LSPs). If we consider that the 88 erasure SPs in the product indicate some form of SP recognition, we can observe that there were twice as many SPs recognized orally. The ESR was characterized by SC when students asked their colleague or teacher how to spell a given word. The LSR presented predominantly UC, verbalizing meta-orthographic arguments, mostly made by L. These results indicate that writing in dyad is an important didactic strategy for the promotion of metalinguistic reflection, favoring the learning of spelling.

Keywords: collaborative writing, erasure, learning, metalinguistic awareness, spelling, text production

Procedia PDF Downloads 161

4399 The Effectiveness of Exchange of Tacit and Explicit Knowledge Using Digital and Face to Face Sharing

Authors: Delio I. Castaneda, Paul Toulson

Abstract:

The purpose of this study was to investigate the knowledge sharing effectiveness of two types of knowledge, tacit and explicit, depending on two channels: face to face or digital. Participants were 217 knowledge workers in New Zealand and researchers who attended a knowledge management conference in the United Kingdom. In the study, it was found that digital tools are effective to share explicit knowledge. In addition, digital tools that facilitated dialogue were effective to share tacit knowledge. It was also found that face to face communication was an effective way to share tacit and explicit knowledge. Results of this study contribute to clarify in what cases digital tools are effective to share tacit knowledge. Additionally, even though explicit knowledge can be easily shared using digital tools, this type of knowledge is also possible to be shared through dialogue. Result of this study may support practitioners to redesign programs and activities based on knowledge sharing to make strategies more effective.

Keywords: digital knowledge, explicit knowledge, knowledge sharing, tacit knowledge

Procedia PDF Downloads 253

4398 DG Power Plants Placement and Evaluation of its Effect on Improving Voltage Security Margin in Radial Distribution Networks

Authors: Atabak Faramarzpour, Mohsen Mohammadian

Abstract:

In this article, we introduce the stability of power system voltage and state DG power plants placement and its effect on improving voltage security margin in radial distribution networks. For this purpose, first, important definitions in voltage stability area such as small and big voltage disturbances, instability, and voltage collapse, and voltage security definitions are stated. Then, according to voltage collapse time, voltage stability is classified and each one's characteristics are stated.

Keywords: DG power plants, evaluation, voltage security, radial distribution networks

Procedia PDF Downloads 669

4397 Hand Gesture Recognition for Sign Language: A New Higher Order Fuzzy HMM Approach

Authors: Saad M. Darwish, Magda M. Madbouly, Murad B. Khorsheed

Abstract:

Sign Languages (SL) are the most accomplished forms of gestural communication. Therefore, their automatic analysis is a real challenge, which is interestingly implied to their lexical and syntactic organization levels. Hidden Markov models (HMM’s) have been used prominently and successfully in speech recognition and, more recently, in handwriting recognition. Consequently, they seem ideal for visual recognition of complex, structured hand gestures such as are found in sign language. In this paper, several results concerning static hand gesture recognition using an algorithm based on Type-2 Fuzzy HMM (T2FHMM) are presented. The features used as observables in the training as well as in the recognition phases are based on Singular Value Decomposition (SVD). SVD is an extension of Eigen decomposition to suit non-square matrices to reduce multi attribute hand gesture data to feature vectors. SVD optimally exposes the geometric structure of a matrix. In our approach, we replace the basic HMM arithmetic operators by some adequate Type-2 fuzzy operators that permits us to relax the additive constraint of probability measures. Therefore, T2FHMMs are able to handle both random and fuzzy uncertainties existing universally in the sequential data. Experimental results show that T2FHMMs can effectively handle noise and dialect uncertainties in hand signals besides a better classification performance than the classical HMMs. The recognition rate of the proposed system is 100% for uniform hand images and 86.21% for cluttered hand images.

Keywords: hand gesture recognition, hand detection, type-2 fuzzy logic, hidden Markov Model

Procedia PDF Downloads 460

4396 Communication and Devices: Face to Face Communication versus Communication with Mobile Technologies

Authors: Nuran Öze

Abstract:

With the rapid changes occurring in the last twenty five years, mobile phone technology has influenced every aspect of life. Technological developments within the Internet and mobile phone areas have not only changed communication practices; it has also changed the everyday life practices of individuals. This article has focused on understanding how people’s communication practices and everyday life practices have changed with the smartphone usage. The study was conducted by using in-depth interview method and the research was conducted on twenty Turkish Cypriots who live in Northern Cyprus. According to the research results, communicating via Internet has rapidly replaced face to face communication in recent years. However, results have changed according to generations. Younger generations can easily adapt themselves to technological changes because they are already gaining everyday life practices right now. However, the older generations practices are already present in their everyday life.

Keywords: face to face communication, internet, mobile technologies, north Cyprus

Procedia PDF Downloads 394

4395 Fine Grained Action Recognition of Skateboarding Tricks

Authors: Frederik Calsius, Mirela Popa, Alexia Briassouli

Abstract:

In the field of machine learning, it is common practice to use benchmark datasets to prove the working of a method. The domain of action recognition in videos often uses datasets like Kinet-ics, Something-Something, UCF-101 and HMDB-51 to report results. Considering the properties of the datasets, there are no datasets that focus solely on very short clips (2 to 3 seconds), and on highly-similar fine-grained actions within one specific domain. This paper researches how current state-of-the-art action recognition methods perform on a dataset that consists of highly similar, fine-grained actions. To do so, a dataset of skateboarding tricks was created. The performed analysis highlights both benefits and limitations of state-of-the-art methods, while proposing future research directions in the activity recognition domain. The conducted research shows that the best results are obtained by fusing RGB data with OpenPose data for the Temporal Shift Module.

Keywords: activity recognition, fused deep representations, fine-grained dataset, temporal modeling

Procedia PDF Downloads 229

4394 Developing an AI-Driven Application for Real-Time Emotion Recognition from Human Vocal Patterns

Authors: Sayor Ajfar Aaron, Mushfiqur Rahman, Sajjat Hossain Abir, Ashif Newaz

Abstract:

This study delves into the development of an artificial intelligence application designed for real-time emotion recognition from human vocal patterns. Utilizing advanced machine learning algorithms, including deep learning and neural networks, the paper highlights both the technical challenges and potential opportunities in accurately interpreting emotional cues from speech. Key findings demonstrate the critical role of diverse training datasets and the impact of ambient noise on recognition accuracy, offering insights into future directions for improving robustness and applicability in real-world scenarios.

Keywords: artificial intelligence, convolutional neural network, emotion recognition, vocal patterns

Procedia PDF Downloads 49

4393 Face Shield Design with Additive Manufacturing Practice Combating COVID-19 Pandemic

Authors: May M. Youssef

Abstract:

This article introduces a design, for additive manufacturing technology, face shield as Personal Protective Equipment from the respiratory viruses such as coronavirus 2. The face shields help to reduce ocular exposure and play a vital role in diverting away from the respiratory COVID-19 air droplets around the users' face. The proposed face shield comprises three assembled polymer parts. The frame with a transparency overhead projector sheet visor is suitable for frontline health care workers and ordinary citizens. The frame design allows tightening the shield around the user’s head and permits rubber elastic straps to be used if required. That ergonomically designed with a unique face mask support used in case of wearing extra protective mask was created using computer aided design (CAD) software package. The finite element analysis (FEA) structural verification of the proposed design is performed by an advanced simulation technique. Subsequently, the prototype model was fabricated by a 3D printing using Fused Deposition Modeling (FDM) as a globally developed face shield product. This study provides a different face shield designs for global production, which showed to be suitable and effective toward supply chain shortages and frequent needs of personal protective goods during coronavirus disease and similar viruses.

Keywords: additive manufacturing, Coronavirus-19, face shield, personal protective equipment, 3D printing

Procedia PDF Downloads 200

4392 Facial Biometric Privacy Using Visual Cryptography: A Fundamental Approach to Enhance the Security of Facial Biometric Data

Authors: Devika Tanna

Abstract:

'Biometrics' means 'life measurement' but the term is usually associated with the use of unique physiological characteristics to identify an individual. It is important to secure the privacy of digital face image that is stored in central database. To impart privacy to such biometric face images, first, the digital face image is split into two host face images such that, each of it gives no idea of existence of the original face image and, then each cover image is stored in two different databases geographically apart. When both the cover images are simultaneously available then only we can access that original image. This can be achieved by using the XM2VTS and IMM face database, an adaptive algorithm for spatial greyscale. The algorithm helps to select the appropriate host images which are most likely to be compatible with the secret image stored in the central database based on its geometry and appearance. The encryption is done using GEVCS which results in a reconstructed image identical to the original private image.

Keywords: adaptive algorithm, database, host images, privacy, visual cryptography

Procedia PDF Downloads 129

4391 Myanmar Character Recognition Using Eight Direction Chain Code Frequency Features

Authors: Kyi Pyar Zaw, Zin Mar Kyu

Abstract:

Character recognition is the process of converting a text image file into editable and searchable text file. Feature Extraction is the heart of any character recognition system. The character recognition rate may be low or high depending on the extracted features. In the proposed paper, 25 features for one character are used in character recognition. Basically, there are three steps of character recognition such as character segmentation, feature extraction and classification. In segmentation step, horizontal cropping method is used for line segmentation and vertical cropping method is used for character segmentation. In the Feature extraction step, features are extracted in two ways. The first way is that the 8 features are extracted from the entire input character using eight direction chain code frequency extraction. The second way is that the input character is divided into 16 blocks. For each block, although 8 feature values are obtained through eight-direction chain code frequency extraction method, we define the sum of these 8 feature values as a feature for one block. Therefore, 16 features are extracted from that 16 blocks in the second way. We use the number of holes feature to cluster the similar characters. We can recognize the almost Myanmar common characters with various font sizes by using these features. All these 25 features are used in both training part and testing part. In the classification step, the characters are classified by matching the all features of input character with already trained features of characters.

Keywords: chain code frequency, character recognition, feature extraction, features matching, segmentation

Procedia PDF Downloads 318

4390 Analysis of Interpolation Factor in Pulse Shaping Filter on MRC for CDMA 2000 Systems

Authors: Pankaj Verma, Gagandeep Singh Walia, Padma Devi, H. P. Singh

Abstract:

Code Division Multiple Access 2000 operates on various RF channel bandwidths 1.2288 or 3.6864 Mcps. CDMA offers high bandwidth and wireless broadband services but the efficiency gets decreased because of many interfering factors like fading, interference, scattering, diffraction, refraction, reflection etc. To reduce the spectral bandwidth is one of the major concerns in modern day technology and this is achieved by pulse shaping filter. This paper investigates the effect of diversity (MRC), interpolation factor in Root Raised Cosine (RRC) filter for the QPSK and BPSK modulation schemes. It is made possible to send information with minimum inter symbol interference and within limited bandwidth with proper pulse shaping technique. Bit error rate (BER) performance is analyzed by applying diversity technique by varying the interpolation factor for Binary Phase Shift Keying (BPSK) and Quadrature Phase Shift Keying (QPSK). Interpolation factor increases the original sampling rate of a sequence to a higher rate and reduces the interference and diversity reduces the fading.

Keywords: CDMA2000, root raised cosine, roll off factor, ISI, diversity, interference, fading

Procedia PDF Downloads 473

4389 Facial Pose Classification Using Hilbert Space Filling Curve and Multidimensional Scaling

Authors: Mekamı Hayet, Bounoua Nacer, Benabderrahmane Sidahmed, Taleb Ahmed

Abstract:

Pose estimation is an important task in computer vision. Though the majority of the existing solutions provide good accuracy results, they are often overly complex and computationally expensive. In this perspective, we propose the use of dimensionality reduction techniques to address the problem of facial pose estimation. Firstly, a face image is converted into one-dimensional time series using Hilbert space filling curve, then the approach converts these time series data to a symbolic representation. Furthermore, a distance matrix is calculated between symbolic series of an input learning dataset of images, to generate classifiers of frontal vs. profile face pose. The proposed method is evaluated with three public datasets. Experimental results have shown that our approach is able to achieve a correct classification rate exceeding 97% with K-NN algorithm.

Keywords: machine learning, pattern recognition, facial pose classification, time series

Procedia PDF Downloads 349

4388 Intelligent Human Pose Recognition Based on EMG Signal Analysis and Machine 3D Model

Authors: Si Chen, Quanhong Jiang

Abstract:

In the increasingly mature posture recognition technology, human movement information is widely used in sports rehabilitation, human-computer interaction, medical health, human posture assessment, and other fields today; this project uses the most original ideas; it is proposed to use the collection equipment for the collection of myoelectric data, reflect the muscle posture change on a degree of freedom through data processing, carry out data-muscle three-dimensional model joint adjustment, and realize basic pose recognition. Based on this, bionic aids or medical rehabilitation equipment can be further developed with the help of robotic arms and cutting-edge technology, which has a bright future and unlimited development space.

Keywords: pose recognition, 3D animation, electromyography, machine learning, bionics

Procedia PDF Downloads 76

4387 The Effect of Microgrid on Power System Oscillatory Stability

Authors: Burak Yildirim, Muhsin Tunay Gencoglu

Abstract:

This publication shows the effects of Microgrid (MG) integration on the power systems oscillating stability. Generated MG model power systems were applied to the IEEE 14 bus test system which is widely used in stability studies. Stability studies were carried out with the help of eigenvalue analysis over linearized system models. In addition, Hopf bifurcation point detection was performed to show the effect of MGs on the system loadability margin. In the study results, it is seen that MGs affect system stability positively by increasing system loadability margin and has a damper effect on the critical modes of the system and the electromechanical local modes, but they make the damping amount of the electromechanical interarea modes reduce.

Keywords: Eigenvalue analysis, microgrid, Hopf bifurcation, oscillatory stability

Procedia PDF Downloads 289

4386 Smartphone-Based Human Activity Recognition by Machine Learning Methods

Authors: Yanting Cao, Kazumitsu Nawata

Abstract:

As smartphones upgrading, their software and hardware are getting smarter, so the smartphone-based human activity recognition will be described as more refined, complex, and detailed. In this context, we analyzed a set of experimental data obtained by observing and measuring 30 volunteers with six activities of daily living (ADL). Due to the large sample size, especially a 561-feature vector with time and frequency domain variables, cleaning these intractable features and training a proper model becomes extremely challenging. After a series of feature selection and parameters adjustment, a well-performed SVM classifier has been trained.

Keywords: smart sensors, human activity recognition, artificial intelligence, SVM

Procedia PDF Downloads 141

4385 Design and Tooth Contact Analysis of Face Gear Drive with Modified Tooth Surface in Helicopter Transmission

Authors: Kazumasa Kawasaki, Isamu Tsuji, Hiroshi Gunbara

Abstract:

A face gear drive is actually composed of a spur or helical pinion that is in mesh with a face gear and transfers power and motion between intersecting or skew axes. Due to the peculiarity of the face gear drive in shunt and confluence drive, it shows potential advantages in the application in the helicopter transmission. The advantages of such applications are the possibility of the split of the torque that appears to be significant where a pinion drives two face gears to provide an accurate division of power and motion. This mechanism greatly reduces the weight and cost compared to conventional design. Therefore, this has been led to revived interest and the face gear drive has been utilized in substitution for bevel and hypoid gears in limited cases. The face gear drive with a spur or a helical pinion is newly designed in order to determine an effective meshing area under the design parameters and specific design dimensions. The face gear has two unique dimensions which control the face width of the tooth, and the outside and inside diameters of the face gear. On the other hand, it is necessary to modify the tooth surfaces of face gear drive in order to avoid the influences of alignment errors on the tooth contact patterns in practical use. In this case, the pinion tooth surfaces are usually modified in the conventional method. However, it is hard to control the tooth contact pattern intentionally and adjust the position of the pinion axis in meshing of the gear pair. Therefore, a method of the modification of the tooth surfaces of the face gear is proposed. Moreover, based on tooth contact analysis, the tooth contact pattern and transmission errors of the designed face gear drive are analyzed, and the influences of alignment errors on the tooth contact patterns and transmission errors are investigated. These results showed that the tooth contact patterns and transmission errors were controllable and the face gear drive which is insensitive to alignment errors can be obtained.

Keywords: alignment error, face gear, gear design, helicopter transmission, tooth contact analysis

Procedia PDF Downloads 436

4384 Iterative Replanning of Diesel Generator and Energy Storage System for Stable Operation of an Isolated Microgrid

Authors: Jiin Jeong, Taekwang Kim, Kwang Ryel Ryu

Abstract:

The target microgrid in this paper is isolated from the large central power system and is assumed to consist of wind generators, photovoltaic power generators, an energy storage system (ESS), a diesel power generator, the community load, and a dump load. The operation of such a microgrid can be hazardous because of the uncertain prediction of power supply and demand and especially due to the high fluctuation of the output from the wind generators. In this paper, we propose an iterative replanning method for determining the appropriate level of diesel generation and the charging/discharging cycles of the ESS for the upcoming one-hour horizon. To cope with the uncertainty of the estimation of supply and demand, the one-hour plan is built repeatedly in the regular interval of one minute by rolling the one-hour horizon. Since the plan should be built with a sufficiently large safe margin to avoid any possible black-out, some energy waste through the dump load is inevitable. In our approach, the level of safe margin is optimized through learning from the past experience. The simulation experiments show that our method combined with the margin optimization can reduce the dump load compared to the method without such optimization.

Keywords: microgrid, operation planning, power efficiency optimization, supply and demand prediction

Procedia PDF Downloads 431

4383 Human Gait Recognition Using Moment with Fuzzy

Authors: Jyoti Bharti, Navneet Manjhi, M. K.Gupta, Bimi Jain

Abstract:

A reliable gait features are required to extract the gait sequences from an images. In this paper suggested a simple method for gait identification which is based on moments. Moment values are extracted on different number of frames of gray scale and silhouette images of CASIA database. These moment values are considered as feature values. Fuzzy logic and nearest neighbour classifier are used for classification. Both achieved higher recognition.

Keywords: gait, fuzzy logic, nearest neighbour, recognition rate, moments

Procedia PDF Downloads 756

4382 A Conglomerate of Multiple Optical Character Recognition Table Detection and Extraction

Authors: Smita Pallavi, Raj Ratn Pranesh, Sumit Kumar

Abstract:

Information representation as tables is compact and concise method that eases searching, indexing, and storage requirements. Extracting and cloning tables from parsable documents is easier and widely used; however, industry still faces challenges in detecting and extracting tables from OCR (Optical Character Recognition) documents or images. This paper proposes an algorithm that detects and extracts multiple tables from OCR document. The algorithm uses a combination of image processing techniques, text recognition, and procedural coding to identify distinct tables in the same image and map the text to appropriate the corresponding cell in dataframe, which can be stored as comma-separated values, database, excel, and multiple other usable formats.

Keywords: table extraction, optical character recognition, image processing, text extraction, morphological transformation

Procedia PDF Downloads 140

4381 Image Recognition and Anomaly Detection Powered by GANs: A Systematic Review

Authors: Agastya Pratap Singh

Abstract:

Generative Adversarial Networks (GANs) have emerged as powerful tools in the fields of image recognition and anomaly detection due to their ability to model complex data distributions and generate realistic images. This systematic review explores recent advancements and applications of GANs in both image recognition and anomaly detection tasks. We discuss various GAN architectures, such as DCGAN, CycleGAN, and StyleGAN, which have been tailored to improve accuracy, robustness, and efficiency in visual data analysis. In image recognition, GANs have been used to enhance data augmentation, improve classification models, and generate high-quality synthetic images. In anomaly detection, GANs have proven effective in identifying rare and subtle abnormalities across various domains, including medical imaging, cybersecurity, and industrial inspection. The review also highlights the challenges and limitations associated with GAN-based methods, such as instability during training and mode collapse, and suggests future research directions to overcome these issues. Through this review, we aim to provide researchers with a comprehensive understanding of the capabilities and potential of GANs in transforming image recognition and anomaly detection practices.

Keywords: generative adversarial networks, image recognition, anomaly detection, DCGAN, CycleGAN, StyleGAN, data augmentation

Procedia PDF Downloads 19

4380 Recognition of Cursive Arabic Handwritten Text Using Embedded Training Based on Hidden Markov Models (HMMs)

Authors: Rabi Mouhcine, Amrouch Mustapha, Mahani Zouhir, Mammass Driss

Abstract:

In this paper, we present a system for offline recognition cursive Arabic handwritten text based on Hidden Markov Models (HMMs). The system is analytical without explicit segmentation used embedded training to perform and enhance the character models. Extraction features preceded by baseline estimation are statistical and geometric to integrate both the peculiarities of the text and the pixel distribution characteristics in the word image. These features are modelled using hidden Markov models and trained by embedded training. The experiments on images of the benchmark IFN/ENIT database show that the proposed system improves recognition.

Keywords: recognition, handwriting, Arabic text, HMMs, embedded training

Procedia PDF Downloads 352

4379 A Memristive Device with Intrinsic Rectification Behavior and Performace of Crossbar Arrays

Authors: Yansong Gao, Damith C.Ranasinghe, Siad F. Al-Sarawi, Omid Kavehei, Derek Abbott

Abstract:

Passive crossbar arrays is in principle the simplest functional electrical circuit, together with memristive device in cross-point, holding great promise in future high-density, non-volatile memories. However, the greatest problem of crossbar array is the sneak path current. In this paper, we investigate one type of memristive device with intrinsic rectification behavior to address the sneak path currents. Firstly, a SPICE behavior model written in Verilog-A language of the memristive device is presented to fit experimental data published in literature. Next, systematic performance simulations including read margin and power consumption of crossbar array, which uses the self-rectifying memristive device as storage element at cross-point, with respect to different crossbar sizes, interconnect resistance, ratio of HRS/LRS (High Resistance State/ Low Resistance State), rectification ratio and different read schemes are conducted. Subsequently, Trade-offs among reading margin, power consumption, and reading schemes are analyzed to provide guidelines for circuit design. Finally, performance comparison between the memristive device with/without intrinsic rectification behavior is given to show the worthiness of this intrinsic rectification behavior.

Keywords: memristive device, memristor, crossbar, RRAM, read margin, power consumption

Procedia PDF Downloads 435

4378 A Multimodal Approach to Improve the Performance of Biometric System

Authors: Chander Kant, Arun Kumar

Abstract:

Biometric systems automatically recognize an individual based on his/her physiological and behavioral characteristics. There are also some traits like weight, age, height etc. that may not provide reliable user recognition because of there common and temporary nature. These traits are called soft bio metric traits. Although soft bio metric traits are lack of permanence to uniquely and reliably identify an individual, yet they provide some beneficial evidence about the user identity and may improve the system performance. Here in this paper, we have proposed an approach for integrating the soft bio metrics with fingerprint and face to improve the performance of personal authentication system. In our approach we have proposed a combined architecture of three different sensors to elevate the system performance. The approach includes, soft bio metrics, fingerprint and face traits. We have also proven the efficiency of proposed system regarding FAR (False Acceptance Ratio) and total response time, with the help of MUBI (Multimodal Bio metrics Integration) software.

Keywords: FAR, minutiae point, multimodal bio metrics, primary bio metric, soft bio metric

Procedia PDF Downloads 344

4377 Improving Learning and Teaching of Software Packages among Engineering Students

Authors: Sara Moridpour

Abstract:

To meet emerging industry needs, engineering students must learn different software packages and enhance their computational skills. Traditionally, face-to-face is selected as the preferred approach to teaching software packages. Face-to-face tutorials and workshops provide an interactive environment for learning software packages where the students can communicate with the teacher and interact with other students, evaluate their skills, and receive feedback. However, COVID-19 significantly limited face-to-face learning and teaching activities at universities. Worldwide lockdowns and the shift to online and remote learning and teaching provided the opportunity to introduce different strategies to enhance the interaction among students and teachers in online and virtual environments and improve the learning and teaching of software packages in online and blended teaching methods. This paper introduces a blended strategy to teach engineering software packages to undergraduate students. This article evaluates the effectiveness of the proposed blended learning and teaching strategy in students’ learning by comparing the impact of face-to-face, online and the proposed blended environments on students’ software skills. The paper evaluates the students’ software skills and their software learning through an authentic assignment. According to the results, the proposed blended teaching strategy successfully improves the software learning experience among undergraduate engineering students.

Keywords: teaching software packages, undergraduate students, blended learning and teaching, authentic assessment

Procedia PDF Downloads 115

4376 Fitness Action Recognition Based on MediaPipe

Authors: Zixuan Xu, Yichun Lou, Yang Song, Zihuai Lin

Abstract:

MediaPipe is an open-source machine learning computer vision framework that can be ported into a multi-platform environment, which makes it easier to use it to recognize the human activity. Based on this framework, many human recognition systems have been created, but the fundamental issue is the recognition of human behavior and posture. In this paper, two methods are proposed to recognize human gestures based on MediaPipe, the first one uses the Adaptive Boosting algorithm to recognize a series of fitness gestures, and the second one uses the Fast Dynamic Time Warping algorithm to recognize 413 continuous fitness actions. These two methods are also applicable to any human posture movement recognition.

Keywords: computer vision, MediaPipe, adaptive boosting, fast dynamic time warping

Procedia PDF Downloads 116

4375 Words Spotting in the Images Handwritten Historical Documents

Authors: Issam Ben Jami

Abstract:

Information retrieval in digital libraries is very important because most famous historical documents occupy a significant value. The word spotting in historical documents is a very difficult notion, because automatic recognition of such documents is naturally cursive, it represents a wide variability in the level scale and translation words in the same documents. We first present a system for the automatic recognition, based on the extraction of interest points words from the image model. The extraction phase of the key points is chosen from the representation of the image as a synthetic description of the shape recognition in a multidimensional space. As a result, we use advanced methods that can find and describe interesting points invariant to scale, rotation and lighting which are linked to local configurations of pixels. We test this approach on documents of the 15th century. Our experiments give important results.

Keywords: feature matching, historical documents, pattern recognition, word spotting

Procedia PDF Downloads 273