Search results for: generative named entity recognition
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2708

Search results for: generative named entity recognition

2378 Study on the Characteristics of Chinese Urban Network Space from the Perspective of Innovative Collaboration

Authors: Wei Wang, Yilun Xu

Abstract:

With the development of knowledge economy era, deepening the mechanism of cooperation and adhering to sharing and win-win cooperation has become new direction of urban development nowadays. In recent years, innovative collaborations between cities are becoming more and more frequent, whose influence on urban network space has aroused many scholars' attention. Taking 46 cities in China as the research object, the paper builds the connectivity of innovative network between cities and the linkages of urban external innovation using patent cooperation data among cities, and explores urban network space in China by the application of GIS, which is a beneficial exploration to the study of social network space in China in the era of information network. The result shows that the urban innovative network space and geographical entity space exist differences, and the linkages of external innovation are not entirely related to the city innovative capacity and the level of economy development. However, urban innovative network space and geographical entity space are similar in hierarchical clustering. They have both formed Beijing-Tianjin-Hebei, Yangtze River Delta, Pearl River Delta three metropolitan areas and Beijing-Shenzhen-Shanghai-Hangzhou four core cities, which lead the development of innovative network space in China.

Keywords: innovative collaboration, urban network space, the connectivity of innovative network, the linkages of external innovation

Procedia PDF Downloads 173
2377 The Influence of Job Recognition and Job Motivation on Organizational Commitment in Public Sector: The Mediation Role of Employee Engagement

Authors: Muhammad Tayyab, Saba Saira

Abstract:

It is an established fact that organizations across the globe consider employees as their assets and try to advance their well-being. However, the local firms of developing countries are mostly profit oriented and do not have much concern about their employees’ engagement or commitment. Like other developing countries, the local organizations of Pakistan are also less concerned about the well-being of their employees. Especially public sector organizations lack concern regarding engagement, satisfaction or commitment of the employees. Therefore, this study aimed at investigating the impact of job recognition and job motivation on organizational commitment in the mediation role of employee engagement. The data were collected from land record officers of board of revenue, Punjab, Pakistan. Structured questionnaire was used to collect data through physically visiting land record officers and also through the internet. A total of 318 land record officers’ responses were finalized to perform data analysis. The data were analyzed through confirmatory factor analysis and structural equation modeling technique. The findings revealed that job recognition and job motivation have direct as well as indirect positive and significant impact on organizational commitment. The limitations, practical implications and future research indications are also explained.

Keywords: job motivation, job recognition, employee engagement, employee commitment, public sector, land record officers

Procedia PDF Downloads 129
2376 An Automatic Speech Recognition of Conversational Telephone Speech in Malay Language

Authors: M. Draman, S. Z. Muhamad Yassin, M. S. Alias, Z. Lambak, M. I. Zulkifli, S. N. Padhi, K. N. Baharim, F. Maskuriy, A. I. A. Rahim

Abstract:

The performance of Malay automatic speech recognition (ASR) system for the call centre environment is presented. The system utilizes Kaldi toolkit as the platform to the entire library and algorithm used in performing the ASR task. The acoustic model implemented in this system uses a deep neural network (DNN) method to model the acoustic signal and the standard (n-gram) model for language modelling. With 80 hours of training data from the call centre recordings, the ASR system can achieve 72% of accuracy that corresponds to 28% of word error rate (WER). The testing was done using 20 hours of audio data. Despite the implementation of DNN, the system shows a low accuracy owing to the varieties of noises, accent and dialect that typically occurs in Malaysian call centre environment. This significant variation of speakers is reflected by the large standard deviation of the average word error rate (WERav) (i.e., ~ 10%). It is observed that the lowest WER (13.8%) was obtained from recording sample with a standard Malay dialect (central Malaysia) of native speaker as compared to 49% of the sample with the highest WER that contains conversation of the speaker that uses non-standard Malay dialect.

Keywords: conversational speech recognition, deep neural network, Malay language, speech recognition

Procedia PDF Downloads 318
2375 Local Image Features Emerging from Brain Inspired Multi-Layer Neural Network

Authors: Hui Wei, Zheng Dong

Abstract:

Object recognition has long been a challenging task in computer vision. Yet the human brain, with the ability to rapidly and accurately recognize visual stimuli, manages this task effortlessly. In the past decades, advances in neuroscience have revealed some neural mechanisms underlying visual processing. In this paper, we present a novel model inspired by the visual pathway in primate brains. This multi-layer neural network model imitates the hierarchical convergent processing mechanism in the visual pathway. We show that local image features generated by this model exhibit robust discrimination and even better generalization ability compared with some existing image descriptors. We also demonstrate the application of this model in an object recognition task on image data sets. The result provides strong support for the potential of this model.

Keywords: biological model, feature extraction, multi-layer neural network, object recognition

Procedia PDF Downloads 539
2374 Development of a Sequential Multimodal Biometric System for Web-Based Physical Access Control into a Security Safe

Authors: Babatunde Olumide Olawale, Oyebode Olumide Oyediran

Abstract:

The security safe is a place or building where classified document and precious items are kept. To prevent unauthorised persons from gaining access to this safe a lot of technologies had been used. But frequent reports of an unauthorised person gaining access into security safes with the aim of removing document and items from the safes are pointers to the fact that there is still security gap in the recent technologies used as access control for the security safe. In this paper we try to solve this problem by developing a multimodal biometric system for physical access control into a security safe using face and voice recognition. The safe is accessed by the combination of face and speech pattern recognition and also in that sequential order. User authentication is achieved through the use of camera/sensor unit and a microphone unit both attached to the door of the safe. The user face was captured by the camera/sensor while the speech was captured by the use of the microphone unit. The Scale Invariance Feature Transform (SIFT) algorithm was used to train images to form templates for the face recognition system while the Mel-Frequency Cepitral Coefficients (MFCC) algorithm was used to train the speech recognition system to recognise authorise user’s speech. Both algorithms were hosted in two separate web based servers and for automatic analysis of our work; our developed system was simulated in a MATLAB environment. The results obtained shows that the developed system was able to give access to authorise users while declining unauthorised person access to the security safe.

Keywords: access control, multimodal biometrics, pattern recognition, security safe

Procedia PDF Downloads 324
2373 The Combination of the Mel Frequency Cepstral Coefficients (MFCC), Perceptual Linear Prediction (PLP), JITTER and SHIMMER Coefficients for the Improvement of Automatic Recognition System for Dysarthric Speech

Authors: Brahim-Fares Zaidi, Malika Boudraa, Sid-Ahmed Selouani

Abstract:

Our work aims to improve our Automatic Recognition System for Dysarthria Speech (ARSDS) based on the Hidden Models of Markov (HMM) and the Hidden Markov Model Toolkit (HTK) to help people who are sick. With pronunciation problems, we applied two techniques of speech parameterization based on Mel Frequency Cepstral Coefficients (MFCC's) and Perceptual Linear Prediction (PLP's) and concatenated them with JITTER and SHIMMER coefficients in order to increase the recognition rate of a dysarthria speech. For our tests, we used the NEMOURS database that represents speakers with dysarthria and normal speakers.

Keywords: hidden Markov model toolkit (HTK), hidden models of Markov (HMM), Mel-frequency cepstral coefficients (MFCC), perceptual linear prediction (PLP’s)

Procedia PDF Downloads 157
2372 A Two-Stage Adaptation towards Automatic Speech Recognition System for Malay-Speaking Children

Authors: Mumtaz Begum Mustafa, Siti Salwah Salim, Feizal Dani Rahman

Abstract:

Recently, Automatic Speech Recognition (ASR) systems were used to assist children in language acquisition as it has the ability to detect human speech signal. Despite the benefits offered by the ASR system, there is a lack of ASR systems for Malay-speaking children. One of the contributing factors for this is the lack of continuous speech database for the target users. Though cross-lingual adaptation is a common solution for developing ASR systems for under-resourced language, it is not viable for children as there are very limited speech databases as a source model. In this research, we propose a two-stage adaptation for the development of ASR system for Malay-speaking children using a very limited database. The two stage adaptation comprises the cross-lingual adaptation (first stage) and cross-age adaptation. For the first stage, a well-known speech database that is phonetically rich and balanced, is adapted to the medium-sized Malay adults using supervised MLLR. The second stage adaptation uses the speech acoustic model generated from the first adaptation, and the target database is a small-sized database of the target users. We have measured the performance of the proposed technique using word error rate, and then compare them with the conventional benchmark adaptation. The two stage adaptation proposed in this research has better recognition accuracy as compared to the benchmark adaptation in recognizing children’s speech.

Keywords: Automatic Speech Recognition System, children speech, adaptation, Malay

Procedia PDF Downloads 390
2371 Facial Expression Phoenix (FePh): An Annotated Sequenced Dataset for Facial and Emotion-Specified Expressions in Sign Language

Authors: Marie Alaghband, Niloofar Yousefi, Ivan Garibay

Abstract:

Facial expressions are important parts of both gesture and sign language recognition systems. Despite the recent advances in both fields, annotated facial expression datasets in the context of sign language are still scarce resources. In this manuscript, we introduce an annotated sequenced facial expression dataset in the context of sign language, comprising over 3000 facial images extracted from the daily news and weather forecast of the public tv-station PHOENIX. Unlike the majority of currently existing facial expression datasets, FePh provides sequenced semi-blurry facial images with different head poses, orientations, and movements. In addition, in the majority of images, identities are mouthing the words, which makes the data more challenging. To annotate this dataset we consider primary, secondary, and tertiary dyads of seven basic emotions of "sad", "surprise", "fear", "angry", "neutral", "disgust", and "happy". We also considered the "None" class if the image’s facial expression could not be described by any of the aforementioned emotions. Although we provide FePh as a facial expression dataset of signers in sign language, it has a wider application in gesture recognition and Human Computer Interaction (HCI) systems.

Keywords: annotated facial expression dataset, gesture recognition, sequenced facial expression dataset, sign language recognition

Procedia PDF Downloads 156
2370 Lip Localization Technique for Myanmar Consonants Recognition Based on Lip Movements

Authors: Thein Thein, Kalyar Myo San

Abstract:

Lip reading system is one of the different supportive technologies for hearing impaired, or elderly people or non-native speakers. For normal hearing persons in noisy environments or in conditions where the audio signal is not available, lip reading techniques can be used to increase their understanding of spoken language. Hearing impaired persons have used lip reading techniques as important tools to find out what was said by other people without hearing voice. Thus, visual speech information is important and become active research area. Using visual information from lip movements can improve the accuracy and robustness of a speech recognition system and the need for lip reading system is ever increasing for every language. However, the recognition of lip movement is a difficult task because of the region of interest (ROI) is nonlinear and noisy. Therefore, this paper proposes method to detect the accurate lips shape and to localize lip movement towards automatic lip tracking by using the combination of Otsu global thresholding technique and Moore Neighborhood Tracing Algorithm. Proposed method shows how accurate lip localization and tracking which is useful for speech recognition. In this work of study and experiments will be carried out the automatic lip localizing the lip shape for Myanmar consonants using the only visual information from lip movements which is useful for visual speech of Myanmar languages.

Keywords: lip reading, lip localization, lip tracking, Moore neighborhood tracing algorithm

Procedia PDF Downloads 349
2369 Fusion of Finger Inner Knuckle Print and Hand Geometry Features to Enhance the Performance of Biometric Verification System

Authors: M. L. Anitha, K. A. Radhakrishna Rao

Abstract:

With the advent of modern computing technology, there is an increased demand for developing recognition systems that have the capability of verifying the identity of individuals. Recognition systems are required by several civilian and commercial applications for providing access to secured resources. Traditional recognition systems which are based on physical identities are not sufficiently reliable to satisfy the security requirements due to the use of several advances of forgery and identity impersonation methods. Recognizing individuals based on his/her unique physiological characteristics known as biometric traits is a reliable technique, since these traits are not transferable and they cannot be stolen or lost. Since the performance of biometric based recognition system depends on the particular trait that is utilized, the present work proposes a fusion approach which combines Inner knuckle print (IKP) trait of the middle, ring and index fingers with the geometrical features of hand. The hand image captured from a digital camera is preprocessed to find finger IKP as region of interest (ROI) and hand geometry features. Geometrical features are represented as the distances between different key points and IKP features are extracted by applying local binary pattern descriptor on the IKP ROI. The decision level AND fusion was adopted, which has shown improvement in performance of the combined scheme. The proposed approach is tested on the database collected at our institute. Proposed approach is of significance since both hand geometry and IKP features can be extracted from the palm region of the hand. The fusion of these features yields a false acceptance rate of 0.75%, false rejection rate of 0.86% for verification tests conducted, which is less when compared to the results obtained using individual traits. The results obtained confirm the usefulness of proposed approach and suitability of the selected features for developing biometric based recognition system based on features from palmar region of hand.

Keywords: biometrics, hand geometry features, inner knuckle print, recognition

Procedia PDF Downloads 217
2368 Investigating the Role of Artificial Intelligence in Developing Creativity in Architecture Education in Egypt: A Case Study of Design Studios

Authors: Ahmed Radwan, Ahmed Abdel Ghaney

Abstract:

This paper delves into the transformative potential of artificial intelligence (AI) in fostering creativity within the domain of architecture education, especially with a specific emphasis on its implications within the Design Studios; the convergence of AI and architectural pedagogy has introduced avenues for redefining the boundaries of creative expression and problem-solving. By harnessing AI-driven tools, students and educators can collaboratively explore a spectrum of design possibilities, stimulate innovative ideation, and engage in multidimensional design processes. This paper investigates the ways in which AI contributes to architectural creativity by facilitating generative design, pattern recognition, virtual reality experiences, and sustainable design optimization. Furthermore, the study examines the balance between AI-enhanced creativity and the preservation of core principles of architectural design/education, ensuring that technology is harnessed to augment rather than replace foundational design skills. Through an exploration of Egypt's architectural heritage and contemporary challenges, this research underscores how AI can synergize with cultural context and historical insights to inspire cutting-edge architectural solutions. By analyzing AI's impact on nurturing creativity among Egyptian architecture students, this paper seeks to contribute to the ongoing discourse on the integration of technology within global architectural education paradigms. It is hoped that this research will guide the thoughtful incorporation of AI in fostering creativity while preserving the authenticity and richness of architectural design education in Egypt and beyond.

Keywords: architecture, artificial intelligence, architecture education, Egypt

Procedia PDF Downloads 72
2367 Humanitarian Emergency of the Refugee Condition for Central American Immigrants in Irregular Situation

Authors: María de los Ángeles Cerda González, Itzel Arriaga Hurtado, Pascacio José Martínez Pichardo

Abstract:

In México, the recognition of refugee condition is a fundamental right which, as host State, has the obligation of respect, protect, and fulfill to the foreigners – where we can find the figure of immigrants in irregular situation-, that cannot return to their country of origin for humanitarian reasons. The recognition of the refugee condition as a fundamental right in the Mexican law system proceeds under these situations: 1. The immigrant applies for the refugee condition, even without the necessary proving elements to accredit the humanitarian character of his departure from his country of origin. 2. The immigrant does not apply for the recognition of refugee because he does not know he has the right to, even if he has the profile to apply for. 3. The immigrant who applies fulfills the requirements of the administrative procedure and has access to the refugee recognition. Of the three situations above, only the last one is contemplated for the national indexes of the status refugee; and the first two prove the inefficiency of the governmental system viewed from its lack of sensibility consequence of the no education in human rights matter and which results in the legal vulnerability of the immigrants in irregular situation because they do not have access to the procuration and administration of justice. In the aim of determining the causes and consequences of the no recognition of the refugee status, this investigation was structured from a systemic analysis which objective is to show the advances in Central American humanitarian emergency investigation, the Mexican States actions to protect, respect and fulfil the fundamental right of refugee of immigrants in irregular situation and the social and legal vulnerabilities suffered by Central Americans in Mexico. Therefore, to achieve the deduction of the legal nature of the humanitarian emergency from the Human Rights as a branch of the International Public Law, a conceptual framework is structured using the inductive deductive method. The problem statement is made from a legal framework to approach a theoretical scheme under the theory of social systems, from the analysis of the lack of communication of the governmental and normative subsystems of the Mexican legal system relative to the process undertaken by the Central American immigrants to achieve the recognition of the refugee status as a human right. Accordingly, is determined that fulfilling the obligations of the State referent to grant the right of the recognition of the refugee condition, would mean a guideline for a new stage in Mexican Law, because it would enlarge the constitutional benefits to everyone whose right to the recognition of refugee has been denied an as consequence, a great advance in human rights matter would be achieved.

Keywords: central American immigrants in irregular situation, humanitarian emergency, human rights, refugee

Procedia PDF Downloads 286
2366 Hand Symbol Recognition Using Canny Edge Algorithm and Convolutional Neural Network

Authors: Harshit Mittal, Neeraj Garg

Abstract:

Hand symbol recognition is a pivotal component in the domain of computer vision, with far-reaching applications spanning sign language interpretation, human-computer interaction, and accessibility. This research paper discusses the approach with the integration of the Canny Edge algorithm and convolutional neural network. The significance of this study lies in its potential to enhance communication and accessibility for individuals with hearing impairments or those engaged in gesture-based interactions with technology. In the experiment mentioned, the data is manually collected by the authors from the webcam using Python codes, to increase the dataset augmentation, is applied to original images, which makes the model more compatible and advanced. Further, the dataset of about 6000 coloured images distributed equally in 5 classes (i.e., 1, 2, 3, 4, 5) are pre-processed first to gray images and then by the Canny Edge algorithm with threshold 1 and 2 as 150 each. After successful data building, this data is trained on the Convolutional Neural Network model, giving accuracy: 0.97834, precision: 0.97841, recall: 0.9783, and F1 score: 0.97832. For user purposes, a block of codes is built in Python to enable a window for hand symbol recognition. This research, at its core, seeks to advance the field of computer vision by providing an advanced perspective on hand sign recognition. By leveraging the capabilities of the Canny Edge algorithm and convolutional neural network, this study contributes to the ongoing efforts to create more accurate, efficient, and accessible solutions for individuals with diverse communication needs.

Keywords: hand symbol recognition, computer vision, Canny edge algorithm, convolutional neural network

Procedia PDF Downloads 59
2365 Multimodal Database of Emotional Speech, Video and Gestures

Authors: Tomasz Sapiński, Dorota Kamińska, Adam Pelikant, Egils Avots, Cagri Ozcinar, Gholamreza Anbarjafari

Abstract:

People express emotions through different modalities. Integration of verbal and non-verbal communication channels creates a system in which the message is easier to understand. Expanding the focus to several expression forms can facilitate research on emotion recognition as well as human-machine interaction. In this article, the authors present a Polish emotional database composed of three modalities: facial expressions, body movement and gestures, and speech. The corpora contains recordings registered in studio conditions, acted out by 16 professional actors (8 male and 8 female). The data is labeled with six basic emotions categories, according to Ekman’s emotion categories. To check the quality of performance, all recordings are evaluated by experts and volunteers. The database is available to academic community and might be useful in the study on audio-visual emotion recognition.

Keywords: body movement, emotion recognition, emotional corpus, facial expressions, gestures, multimodal database, speech

Procedia PDF Downloads 348
2364 An Approach for Vocal Register Recognition Based on Spectral Analysis of Singing

Authors: Aleksandra Zysk, Pawel Badura

Abstract:

Recognizing and controlling vocal registers during singing is a difficult task for beginner vocalist. It requires among others identifying which part of natural resonators is being used when a sound propagates through the body. Thus, an application has been designed allowing for sound recording, automatic vocal register recognition (VRR), and a graphical user interface providing real-time visualization of the signal and recognition results. Six spectral features are determined for each time frame and passed to the support vector machine classifier yielding a binary decision on the head or chest register assignment of the segment. The classification training and testing data have been recorded by ten professional female singers (soprano, aged 19-29) performing sounds for both chest and head register. The classification accuracy exceeded 93% in each of various validation schemes. Apart from a hard two-class clustering, the support vector classifier returns also information on the distance between particular feature vector and the discrimination hyperplane in a feature space. Such an information reflects the level of certainty of the vocal register classification in a fuzzy way. Thus, the designed recognition and training application is able to assess and visualize the continuous trend in singing in a user-friendly graphical mode providing an easy way to control the vocal emission.

Keywords: classification, singing, spectral analysis, vocal emission, vocal register

Procedia PDF Downloads 299
2363 ViraPart: A Text Refinement Framework for Automatic Speech Recognition and Natural Language Processing Tasks in Persian

Authors: Narges Farokhshad, Milad Molazadeh, Saman Jamalabbasi, Hamed Babaei Giglou, Saeed Bibak

Abstract:

The Persian language is an inflectional subject-object-verb language. This fact makes Persian a more uncertain language. However, using techniques such as Zero-Width Non-Joiner (ZWNJ) recognition, punctuation restoration, and Persian Ezafe construction will lead us to a more understandable and precise language. In most of the works in Persian, these techniques are addressed individually. Despite that, we believe that for text refinement in Persian, all of these tasks are necessary. In this work, we proposed a ViraPart framework that uses embedded ParsBERT in its core for text clarifications. First, used the BERT variant for Persian followed by a classifier layer for classification procedures. Next, we combined models outputs to output cleartext. In the end, the proposed model for ZWNJ recognition, punctuation restoration, and Persian Ezafe construction performs the averaged F1 macro scores of 96.90%, 92.13%, and 98.50%, respectively. Experimental results show that our proposed approach is very effective in text refinement for the Persian language.

Keywords: Persian Ezafe, punctuation, ZWNJ, NLP, ParsBERT, transformers

Procedia PDF Downloads 209
2362 Cross Attention Fusion for Dual-Stream Speech Emotion Recognition

Authors: Shaode Yu, Jiajian Meng, Bing Zhu, Hang Yu, Qiurui Sun

Abstract:

Speech emotion recognition (SER) is for recognizing human subjective emotions through audio data in-depth analysis. From speech audios, how to comprehensively extract emotional information and how to effectively fuse extracted features remain challenging. This paper presents a dual-stream SER framework that embraces both full training and transfer learning of different networks for thorough feature encoding. Besides, a plug-and-play cross-attention fusion (CAF) module is implemented for the valid integration of the dual-stream encoder output. The effectiveness of the proposed CAF module is compared to the other three fusion modules (feature summation, feature concatenation, and feature-wise linear modulation) on two databases (RAVDESS and IEMO-CAP) using different dual-stream encoders (full training network, DPCNN or TextRCNN; transfer learning network, HuBERT or Wav2Vec2). Experimental results suggest that the CAF module can effectively reconcile conflicts between features from different encoders and outperform the other three feature fusion modules on the SER task. In the future, the plug-and-play CAF module can be extended for multi-branch feature fusion, and the dual-stream SER framework can be widened for multi-stream data representation to improve the recognition performance and generalization capacity.

Keywords: speech emotion recognition, cross-attention fusion, dual-stream, pre-trained

Procedia PDF Downloads 70
2361 Generative Adversarial Network Based Fingerprint Anti-Spoofing Limitations

Authors: Yehjune Heo

Abstract:

Fingerprint Anti-Spoofing approaches have been actively developed and applied in real-world applications. One of the main problems for Fingerprint Anti-Spoofing is not robust to unseen samples, especially in real-world scenarios. A possible solution will be to generate artificial, but realistic fingerprint samples and use them for training in order to achieve good generalization. This paper contains experimental and comparative results with currently popular GAN based methods and uses realistic synthesis of fingerprints in training in order to increase the performance. Among various GAN models, the most popular StyleGAN is used for the experiments. The CNN models were first trained with the dataset that did not contain generated fake images and the accuracy along with the mean average error rate were recorded. Then, the fake generated images (fake images of live fingerprints and fake images of spoof fingerprints) were each combined with the original images (real images of live fingerprints and real images of spoof fingerprints), and various CNN models were trained. The best performances for each CNN model, trained with the dataset of generated fake images and each time the accuracy and the mean average error rate, were recorded. We observe that current GAN based approaches need significant improvements for the Anti-Spoofing performance, although the overall quality of the synthesized fingerprints seems to be reasonable. We include the analysis of this performance degradation, especially with a small number of samples. In addition, we suggest several approaches towards improved generalization with a small number of samples, by focusing on what GAN based approaches should learn and should not learn.

Keywords: anti-spoofing, CNN, fingerprint recognition, GAN

Procedia PDF Downloads 180
2360 Data Augmentation for Early-Stage Lung Nodules Using Deep Image Prior and Pix2pix

Authors: Qasim Munye, Juned Islam, Haseeb Qureshi, Syed Jung

Abstract:

Lung nodules are commonly identified in computed tomography (CT) scans by experienced radiologists at a relatively late stage. Early diagnosis can greatly increase survival. We propose using a pix2pix conditional generative adversarial network to generate realistic images simulating early-stage lung nodule growth. We have applied deep images prior to 2341 slices from 895 computed tomography (CT) scans from the Lung Image Database Consortium (LIDC) dataset to generate pseudo-healthy medical images. From these images, 819 were chosen to train a pix2pix network. We observed that for most of the images, the pix2pix network was able to generate images where the nodule increased in size and intensity across epochs. To evaluate the images, 400 generated images were chosen at random and shown to a medical student beside their corresponding original image. Of these 400 generated images, 384 were defined as satisfactory - meaning they resembled a nodule and were visually similar to the corresponding image. We believe that this generated dataset could be used as training data for neural networks to detect lung nodules at an early stage or to improve the accuracy of such networks. This is particularly significant as datasets containing the growth of early-stage nodules are scarce. This project shows that the combination of deep image prior and generative models could potentially open the door to creating larger datasets than currently possible and has the potential to increase the accuracy of medical classification tasks.

Keywords: medical technology, artificial intelligence, radiology, lung cancer

Procedia PDF Downloads 64
2359 Algorithm for Path Recognition in-between Tree Rows for Agricultural Wheeled-Mobile Robots

Authors: Anderson Rocha, Pedro Miguel de Figueiredo Dinis Oliveira Gaspar

Abstract:

Machine vision has been widely used in recent years in agriculture, as a tool to promote the automation of processes and increase the levels of productivity. The aim of this work is the development of a path recognition algorithm based on image processing to guide a terrestrial robot in-between tree rows. The proposed algorithm was developed using the software MATLAB, and it uses several image processing operations, such as threshold detection, morphological erosion, histogram equalization and the Hough transform, to find edge lines along tree rows on an image and to create a path to be followed by a mobile robot. To develop the algorithm, a set of images of different types of orchards was used, which made possible the construction of a method capable of identifying paths between trees of different heights and aspects. The algorithm was evaluated using several images with different characteristics of quality and the results showed that the proposed method can successfully detect a path in different types of environments.

Keywords: agricultural mobile robot, image processing, path recognition, hough transform

Procedia PDF Downloads 143
2358 Deep Learning Application for Object Image Recognition and Robot Automatic Grasping

Authors: Shiuh-Jer Huang, Chen-Zon Yan, C. K. Huang, Chun-Chien Ting

Abstract:

Since the vision system application in industrial environment for autonomous purposes is required intensely, the image recognition technique becomes an important research topic. Here, deep learning algorithm is employed in image system to recognize the industrial object and integrate with a 7A6 Series Manipulator for object automatic gripping task. PC and Graphic Processing Unit (GPU) are chosen to construct the 3D Vision Recognition System. Depth Camera (Intel RealSense SR300) is employed to extract the image for object recognition and coordinate derivation. The YOLOv2 scheme is adopted in Convolution neural network (CNN) structure for object classification and center point prediction. Additionally, image processing strategy is used to find the object contour for calculating the object orientation angle. Then, the specified object location and orientation information are sent to robotic controller. Finally, a six-axis manipulator can grasp the specific object in a random environment based on the user command and the extracted image information. The experimental results show that YOLOv2 has been successfully employed to detect the object location and category with confidence near 0.9 and 3D position error less than 0.4 mm. It is useful for future intelligent robotic application in industrial 4.0 environment.

Keywords: deep learning, image processing, convolution neural network, YOLOv2, 7A6 series manipulator

Procedia PDF Downloads 242
2357 'Value-Based Re-Framing' in Identity-Based Conflicts: A Skill for Mediators in Multi-Cultural Societies

Authors: Hami-Ziniman Revital, Ashwall Rachelly

Abstract:

The conflict resolution realm has developed tremendously during the last half-decade. Three main approaches should be mentioned: an Alternative Dispute Resolution (ADR) suggesting processes such as Arbitration or Interests-based Negotiation was developed as an answer to obligations and rights-based conflicts. The Pragmatic mediation approach focuses on the gap between interests and needs of disputants. The Transformative mediation approach focusses on relations and suits identity-based conflicts. In the current study, we examine the conflictual relations between religious and non-religious Jews in Israel and the impact of three transformative mechanisms: Inter-group recognition, In-group empowerment and Value-based reframing on the relations between the participants. The research was conducted during four facilitated joint mediation classes. A unique finding was found. Using both transformative mechanisms and the Contact Hypothesis criteria, we identify transformation in participants’ relations and a considerable change from anger, alienation, and suspiciousness to an increased understanding, affection and interpersonal concern towards the out-group members. Intergroup Recognition, In-group empowerment, and Values-based reframing were the skills discovered as the main enablers of the change in the relations and the research participants’ fostered mutual recognition of the out-group values and identity-based issues. We conclude this transformation was possible due to a constant intergroup contact, based on the Contact Hypothesis criteria. In addition, as Interests-based mediation uses “Reframing” as a skill to acknowledge both mutual and opposite needs of the disputants, we suggest the use of “Value-based Reframing” in intergroup identity-based conflicts, as a skill contributes to the empowerment and the recognition of both mutual and different out-group values. We offer to implement those insights and skills to assist conflict resolution facilitators in various intergroup identity-based conflicts resolution efforts and to establish further research and knowledge.

Keywords: empowerment, identity-based conflict, intergroup recognition, intergroup relations, mediation skills, multi-cultural society, reframing, value-based recognition

Procedia PDF Downloads 338
2356 Facial Recognition Technology in Institutions of Higher Learning: Exploring the Use in Kenya

Authors: Samuel Mwangi, Josephine K. Mule

Abstract:

Access control as a security technique regulates who or what can access resources. It is a fundamental concept in security that minimizes risks to the institutions that use access control. Regulating access to institutions of higher learning is key to ensure only authorized personnel and students are allowed into the institutions. The use of biometrics has been criticized due to the setup and maintenance costs, hygiene concerns, and trepidations regarding data privacy, among other apprehensions. Facial recognition is arguably a fast and accurate way of validating identity in order to guard protected areas. It guarantees that only authorized individuals gain access to secure locations while requiring far less personal information whilst providing an additional layer of security beyond keys, fobs, or identity cards. This exploratory study sought to investigate the use of facial recognition in controlling access in institutions of higher learning in Kenya. The sample population was drawn from both private and public higher learning institutions. The data is based on responses from staff and students. Questionnaires were used for data collection and follow up interviews conducted to understand responses from the questionnaires. 80% of the sampled population indicated that there were many security breaches by unauthorized people, with some resulting in terror attacks. These security breaches were attributed to stolen identity cases, where staff or student identity cards were stolen and used by criminals to access the institutions. These unauthorized accesses have resulted in losses to the institutions, including reputational damages. The findings indicate that security breaches are a major problem in institutions of higher learning in Kenya. Consequently, access control would be beneficial if employed to curb security breaches. We suggest the use of facial recognition technology, given its uniqueness in identifying users and its non-repudiation capabilities.

Keywords: facial recognition, access control, technology, learning

Procedia PDF Downloads 122
2355 AI-Powered Models for Real-Time Fraud Detection in Financial Transactions to Improve Financial Security

Authors: Shanshan Zhu, Mohammad Nasim

Abstract:

Financial fraud continues to be a major threat to financial institutions across the world, causing colossal money losses and undermining public trust. Fraud prevention techniques, based on hard rules, have become ineffective due to evolving patterns of fraud in recent times. Against such a background, the present study probes into distinct methodologies that exploit emergent AI-driven techniques to further strengthen fraud detection. We would like to compare the performance of generative adversarial networks and graph neural networks with other popular techniques, like gradient boosting, random forests, and neural networks. To this end, we would recommend integrating all these state-of-the-art models into one robust, flexible, and smart system for real-time anomaly and fraud detection. To overcome the challenge, we designed synthetic data and then conducted pattern recognition and unsupervised and supervised learning analyses on the transaction data to identify which activities were fishy. With the use of actual financial statistics, we compare the performance of our model in accuracy, speed, and adaptability versus conventional models. The results of this study illustrate a strong signal and need to integrate state-of-the-art, AI-driven fraud detection solutions into frameworks that are highly relevant to the financial domain. It alerts one to the great urgency that banks and related financial institutions must rapidly implement these most advanced technologies to continue to have a high level of security.

Keywords: AI-driven fraud detection, financial security, machine learning, anomaly detection, real-time fraud detection

Procedia PDF Downloads 29
2354 Face Recognition Using Eigen Faces Algorithm

Authors: Shweta Pinjarkar, Shrutika Yawale, Mayuri Patil, Reshma Adagale

Abstract:

Face recognition is the technique which can be applied to the wide variety of problems like image and film processing, human computer interaction, criminal identification etc. This has motivated researchers to develop computational models to identify the faces, which are easy and simple to implement. In this, demonstrates the face recognition system in android device using eigenface. The system can be used as the base for the development of the recognition of human identity. Test images and training images are taken directly with the camera in android device.The test results showed that the system produces high accuracy. The goal is to implement model for particular face and distinguish it with large number of stored faces. face recognition system detects the faces in picture taken by web camera or digital camera and these images then checked with training images dataset based on descriptive features. Further this algorithm can be extended to recognize the facial expressions of a person.recognition could be carried out under widely varying conditions like frontal view,scaled frontal view subjects with spectacles. The algorithm models the real time varying lightning conditions. The implemented system is able to perform real-time face detection, face recognition and can give feedback giving a window with the subject's info from database and sending an e-mail notification to interested institutions using android application. Face recognition is the technique which can be applied to the wide variety of problems like image and film processing, human computer interaction, criminal identification etc. This has motivated researchers to develop computational models to identify the faces, which are easy and simple to implement. In this , demonstrates the face recognition system in android device using eigenface. The system can be used as the base for the development of the recognition of human identity. Test images and training images are taken directly with the camera in android device.The test results showed that the system produces high accuracy. The goal is to implement model for particular face and distinguish it with large number of stored faces. face recognition system detects the faces in picture taken by web camera or digital camera and these images then checked with training images dataset based on descriptive features. Further this algorithm can be extended to recognize the facial expressions of a person.recognition could be carried out under widely varying conditions like frontal view,scaled frontal view subjects with spectacles. The algorithm models the real time varying lightning conditions. The implemented system is able to perform real-time face detection, face recognition and can give feedback giving a window with the subject's info from database and sending an e-mail notification to interested institutions using android application.

Keywords: face detection, face recognition, eigen faces, algorithm

Procedia PDF Downloads 355
2353 “laws Drifting Off While Artificial Intelligence Thriving” – A Comparative Study with Special Reference to Computer Science and Information Technology

Authors: Amarendar Reddy Addula

Abstract:

Definition of Artificial Intelligence: Artificial intelligence is the simulation of mortal intelligence processes by machines, especially computer systems. Explicit operations of AI comprise expert systems, natural language processing, and speech recognition, and machine vision. Artificial Intelligence (AI) is an original medium for digital business, according to a new report by Gartner. The last 10 times represent an advance period in AI’s development, prodded by the confluence of factors, including the rise of big data, advancements in cipher structure, new machine literacy ways, the materialization of pall computing, and the vibrant open- source ecosystem. Influence of AI to a broader set of use cases and druggies and its gaining fashionability because it improves AI’s versatility, effectiveness, and rigidity. Edge AI will enable digital moments by employing AI for real- time analytics closer to data sources. Gartner predicts that by 2025, further than 50 of all data analysis by deep neural networks will do at the edge, over from lower than 10 in 2021. Responsible AI is a marquee term for making suitable business and ethical choices when espousing AI. It requires considering business and societal value, threat, trust, translucency, fairness, bias mitigation, explainability, responsibility, safety, sequestration, and nonsupervisory compliance. Responsible AI is ever more significant amidst growing nonsupervisory oversight, consumer prospects, and rising sustainability pretensions. Generative AI is the use of AI to induce new vestiges and produce innovative products. To date, generative AI sweats have concentrated on creating media content similar as photorealistic images of people and effects, but it can also be used for law generation, creating synthetic irregular data, and designing medicinals and accoutrements with specific parcels. AI is the subject of a wide- ranging debate in which there's a growing concern about its ethical and legal aspects. Constantly, the two are varied and nonplussed despite being different issues and areas of knowledge. The ethical debate raises two main problems the first, abstract, relates to the idea and content of ethics; the alternate, functional, and concerns its relationship with the law. Both set up models of social geste, but they're different in compass and nature. The juridical analysis is grounded on anon-formalistic scientific methodology. This means that it's essential to consider the nature and characteristics of the AI as a primary step to the description of its legal paradigm. In this regard, there are two main issues the relationship between artificial and mortal intelligence and the question of the unitary or different nature of the AI. From that theoretical and practical base, the study of the legal system is carried out by examining its foundations, the governance model, and the nonsupervisory bases. According to this analysis, throughout the work and in the conclusions, International Law is linked as the top legal frame for the regulation of AI.

Keywords: artificial intelligence, ethics & human rights issues, laws, international laws

Procedia PDF Downloads 92
2352 Burnout Recognition for Call Center Agents by Using Skin Color Detection with Hand Poses

Authors: El Sayed A. Sharara, A. Tsuji, K. Terada

Abstract:

Call centers have been expanding and they have influence on activation in various markets increasingly. A call center’s work is known as one of the most demanding and stressful jobs. In this paper, we propose the fatigue detection system in order to detect burnout of call center agents in the case of a neck pain and upper back pain. Our proposed system is based on the computer vision technique combined skin color detection with the Viola-Jones object detector. To recognize the gesture of hand poses caused by stress sign, the YCbCr color space is used to detect the skin color region including face and hand poses around the area related to neck ache and upper back pain. A cascade of clarifiers by Viola-Jones is used for face recognition to extract from the skin color region. The detection of hand poses is given by the evaluation of neck pain and upper back pain by using skin color detection and face recognition method. The system performance is evaluated using two groups of dataset created in the laboratory to simulate call center environment. Our call center agent burnout detection system has been implemented by using a web camera and has been processed by MATLAB. From the experimental results, our system achieved 96.3% for upper back pain detection and 94.2% for neck pain detection.

Keywords: call center agents, fatigue, skin color detection, face recognition

Procedia PDF Downloads 293
2351 Development of an EEG-Based Real-Time Emotion Recognition System on Edge AI

Authors: James Rigor Camacho, Wansu Lim

Abstract:

Over the last few years, the development of new wearable and processing technologies has accelerated in order to harness physiological data such as electroencephalograms (EEGs) for EEG-based applications. EEG has been demonstrated to be a source of emotion recognition signals with the highest classification accuracy among physiological signals. However, when emotion recognition systems are used for real-time classification, the training unit is frequently left to run offline or in the cloud rather than working locally on the edge. That strategy has hampered research, and the full potential of using an edge AI device has yet to be realized. Edge AI devices are computers with high performance that can process complex algorithms. It is capable of collecting, processing, and storing data on its own. It can also analyze and apply complicated algorithms like localization, detection, and recognition on a real-time application, making it a powerful embedded device. The NVIDIA Jetson series, specifically the Jetson Nano device, was used in the implementation. The cEEGrid, which is integrated to the open-source brain computer-interface platform (OpenBCI), is used to collect EEG signals. An EEG-based real-time emotion recognition system on Edge AI is proposed in this paper. To perform graphical spectrogram categorization of EEG signals and to predict emotional states based on input data properties, machine learning-based classifiers were used. Until the emotional state was identified, the EEG signals were analyzed using the K-Nearest Neighbor (KNN) technique, which is a supervised learning system. In EEG signal processing, after each EEG signal has been received in real-time and translated from time to frequency domain, the Fast Fourier Transform (FFT) technique is utilized to observe the frequency bands in each EEG signal. To appropriately show the variance of each EEG frequency band, power density, standard deviation, and mean are calculated and employed. The next stage is to identify the features that have been chosen to predict emotion in EEG data using the K-Nearest Neighbors (KNN) technique. Arousal and valence datasets are used to train the parameters defined by the KNN technique.Because classification and recognition of specific classes, as well as emotion prediction, are conducted both online and locally on the edge, the KNN technique increased the performance of the emotion recognition system on the NVIDIA Jetson Nano. Finally, this implementation aims to bridge the research gap on cost-effective and efficient real-time emotion recognition using a resource constrained hardware device, like the NVIDIA Jetson Nano. On the cutting edge of AI, EEG-based emotion identification can be employed in applications that can rapidly expand the research and implementation industry's use.

Keywords: edge AI device, EEG, emotion recognition system, supervised learning algorithm, sensors

Procedia PDF Downloads 103
2350 GenAI Agents in Product Management: A Case Study from the Manufacturing Sector

Authors: Aron Witkowski, Andrzej Wodecki

Abstract:

Purpose: This study aims to explore the feasibility and effectiveness of utilizing Generative Artificial Intelligence (GenAI) agents as product managers within the manufacturing sector. It seeks to evaluate whether current GenAI capabilities can fulfill the complex requirements of product management and deliver comparable outcomes to human counterparts. Study Design/Methodology/Approach: This research involved the creation of a support application for product managers, utilizing high-quality sources on product management and generative AI technologies. The application was designed to assist in various aspects of product management tasks. To evaluate its effectiveness, a study was conducted involving 10 experienced product managers from the manufacturing sector. These professionals were tasked with using the application and providing feedback on the tool's responses to common questions and challenges they encounter in their daily work. The study employed a mixed-methods approach, combining quantitative assessments of the tool's performance with qualitative interviews to gather detailed insights into the user experience and perceived value of the application. Findings: The findings reveal that GenAI-based product management agents exhibit significant potential in handling routine tasks, data analysis, and predictive modeling. However, there are notable limitations in areas requiring nuanced decision-making, creativity, and complex stakeholder interactions. The case study demonstrates that while GenAI can augment human capabilities, it is not yet fully equipped to independently manage the holistic responsibilities of a product manager in the manufacturing sector. Originality/Value: This research provides an analysis of GenAI's role in product management within the manufacturing industry, contributing to the limited body of literature on the application of GenAI agents in this domain. It offers practical insights into the current capabilities and limitations of GenAI, helping organizations make informed decisions about integrating AI into their product management strategies. Implications for Academic and Practical Fields: For academia, the study suggests new avenues for research in AI-human collaboration and the development of advanced AI systems capable of higher-level managerial functions. Practically, it provides industry professionals with a nuanced understanding of how GenAI can be leveraged to enhance product management, guiding investments in AI technologies and training programs to bridge identified gaps.

Keywords: generative artificial intelligence, GenAI, NPD, new product development, product management, manufacturing

Procedia PDF Downloads 45
2349 Towards Logical Inference for the Arabic Question-Answering

Authors: Wided Bakari, Patrice Bellot, Omar Trigui, Mahmoud Neji

Abstract:

This article constitutes an opening to think of the modeling and analysis of Arabic texts in the context of a question-answer system. It is a question of exceeding the traditional approaches focused on morphosyntactic approaches. Furthermore, we present a new approach that analyze a text in order to extract correct answers then transform it to logical predicates. In addition, we would like to represent different levels of information within a text to answer a question and choose an answer among several proposed. To do so, we transform both the question and the text into logical forms. Then, we try to recognize all entailment between them. The results of recognizing the entailment are a set of text sentences that can implicate the user’s question. Our work is now concentrated on an implementation step in order to develop a system of question-answering in Arabic using techniques to recognize textual implications. In this context, the extraction of text features (keywords, named entities, and relationships that link them) is actually considered the first step in our process of text modeling. The second one is the use of techniques of textual implication that relies on the notion of inference and logic representation to extract candidate answers. The last step is the extraction and selection of the desired answer.

Keywords: NLP, Arabic language, question-answering, recognition text entailment, logic forms

Procedia PDF Downloads 338