Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 2672

Search results for: generative named entity recognition

2642 Use of Generative Adversarial Networks (GANs) in Neuroimaging and Clinical Neuroscience Applications

Abstract:

GANs are a potent form of deep learning models that have found success in various fields. They are part of the larger group of generative techniques, which aim to produce authentic data using a probabilistic model that learns distributions from actual samples. In clinical settings, GANs have demonstrated improved abilities in capturing spatially intricate, nonlinear, and possibly subtle disease impacts in contrast to conventional generative techniques. This review critically evaluates the current research on how GANs are being used in imaging studies of different neurological conditions like Alzheimer's disease, brain tumors, aging of the brain, and multiple sclerosis. We offer a clear explanation of different GAN techniques for each use case in neuroimaging and delve into the key hurdles, unanswered queries, and potential advancements in utilizing GANs in this field. Our goal is to connect advanced deep learning techniques with neurology studies, showcasing how GANs can assist in clinical decision-making and enhance our comprehension of the structural and functional aspects of brain disorders.

Keywords: GAN, pathology, generative adversarial network, neuro imaging

Procedia PDF Downloads 21

2641 Generating Swarm Satellite Data Using Long Short-Term Memory and Generative Adversarial Networks for the Detection of Seismic Precursors

Authors: Yaxin Bi

Abstract:

Accurate prediction and understanding of the evolution mechanisms of earthquakes remain challenging in the fields of geology, geophysics, and seismology. This study leverages Long Short-Term Memory (LSTM) networks and Generative Adversarial Networks (GANs), a generative model tailored to time-series data, for generating synthetic time series data based on Swarm satellite data, which will be used for detecting seismic anomalies. LSTMs demonstrated commendable predictive performance in generating synthetic data across multiple countries. In contrast, the GAN models struggled to generate synthetic data, often producing non-informative values, although they were able to capture the data distribution of the time series. These findings highlight both the promise and challenges associated with applying deep learning techniques to generate synthetic data, underscoring the potential of deep learning in generating synthetic electromagnetic satellite data.

Keywords: LSTM, GAN, earthquake, synthetic data, generative AI, seismic precursors

Procedia PDF Downloads 26

2640 Facial Recognition on the Basis of Facial Fragments

Authors: Tetyana Baydyk, Ernst Kussul, Sandra Bonilla Meza

Abstract:

There are many articles that attempt to establish the role of different facial fragments in face recognition. Various approaches are used to estimate this role. Frequently, authors calculate the entropy corresponding to the fragment. This approach can only give approximate estimation. In this paper, we propose to use a more direct measure of the importance of different fragments for face recognition. We propose to select a recognition method and a face database and experimentally investigate the recognition rate using different fragments of faces. We present two such experiments in the paper. We selected the PCNC neural classifier as a method for face recognition and parts of the LFW (Labeled Faces in the Wild) face database as training and testing sets. The recognition rate of the best experiment is comparable with the recognition rate obtained using the whole face.

Keywords: face recognition, labeled faces in the wild (LFW) database, random local descriptor (RLD), random features

Procedia PDF Downloads 352

2639 Adversarial Disentanglement Using Latent Classifier for Pose-Independent Representation

Authors: Hamed Alqahtani, Manolya Kavakli-Thorne

Abstract:

The large pose discrepancy is one of the critical challenges in face recognition during video surveillance. Due to the entanglement of pose attributes with identity information, the conventional approaches for pose-independent representation lack in providing quality results in recognizing largely posed faces. In this paper, we propose a practical approach to disentangle the pose attribute from the identity information followed by synthesis of a face using a classifier network in latent space. The proposed approach employs a modified generative adversarial network framework consisting of an encoder-decoder structure embedded with a classifier in manifold space for carrying out factorization on the latent encoding. It can be further generalized to other face and non-face attributes for real-life video frames containing faces with significant attribute variations. Experimental results and comparison with state of the art in the field prove that the learned representation of the proposed approach synthesizes more compelling perceptual images through a combination of adversarial and classification losses.

Keywords: disentanglement, face detection, generative adversarial networks, video surveillance

Procedia PDF Downloads 120

2638 Artificial Intelligence for Generative Modelling

Authors: Shryas Bhurat, Aryan Vashistha, Sampreet Dinakar Nayak, Ayush Gupta

Abstract:

As the technology is advancing more towards high computational resources, there is a paradigm shift in the usage of these resources to optimize the design process. This paper discusses the usage of ‘Generative Design using Artificial Intelligence’ to build better models that adapt the operations like selection, mutation, and crossover to generate results. The human mind thinks of the simplest approach while designing an object, but the intelligence learns from the past & designs the complex optimized CAD Models. Generative Design takes the boundary conditions and comes up with multiple solutions with iterations to come up with a sturdy design with the most optimal parameter that is given, saving huge amounts of time & resources. The new production techniques that are at our disposal allow us to use additive manufacturing, 3D printing, and other innovative manufacturing techniques to save resources and design artistically engineered CAD Models. Also, this paper discusses the Genetic Algorithm, the Non-Domination technique to choose the right results using biomimicry that has evolved for current habitation for millions of years. The computer uses parametric models to generate newer models using an iterative approach & uses cloud computing to store these iterative designs. The later part of the paper compares the topology optimization technology with Generative Design that is previously being used to generate CAD Models. Finally, this paper shows the performance of algorithms and how these algorithms help in designing resource-efficient models.

Keywords: genetic algorithm, bio mimicry, generative modeling, non-dominant techniques

Procedia PDF Downloads 143

2637 A Neuron Model of Facial Recognition and Detection of an Authorized Entity Using Machine Learning System

Authors: J. K. Adedeji, M. O. Oyekanmi

Abstract:

This paper has critically examined the use of Machine Learning procedures in curbing unauthorized access into valuable areas of an organization. The use of passwords, pin codes, user’s identification in recent times has been partially successful in curbing crimes involving identities, hence the need for the design of a system which incorporates biometric characteristics such as DNA and pattern recognition of variations in facial expressions. The facial model used is the OpenCV library which is based on the use of certain physiological features, the Raspberry Pi 3 module is used to compile the OpenCV library, which extracts and stores the detected faces into the datasets directory through the use of camera. The model is trained with 50 epoch run in the database and recognized by the Local Binary Pattern Histogram (LBPH) recognizer contained in the OpenCV. The training algorithm used by the neural network is back propagation coded using python algorithmic language with 200 epoch runs to identify specific resemblance in the exclusive OR (XOR) output neurons. The research however confirmed that physiological parameters are better effective measures to curb crimes relating to identities.

Keywords: biometric characters, facial recognition, neural network, OpenCV

Procedia PDF Downloads 249

2636 DBN-Based Face Recognition System Using Light Field

Authors: Bing Gu

Abstract:

Abstract—Most of Conventional facial recognition systems are based on image features, such as LBP, SIFT. Recently some DBN-based 2D facial recognition systems have been proposed. However, we find there are few DBN-based 3D facial recognition system and relative researches. 3D facial images include all the individual biometric information. We can use these information to build more accurate features, So we present our DBN-based face recognition system using Light Field. We can see Light Field as another presentation of 3D image, and Light Field Camera show us a way to receive a Light Field. We use the commercially available Light Field Camera to act as the collector of our face recognition system, and the system receive a state-of-art performance as convenient as conventional 2D face recognition system.

Keywords: DBN, face recognition, light field, Lytro

Procedia PDF Downloads 459

2635 Wearable Music: Generation of Costumes from Music and Generative Art and Wearing Them by 3-Way Projectors

Authors: Noriki Amano

Abstract:

The final goal of this study is to create another way in which people enjoy music through the performance of 'Wearable Music'. Concretely speaking, we generate colorful costumes in real- time from music and to realize their dressing by projecting them to a person. For this purpose, we propose three methods in this study. First, a method of giving color to music in a three-dimensionally way. Second, a method of generating images of costumes from music. Third, a method of wearing the images of music. In particular, this study stands out from other related work in that we generate images of unique costumes from music and realize to wear them. In this study, we use the technique of generative arts to generate images of unique costumes and project the images to the fog generated around a person from 3-way using projectors. From this study, we can get how to enjoy music as 'wearable'. Furthermore, we are also able to have the prospect of unconventional entertainment based on the fusion between music and costumes.

Keywords: entertainment computing, costumes, music, generative programming

Procedia PDF Downloads 168

2634 Artificial Intelligence and the Next Generation Journalistic Practice: Prospects, Issues and Challenges

Authors: Shola Abidemi Olabode

Abstract:

The technological revolution over the years has impacted journalistic practice. As a matter of fact, journalistic practice has evolved alongside technologies of every generation transforming news and reporting, entertainment, and politics. Alongside these developments, the emergence of new kinds of risks and harms associated with generative AI has become rife with implications for media and journalism. Despite their numerous benefits for research and development, generative AI technologies like ChatGPT introduce new practical, ethical, and regulatory complexities in the practice of media and journalism. This paper presents a preliminary overview of the new kinds of challenges and issues for journalism and media practice in the era of generative AI, the implications for Nigeria, and invites a consideration of methods to mitigate the evolving complexity. It draws mainly on desk-based research underscoring the literature in both developed and developing non-western contexts as a contribution to knowledge.

Keywords: AI, journalism, media, online harms

Procedia PDF Downloads 73

2633 A Named Data Networking Stack for Contiki-NG-OS

Authors: Sedat Bilgili, Alper K. Demir

Abstract:

The current Internet has become the dominant use with continuing growth in the home, medical, health, smart cities and industrial automation applications. Internet of Things (IoT) is an emerging technology to enable such applications in our lives. Moreover, Named Data Networking (NDN) is also emerging as a Future Internet architecture where it fits the communication needs of IoT networks. The aim of this study is to provide an NDN protocol stack implementation running on the Contiki operating system (OS). Contiki OS is an OS that is developed for constrained IoT devices. In this study, an NDN protocol stack that can work on top of IEEE 802.15.4 link and physical layers have been developed and presented.

Keywords: internet of things (IoT), named-data, named data networking (NDN), operating system

Procedia PDF Downloads 166

2632 Crafting Robust Business Model Innovation Path with Generative Artificial Intelligence in Start-up SMEs

Authors: Ignitia Motjolopane

Abstract:

Small and medium enterprises (SMEs) play an important role in economies by contributing to economic growth and employment. In the fourth industrial revolution, the convergence of technologies and the changing nature of work created pressures on economies globally. Generative artificial intelligence (AI) may support SMEs in exploring, exploiting, and transforming business models to align with their growth aspirations. SMEs' growth aspirations fall into four categories: subsistence, income, growth, and speculative. Subsistence-oriented firms focus on meeting basic financial obligations and show less motivation for business model innovation. SMEs focused on income, growth, and speculation are more likely to pursue business model innovation to support growth strategies. SMEs' strategic goals link to distinct business model innovation paths depending on whether SMEs are starting a new business, pursuing growth, or seeking profitability. Integrating generative artificial intelligence in start-up SME business model innovation enhances value creation, user-oriented innovation, and SMEs' ability to adapt to dynamic changes in the business environment. The existing literature may lack comprehensive frameworks and guidelines for effectively integrating generative AI in start-up reiterative business model innovation paths. This paper examines start-up business model innovation path with generative artificial intelligence. A theoretical approach is used to examine start-up-focused SME reiterative business model innovation path with generative AI. Articulating how generative AI may be used to support SMEs to systematically and cyclically build the business model covering most or all business model components and analyse and test the BM's viability throughout the process. As such, the paper explores generative AI usage in market exploration. Moreover, market exploration poses unique challenges for start-ups compared to established companies due to a lack of extensive customer data, sales history, and market knowledge. Furthermore, the paper examines the use of generative AI in developing and testing viable value propositions and business models. In addition, the paper looks into identifying and selecting partners with generative AI support. Selecting the right partners is crucial for start-ups and may significantly impact success. The paper will examine generative AI usage in choosing the right information technology, funding process, revenue model determination, and stress testing business models. Stress testing business models validate strong and weak points by applying scenarios and evaluating the robustness of individual business model components and the interrelation between components. Thus, the stress testing business model may address these uncertainties, as misalignment between an organisation and its environment has been recognised as the leading cause of company failure. Generative AI may be used to generate business model stress-testing scenarios. The paper is expected to make a theoretical and practical contribution to theory and approaches in crafting a robust business model innovation path with generative artificial intelligence in start-up SMEs.

Keywords: business models, innovation, generative AI, small medium enterprises

Procedia PDF Downloads 65

2631 LuMee: A Centralized Smart Protector for School Children who are Using Online Education

Authors: Lumindu Dilumka, Ranaweera I. D., Sudusinghe S. P., Sanduni Kanchana A. M. K.

Abstract:

This study was motivated by the challenges experienced by parents and guardians in ensuring the safety of children in cyberspace. In the last two or three years, online education has become very popular all over the world due to the Covid 19 pandemic. Therefore, parents, guardians and teachers must ensure the safety of children in cyberspace. Children are more likely to go astray and there are plenty of online programs are waiting to get them on the wrong track and also, children who are engaging in the online education can be distracted at any moment. Therefore, parents should keep a close check on their children's online activity. Apart from that, due to the unawareness of children, they tempt to share their sensitive information, causing a chance of being a victim of phishing attacks from outsiders. These problems can be overcome through the proposed web-based system. We use feature extraction, web tracking and analysis mechanisms, image processing and name entity recognition to implement this web-based system.

Keywords: online education, cyber bullying, social media, face recognition, web tracker, privacy data

Procedia PDF Downloads 80

2630 Image Inpainting Model with Small-Sample Size Based on Generative Adversary Network and Genetic Algorithm

Authors: Jiawen Wang, Qijun Chen

Abstract:

The performance of most machine-learning methods for image inpainting depends on the quantity and quality of the training samples. However, it is very expensive or even impossible to obtain a great number of training samples in many scenarios. In this paper, an image inpainting model based on a generative adversary network (GAN) is constructed for the cases when the number of training samples is small. Firstly, a feature extraction network (F-net) is incorporated into the GAN network to utilize the available information of the inpainting image. The weighted sum of the extracted feature and the random noise acts as the input to the generative network (G-net). The proposed network can be trained well even when the sample size is very small. Secondly, in the phase of the completion for each damaged image, a genetic algorithm is designed to search an optimized noise input for G-net; based on this optimized input, the parameters of the G-net and F-net are further learned (Once the completion for a certain damaged image ends, the parameters restore to its original values obtained in the training phase) to generate an image patch that not only can fill the missing part of the damaged image smoothly but also has visual semantics.

Keywords: image inpainting, generative adversary nets, genetic algorithm, small-sample size

Procedia PDF Downloads 125

2629 Generative Behaviors and Psychological Well-Being in Mexican Elders

Authors: Ana L. Gonzalez-Celis, Edgardo Ruiz-Carrillo, Karina Reyes-Jarquin, Margarita Chavez-Becerra

Abstract:

Since recent decades, the aging has been viewed from a more positive perspective, where is not only about losses and damage, but also about being on a stage where you can enjoy life and live with well-being and quality of life. The challenge to feel better is to find those resources that seniors have. For that reason, psychological well-being has shown interest in the study of the affect and life satisfaction (hedonic well-being), while from a more recent tradition, focus on the development of capabilities and the personal growth, considering both as the main indicators of the quality of life. A resource that can be used in the later age is generativity, which refers to the ability of older people to develop and grow through activities that contribute with the improvement of the context in which they live and participate. In this way the generative interest is understood as a favourable attitude that contribute to the common benefit while strengthening and enriching the social institutions, to ensure continuity between generations and social development. On the other hand, generative behavior, differentiating from generative interest, is the expression of that attitude reflected in activities that make a social contribution and a benefit for generations to come. Hence the purpose of the research was to test if there is an association between the generative behaviour type and the psychological well-being with their dimensions. For this reason 188 Mexican adults from 60 to 94 years old (M = 69.78), 67% women, 33% men, completed two instruments: The Ryff’s Well-Being Scales to measure psychological well-being with 39 items with two dimensions (Hedonic and Eudaimonic well-being), and the Loyola’s Generative Behaviors Scale, grouped in five categories: Knowledge transmitted to the next generation, things to be remember, creativity, be productive, contribution to the community, and responsibility of other people. In addition, the socio-demographic data sheet was tested, and self-reported health status. The results indicated that the psychological well-being and its dimensions were significantly associated with the presence of generative behavior, where the level of well-being was higher when the frequency of some generative behaviour excelled; finding that the behavior with greater psychological well-being (M = 81.04, SD = 8.18) was "things to be remembered"; while with greater hedonic well-being (M = 73.39, SD = 12.19) was the behavior "responsibility of other people"; and with greater Eudaimonic well-being (M = 84.61, SD = 6.63), was the behavior "things to be remembered”. The most important findings highlight the importance of generative behaviors in adulthood, finding empirical evidence that the generativity in the last stage of life is associated with well-being. However, by finding differences in the types of generative behaviors at the level of well-being, is proposed the idea that generativity is not situated as an isolated construct, but needs other contextualized and related constructs that can simultaneously operate at different levels, taking into account the relationship between the environment and the individual, encompassing both the social and psychological dimension.

Keywords: eudaimonic well-being, generativity, hedonic well-being, Mexican elders, psychological well-being

Procedia PDF Downloads 264

2628 Face Tracking and Recognition Using Deep Learning Approach

Authors: Degale Desta, Cheng Jian

Abstract:

The most important factor in identifying a person is their face. Even identical twins have their own distinct faces. As a result, identification and face recognition are needed to tell one person from another. A face recognition system is a verification tool used to establish a person's identity using biometrics. Nowadays, face recognition is a common technique used in a variety of applications, including home security systems, criminal identification, and phone unlock systems. This system is more secure because it only requires a facial image instead of other dependencies like a key or card. Face detection and face identification are the two phases that typically make up a human recognition system.The idea behind designing and creating a face recognition system using deep learning with Azure ML Python's OpenCV is explained in this paper. Face recognition is a task that can be accomplished using deep learning, and given the accuracy of this method, it appears to be a suitable approach. To show how accurate the suggested face recognition system is, experimental results are given in 98.46% accuracy using Fast-RCNN Performance of algorithms under different training conditions.

Keywords: deep learning, face recognition, identification, fast-RCNN

Procedia PDF Downloads 134

2627 Audio-Visual Co-Data Processing Pipeline

Authors: Rita Chattopadhyay, Vivek Anand Thoutam

Abstract:

Speech is the most acceptable means of communication where we can quickly exchange our feelings and thoughts. Quite often, people can communicate orally but cannot interact or work with computers or devices. It’s easy and quick to give speech commands than typing commands to computers. In the same way, it’s easy listening to audio played from a device than extract output from computers or devices. Especially with Robotics being an emerging market with applications in warehouses, the hospitality industry, consumer electronics, assistive technology, etc., speech-based human-machine interaction is emerging as a lucrative feature for robot manufacturers. Considering this factor, the objective of this paper is to design the “Audio-Visual Co-Data Processing Pipeline.” This pipeline is an integrated version of Automatic speech recognition, a Natural language model for text understanding, object detection, and text-to-speech modules. There are many Deep Learning models for each type of the modules mentioned above, but OpenVINO Model Zoo models are used because the OpenVINO toolkit covers both computer vision and non-computer vision workloads across Intel hardware and maximizes performance, and accelerates application development. A speech command is given as input that has information about target objects to be detected and start and end times to extract the required interval from the video. Speech is converted to text using the Automatic speech recognition QuartzNet model. The summary is extracted from text using a natural language model Generative Pre-Trained Transformer-3 (GPT-3). Based on the summary, essential frames from the video are extracted, and the You Only Look Once (YOLO) object detection model detects You Only Look Once (YOLO) objects on these extracted frames. Frame numbers that have target objects (specified objects in the speech command) are saved as text. Finally, this text (frame numbers) is converted to speech using text to speech model and will be played from the device. This project is developed for 80 You Only Look Once (YOLO) labels, and the user can extract frames based on only one or two target labels. This pipeline can be extended for more than two target labels easily by making appropriate changes in the object detection module. This project is developed for four different speech command formats by including sample examples in the prompt used by Generative Pre-Trained Transformer-3 (GPT-3) model. Based on user preference, one can come up with a new speech command format by including some examples of the respective format in the prompt used by the Generative Pre-Trained Transformer-3 (GPT-3) model. This pipeline can be used in many projects like human-machine interface, human-robot interaction, and surveillance through speech commands. All object detection projects can be upgraded using this pipeline so that one can give speech commands and output is played from the device.

Keywords: OpenVINO, automatic speech recognition, natural language processing, object detection, text to speech

Procedia PDF Downloads 75

2626 Enhancing Residential Architecture through Generative Design: Balancing Aesthetics, Legal Constraints, and Environmental Considerations

Authors: Milena Nanova, Radul Shishkov, Damyan Damov, Martin Georgiev

Abstract:

This research paper presents an in-depth exploration of the use of generative design in urban residential architecture, with a dual focus on aligning aesthetic values with legal and environmental constraints. The study aims to demonstrate how generative design methodologies can innovate residential building designs that are not only legally compliant and environmentally conscious but also aesthetically compelling. At the core of our research is a specially developed generative design framework tailored for urban residential settings. This framework employs computational algorithms to produce diverse design solutions, meticulously balancing aesthetic appeal with practical considerations. By integrating site-specific features, urban legal restrictions, and environmental factors, our approach generates designs that resonate with the unique character of urban landscapes while adhering to regulatory frameworks. The paper places emphasis on algorithmic implementation of the logical constraint and intricacies in residential architecture by exploring the potential of generative design to create visually engaging and contextually harmonious structures. This exploration also contains an analysis of how these designs align with legal building parameters, showcasing the potential for creative solutions within the confines of urban building regulations. Concurrently, our methodology integrates functional, economic, and environmental factors. We investigate how generative design can be utilized to optimize buildings' performance, considering them, aiming to achieve a symbiotic relationship between the built environment and its natural surroundings. Through a blend of theoretical research and practical case studies, this research highlights the multifaceted capabilities of generative design and demonstrates practical applications of our framework. Our findings illustrate the rich possibilities that arise from an algorithmic design approach in the context of a vibrant urban landscape. This study contributes an alternative perspective to residential architecture, suggesting that the future of urban development lies in embracing the complex interplay between computational design innovation, regulatory adherence, and environmental responsibility.

Keywords: generative design, computational design, parametric design, algorithmic modeling

Procedia PDF Downloads 49

2625 Semi-Supervised Outlier Detection Using a Generative and Adversary Framework

Authors: Jindong Gu, Matthias Schubert, Volker Tresp

Abstract:

In many outlier detection tasks, only training data belonging to one class, i.e., the positive class, is available. The task is then to predict a new data point as belonging either to the positive class or to the negative class, in which case the data point is considered an outlier. For this task, we propose a novel corrupted Generative Adversarial Network (CorGAN). In the adversarial process of training CorGAN, the Generator generates outlier samples for the negative class, and the Discriminator is trained to distinguish the positive training data from the generated negative data. The proposed framework is evaluated using an image dataset and a real-world network intrusion dataset. Our outlier-detection method achieves state-of-the-art performance on both tasks.

Keywords: one-class classification, outlier detection, generative adversary networks, semi-supervised learning

Procedia PDF Downloads 144

2624 Comparing Emotion Recognition from Voice and Facial Data Using Time Invariant Features

Authors: Vesna Kirandziska, Nevena Ackovska, Ana Madevska Bogdanova

Abstract:

The problem of emotion recognition is a challenging problem. It is still an open problem from the aspect of both intelligent systems and psychology. In this paper, both voice features and facial features are used for building an emotion recognition system. A Support Vector Machine classifiers are built by using raw data from video recordings. In this paper, the results obtained for the emotion recognition are given, and a discussion about the validity and the expressiveness of different emotions is presented. A comparison between the classifiers build from facial data only, voice data only and from the combination of both data is made here. The need for a better combination of the information from facial expression and voice data is argued.

Keywords: emotion recognition, facial recognition, signal processing, machine learning

Procedia PDF Downloads 312

2623 Possibilities, Challenges and the State of the Art of Automatic Speech Recognition in Air Traffic Control

Authors: Van Nhan Nguyen, Harald Holone

Abstract:

Over the past few years, a lot of research has been conducted to bring Automatic Speech Recognition (ASR) into various areas of Air Trafﬁc Control (ATC), such as air trafﬁc control simulation and training, monitoring live operators for with the aim of safety improvements, air trafﬁc controller workload measurement and conducting analysis on large quantities controller-pilot speech. Due to the high accuracy requirements of the ATC context and its unique challenges, automatic speech recognition has not been widely adopted in this ﬁeld. With the aim of providing a good starting point for researchers who are interested bringing automatic speech recognition into ATC, this paper gives an overview of possibilities and challenges of applying automatic speech recognition in air trafﬁc control. To provide this overview, we present an updated literature review of speech recognition technologies in general, as well as speciﬁc approaches relevant to the ATC context. Based on this literature review, criteria for selecting speech recognition approaches for the ATC domain are presented, and remaining challenges and possible solutions are discussed.

Keywords: automatic speech recognition, asr, air traffic control, atc

Procedia PDF Downloads 390

2622 A Contribution to Human Activities Recognition Using Expert System Techniques

Authors: Malika Yaici, Soraya Aloui, Sara Semchaoui

Abstract:

This paper deals with human activity recognition from sensor data. It is an active research area, and the main objective is to obtain a high recognition rate. In this work, a recognition system based on expert systems is proposed; the recognition is performed using the objects, object states, and gestures and taking into account the context (the location of the objects and of the person performing the activity, the duration of the elementary actions and the activity). The system recognizes complex activities after decomposing them into simple, easy-to-recognize activities. The proposed method can be applied to any type of activity. The simulation results show the robustness of our system and its speed of decision.

Keywords: human activity recognition, ubiquitous computing, context-awareness, expert system

Procedia PDF Downloads 102

2621 Switching to the Latin Alphabet in Kazakhstan: A Brief Overview of Character Recognition Methods

Authors: Ainagul Yermekova, Liudmila Goncharenko, Ali Baghirzade, Sergey Sybachin

Abstract:

In this article, we address the problem of Kazakhstan's transition to the Latin alphabet. The transition process started in 2017 and is scheduled to be completed in 2025. In connection with these events, the problem of recognizing the characters of the new alphabet is raised. Well-known character recognition programs such as ABBYY FineReader, FormReader, MyScript Stylus did not recognize specific Kazakh letters that were used in Cyrillic. The author tries to give an assessment of the well-known method of character recognition that could be in demand as part of the country's transition to the Latin alphabet. Three methods of character recognition: template, structured, and feature-based, are considered through the algorithms of operation. At the end of the article, a general conclusion is made about the possibility of applying a certain method to a particular recognition process: for example, in the process of population census, recognition of typographic text in Latin, or recognition of photos of car numbers, store signs, etc.

Keywords: text detection, template method, recognition algorithm, structured method, feature method

Procedia PDF Downloads 178

2620 Recognizing an Individual, Their Topic of Conversation and Cultural Background from 3D Body Movement

Authors: Gheida J. Shahrour, Martin J. Russell

Abstract:

The 3D body movement signals captured during human-human conversation include clues not only to the content of people’s communication but also to their culture and personality. This paper is concerned with automatic extraction of this information from body movement signals. For the purpose of this research, we collected a novel corpus from 27 subjects, arranged them into groups according to their culture. We arranged each group into pairs and each pair communicated with each other about different topics. A state-of-art recognition system is applied to the problems of person, culture, and topic recognition. We borrowed modeling, classification, and normalization techniques from speech recognition. We used Gaussian Mixture Modeling (GMM) as the main technique for building our three systems, obtaining 77.78%, 55.47%, and 39.06% from the person, culture, and topic recognition systems respectively. In addition, we combined the above GMM systems with Support Vector Machines (SVM) to obtain 85.42%, 62.50%, and 40.63% accuracy for person, culture, and topic recognition respectively. Although direct comparison among these three recognition systems is difficult, it seems that our person recognition system performs best for both GMM and GMM-SVM, suggesting that inter-subject differences (i.e. subject’s personality traits) are a major source of variation. When removing these traits from culture and topic recognition systems using the Nuisance Attribute Projection (NAP) and the Intersession Variability Compensation (ISVC) techniques, we obtained 73.44% and 46.09% accuracy from culture and topic recognition systems respectively.

Keywords: person recognition, topic recognition, culture recognition, 3D body movement signals, variability compensation

Procedia PDF Downloads 535

2619 Reed: An Approach Towards Quickly Bootstrapping Multilingual Acoustic Models

Authors: Bipasha Sen, Aditya Agarwal

Abstract:

Multilingual automatic speech recognition (ASR) system is a single entity capable of transcribing multiple languages sharing a common phone space. Performance of such a system is highly dependent on the compatibility of the languages. State of the art speech recognition systems are built using sequential architectures based on recurrent neural networks (RNN) limiting the computational parallelization in training. This poses a significant challenge in terms of time taken to bootstrap and validate the compatibility of multiple languages for building a robust multilingual system. Complex architectural choices based on self-attention networks are made to improve the parallelization thereby reducing the training time. In this work, we propose Reed, a simple system based on 1D convolutions which uses very short context to improve the training time. To improve the performance of our system, we use raw time-domain speech signals directly as input. This enables the convolutional layers to learn feature representations rather than relying on handcrafted features such as MFCC. We report improvement on training and inference times by atleast a factor of 4x and 7.4x respectively with comparable WERs against standard RNN based baseline systems on SpeechOcean's multilingual low resource dataset.

Keywords: convolutional neural networks, language compatibility, low resource languages, multilingual automatic speech recognition

Procedia PDF Downloads 116

2618 Human Activities Recognition Based on Expert System

Authors: Malika Yaici, Soraya Aloui, Sara Semchaoui

Abstract:

Recognition of human activities from sensor data is an active research area, and the main objective is to obtain a high recognition rate. In this work, we propose a recognition system based on expert systems. The proposed system makes the recognition based on the objects, object states, and gestures, taking into account the context (the location of the objects and of the person performing the activity, the duration of the elementary actions, and the activity). This work focuses on complex activities which are decomposed into simple easy to recognize activities. The proposed method can be applied to any type of activity. The simulation results show the robustness of our system and its speed of decision.

Keywords: human activity recognition, ubiquitous computing, context-awareness, expert system

Procedia PDF Downloads 133

2617 Surface to the Deeper: A Universal Entity Alignment Approach Focusing on Surface Information

Authors: Zheng Baichuan, Li Shenghui, Li Bingqian, Zhang Ning, Chen Kai

Abstract:

Entity alignment (EA) tasks in knowledge graphs often play a pivotal role in the integration of knowledge graphs, where structural differences often exist between the source and target graphs, such as the presence or absence of attribute information and the types of attribute information (text, timestamps, images, etc.). However, most current research efforts are focused on improving alignment accuracy, often along with an increased reliance on specific structures -a dependency that inevitably diminishes their practical value and causes difficulties when facing knowledge graph alignment tasks with varying structures. Therefore, we propose a universal knowledge graph alignment approach that only utilizes the common basic structures shared by knowledge graphs. We have demonstrated through experiments that our method achieves state-of-the-art performance in fair comparisons.

Keywords: knowledge graph, entity alignment, transformer, deep learning

Procedia PDF Downloads 37

2616 Enhanced Face Recognition with Daisy Descriptors Using 1BT Based Registration

Authors: Sevil Igit, Merve Meric, Sarp Erturk

Abstract:

In this paper, it is proposed to improve Daisy descriptor based face recognition using a novel One-Bit Transform (1BT) based pre-registration approach. The 1BT based pre-registration procedure is fast and has low computational complexity. It is shown that the face recognition accuracy is improved with the proposed approach. The proposed approach can facilitate highly accurate face recognition using DAISY descriptor with simple matching and thereby facilitate a low-complexity approach.

Keywords: face recognition, Daisy descriptor, One-Bit Transform, image registration

Procedia PDF Downloads 358

2615 Management and Agreement Protocol in Computer Security

Authors: Abdulameer K. Hussain

Abstract:

When dealing with a cryptographic system we note that there are many activities performed by parties of this cryptographic system and the most prominent of these activities is the process of agreement between the parties involved in the cryptographic system on how to deal and perform the cryptographic system tasks to be more secure, more confident and reliable. The most common agreement among parties is a key agreement and other types of agreements. Despite the fact that there is an attempt from some quarters to find other effective agreement methods but these methods are limited to the traditional agreements. This paper presents different parameters to perform more effectively the task of the agreement, including the key alternative, the agreement on the encryption method used and the agreement to prevent the denial of the services. To manage and achieve these goals, this method proposes the existence of an control and monitoring entity to manage these agreements by collecting different statistical information of the opinions of the authorized parties in the cryptographic system. These statistics help this entity to take the proper decision about the agreement factors. This entity is called Agreement Manager (AM).

Keywords: agreement parameters, key agreement, key exchange, security management

Procedia PDF Downloads 415

2614 Generative Adversarial Network for Bidirectional Mappings between Retinal Fundus Images and Vessel Segmented Images

Authors: Haoqi Gao, Koichi Ogawara

Abstract:

Retinal vascular segmentation of color fundus is the basis of ophthalmic computer-aided diagnosis and large-scale disease screening systems. Early screening of fundus diseases has great value for clinical medical diagnosis. The traditional methods depend on the experience of the doctor, which is time-consuming, labor-intensive, and inefficient. Furthermore, medical images are scarce and fraught with legal concerns regarding patient privacy. In this paper, we propose a new Generative Adversarial Network based on CycleGAN for retinal fundus images. This method can generate not only synthetic fundus images but also generate corresponding segmentation masks, which has certain application value and challenge in computer vision and computer graphics. In the results, we evaluate our proposed method from both quantitative and qualitative. For generated segmented images, our method achieves dice coefficient of 0.81 and PR of 0.89 on DRIVE dataset. For generated synthetic fundus images, we use ”Toy Experiment” to verify the state-of-the-art performance of our method.

Keywords: retinal vascular segmentations, generative ad-versarial network, cyclegan, fundus images

Procedia PDF Downloads 138

2613 From Shallow Semantic Representation to Deeper One: Verb Decomposition Approach

Authors: Aliaksandr Huminski

Abstract:

Semantic Role Labeling (SRL) as shallow semantic parsing approach includes recognition and labeling arguments of a verb in a sentence. Verb participants are linked with specific semantic roles (Agent, Patient, Instrument, Location, etc.). Thus, SRL can answer on key questions such as ‘Who’, ‘When’, ‘What’, ‘Where’ in a text and it is widely applied in dialog systems, question-answering, named entity recognition, information retrieval, and other fields of NLP. However, SRL has the following flaw: Two sentences with identical (or almost identical) meaning can have different semantic role structures. Let consider 2 sentences: (1) John put butter on the bread. (2) John buttered the bread. SRL for (1) and (2) will be significantly different. For the verb put in (1) it is [Agent + Patient + Goal], but for the verb butter in (2) it is [Agent + Goal]. It happens because of one of the most interesting and intriguing features of a verb: Its ability to capture participants as in the case of the verb butter, or their features as, say, in the case of the verb drink where the participant’s feature being liquid is shared with the verb. This capture looks like a total fusion of meaning and cannot be decomposed in direct way (in comparison with compound verbs like babysit or breastfeed). From this perspective, SRL looks really shallow to represent semantic structure. If the key point in semantic representation is an opportunity to use it for making inferences and finding hidden reasons, it assumes by default that two different but semantically identical sentences must have the same semantic structure. Otherwise we will have different inferences from the same meaning. To overcome the above-mentioned flaw, the following approach is suggested. Assume that: P is a participant of relation; F is a feature of a participant; Vcp is a verb that captures a participant; Vcf is a verb that captures a feature of a participant; Vpr is a primitive verb or a verb that does not capture any participant and represents only a relation. In another word, a primitive verb is a verb whose meaning does not include meanings from its surroundings. Then Vcp and Vcf can be decomposed as: Vcp = Vpr +P; Vcf = Vpr +F. If all Vcp and Vcf will be represented this way, then primitive verbs Vpr can be considered as a canonical form for SRL. As a result of that, there will be no hidden participants caught by a verb since all participants will be explicitly unfolded. An obvious example of Vpr is the verb go, which represents pure movement. In this case the verb drink can be represented as man-made movement of liquid into specific direction. Extraction and using primitive verbs for SRL create a canonical representation unique for semantically identical sentences. It leads to the unification of semantic representation. In this case, the critical flaw related to SRL will be resolved.

Keywords: decomposition, labeling, primitive verbs, semantic roles

Procedia PDF Downloads 360