Search results for: spatio-temporal action recognition
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 4142

Search results for: spatio-temporal action recognition

3842 Clinical Outcomes and Symptom Management in Pediatric Patients Following Eczema Action Plans: A Quality Improvement Project

Authors: Karla Lebedoff, Susan Walsh, Michelle Bain

Abstract:

Eczema is a chronic atopy condition requiring long-term daily management in children. Written action plans for other chronic atopic conditions, such as asthma and food allergies, are widely recommended and distributed to pediatric patients' parents and caregivers, seeking to improve clinical outcomes and become empowered to manage the patient's ever-changing symptoms. Written action plans for eczema, referred to as "asthma of the skin," are not routinely used in practice. Parents of children suffering from eczema rarely receive a written action plan to follow, and commendations supporting eczema action plans are inconsistent. Pediatric patients between birth and 18 years old who were followed for eczema at an urban Midwest community hospital were eligible to participate in this quality improvement project. At the initial visit, parents received instructions on individualized eczema action plans for their child and completed two validated surveys: Health Confidence Score (HCS) and Patient-Oriented Eczema Measure (POEM). Pre- and post-survey responses were collected, and clinical symptom presentation at follow-up were outcome determinants. Project implementation was guided by Institute for Healthcare Improvement's Step-up Framework and the Plan-Do-Study-Act cycle. This project measured clinical outcomes and parent confidence in self-management of their child's eczema symptoms with the responses from 26 participant surveys. Pre-survey responses were collected from 36 participants, though ten were lost to follow-up. Average POEM scores improved by 53%, while average HCS scores remained unchanged. Of seven completed in-person follow-up visits, six clinical progress notes documented improvement. Individualized eczema action plans can be seamlessly incorporated into primary and specialty care visits for pediatric patients suffering from eczema. Following a patient-specific eczema action plan may lessen the daily physical and mental burdens of uncontrolled eczema for children and parents, managing symptoms that chronically flare and recede. Furthermore, incorporating eczema action plans into practice potentially reduces the likely underestimated $5.3 billion economic disease burden of eczema on the U.S. healthcare system.

Keywords: atopic dermatitis, eczema action plan, eczema symptom management, pediatric eczema

Procedia PDF Downloads 132
3841 An Automatic Speech Recognition of Conversational Telephone Speech in Malay Language

Authors: M. Draman, S. Z. Muhamad Yassin, M. S. Alias, Z. Lambak, M. I. Zulkifli, S. N. Padhi, K. N. Baharim, F. Maskuriy, A. I. A. Rahim

Abstract:

The performance of Malay automatic speech recognition (ASR) system for the call centre environment is presented. The system utilizes Kaldi toolkit as the platform to the entire library and algorithm used in performing the ASR task. The acoustic model implemented in this system uses a deep neural network (DNN) method to model the acoustic signal and the standard (n-gram) model for language modelling. With 80 hours of training data from the call centre recordings, the ASR system can achieve 72% of accuracy that corresponds to 28% of word error rate (WER). The testing was done using 20 hours of audio data. Despite the implementation of DNN, the system shows a low accuracy owing to the varieties of noises, accent and dialect that typically occurs in Malaysian call centre environment. This significant variation of speakers is reflected by the large standard deviation of the average word error rate (WERav) (i.e., ~ 10%). It is observed that the lowest WER (13.8%) was obtained from recording sample with a standard Malay dialect (central Malaysia) of native speaker as compared to 49% of the sample with the highest WER that contains conversation of the speaker that uses non-standard Malay dialect.

Keywords: conversational speech recognition, deep neural network, Malay language, speech recognition

Procedia PDF Downloads 318
3840 Treatment and Diagnostic Imaging Methods of Fetal Heart Function in Radiology

Authors: Mahdi Farajzadeh Ajirlou

Abstract:

Prior evidence of normal cardiac anatomy is desirable to relieve the anxiety of cases with a family history of congenital heart disease or to offer the option of early gestation termination or close follow-up should a cardiac anomaly be proved. Fetal heart discovery plays an important part in the opinion of the fetus, and it can reflect the fetal heart function of the fetus, which is regulated by the central nervous system. Acquisition of ventricular volume and inflow data would be useful to quantify more valve regurgitation and ventricular function to determine the degree of cardiovascular concession in fetal conditions at threat for hydrops fetalis. This study discusses imaging the fetal heart with transvaginal ultrasound, Doppler ultrasound, three-dimensional ultrasound (3DUS) and four-dimensional (4D) ultrasound, spatiotemporal image correlation (STIC), glamorous resonance imaging and cardiac catheterization. Doppler ultrasound (DUS) image is a kind of real- time image with a better imaging effect on blood vessels and soft tissues. DUS imaging can observe the shape of the fetus, but it cannot show whether the fetus is hypoxic or distressed. Spatiotemporal image correlation (STIC) enables the acquisition of a volume of data concomitant with the beating heart. The automated volume accession is made possible by the array in the transducer performing a slow single reach, recording a single 3D data set conforming to numerous 2D frames one behind the other. The volume accession can be done in a stationary 3D, either online 4D (direct volume scan, live 3D ultrasound or a so-called 4D (3D/ 4D)), or either spatiotemporal image correlation-STIC (off-line 4D, which is a circular volume check-up). Fetal cardiovascular MRI would appear to be an ideal approach to the noninvasive disquisition of the impact of abnormal cardiovascular hemodynamics on antenatal brain growth and development. Still, there are practical limitations to the use of conventional MRI for fetal cardiovascular assessment, including the small size and high heart rate of the mortal fetus, the lack of conventional cardiac gating styles to attend data accession, and the implicit corruption of MRI data due to motherly respiration and unpredictable fetal movements. Fetal cardiac MRI has the implicit to complement ultrasound in detecting cardiovascular deformations and extracardiac lesions. Fetal cardiac intervention (FCI), minimally invasive catheter interventions, is a new and evolving fashion that allows for in-utero treatment of a subset of severe forms of congenital heart deficiency. In special cases, it may be possible to modify the natural history of congenital heart disorders. It's entirely possible that future generations will ‘repair’ congenital heart deficiency in utero using nanotechnologies or remote computer-guided micro-robots that work in the cellular layer.

Keywords: fetal, cardiac MRI, ultrasound, 3D, 4D, heart disease, invasive, noninvasive, catheter

Procedia PDF Downloads 32
3839 Design and Development of Automatic Onion Harvester

Authors: P. Revathi, T. Mrunalini, K. Padma Priya, P. Ramya, R. Saranya

Abstract:

During the tough times of covid, those people who were hospitalized found it difficult to always convey what they wanted to or needed to the attendee. Sometimes the attendees might also not be there. In that case, the patients can use simple hand gestures to control electrical appliances (like its set it for a zero watts bulb)and three other gestures for voice note intimation. In this AI-based hand recognition project, NodeMCU is used for the control action of the relay, and it is connected to the firebase for storing the value in the cloud and is interfaced with the python code via raspberry pi. For three hand gestures, a voice clip is added for intimation to the attendee. This is done with the help of Google’s text to speech and the inbuilt audio file option in the raspberry pi 4. All the 5 gestures will be detected when shown with their hands via a webcam which is placed for gesture detection. A personal computer is used for displaying the gestures and for running the code in the raspberry pi imager.

Keywords: onion harvesting, automatic pluging, camera, raspberry pi

Procedia PDF Downloads 194
3838 Local Image Features Emerging from Brain Inspired Multi-Layer Neural Network

Authors: Hui Wei, Zheng Dong

Abstract:

Object recognition has long been a challenging task in computer vision. Yet the human brain, with the ability to rapidly and accurately recognize visual stimuli, manages this task effortlessly. In the past decades, advances in neuroscience have revealed some neural mechanisms underlying visual processing. In this paper, we present a novel model inspired by the visual pathway in primate brains. This multi-layer neural network model imitates the hierarchical convergent processing mechanism in the visual pathway. We show that local image features generated by this model exhibit robust discrimination and even better generalization ability compared with some existing image descriptors. We also demonstrate the application of this model in an object recognition task on image data sets. The result provides strong support for the potential of this model.

Keywords: biological model, feature extraction, multi-layer neural network, object recognition

Procedia PDF Downloads 538
3837 Optimization of Marine Waste Collection Considering Dynamic Transport and Ship’s Wake Impact

Authors: Guillaume Richard, Sarra Zaied

Abstract:

Marine waste quantities increase more and more, 5 million tons of plastic waste enter the ocean every year. Their spatiotemporal distribution is never homogeneous and depends mainly on the hydrodynamic characteristics of the environment, as well as the size and location of the waste. As part of optimizing collect of marine plastic wastes, it is important to measure and monitor their evolution over time. In this context, diverse studies have been dedicated to describing waste behavior in order to identify its accumulation in ocean areas. None of the existing tools which track objects at sea had the objective of tracking down a slick of waste. Moreover, the applications related to marine waste are in the minority compared to rescue applications or oil slicks tracking applications. These approaches are able to accurately simulate an object's behavior over time but not during the collection mission of a waste sheet. This paper presents numerical modeling of a boat’s wake impact on the floating marine waste behavior during a collection mission. The aim is to predict the trajectory of a marine waste slick to optimize its collection using meteorological data of ocean currents, wind, and possibly waves. We have made the choice to use Ocean Parcels which is a Python library suitable for trajectoring particles in the ocean. The modeling results showed the important role of advection and diffusion processes in the spatiotemporal distribution of floating plastic litter. The performance of the proposed method was evaluated on real data collected from the Copernicus Marine Environment Monitoring Service (CMEMS). The results of the evaluation in Cape of Good Hope (South Africa) prove that the proposed approach can effectively predict the position and velocity of marine litter during collection, which allowed for optimizing time and more than $90\%$ of the amount of collected waste.

Keywords: marine litter, advection-diffusion equation, sea current, numerical model

Procedia PDF Downloads 85
3836 Development of a Sequential Multimodal Biometric System for Web-Based Physical Access Control into a Security Safe

Authors: Babatunde Olumide Olawale, Oyebode Olumide Oyediran

Abstract:

The security safe is a place or building where classified document and precious items are kept. To prevent unauthorised persons from gaining access to this safe a lot of technologies had been used. But frequent reports of an unauthorised person gaining access into security safes with the aim of removing document and items from the safes are pointers to the fact that there is still security gap in the recent technologies used as access control for the security safe. In this paper we try to solve this problem by developing a multimodal biometric system for physical access control into a security safe using face and voice recognition. The safe is accessed by the combination of face and speech pattern recognition and also in that sequential order. User authentication is achieved through the use of camera/sensor unit and a microphone unit both attached to the door of the safe. The user face was captured by the camera/sensor while the speech was captured by the use of the microphone unit. The Scale Invariance Feature Transform (SIFT) algorithm was used to train images to form templates for the face recognition system while the Mel-Frequency Cepitral Coefficients (MFCC) algorithm was used to train the speech recognition system to recognise authorise user’s speech. Both algorithms were hosted in two separate web based servers and for automatic analysis of our work; our developed system was simulated in a MATLAB environment. The results obtained shows that the developed system was able to give access to authorise users while declining unauthorised person access to the security safe.

Keywords: access control, multimodal biometrics, pattern recognition, security safe

Procedia PDF Downloads 324
3835 Preschool Story Retelling: Actions and Verb Use

Authors: Eva Nwokah, Casey Taliancich-Klinger, Lauren Luna, Sarah Rodriguez

Abstract:

Story-retelling is a technique frequently used to assess children’s language skills and support their development of narratives. Fourteen preschool children listened to one of two stories from the wordless, illustrated Frog book series and then retold the story using the pictures. A comparison of three verb types (action, mental and other) in the original story model, and children's verb use in their retold stories revealed the salience of action events. The children's stories contained a similar proportion of verb types to the original story. However, the action verbs they used were rarely those they had heard in the original. The implications for the process of lexical encoding and narrative recall are discussed, as well as suggestions for the use of wordless picture books and the language teaching of new verbs.

Keywords: story re-telling, verb use, preschool language, wordless picture books

Procedia PDF Downloads 266
3834 Theory of the Optimum Signal Approximation Clarifying the Importance in the Recognition of Parallel World and Application to Secure Signal Communication with Feedback

Authors: Takuro Kida, Yuichi Kida

Abstract:

In this paper, it is shown a base of the new trend of algorithm mathematically that treats a historical reason of continuous discrimination in the world as well as its solution by introducing new concepts of parallel world that includes an invisible set of errors as its companion. With respect to a matrix operator-filter bank that the matrix operator-analysis-filter bank H and the matrix operator-sampling-filter bank S are given, firstly, we introduce the detail algorithm to derive the optimum matrix operator-synthesis-filter bank Z that minimizes all the worst-case measures of the matrix operator-error-signals E(ω) = F(ω) − Y(ω) between the matrix operator-input-signals F(ω) and the matrix operator-output-signals Y(ω) of the matrix operator-filter bank at the same time. Further, feedback is introduced to the above approximation theory, and it is indicated that introducing conversations with feedback do not superior automatically to the accumulation of existing knowledge of signal prediction. Secondly, the concept of category in the field of mathematics is applied to the above optimum signal approximation and is indicated that the category-based approximation theory is applied to the set-theoretic consideration of the recognition of humans. Based on this discussion, it is shown naturally why the narrow perception that tends to create isolation shows an apparent advantage in the short term and, often, why such narrow thinking becomes intimate with discriminatory action in a human group. Throughout these considerations, it is presented that, in order to abolish easy and intimate discriminatory behavior, it is important to create a parallel world of conception where we share the set of invisible error signals, including the words and the consciousness of both worlds.

Keywords: matrix filterbank, optimum signal approximation, category theory, simultaneous minimization

Procedia PDF Downloads 135
3833 The Combination of the Mel Frequency Cepstral Coefficients (MFCC), Perceptual Linear Prediction (PLP), JITTER and SHIMMER Coefficients for the Improvement of Automatic Recognition System for Dysarthric Speech

Authors: Brahim-Fares Zaidi, Malika Boudraa, Sid-Ahmed Selouani

Abstract:

Our work aims to improve our Automatic Recognition System for Dysarthria Speech (ARSDS) based on the Hidden Models of Markov (HMM) and the Hidden Markov Model Toolkit (HTK) to help people who are sick. With pronunciation problems, we applied two techniques of speech parameterization based on Mel Frequency Cepstral Coefficients (MFCC's) and Perceptual Linear Prediction (PLP's) and concatenated them with JITTER and SHIMMER coefficients in order to increase the recognition rate of a dysarthria speech. For our tests, we used the NEMOURS database that represents speakers with dysarthria and normal speakers.

Keywords: hidden Markov model toolkit (HTK), hidden models of Markov (HMM), Mel-frequency cepstral coefficients (MFCC), perceptual linear prediction (PLP’s)

Procedia PDF Downloads 154
3832 Powerful Media: Reflection of Professional Audience

Authors: Hamide Farshad, Mohammadreza Javidi Abdollah Zadeh Aval

Abstract:

As a result of the growing penetration of the media into human life, a new role under the title of "audience" is defined in the social life .A kind of role which is dramatically changed since its formation. This article aims to define the audience position in the new media equations which is concluded to the transformation of the media role. By using the Library and Attributive method to study the history, the evolutionary outlook to the audience and the recognition of the audience and the media relation in the new media context is studied. It was perceived in past that public communication would result in receiving the audience. But after the emergence of the interactional media and transformation in the audience social life, a new kind of public communication is formed, and also the imaginary picture of the audience is replaced by the audience impact on the communication process. Part of this impact can be seen in the form of feedback which is one of the public communication elements. In public communication, the audience feedback is completely accepted. But in many cases, and along with the audience feedback, the media changes its direction; this direction shift is known as media feedback. At this state, the media and the audience are both doers and consistently change their positions in an interaction. With the greater number of the audience and the media, this process has taken a new role, and the role of this doer is sometimes taken by an audience while influencing another audience, or a media while influencing another media. In this article, this multiple public communication process is shown through representing a model under the title of ”The bilateral influence of the audience and the media.” Based on this model, the audience and the media power are not the two sides of a coin, and as a result, by accepting these two as the doers, the bilateral power of the audience and the media will be complementary to each other. Also more, the compatibility between the media and the audience is analyzed in the bilateral and interactional relation hypothesis, and by analyzing the action law hypothesis, the dos and don’ts of this role are defined, and media is obliged to know and accept them in order to be able to survive. They also have a determining role in the strategic studies of a media.

Keywords: audience, effect, media, interaction, action laws

Procedia PDF Downloads 483
3831 A Two-Stage Adaptation towards Automatic Speech Recognition System for Malay-Speaking Children

Authors: Mumtaz Begum Mustafa, Siti Salwah Salim, Feizal Dani Rahman

Abstract:

Recently, Automatic Speech Recognition (ASR) systems were used to assist children in language acquisition as it has the ability to detect human speech signal. Despite the benefits offered by the ASR system, there is a lack of ASR systems for Malay-speaking children. One of the contributing factors for this is the lack of continuous speech database for the target users. Though cross-lingual adaptation is a common solution for developing ASR systems for under-resourced language, it is not viable for children as there are very limited speech databases as a source model. In this research, we propose a two-stage adaptation for the development of ASR system for Malay-speaking children using a very limited database. The two stage adaptation comprises the cross-lingual adaptation (first stage) and cross-age adaptation. For the first stage, a well-known speech database that is phonetically rich and balanced, is adapted to the medium-sized Malay adults using supervised MLLR. The second stage adaptation uses the speech acoustic model generated from the first adaptation, and the target database is a small-sized database of the target users. We have measured the performance of the proposed technique using word error rate, and then compare them with the conventional benchmark adaptation. The two stage adaptation proposed in this research has better recognition accuracy as compared to the benchmark adaptation in recognizing children’s speech.

Keywords: Automatic Speech Recognition System, children speech, adaptation, Malay

Procedia PDF Downloads 389
3830 Facial Expression Phoenix (FePh): An Annotated Sequenced Dataset for Facial and Emotion-Specified Expressions in Sign Language

Authors: Marie Alaghband, Niloofar Yousefi, Ivan Garibay

Abstract:

Facial expressions are important parts of both gesture and sign language recognition systems. Despite the recent advances in both fields, annotated facial expression datasets in the context of sign language are still scarce resources. In this manuscript, we introduce an annotated sequenced facial expression dataset in the context of sign language, comprising over 3000 facial images extracted from the daily news and weather forecast of the public tv-station PHOENIX. Unlike the majority of currently existing facial expression datasets, FePh provides sequenced semi-blurry facial images with different head poses, orientations, and movements. In addition, in the majority of images, identities are mouthing the words, which makes the data more challenging. To annotate this dataset we consider primary, secondary, and tertiary dyads of seven basic emotions of "sad", "surprise", "fear", "angry", "neutral", "disgust", and "happy". We also considered the "None" class if the image’s facial expression could not be described by any of the aforementioned emotions. Although we provide FePh as a facial expression dataset of signers in sign language, it has a wider application in gesture recognition and Human Computer Interaction (HCI) systems.

Keywords: annotated facial expression dataset, gesture recognition, sequenced facial expression dataset, sign language recognition

Procedia PDF Downloads 155
3829 Lip Localization Technique for Myanmar Consonants Recognition Based on Lip Movements

Authors: Thein Thein, Kalyar Myo San

Abstract:

Lip reading system is one of the different supportive technologies for hearing impaired, or elderly people or non-native speakers. For normal hearing persons in noisy environments or in conditions where the audio signal is not available, lip reading techniques can be used to increase their understanding of spoken language. Hearing impaired persons have used lip reading techniques as important tools to find out what was said by other people without hearing voice. Thus, visual speech information is important and become active research area. Using visual information from lip movements can improve the accuracy and robustness of a speech recognition system and the need for lip reading system is ever increasing for every language. However, the recognition of lip movement is a difficult task because of the region of interest (ROI) is nonlinear and noisy. Therefore, this paper proposes method to detect the accurate lips shape and to localize lip movement towards automatic lip tracking by using the combination of Otsu global thresholding technique and Moore Neighborhood Tracing Algorithm. Proposed method shows how accurate lip localization and tracking which is useful for speech recognition. In this work of study and experiments will be carried out the automatic lip localizing the lip shape for Myanmar consonants using the only visual information from lip movements which is useful for visual speech of Myanmar languages.

Keywords: lip reading, lip localization, lip tracking, Moore neighborhood tracing algorithm

Procedia PDF Downloads 348
3828 Fusion of Finger Inner Knuckle Print and Hand Geometry Features to Enhance the Performance of Biometric Verification System

Authors: M. L. Anitha, K. A. Radhakrishna Rao

Abstract:

With the advent of modern computing technology, there is an increased demand for developing recognition systems that have the capability of verifying the identity of individuals. Recognition systems are required by several civilian and commercial applications for providing access to secured resources. Traditional recognition systems which are based on physical identities are not sufficiently reliable to satisfy the security requirements due to the use of several advances of forgery and identity impersonation methods. Recognizing individuals based on his/her unique physiological characteristics known as biometric traits is a reliable technique, since these traits are not transferable and they cannot be stolen or lost. Since the performance of biometric based recognition system depends on the particular trait that is utilized, the present work proposes a fusion approach which combines Inner knuckle print (IKP) trait of the middle, ring and index fingers with the geometrical features of hand. The hand image captured from a digital camera is preprocessed to find finger IKP as region of interest (ROI) and hand geometry features. Geometrical features are represented as the distances between different key points and IKP features are extracted by applying local binary pattern descriptor on the IKP ROI. The decision level AND fusion was adopted, which has shown improvement in performance of the combined scheme. The proposed approach is tested on the database collected at our institute. Proposed approach is of significance since both hand geometry and IKP features can be extracted from the palm region of the hand. The fusion of these features yields a false acceptance rate of 0.75%, false rejection rate of 0.86% for verification tests conducted, which is less when compared to the results obtained using individual traits. The results obtained confirm the usefulness of proposed approach and suitability of the selected features for developing biometric based recognition system based on features from palmar region of hand.

Keywords: biometrics, hand geometry features, inner knuckle print, recognition

Procedia PDF Downloads 217
3827 Humanitarian Emergency of the Refugee Condition for Central American Immigrants in Irregular Situation

Authors: María de los Ángeles Cerda González, Itzel Arriaga Hurtado, Pascacio José Martínez Pichardo

Abstract:

In México, the recognition of refugee condition is a fundamental right which, as host State, has the obligation of respect, protect, and fulfill to the foreigners – where we can find the figure of immigrants in irregular situation-, that cannot return to their country of origin for humanitarian reasons. The recognition of the refugee condition as a fundamental right in the Mexican law system proceeds under these situations: 1. The immigrant applies for the refugee condition, even without the necessary proving elements to accredit the humanitarian character of his departure from his country of origin. 2. The immigrant does not apply for the recognition of refugee because he does not know he has the right to, even if he has the profile to apply for. 3. The immigrant who applies fulfills the requirements of the administrative procedure and has access to the refugee recognition. Of the three situations above, only the last one is contemplated for the national indexes of the status refugee; and the first two prove the inefficiency of the governmental system viewed from its lack of sensibility consequence of the no education in human rights matter and which results in the legal vulnerability of the immigrants in irregular situation because they do not have access to the procuration and administration of justice. In the aim of determining the causes and consequences of the no recognition of the refugee status, this investigation was structured from a systemic analysis which objective is to show the advances in Central American humanitarian emergency investigation, the Mexican States actions to protect, respect and fulfil the fundamental right of refugee of immigrants in irregular situation and the social and legal vulnerabilities suffered by Central Americans in Mexico. Therefore, to achieve the deduction of the legal nature of the humanitarian emergency from the Human Rights as a branch of the International Public Law, a conceptual framework is structured using the inductive deductive method. The problem statement is made from a legal framework to approach a theoretical scheme under the theory of social systems, from the analysis of the lack of communication of the governmental and normative subsystems of the Mexican legal system relative to the process undertaken by the Central American immigrants to achieve the recognition of the refugee status as a human right. Accordingly, is determined that fulfilling the obligations of the State referent to grant the right of the recognition of the refugee condition, would mean a guideline for a new stage in Mexican Law, because it would enlarge the constitutional benefits to everyone whose right to the recognition of refugee has been denied an as consequence, a great advance in human rights matter would be achieved.

Keywords: central American immigrants in irregular situation, humanitarian emergency, human rights, refugee

Procedia PDF Downloads 285
3826 Hand Symbol Recognition Using Canny Edge Algorithm and Convolutional Neural Network

Authors: Harshit Mittal, Neeraj Garg

Abstract:

Hand symbol recognition is a pivotal component in the domain of computer vision, with far-reaching applications spanning sign language interpretation, human-computer interaction, and accessibility. This research paper discusses the approach with the integration of the Canny Edge algorithm and convolutional neural network. The significance of this study lies in its potential to enhance communication and accessibility for individuals with hearing impairments or those engaged in gesture-based interactions with technology. In the experiment mentioned, the data is manually collected by the authors from the webcam using Python codes, to increase the dataset augmentation, is applied to original images, which makes the model more compatible and advanced. Further, the dataset of about 6000 coloured images distributed equally in 5 classes (i.e., 1, 2, 3, 4, 5) are pre-processed first to gray images and then by the Canny Edge algorithm with threshold 1 and 2 as 150 each. After successful data building, this data is trained on the Convolutional Neural Network model, giving accuracy: 0.97834, precision: 0.97841, recall: 0.9783, and F1 score: 0.97832. For user purposes, a block of codes is built in Python to enable a window for hand symbol recognition. This research, at its core, seeks to advance the field of computer vision by providing an advanced perspective on hand sign recognition. By leveraging the capabilities of the Canny Edge algorithm and convolutional neural network, this study contributes to the ongoing efforts to create more accurate, efficient, and accessible solutions for individuals with diverse communication needs.

Keywords: hand symbol recognition, computer vision, Canny edge algorithm, convolutional neural network

Procedia PDF Downloads 57
3825 Multimodal Database of Emotional Speech, Video and Gestures

Authors: Tomasz Sapiński, Dorota Kamińska, Adam Pelikant, Egils Avots, Cagri Ozcinar, Gholamreza Anbarjafari

Abstract:

People express emotions through different modalities. Integration of verbal and non-verbal communication channels creates a system in which the message is easier to understand. Expanding the focus to several expression forms can facilitate research on emotion recognition as well as human-machine interaction. In this article, the authors present a Polish emotional database composed of three modalities: facial expressions, body movement and gestures, and speech. The corpora contains recordings registered in studio conditions, acted out by 16 professional actors (8 male and 8 female). The data is labeled with six basic emotions categories, according to Ekman’s emotion categories. To check the quality of performance, all recordings are evaluated by experts and volunteers. The database is available to academic community and might be useful in the study on audio-visual emotion recognition.

Keywords: body movement, emotion recognition, emotional corpus, facial expressions, gestures, multimodal database, speech

Procedia PDF Downloads 348
3824 Analysis of the Significance of Multimedia Channels Using Sparse PCA and Regularized SVD

Authors: Kourosh Modarresi

Abstract:

The abundance of media channels and devices has given users a variety of options to extract, discover, and explore information in the digital world. Since, often, there is a long and complicated path that a typical user may venture before taking any (significant) action (such as purchasing goods and services), it is critical to know how each node (media channel) in the path of user has contributed to the final action. In this work, the significance of each media channel is computed using statistical analysis and machine learning techniques. More specifically, “Regularized Singular Value Decomposition”, and “Sparse Principal Component” has been used to compute the significance of each channel toward the final action. The results of this work are a considerable improvement compared to the present approaches.

Keywords: multimedia attribution, sparse principal component, regularization, singular value decomposition, feature significance, machine learning, linear systems, variable shrinkage

Procedia PDF Downloads 306
3823 An Approach for Vocal Register Recognition Based on Spectral Analysis of Singing

Authors: Aleksandra Zysk, Pawel Badura

Abstract:

Recognizing and controlling vocal registers during singing is a difficult task for beginner vocalist. It requires among others identifying which part of natural resonators is being used when a sound propagates through the body. Thus, an application has been designed allowing for sound recording, automatic vocal register recognition (VRR), and a graphical user interface providing real-time visualization of the signal and recognition results. Six spectral features are determined for each time frame and passed to the support vector machine classifier yielding a binary decision on the head or chest register assignment of the segment. The classification training and testing data have been recorded by ten professional female singers (soprano, aged 19-29) performing sounds for both chest and head register. The classification accuracy exceeded 93% in each of various validation schemes. Apart from a hard two-class clustering, the support vector classifier returns also information on the distance between particular feature vector and the discrimination hyperplane in a feature space. Such an information reflects the level of certainty of the vocal register classification in a fuzzy way. Thus, the designed recognition and training application is able to assess and visualize the continuous trend in singing in a user-friendly graphical mode providing an easy way to control the vocal emission.

Keywords: classification, singing, spectral analysis, vocal emission, vocal register

Procedia PDF Downloads 297
3822 Innovative Business Education Pedagogy: A Case Study of Action Learning at NITIE, Mumbai

Authors: Sudheer Dhume, T. Prasad

Abstract:

There are distinct signs of Business Education losing its sheen. It is more so in developing countries. One of the reasons is the value addition at the end of 2 year MBA program is not matching with the requirements of present times and expectations of the students. In this backdrop, Pedagogy Innovation has become prerequisite for making our MBA programs relevant and useful. This paper is the description and analysis of innovative Action Learning pedagogical approach adopted by a group of faculty members at NITIE Mumbai. It not only promotes multidisciplinary research but also enhances integration of the functional areas skillsets in the students. The paper discusses the theoretical bases of this pedagogy and evaluates the effectiveness of it vis-à-vis conventional pedagogical tools. The evaluation research using Bloom’s taxonomy framework showed that this blended method of Business Education is much superior as compared to conventional pedagogy.

Keywords: action learning, blooms taxonomy, business education, innovation, pedagogy

Procedia PDF Downloads 266
3821 ViraPart: A Text Refinement Framework for Automatic Speech Recognition and Natural Language Processing Tasks in Persian

Authors: Narges Farokhshad, Milad Molazadeh, Saman Jamalabbasi, Hamed Babaei Giglou, Saeed Bibak

Abstract:

The Persian language is an inflectional subject-object-verb language. This fact makes Persian a more uncertain language. However, using techniques such as Zero-Width Non-Joiner (ZWNJ) recognition, punctuation restoration, and Persian Ezafe construction will lead us to a more understandable and precise language. In most of the works in Persian, these techniques are addressed individually. Despite that, we believe that for text refinement in Persian, all of these tasks are necessary. In this work, we proposed a ViraPart framework that uses embedded ParsBERT in its core for text clarifications. First, used the BERT variant for Persian followed by a classifier layer for classification procedures. Next, we combined models outputs to output cleartext. In the end, the proposed model for ZWNJ recognition, punctuation restoration, and Persian Ezafe construction performs the averaged F1 macro scores of 96.90%, 92.13%, and 98.50%, respectively. Experimental results show that our proposed approach is very effective in text refinement for the Persian language.

Keywords: Persian Ezafe, punctuation, ZWNJ, NLP, ParsBERT, transformers

Procedia PDF Downloads 207
3820 Urban Flood Risk Mapping–a Review

Authors: Sherly M. A., Subhankar Karmakar, Terence Chan, Christian Rau

Abstract:

Floods are one of the most frequent natural disasters, causing widespread devastation, economic damage and threat to human lives. Hydrologic impacts of climate change and intensification of urbanization are two root causes of increased flood occurrences, and recent research trends are oriented towards understanding these aspects. Due to rapid urbanization, population of cities across the world has increased exponentially leading to improperly planned developments. Climate change due to natural and anthropogenic activities on our environment has resulted in spatiotemporal changes in rainfall patterns. The combined effect of both aggravates the vulnerability of urban populations to floods. In this context, an efficient and effective flood risk management with its core component as flood risk mapping is essential in prevention and mitigation of flood disasters. Urban flood risk mapping involves zoning of an urban region based on its flood risk, which depicts the spatiotemporal pattern of frequency and severity of hazards, exposure to hazards, and degree of vulnerability of the population in terms of socio-economic, environmental and infrastructural aspects. Although vulnerability is a key component of risk, its assessment and mapping is often less advanced than hazard mapping and quantification. A synergic effort from technical experts and social scientists is vital for the effectiveness of flood risk management programs. Despite an increasing volume of quality research conducted on urban flood risk, a comprehensive multidisciplinary approach towards flood risk mapping still remains neglected due to which many of the input parameters and definitions of flood risk concepts are imprecise. Thus, the objectives of this review are to introduce and precisely define the relevant input parameters, concepts and terms in urban flood risk mapping, along with its methodology, current status and limitations. The review also aims at providing thought-provoking insights to potential future researchers and flood management professionals.

Keywords: flood risk, flood hazard, flood vulnerability, flood modeling, urban flooding, urban flood risk mapping

Procedia PDF Downloads 584
3819 Cross Attention Fusion for Dual-Stream Speech Emotion Recognition

Authors: Shaode Yu, Jiajian Meng, Bing Zhu, Hang Yu, Qiurui Sun

Abstract:

Speech emotion recognition (SER) is for recognizing human subjective emotions through audio data in-depth analysis. From speech audios, how to comprehensively extract emotional information and how to effectively fuse extracted features remain challenging. This paper presents a dual-stream SER framework that embraces both full training and transfer learning of different networks for thorough feature encoding. Besides, a plug-and-play cross-attention fusion (CAF) module is implemented for the valid integration of the dual-stream encoder output. The effectiveness of the proposed CAF module is compared to the other three fusion modules (feature summation, feature concatenation, and feature-wise linear modulation) on two databases (RAVDESS and IEMO-CAP) using different dual-stream encoders (full training network, DPCNN or TextRCNN; transfer learning network, HuBERT or Wav2Vec2). Experimental results suggest that the CAF module can effectively reconcile conflicts between features from different encoders and outperform the other three feature fusion modules on the SER task. In the future, the plug-and-play CAF module can be extended for multi-branch feature fusion, and the dual-stream SER framework can be widened for multi-stream data representation to improve the recognition performance and generalization capacity.

Keywords: speech emotion recognition, cross-attention fusion, dual-stream, pre-trained

Procedia PDF Downloads 69
3818 The Effects of Online Video Gaming on Creativity

Authors: Chloe Shu-Hua Yeh

Abstract:

Effects of videogame play on players cognitive abilities is a growing research field in the recent decades, however, little is known about how ‘out-of-school’ use of videogame influences creativity. This interdisciplinary research explores the cognitive and emotional effects of two different types of online videogames (an action videogame and a non-action videogame) on subsequent creativity performances using a within-participant design study with 36 participants. Results showed that after playing the action game participants performed higher originality, elaboration and flexibility than after playing the causal game. The results explored effects of emotional states elicited during playing the games suggesting that arousal may be a significant emotional factor which influence subsequent creativity performance. The cognitive and emotional effects of videogame were discussed followed with implications for emotion-creativity-videogame play research, game designers, educational practitioners and parents.

Keywords: attentional breadth, creativity, emotion, videogame play

Procedia PDF Downloads 524
3817 Category-Base Theory of the Optimum Signal Approximation Clarifying the Importance of Parallel Worlds in the Recognition of Human and Application to Secure Signal Communication with Feedback

Authors: Takuro Kida, Yuichi Kida

Abstract:

We show a base of the new trend of algorithm mathematically that treats a historical reason of continuous discrimination in the world as well as its solution by introducing new concepts of parallel world that includes an invisible set of errors as its companion. With respect to a matrix operator-filter bank that the matrix operator-analysis-filter bank H and the matrix operator-sampling-filter bank S are given, firstly, we introduce the detailed algorithm to derive the optimum matrix operator-synthesis-filter bank Z that minimizes all the worst-case measures of the matrix operator-error-signals E(ω) = F(ω) − Y(ω) between the matrix operator-input-signals F(ω) and the matrix operator-output signals Y(ω) of the matrix operator-filter bank at the same time. Further, feedback is introduced to the above approximation theory and it is indicated that introducing conversations with feedback does not superior automatically to the accumulation of existing knowledge of signal prediction. Secondly, the concept of category in the field of mathematics is applied to the above optimum signal approximation and is indicated that the category-based approximation theory is applied to the set-theoretic consideration of the recognition of humans. Based on this discussion, it is shown naturally why the narrow perception that tends to create isolation shows an apparent advantage in the short term and, often, why such narrow thinking becomes intimate with discriminatory action in a human group. Throughout these considerations, it is presented that, in order to abolish easy and intimate discriminatory behavior, it is important to create a parallel world of conception where we share the set of invisible error signals, including the words and the consciousness of both worlds.

Keywords: signal prediction, pseudo inverse matrix, artificial intelligence, conditional optimization

Procedia PDF Downloads 148
3816 Algorithm for Path Recognition in-between Tree Rows for Agricultural Wheeled-Mobile Robots

Authors: Anderson Rocha, Pedro Miguel de Figueiredo Dinis Oliveira Gaspar

Abstract:

Machine vision has been widely used in recent years in agriculture, as a tool to promote the automation of processes and increase the levels of productivity. The aim of this work is the development of a path recognition algorithm based on image processing to guide a terrestrial robot in-between tree rows. The proposed algorithm was developed using the software MATLAB, and it uses several image processing operations, such as threshold detection, morphological erosion, histogram equalization and the Hough transform, to find edge lines along tree rows on an image and to create a path to be followed by a mobile robot. To develop the algorithm, a set of images of different types of orchards was used, which made possible the construction of a method capable of identifying paths between trees of different heights and aspects. The algorithm was evaluated using several images with different characteristics of quality and the results showed that the proposed method can successfully detect a path in different types of environments.

Keywords: agricultural mobile robot, image processing, path recognition, hough transform

Procedia PDF Downloads 141
3815 Deep Learning Application for Object Image Recognition and Robot Automatic Grasping

Authors: Shiuh-Jer Huang, Chen-Zon Yan, C. K. Huang, Chun-Chien Ting

Abstract:

Since the vision system application in industrial environment for autonomous purposes is required intensely, the image recognition technique becomes an important research topic. Here, deep learning algorithm is employed in image system to recognize the industrial object and integrate with a 7A6 Series Manipulator for object automatic gripping task. PC and Graphic Processing Unit (GPU) are chosen to construct the 3D Vision Recognition System. Depth Camera (Intel RealSense SR300) is employed to extract the image for object recognition and coordinate derivation. The YOLOv2 scheme is adopted in Convolution neural network (CNN) structure for object classification and center point prediction. Additionally, image processing strategy is used to find the object contour for calculating the object orientation angle. Then, the specified object location and orientation information are sent to robotic controller. Finally, a six-axis manipulator can grasp the specific object in a random environment based on the user command and the extracted image information. The experimental results show that YOLOv2 has been successfully employed to detect the object location and category with confidence near 0.9 and 3D position error less than 0.4 mm. It is useful for future intelligent robotic application in industrial 4.0 environment.

Keywords: deep learning, image processing, convolution neural network, YOLOv2, 7A6 series manipulator

Procedia PDF Downloads 239
3814 'Value-Based Re-Framing' in Identity-Based Conflicts: A Skill for Mediators in Multi-Cultural Societies

Authors: Hami-Ziniman Revital, Ashwall Rachelly

Abstract:

The conflict resolution realm has developed tremendously during the last half-decade. Three main approaches should be mentioned: an Alternative Dispute Resolution (ADR) suggesting processes such as Arbitration or Interests-based Negotiation was developed as an answer to obligations and rights-based conflicts. The Pragmatic mediation approach focuses on the gap between interests and needs of disputants. The Transformative mediation approach focusses on relations and suits identity-based conflicts. In the current study, we examine the conflictual relations between religious and non-religious Jews in Israel and the impact of three transformative mechanisms: Inter-group recognition, In-group empowerment and Value-based reframing on the relations between the participants. The research was conducted during four facilitated joint mediation classes. A unique finding was found. Using both transformative mechanisms and the Contact Hypothesis criteria, we identify transformation in participants’ relations and a considerable change from anger, alienation, and suspiciousness to an increased understanding, affection and interpersonal concern towards the out-group members. Intergroup Recognition, In-group empowerment, and Values-based reframing were the skills discovered as the main enablers of the change in the relations and the research participants’ fostered mutual recognition of the out-group values and identity-based issues. We conclude this transformation was possible due to a constant intergroup contact, based on the Contact Hypothesis criteria. In addition, as Interests-based mediation uses “Reframing” as a skill to acknowledge both mutual and opposite needs of the disputants, we suggest the use of “Value-based Reframing” in intergroup identity-based conflicts, as a skill contributes to the empowerment and the recognition of both mutual and different out-group values. We offer to implement those insights and skills to assist conflict resolution facilitators in various intergroup identity-based conflicts resolution efforts and to establish further research and knowledge.

Keywords: empowerment, identity-based conflict, intergroup recognition, intergroup relations, mediation skills, multi-cultural society, reframing, value-based recognition

Procedia PDF Downloads 336
3813 Facial Recognition Technology in Institutions of Higher Learning: Exploring the Use in Kenya

Authors: Samuel Mwangi, Josephine K. Mule

Abstract:

Access control as a security technique regulates who or what can access resources. It is a fundamental concept in security that minimizes risks to the institutions that use access control. Regulating access to institutions of higher learning is key to ensure only authorized personnel and students are allowed into the institutions. The use of biometrics has been criticized due to the setup and maintenance costs, hygiene concerns, and trepidations regarding data privacy, among other apprehensions. Facial recognition is arguably a fast and accurate way of validating identity in order to guard protected areas. It guarantees that only authorized individuals gain access to secure locations while requiring far less personal information whilst providing an additional layer of security beyond keys, fobs, or identity cards. This exploratory study sought to investigate the use of facial recognition in controlling access in institutions of higher learning in Kenya. The sample population was drawn from both private and public higher learning institutions. The data is based on responses from staff and students. Questionnaires were used for data collection and follow up interviews conducted to understand responses from the questionnaires. 80% of the sampled population indicated that there were many security breaches by unauthorized people, with some resulting in terror attacks. These security breaches were attributed to stolen identity cases, where staff or student identity cards were stolen and used by criminals to access the institutions. These unauthorized accesses have resulted in losses to the institutions, including reputational damages. The findings indicate that security breaches are a major problem in institutions of higher learning in Kenya. Consequently, access control would be beneficial if employed to curb security breaches. We suggest the use of facial recognition technology, given its uniqueness in identifying users and its non-repudiation capabilities.

Keywords: facial recognition, access control, technology, learning

Procedia PDF Downloads 121