Search results for: cosine margin face recognition
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 4590

Search results for: cosine margin face recognition

4230 Recognition of Noisy Words Using the Time Delay Neural Networks Approach

Authors: Khenfer-Koummich Fatima, Mesbahi Larbi, Hendel Fatiha

Abstract:

This paper presents a recognition system for isolated words like robot commands. It’s carried out by Time Delay Neural Networks; TDNN. To teleoperate a robot for specific tasks as turn, close, etc… In industrial environment and taking into account the noise coming from the machine. The choice of TDNN is based on its generalization in terms of accuracy, in more it acts as a filter that allows the passage of certain desirable frequency characteristics of speech; the goal is to determine the parameters of this filter for making an adaptable system to the variability of speech signal and to noise especially, for this the back propagation technique was used in learning phase. The approach was applied on commands pronounced in two languages separately: The French and Arabic. The results for two test bases of 300 spoken words for each one are 87%, 97.6% in neutral environment and 77.67%, 92.67% when the white Gaussian noisy was added with a SNR of 35 dB.

Keywords: TDNN, neural networks, noise, speech recognition

Procedia PDF Downloads 289
4229 Usability Testing on Information Design through Single-Lens Wearable Device

Authors: Jae-Hyun Choi, Sung-Soo Bae, Sangyoung Yoon, Hong-Ku Yun, Jiyoung Kwahk

Abstract:

This study was conducted to investigate the effect of ocular dominance on recognition performance using a single-lens smart display designed for cycling. A total of 36 bicycle riders who have been cycling consistently were recruited and participated in the experiment. The participants were asked to perform tasks riding a bicycle on a stationary stand for safety reasons. Independent variables of interest include ocular dominance, bike usage, age group, and information layout. Recognition time (i.e., the time required to identify specific information measured with an eye-tracker), error rate (i.e. false answer or failure to identify the information in 5 seconds), and user preference scores were measured and statistical tests were conducted to identify significant results. Recognition time and error ratio showed significant difference by ocular dominance factor, while the preference score did not. Recognition time was faster when the single-lens see-through display on the dominant eye (average 1.12sec) than on the non-dominant eye (average 1.38sec). Error ratio of the information recognition task was significantly lower when the see-through display was worn on the dominant eye (average 4.86%) than on the non-dominant eye (average 14.04%). The interaction effect of ocular dominance and age group was significant with respect to recognition time and error ratio. The recognition time of the users in their 40s was significantly longer than the other age groups when the display was placed on the non-dominant eye, while no difference was observed on the dominant eye. Error ratio also showed the same pattern. Although no difference was observed for the main effect of ocular dominance and bike usage, the interaction effect between the two variables was significant with respect to preference score. Preference score of daily bike users was higher when the display was placed on the dominant eye, whereas participants who use bikes for leisure purposes showed the opposite preference patterns. It was found more effective and efficient to wear a see-through display on the dominant eye than on the non-dominant eye, although user preference was not affected by ocular dominance. It is recommended to wear a see-through display on the dominant eye since it is safer by helping the user recognize the presented information faster and more accurately, even if the user may not notice the difference.

Keywords: eye tracking, information recognition, ocular dominance, smart headware, wearable device

Procedia PDF Downloads 272
4228 Robust Recognition of Locomotion Patterns via Data-Driven Machine Learning in the Cloud Environment

Authors: Shinoy Vengaramkode Bhaskaran, Kaushik Sathupadi, Sandesh Achar

Abstract:

Human locomotion recognition is important in a variety of sectors, such as robotics, security, healthcare, fitness tracking and cloud computing. With the increasing pervasiveness of peripheral devices, particularly Inertial Measurement Units (IMUs) sensors, researchers have attempted to exploit these advancements in order to precisely and efficiently identify and categorize human activities. This research paper introduces a state-of-the-art methodology for the recognition of human locomotion patterns in a cloud environment. The methodology is based on a publicly available benchmark dataset. The investigation implements a denoising and windowing strategy to deal with the unprocessed data. Next, feature extraction is adopted to abstract the main cues from the data. The SelectKBest strategy is used to abstract optimal features from the data. Furthermore, state-of-the-art ML classifiers are used to evaluate the performance of the system, including logistic regression, random forest, gradient boosting and SVM have been investigated to accomplish precise locomotion classification. Finally, a detailed comparative analysis of results is presented to reveal the performance of recognition models.

Keywords: artificial intelligence, cloud computing, IoT, human locomotion, gradient boosting, random forest, neural networks, body-worn sensors

Procedia PDF Downloads 11
4227 Feasibility of Voluntary Deep Inspiration Breath-Hold Radiotherapy Technique Implementation without Deep Inspiration Breath-Hold-Assisting Device

Authors: Auwal Abubakar, Shazril Imran Shaukat, Noor Khairiah A. Karim, Mohammed Zakir Kassim, Gokula Kumar Appalanaido, Hafiz Mohd Zin

Abstract:

Background: Voluntary deep inspiration breath-hold radiotherapy (vDIBH-RT) is an effective cardiac dose reduction technique during left breast radiotherapy. This study aimed to assess the accuracy of the implementation of the vDIBH technique among left breast cancer patients without the use of a special device such as a surface-guided imaging system. Methods: The vDIBH-RT technique was implemented among thirteen (13) left breast cancer patients at the Advanced Medical and Dental Institute (AMDI), Universiti Sains Malaysia. Breath-hold monitoring was performed based on breath-hold skin marks and laser light congruence observed on zoomed CCTV images from the control console during each delivery. The initial setup was verified using cone beam computed tomography (CBCT) during breath-hold. Each field was delivered using multiple beam segments to allow a delivery time of 20 seconds, which can be tolerated by patients in breath-hold. The data were analysed using an in-house developed MATLAB algorithm. PTV margin was computed based on van Herk's margin recipe. Results: The setup error analysed from CBCT shows that the population systematic error in lateral (x), longitudinal (y), and vertical (z) axes was 2.28 mm, 3.35 mm, and 3.10 mm, respectively. Based on the CBCT image guidance, the Planning target volume (PTV) margin that would be required for vDIBH-RT using CCTV/Laser monitoring technique is 7.77 mm, 10.85 mm, and 10.93 mm in x, y, and z axes, respectively. Conclusion: It is feasible to safely implement vDIBH-RT among left breast cancer patients without special equipment. The breath-hold monitoring technique is cost-effective, radiation-free, easy to implement, and allows real-time breath-hold monitoring.

Keywords: vDIBH, cone beam computed tomography, radiotherapy, left breast cancer

Procedia PDF Downloads 57
4226 Effects of Oxytocin on Neural Response to Facial Emotion Recognition in Schizophrenia

Authors: Avyarthana Dey, Naren P. Rao, Arpitha Jacob, Chaitra V. Hiremath, Shivarama Varambally, Ganesan Venkatasubramanian, Rose Dawn Bharath, Bangalore N. Gangadhar

Abstract:

Objective: Impaired facial emotion recognition is widely reported in schizophrenia. Neuropeptide oxytocin is known to modulate brain regions involved in facial emotion recognition, namely amygdala, in healthy volunteers. However, its effect on facial emotion recognition deficits seen in schizophrenia is not well explored. In this study, we examined the effect of intranasal OXT on processing facial emotions and its neural correlates in patients with schizophrenia. Method: 12 male patients (age= 31.08±7.61 years, education= 14.50±2.20 years) participated in this single-blind, counterbalanced functional magnetic resonance imaging (fMRI) study. All participants underwent three fMRI scans; one at baseline, one each after single dose 24IU intranasal OXT and intranasal placebo. The order of administration of OXT and placebo were counterbalanced and subject was blind to the drug administered. Participants performed a facial emotion recognition task presented in a block design with six alternating blocks of faces and shapes. The faces depicted happy, angry or fearful emotions. The images were preprocessed and analyzed using SPM 12. First level contrasts comparing recognition of emotions and shapes were modelled at individual subject level. A group level analysis was performed using the contrasts generated at the first level to compare the effects of intranasal OXT and placebo. The results were thresholded at uncorrected p < 0.001 with a cluster size of 6 voxels. Neuropeptide oxytocin is known to modulate brain regions involved in facial emotion recognition, namely amygdala, in healthy volunteers. Results: Compared to placebo, intranasal OXT attenuated activity in inferior temporal, fusiform and parahippocampal gyri (BA 20), premotor cortex (BA 6), middle frontal gyrus (BA 10) and anterior cingulate gyrus (BA 24) and enhanced activity in the middle occipital gyrus (BA 18), inferior occipital gyrus (BA 19), and superior temporal gyrus (BA 22). There were no significant differences between the conditions on the accuracy scores of emotion recognition between baseline (77.3±18.38), oxytocin (82.63 ± 10.92) or Placebo (76.62 ± 22.67). Conclusion: Our results provide further evidence to the modulatory effect of oxytocin in patients with schizophrenia. Single dose oxytocin resulted in significant changes in activity of brain regions involved in emotion processing. Future studies need to examine the effectiveness of long-term treatment with OXT for emotion recognition deficits in patients with schizophrenia.

Keywords: recognition, functional connectivity, oxytocin, schizophrenia, social cognition

Procedia PDF Downloads 220
4225 Dermatomyositis: It is Not Always an Allergic Reaction

Authors: Irfan Abdulrahman Sheth, Sohil Pothiawala

Abstract:

Dermatomyositis is an idiopathic inflammatory myopathy, traditionally characterized by a progressive, symmetrical proximal muscle weakness and pathognomonic or characteristic cutaneous manifestations. We report a case of a 60-year old Chinese female who was referred from polyclinic for allergic rash over the body after applying hair dye 3 weeks ago. It was associated with puffiness of face, shortness of breath and hoarse voice since last 2 weeks with decrease effort tolerance. She also complained of dysphagia/ myalgia with progressive weakness of proximal muscles and palpitations. She denied chest pain, loss of appetite, weight loss, orthopnea or fever. She had stable vital signs and appeared cushingoid. She was noted to have rash over the scalp/ face and ecchymosis over the right arm with puffiness of face and periorbital oedema. There was symmetrical muscle weakness and other neurological examination was normal. Initial impression was of allergic reaction and underlying nephrotic syndrome and Cushing’s syndrome from TCM use. Diagnostic tests showed high Creatinine kinase (CK) of 1463 u/l, CK–MB of 18.7 ug/l and Troponin –T of 0.09 ug/l. The Full blood count and renal panel was normal. EMG showed inflammatory myositis. Patient was managed by rheumatologist and discharged on oral prednisolone with methotrexate/ ergocalciferol capsule and calcium carb, vitamin D tablets and outpatient follow up. In some patients, cutaneous disease exists in the absence of objective evidence of muscle inflammation. Management of dermatomyositis begins with careful investigation for the presence of muscle disease or of additional systemic involvement, particularly of the pulmonary, cardiac or gastrointestinal systems, and for the possibility of an accompanying malignancy. Muscle disease and systemic involvement can be refractory and may require multiple sequential therapeutic interventions or, at times, combinations of therapies. Thus, we want to highlight to the physicians that the cutaneous disease of dermatomyositis should not be confused with allergic reaction. It can be particularly challenging to diagnose. Early recognition aids appropriate management of this group of patients.

Keywords: dermatomyositis, myopathy, allergy, cutaneous disease

Procedia PDF Downloads 335
4224 Using Optical Character Recognition to Manage the Unstructured Disaster Data into Smart Disaster Management System

Authors: Dong Seop Lee, Byung Sik Kim

Abstract:

In the 4th Industrial Revolution, various intelligent technologies have been developed in many fields. These artificial intelligence technologies are applied in various services, including disaster management. Disaster information management does not just support disaster work, but it is also the foundation of smart disaster management. Furthermore, it gets historical disaster information using artificial intelligence technology. Disaster information is one of important elements of entire disaster cycle. Disaster information management refers to the act of managing and processing electronic data about disaster cycle from its’ occurrence to progress, response, and plan. However, information about status control, response, recovery from natural and social disaster events, etc. is mainly managed in the structured and unstructured form of reports. Those exist as handouts or hard-copies of reports. Such unstructured form of data is often lost or destroyed due to inefficient management. It is necessary to manage unstructured data for disaster information. In this paper, the Optical Character Recognition approach is used to convert handout, hard-copies, images or reports, which is printed or generated by scanners, etc. into electronic documents. Following that, the converted disaster data is organized into the disaster code system as disaster information. Those data are stored in the disaster database system. Gathering and creating disaster information based on Optical Character Recognition for unstructured data is important element as realm of the smart disaster management. In this paper, Korean characters were improved to over 90% character recognition rate by using upgraded OCR. In the case of character recognition, the recognition rate depends on the fonts, size, and special symbols of character. We improved it through the machine learning algorithm. These converted structured data is managed in a standardized disaster information form connected with the disaster code system. The disaster code system is covered that the structured information is stored and retrieve on entire disaster cycle such as historical disaster progress, damages, response, and recovery. The expected effect of this research will be able to apply it to smart disaster management and decision making by combining artificial intelligence technologies and historical big data.

Keywords: disaster information management, unstructured data, optical character recognition, machine learning

Procedia PDF Downloads 129
4223 Gene Names Identity Recognition Using Siamese Network for Biomedical Publications

Authors: Micheal Olaolu Arowolo, Muhammad Azam, Fei He, Mihail Popescu, Dong Xu

Abstract:

As the quantity of biological articles rises, so does the number of biological route figures. Each route figure shows gene names and relationships. Annotating pathway diagrams manually is time-consuming. Advanced image understanding models could speed up curation, but they must be more precise. There is rich information in biological pathway figures. The first step to performing image understanding of these figures is to recognize gene names automatically. Classical optical character recognition methods have been employed for gene name recognition, but they are not optimized for literature mining data. This study devised a method to recognize an image bounding box of gene name as a photo using deep Siamese neural network models to outperform the existing methods using ResNet, DenseNet and Inception architectures, the results obtained about 84% accuracy.

Keywords: biological pathway, gene identification, object detection, Siamese network

Procedia PDF Downloads 292
4222 Online Versus Face-To-Face – How Do Video Consultations Change The Doctor-Patient-Interaction

Authors: Markus Feufel, Friederike Kendel, Caren Hilger, Selamawit Woldai

Abstract:

Since the corona pandemic, the use of video consultation has increased remarkably. For vulnerable groups such as oncological patients, the advantages seem obvious. But how does video consultation potentially change the doctor-patient relationship compared to face-to-face consultation? Which barriers may hinder the effective use of this consultation format in practice? We are presenting first results from a mixed-methods field study, funded by Federal Ministry of Health, which will provide the basis for a hands-on guide for both physicians and patients on how to improve the quality of video consultations. We use a quasi-experimental design to analyze qualitative and quantitative differences between face-to-face and video consultations based on video recordings of N = 64 actual counseling sessions (n = 32 for each consultation format). Data will be recorded from n = 32 gynecological and n = 32 urological cancer patients at two clinics. After the consultation, all patients will be asked to fill out a questionnaire about their consultation experience. For quantitative analyses, the counseling sessions will be systematically compared in terms of verbal and nonverbal communication patterns. Relative frequencies of eye contact and the information exchanged will be compared using 𝝌2 -tests. The validated questionnaire MAPPIN'Obsdyad will be used to assess the expression of shared decision-making parameters. In addition, semi-structured interviews will be conducted with n = 10 physicians and n = 10 patients experienced with video consultation, for which a qualitative content analysis will be conducted. We will elaborate the comprehensive methodological approach we used to compare video vs. face-to-face consultations and present first evidence on how video consultations change the doctor-patient interaction. We will also outline possible barriers of video consultations and best practices on how they may be overcome. Based on the results, we will present and discuss recommendations outlining best practices for how to prepare and conduct high-quality video consultations from the perspective of both physicians and patients.

Keywords: video consultation, patient-doctor-relationship, digital applications, technical barriers

Procedia PDF Downloads 140
4221 E Learning/Teaching and the Impact on Student Performance at the Postgraduate Level

Authors: Charles Lemckert

Abstract:

E-Learning and E-Teaching can mean many things to different people. For some, the implication is that all material must be delivered in an E way, while for others it only forms part of the learning/teaching process, and (unfortunately) for some it is considered too much work. However, just look around and you will see all generations learning using E devices. In this study we used different forms of teaching, including E, to look at how students responded to set activities and how they performed academically. The particular context was set around a postgraduate university course where students were either present at a face-to-face intensive workshop (on water treatment plant design) or where they were not. For the latter, students needed to make sole use of E media. It is relevant to note that even though some were at the face-to-face class, they were still exposed to E material as the lecturer did use PC projections. Additionally, some also accessed the associate E material (pdf slides and video recordings) to assist their required activities. Analysis of the student performance, in their set assignment, showed that the actual form of delivery did not affect the student performance. This is because, in the end, all the students had access to the recorded/presented E material. The study also showed (somewhat expectedly) that when the material they required for the assignment was clear, the student performance did drop. Therefore, it is possible to enhance future delivery of courses through careful reflection and appropriate support. In the end, we must remember innovation is not just restricted to E.

Keywords: postgraduate, engineering, assignment, perforamance

Procedia PDF Downloads 332
4220 Economics of Sugandhakokila (Cinnamomum Glaucescens (Nees) Dury) in Dang District of Nepal: A Value Chain Perspective

Authors: Keshav Raj Acharya, Prabina Sharma

Abstract:

Sugandhakokila (Cinnamomum glaucescens Nees. Dury) is a large evergreen native tree species; mostly confined naturally in mid-hills of Rapti Zone of Nepal. The species is identified as prioritized for agro-technology development as well as for research and development by a department of plant resources. This species is band for export outside the country without processing by the government of Nepal to encourage the value addition within the country. The present study was carried out in Chillikot village of Dang district to find out the economic contribution of C. glaucescens in the local economy and to document the major conservation threats for this species. Participatory Rural Appraisal (PRA) tools such as Household survey, key informants interviews and focus group discussions were carried out to collect the data. The present study reveals that about 1.7 million Nepalese rupees (NPR) have been contributed annually in the local economy of 29 households from the collection of C. glaucescens berries in the study area. The average annual income of each family was around NPR 67,165.38 (US$ 569.19) from the sale of the berries which contributes about 53% of the total household income. Six different value chain actors are involved in C. glaucescens business. Maximum profit margin was taken by collector followed by producer, exporter and processor. The profit margin was found minimum to regional and village traders. The total profit margin for producers was NPR 138.86/kg, and regional traders have gained NPR 17/kg. However, there is a possibility to increase the profit of producers by NPR 8.00 more for each kg of berries through the initiation of community forest user group and village cooperatives in the area. Open access resource, infestation by an insect to over matured trees and browsing by goats were identified as major conservation threats for this species. Handing over the national forest as a community forest, linking the producers with the processor through organized market channel and replacing the old tree through new plantation has been recommended for future.

Keywords: community forest, conservation threats, C. glaucescens, value chain analysis

Procedia PDF Downloads 140
4219 An Inviscid Compressible Flow Solver Based on Unstructured OpenFOAM Mesh Format

Authors: Utkan Caliskan

Abstract:

Two types of numerical codes based on finite volume method are developed in order to solve compressible Euler equations to simulate the flow through forward facing step channel. Both algorithms have AUSM+- up (Advection Upstream Splitting Method) scheme for flux splitting and two-stage Runge-Kutta scheme for time stepping. In this study, the flux calculations differentiate between the algorithm based on OpenFOAM mesh format which is called 'face-based' algorithm and the basic algorithm which is called 'element-based' algorithm. The face-based algorithm avoids redundant flux computations and also is more flexible with hybrid grids. Moreover, some of OpenFOAM’s preprocessing utilities can be used on the mesh. Parallelization of the face based algorithm for which atomic operations are needed due to the shared memory model, is also presented. For several mesh sizes, 2.13x speed up is obtained with face-based approach over the element-based approach.

Keywords: cell centered finite volume method, compressible Euler equations, OpenFOAM mesh format, OpenMP

Procedia PDF Downloads 319
4218 The Influence of Job Recognition and Job Motivation on Organizational Commitment in Public Sector: The Mediation Role of Employee Engagement

Authors: Muhammad Tayyab, Saba Saira

Abstract:

It is an established fact that organizations across the globe consider employees as their assets and try to advance their well-being. However, the local firms of developing countries are mostly profit oriented and do not have much concern about their employees’ engagement or commitment. Like other developing countries, the local organizations of Pakistan are also less concerned about the well-being of their employees. Especially public sector organizations lack concern regarding engagement, satisfaction or commitment of the employees. Therefore, this study aimed at investigating the impact of job recognition and job motivation on organizational commitment in the mediation role of employee engagement. The data were collected from land record officers of board of revenue, Punjab, Pakistan. Structured questionnaire was used to collect data through physically visiting land record officers and also through the internet. A total of 318 land record officers’ responses were finalized to perform data analysis. The data were analyzed through confirmatory factor analysis and structural equation modeling technique. The findings revealed that job recognition and job motivation have direct as well as indirect positive and significant impact on organizational commitment. The limitations, practical implications and future research indications are also explained.

Keywords: job motivation, job recognition, employee engagement, employee commitment, public sector, land record officers

Procedia PDF Downloads 132
4217 An Automatic Speech Recognition of Conversational Telephone Speech in Malay Language

Authors: M. Draman, S. Z. Muhamad Yassin, M. S. Alias, Z. Lambak, M. I. Zulkifli, S. N. Padhi, K. N. Baharim, F. Maskuriy, A. I. A. Rahim

Abstract:

The performance of Malay automatic speech recognition (ASR) system for the call centre environment is presented. The system utilizes Kaldi toolkit as the platform to the entire library and algorithm used in performing the ASR task. The acoustic model implemented in this system uses a deep neural network (DNN) method to model the acoustic signal and the standard (n-gram) model for language modelling. With 80 hours of training data from the call centre recordings, the ASR system can achieve 72% of accuracy that corresponds to 28% of word error rate (WER). The testing was done using 20 hours of audio data. Despite the implementation of DNN, the system shows a low accuracy owing to the varieties of noises, accent and dialect that typically occurs in Malaysian call centre environment. This significant variation of speakers is reflected by the large standard deviation of the average word error rate (WERav) (i.e., ~ 10%). It is observed that the lowest WER (13.8%) was obtained from recording sample with a standard Malay dialect (central Malaysia) of native speaker as compared to 49% of the sample with the highest WER that contains conversation of the speaker that uses non-standard Malay dialect.

Keywords: conversational speech recognition, deep neural network, Malay language, speech recognition

Procedia PDF Downloads 322
4216 Local Image Features Emerging from Brain Inspired Multi-Layer Neural Network

Authors: Hui Wei, Zheng Dong

Abstract:

Object recognition has long been a challenging task in computer vision. Yet the human brain, with the ability to rapidly and accurately recognize visual stimuli, manages this task effortlessly. In the past decades, advances in neuroscience have revealed some neural mechanisms underlying visual processing. In this paper, we present a novel model inspired by the visual pathway in primate brains. This multi-layer neural network model imitates the hierarchical convergent processing mechanism in the visual pathway. We show that local image features generated by this model exhibit robust discrimination and even better generalization ability compared with some existing image descriptors. We also demonstrate the application of this model in an object recognition task on image data sets. The result provides strong support for the potential of this model.

Keywords: biological model, feature extraction, multi-layer neural network, object recognition

Procedia PDF Downloads 542
4215 New Innovation and Sustainability in a Developing Country: The Case of Cameroon

Authors: Lema Catherine Forje

Abstract:

Innovation activates the system of an economy to a new level. Innovation follows a process. The first step in innovation is the idea-generation process. There is widespread appreciation that people go to great lengths, incur expenses: energy and materials to generate innovative ideas. People get inspired, create, and connect. The inspiration also enables the building of a culture of innovation. Data collection was done through a face-to-face interview with the producer of the first Cameroon beer that came out in the early 1960s, a rice producing company, a cement producing company, and 100 women following a type of dressing commonly worn by Cameroonian women (wrappa). There were a total number of one hundred and three interviewees. The implication of this study is for everybody. It sheds light on the factors that are likely to sustain an innovation. Conclusion emphasises continuous research to keep giving the innovation a face lift.

Keywords: entrepreneurship, ideas, innovation, sustainability

Procedia PDF Downloads 296
4214 Efficient Residual Road Condition Segmentation Network Based on Reconstructed Images

Authors: Xiang Shijie, Zhou Dong, Tian Dan

Abstract:

This paper focuses on the application of real-time semantic segmentation technology in complex road condition recognition, aiming to address the critical issue of how to improve segmentation accuracy while ensuring real-time performance. Semantic segmentation technology has broad application prospects in fields such as autonomous vehicle navigation and remote sensing image recognition. However, current real-time semantic segmentation networks face significant technical challenges and optimization gaps in balancing speed and accuracy. To tackle this problem, this paper conducts an in-depth study and proposes an innovative Guided Image Reconstruction Module. By resampling high-resolution images into a set of low-resolution images, this module effectively reduces computational complexity, allowing the network to more efficiently extract features within limited resources, thereby improving the performance of real-time segmentation tasks. In addition, a dual-branch network structure is designed in this paper to fully leverage the advantages of different feature layers. A novel Hybrid Attention Mechanism is also introduced, which can dynamically capture multi-scale contextual information and effectively enhance the focus on important features, thus improving the segmentation accuracy of the network in complex road condition. Compared with traditional methods, the proposed model achieves a better balance between accuracy and real-time performance and demonstrates competitive results in road condition segmentation tasks, showcasing its superiority. Experimental results show that this method not only significantly improves segmentation accuracy while maintaining real-time performance, but also remains stable across diverse and complex road conditions, making it highly applicable in practical scenarios. By incorporating the Guided Image Reconstruction Module, dual-branch structure, and Hybrid Attention Mechanism, this paper presents a novel approach to real-time semantic segmentation tasks, which is expected to further advance the development of this field.

Keywords: hybrid attention mechanism, image reconstruction, real-time, road status recognition

Procedia PDF Downloads 23
4213 Loss Function Optimization for CNN-Based Fingerprint Anti-Spoofing

Authors: Yehjune Heo

Abstract:

As biometric systems become widely deployed, the security of identification systems can be easily attacked by various spoof materials. This paper contributes to finding a reliable and practical anti-spoofing method using Convolutional Neural Networks (CNNs) based on the types of loss functions and optimizers. The types of CNNs used in this paper include AlexNet, VGGNet, and ResNet. By using various loss functions including Cross-Entropy, Center Loss, Cosine Proximity, and Hinge Loss, and various loss optimizers which include Adam, SGD, RMSProp, Adadelta, Adagrad, and Nadam, we obtained significant performance changes. We realize that choosing the correct loss function for each model is crucial since different loss functions lead to different errors on the same evaluation. By using a subset of the Livdet 2017 database, we validate our approach to compare the generalization power. It is important to note that we use a subset of LiveDet and the database is the same across all training and testing for each model. This way, we can compare the performance, in terms of generalization, for the unseen data across all different models. The best CNN (AlexNet) with the appropriate loss function and optimizers result in more than 3% of performance gain over the other CNN models with the default loss function and optimizer. In addition to the highest generalization performance, this paper also contains the models with high accuracy associated with parameters and mean average error rates to find the model that consumes the least memory and computation time for training and testing. Although AlexNet has less complexity over other CNN models, it is proven to be very efficient. For practical anti-spoofing systems, the deployed version should use a small amount of memory and should run very fast with high anti-spoofing performance. For our deployed version on smartphones, additional processing steps, such as quantization and pruning algorithms, have been applied in our final model.

Keywords: anti-spoofing, CNN, fingerprint recognition, loss function, optimizer

Procedia PDF Downloads 136
4212 The Combination of the Mel Frequency Cepstral Coefficients (MFCC), Perceptual Linear Prediction (PLP), JITTER and SHIMMER Coefficients for the Improvement of Automatic Recognition System for Dysarthric Speech

Authors: Brahim-Fares Zaidi, Malika Boudraa, Sid-Ahmed Selouani

Abstract:

Our work aims to improve our Automatic Recognition System for Dysarthria Speech (ARSDS) based on the Hidden Models of Markov (HMM) and the Hidden Markov Model Toolkit (HTK) to help people who are sick. With pronunciation problems, we applied two techniques of speech parameterization based on Mel Frequency Cepstral Coefficients (MFCC's) and Perceptual Linear Prediction (PLP's) and concatenated them with JITTER and SHIMMER coefficients in order to increase the recognition rate of a dysarthria speech. For our tests, we used the NEMOURS database that represents speakers with dysarthria and normal speakers.

Keywords: hidden Markov model toolkit (HTK), hidden models of Markov (HMM), Mel-frequency cepstral coefficients (MFCC), perceptual linear prediction (PLP’s)

Procedia PDF Downloads 161
4211 Income and Factor Analysis of Small Scale Broiler Production in Imo State, Nigeria

Authors: Ubon Asuquo Essien, Okwudili Bismark Ibeagwa, Daberechi Peace Ubabuko

Abstract:

The Broiler Poultry subsector is dominated by small scale production with low aggregate output. The high cost of inputs currently experienced in Nigeria tends to aggravate the situation; hence many broiler farmers struggle to break-even. This study was designed to examine income and input factors in small scale deep liter broiler production in Imo state, Nigeria. Specifically, the study examined; socio-economic characteristics of small scale deep liter broiler producing Poultry farmers; estimate cost and returns of broiler production in the area; analyze input factors in broiler production in the area and examined marketability, age and profitability of the enterprise. A multi-stage sampling technique was adopted in selecting 60 small scale broiler farmers who use deep liter system from 6 communities through the use of structured questionnaire. The socioeconomic characteristics of the broiler farmers and the profitability/ marketability age of the birds were described using descriptive statistical tools such as frequencies, means and percentages. Gross margin analysis was used to analyze the cost and returns to broiler production, while Cobb Douglas production function was employed to analyze input factors in broiler production. The result of the study revealed that the cost of feed (P<0.1), deep liter material (P<0.05) and medication (P<0.05) had a significant positive relationship with the gross return of broiler farmers in the study area, while cost of labour, fuel and day old chicks were not significant. Furthermore, Gross profit margin of the farmers who market their broiler at the 8th week of rearing was 80.7%; and 78.7% and 60.8% for farmers who market at the 10th week and 12th week of rearing, respectively. The business is, therefore, profitable but at varying degree. Government and Development partners should make deliberate efforts to curb the current rise in the prices of poultry feeds, drugs and timber materials used as bedding so as to widen the profit margin and encourage more farmers to go into the business. The farmers equally need more technical assistance from extension agents with regards to timely and profitable marketing.

Keywords: broilers, factor analysis, income, small scale

Procedia PDF Downloads 80
4210 A Two-Stage Adaptation towards Automatic Speech Recognition System for Malay-Speaking Children

Authors: Mumtaz Begum Mustafa, Siti Salwah Salim, Feizal Dani Rahman

Abstract:

Recently, Automatic Speech Recognition (ASR) systems were used to assist children in language acquisition as it has the ability to detect human speech signal. Despite the benefits offered by the ASR system, there is a lack of ASR systems for Malay-speaking children. One of the contributing factors for this is the lack of continuous speech database for the target users. Though cross-lingual adaptation is a common solution for developing ASR systems for under-resourced language, it is not viable for children as there are very limited speech databases as a source model. In this research, we propose a two-stage adaptation for the development of ASR system for Malay-speaking children using a very limited database. The two stage adaptation comprises the cross-lingual adaptation (first stage) and cross-age adaptation. For the first stage, a well-known speech database that is phonetically rich and balanced, is adapted to the medium-sized Malay adults using supervised MLLR. The second stage adaptation uses the speech acoustic model generated from the first adaptation, and the target database is a small-sized database of the target users. We have measured the performance of the proposed technique using word error rate, and then compare them with the conventional benchmark adaptation. The two stage adaptation proposed in this research has better recognition accuracy as compared to the benchmark adaptation in recognizing children’s speech.

Keywords: Automatic Speech Recognition System, children speech, adaptation, Malay

Procedia PDF Downloads 397
4209 The Effect of Size and Tumor Depth on Histological Clearance Margins of Basal Cell Carcinomas

Authors: Martin Van, Mohammed Javed, Sarah Hemington-Gorse

Abstract:

Aim: Our aim was to determine the effect of size and tumor depth of basal cell carcinomas (BCCs) on surgical margin clearance. Methods: A retrospective study was conducted at the Welsh Centre for Burns and Plastic Surgery (WCBPS), Morriston Hospital between 1 Jan 2016 – 31 July 2016. Only patients with confirmed BCC on histopathological analysis were included. Patient data including anatomical region treated, lesion size, histopathological clearance margins and histological sub-types were recorded. An independent T-test was performed determine statistical significance. Results: A total of 228 BCCs were excised in 160 patients. Eleven lesions (4.8%) were incompletely excised. The nose area had the highest rate of incomplete excision. The mean diameter of incompletely excised lesions was 11.4mm vs 11.5mm in completely excised lesions (p=0.959) and the mean histological depth of incompletely excised lesions was 4.1mm vs. 2.5mm for completely excised BCCs (p < 0.05). Conclusions: BCC tumor depth of > 4.1 mm was associated with high rate of incomplete margin clearance. Hence, in prospective patients, a BCC tumor depth (>4 mm) on tissue biopsy should alert the surgeon of potentially higher risk of incomplete excision of lesion.

Keywords: basal cell carcinoma, excision margins, plastic surgery, treatment

Procedia PDF Downloads 238
4208 Facial Expression Phoenix (FePh): An Annotated Sequenced Dataset for Facial and Emotion-Specified Expressions in Sign Language

Authors: Marie Alaghband, Niloofar Yousefi, Ivan Garibay

Abstract:

Facial expressions are important parts of both gesture and sign language recognition systems. Despite the recent advances in both fields, annotated facial expression datasets in the context of sign language are still scarce resources. In this manuscript, we introduce an annotated sequenced facial expression dataset in the context of sign language, comprising over 3000 facial images extracted from the daily news and weather forecast of the public tv-station PHOENIX. Unlike the majority of currently existing facial expression datasets, FePh provides sequenced semi-blurry facial images with different head poses, orientations, and movements. In addition, in the majority of images, identities are mouthing the words, which makes the data more challenging. To annotate this dataset we consider primary, secondary, and tertiary dyads of seven basic emotions of "sad", "surprise", "fear", "angry", "neutral", "disgust", and "happy". We also considered the "None" class if the image’s facial expression could not be described by any of the aforementioned emotions. Although we provide FePh as a facial expression dataset of signers in sign language, it has a wider application in gesture recognition and Human Computer Interaction (HCI) systems.

Keywords: annotated facial expression dataset, gesture recognition, sequenced facial expression dataset, sign language recognition

Procedia PDF Downloads 159
4207 Lip Localization Technique for Myanmar Consonants Recognition Based on Lip Movements

Authors: Thein Thein, Kalyar Myo San

Abstract:

Lip reading system is one of the different supportive technologies for hearing impaired, or elderly people or non-native speakers. For normal hearing persons in noisy environments or in conditions where the audio signal is not available, lip reading techniques can be used to increase their understanding of spoken language. Hearing impaired persons have used lip reading techniques as important tools to find out what was said by other people without hearing voice. Thus, visual speech information is important and become active research area. Using visual information from lip movements can improve the accuracy and robustness of a speech recognition system and the need for lip reading system is ever increasing for every language. However, the recognition of lip movement is a difficult task because of the region of interest (ROI) is nonlinear and noisy. Therefore, this paper proposes method to detect the accurate lips shape and to localize lip movement towards automatic lip tracking by using the combination of Otsu global thresholding technique and Moore Neighborhood Tracing Algorithm. Proposed method shows how accurate lip localization and tracking which is useful for speech recognition. In this work of study and experiments will be carried out the automatic lip localizing the lip shape for Myanmar consonants using the only visual information from lip movements which is useful for visual speech of Myanmar languages.

Keywords: lip reading, lip localization, lip tracking, Moore neighborhood tracing algorithm

Procedia PDF Downloads 352
4206 Fusion of Finger Inner Knuckle Print and Hand Geometry Features to Enhance the Performance of Biometric Verification System

Authors: M. L. Anitha, K. A. Radhakrishna Rao

Abstract:

With the advent of modern computing technology, there is an increased demand for developing recognition systems that have the capability of verifying the identity of individuals. Recognition systems are required by several civilian and commercial applications for providing access to secured resources. Traditional recognition systems which are based on physical identities are not sufficiently reliable to satisfy the security requirements due to the use of several advances of forgery and identity impersonation methods. Recognizing individuals based on his/her unique physiological characteristics known as biometric traits is a reliable technique, since these traits are not transferable and they cannot be stolen or lost. Since the performance of biometric based recognition system depends on the particular trait that is utilized, the present work proposes a fusion approach which combines Inner knuckle print (IKP) trait of the middle, ring and index fingers with the geometrical features of hand. The hand image captured from a digital camera is preprocessed to find finger IKP as region of interest (ROI) and hand geometry features. Geometrical features are represented as the distances between different key points and IKP features are extracted by applying local binary pattern descriptor on the IKP ROI. The decision level AND fusion was adopted, which has shown improvement in performance of the combined scheme. The proposed approach is tested on the database collected at our institute. Proposed approach is of significance since both hand geometry and IKP features can be extracted from the palm region of the hand. The fusion of these features yields a false acceptance rate of 0.75%, false rejection rate of 0.86% for verification tests conducted, which is less when compared to the results obtained using individual traits. The results obtained confirm the usefulness of proposed approach and suitability of the selected features for developing biometric based recognition system based on features from palmar region of hand.

Keywords: biometrics, hand geometry features, inner knuckle print, recognition

Procedia PDF Downloads 220
4205 A Dynamic Neural Network Model for Accurate Detection of Masked Faces

Authors: Oladapo Tolulope Ibitoye

Abstract:

Neural networks have become prominent and widely engaged in algorithmic-based machine learning networks. They are perfect in solving day-to-day issues to a certain extent. Neural networks are computing systems with several interconnected nodes. One of the numerous areas of application of neural networks is object detection. This is a prominent area due to the coronavirus disease pandemic and the post-pandemic phases. Wearing a face mask in public slows the spread of the virus, according to experts’ submission. This calls for the development of a reliable and effective model for detecting face masks on people's faces during compliance checks. The existing neural network models for facemask detection are characterized by their black-box nature and large dataset requirement. The highlighted challenges have compromised the performance of the existing models. The proposed model utilized Faster R-CNN Model on Inception V3 backbone to reduce system complexity and dataset requirement. The model was trained and validated with very few datasets and evaluation results shows an overall accuracy of 96% regardless of skin tone.

Keywords: convolutional neural network, face detection, face mask, masked faces

Procedia PDF Downloads 68
4204 Influence of the Seat Arrangement in Public Reading Spaces on Individual Subjective Perceptions

Authors: Jo-Han Chang, Chung-Jung Wu

Abstract:

This study involves a design proposal. The objective of is to create a seat arrangement model for public reading spaces that enable free arrangement without disturbing the users. Through a subjective perception scale, this study explored whether distance between seats and direction of seats influence individual subjective perceptions in a public reading space. This study also involves analysis of user subjective perceptions when reading in the settings on 3 seats at different directions and with 5 distances between seats. The results may be applied to public chair design. This study investigated that (a) whether different directions of seats and distances between seats influence individual subjective perceptions and (b) the acceptable personal space between 2 strangers in a public reading space. The results are shown as follows: (a) the directions of seats and distances between seats influenced individual subjective perceptions. (b) subjective evaluation scores were higher for back-to-back seat directions with Distances A (10 cm) and B (62 cm) compared with face-to-face and side-by-side seat directions; however, when the seat distance exceeded 114 cm (Distance C), no difference existed among the directions of seats. (c) regarding reading in public spaces, when the distance between seats is 10 cm only, we recommend arranging the seats in a back-to-back fashion to increase user comfort and arrangement of face-to-face and side- by-side seat directions should be avoided. When the seat arrangement is limited to face-to-face design, the distance between seats should be increased to at least 62 cm. Moreover, the distance between seats should be increased to at least 114 cm for side- by-side seats to elevate user comfort.

Keywords: individual subjective perceptions, personal space, seat arrangement, direction, distances

Procedia PDF Downloads 427
4203 Humanitarian Emergency of the Refugee Condition for Central American Immigrants in Irregular Situation

Authors: María de los Ángeles Cerda González, Itzel Arriaga Hurtado, Pascacio José Martínez Pichardo

Abstract:

In México, the recognition of refugee condition is a fundamental right which, as host State, has the obligation of respect, protect, and fulfill to the foreigners – where we can find the figure of immigrants in irregular situation-, that cannot return to their country of origin for humanitarian reasons. The recognition of the refugee condition as a fundamental right in the Mexican law system proceeds under these situations: 1. The immigrant applies for the refugee condition, even without the necessary proving elements to accredit the humanitarian character of his departure from his country of origin. 2. The immigrant does not apply for the recognition of refugee because he does not know he has the right to, even if he has the profile to apply for. 3. The immigrant who applies fulfills the requirements of the administrative procedure and has access to the refugee recognition. Of the three situations above, only the last one is contemplated for the national indexes of the status refugee; and the first two prove the inefficiency of the governmental system viewed from its lack of sensibility consequence of the no education in human rights matter and which results in the legal vulnerability of the immigrants in irregular situation because they do not have access to the procuration and administration of justice. In the aim of determining the causes and consequences of the no recognition of the refugee status, this investigation was structured from a systemic analysis which objective is to show the advances in Central American humanitarian emergency investigation, the Mexican States actions to protect, respect and fulfil the fundamental right of refugee of immigrants in irregular situation and the social and legal vulnerabilities suffered by Central Americans in Mexico. Therefore, to achieve the deduction of the legal nature of the humanitarian emergency from the Human Rights as a branch of the International Public Law, a conceptual framework is structured using the inductive deductive method. The problem statement is made from a legal framework to approach a theoretical scheme under the theory of social systems, from the analysis of the lack of communication of the governmental and normative subsystems of the Mexican legal system relative to the process undertaken by the Central American immigrants to achieve the recognition of the refugee status as a human right. Accordingly, is determined that fulfilling the obligations of the State referent to grant the right of the recognition of the refugee condition, would mean a guideline for a new stage in Mexican Law, because it would enlarge the constitutional benefits to everyone whose right to the recognition of refugee has been denied an as consequence, a great advance in human rights matter would be achieved.

Keywords: central American immigrants in irregular situation, humanitarian emergency, human rights, refugee

Procedia PDF Downloads 289
4202 Hand Symbol Recognition Using Canny Edge Algorithm and Convolutional Neural Network

Authors: Harshit Mittal, Neeraj Garg

Abstract:

Hand symbol recognition is a pivotal component in the domain of computer vision, with far-reaching applications spanning sign language interpretation, human-computer interaction, and accessibility. This research paper discusses the approach with the integration of the Canny Edge algorithm and convolutional neural network. The significance of this study lies in its potential to enhance communication and accessibility for individuals with hearing impairments or those engaged in gesture-based interactions with technology. In the experiment mentioned, the data is manually collected by the authors from the webcam using Python codes, to increase the dataset augmentation, is applied to original images, which makes the model more compatible and advanced. Further, the dataset of about 6000 coloured images distributed equally in 5 classes (i.e., 1, 2, 3, 4, 5) are pre-processed first to gray images and then by the Canny Edge algorithm with threshold 1 and 2 as 150 each. After successful data building, this data is trained on the Convolutional Neural Network model, giving accuracy: 0.97834, precision: 0.97841, recall: 0.9783, and F1 score: 0.97832. For user purposes, a block of codes is built in Python to enable a window for hand symbol recognition. This research, at its core, seeks to advance the field of computer vision by providing an advanced perspective on hand sign recognition. By leveraging the capabilities of the Canny Edge algorithm and convolutional neural network, this study contributes to the ongoing efforts to create more accurate, efficient, and accessible solutions for individuals with diverse communication needs.

Keywords: hand symbol recognition, computer vision, Canny edge algorithm, convolutional neural network

Procedia PDF Downloads 64
4201 Multimodal Database of Emotional Speech, Video and Gestures

Authors: Tomasz Sapiński, Dorota Kamińska, Adam Pelikant, Egils Avots, Cagri Ozcinar, Gholamreza Anbarjafari

Abstract:

People express emotions through different modalities. Integration of verbal and non-verbal communication channels creates a system in which the message is easier to understand. Expanding the focus to several expression forms can facilitate research on emotion recognition as well as human-machine interaction. In this article, the authors present a Polish emotional database composed of three modalities: facial expressions, body movement and gestures, and speech. The corpora contains recordings registered in studio conditions, acted out by 16 professional actors (8 male and 8 female). The data is labeled with six basic emotions categories, according to Ekman’s emotion categories. To check the quality of performance, all recordings are evaluated by experts and volunteers. The database is available to academic community and might be useful in the study on audio-visual emotion recognition.

Keywords: body movement, emotion recognition, emotional corpus, facial expressions, gestures, multimodal database, speech

Procedia PDF Downloads 349