Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 1868

Search results for: speech dataset

818 Challenges to Developing a Trans-European Programme for Health Professionals to Recognize and Respond to Survivors of Domestic Violence and Abuse

Authors: June Keeling, Christina Athanasiades, Vaiva Hendrixson, Delyth Wyndham

Abstract:

Recognition and education in violence, abuse, and neglect for medical and healthcare practitioners (REVAMP) is a trans-European project aiming to introduce a training programme that has been specifically developed by partners across seven European countries to meet the needs of medical and healthcare practitioners. Amalgamating the knowledge and experience of clinicians, researchers, and educators from interdisciplinary and multi-professional backgrounds, REVAMP has tackled the under-resourced and underdeveloped area of domestic violence and abuse. The team designed an online training programme to support medical and healthcare practitioners to recognise and respond appropriately to survivors of domestic violence and abuse at their point of contact with a health provider. The REVAMP partner countries include Europe: France, Lithuania, Germany, Greece, Iceland, Norway, and the UK. The training is delivered through a series of interactive online modules, adapting evidence-based pedagogical approaches to learning. Capturing and addressing the complexities of the project impacted the methodological decisions and approaches to evaluation. The challenge was to find an evaluation methodology that captured valid data across all partner languages to demonstrate the extent of the change in knowledge and understanding. Co-development by all team members was a lengthy iterative process, challenged by a lack of consistency in terminology. A mixed methods approach enabled both qualitative and quantitative data to be collected, at the start, during, and at the conclusion of the training for the purposes of evaluation. The module content and evaluation instrument were accessible in each partner country's language. Collecting both types of data provided a high-level snapshot of attainment via the quantitative dataset and an in-depth understanding of the impact of the training from the qualitative dataset. The analysis was mixed methods, with integration at multiple interfaces. The primary focus of the analysis was to support the overall project evaluation for the funding agency. A key project outcome was identifying that the trans-European approach posed several challenges. Firstly, the project partners did not share a first language or a legal or professional approach to domestic abuse and neglect. This was negotiated through complex, systematic, and iterative interaction between team members so that consensus could be achieved. Secondly, the context of the data collection in several different cultural, educational, and healthcare systems across Europe challenged the development of a robust evaluation. The participants in the pilot evaluation shared that the training was contemporary, well-designed, and of great relevance to inform practice. Initial results from the evaluation indicated that the participants were drawn from more than eight partner countries due to the online nature of the training. The primary results indicated a high level of engagement with the content and achievement through the online assessment. The main finding was that the participants perceived the impact of domestic abuse and neglect in very different ways in their individual professional contexts. Most significantly, the participants recognised the need for the training and the gap that existed previously. It is notable that a mixed-methods evaluation of a trans-European project is unusual at this scale.

Keywords: domestic violence, e-learning, health professionals, trans-European

Procedia PDF Downloads 80

817 Deepnic, A Method to Transform Each Variable into Image for Deep Learning

Authors: Nguyen J. M., Lucas G., Brunner M., Ruan S., Antonioli D.

Abstract:

Deep learning based on convolutional neural networks (CNN) is a very powerful technique for classifying information from an image. We propose a new method, DeepNic, to transform each variable of a tabular dataset into an image where each pixel represents a set of conditions that allow the variable to make an error-free prediction. The contrast of each pixel is proportional to its prediction performance and the color of each pixel corresponds to a sub-family of NICs. NICs are probabilities that depend on the number of inputs to each neuron and the range of coefficients of the inputs. Each variable can therefore be expressed as a function of a matrix of 2 vectors corresponding to an image whose pixels express predictive capabilities. Our objective is to transform each variable of tabular data into images into an image that can be analysed by CNNs, unlike other methods which use all the variables to construct an image. We analyse the NIC information of each variable and express it as a function of the number of neurons and the range of coefficients used. The predictive value and the category of the NIC are expressed by the contrast and the color of the pixel. We have developed a pipeline to implement this technology and have successfully applied it to genomic expressions on an Affymetrix chip.

Keywords: tabular data, deep learning, perfect trees, NICS

Procedia PDF Downloads 86

816 Online Yoga Asana Trainer Using Deep Learning

Authors: Venkata Narayana Chejarla, Nafisa Parvez Shaik, Gopi Vara Prasad Marabathula, Deva Kumar Bejjam

Abstract:

Yoga is an advanced, well-recognized method with roots in Indian philosophy. Yoga benefits both the body and the psyche. Yoga is a regular exercise that helps people relax and sleep better while also enhancing their balance, endurance, and concentration. Yoga can be learned in a variety of settings, including at home with the aid of books and the internet as well as in yoga studios with the guidance of an instructor. Self-learning does not teach the proper yoga poses, and doing them without the right instruction could result in significant injuries. We developed "Online Yoga Asana Trainer using Deep Learning" so that people could practice yoga without a teacher. Our project is developed using Tensorflow, Movenet, and Keras models. The system makes use of data from Kaggle that includes 25 different yoga poses. The first part of the process involves applying the movement model for extracting the 17 key points of the body from the dataset, and the next part involves preprocessing, which includes building a pose classification model using neural networks. The system scores a 98.3% accuracy rate. The system is developed to work with live videos.

Keywords: yoga, deep learning, movenet, tensorflow, keras, CNN

Procedia PDF Downloads 238

815 Evaluating Contextually Targeted Advertising with Attention Measurement

Authors: John Hawkins, Graham Burton

Abstract:

Contextual targeting is a common strategy for advertising that places marketing messages in media locations that are expected to be aligned with the target audience. There are multiple major challenges to contextual targeting: the ideal categorisation scheme needs to be known, as well as the most appropriate subsections of that scheme for a given campaign or creative. In addition, the campaign reach is typically limited when targeting becomes narrow, so a balance must be struck between requirements. Finally, refinement of the process is limited by the use of evaluation methods that are either rapid but non-specific (click through rates), or reliable but slow and costly (conversions or brand recall studies). In this study we evaluate the use of attention measurement as a technique for understanding the performance of targeting on the basis of specific contextual topics. We perform the analysis using a large scale dataset of impressions categorised using the iAB V2.0 taxonomy. We evaluate multiple levels of the categorisation hierarchy, using categories at different positions within an initial creative specific ranking. The results illustrate that measuring attention time is an affective signal for the performance of a specific creative within a specific context. Performance is sustained across a ranking of categories from one period to another.

Keywords: contextual targeting, digital advertising, attention measurement, marketing performance

Procedia PDF Downloads 103

814 A Comparison of the First Language Vocabulary Used by Indonesian Year 4 Students and the Vocabulary Taught to Them in English Language Textbooks

Authors: Fitria Ningsih

Abstract:

This study concerns on the process of making corpus obtained from Indonesian year 4 students’ free writing compared to the vocabulary taught in English language textbooks. 369 students’ sample writings from 19 public elementary schools in Malang, East Java, Indonesia and 5 selected English textbooks were analyzed through corpus in linguistics method using AdTAT -the Adelaide Text Analysis Tool- program. The findings produced wordlists of the top 100 words most frequently used by students and the top 100 words given in English textbooks. There was a 45% match between the two lists. Furthermore, the classifications of the top 100 most frequent words from the two corpora based on part of speech found that both the Indonesian and English languages employed a similar use of nouns, verbs, adjectives, and prepositions. Moreover, to see the contextualizing the vocabulary of learning materials towards the students’ need, a depth-analysis dealing with the content and the cultural views from the vocabulary taught in the textbooks was discussed through the criteria developed from the checklist. Lastly, further suggestions are addressed to language teachers to understand the students’ background such as recognizing the basic words students acquire before teaching them new vocabulary in order to achieve successful learning of the target language.

Keywords: corpus, frequency, English, Indonesian, linguistics, textbooks, vocabulary, wordlists, writing

Procedia PDF Downloads 183

813 Shifted Window Based Self-Attention via Swin Transformer for Zero-Shot Learning

Authors: Yasaswi Palagummi, Sareh Rowlands

Abstract:

Generalised Zero-Shot Learning, often known as GZSL, is an advanced variant of zero-shot learning in which the samples in the unseen category may be either seen or unseen. GZSL methods typically have a bias towards the seen classes because they learn a model to perform recognition for both the seen and unseen classes using data samples from the seen classes. This frequently leads to the misclassification of data from the unseen classes into the seen classes, making the task of GZSL more challenging. In this work of ours, to solve the GZSL problem, we propose an approach leveraging the Shifted Window based Self-Attention in the Swin Transformer (Swin-GZSL) to work in the inductive GSZL problem setting. We run experiments on three popular benchmark datasets: CUB, SUN, and AWA2, which are specifically used for ZSL and its other variants. The results show that our model based on Swin Transformer has achieved state-of-the-art harmonic mean for two datasets -AWA2 and SUN and near-state-of-the-art for the other dataset - CUB. More importantly, this technique has a linear computational complexity, which reduces training time significantly. We have also observed less bias than most of the existing GZSL models.

Keywords: generalised, zero-shot learning, inductive learning, shifted-window attention, Swin transformer, vision transformer

Procedia PDF Downloads 68

812 Risk Screening in Digital Insurance Distribution: Evidence and Explanations

Authors: Finbarr Murphy, Wei Xu, Xian Xu

Abstract:

The embedding of digital technologies in the global economy has attracted increasing attention from economists. With a large and detailed dataset, this study examines the specific case where consumers have a choice between offline and digital channels in the context of insurance purchases. We find that digital channels screen consumers with lower unobserved risk. For the term life, endowment, and disease insurance products, the average risk of the policies purchased through digital channels was 75%, 21%, and 31%, respectively, lower than those purchased offline. As a consequence, the lower unobserved risk leads to weaker information asymmetry and higher profitability of digital channels. We highlight three mechanisms of the risk screening effect: heterogeneous marginal influence of channel features on insurance demand, the channel features directly related to risk control, and the link between the digital divide and risk. We also find that the risk screening effect mainly comes from the extensive margin, i.e., from new consumers. This paper contributes to three connected areas in the insurance context: the heterogeneous economic impacts of digital technology adoption, insurer-side risk selection, and insurance marketing.

Keywords: digital economy, information asymmetry, insurance, mobile application, risk screening

Procedia PDF Downloads 68

811 Dynamic Interaction between Renwable Energy Consumption and Sustainable Development: Evidence from Ecowas Region

Authors: Maman Ali M. Moustapha, Qian Yu, Benjamin Adjei Danquah

Abstract:

This paper investigates the dynamic interaction between renewable energy consumption (REC) and economic growth using dataset from the Economic Community of West African States (ECOWAS) from 2002 to 2016. For this study the Autoregressive Distributed Lag- Bounds test approach (ARDL) was used to examine the long run relationship between real gross domestic product and REC, while VECM based on Granger causality has been used to examine the direction of Granger causality. Our empirical findings indicate that REC has significant and positive impact on real gross domestic product. In addition, we found that REC and the percentage of access to electricity had unidirectional Granger causality to economic growth while carbon dioxide emission has bidirectional Granger causality to economic growth. Our findings indicate also that 1 per cent increase in the REC leads to an increase in Real GDP by 0.009 in long run. Thus, REC can be a means to ensure sustainable economic growth in the ECOWAS sub-region. However, it is necessary to increase further support and investments on renewable energy production in order to speed up sustainable economic development throughout the region

Keywords: Economic Growth, Renewable Energy, Sustainable Development, Sustainable Energy

Procedia PDF Downloads 207

810 Advancing in Cricket Analytics: Novel Approaches for Pitch and Ball Detection Employing OpenCV and YOLOV8

Authors: Pratham Madnur, Prathamkumar Shetty, Sneha Varur, Gouri Parashetti

Abstract:

In order to overcome conventional obstacles, this research paper investigates novel approaches for cricket pitch and ball detection that make use of cutting-edge technologies. The research integrates OpenCV for pitch inspection and modifies the YOLOv8 model for cricket ball detection in order to overcome the shortcomings of manual pitch assessment and traditional ball detection techniques. To ensure flexibility in a range of pitch environments, the pitch detection method leverages OpenCV’s color space transformation, contour extraction, and accurate color range defining features. Regarding ball detection, the YOLOv8 model emphasizes the preservation of minor object details to improve accuracy and is specifically trained to the unique properties of cricket balls. The methods are more reliable because of the careful preparation of the datasets, which include novel ball and pitch information. These cutting-edge methods not only improve cricket analytics but also set the stage for flexible methods in more general sports technology applications.

Keywords: OpenCV, YOLOv8, cricket, custom dataset, computer vision, sports

Procedia PDF Downloads 68

809 Evaluation of Features Extraction Algorithms for a Real-Time Isolated Word Recognition System

Authors: Tomyslav Sledevič, Artūras Serackis, Gintautas Tamulevičius, Dalius Navakauskas

Abstract:

This paper presents a comparative evaluation of features extraction algorithm for a real-time isolated word recognition system based on FPGA. The Mel-frequency cepstral, linear frequency cepstral, linear predictive and their cepstral coefficients were implemented in hardware/software design. The proposed system was investigated in the speaker-dependent mode for 100 different Lithuanian words. The robustness of features extraction algorithms was tested recognizing the speech records at different signals to noise rates. The experiments on clean records show highest accuracy for Mel-frequency cepstral and linear frequency cepstral coefficients. For records with 15 dB signal to noise rate the linear predictive cepstral coefficients give best result. The hard and soft part of the system is clocked on 50 MHz and 100 MHz accordingly. For the classification purpose, the pipelined dynamic time warping core was implemented. The proposed word recognition system satisfies the real-time requirements and is suitable for applications in embedded systems.

Keywords: isolated word recognition, features extraction, MFCC, LFCC, LPCC, LPC, FPGA, DTW

Procedia PDF Downloads 488

808 Hyperspectral Image Classification Using Tree Search Algorithm

Authors: Shreya Pare, Parvin Akhter

Abstract:

Remotely sensing image classification becomes a very challenging task owing to the high dimensionality of hyperspectral images. The pixel-wise classification methods fail to take the spatial structure information of an image. Therefore, to improve the performance of classification, spatial information can be integrated into the classification process. In this paper, the multilevel thresholding algorithm based on a modified fuzzy entropy function is used to perform the segmentation of hyperspectral images. The fuzzy parameters of the MFE function have been optimized by using a new meta-heuristic algorithm based on the Tree-Search algorithm. The segmented image is classified by a large distribution machine (LDM) classifier. Experimental results are shown on a hyperspectral image dataset. The experimental outputs indicate that the proposed technique (MFE-TSA-LDM) achieves much higher classification accuracy for hyperspectral images when compared to state-of-art classification techniques. The proposed algorithm provides accurate segmentation and classification maps, thus becoming more suitable for image classification with large spatial structures.

Keywords: classification, hyperspectral images, large distribution margin, modified fuzzy entropy function, multilevel thresholding, tree search algorithm, hyperspectral image classification using tree search algorithm

Procedia PDF Downloads 172

807 Internationalization Strategies and Firm Productivity: Manufacturing Firm-Level Evidence from Ethiopia

Authors: Soressa Tolcha Jarra

Abstract:

Looking into firm-level internationalization strategies and their effects on firms' productivity is needed in order to understand the role of firms’ participation in trading activities on the one hand and the effects of firms’ internalization strategies on firm-level productivity on the other. Thus, this study aims to investigate firms' imports of intermediates and export strategies and their impact on firm productivity using an establishment-level panel dataset from Ethiopian manufacturing firms over the period 2011–2020. Methodologically, the joint firm’s decision to import intermediates and estimate exports is undertaken by system GMM using Wooldridge's approach. The translog-production function is used to estimate firm-level productivity by considering a general Markov process. The size of the firm is used in a mediating role. The result indicates evidence of the self-selection of more productive firms into exporting and importing intermediates, which is indicative of sizable export and import market entry costs. Furthermore, there is evidence in favor of learning by exporting (LBE) and learning by importing (LBI) hypotheses for smaller and medium Ethiopian manufacturing firms. However, for large firms, there is only evidence in support of the learning by exporting (LBE) hypothesis.

Keywords: Ethiopia, export, firm productivity, intermediate imports

Procedia PDF Downloads 29

806 Cross-Dialectal Study of Issues in Dagbanli Phonology

Authors: Abdul-Razak Inusah

Abstract:

The study is a cross-sectional investigation of issues in Dagbanli Phonology, a Mabia language spoken in the Northern Region of Ghana. The issues investigated and assessed for the purpose of Dagbanli phonology are the status of the velar fricatives [x, ɣ] and the flap [ɾ] across Dagbanli dialects. The ethnographic approach is employed to solicit the primary data from bucolic Dagbanli speech communities. The descriptive method is engaged for the analysis of the primary data available. The investigation reveals that the dialects have the velar fricatives [x, ɣ] confined to specific segmental contexts with a particular inventory stricture. The flap[ɾ] is noticed to occur mostly in intervocalic but entirely missing in Dagbanli indigenous words in word-initial. The velar fricatives [x, ɣ] and the flap[ɾ] are observed to be non-contrastive and only suffice as dialectal allophones in the language. The paper shows evidence of coalesce of non-coronal labial /m/ and coronal fricative /s/ to produce dorsal fricative [x] in intervocalic and coalesce of stem final stop /ɡ/ and suffix onset fricative /s/ to yield the dorsal fricative [x], a finding which shows the status of the segment [x] in Dagbanli phonology. The paper concludes that the segments [x], [ɣ] and [ɾ] are positional variants of /ɡ+s/ or /m+s/, /ɡ/ and /d/.

Keywords: Dagbani, phonology, dialect, segment, fricatives, coalesce

Procedia PDF Downloads 46

805 Drinking Water Quality Assessment Using Fuzzy Inference System Method: A Case Study of Rome, Italy

Authors: Yas Barzegar, Atrin Barzegar

Abstract:

Drinking water quality assessment is a major issue today; technology and practices are continuously improving; Artificial Intelligence (AI) methods prove their efficiency in this domain. The current research seeks a hierarchical fuzzy model for predicting drinking water quality in Rome (Italy). The Mamdani fuzzy inference system (FIS) is applied with different defuzzification methods. The Proposed Model includes three fuzzy intermediate models and one fuzzy final model. Each fuzzy model consists of three input parameters and 27 fuzzy rules. The model is developed for water quality assessment with a dataset considering nine parameters (Alkalinity, Hardness, pH, Ca, Mg, Fluoride, Sulphate, Nitrates, and Iron). Fuzzy-logic-based methods have been demonstrated to be appropriate to address uncertainty and subjectivity in drinking water quality assessment; it is an effective method for managing complicated, uncertain water systems and predicting drinking water quality. The FIS method can provide an effective solution to complex systems; this method can be modified easily to improve performance.

Keywords: water quality, fuzzy logic, smart cities, water attribute, fuzzy inference system, membership function

Procedia PDF Downloads 74

804 Tibyan Automated Arabic Correction Using Machine-Learning in Detecting Syntactical Mistakes

Authors: Ashwag O. Maghraby, Nida N. Khan, Hosnia A. Ahmed, Ghufran N. Brohi, Hind F. Assouli, Jawaher S. Melibari

Abstract:

The Arabic language is one of the most important languages. Learning it is so important for many people around the world because of its religious and economic importance and the real challenge lies in practicing it without grammatical or syntactical mistakes. This research focused on detecting and correcting the syntactic mistakes of Arabic syntax according to their position in the sentence and focused on two of the main syntactical rules in Arabic: Dual and Plural. It analyzes each sentence in the text, using Stanford CoreNLP morphological analyzer and machine-learning approach in order to detect the syntactical mistakes and then correct it. A prototype of the proposed system was implemented and evaluated. It uses support vector machine (SVM) algorithm to detect Arabic grammatical errors and correct them using the rule-based approach. The prototype system has a far accuracy 81%. In general, it shows a set of useful grammatical suggestions that the user may forget about while writing due to lack of familiarity with grammar or as a result of the speed of writing such as alerting the user when using a plural term to indicate one person.

Keywords: Arabic language acquisition and learning, natural language processing, morphological analyzer, part-of-speech

Procedia PDF Downloads 147

803 Developing a Web GIS Tool for the Evaluation of Soil Erosion of a Watershed

Authors: Y. Fekir, K. Mederbal, M. A. Hamadouche, D. Anteur

Abstract:

The soil erosion by water has become one of the biggest problems of the environment in the world, threatening the majority of countries. There are several models to evaluate erosion. These models are still a simplified representation of reality. They permit the analysis of complex systems, measurements are complementary to allow an extrapolation in time and space and may combine different factors. The empirical model of soil loss proposed by Wischmeier and Smith (Universal Soil Loss Equation), is widely used in many countries. He considers that erosion is a multiplicative function of five factors: rainfall erosivity (the R factor) the soil erodibility factor (K), topography (LS), the erosion control practices (P) and vegetation cover and agricultural practices (C). In this work, we tried to develop a tool based on Web GIS functionality to evaluate soil losses caused by erosion taking into account five factors. This tool allows the user to integrate all the data needed for the evaluation (DEM, Land use, rainfall ...) in the form of digital layers to calculate the five factors taken into account in the USLE equation (R, K, C, P, LS). Accordingly, and after treatment of the integrated data set, a map of the soil losses will be achieved as a result. We tested the proposed tool on a watershed basin located in the weste of Algeria where a dataset was collected and prepared.

Keywords: USLE, erosion, web gis, Algeria

Procedia PDF Downloads 326

802 A Comparative Study of Additive and Nonparametric Regression Estimators and Variable Selection Procedures

Authors: Adriano Z. Zambom, Preethi Ravikumar

Abstract:

One of the biggest challenges in nonparametric regression is the curse of dimensionality. Additive models are known to overcome this problem by estimating only the individual additive effects of each covariate. However, if the model is misspecified, the accuracy of the estimator compared to the fully nonparametric one is unknown. In this work the efficiency of completely nonparametric regression estimators such as the Loess is compared to the estimators that assume additivity in several situations, including additive and non-additive regression scenarios. The comparison is done by computing the oracle mean square error of the estimators with regards to the true nonparametric regression function. Then, a backward elimination selection procedure based on the Akaike Information Criteria is proposed, which is computed from either the additive or the nonparametric model. Simulations show that if the additive model is misspecified, the percentage of time it fails to select important variables can be higher than that of the fully nonparametric approach. A dimension reduction step is included when nonparametric estimator cannot be computed due to the curse of dimensionality. Finally, the Boston housing dataset is analyzed using the proposed backward elimination procedure and the selected variables are identified.

Keywords: additive model, nonparametric regression, variable selection, Akaike Information Criteria

Procedia PDF Downloads 261

801 Teaching How to Speak ‘Correct’ English in No Time: An Assessment of the ‘Success’ of Professor Higgins’ Motivation in George Bernard Shaw’s Pygmalion

Authors: Armel Mbon

Abstract:

This paper examines the ‘success’ of George Bernard Shaw's main character Professor Higgins' motivation in teaching Eliza Doolittle, a young Cockney flower girl, how to speak 'correct' English in no time in Pygmalion. Notice should be given that Shaw in whose writings, language issues feature prominently, does not believe there is such a thing as perfectly correct English, but believes in the varieties of spoken English as a source of its richness. Indeed, along with his fellow phonetician Colonel Pickering, Henry Higgins succeeds in teaching Eliza that he first judges unfairly, the dialect of the upper classes and Received Pronunciation, to facilitate her social advancement. So, after six months of rigorous learning, Eliza's speech and manners are transformed, and she is able to pass herself off as a lady. Such is the success of Professor Higgins’ motivation in linguistically transforming his learner in record time. On the other side, his motivation is unsuccessful since, by the end of the play, he cannot have Eliza he believes he has shaped to his so-called good image, for wife. So, this paper aims to show, in support of the psychological approach, that in motivation, feelings, pride and prejudice cannot be combined, and that one has not to pre-judge someone’s attitude based purely on how well they speak English.

Keywords: teaching, speak, in no time, success

Procedia PDF Downloads 64

800 Ensemble of Deep CNN Architecture for Classifying the Source and Quality of Teff Cereal

Authors: Belayneh Matebie, Michael Melese

Abstract:

The study focuses on addressing the challenges in classifying and ensuring the quality of Eragrostis Teff, a small and round grain that is the smallest cereal grain. Employing a traditional classification method is challenging because of its small size and the similarity of its environmental characteristics. To overcome this, this study employs a machine learning approach to develop a source and quality classification system for Teff cereal. Data is collected from various production areas in the Amhara regions, considering two types of cereal (high and low quality) across eight classes. A total of 5,920 images are collected, with 740 images for each class. Image enhancement techniques, including scaling, data augmentation, histogram equalization, and noise removal, are applied to preprocess the data. Convolutional Neural Network (CNN) is then used to extract relevant features and reduce dimensionality. The dataset is split into 80% for training and 20% for testing. Different classifiers, including FVGG16, FINCV3, QSCTC, EMQSCTC, SVM, and RF, are employed for classification, achieving accuracy rates ranging from 86.91% to 97.72%. The ensemble of FVGG16, FINCV3, and QSCTC using the Max-Voting approach outperforms individual algorithms.

Keywords: Teff, ensemble learning, max-voting, CNN, SVM, RF

Procedia PDF Downloads 43

799 Accuracy Improvement of Traffic Participant Classification Using Millimeter-Wave Radar by Leveraging Simulator Based on Domain Adaptation

Authors: Tokihiko Akita, Seiichi Mita

Abstract:

A millimeter-wave radar is the most robust against adverse environments, making it an essential environment recognition sensor for automated driving. However, the reflection signal is sparse and unstable, so it is difficult to obtain the high recognition accuracy. Deep learning provides high accuracy even for them in recognition, but requires large scale datasets with ground truth. Specially, it takes a lot of cost to annotate for a millimeter-wave radar. For the solution, utilizing a simulator that can generate an annotated huge dataset is effective. Simulation of the radar is more difficult to match with real world data than camera image, and recognition by deep learning with higher-order features using the simulator causes further deviation. We have challenged to improve the accuracy of traffic participant classification by fusing simulator and real-world data with domain adaptation technique. Experimental results with the domain adaptation network created by us show that classification accuracy can be improved even with a few real-world data.

Keywords: millimeter-wave radar, object classification, deep learning, simulation, domain adaptation

Procedia PDF Downloads 88

798 Braille Lab: A New Design Approach for Social Entrepreneurship and Innovation in Assistive Tools for the Visually Impaired

Authors: Claudio Loconsole, Daniele Leonardis, Antonio Brunetti, Gianpaolo Francesco Trotta, Nicholas Caporusso, Vitoantonio Bevilacqua

Abstract:

Unfortunately, many people still do not have access to communication, with specific regard to reading and writing. Among them, people who are blind or visually impaired, have several difficulties in getting access to the world, compared to the sighted. Indeed, despite technology advancement and cost reduction, nowadays assistive devices are still expensive such as Braille-based input/output systems which enable reading and writing texts (e.g., personal notes, documents). As a consequence, assistive technology affordability is fundamental in supporting the visually impaired in communication, learning, and social inclusion. This, in turn, has serious consequences in terms of equal access to opportunities, freedom of expression, and actual and independent participation to a society designed for the sighted. Moreover, the visually impaired experience difficulties in recognizing objects and interacting with devices in any activities of daily living. It is not a case that Braille indications are commonly reported only on medicine boxes and elevator keypads. Several software applications for the automatic translation of written text into speech (e.g., Text-To-Speech - TTS) enable reading pieces of documents. However, apart from simple tasks, in many circumstances TTS software is not suitable for understanding very complicated pieces of text requiring to dwell more on specific portions (e.g., mathematical formulas or Greek text). In addition, the experience of reading\writing text is completely different both in terms of engagement, and from an educational perspective. Statistics on the employment rate of blind people show that learning to read and write provides the visually impaired with up to 80% more opportunities of finding a job. Especially in higher educational levels, where the ability to digest very complex text is key, accessibility and availability of Braille plays a fundamental role in reducing drop-out rate of the visually impaired, thus affecting the effectiveness of the constitutional right to get access to education. In this context, the Braille Lab project aims at overcoming these social needs by including affordability in designing and developing assistive tools for visually impaired people. In detail, our awarded project focuses on a technology innovation of the operation principle of existing assistive tools for the visually impaired leaving the Human-Machine Interface unchanged. This can result in a significant reduction of the production costs and consequently of tool selling prices, thus representing an important opportunity for social entrepreneurship. The first two assistive tools designed within the Braille Lab project following the proposed approach aims to provide the possibility to personally print documents and handouts and to read texts written in Braille using refreshable Braille display, respectively. The former, named ‘Braille Cartridge’, represents an alternative solution for printing in Braille and consists in the realization of an electronic-controlled dispenser printing (cartridge) which can be integrated within traditional ink-jet printers, in order to leverage the efficiency and cost of the device mechanical structure which are already being used. The latter, named ‘Braille Cursor’, is an innovative Braille display featuring a substantial technology innovation by means of a unique cursor virtualizing Braille cells, thus limiting the number of active pins needed for Braille characters.

Keywords: Human rights, social challenges and technology innovations, visually impaired, affordability, assistive tools

Procedia PDF Downloads 272

797 Analysis of Facial Expressions with Amazon Rekognition

Authors: Kashika P. H.

Abstract:

The development of computer vision systems has been greatly aided by the efficient and precise detection of images and videos. Although the ability to recognize and comprehend images is a strength of the human brain, employing technology to tackle this issue is exceedingly challenging. In the past few years, the use of Deep Learning algorithms to treat object detection has dramatically expanded. One of the key issues in the realm of image recognition is the recognition and detection of certain notable people from randomly acquired photographs. Face recognition uses a way to identify, assess, and compare faces for a variety of purposes, including user identification, user counting, and classification. With the aid of an accessible deep learning-based API, this article intends to recognize various faces of people and their facial descriptors more accurately. The purpose of this study is to locate suitable individuals and deliver accurate information about them by using the Amazon Rekognition system to identify a specific human from a vast image dataset. We have chosen the Amazon Rekognition system, which allows for more accurate face analysis, face comparison, and face search, to tackle this difficulty.

Keywords: Amazon rekognition, API, deep learning, computer vision, face detection, text detection

Procedia PDF Downloads 102

796 Diabetes Diagnosis Model Using Rough Set and K- Nearest Neighbor Classifier

Authors: Usiobaifo Agharese Rosemary, Osaseri Roseline Oghogho

Abstract:

Diabetes is a complex group of disease with a variety of causes; it is a disorder of the body metabolism in the digestion of carbohydrates food. The application of machine learning in the field of medical diagnosis has been the focus of many researchers and the use of recognition and classification model as a decision support tools has help the medical expert in diagnosis of diseases. Considering the large volume of medical data which require special techniques, experience, and high diagnostic skill in the diagnosis of diseases, the application of an artificial intelligent system to assist medical personnel in order to enhance their efficiency and accuracy in diagnosis will be an invaluable tool. In this study will propose a diabetes diagnosis model using rough set and K-nearest Neighbor classifier algorithm. The system consists of two modules: the feature extraction module and predictor module, rough data set is used to preprocess the attributes while K-nearest neighbor classifier is used to classify the given data. The dataset used for this model was taken for University of Benin Teaching Hospital (UBTH) database. Half of the data was used in the training while the other half was used in testing the system. The proposed model was able to achieve over 80% accuracy.

Keywords: classifier algorithm, diabetes, diagnostic model, machine learning

Procedia PDF Downloads 332

795 Track Initiation Method Based on Multi-Algorithm Fusion Learning of 1DCNN And Bi-LSTM

Authors: Zhe Li, Aihua Cai

Abstract:

Aiming at the problem of high-density clutter and interference affecting radar detection target track initiation in ECM and complex radar mission, the traditional radar target track initiation method has been difficult to adapt. To this end, we propose a multi-algorithm fusion learning track initiation algorithm, which transforms the track initiation problem into a true-false track discrimination problem, and designs an algorithm based on 1DCNN(One-Dimensional CNN)combined with Bi-LSTM (Bi-Directional Long Short-Term Memory )for fusion classification. The experimental dataset consists of real trajectories obtained from a certain type of three-coordinate radar measurements, and the experiments are compared with traditional trajectory initiation methods such as rule-based method, logical-based method and Hough-transform-based method. The simulation results show that the overall performance of the multi-algorithm fusion learning track initiation algorithm is significantly better than that of the traditional method, and the real track initiation rate can be effectively improved under high clutter density with the average initiation time similar to the logical method.

Keywords: track initiation, multi-algorithm fusion, 1DCNN, Bi-LSTM

Procedia PDF Downloads 77

794 Fast Adjustable Threshold for Uniform Neural Network Quantization

Authors: Alexander Goncharenko, Andrey Denisov, Sergey Alyamkin, Evgeny Terentev

Abstract:

The neural network quantization is highly desired procedure to perform before running neural networks on mobile devices. Quantization without fine-tuning leads to accuracy drop of the model, whereas commonly used training with quantization is done on the full set of the labeled data and therefore is both time- and resource-consuming. Real life applications require simplification and acceleration of quantization procedure that will maintain accuracy of full-precision neural network, especially for modern mobile neural network architectures like Mobilenet-v1, MobileNet-v2 and MNAS. Here we present a method to significantly optimize training with quantization procedure by introducing the trained scale factors for discretization thresholds that are separate for each filter. Using the proposed technique, we quantize the modern mobile architectures of neural networks with the set of train data of only ∼ 10% of the total ImageNet 2012 sample. Such reduction of train dataset size and small number of trainable parameters allow to fine-tune the network for several hours while maintaining the high accuracy of quantized model (accuracy drop was less than 0.5%). Ready-for-use models and code are available in the GitHub repository.

Keywords: distillation, machine learning, neural networks, quantization

Procedia PDF Downloads 320

793 Cross Line of Causality in Childhood Stuttering between Psychology and Neurolinguistics: Systematic Literature Review and Meta-Analysis

Authors: Sadeq Al Yaari, Muhammad Alkhunayn, Ayman Al Yaari, Montaha Al Yaari, Aayah Al Yaari, Adham Al Yaari, Sajedah Al Yaari, Fatehi Eissa

Abstract:

Stuttering is a multidimensional disorder that is influenced by different factors. As a result of their un-understanding of the genuine reasons behind stuttering, psychiatrists and Speech and Language Pathologists/Therapists (SLP/Ts) are often unfamiliar with the psychoneurolinguistic characteristics, support needs, and the disability measurement impacting requested rehabilitation of the stuttering population. PubMed, PsycInfo, Web of Science, Scopus, and Google scholar searches, in addition to some unpublished literature, were conducted in this Systematic Literature Review and Meta-analysis (SLR and Meta-analysis) to identify whether stuttering is caused by psychological or neurological reasons. The study concluded that psychological, not neurolinguistic factors were identified as most significant for the causality of childhood stuttering. Stutterers have intact language skills, but impaired ability more to communicate with others than to form letters in the brain or to articulate them. The study recommends research in the future that sheds light on the adult stuttering population often left out of the focus of diagnosis and in need of further exploration vis-a-vis issues they encounter, as well as the possible ways to deal with them psychoneurolinguistically.

Keywords: causality, childhood stuttering, psychology, neurolinguistics, systematic literature review, meta-analysis

Procedia PDF Downloads 46

792 An Event Relationship Extraction Method Incorporating Deep Feedback Recurrent Neural Network and Bidirectional Long Short-Term Memory

Authors: Yin Yuanling

Abstract:

A Deep Feedback Recurrent Neural Network (DFRNN) and Bidirectional Long Short-Term Memory (BiLSTM) are designed to address the problem of low accuracy of traditional relationship extraction models. This method combines a deep feedback-based recurrent neural network (DFRNN) with a bi-directional long short-term memory (BiLSTM) approach. The method combines DFRNN, which extracts local features of text based on deep feedback recurrent mechanism, BiLSTM, which better extracts global features of text, and Self-Attention, which extracts semantic information. Experiments show that the method achieves an F1 value of 76.69% on the CEC dataset, which is 0.0652 better than the BiLSTM+Self-ATT model, thus optimizing the performance of the deep learning method in the event relationship extraction task.

Keywords: event relations, deep learning, DFRNN models, bi-directional long and short-term memory networks

Procedia PDF Downloads 140

791 On the Implementation of The Pulse Coupled Neural Network (PCNN) in the Vision of Cognitive Systems

Authors: Hala Zaghloul, Taymoor Nazmy

Abstract:

One of the great challenges of the 21st century is to build a robot that can perceive and act within its environment and communicate with people, while also exhibiting the cognitive capabilities that lead to performance like that of people. The Pulse Coupled Neural Network, PCNN, is a relative new ANN model that derived from a neural mammal model with a great potential in the area of image processing as well as target recognition, feature extraction, speech recognition, combinatorial optimization, compressed encoding. PCNN has unique feature among other types of neural network, which make it a candid to be an important approach for perceiving in cognitive systems. This work show and emphasis on the potentials of PCNN to perform different tasks related to image processing. The main drawback or the obstacle that prevent the direct implementation of such technique, is the need to find away to control the PCNN parameters toward perform a specific task. This paper will evaluate the performance of PCNN standard model for processing images with different properties, and select the important parameters that give a significant result, also, the approaches towards find a way for the adaptation of the PCNN parameters to perform a specific task.

Keywords: cognitive system, image processing, segmentation, PCNN kernels

Procedia PDF Downloads 273

790 Multi-Spectral Deep Learning Models for Forest Fire Detection

Authors: Smitha Haridasan, Zelalem Demissie, Atri Dutta, Ajita Rattani

Abstract:

Aided by the wind, all it takes is one ember and a few minutes to create a wildfire. Wildfires are growing in frequency and size due to climate change. Wildfires and its consequences are one of the major environmental concerns. Every year, millions of hectares of forests are destroyed over the world, causing mass destruction and human casualties. Thus early detection of wildfire becomes a critical component to mitigate this threat. Many computer vision-based techniques have been proposed for the early detection of forest fire using video surveillance. Several computer vision-based methods have been proposed to predict and detect forest fires at various spectrums, namely, RGB, HSV, and YCbCr. The aim of this paper is to propose a multi-spectral deep learning model that combines information from different spectrums at intermediate layers for accurate fire detection. A heterogeneous dataset assembled from publicly available datasets is used for model training and evaluation in this study. The experimental results show that multi-spectral deep learning models could obtain an improvement of about 4.68 % over those based on a single spectrum for fire detection.

Keywords: deep learning, forest fire detection, multi-spectral learning, natural hazard detection

Procedia PDF Downloads 233

789 Prediction of the Crustal Deformation of Volcán - Nevado Del RUíz in the Year 2020 Using Tropomi Tropospheric Information, Dinsar Technique, and Neural Networks

Authors: Juan Sebastián Hernández

Abstract:

The Nevado del Ruíz volcano, located between the limits of the Departments of Caldas and Tolima in Colombia, presented an unstable behaviour in the course of the year 2020, this volcanic activity led to secondary effects on the crust, which is why the prediction of deformations becomes the task of geoscientists. In the course of this article, the use of tropospheric variables such as evapotranspiration, UV aerosol index, carbon monoxide, nitrogen dioxide, methane, surface temperature, among others, is used to train a set of neural networks that can predict the behaviour of the resulting phase of an unrolled interferogram with the DInSAR technique, whose main objective is to identify and characterise the behaviour of the crust based on the environmental conditions. For this purpose, variables were collected, a generalised linear model was created, and a set of neural networks was created. After the training of the network, validation was carried out with the test data, giving an MSE of 0.17598 and an associated r-squared of approximately 0.88454. The resulting model provided a dataset with good thematic accuracy, reflecting the behaviour of the volcano in 2020, given a set of environmental characteristics.

Keywords: crustal deformation, Tropomi, neural networks (ANN), volcanic activity, DInSAR

Procedia PDF Downloads 98