Search results for: supervised classifiers
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 514

Search results for: supervised classifiers

244 Deep learning with Noisy Labels : Learning True Labels as Discrete Latent Variable

Authors: Azeddine El-Hassouny, Chandrashekhar Meshram, Geraldin Nanfack

Abstract:

In recent years, learning from data with noisy labels (Label Noise) has been a major concern in supervised learning. This problem has become even more worrying in Deep Learning, where the generalization capabilities have been questioned lately. Indeed, deep learning requires a large amount of data that is generally collected by search engines, which frequently return data with unreliable labels. In this paper, we investigate the Label Noise in Deep Learning using variational inference. Our contributions are : (1) exploiting Label Noise concept where the true labels are learnt using reparameterization variational inference, while observed labels are learnt discriminatively. (2) the noise transition matrix is learnt during the training without any particular process, neither heuristic nor preliminary phases. The theoretical results shows how true label distribution can be learned by variational inference in any discriminate neural network, and the effectiveness of our approach is proved in several target datasets, such as MNIST and CIFAR32.

Keywords: label noise, deep learning, discrete latent variable, variational inference, MNIST, CIFAR32

Procedia PDF Downloads 82
243 Supervised-Component-Based Generalised Linear Regression with Multiple Explanatory Blocks: THEME-SCGLR

Authors: Bry X., Trottier C., Mortier F., Cornu G., Verron T.

Abstract:

We address component-based regularization of a Multivariate Generalized Linear Model (MGLM). A set of random responses Y is assumed to depend, through a GLM, on a set X of explanatory variables, as well as on a set T of additional covariates. X is partitioned into R conceptually homogeneous blocks X1, ... , XR , viewed as explanatory themes. Variables in each Xr are assumed many and redundant. Thus, Generalised Linear Regression (GLR) demands regularization with respect to each Xr. By contrast, variables in T are assumed selected so as to demand no regularization. Regularization is performed searching each Xr for an appropriate number of orthogonal components that both contribute to model Y and capture relevant structural information in Xr. We propose a very general criterion to measure structural relevance (SR) of a component in a block, and show how to take SR into account within a Fisher-scoring-type algorithm in order to estimate the model. We show how to deal with mixed-type explanatory variables. The method, named THEME-SCGLR, is tested on simulated data.

Keywords: Component-Model, Fisher Scoring Algorithm, GLM, PLS Regression, SCGLR, SEER, THEME

Procedia PDF Downloads 371
242 Educational Data Mining: The Case of the Department of Mathematics and Computing in the Period 2009-2018

Authors: Mário Ernesto Sitoe, Orlando Zacarias

Abstract:

University education is influenced by several factors that range from the adoption of strategies to strengthen the whole process to the academic performance improvement of the students themselves. This work uses data mining techniques to develop a predictive model to identify students with a tendency to evasion and retention. To this end, a database of real students’ data from the Department of University Admission (DAU) and the Department of Mathematics and Informatics (DMI) was used. The data comprised 388 undergraduate students admitted in the years 2009 to 2014. The Weka tool was used for model building, using three different techniques, namely: K-nearest neighbor, random forest, and logistic regression. To allow for training on multiple train-test splits, a cross-validation approach was employed with a varying number of folds. To reduce bias variance and improve the performance of the models, ensemble methods of Bagging and Stacking were used. After comparing the results obtained by the three classifiers, Logistic Regression using Bagging with seven folds obtained the best performance, showing results above 90% in all evaluated metrics: accuracy, rate of true positives, and precision. Retention is the most common tendency.

Keywords: evasion and retention, cross-validation, bagging, stacking

Procedia PDF Downloads 52
241 An Integrated Approach to Find the Effect of Strain Rate on Ultimate Tensile Strength of Randomly Oriented Short Glass Fiber Composite in Combination with Artificial Neural Network

Authors: Sharad Shrivastava, Arun Jalan

Abstract:

In this study tensile testing was performed on randomly oriented short glass fiber/epoxy resin composite specimens which were prepared using hand lay-up method. Samples were tested over a wide range of strain rate/loading rate from 2mm/min to 40mm/min to see the effect on ultimate tensile strength of the composite. A multi layered 'back propagation artificial neural network of supervised learning type' was used to analyze and predict the tensile properties with strain rate and temperature as given input and output as UTS to predict. Various network structures were designed and investigated with varying parameters and network sizes, and an optimized network structure was proposed to predict the UTS of short glass fiber/epoxy resin composite specimens with reasonably good accuracy.

Keywords: glass fiber composite, mechanical properties, strain rate, artificial neural network

Procedia PDF Downloads 413
240 Automatic Classification of Periodic Heart Sounds Using Convolutional Neural Network

Authors: Jia Xin Low, Keng Wah Choo

Abstract:

This paper presents an automatic normal and abnormal heart sound classification model developed based on deep learning algorithm. MITHSDB heart sounds datasets obtained from the 2016 PhysioNet/Computing in Cardiology Challenge database were used in this research with the assumption that the electrocardiograms (ECG) were recorded simultaneously with the heart sounds (phonocardiogram, PCG). The PCG time series are segmented per heart beat, and each sub-segment is converted to form a square intensity matrix, and classified using convolutional neural network (CNN) models. This approach removes the need to provide classification features for the supervised machine learning algorithm. Instead, the features are determined automatically through training, from the time series provided. The result proves that the prediction model is able to provide reasonable and comparable classification accuracy despite simple implementation. This approach can be used for real-time classification of heart sounds in Internet of Medical Things (IoMT), e.g. remote monitoring applications of PCG signal.

Keywords: convolutional neural network, discrete wavelet transform, deep learning, heart sound classification

Procedia PDF Downloads 316
239 Definite Article Errors and Effect of L1 Transfer

Authors: Bimrisha Mali

Abstract:

The present study investigates the type of errors English as a second language (ESL) learners produce using the definite article ‘the’. The participants were provided a questionnaire on the learner's ability test. The questionnaire consists of three cloze tests and two free composition tests. Each participant's response was received in the form of written data. A total of 78 participants from three government schools participated in the study. The participants are high-school students from Rural Assam. Assam is a north-eastern state of India. Their age ranged between 14-15. The medium of instruction and the communication among the students take place in the local language, i.e., Assamese. Pit Corder’s steps for conducting error analysis have been followed for the analysis procedure. Four types of errors were found (1) deletion of the definite article, (2) use of the definite article as modifiers as adjectives, (3) incorrect use of the definite article with singular proper nouns, (4) substitution of the definite article by the indefinite article ‘a’. Classifiers in Assamese that express definiteness is used with nouns, adjectives, and numerals. It is found that native language (L1) transfer plays a pivotal role in the learners’ errors. The analysis reveals the learners' inability to acquire the semantic connotation of definiteness in English due to native language (L1) interference.

Keywords: definite article error, l1 transfer, error analysis, ESL

Procedia PDF Downloads 99
238 Isolation Preserving Medical Conclusion Hold Structure via C5 Algorithm

Authors: Swati Kishor Zode, Rahul Ambekar

Abstract:

Data mining is the extraction of fascinating examples on the other hand information from enormous measure of information and choice is made as indicated by the applicable information extracted. As of late, with the dangerous advancement in internet, stockpiling of information and handling procedures, privacy preservation has been one of the major (higher) concerns in data mining. Various techniques and methods have been produced for protection saving data mining. In the situation of Clinical Decision Support System, the choice is to be made on the premise of the data separated from the remote servers by means of Internet to diagnose the patient. In this paper, the fundamental thought is to build the precision of Decision Support System for multiple diseases for different maladies and in addition protect persistent information while correspondence between Clinician side (Client side) also, the Server side. A privacy preserving protocol for clinical decision support network is proposed so that patients information dependably stay scrambled amid diagnose prepare by looking after the accuracy. To enhance the precision of Decision Support System for various malady C5.0 classifiers and to save security, a Homomorphism encryption algorithm Paillier cryptosystem is being utilized.

Keywords: classification, homomorphic encryption, clinical decision support, privacy

Procedia PDF Downloads 307
237 Selection of Appropriate Classification Technique for Lithological Mapping of Gali Jagir Area, Pakistan

Authors: Khunsa Fatima, Umar K. Khattak, Allah Bakhsh Kausar

Abstract:

Satellite images interpretation and analysis assist geologists by providing valuable information about geology and minerals of an area to be surveyed. A test site in Fatejang of district Attock has been studied using Landsat ETM+ and ASTER satellite images for lithological mapping. Five different supervised image classification techniques namely maximum likelihood, parallelepiped, minimum distance to mean, mahalanobis distance and spectral angle mapper have been performed on both satellite data images to find out the suitable classification technique for lithological mapping in the study area. Results of these five image classification techniques were compared with the geological map produced by Geological Survey of Pakistan. The result of maximum likelihood classification technique applied on ASTER satellite image has the highest correlation of 0.66 with the geological map. Field observations and XRD spectra of field samples also verified the results. A lithological map was then prepared based on the maximum likelihood classification of ASTER satellite image.

Keywords: ASTER, Landsat-ETM+, satellite, image classification

Procedia PDF Downloads 358
236 A Supervised Approach for Word Sense Disambiguation Based on Arabic Diacritics

Authors: Alaa Alrakaf, Sk. Md. Mizanur Rahman

Abstract:

Since the last two decades’ Arabic natural language processing (ANLP) has become increasingly much more important. One of the key issues related to ANLP is ambiguity. In Arabic language different pronunciation of one word may have a different meaning. Furthermore, ambiguity also has an impact on the effectiveness and efficiency of Machine Translation (MT). The issue of ambiguity has limited the usefulness and accuracy of the translation from Arabic to English. The lack of Arabic resources makes ambiguity problem more complicated. Additionally, the orthographic level of representation cannot specify the exact meaning of the word. This paper looked at the diacritics of Arabic language and used them to disambiguate a word. The proposed approach of word sense disambiguation used Diacritizer application to Diacritize Arabic text then found the most accurate sense of an ambiguous word using Naïve Bayes Classifier. Our Experimental study proves that using Arabic Diacritics with Naïve Bayes Classifier enhances the accuracy of choosing the appropriate sense by 23% and also decreases the ambiguity in machine translation.

Keywords: Arabic natural language processing, machine learning, machine translation, Naive bayes classifier, word sense disambiguation

Procedia PDF Downloads 324
235 Microfungi on Sandy Beaches: Potential Threats for People Enjoying Lakeside Recreation

Authors: Tomasz Balabanski, Anna Biedunkiewicz

Abstract:

Research on basic bacteriological and physicochemical parameters conducted by state institutions (Provincial Sanitary and Epidemiological Station and District Sanitary and Epidemiological Station) are limited to bathing waters under constant sanitary and epidemiological supervision. Unfortunately, no routine or monitoring tests are carried out for the presence of microfungi. This also applies to beach sand used for recreational purposes. The purpose of the planned own research was to determine the diversity of the mycobiota present on supervised and unsupervised sandy beaches, on the shores of lakes, of municipal baths used for recreation. The research material consisted of microfungi isolated from April to October 2019 from sandy beaches of supervised and unsupervised lakes located within the administrative boundaries of the city of Olsztyn (North-Eastern Poland, Europe). Four lakes, out of the fifteen available (Tyrsko, Kortowskie, Skanda, and Ukiel), whose bathing waters are subjected to routine bacteriological tests, were selected for testing. To compare the diversity of the mycobiota composition on the surface and below the sand mixing layer, samples were taken from two depths (10 cm and 50 cm), using a soil auger. Micro-fungi from sand samples were obtained by surface inoculation on an RBC medium from the 1st dilution (1:10). After incubation at 25°C for 96-144 h, the average number of CFU/dm³ was counted. Morphologically differing yeast colonies were passaged into Sabouraud agar slants with gentamicin and incubated again. For detailed laboratory analyses, culture methods (macro- and micro-cultures) and identification methods recommended in diagnostic mycological laboratories were used. The conducted research allowed obtaining 140 yeast isolates. The total average population ranged from 1.37 × 10⁻² CFU/dm³ before the bathing season (April 2019), 1.64 × 10⁻³ CFU/dm³ in the season (May-September 2019), and 1.60 × 10⁻² CFU/dm³ after the end of the season (October 2019). More microfungi were obtained from the surface layer of sand (100 isolates) than from the deeper layer (40 isolates). Reported microfungi may circulate seasonally between individual elements of the lake ecosystem. From the sand/soil from the catchment area beaches, they can get into bathing waters, stopping periodically on the coastal phyllosphere. The sand of the beaches and the phyllosphere are a kind of filter for the water reservoir. The presence of microfungi with various pathogenicity potential in these places is of major epidemiological importance. Therefore, full monitoring of not only recreational waters but also sandy beaches should be treated as an element of constant control by appropriate supervisory institutions, allowing recreational areas for public use so that the use of these places does not involve the risk of infection. Acknowledgment: 'Development Program of the University of Warmia and Mazury in Olsztyn', POWR.03.05.00-00-Z310/17, co-financed by the European Union under the European Social Fund from the Operational Program Knowledge Education Development. Tomasz Bałabański is a recipient of a scholarship from the Programme Interdisciplinary Doctoral Studies in Biology and Biotechnology (POWR.03.05.00-00-Z310/17), which is funded by the 'European Social Fund'.

Keywords: beach, microfungi, sand, yeasts

Procedia PDF Downloads 73
234 Research on Knowledge Graph Inference Technology Based on Proximal Policy Optimization

Authors: Yihao Kuang, Bowen Ding

Abstract:

With the increasing scale and complexity of knowledge graph, modern knowledge graph contains more and more types of entity, relationship, and attribute information. Therefore, in recent years, it has been a trend for knowledge graph inference to use reinforcement learning to deal with large-scale, incomplete, and noisy knowledge graph and improve the inference effect and interpretability. The Proximal Policy Optimization (PPO) algorithm utilizes a near-end strategy optimization approach. This allows for more extensive updates of policy parameters while constraining the update extent to maintain training stability. This characteristic enables PPOs to converge to improve strategies more rapidly, often demonstrating enhanced performance early in the training process. Furthermore, PPO has the advantage of offline learning, effectively utilizing historical experience data for training and enhancing sample utilization. This means that even with limited resources, PPOs can efficiently train for reinforcement learning tasks. Based on these characteristics, this paper aims to obtain better and more efficient inference effect by introducing PPO into knowledge inference technology.

Keywords: reinforcement learning, PPO, knowledge inference, supervised learning

Procedia PDF Downloads 32
233 Stock Market Prediction Using Convolutional Neural Network That Learns from a Graph

Authors: Mo-Se Lee, Cheol-Hwi Ahn, Kee-Young Kwahk, Hyunchul Ahn

Abstract:

Over the past decade, deep learning has been in spotlight among various machine learning algorithms. In particular, CNN (Convolutional Neural Network), which is known as effective solution for recognizing and classifying images, has been popularly applied to classification and prediction problems in various fields. In this study, we try to apply CNN to stock market prediction, one of the most challenging tasks in the machine learning research. In specific, we propose to apply CNN as the binary classifier that predicts stock market direction (up or down) by using a graph as its input. That is, our proposal is to build a machine learning algorithm that mimics a person who looks at the graph and predicts whether the trend will go up or down. Our proposed model consists of four steps. In the first step, it divides the dataset into 5 days, 10 days, 15 days, and 20 days. And then, it creates graphs for each interval in step 2. In the next step, CNN classifiers are trained using the graphs generated in the previous step. In step 4, it optimizes the hyper parameters of the trained model by using the validation dataset. To validate our model, we will apply it to the prediction of KOSPI200 for 1,986 days in eight years (from 2009 to 2016). The experimental dataset will include 14 technical indicators such as CCI, Momentum, ROC and daily closing price of KOSPI200 of Korean stock market.

Keywords: convolutional neural network, deep learning, Korean stock market, stock market prediction

Procedia PDF Downloads 402
232 ANN Based Simulation of PWM Scheme for Seven Phase Voltage Source Inverter Using MATLAB/Simulink

Authors: Mohammad Arif Khan

Abstract:

This paper analyzes and presents the development of Artificial Neural Network based controller of space vector modulation (ANN-SVPWM) for a seven-phase voltage source inverter. At first, the conventional method of producing sinusoidal output voltage by utilizing six active and one zero space vectors are used to synthesize the input reference, is elaborated and then new PWM scheme called Artificial Neural Network Based PWM is presented. The ANN based controller has the advantage of the very fast implementation and analyzing the algorithms and avoids the direct computation of trigonometric and non-linear functions. The ANN controller uses the individual training strategy with the fixed weight and supervised models. A computer simulation program has been developed using Matlab/Simulink together with the neural network toolbox for training the ANN-controller. A comparison of the proposed scheme with the conventional scheme is presented based on various performance indices. Extensive Simulation results are provided to validate the findings.

Keywords: space vector PWM, total harmonic distortion, seven-phase, voltage source inverter, multi-phase, artificial neural network

Procedia PDF Downloads 430
231 The Use of Boosted Multivariate Trees in Medical Decision-Making for Repeated Measurements

Authors: Ebru Turgal, Beyza Doganay Erdogan

Abstract:

Machine learning aims to model the relationship between the response and features. Medical decision-making researchers would like to make decisions about patients’ course and treatment, by examining the repeated measurements over time. Boosting approach is now being used in machine learning area for these aims as an influential tool. The aim of this study is to show the usage of multivariate tree boosting in this field. The main reason for utilizing this approach in the field of decision-making is the ease solutions of complex relationships. To show how multivariate tree boosting method can be used to identify important features and feature-time interaction, we used the data, which was collected retrospectively from Ankara University Chest Diseases Department records. Dataset includes repeated PF ratio measurements. The follow-up time is planned for 120 hours. A set of different models is tested. In conclusion, main idea of classification with weighed combination of classifiers is a reliable method which was shown with simulations several times. Furthermore, time varying variables will be taken into consideration within this concept and it could be possible to make accurate decisions about regression and survival problems.

Keywords: boosted multivariate trees, longitudinal data, multivariate regression tree, panel data

Procedia PDF Downloads 168
230 Non-Targeted Adversarial Image Classification Attack-Region Modification Methods

Authors: Bandar Alahmadi, Lethia Jackson

Abstract:

Machine Learning model is used today in many real-life applications. The safety and security of such model is important, so the results of the model are as accurate as possible. One challenge of machine learning model security is the adversarial examples attack. Adversarial examples are designed by the attacker to cause the machine learning model to misclassify the input. We propose a method to generate adversarial examples to attack image classifiers. We are modifying the successfully classified images, so a classifier misclassifies them after the modification. In our method, we do not update the whole image, but instead we detect the important region, modify it, place it back to the original image, and then run it through a classifier. The algorithm modifies the detected region using two methods. First, it will add abstract image matrix on back of the detected image matrix. Then, it will perform a rotation attack to rotate the detected region around its axes, and embed the trace of image in image background. Finally, the attacked region is placed in its original position, from where it was removed, and a smoothing filter is applied to smooth the background with foreground. We test our method in cascade classifier, and the algorithm is efficient, the classifier confident has dropped to almost zero. We also try it in CNN (Convolutional neural network) with higher setting and the algorithm was successfully worked.

Keywords: adversarial examples, attack, computer vision, image processing

Procedia PDF Downloads 305
229 The Best Prediction Data Mining Model for Breast Cancer Probability in Women Residents in Kabul

Authors: Mina Jafari, Kobra Hamraee, Saied Hossein Hosseini

Abstract:

The prediction of breast cancer disease is one of the challenges in medicine. In this paper we collected 528 records of women’s information who live in Kabul including demographic, life style, diet and pregnancy data. There are many classification algorithm in breast cancer prediction and tried to find the best model with most accurate result and lowest error rate. We evaluated some other common supervised algorithms in data mining to find the best model in prediction of breast cancer disease among afghan women living in Kabul regarding to momography result as target variable. For evaluating these algorithms we used Cross Validation which is an assured method for measuring the performance of models. After comparing error rate and accuracy of three models: Decision Tree, Naive Bays and Rule Induction, Decision Tree with accuracy of 94.06% and error rate of %15 is found the best model to predicting breast cancer disease based on the health care records.

Keywords: decision tree, breast cancer, probability, data mining

Procedia PDF Downloads 107
228 Effects of Resistance Exercise Training on Blood Profile and CRP in Men with Type 2 Diabetes Mellitus

Authors: Mohsen Salesi, Seyyed Zoheir Rabei

Abstract:

Exercise has been considered a cornerstone of diabetes prevention and treatment for decades, but the benefits of resistance training are less clear. The purpose of this study was to determine the impact of resistance training on blood profile and inflammatory marker (CRP) of type 2 diabetes mellitus people. Thirty diabetic male were recruited (age: 50.34±10.28 years) and randomly assigned to 8 weeks resistance exercise training (n=15) and control groups (n=15). Before and after training blood pressure, weight, lipid profile (TC, TG, LDL-c, and HDL-c) and hs-CRP were measured. The resistance exercise training group took part in supervised 50–80 minutes resistance training sessions, three days a week on non-consecutive days for 8 weeks. Each exercise session included approximately 10 min of warm-up and cool-down periods. Results showed that TG significantly decreased (pre 210.19±9.31 vs. 101.12±7.25, p=0.03) and HDL-c significantly increased (pre 42.37±3.15 vs. 47.50±2.19, p=0.01) after exercise training. However, there was no difference between groups in TC, LDL-c, BMI and weight. In addition, a decrease in fasting blood glucose levels showed significant difference between groups (pre 144.65±5.73 vs. 124.21±6.48 p=0.04). Regular resistance exercise training can improve the lipid profile and reducing the cardiovascular risk factors in T2DM patients.

Keywords: lipid profile, resistance exercise, type 2 diabetes mellitus, men

Procedia PDF Downloads 373
227 A Geographical Framework for Studying the Territorial Sustainability Based on Land Use Change

Authors: Miguel Ramirez, Ivan Lizarazo

Abstract:

The emergence of various interpretations of sustainability, including weak and strong paradigms, can be traced back to the definition of sustainable development provided in the 1987 Brundtland report and the subsequent evolution of the sustainability concept. However, there has been limited scholarly attention given to clarifying the concept of sustainability within the theoretical and conceptual framework of geography. The discipline has predominantly been focused on understanding the diverse conceptions of sustainability within its epistemological boundaries, resulting in tensions between sustainability paradigms and their associated dimensions, including the incorporation of political perspectives, with particular emphasis on environmental geography's epistemology. In response to this gap, a conceptual framework for sustainability is proposed, effectively integrating spatial and territorial concepts. This framework aims to enhance geography's role in contributing to sustainability by utilizing the land system theory, which is based on the dynamics of land use change. Such an integrated conceptual framework enables incorporating methodological tools such as remote sensing, encompassing various earth observations and fusion methods, and supervised classification techniques. Additionally, it looks for better integration of socioecological information, thereby capturing essential population-related features.

Keywords: geography, sustainability, land change science, territorial sustainability

Procedia PDF Downloads 41
226 Meta Mask Correction for Nuclei Segmentation in Histopathological Image

Authors: Jiangbo Shi, Zeyu Gao, Chen Li

Abstract:

Nuclei segmentation is a fundamental task in digital pathology analysis and can be automated by deep learning-based methods. However, the development of such an automated method requires a large amount of data with precisely annotated masks which is hard to obtain. Training with weakly labeled data is a popular solution for reducing the workload of annotation. In this paper, we propose a novel meta-learning-based nuclei segmentation method which follows the label correction paradigm to leverage data with noisy masks. Specifically, we design a fully conventional meta-model that can correct noisy masks by using a small amount of clean meta-data. Then the corrected masks are used to supervise the training of the segmentation model. Meanwhile, a bi-level optimization method is adopted to alternately update the parameters of the main segmentation model and the meta-model. Extensive experimental results on two nuclear segmentation datasets show that our method achieves the state-of-the-art result. In particular, in some noise scenarios, it even exceeds the performance of training on supervised data.

Keywords: deep learning, histopathological image, meta-learning, nuclei segmentation, weak annotations

Procedia PDF Downloads 108
225 Harnessing Artificial Intelligence and Machine Learning for Advanced Fraud Detection and Prevention

Authors: Avinash Malladhi

Abstract:

Forensic accounting is a specialized field that involves the application of accounting principles, investigative skills, and legal knowledge to detect and prevent fraud. With the rise of big data and technological advancements, artificial intelligence (AI) and machine learning (ML) algorithms have emerged as powerful tools for forensic accountants to enhance their fraud detection capabilities. In this paper, we review and analyze various AI/ML algorithms that are commonly used in forensic accounting, including supervised and unsupervised learning, deep learning, natural language processing Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), Support Vector Machines (SVMs), Decision Trees, and Random Forests. We discuss their underlying principles, strengths, and limitations and provide empirical evidence from existing research studies demonstrating their effectiveness in detecting financial fraud. We also highlight potential ethical considerations and challenges associated with using AI/ML in forensic accounting. Furthermore, we highlight the benefits of these technologies in improving fraud detection and prevention in forensic accounting.

Keywords: AI, machine learning, forensic accounting & fraud detection, anti money laundering, Benford's law, fraud triangle theory

Procedia PDF Downloads 54
224 Extraction of Urban Land Features from TM Landsat Image Using the Land Features Index and Tasseled Cap Transformation

Authors: R. Bouhennache, T. Bouden, A. A. Taleb, A. Chaddad

Abstract:

In this paper we propose a method to map the urban areas. The method uses an arithmetic calculation processed from the land features indexes and Tasseled cap transformation TC of multi spectral Thematic Mapper Landsat TM image. For this purpose the derived indexes image from the original image such SAVI the soil adjusted vegetation index, UI the urban Index, and EBBI the enhanced built up and bareness index were staked to form a new image and the bands were uncorrelated, also the Spectral Angle Mapper (SAM) and Spectral Information Divergence (SID) supervised classification approaches were first applied on the new image TM data using the reference spectra of the spectral library and subsequently the four urban, vegetation, water and soil land cover categories were extracted with their accuracy assessment.The urban features were represented using a logic calculation applied to the brightness, UI-SAVI, NDBI-greenness and EBBI- brightness data sets. The study applied to Blida and mentioned that the urban features can be mapped with an accuracy ranging from 92 % to 95%.

Keywords: EBBI, SAVI, Tasseled Cap Transformation, UI

Procedia PDF Downloads 452
223 Efficacy and Safety of Sublingual Sufentanil for the Management of Acute Pain

Authors: Neil Singla, Derek Muse, Karen DiDonato, Pamela Palmer

Abstract:

Introduction: Pain is the most common reason people visit emergency rooms. Studies indicate however, that Emergency Department (ED) physicians often do not provide adequate analgesia to their patients as a result of gender and age bias, opiophobia and insufficient knowledge of and formal training in acute pain management. Novel classes of analgesics have recently been introduced, but many patients suffer from acute pain in settings where the availability of intravenous (IV) access may be limited, so there remains a clinical need for rapid-acting, potent analgesics that do not require an invasive route of delivery. A sublingual sufentanil tablet (SST), dispensed using a single-dose applicator, is in development for treatment of moderate-to-severe acute pain in a medically-supervised setting. Objective: The primary objective of this study was to demonstrate the repeat-dose efficacy, safety and tolerability of sufentanil 20 mcg and 30 mcg sublingual tablets compared to placebo for the management of acute pain as determined by the time-weighted sum of pain intensity differences (SPID) to baseline over the 12-hour study period (SPID12). Key secondary efficacy variables included SPID over the first hour (SPID1), Total pain relief over the 12-hour study period (TOTPAR12), time to perceived pain relief (PR) and time to meaningful PR. Safety variables consisted of adverse events (AE), vital signs, oxygen saturation and early termination. Methods: In this Phase 2, double-blind, dose-finding study, an equal number of male and female patients were randomly assigned in a 2:2:1 ratio to SST 20 mcg, SS 30 mcg or placebo, respectively, following bunionectomy. Study drug was dosed as needed, but not more frequently than hourly. Rescue medication was available as needed. The primary endpoint was the Summed Pain Intensity Difference to baseline over 12h (SPIDI2). Safety was assessed by continuous oxygen saturation monitoring and adverse event reporting. Results: 101 patients (51 Male/50 Female) were randomized, 100 received study treatment (intent-to-treat [ITT] population), and 91 completed the study. Reasons for early discontinuation were lack of efficacy (6), adverse events (2) and drug-dosing error (1). Mean age was 42.5 years. For the ITT population, SST 30 mcg was superior to placebo (p=0.003) for the SPID12. SPID12 scores in the active groups were superior for both male (ANOVA overall p-value =0.038) and female (ANOVA overall p-value=0.005) patients. Statistically significant differences in favour of sublingual sufentanil were also observed between the SST 30mcg and placebo group for SPID1(p<0.001), TOTPAR12(p=0.002), time to perceived PR (p=0.023) and time to meaningful PR (p=0.010). Nausea, vomiting and somnolence were more frequent in the sufentanil groups but there were no significant differences between treatment arms for the proportion of patients who prematurely terminated due to AE or inadequate analgesia. Conclusions: Sufentanil tablets dispensed sublingually using a single-dose applicator is in development for treatment of patients with moderate-to-severe acute pain in a medically-supervised setting where immediate IV access is limited. When administered sublingually, sufentanil’s pharmacokinetic profile and non-invasive delivery makes it a useful alternative to IM or IV dosing.

Keywords: acute pain, pain management, sublingual, sufentanil

Procedia PDF Downloads 328
222 Exploring Data Leakage in EEG Based Brain-Computer Interfaces: Overfitting Challenges

Authors: Khalida Douibi, Rodrigo Balp, Solène Le Bars

Abstract:

In the medical field, applications related to human experiments are frequently linked to reduced samples size, which makes the training of machine learning models quite sensitive and therefore not very robust nor generalizable. This is notably the case in Brain-Computer Interface (BCI) studies, where the sample size rarely exceeds 20 subjects or a few number of trials. To address this problem, several resampling approaches are often used during the data preparation phase, which is an overly critical step in a data science analysis process. One of the naive approaches that is usually applied by data scientists consists in the transformation of the entire database before the resampling phase. However, this can cause model’ s performance to be incorrectly estimated when making predictions on unseen data. In this paper, we explored the effect of data leakage observed during our BCI experiments for device control through the real-time classification of SSVEPs (Steady State Visually Evoked Potentials). We also studied potential ways to ensure optimal validation of the classifiers during the calibration phase to avoid overfitting. The results show that the scaling step is crucial for some algorithms, and it should be applied after the resampling phase to avoid data leackage and improve results.

Keywords: data leackage, data science, machine learning, SSVEP, BCI, overfitting

Procedia PDF Downloads 120
221 Performance Analysis of Traffic Classification with Machine Learning

Authors: Htay Htay Yi, Zin May Aye

Abstract:

Network security is role of the ICT environment because malicious users are continually growing that realm of education, business, and then related with ICT. The network security contravention is typically described and examined centrally based on a security event management system. The firewalls, Intrusion Detection System (IDS), and Intrusion Prevention System are becoming essential to monitor or prevent of potential violations, incidents attack, and imminent threats. In this system, the firewall rules are set only for where the system policies are needed. Dataset deployed in this system are derived from the testbed environment. The traffic as in DoS and PortScan traffics are applied in the testbed with firewall and IDS implementation. The network traffics are classified as normal or attacks in the existing testbed environment based on six machine learning classification methods applied in the system. It is required to be tested to get datasets and applied for DoS and PortScan. The dataset is based on CICIDS2017 and some features have been added. This system tested 26 features from the applied dataset. The system is to reduce false positive rates and to improve accuracy in the implemented testbed design. The system also proves good performance by selecting important features and comparing existing a dataset by machine learning classifiers.

Keywords: false negative rate, intrusion detection system, machine learning methods, performance

Procedia PDF Downloads 95
220 DISGAN: Efficient Generative Adversarial Network-Based Method for Cyber-Intrusion Detection

Authors: Hongyu Chen, Li Jiang

Abstract:

Ubiquitous anomalies endanger the security of our system con- stantly. They may bring irreversible damages to the system and cause leakage of privacy. Thus, it is of vital importance to promptly detect these anomalies. Traditional supervised methods such as Decision Trees and Support Vector Machine (SVM) are used to classify normality and abnormality. However, in some case, the abnormal status are largely rarer than normal status, which leads to decision bias of these methods. Generative adversarial network (GAN) has been proposed to handle the case. With its strong generative ability, it only needs to learn the distribution of normal status, and identify the abnormal status through the gap between it and the learned distribution. Nevertheless, existing GAN-based models are not suitable to process data with discrete values, leading to immense degradation of detection performance. To cope with the discrete features, in this paper, we propose an efficient GAN-based model with specifically-designed loss function. Experiment results show that our model outperforms state-of-the-art models on discrete dataset and remarkably reduce the overhead.

Keywords: GAN, discrete feature, Wasserstein distance, multiple intermediate layers

Procedia PDF Downloads 93
219 Individualized Emotion Recognition Through Dual-Representations and Ground-Established Ground Truth

Authors: Valentina Zhang

Abstract:

While facial expression is a complex and individualized behavior, all facial emotion recognition (FER) systems known to us rely on a single facial representation and are trained on universal data. We conjecture that: (i) different facial representations can provide different, sometimes complementing views of emotions; (ii) when employed collectively in a discussion group setting, they enable more accurate emotion reading which is highly desirable in autism care and other applications context sensitive to errors. In this paper, we first study FER using pixel-based DL vs semantics-based DL in the context of deepfake videos. Our experiment indicates that while the semantics-trained model performs better with articulated facial feature changes, the pixel-trained model outperforms on subtle or rare facial expressions. Armed with these findings, we have constructed an adaptive FER system learning from both types of models for dyadic or small interacting groups and further leveraging the synthesized group emotions as the ground truth for individualized FER training. Using a collection of group conversation videos, we demonstrate that FER accuracy and personalization can benefit from such an approach.

Keywords: neurodivergence care, facial emotion recognition, deep learning, ground truth for supervised learning

Procedia PDF Downloads 99
218 Online Handwritten Character Recognition for South Indian Scripts Using Support Vector Machines

Authors: Steffy Maria Joseph, Abdu Rahiman V, Abdul Hameed K. M.

Abstract:

Online handwritten character recognition is a challenging field in Artificial Intelligence. The classification success rate of current techniques decreases when the dataset involves similarity and complexity in stroke styles, number of strokes and stroke characteristics variations. Malayalam is a complex south indian language spoken by about 35 million people especially in Kerala and Lakshadweep islands. In this paper, we consider the significant feature extraction for the similar stroke styles of Malayalam. This extracted feature set are suitable for the recognition of other handwritten south indian languages like Tamil, Telugu and Kannada. A classification scheme based on support vector machines (SVM) is proposed to improve the accuracy in classification and recognition of online malayalam handwritten characters. SVM Classifiers are the best for real world applications. The contribution of various features towards the accuracy in recognition is analysed. Performance for different kernels of SVM are also studied. A graphical user interface has developed for reading and displaying the character. Different writing styles are taken for each of the 44 alphabets. Various features are extracted and used for classification after the preprocessing of input data samples. Highest recognition accuracy of 97% is obtained experimentally at the best feature combination with polynomial kernel in SVM.

Keywords: SVM, matlab, malayalam, South Indian scripts, onlinehandwritten character recognition

Procedia PDF Downloads 545
217 Probing Language Models for Multiple Linguistic Information

Authors: Bowen Ding, Yihao Kuang

Abstract:

In recent years, large-scale pre-trained language models have achieved state-of-the-art performance on a variety of natural language processing tasks. The word vectors produced by these language models can be viewed as dense encoded presentations of natural language that in text form. However, it is unknown how much linguistic information is encoded and how. In this paper, we construct several corresponding probing tasks for multiple linguistic information to clarify the encoding capabilities of different language models and performed a visual display. We firstly obtain word presentations in vector form from different language models, including BERT, ELMo, RoBERTa and GPT. Classifiers with a small scale of parameters and unsupervised tasks are then applied on these word vectors to discriminate their capability to encode corresponding linguistic information. The constructed probe tasks contain both semantic and syntactic aspects. The semantic aspect includes the ability of the model to understand semantic entities such as numbers, time, and characters, and the grammatical aspect includes the ability of the language model to understand grammatical structures such as dependency relationships and reference relationships. We also compare encoding capabilities of different layers in the same language model to infer how linguistic information is encoded in the model.

Keywords: language models, probing task, text presentation, linguistic information

Procedia PDF Downloads 65
216 Comprehensive Review of Adversarial Machine Learning in PDF Malware

Authors: Preston Nabors, Nasseh Tabrizi

Abstract:

Portable Document Format (PDF) files have gained significant popularity for sharing and distributing documents due to their universal compatibility. However, the widespread use of PDF files has made them attractive targets for cybercriminals, who exploit vulnerabilities to deliver malware and compromise the security of end-user systems. This paper reviews notable contributions in PDF malware detection, including static, dynamic, signature-based, and hybrid analysis. It presents a comprehensive examination of PDF malware detection techniques, focusing on the emerging threat of adversarial sampling and the need for robust defense mechanisms. The paper highlights the vulnerability of machine learning classifiers to evasion attacks. It explores adversarial sampling techniques in PDF malware detection to produce mimicry and reverse mimicry evasion attacks, which aim to bypass detection systems. Improvements for future research are identified, including accessible methods, applying adversarial sampling techniques to malicious payloads, evaluating other models, evaluating the importance of features to malware, implementing adversarial defense techniques, and conducting comprehensive examination across various scenarios. By addressing these opportunities, researchers can enhance PDF malware detection and develop more resilient defense mechanisms against adversarial attacks.

Keywords: adversarial attacks, adversarial defense, adversarial machine learning, intrusion detection, PDF malware, malware detection, malware detection evasion

Procedia PDF Downloads 11
215 An Innovative Auditory Impulsed EEG and Neural Network Based Biometric Identification System

Authors: Ritesh Kumar, Gitanjali Chhetri, Mandira Bhatia, Mohit Mishra, Abhijith Bailur, Abhinav

Abstract:

The prevalence of the internet and technology in our day to day lives is creating more security issues than ever. The need for protecting and providing a secure access to private and business data has led to the development of many security systems. One of the potential solutions is to employ the bio-metric authentication technique. In this paper we present an innovative biometric authentication method that utilizes a person’s EEG signal, which is acquired in response to an auditory stimulus,and transferred wirelessly to a computer that has the necessary ANN algorithm-Multi layer perceptrol neural network because of is its ability to differentiate between information which is not linearly separable.In order to determine the weights of the hidden layer we use Gaussian random weight initialization. MLP utilizes a supervised learning technique called Back propagation for training the network. The complex algorithm used for EEG classification reduces the chances of intrusion into the protected public or private data.

Keywords: EEG signal, auditory evoked potential, biometrics, multilayer perceptron neural network, back propagation rule, Gaussian random weight initialization

Procedia PDF Downloads 361