Search results for: Supervised machine learning
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2914

Search results for: Supervised machine learning

2794 Classification Based on Deep Neural Cellular Automata Model

Authors: Yasser F. Hassan

Abstract:

Deep learning structure is a branch of machine learning science and greet achievement in research and applications. Cellular neural networks are regarded as array of nonlinear analog processors called cells connected in a way allowing parallel computations. The paper discusses how to use deep learning structure for representing neural cellular automata model. The proposed learning technique in cellular automata model will be examined from structure of deep learning. A deep automata neural cellular system modifies each neuron based on the behavior of the individual and its decision as a result of multi-level deep structure learning. The paper will present the architecture of the model and the results of simulation of approach are given. Results from the implementation enrich deep neural cellular automata system and shed a light on concept formulation of the model and the learning in it.

Keywords: Cellular automata, neural cellular automata, deep learning, classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 866
2793 Urban Big Data: An Experimental Approach to Building-Value Estimation Using Web-Based Data

Authors: Sun-Young Jang, Sung-Ah Kim, Dongyoun Shin

Abstract:

Current real-estate value estimation, difficult for laymen, usually is performed by specialists. This paper presents an automated estimation process based on big data and machine-learning technology that calculates influences of building conditions on real-estate price measurement. The present study analyzed actual building sales sample data for Nonhyeon-dong, Gangnam-gu, Seoul, Korea, measuring the major influencing factors among the various building conditions. Further to that analysis, a prediction model was established and applied using RapidMiner Studio, a graphical user interface (GUI)-based tool for derivation of machine-learning prototypes. The prediction model is formulated by reference to previous examples. When new examples are applied, it analyses and predicts accordingly. The analysis process discerns the crucial factors effecting price increases by calculation of weighted values. The model was verified, and its accuracy determined, by comparing its predicted values with actual price increases.

Keywords: Big data, building-value analysis, machine learning, price prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1164
2792 Symmetrical Analysis of a Six-Phase Induction Machine Under Fault Conditions

Authors: E. K.Appiah, G. M'boungui, A. A. Jimoh, J. L. Munda, A.S.O. Ogunjuyigbe

Abstract:

The operational behavior of a six-phase squirrel cage induction machine with faulted stator terminals is presented in this paper. The study is carried out using the derived mathematical model of the machine in the arbitrary reference frame. Tests are conducted on a 1 kW experimental machine. Steady-state and dynamic performance are analyzed for the machine unloaded and loaded conditions. The results shows that with one of the stator phases experiencing either an open- circuit or short circuit fault the machine still produces starting torque, albeit the running performance is significantly derated.

Keywords: Performance, fault conditions, six-phase induction machine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2830
2791 Hand Gesture Interpretation Using Sensing Glove Integrated with Machine Learning Algorithms

Authors: Aqsa Ali, Aleem Mushtaq, Attaullah Memon, Monna

Abstract:

In this paper, we present a low cost design for a smart glove that can perform sign language recognition to assist the speech impaired people. Specifically, we have designed and developed an Assistive Hand Gesture Interpreter that recognizes hand movements relevant to the American Sign Language (ASL) and translates them into text for display on a Thin-Film-Transistor Liquid Crystal Display (TFT LCD) screen as well as synthetic speech. Linear Bayes Classifiers and Multilayer Neural Networks have been used to classify 11 feature vectors obtained from the sensors on the glove into one of the 27 ASL alphabets and a predefined gesture for space. Three types of features are used; bending using six bend sensors, orientation in three dimensions using accelerometers and contacts at vital points using contact sensors. To gauge the performance of the presented design, the training database was prepared using five volunteers. The accuracy of the current version on the prepared dataset was found to be up to 99.3% for target user. The solution combines electronics, e-textile technology, sensor technology, embedded system and machine learning techniques to build a low cost wearable glove that is scrupulous, elegant and portable.

Keywords: American sign language, assistive hand gesture interpreter, human-machine interface, machine learning, sensing glove.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2731
2790 Comparative Evaluation of Accuracy of Selected Machine Learning Classification Techniques for Diagnosis of Cancer: A Data Mining Approach

Authors: Rajvir Kaur, Jeewani Anupama Ginige

Abstract:

With recent trends in Big Data and advancements in Information and Communication Technologies, the healthcare industry is at the stage of its transition from clinician oriented to technology oriented. Many people around the world die of cancer because the diagnosis of disease was not done at an early stage. Nowadays, the computational methods in the form of Machine Learning (ML) are used to develop automated decision support systems that can diagnose cancer with high confidence in a timely manner. This paper aims to carry out the comparative evaluation of a selected set of ML classifiers on two existing datasets: breast cancer and cervical cancer. The ML classifiers compared in this study are Decision Tree (DT), Support Vector Machine (SVM), k-Nearest Neighbor (k-NN), Logistic Regression, Ensemble (Bagged Tree) and Artificial Neural Networks (ANN). The evaluation is carried out based on standard evaluation metrics Precision (P), Recall (R), F1-score and Accuracy. The experimental results based on the evaluation metrics show that ANN showed the highest-level accuracy (99.4%) when tested with breast cancer dataset. On the other hand, when these ML classifiers are tested with the cervical cancer dataset, Ensemble (Bagged Tree) technique gave better accuracy (93.1%) in comparison to other classifiers.

Keywords: Artificial neural networks, breast cancer, cancer dataset, classifiers, cervical cancer, F-score, logistic regression, machine learning, precision, recall, support vector machine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1553
2789 A Comparison of YOLO Family for Apple Detection and Counting in Orchards

Authors: Yuanqing Li, Changyi Lei, Zhaopeng Xue, Zhuo Zheng, Yanbo Long

Abstract:

In agricultural production and breeding, implementing automatic picking robot in orchard farming to reduce human labour and error is challenging. The core function of it is automatic identification based on machine vision. This paper focuses on apple detection and counting in orchards and implements several deep learning methods. Extensive datasets are used and a semi-automatic annotation method is proposed. The proposed deep learning models are in state-of-the-art YOLO family. In view of the essence of the models with various backbones, a multi-dimensional comparison in details is made in terms of counting accuracy, mAP and model memory, laying the foundation for realising automatic precision agriculture.

Keywords: Agricultural object detection, Deep learning, machine vision, YOLO family.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1099
2788 Design Optimization of a Double Stator Cup- Rotor Machine

Authors: E. Diryak, P. Lefley, L. Petkovska, G. Cvetkovski

Abstract:

This paper presents the optimum design for a double stator, cup rotor machine; a novel type of BLDC PM Machine. The optimization approach is divided into two stages: the first stage is calculating the machine configuration using Matlab, and the second stage is the optimization of the machine using Finite Element Modeling (FEM). Under the design specifications, the machine model will be selected from three pole numbers, namely, 8, 10 and 12 with an appropriate slot number. A double stator brushless DC permanent magnet machine is designed to achieve low cogging torque; high electromagnetic torque and low ripple torque.

Keywords: Permanent magnet machine, low- cogging torque, low- ripple torque, high- electromagnetic torque, design optimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2166
2787 Automated Process Quality Monitoring with Prediction of Fault Condition Using Measurement Data

Authors: Hyun-Woo Cho

Abstract:

Detection of incipient abnormal events is important to improve safety and reliability of machine operations and reduce losses caused by failures. Improper set-ups or aligning of parts often leads to severe problems in many machines. The construction of prediction models for predicting faulty conditions is quite essential in making decisions on when to perform machine maintenance. This paper presents a multivariate calibration monitoring approach based on the statistical analysis of machine measurement data. The calibration model is used to predict two faulty conditions from historical reference data. This approach utilizes genetic algorithms (GA) based variable selection, and we evaluate the predictive performance of several prediction methods using real data. The results shows that the calibration model based on supervised probabilistic principal component analysis (SPPCA) yielded best performance in this work. By adopting a proper variable selection scheme in calibration models, the prediction performance can be improved by excluding non-informative variables from their model building steps.

Keywords: Prediction, operation monitoring, on-line data, nonlinear statistical methods, empirical model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1658
2786 The Role of Synthetic Data in Aerial Object Detection

Authors: Ava Dodd, Jonathan Adams

Abstract:

The purpose of this study is to explore the characteristics of developing a machine learning application using synthetic data. The study is structured to develop the application for the purpose of deploying the computer vision model. The findings discuss the realities of attempting to develop a computer vision model for practical purpose, and detail the processes, tools and techniques that were used to meet accuracy requirements. The research reveals that synthetic data represent another variable that can be adjusted to improve the performance of a computer vision model. Further, a suite of tools and tuning recommendations are provided.

Keywords: computer vision, machine learning, synthetic data, YOLOv4

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 852
2785 A Machine Learning Based Framework for Education Levelling in Multicultural Countries: UAE as a Case Study

Authors: Shatha Ghareeb, Rawaa Al-Jumeily, Thar Baker

Abstract:

In Abu Dhabi, there are many different education curriculums where sector of private schools and quality assurance is supervising many private schools in Abu Dhabi for many nationalities. As there are many different education curriculums in Abu Dhabi to meet expats’ needs, there are different requirements for registration and success. In addition, there are different age groups for starting education in each curriculum. In fact, each curriculum has a different number of years, assessment techniques, reassessment rules, and exam boards. Currently, students that transfer curriculums are not being placed in the right year group due to different start and end dates of each academic year and their date of birth for each year group is different for each curriculum and as a result, we find students that are either younger or older for that year group which therefore creates gaps in their learning and performance. In addition, there is not a way of storing student data throughout their academic journey so that schools can track the student learning process. In this paper, we propose to develop a computational framework applicable in multicultural countries such as UAE in which multi-education systems are implemented. The ultimate goal is to use cloud and fog computing technology integrated with Artificial Intelligence techniques of Machine Learning to aid in a smooth transition when assigning students to their year groups, and provide leveling and differentiation information of students who relocate from a particular education curriculum to another, whilst also having the ability to store and access student data from anywhere throughout their academic journey.

Keywords: Admissions, algorithms, cloud computing, differentiation, fog computing, leveling, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 724
2784 Stackelberg Security Game for Optimizing Security of Federated Internet of Things Platform Instances

Authors: Violeta Damjanovic-Behrendt

Abstract:

This paper presents an approach for optimal cyber security decisions to protect instances of a federated Internet of Things (IoT) platform in the cloud. The presented solution implements the repeated Stackelberg Security Game (SSG) and a model called Stochastic Human behaviour model with AttRactiveness and Probability weighting (SHARP). SHARP employs the Subjective Utility Quantal Response (SUQR) for formulating a subjective utility function, which is based on the evaluations of alternative solutions during decision-making. We augment the repeated SSG (including SHARP and SUQR) with a reinforced learning algorithm called Naïve Q-Learning. Naïve Q-Learning belongs to the category of active and model-free Machine Learning (ML) techniques in which the agent (either the defender or the attacker) attempts to find an optimal security solution. In this way, we combine GT and ML algorithms for discovering optimal cyber security policies. The proposed security optimization components will be validated in a collaborative cloud platform that is based on the Industrial Internet Reference Architecture (IIRA) and its recently published security model.

Keywords: Security, internet of things, cloud computing, Stackelberg security game, machine learning, Naïve Q-learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1644
2783 Pruning Method of Belief Decision Trees

Authors: Salsabil Trabelsi, Zied Elouedi, Khaled Mellouli

Abstract:

The belief decision tree (BDT) approach is a decision tree in an uncertain environment where the uncertainty is represented through the Transferable Belief Model (TBM), one interpretation of the belief function theory. The uncertainty can appear either in the actual class of training objects or attribute values of objects to classify. In this paper, we develop a post-pruning method of belief decision trees in order to reduce size and improve classification accuracy on unseen cases. The pruning of decision tree has a considerable intention in the areas of machine learning.

Keywords: machine learning, uncertainty, belief function theory, belief decision tree, pruning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1910
2782 Gaits Stability Analysis for a Pneumatic Quadruped Robot Using Reinforcement Learning

Authors: Soofiyan Atar, Adil Shaikh, Sahil Rajpurkar, Pragnesh Bhalala, Aniket Desai, Irfan Siddavatam

Abstract:

Deep reinforcement learning (deep RL) algorithms leverage the symbolic power of complex controllers by automating it by mapping sensory inputs to low-level actions. Deep RL eliminates the complex robot dynamics with minimal engineering. Deep RL provides high-risk involvement by directly implementing it in real-world scenarios and also high sensitivity towards hyperparameters. Tuning of hyperparameters on a pneumatic quadruped robot becomes very expensive through trial-and-error learning. This paper presents an automated learning control for a pneumatic quadruped robot using sample efficient deep Q learning, enabling minimal tuning and very few trials to learn the neural network. Long training hours may degrade the pneumatic cylinder due to jerk actions originated through stochastic weights. We applied this method to the pneumatic quadruped robot, which resulted in a hopping gait. In our process, we eliminated the use of a simulator and acquired a stable gait. This approach evolves so that the resultant gait matures more sturdy towards any stochastic changes in the environment. We further show that our algorithm performed very well as compared to programmed gait using robot dynamics.

Keywords: model-based reinforcement learning, gait stability, supervised learning, pneumatic quadruped

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 588
2781 An Educational Data Mining System for Advising Higher Education Students

Authors: Heba Mohammed Nagy, Walid Mohamed Aly, Osama Fathy Hegazy

Abstract:

Educational  data mining  is  a  specific  data   mining field applied to data originating from educational environments, it relies on different  approaches to discover hidden knowledge  from  the  available   data. Among these approaches are   machine   learning techniques which are used to build a system that acquires learning from previous data. Machine learning can be applied to solve different regression, classification, clustering and optimization problems.

In  our  research, we propose  a “Student  Advisory  Framework” that  utilizes  classification  and  clustering  to  build  an  intelligent system. This system can be used to provide pieces of consultations to a first year  university  student to  pursue a  certain   education   track   where  he/she  will  likely  succeed  in, aiming  to  decrease   the  high  rate   of  academic  failure   among these  students.  A real case study  in Cairo  Higher  Institute  for Engineering, Computer  Science  and  Management  is  presented using  real  dataset   collected  from  2000−2012.The dataset has two main components: pre-higher education dataset and first year courses results dataset. Results have proved the efficiency of the suggested framework.

Keywords: Classification, Clustering, Educational Data Mining (EDM), Machine Learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5213
2780 Consumer Load Profile Determination with Entropy-Based K-Means Algorithm

Authors: Ioannis P. Panapakidis, Marios N. Moschakis

Abstract:

With the continuous increment of smart meter installations across the globe, the need for processing of the load data is evident. Clustering-based load profiling is built upon the utilization of unsupervised machine learning tools for the purpose of formulating the typical load curves or load profiles. The most commonly used algorithm in the load profiling literature is the K-means. While the algorithm has been successfully tested in a variety of applications, its drawback is the strong dependence in the initialization phase. This paper proposes a novel modified form of the K-means that addresses the aforementioned problem. Simulation results indicate the superiority of the proposed algorithm compared to the K-means.

Keywords: Clustering, load profiling, load modeling, machine learning, energy efficiency and quality.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1211
2779 Comprehensive Analysis of Data Mining Tools

Authors: S. Sarumathi, N. Shanthi

Abstract:

Due to the fast and flawless technological innovation there is a tremendous amount of data dumping all over the world in every domain such as Pattern Recognition, Machine Learning, Spatial Data Mining, Image Analysis, Fraudulent Analysis, World Wide Web etc., This issue turns to be more essential for developing several tools for data mining functionalities. The major aim of this paper is to analyze various tools which are used to build a resourceful analytical or descriptive model for handling large amount of information more efficiently and user friendly. In this survey the diverse tools are illustrated with their extensive technical paradigm, outstanding graphical interface and inbuilt multipath algorithms in which it is very useful for handling significant amount of data more indeed.

Keywords: Classification, Clustering, Data Mining, Machine learning, Visualization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2439
2778 Distributed System Computing Resource Scheduling Algorithm Based on Deep Reinforcement Learning

Authors: Yitao Lei, Xingxiang Zhai, Burra Venkata Durga Kumar

Abstract:

As the quantity and complexity of computing in large-scale software systems increase, distributed system computing becomes increasingly important. The distributed system realizes high-performance computing by collaboration between different computing resources. If there are no efficient resource scheduling resources, the abuse of distributed computing may cause resource waste and high costs. However, resource scheduling is usually an NP-hard problem, so we cannot find a general solution. However, some optimization algorithms exist like genetic algorithm, ant colony optimization, etc. The large scale of distributed systems makes this traditional optimization algorithm challenging to work with. Heuristic and machine learning algorithms are usually applied in this situation to ease the computing load. As a result, we do a review of traditional resource scheduling optimization algorithms and try to introduce a deep reinforcement learning method that utilizes the perceptual ability of neural networks and the decision-making ability of reinforcement learning. Using the machine learning method, we try to find important factors that influence the performance of distributed system computing and help the distributed system do an efficient computing resource scheduling. This paper surveys the application of deep reinforcement learning on distributed system computing resource scheduling. The research proposes a deep reinforcement learning method that uses a recurrent neural network to optimize the resource scheduling. The paper concludes the challenges and improvement directions for Deep Reinforcement Learning-based resource scheduling algorithms.

Keywords: Resource scheduling, deep reinforcement learning, distributed system, artificial intelligence.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 495
2777 Improving the Performance of Back-Propagation Training Algorithm by Using ANN

Authors: Vishnu Pratap Singh Kirar

Abstract:

Artificial Neural Network (ANN) can be trained using back propagation (BP). It is the most widely used algorithm for supervised learning with multi-layered feed-forward networks. Efficient learning by the BP algorithm is required for many practical applications. The BP algorithm calculates the weight changes of artificial neural networks, and a common approach is to use a twoterm algorithm consisting of a learning rate (LR) and a momentum factor (MF). The major drawbacks of the two-term BP learning algorithm are the problems of local minima and slow convergence speeds, which limit the scope for real-time applications. Recently the addition of an extra term, called a proportional factor (PF), to the two-term BP algorithm was proposed. The third increases the speed of the BP algorithm. However, the PF term also reduces the convergence of the BP algorithm, and criteria for evaluating convergence are required to facilitate the application of the three terms BP algorithm. Although these two seem to be closely related, as described later, we summarize various improvements to overcome the drawbacks. Here we compare the different methods of convergence of the new three-term BP algorithm.

Keywords: Neural Network, Backpropagation, Local Minima, Fast Convergence Rate.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3559
2776 Application of Granular Computing Paradigm in Knowledge Induction

Authors: Iftikhar U. Sikder

Abstract:

This paper illustrates an application of granular computing approach, namely rough set theory in data mining. The paper outlines the formalism of granular computing and elucidates the mathematical underpinning of rough set theory, which has been widely used by the data mining and the machine learning community. A real-world application is illustrated, and the classification performance is compared with other contending machine learning algorithms. The predictive performance of the rough set rule induction model shows comparative success with respect to other contending algorithms.

Keywords: Concept approximation, granular computing, reducts, rough set theory, rule induction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 834
2775 Adaptive Educational Hypermedia System for High School Students Based on Learning Styles

Authors: Stephen Akuma, Timothy Ndera

Abstract:

Information seekers get “lost in hyperspace” due to the voluminous documents updated daily on the internet. Adaptive Hypermedia Systems (AHS) are used to direct learners to their target goals. One of the most common AHS designed to help information seekers to overcome the problem of information overload is the Adaptive Education Hypermedia System (AEHS). However, this paper focuses on AEHS that adopts the learning preference of high school students and deliver learning content according to this preference throughout their learning experience. The research developed a prototype system for predicting students’ learning preference from the Visual, Aural, Read-Write and Kinesthetic (VARK) learning style model and adopting the learning content suitable to their preference. The predicting strength of several classifiers was compared and we found Support Vector Machine (SVM) to be more accurate in predicting learning style based on users’ preferences.

Keywords: Hypermedia, adaptive education, learning style, lesson content, user profile, prediction, feedback, adaptive hypermedia, learning style.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 847
2774 Comparison of Machine Learning Techniques for Single Imputation on Audiograms

Authors: Sarah Beaver, Renee Bryce

Abstract:

Audiograms detect hearing impairment, but missing values pose problems. This work explores imputations in an attempt to improve accuracy. This work implements Linear Regression, Lasso, Linear Support Vector Regression, Bayesian Ridge, K Nearest Neighbors (KNN), and Random Forest machine learning techniques to impute audiogram frequencies ranging from 125 Hz to 8000 Hz. The data contain patients who had or were candidates for cochlear implants. Accuracy is compared across two different Nested Cross-Validation k values. Over 4000 audiograms were used from 800 unique patients. Additionally, training on data combines and compares left and right ear audiograms versus single ear side audiograms. The accuracy achieved using Root Mean Square Error (RMSE) values for the best models for Random Forest ranges from 4.74 to 6.37. The R2 values for the best models for Random Forest ranges from .91 to .96. The accuracy achieved using RMSE values for the best models for KNN ranges from 5.00 to 7.72. The R2 values for the best models for KNN ranges from .89 to .95. The best imputation models received R2 between .89 to .96 and RMSE values less than 8dB. We also show that the accuracy of classification predictive models performed better with our imputation models versus constant imputations by a two percent increase.

Keywords: Machine Learning, audiograms, data imputations, single imputations.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 161
2773 Neural-Symbolic Machine-Learning for Knowledge Discovery and Adaptive Information Retrieval

Authors: Hager Kammoun, Jean Charles Lamirel, Mohamed Ben Ahmed

Abstract:

In this paper, a model for an information retrieval system is proposed which takes into account that knowledge about documents and information need of users are dynamic. Two methods are combined, one qualitative or symbolic and the other quantitative or numeric, which are deemed suitable for many clustering contexts, data analysis, concept exploring and knowledge discovery. These two methods may be classified as inductive learning techniques. In this model, they are introduced to build “long term" knowledge about past queries and concepts in a collection of documents. The “long term" knowledge can guide and assist the user to formulate an initial query and can be exploited in the process of retrieving relevant information. The different kinds of knowledge are organized in different points of view. This may be considered an enrichment of the exploration level which is coherent with the concept of document/query structure.

Keywords: Information Retrieval Systems, machine learning, classification, Galois lattices, Self Organizing Map.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1189
2772 Forecasting Fraudulent Financial Statements using Data Mining

Authors: S. Kotsiantis, E. Koumanakos, D. Tzelepis, V. Tampakas

Abstract:

This paper explores the effectiveness of machine learning techniques in detecting firms that issue fraudulent financial statements (FFS) and deals with the identification of factors associated to FFS. To this end, a number of experiments have been conducted using representative learning algorithms, which were trained using a data set of 164 fraud and non-fraud Greek firms in the recent period 2001-2002. The decision of which particular method to choose is a complicated problem. A good alternative to choosing only one method is to create a hybrid forecasting system incorporating a number of possible solution methods as components (an ensemble of classifiers). For this purpose, we have implemented a hybrid decision support system that combines the representative algorithms using a stacking variant methodology and achieves better performance than any examined simple and ensemble method. To sum up, this study indicates that the investigation of financial information can be used in the identification of FFS and underline the importance of financial ratios.

Keywords: Machine learning, stacking, classifier.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3053
2771 Comparison of Machine Learning and Deep Learning Algorithms for Automatic Classification of 80 Different Pollen Species

Authors: Endrick Barnacin, Jean-Luc Henry, Jimmy Nagau, Jack Molinié

Abstract:

Palynology is a field of interest in many disciplines due to its multiple applications: chronological dating, climatology, allergy treatment, and honey characterization. Unfortunately, the analysis of a pollen slide is a complicated and time consuming task that requires the intervention of experts in the field, which are becoming increasingly rare due to economic and social conditions. In this context, the automation of this task is urgent. In this work, we compare classical feature extraction methods (Shape, GLCM, LBP, and others) and Deep Learning (CNN and Transfer Learning) to perform a recognition task over 80 regional pollen species. It has been found that the use of Transfer Learning seems to be more precise than the other approaches.

Keywords: Image segmentation, stuck particles separation, Sobel operator, thresholding.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 201
2770 Hybrid Approach for Software Defect Prediction Using Machine Learning with Optimization Technique

Authors: C. Manjula, Lilly Florence

Abstract:

Software technology is developing rapidly which leads to the growth of various industries. Now-a-days, software-based applications have been adopted widely for business purposes. For any software industry, development of reliable software is becoming a challenging task because a faulty software module may be harmful for the growth of industry and business. Hence there is a need to develop techniques which can be used for early prediction of software defects. Due to complexities in manual prediction, automated software defect prediction techniques have been introduced. These techniques are based on the pattern learning from the previous software versions and finding the defects in the current version. These techniques have attracted researchers due to their significant impact on industrial growth by identifying the bugs in software. Based on this, several researches have been carried out but achieving desirable defect prediction performance is still a challenging task. To address this issue, here we present a machine learning based hybrid technique for software defect prediction. First of all, Genetic Algorithm (GA) is presented where an improved fitness function is used for better optimization of features in data sets. Later, these features are processed through Decision Tree (DT) classification model. Finally, an experimental study is presented where results from the proposed GA-DT based hybrid approach is compared with those from the DT classification technique. The results show that the proposed hybrid approach achieves better classification accuracy.

Keywords: Decision tree, genetic algorithm, machine learning, software defect prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1465
2769 Application of Extreme Learning Machine Method for Time Series Analysis

Authors: Rampal Singh, S. Balasundaram

Abstract:

In this paper, we study the application of Extreme Learning Machine (ELM) algorithm for single layered feedforward neural networks to non-linear chaotic time series problems. In this algorithm the input weights and the hidden layer bias are randomly chosen. The ELM formulation leads to solving a system of linear equations in terms of the unknown weights connecting the hidden layer to the output layer. The solution of this general system of linear equations will be obtained using Moore-Penrose generalized pseudo inverse. For the study of the application of the method we consider the time series generated by the Mackey Glass delay differential equation with different time delays, Santa Fe A and UCR heart beat rate ECG time series. For the choice of sigmoid, sin and hardlim activation functions the optimal values for the memory order and the number of hidden neurons which give the best prediction performance in terms of root mean square error are determined. It is observed that the results obtained are in close agreement with the exact solution of the problems considered which clearly shows that ELM is a very promising alternative method for time series prediction.

Keywords: Chaotic time series, Extreme learning machine, Generalization performance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3519
2768 Hybrid Machine Learning Approach for Text Categorization

Authors: Nerijus Remeikis, Ignas Skucas, Vida Melninkaite

Abstract:

Text categorization - the assignment of natural language documents to one or more predefined categories based on their semantic content - is an important component in many information organization and management tasks. Performance of neural networks learning is known to be sensitive to the initial weights and architecture. This paper discusses the use multilayer neural network initialization with decision tree classifier for improving text categorization accuracy. An adaptation of the algorithm is proposed in which a decision tree from root node until a final leave is used for initialization of multilayer neural network. The experimental evaluation demonstrates this approach provides better classification accuracy with Reuters-21578 corpus, one of the standard benchmarks for text categorization tasks. We present results comparing the accuracy of this approach with multilayer neural network initialized with traditional random method and decision tree classifiers.

Keywords: Text categorization, decision trees, neural networks, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1806
2767 3D Human Reconstruction over Cloud Based Image Data via AI and Machine Learning

Authors: Kaushik Sathupadi, Sandesh Achar

Abstract:

Human action recognition (HAR) modeling is a critical task in machine learning. These systems require better techniques for recognizing body parts and selecting optimal features based on vision sensors to identify complex action patterns efficiently. Still, there is a considerable gap and challenges between images and videos, such as brightness, motion variation, and random clutters. This paper proposes a robust approach for classifying human actions over cloud-based image data. First, we apply pre-processing and detection, human and outer shape detection techniques. Next, we extract valuable information in terms of cues. We extract two distinct features: fuzzy local binary patterns and sequence representation. Then, we applied a greedy, randomized adaptive search procedure for data optimization and dimension reduction, and for classification, we used a random forest. We tested our model on two benchmark datasets, AAMAZ and the KTH Multi-view Football datasets. Our HAR framework significantly outperforms the other state-of-the-art approaches and achieves a better recognition rate of 91% and 89.6% over the AAMAZ and KTH Multi-view Football datasets, respectively.

Keywords: Computer vision, human motion analysis, random forest, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 36
2766 A Hybrid Gene Selection Technique Using Improved Mutual Information and Fisher Score for Cancer Classification Using Microarrays

Authors: M. Anidha, K. Premalatha

Abstract:

Feature Selection is significant in order to perform constructive classification in the area of cancer diagnosis. However, a large number of features compared to the number of samples makes the task of classification computationally very hard and prone to errors in microarray gene expression datasets. In this paper, we present an innovative method for selecting highly informative gene subsets of gene expression data that effectively classifies the cancer data into tumorous and non-tumorous. The hybrid gene selection technique comprises of combined Mutual Information and Fisher score to select informative genes. The gene selection is validated by classification using Support Vector Machine (SVM) which is a supervised learning algorithm capable of solving complex classification problems. The results obtained from improved Mutual Information and F-Score with SVM as a classifier has produced efficient results.

Keywords: Gene selection, mutual information, Fisher score, classification, SVM.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1152
2765 Extraction of Significant Phrases from Text

Authors: Yuan J. Lui

Abstract:

Prospective readers can quickly determine whether a document is relevant to their information need if the significant phrases (or keyphrases) in this document are provided. Although keyphrases are useful, not many documents have keyphrases assigned to them, and manually assigning keyphrases to existing documents is costly. Therefore, there is a need for automatic keyphrase extraction. This paper introduces a new domain independent keyphrase extraction algorithm. The algorithm approaches the problem of keyphrase extraction as a classification task, and uses a combination of statistical and computational linguistics techniques, a new set of attributes, and a new machine learning method to distinguish keyphrases from non-keyphrases. The experiments indicate that this algorithm performs better than other keyphrase extraction tools and that it significantly outperforms Microsoft Word 2000-s AutoSummarize feature. The domain independence of this algorithm has also been confirmed in our experiments.

Keywords: classification, keyphrase extraction, machine learning, summarization

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2051