Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 2315

Search results for: classifier algorithms

2225 M-Machine Assembly Scheduling Problem to Minimize Total Tardiness with Non-Zero Setup Times

Authors: Harun Aydilek, Asiye Aydilek, Ali Allahverdi

Abstract:

Our objective is to minimize the total tardiness in an m-machine two-stage assembly flowshop scheduling problem. The objective is an important performance measure because of the fact that the fulfillment of due dates of customers has to be taken into account while making scheduling decisions. In the literature, the problem is considered with zero setup times which may not be realistic and appropriate for some scheduling environments. Considering separate setup times from processing times increases machine utilization by decreasing the idle time and reduces total tardiness. We propose two new algorithms and adapt four existing algorithms in the literature which are different versions of simulated annealing and genetic algorithms. Moreover, a dominance relation is developed based on the mathematical formulation of the problem. The developed dominance relation is incorporated in our proposed algorithms. Computational experiments are conducted to investigate the performance of the newly proposed algorithms. We find that one of the proposed algorithms performs significantly better than the others, i.e., the error of the best algorithm is less than those of the other algorithms by minimum 50%. The newly proposed algorithm is also efficient for the case of zero setup times and performs better than the best existing algorithm in the literature.

Keywords: algorithm, assembly flowshop, scheduling, simulation, total tardiness

Procedia PDF Downloads 329

2224 Application of Machine Learning Techniques in Forest Cover-Type Prediction

Authors: Saba Ebrahimi, Hedieh Ashrafi

Abstract:

Predicting the cover type of forests is a challenge for natural resource managers. In this project, we aim to perform a comprehensive comparative study of two well-known classification methods, support vector machine (SVM) and decision tree (DT). The comparison is first performed among different types of each classifier, and then the best of each classifier will be compared by considering different evaluation metrics. The effect of boosting and bagging for decision trees is also explored. Furthermore, the effect of principal component analysis (PCA) and feature selection is also investigated. During the project, the forest cover-type dataset from the remote sensing and GIS program is used in all computations.

Keywords: classification methods, support vector machine, decision tree, forest cover-type dataset

Procedia PDF Downloads 217

2223 The Modelling of Real Time Series Data

Authors: Valeria Bondarenko

Abstract:

We proposed algorithms for: estimation of parameters fBm (volatility and Hurst exponent) and for the approximation of random time series by functional of fBm. We proved the consistency of the estimators, which constitute the above algorithms, and proved the optimal forecast of approximated time series. The adequacy of estimation algorithms, approximation, and forecasting is proved by numerical experiment. During the process of creating software, the system has been created, which is displayed by the hierarchical structure. The comparative analysis of proposed algorithms with the other methods gives evidence of the advantage of approximation method. The results can be used to develop methods for the analysis and modeling of time series describing the economic, physical, biological and other processes.

Keywords: mathematical model, random process, Wiener process, fractional Brownian motion

Procedia PDF Downloads 358

2222 Double Clustering as an Unsupervised Approach for Order Picking of Distributed Warehouses

Authors: Hsin-Yi Huang, Ming-Sheng Liu, Jiun-Yan Shiau

Abstract:

Planning the order picking lists of warehouses to achieve when the costs associated with logistics on the operational performance is a significant challenge. In e-commerce era, this task is especially important productive processes are high. Nowadays, many order planning techniques employ supervised machine learning algorithms. However, the definition of which features should be processed by such algorithms is not a simple task, being crucial to the proposed technique’s success. Against this background, we consider whether unsupervised algorithms can enhance the planning of order-picking lists. A Zone2 picking approach, which is based on using clustering algorithms twice, is developed. A simplified example is given to demonstrate the merit of our approach.

Keywords: order picking, warehouse, clustering, unsupervised learning

Procedia PDF Downloads 159

2221 Preliminary Study of Hand Gesture Classification in Upper-Limb Prosthetics Using Machine Learning with EMG Signals

Authors: Linghui Meng, James Atlas, Deborah Munro

Abstract:

There is an increasing demand for prosthetics capable of mimicking natural limb movements and hand gestures, but precise movement control of prosthetics using only electrode signals continues to be challenging. This study considers the implementation of machine learning as a means of improving accuracy and presents an initial investigation into hand gesture recognition using models based on electromyographic (EMG) signals. EMG signals, which capture muscle activity, are used as inputs to machine learning algorithms to improve prosthetic control accuracy, functionality and adaptivity. Using logistic regression, a machine learning classifier, this study evaluates the accuracy of classifying two hand gestures from the publicly available Ninapro dataset using two-time series feature extraction algorithms: Time Series Feature Extraction (TSFE) and Convolutional Neural Networks (CNNs). Trials were conducted using varying numbers of EMG channels from one to eight to determine the impact of channel quantity on classification accuracy. The results suggest that although both algorithms can successfully distinguish between hand gesture EMG signals, CNNs outperform TSFE in extracting useful information for both accuracy and computational efficiency. In addition, although more channels of EMG signals provide more useful information, they also require more complex and computationally intensive feature extractors and consequently do not perform as well as lower numbers of channels. The findings also underscore the potential of machine learning techniques in developing more effective and adaptive prosthetic control systems.

Keywords: EMG, machine learning, prosthetic control, electromyographic prosthetics, hand gesture classification, CNN, computational neural networks, TSFE, time series feature extraction, channel count, logistic regression, ninapro, classifiers

Procedia PDF Downloads 29

2220 Algorithms Minimizing Total Tardiness

Authors: Harun Aydilek, Asiye Aydilek, Ali Allahverdi

Abstract:

The total tardiness is a widely used performance measure in the scheduling literature. This performance measure is particularly important in situations where there is a cost to complete a job beyond its due date. The cost of scheduling increases as the gap between a job's due date and its completion time increases. Such costs may also be penalty costs in contracts, loss of goodwill. This performance measure is important as the fulfillment of due dates of customers has to be taken into account while making scheduling decisions. The problem is addressed in the literature, however, it has been assumed zero setup times. Even though this assumption may be valid for some environments, it is not valid for some other scheduling environments. When setup times are treated as separate from processing times, it is possible to increase machine utilization and to reduce total tardiness. Therefore, non-zero setup times need to be considered as separate. A dominance relation is developed and several algorithms are proposed. The developed dominance relation is utilized in the proposed algorithms. Extensive computational experiments are conducted for the evaluation of the algorithms. The experiments indicated that the developed algorithms perform much better than the existing algorithms in the literature. More specifically, one of the newly proposed algorithms reduces the error of the best existing algorithm in the literature by 40 percent.

Keywords: algorithm, assembly flowshop, dominance relation, total tardiness

Procedia PDF Downloads 354

2219 Classification of Forest Types Using Remote Sensing and Self-Organizing Maps

Authors: Wanderson Goncalves e Goncalves, José Alberto Silva de Sá

Abstract:

Human actions are a threat to the balance and conservation of the Amazon forest. Therefore the environmental monitoring services play an important role as the preservation and maintenance of this environment. This study classified forest types using data from a forest inventory provided by the 'Florestal e da Biodiversidade do Estado do Pará' (IDEFLOR-BIO), located between the municipalities of Santarém, Juruti and Aveiro, in the state of Pará, Brazil, covering an area approximately of 600,000 hectares, Bands 3, 4 and 5 of the TM-Landsat satellite image, and Self - Organizing Maps. The information from the satellite images was extracted using QGIS software 2.8.1 Wien and was used as a database for training the neural network. The midpoints of each sample of forest inventory have been linked to images. Later the Digital Numbers of the pixels have been extracted, composing the database that fed the training process and testing of the classifier. The neural network was trained to classify two forest types: Rain Forest of Lowland Emerging Canopy (Dbe) and Rain Forest of Lowland Emerging Canopy plus Open with palm trees (Dbe + Abp) in the Mamuru Arapiuns glebes of Pará State, and the number of examples in the training data set was 400, 200 examples for each class (Dbe and Dbe + Abp), and the size of the test data set was 100, with 50 examples for each class (Dbe and Dbe + Abp). Therefore, total mass of data consisted of 500 examples. The classifier was compiled in Orange Data Mining 2.7 Software and was evaluated in terms of the confusion matrix indicators. The results of the classifier were considered satisfactory, and being obtained values of the global accuracy equal to 89% and Kappa coefficient equal to 78% and F1 score equal to 0,88. It evaluated also the efficiency of the classifier by the ROC plot (receiver operating characteristics), obtaining results close to ideal ratings, showing it to be a very good classifier, and demonstrating the potential of this methodology to provide ecosystem services, particularly in anthropogenic areas in the Amazon.

Keywords: artificial neural network, computational intelligence, pattern recognition, unsupervised learning

Procedia PDF Downloads 361

2218 From Electroencephalogram to Epileptic Seizures Detection by Using Artificial Neural Networks

Authors: Gaetano Zazzaro, Angelo Martone, Roberto V. Montaquila, Luigi Pavone

Abstract:

Seizure is the main factor that affects the quality of life of epileptic patients. The diagnosis of epilepsy, and hence the identification of epileptogenic zone, is commonly made by using continuous Electroencephalogram (EEG) signal monitoring. Seizure identification on EEG signals is made manually by epileptologists and this process is usually very long and error prone. The aim of this paper is to describe an automated method able to detect seizures in EEG signals, using knowledge discovery in database process and data mining methods and algorithms, which can support physicians during the seizure detection process. Our detection method is based on Artificial Neural Network classifier, trained by applying the multilayer perceptron algorithm, and by using a software application, called Training Builder that has been developed for the massive extraction of features from EEG signals. This tool is able to cover all the data preparation steps ranging from signal processing to data analysis techniques, including the sliding window paradigm, the dimensionality reduction algorithms, information theory, and feature selection measures. The final model shows excellent performances, reaching an accuracy of over 99% during tests on data of a single patient retrieved from a publicly available EEG dataset.

Keywords: artificial neural network, data mining, electroencephalogram, epilepsy, feature extraction, seizure detection, signal processing

Procedia PDF Downloads 188

2217 Design and Implementation of Generative Models for Odor Classification Using Electronic Nose

Authors: Kumar Shashvat, Amol P. Bhondekar

Abstract:

In the midst of the five senses, odor is the most reminiscent and least understood. Odor testing has been mysterious and odor data fabled to most practitioners. The delinquent of recognition and classification of odor is important to achieve. The facility to smell and predict whether the artifact is of further use or it has become undesirable for consumption; the imitation of this problem hooked on a model is of consideration. The general industrial standard for this classification is color based anyhow; odor can be improved classifier than color based classification and if incorporated in machine will be awfully constructive. For cataloging of odor for peas, trees and cashews various discriminative approaches have been used Discriminative approaches offer good prognostic performance and have been widely used in many applications but are incapable to make effectual use of the unlabeled information. In such scenarios, generative approaches have better applicability, as they are able to knob glitches, such as in set-ups where variability in the series of possible input vectors is enormous. Generative models are integrated in machine learning for either modeling data directly or as a transitional step to form an indeterminate probability density function. The algorithms or models Linear Discriminant Analysis and Naive Bayes Classifier have been used for classification of the odor of cashews. Linear Discriminant Analysis is a method used in data classification, pattern recognition, and machine learning to discover a linear combination of features that typifies or divides two or more classes of objects or procedures. The Naive Bayes algorithm is a classification approach base on Bayes rule and a set of qualified independence theory. Naive Bayes classifiers are highly scalable, requiring a number of restraints linear in the number of variables (features/predictors) in a learning predicament. The main recompenses of using the generative models are generally a Generative Models make stronger assumptions about the data, specifically, about the distribution of predictors given the response variables. The Electronic instrument which is used for artificial odor sensing and classification is an electronic nose. This device is designed to imitate the anthropological sense of odor by providing an analysis of individual chemicals or chemical mixtures. The experimental results have been evaluated in the form of the performance measures i.e. are accuracy, precision and recall. The investigational results have proven that the overall performance of the Linear Discriminant Analysis was better in assessment to the Naive Bayes Classifier on cashew dataset.

Keywords: odor classification, generative models, naive bayes, linear discriminant analysis

Procedia PDF Downloads 387

2216 An Adaptive Oversampling Technique for Imbalanced Datasets

Authors: Shaukat Ali Shahee, Usha Ananthakumar

Abstract:

A data set exhibits class imbalance problem when one class has very few examples compared to the other class, and this is also referred to as between class imbalance. The traditional classifiers fail to classify the minority class examples correctly due to its bias towards the majority class. Apart from between-class imbalance, imbalance within classes where classes are composed of a different number of sub-clusters with these sub-clusters containing different number of examples also deteriorates the performance of the classifier. Previously, many methods have been proposed for handling imbalanced dataset problem. These methods can be classified into four categories: data preprocessing, algorithmic based, cost-based methods and ensemble of classifier. Data preprocessing techniques have shown great potential as they attempt to improve data distribution rather than the classifier. Data preprocessing technique handles class imbalance either by increasing the minority class examples or by decreasing the majority class examples. Decreasing the majority class examples lead to loss of information and also when minority class has an absolute rarity, removing the majority class examples is generally not recommended. Existing methods available for handling class imbalance do not address both between-class imbalance and within-class imbalance simultaneously. In this paper, we propose a method that handles between class imbalance and within class imbalance simultaneously for binary classification problem. Removing between class imbalance and within class imbalance simultaneously eliminates the biases of the classifier towards bigger sub-clusters by minimizing the error domination of bigger sub-clusters in total error. The proposed method uses model-based clustering to find the presence of sub-clusters or sub-concepts in the dataset. The number of examples oversampled among the sub-clusters is determined based on the complexity of sub-clusters. The method also takes into consideration the scatter of the data in the feature space and also adaptively copes up with unseen test data using Lowner-John ellipsoid for increasing the accuracy of the classifier. In this study, neural network is being used as this is one such classifier where the total error is minimized and removing the between-class imbalance and within class imbalance simultaneously help the classifier in giving equal weight to all the sub-clusters irrespective of the classes. The proposed method is validated on 9 publicly available data sets and compared with three existing oversampling techniques that rely on the spatial location of minority class examples in the euclidean feature space. The experimental results show the proposed method to be statistically significantly superior to other methods in terms of various accuracy measures. Thus the proposed method can serve as a good alternative to handle various problem domains like credit scoring, customer churn prediction, financial distress, etc., that typically involve imbalanced data sets.

Keywords: classification, imbalanced dataset, Lowner-John ellipsoid, model based clustering, oversampling

Procedia PDF Downloads 418

2215 Principal Component Analysis Applied to the Electric Power Systems – Practical Guide; Practical Guide for Algorithms

Authors: John Morales, Eduardo Orduña

Abstract:

Currently the Principal Component Analysis (PCA) theory has been used to develop algorithms regarding to Electric Power Systems (EPS). In this context, this paper presents a practical tutorial of this technique detailed their concept, on-line and off-line mathematical foundations, which are necessary and desirables in EPS algorithms. Thus, features of their eigenvectors which are very useful to real-time process are explained, showing how it is possible to select these parameters through a direct optimization. On the other hand, in this work in order to show the application of PCA to off-line and on-line signals, an example step to step using Matlab commands is presented. Finally, a list of different approaches using PCA is presented, and some works which could be analyzed using this tutorial are presented.

Keywords: practical guide; on-line; off-line, algorithms, faults

Procedia PDF Downloads 563

2214 Comparing Community Detection Algorithms in Bipartite Networks

Authors: Ehsan Khademi, Mahdi Jalili

Abstract:

Despite the special features of bipartite networks, they are common in many systems. Real-world bipartite networks may show community structure, similar to what one can find in one-mode networks. However, the interpretation of the community structure in bipartite networks is different as compared to one-mode networks. In this manuscript, we compare a number of available methods that are frequently used to discover community structure of bipartite networks. These networks are categorized into two broad classes. One class is the methods that, first, transfer the network into a one-mode network, and then apply community detection algorithms. The other class is the algorithms that have been developed specifically for bipartite networks. These algorithms are applied on a model network with prescribed community structure.

Keywords: community detection, bipartite networks, co-clustering, modularity, network projection, complex networks

Procedia PDF Downloads 625

2213 Machine Learning Techniques in Bank Credit Analysis

Authors: Fernanda M. Assef, Maria Teresinha A. Steiner

Abstract:

The aim of this paper is to compare and discuss better classifier algorithm options for credit risk assessment by applying different Machine Learning techniques. Using records from a Brazilian financial institution, this study uses a database of 5,432 companies that are clients of the bank, where 2,600 clients are classified as non-defaulters, 1,551 are classified as defaulters and 1,281 are temporarily defaulters, meaning that the clients are overdue on their payments for up 180 days. For each case, a total of 15 attributes was considered for a one-against-all assessment using four different techniques: Artificial Neural Networks Multilayer Perceptron (ANN-MLP), Artificial Neural Networks Radial Basis Functions (ANN-RBF), Logistic Regression (LR) and finally Support Vector Machines (SVM). For each method, different parameters were analyzed in order to obtain different results when the best of each technique was compared. Initially the data were coded in thermometer code (numerical attributes) or dummy coding (for nominal attributes). The methods were then evaluated for each parameter and the best result of each technique was compared in terms of accuracy, false positives, false negatives, true positives and true negatives. This comparison showed that the best method, in terms of accuracy, was ANN-RBF (79.20% for non-defaulter classification, 97.74% for defaulters and 75.37% for the temporarily defaulter classification). However, the best accuracy does not always represent the best technique. For instance, on the classification of temporarily defaulters, this technique, in terms of false positives, was surpassed by SVM, which had the lowest rate (0.07%) of false positive classifications. All these intrinsic details are discussed considering the results found, and an overview of what was presented is shown in the conclusion of this study.

Keywords: artificial neural networks (ANNs), classifier algorithms, credit risk assessment, logistic regression, machine Learning, support vector machines

Procedia PDF Downloads 103

2212 Data Mining Algorithms Analysis: Case Study of Price Predictions of Lands

Authors: Julio Albuja, David Zaldumbide

Abstract:

Data analysis is an important step before taking a decision about money. The aim of this work is to analyze the factors that influence the final price of the houses through data mining algorithms. To our best knowledge, previous work was researched just to compare results. Furthermore, before using the data of the data set, the Z-Transformation were used to standardize the data in the same range. Hence, the data was classified into two groups to visualize them in a readability format. A decision tree was built, and graphical data is displayed where clearly is easy to see the results and the factors' influence in these graphics. The definitions of these methods are described, as well as the descriptions of the results. Finally, conclusions and recommendations are presented related to the released results that our research showed making it easier to apply these algorithms using a customized data set.

Keywords: algorithms, data, decision tree, transformation

Procedia PDF Downloads 374

2211 A Review on Comparative Analysis of Path Planning and Collision Avoidance Algorithms

Authors: Divya Agarwal, Pushpendra S. Bharti

Abstract:

Autonomous mobile robots (AMR) are expected as smart tools for operations in every automation industry. Path planning and obstacle avoidance is the backbone of AMR as robots have to reach their goal location avoiding obstacles while traversing through optimized path defined according to some criteria such as distance, time or energy. Path planning can be classified into global and local path planning where environmental information is known and unknown/partially known, respectively. A number of sensors are used for data collection. A number of algorithms such as artificial potential field (APF), rapidly exploring random trees (RRT), bidirectional RRT, Fuzzy approach, Purepursuit, A* algorithm, vector field histogram (VFH) and modified local path planning algorithm, etc. have been used in the last three decades for path planning and obstacle avoidance for AMR. This paper makes an attempt to review some of the path planning and obstacle avoidance algorithms used in the field of AMR. The review includes comparative analysis of simulation and mathematical computations of path planning and obstacle avoidance algorithms using MATLAB 2018a. From the review, it could be concluded that different algorithms may complete the same task (i.e. with a different set of instructions) in less or more time, space, effort, etc.

Keywords: path planning, obstacle avoidance, autonomous mobile robots, algorithms

Procedia PDF Downloads 232

2210 A Supervised Approach for Detection of Singleton Spam Reviews

Authors: Atefeh Heydari, Mohammadali Tavakoli, Naomie Salim

Abstract:

In recent years, we have witnessed that online reviews are the most important source of customers’ opinion. They are progressively more used by individuals and organisations to make purchase and business decisions. Unfortunately, for the reason of profit or fame, frauds produce deceptive reviews to hoodwink potential customers. Their activities mislead not only potential customers to make appropriate purchasing decisions and organisations to reshape their business, but also opinion mining techniques by preventing them from reaching accurate results. Spam reviews could be divided into two main groups, i.e. multiple and singleton spam reviews. Detecting a singleton spam review that is the only review written by a user ID is extremely challenging due to lack of clue for detection purposes. Singleton spam reviews are very harmful and various features and proofs used in multiple spam reviews detection are not applicable in this case. Current research aims to propose a novel supervised technique to detect singleton spam reviews. To achieve this, various features are proposed in this study and are to be combined with the most appropriate features extracted from literature and employed in a classifier. In order to compare the performance of different classifiers, SVM and naive Bayes classification algorithms were used for model building. The results revealed that SVM was more accurate than naive Bayes and our proposed technique is capable to detect singleton spam reviews effectively.

Keywords: classification algorithms, Naïve Bayes, opinion review spam detection, singleton review spam detection, support vector machine

Procedia PDF Downloads 309

2209 A Novel Heuristic for Analysis of Large Datasets by Selecting Wrapper-Based Features

Authors: Bushra Zafar, Usman Qamar

Abstract:

Large data sample size and dimensions render the effectiveness of conventional data mining methodologies. A data mining technique are important tools for collection of knowledgeable information from variety of databases and provides supervised learning in the form of classification to design models to describe vital data classes while structure of the classifier is based on class attribute. Classification efficiency and accuracy are often influenced to great extent by noisy and undesirable features in real application data sets. The inherent natures of data set greatly masks its quality analysis and leave us with quite few practical approaches to use. To our knowledge first time, we present a new approach for investigation of structure and quality of datasets by providing a targeted analysis of localization of noisy and irrelevant features of data sets. Machine learning is based primarily on feature selection as pre-processing step which offers us to select few features from number of features as a subset by reducing the space according to certain evaluation criterion. The primary objective of this study is to trim down the scope of the given data sample by searching a small set of important features which may results into good classification performance. For this purpose, a heuristic for wrapper-based feature selection using genetic algorithm and for discriminative feature selection an external classifier are used. Selection of feature based on its number of occurrence in the chosen chromosomes. Sample dataset has been used to demonstrate proposed idea effectively. A proposed method has improved average accuracy of different datasets is about 95%. Experimental results illustrate that proposed algorithm increases the accuracy of prediction of different diseases.

Keywords: data mining, generic algorithm, KNN algorithms, wrapper based feature selection

Procedia PDF Downloads 316

2208 Secure Hashing Algorithm and Advance Encryption Algorithm in Cloud Computing

Authors: Jaimin Patel

Abstract:

Cloud computing is one of the most sharp and important movement in various computing technologies. It provides flexibility to users, cost effectiveness, location independence, easy maintenance, enables multitenancy, drastic performance improvements, and increased productivity. On the other hand, there are also major issues like security. Being a common server, security for a cloud is a major issue; it is important to provide security to protect user’s private data, and it is especially important in e-commerce and social networks. In this paper, encryption algorithms such as Advanced Encryption Standard algorithms, their vulnerabilities, risk of attacks, optimal time and complexity management and comparison with other algorithms based on software implementation is proposed. Encryption techniques to improve the performance of AES algorithms and to reduce risk management are given. Secure Hash Algorithms, their vulnerabilities, software implementations, risk of attacks and comparison with other hashing algorithms as well as the advantages and disadvantages between hashing techniques and encryption are given.

Keywords: Cloud computing, encryption algorithm, secure hashing algorithm, brute force attack, birthday attack, plaintext attack, man in middle attack

Procedia PDF Downloads 280

2207 Rank-Based Chain-Mode Ensemble for Binary Classification

Authors: Chongya Song, Kang Yen, Alexander Pons, Jin Liu

Abstract:

In the field of machine learning, the ensemble has been employed as a common methodology to improve the performance upon multiple base classifiers. However, the true predictions are often canceled out by the false ones during consensus due to a phenomenon called “curse of correlation” which is represented as the strong interferences among the predictions produced by the base classifiers. In addition, the existing practices are still not able to effectively mitigate the problem of imbalanced classification. Based on the analysis on our experiment results, we conclude that the two problems are caused by some inherent deficiencies in the approach of consensus. Therefore, we create an enhanced ensemble algorithm which adopts a designed rank-based chain-mode consensus to overcome the two problems. In order to evaluate the proposed ensemble algorithm, we employ a well-known benchmark data set NSL-KDD (the improved version of dataset KDDCup99 produced by University of New Brunswick) to make comparisons between the proposed and 8 common ensemble algorithms. Particularly, each compared ensemble classifier uses the same 22 base classifiers, so that the differences in terms of the improvements toward the accuracy and reliability upon the base classifiers can be truly revealed. As a result, the proposed rank-based chain-mode consensus is proved to be a more effective ensemble solution than the traditional consensus approach, which outperforms the 8 ensemble algorithms by 20% on almost all compared metrices which include accuracy, precision, recall, F1-score and area under receiver operating characteristic curve.

Keywords: consensus, curse of correlation, imbalance classification, rank-based chain-mode ensemble

Procedia PDF Downloads 138

2206 Contourlet Transform and Local Binary Pattern Based Feature Extraction for Bleeding Detection in Endoscopic Images

Authors: Mekha Mathew, Varun P Gopi

Abstract:

Wireless Capsule Endoscopy (WCE) has become a great device in Gastrointestinal (GI) tract diagnosis, which can examine the entire GI tract, especially the small intestine without invasiveness and sedation. Bleeding in the digestive tract is a symptom of a disease rather than a disease itself. Hence the detection of bleeding is important in diagnosing many diseases. In this paper we proposes a novel method for distinguishing bleeding regions from normal regions based on Contourlet transform and Local Binary Pattern (LBP). Experiments show that this method provides a high accuracy rate of 96.38% in CIE XYZ colour space for k-Nearest Neighbour (k-NN) classifier.

Keywords: Wireless Capsule Endoscopy, local binary pattern, k-NN classifier, contourlet transform

Procedia PDF Downloads 485

2205 An Efficient Algorithm for Solving the Transmission Network Expansion Planning Problem Integrating Machine Learning with Mathematical Decomposition

Authors: Pablo Oteiza, Ricardo Alvarez, Mehrdad Pirnia, Fuat Can

Abstract:

To effectively combat climate change, many countries around the world have committed to a decarbonisation of their electricity, along with promoting a large-scale integration of renewable energy sources (RES). While this trend represents a unique opportunity to effectively combat climate change, achieving a sound and cost-efficient energy transition towards low-carbon power systems poses significant challenges for the multi-year Transmission Network Expansion Planning (TNEP) problem. The objective of the multi-year TNEP is to determine the necessary network infrastructure to supply the projected demand in a cost-efficient way, considering the evolution of the new generation mix, including the integration of RES. The rapid integration of large-scale RES increases the variability and uncertainty in the power system operation, which in turn increases short-term flexibility requirements. To meet these requirements, flexible generating technologies such as energy storage systems must be considered within the TNEP as well, along with proper models for capturing the operational challenges of future power systems. As a consequence, TNEP formulations are becoming more complex and difficult to solve, especially for its application in realistic-sized power system models. To meet these challenges, there is an increasing need for developing efficient algorithms capable of solving the TNEP problem with reasonable computational time and resources. In this regard, a promising research area is the use of artificial intelligence (AI) techniques for solving large-scale mixed-integer optimization problems, such as the TNEP. In particular, the use of AI along with mathematical optimization strategies based on decomposition has shown great potential. In this context, this paper presents an efficient algorithm for solving the multi-year TNEP problem. The algorithm combines AI techniques with Column Generation, a traditional decomposition-based mathematical optimization method. One of the challenges of using Column Generation for solving the TNEP problem is that the subproblems are of mixed-integer nature, and therefore solving them requires significant amounts of time and resources. Hence, in this proposal we solve a linearly relaxed version of the subproblems, and trained a binary classifier that determines the value of the binary variables, based on the results obtained from the linearized version. A key feature of the proposal is that we integrate the binary classifier into the optimization algorithm in such a way that the optimality of the solution can be guaranteed. The results of a study case based on the HRP 38-bus test system shows that the binary classifier has an accuracy above 97% for estimating the value of the binary variables. Since the linearly relaxed version of the subproblems can be solved with significantly less time than the integer programming counterpart, the integration of the binary classifier into the Column Generation algorithm allowed us to reduce the computational time required for solving the problem by 50%. The final version of this paper will contain a detailed description of the proposed algorithm, the AI-based binary classifier technique and its integration into the CG algorithm. To demonstrate the capabilities of the proposal, we evaluate the algorithm in case studies with different scenarios, as well as in other power system models.

Keywords: integer optimization, machine learning, mathematical decomposition, transmission planning

Procedia PDF Downloads 85

2204 Prediction of Bariatric Surgery Publications by Using Different Machine Learning Algorithms

Authors: Senol Dogan, Gunay Karli

Abstract:

Identification of relevant publications based on a Medline query is time-consuming and error-prone. An all based process has the potential to solve this problem without any manual work. To the best of our knowledge, our study is the first to investigate the ability of machine learning to identify relevant articles accurately. 5 different machine learning algorithms were tested using 23 predictors based on several metadata fields attached to publications. We find that the Boosted model is the best-performing algorithm and its overall accuracy is 96%. In addition, specificity and sensitivity of the algorithm is 97 and 93%, respectively. As a result of the work, we understood that we can apply the same procedure to understand cancer gene expression big data.

Keywords: prediction of publications, machine learning, algorithms, bariatric surgery, comparison of algorithms, boosted, tree, logistic regression, ANN model

Procedia PDF Downloads 209

2203 An Improved Approach to Solve Two-Level Hierarchical Time Minimization Transportation Problem

Authors: Kalpana Dahiya

Abstract:

This paper discusses a two-level hierarchical time minimization transportation problem, which is an important class of transportation problems arising in industries. This problem has been studied by various researchers, and a number of polynomial time iterative algorithms are available to find its solution. All the existing algorithms, though efficient, have some shortcomings. The current study proposes an alternate solution algorithm for the problem that is more efficient in terms of computational time than the existing algorithms. The results justifying the underlying theory of the proposed algorithm are given. Further, a detailed comparison of the computational behaviour of all the algorithms for randomly generated instances of this problem of different sizes validates the efficiency of the proposed algorithm.

Keywords: global optimization, hierarchical optimization, transportation problem, concave minimization

Procedia PDF Downloads 162

2202 Enunciation on Complexities of Selected Tree Searching Algorithms

Authors: Parag Bhalchandra, S. D. Khamitkar

Abstract:

Searching trees is a most interesting application of Artificial Intelligence. Over the period of time, many innovative methods have been evolved to better search trees with respect to computational complexities. Tree searches are difficult to understand due to the exponential growth of possibilities when increasing the number of nodes or levels in the tree. Usually it is understood when we traverse down in the tree, traverse down to greater depth, in the search of a solution or a goal. However, this does not happen in reality as explicit enumeration is not a very efficient method and there are many algorithmic speedups that will find the optimal solution without the burden of evaluating all possible trees. It was a common question before all researchers where they often wonder what algorithms will yield the best and fastest result The intention of this paper is two folds, one to review selected tree search algorithms and search strategies that can be applied to a problem space and the second objective is to stimulate to implement recent developments in the complexity behavior of search strategies. The algorithms discussed here apply in general to both brute force and heuristic searches.

Keywords: trees search, asymptotic complexity, brute force, heuristics algorithms

Procedia PDF Downloads 304

2201 Analysis of DC\DC Converter of Photovoltaic System with MPPT Algorithms Comparison

Authors: Badr M. Alshammari, Mohamed A. Khlifi

Abstract:

This paper presents the analysis of DC/DC converter including a comparative study of control methods to extract the maximum power and to track the maximum power point (MPP) from photovoltaic (PV) systems under changeable environmental conditions. This paper proposes two methods of maximum power point tracking algorithm for photovoltaic systems, based on the first hand on P&O control and the other hand on the first order IC. The MPPT system ensures that solar cells can deliver the maximum power possible to the load. Different algorithms are used to design it. Here we compare them and simulate the photovoltaic system with two algorithms. The algorithms are used to control the duty cycle of a DC-DC converter in order to boost the output voltage of the PV generator and guarantee the operation of the solar panels in the Maximum Power Point (MPP). Simulation and experimental results show that the proposed algorithms can effectively improve the efficiency of a photovoltaic array output.

Keywords: solar cell, DC/DC boost converter, MPPT, photovoltaic system

Procedia PDF Downloads 202

2200 Random Forest Classification for Population Segmentation

Authors: Regina Chua

Abstract:

To reduce the costs of re-fielding a large survey, a Random Forest classifier was applied to measure the accuracy of classifying individuals into their assigned segments with the fewest possible questions. Given a long survey, one needed to determine the most predictive ten or fewer questions that would accurately assign new individuals to custom segments. Furthermore, the solution needed to be quick in its classification and usable in non-Python environments. In this paper, a supervised Random Forest classifier was modeled on a dataset with 7,000 individuals, 60 questions, and 254 features. The Random Forest consisted of an iterative collection of individual decision trees that result in a predicted segment with robust precision and recall scores compared to a single tree. A random 70-30 stratified sampling for training the algorithm was used, and accuracy trade-offs at different depths for each segment were identified. Ultimately, the Random Forest classifier performed at 87% accuracy at a depth of 10 with 20 instead of 254 features and 10 instead of 60 questions. With an acceptable accuracy in prioritizing feature selection, new tools were developed for non-Python environments: a worksheet with a formulaic version of the algorithm and an embedded function to predict the segment of an individual in real-time. Random Forest was determined to be an optimal classification model by its feature selection, performance, processing speed, and flexible application in other environments.

Keywords: machine learning, supervised learning, data science, random forest, classification, prediction, predictive modeling

Procedia PDF Downloads 94

2199 Segmentation of Liver Using Random Forest Classifier

Authors: Gajendra Kumar Mourya, Dinesh Bhatia, Akash Handique, Sunita Warjri, Syed Achaab Amir

Abstract:

Nowadays, Medical imaging has become an integral part of modern healthcare. Abdominal CT images are an invaluable mean for abdominal organ investigation and have been widely studied in the recent years. Diagnosis of liver pathologies is one of the major areas of current interests in the field of medical image processing and is still an open problem. To deeply study and diagnose the liver, segmentation of liver is done to identify which part of the liver is mostly affected. Manual segmentation of the liver in CT images is time-consuming and suﬀers from inter- and intra-observer diﬀerences. However, automatic or semi-automatic computer aided segmentation of the Liver is a challenging task due to inter-patient Liver shape and size variability. In this paper, we present a technique for automatic segmenting the liver from CT images using Random Forest Classifier. Random forests or random decision forests are an ensemble learning method for classification that operate by constructing a multitude of decision trees at training time and outputting the class that is the mode of the classes of the individual trees. After comparing with various other techniques, it was found that Random Forest Classifier provide a better segmentation results with respect to accuracy and speed. We have done the validation of our results using various techniques and it shows above 89% accuracy in all the cases.

Keywords: CT images, image validation, random forest, segmentation

Procedia PDF Downloads 313

2198 Scheduling in Cloud Networks Using Chakoos Algorithm

Authors: Masoumeh Ali Pouri, Hamid Haj Seyyed Javadi

Abstract:

Nowadays, cloud processing is one of the important issues in information technology. Since scheduling of tasks graph is an NP-hard problem, considering approaches based on undeterminisitic methods such as evolutionary processing, mostly genetic and cuckoo algorithms, will be effective. Therefore, an efficient algorithm has been proposed for scheduling of tasks graph to obtain an appropriate scheduling with minimum time. In this algorithm, the new approach is based on making the length of the critical path shorter and reducing the cost of communication. Finally, the results obtained from the implementation of the presented method show that this algorithm acts the same as other algorithms when it faces graphs without communication cost. It performs quicker and better than some algorithms like DSC and MCP algorithms when it faces the graphs involving communication cost.

Keywords: cloud computing, scheduling, tasks graph, chakoos algorithm

Procedia PDF Downloads 64

2197 Classification of Multiple Cancer Types with Deep Convolutional Neural Network

Authors: Nan Deng, Zhenqiu Liu

Abstract:

Thousands of patients with metastatic tumors were diagnosed with cancers of unknown primary sites each year. The inability to identify the primary cancer site may lead to inappropriate treatment and unexpected prognosis. Nowadays, a large amount of genomics and transcriptomics cancer data has been generated by next-generation sequencing (NGS) technologies, and The Cancer Genome Atlas (TCGA) database has accrued thousands of human cancer tumors and healthy controls, which provides an abundance of resource to differentiate cancer types. Meanwhile, deep convolutional neural networks (CNNs) have shown high accuracy on classification among a large number of image object categories. Here, we utilize 25 cancer primary tumors and 3 normal tissues from TCGA and convert their RNA-Seq gene expression profiling to color images; train, validate and test a CNN classifier directly from these images. The performance result shows that our CNN classifier can archive >80% test accuracy on most of the tumors and normal tissues. Since the gene expression pattern of distant metastases is similar to their primary tumors, the CNN classifier may provide a potential computational strategy on identifying the unknown primary origin of metastatic cancer in order to plan appropriate treatment for patients.

Keywords: bioinformatics, cancer, convolutional neural network, deep leaning, gene expression pattern

Procedia PDF Downloads 299

2196 Feature Analysis of Predictive Maintenance Models

Authors: Zhaoan Wang

Abstract:

Research in predictive maintenance modeling has improved in the recent years to predict failures and needed maintenance with high accuracy, saving cost and improving manufacturing efficiency. However, classic prediction models provide little valuable insight towards the most important features contributing to the failure. By analyzing and quantifying feature importance in predictive maintenance models, cost saving can be optimized based on business goals. First, multiple classifiers are evaluated with cross-validation to predict the multi-class of failures. Second, predictive performance with features provided by different feature selection algorithms are further analyzed. Third, features selected by different algorithms are ranked and combined based on their predictive power. Finally, linear explainer SHAP (SHapley Additive exPlanations) is applied to interpret classifier behavior and provide further insight towards the specific roles of features in both local predictions and global model behavior. The results of the experiments suggest that certain features play dominant roles in predictive models while others have significantly less impact on the overall performance. Moreover, for multi-class prediction of machine failures, the most important features vary with type of machine failures. The results may lead to improved productivity and cost saving by prioritizing sensor deployment, data collection, and data processing of more important features over less importance features.

Keywords: automated supply chain, intelligent manufacturing, predictive maintenance machine learning, feature engineering, model interpretation

Procedia PDF Downloads 133