Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 28838

Search results for: predictive analysis algorithms

28688 A Mega-Analysis of the Predictive Power of Initial Contact within Minimal Social Network

Authors: Cathal Ffrench, Ryan Barrett, Mike Quayle

Abstract:

It is accepted in social psychology that categorization leads to ingroup favoritism, without further thought given to the processes that may co-occur or even precede categorization. These categorizations move away from the conceptualization of the self as a unique social being toward an increasingly collective identity. Subsequently, many individuals derive much of their self-evaluations from these collective identities. The seminal literature on this topic argues that it is primarily categorization that evokes instances of ingroup favoritism. Apropos to these theories, we argue that categorization acts to enhance and further intergroup processes rather than defining them. More accurately, we propose categorization aids initial ingroup contact and this first contact is predictive of subsequent favoritism on individual and collective levels. This analysis focuses on Virtual Interaction APPLication (VIAPPL) based studies, a software interface that builds on the flaws of the original minimal group studies. The VIAPPL allows the exchange of tokens in an intra and inter-group manner. This token exchange is how we classified the first contact. The study involves binary longitudinal analysis to better understand the subsequent exchanges of individuals based on who they first interacted with. Studies were selected on the criteria of evidence of explicit first interactions and two-group designs. Our findings paint a compelling picture in support of a motivated contact hypothesis, which suggests that an individual’s first motivated contact toward another has strong predictive capabilities for future behavior. This contact can lead to habit formation and specific favoritism towards individuals where contact has been established. This has important implications for understanding how group conflict occurs, and how intra-group individual bias can develop.

Keywords: categorization, group dynamics, initial contact, minimal social networks, momentary contact

Procedia PDF Downloads 125

28687 Automated Test Data Generation For some types of Algorithm

Authors: Hitesh Tahbildar

Abstract:

The cost of test data generation for a program is computationally very high. In general case, no algorithm to generate test data for all types of algorithms has been found. The cost of generating test data for different types of algorithm is different. Till date, people are emphasizing the need to generate test data for different types of programming constructs rather than different types of algorithms. The test data generation methods have been implemented to find heuristics for different types of algorithms. Some algorithms that includes divide and conquer, backtracking, greedy approach, dynamic programming to find the minimum cost of test data generation have been tested. Our experimental results say that some of these types of algorithm can be used as a necessary condition for selecting heuristics and programming constructs are sufficient condition for selecting our heuristics. Finally we recommend the different heuristics for test data generation to be selected for different types of algorithms.

Keywords: ongest path, saturation point, lmax, kL, kS

Procedia PDF Downloads 382

28686 Ensemble Methods in Machine Learning: An Algorithmic Approach to Derive Distinctive Behaviors of Criminal Activity Applied to the Poaching Domain

Authors: Zachary Blanks, Solomon Sonya

Abstract:

Poaching presents a serious threat to endangered animal species, environment conservations, and human life. Additionally, some poaching activity has even been linked to supplying funds to support terrorist networks elsewhere around the world. Consequently, agencies dedicated to protecting wildlife habitats have a near intractable task of adequately patrolling an entire area (spanning several thousand kilometers) given limited resources, funds, and personnel at their disposal. Thus, agencies need predictive tools that are both high-performing and easily implementable by the user to help in learning how the significant features (e.g. animal population densities, topography, behavior patterns of the criminals within the area, etc) interact with each other in hopes of abating poaching. This research develops a classification model using machine learning algorithms to aid in forecasting future attacks that is both easy to train and performs well when compared to other models. In this research, we demonstrate how data imputation methods (specifically predictive mean matching, gradient boosting, and random forest multiple imputation) can be applied to analyze data and create significant predictions across a varied data set. Specifically, we apply these methods to improve the accuracy of adopted prediction models (Logistic Regression, Support Vector Machine, etc). Finally, we assess the performance of the model and the accuracy of our data imputation methods by learning on a real-world data set constituting four years of imputed data and testing on one year of non-imputed data. This paper provides three main contributions. First, we extend work done by the Teamcore and CREATE (Center for Risk and Economic Analysis of Terrorism Events) research group at the University of Southern California (USC) working in conjunction with the Department of Homeland Security to apply game theory and machine learning algorithms to develop more efficient ways of reducing poaching. This research introduces ensemble methods (Random Forests and Stochastic Gradient Boosting) and applies it to real-world poaching data gathered from the Ugandan rain forest park rangers. Next, we consider the effect of data imputation on both the performance of various algorithms and the general accuracy of the method itself when applied to a dependent variable where a large number of observations are missing. Third, we provide an alternate approach to predict the probability of observing poaching both by season and by month. The results from this research are very promising. We conclude that by using Stochastic Gradient Boosting to predict observations for non-commercial poaching by season, we are able to produce statistically equivalent results while being orders of magnitude faster in computation time and complexity. Additionally, when predicting potential poaching incidents by individual month vice entire seasons, boosting techniques produce a mean area under the curve increase of approximately 3% relative to previous prediction schedules by entire seasons.

Keywords: ensemble methods, imputation, machine learning, random forests, statistical analysis, stochastic gradient boosting, wildlife protection

Procedia PDF Downloads 266

28685 M-Machine Assembly Scheduling Problem to Minimize Total Tardiness with Non-Zero Setup Times

Authors: Harun Aydilek, Asiye Aydilek, Ali Allahverdi

Abstract:

Our objective is to minimize the total tardiness in an m-machine two-stage assembly flowshop scheduling problem. The objective is an important performance measure because of the fact that the fulfillment of due dates of customers has to be taken into account while making scheduling decisions. In the literature, the problem is considered with zero setup times which may not be realistic and appropriate for some scheduling environments. Considering separate setup times from processing times increases machine utilization by decreasing the idle time and reduces total tardiness. We propose two new algorithms and adapt four existing algorithms in the literature which are different versions of simulated annealing and genetic algorithms. Moreover, a dominance relation is developed based on the mathematical formulation of the problem. The developed dominance relation is incorporated in our proposed algorithms. Computational experiments are conducted to investigate the performance of the newly proposed algorithms. We find that one of the proposed algorithms performs significantly better than the others, i.e., the error of the best algorithm is less than those of the other algorithms by minimum 50%. The newly proposed algorithm is also efficient for the case of zero setup times and performs better than the best existing algorithm in the literature.

Keywords: algorithm, assembly flowshop, scheduling, simulation, total tardiness

Procedia PDF Downloads 304

28684 Double Clustering as an Unsupervised Approach for Order Picking of Distributed Warehouses

Authors: Hsin-Yi Huang, Ming-Sheng Liu, Jiun-Yan Shiau

Abstract:

Planning the order picking lists of warehouses to achieve when the costs associated with logistics on the operational performance is a significant challenge. In e-commerce era, this task is especially important productive processes are high. Nowadays, many order planning techniques employ supervised machine learning algorithms. However, the definition of which features should be processed by such algorithms is not a simple task, being crucial to the proposed technique’s success. Against this background, we consider whether unsupervised algorithms can enhance the planning of order-picking lists. A Zone2 picking approach, which is based on using clustering algorithms twice, is developed. A simplified example is given to demonstrate the merit of our approach.

Keywords: order picking, warehouse, clustering, unsupervised learning

Procedia PDF Downloads 133

28683 Stochastic Model Predictive Control for Linear Discrete-Time Systems with Random Dither Quantization

Authors: Tomoaki Hashimoto

Abstract:

Recently, feedback control systems using random dither quantizers have been proposed for linear discrete-time systems. However, the constraints imposed on state and control variables have not yet been taken into account for the design of feedback control systems with random dither quantization. Model predictive control is a kind of optimal feedback control in which control performance over a finite future is optimized with a performance index that has a moving initial and terminal time. An important advantage of model predictive control is its ability to handle constraints imposed on state and control variables. Based on the model predictive control approach, the objective of this paper is to present a control method that satisfies probabilistic state constraints for linear discrete-time feedback control systems with random dither quantization. In other words, this paper provides a method for solving the optimal control problems subject to probabilistic state constraints for linear discrete-time feedback control systems with random dither quantization.

Keywords: optimal control, stochastic systems, random dither, quantization

Procedia PDF Downloads 422

28682 Optimization of a High-Growth Investment Portfolio for the South African Market Using Predictive Analytics

Authors: Mia Françoise

Abstract:

This report aims to develop a strategy for assisting short-term investors to benefit from the current economic climate in South Africa by utilizing technical analysis techniques and predictive analytics. As part of this research, value investing and technical analysis principles will be combined to maximize returns for South African investors while optimizing volatility. As an emerging market, South Africa offers many opportunities for high growth in sectors where other developed countries cannot grow at the same rate. Investing in South African companies with significant growth potential can be extremely rewarding. Although the risk involved is more significant in countries with less developed markets and infrastructure, there is more room for growth in these countries. According to recent research, the offshore market is expected to outperform the local market over the long term; however, short-term investments in the local market will likely be more profitable, as the Johannesburg Stock Exchange is predicted to outperform the S&P500 over the short term. The instabilities in the economy contribute to increased market volatility, which can benefit investors if appropriately utilized. Price prediction and portfolio optimization comprise the two primary components of this methodology. As part of this process, statistics and other predictive modeling techniques will be used to predict the future performance of stocks listed on the Johannesburg Stock Exchange. Following predictive data analysis, Modern Portfolio Theory, based on Markowitz's Mean-Variance Theorem, will be applied to optimize the allocation of assets within an investment portfolio. By combining different assets within an investment portfolio, this optimization method produces a portfolio with an optimal ratio of expected risk to expected return. This methodology aims to provide a short-term investment with a stock portfolio that offers the best risk-to-return profile for stocks listed on the JSE by combining price prediction and portfolio optimization.

Keywords: financial stocks, optimized asset allocation, prediction modelling, South Africa

Procedia PDF Downloads 68

28681 Algorithms Minimizing Total Tardiness

Authors: Harun Aydilek, Asiye Aydilek, Ali Allahverdi

Abstract:

The total tardiness is a widely used performance measure in the scheduling literature. This performance measure is particularly important in situations where there is a cost to complete a job beyond its due date. The cost of scheduling increases as the gap between a job's due date and its completion time increases. Such costs may also be penalty costs in contracts, loss of goodwill. This performance measure is important as the fulfillment of due dates of customers has to be taken into account while making scheduling decisions. The problem is addressed in the literature, however, it has been assumed zero setup times. Even though this assumption may be valid for some environments, it is not valid for some other scheduling environments. When setup times are treated as separate from processing times, it is possible to increase machine utilization and to reduce total tardiness. Therefore, non-zero setup times need to be considered as separate. A dominance relation is developed and several algorithms are proposed. The developed dominance relation is utilized in the proposed algorithms. Extensive computational experiments are conducted for the evaluation of the algorithms. The experiments indicated that the developed algorithms perform much better than the existing algorithms in the literature. More specifically, one of the newly proposed algorithms reduces the error of the best existing algorithm in the literature by 40 percent.

Keywords: algorithm, assembly flowshop, dominance relation, total tardiness

Procedia PDF Downloads 332

28680 Minimizing Total Completion Time in No-Wait Flowshops with Setup Times

Authors: Ali Allahverdi

Abstract:

The m-machine no-wait flowshop scheduling problem is addressed in this paper. The objective is to minimize total completion time subject to the constraint that the makespan value is not greater than a certain value. Setup times are treated as separate from processing times. Several recent algorithms are adapted and proposed for the problem. An extensive computational analysis has been conducted for the evaluation of the proposed algorithms. The computational analysis indicates that the best proposed algorithm performs significantly better than the earlier existing best algorithm.

Keywords: scheduling, no-wait flowshop, algorithm, setup times, total completion time, makespan

Procedia PDF Downloads 324

28679 A Model Predictive Control Based Virtual Active Power Filter Using V2G Technology

Authors: Mahdi Zolfaghari, Seyed Hossein Hosseinian, Hossein Askarian Abyaneh, Mehrdad Abedi

Abstract:

This paper presents a virtual active power filter (VAPF) using vehicle to grid (V2G) technology to maintain power quality requirements. The optimal discrete operation of the power converter of electric vehicle (EV) is based on recognizing desired switching states using the model predictive control (MPC) algorithm. A fast dynamic response, lower total harmonic distortion (THD) and good reference tracking performance are realized through the presented control strategy. The simulation results using MATLAB/Simulink validate the effectiveness of the scheme in improving power quality as well as good dynamic response in power transferring capability.

Keywords: electric vehicle, model predictive control, power quality, V2G technology, virtual active power filter

Procedia PDF Downloads 394

28678 Comparing Community Detection Algorithms in Bipartite Networks

Authors: Ehsan Khademi, Mahdi Jalili

Abstract:

Despite the special features of bipartite networks, they are common in many systems. Real-world bipartite networks may show community structure, similar to what one can find in one-mode networks. However, the interpretation of the community structure in bipartite networks is different as compared to one-mode networks. In this manuscript, we compare a number of available methods that are frequently used to discover community structure of bipartite networks. These networks are categorized into two broad classes. One class is the methods that, first, transfer the network into a one-mode network, and then apply community detection algorithms. The other class is the algorithms that have been developed specifically for bipartite networks. These algorithms are applied on a model network with prescribed community structure.

Keywords: community detection, bipartite networks, co-clustering, modularity, network projection, complex networks

Procedia PDF Downloads 595

28677 Predicting the Human Impact of Natural Onset Disasters Using Pattern Recognition Techniques and Rule Based Clustering

Authors: Sara Hasani

Abstract:

This research focuses on natural sudden onset disasters characterised as ‘occurring with little or no warning and often cause excessive injuries far surpassing the national response capacities’. Based on the panel analysis of the historic record of 4,252 natural onset disasters between 1980 to 2015, a predictive method was developed to predict the human impact of the disaster (fatality, injured, homeless) with less than 3% of errors. The geographical dispersion of the disasters includes every country where the data were available and cross-examined from various humanitarian sources. The records were then filtered into 4252 records of the disasters where the five predictive variables (disaster type, HDI, DRI, population, and population density) were clearly stated. The procedure was designed based on a combination of pattern recognition techniques and rule-based clustering for prediction and discrimination analysis to validate the results further. The result indicates that there is a relationship between the disaster human impact and the five socio-economic characteristics of the affected country mentioned above. As a result, a framework was put forward, which could predict the disaster’s human impact based on their severity rank in the early hours of disaster strike. The predictions in this model were outlined in two worst and best-case scenarios, which respectively inform the lower range and higher range of the prediction. A necessity to develop the predictive framework can be highlighted by noticing that despite the existing research in literature, a framework for predicting the human impact and estimating the needs at the time of the disaster is yet to be developed. This can further be used to allocate the resources at the response phase of the disaster where the data is scarce.

Keywords: disaster management, natural disaster, pattern recognition, prediction

Procedia PDF Downloads 136

28676 EnumTree: An Enumerative Biclustering Algorithm for DNA Microarray Data

Authors: Haifa Ben Saber, Mourad Elloumi

Abstract:

In a number of domains, like in DNA microarray data analysis, we need to cluster simultaneously rows (genes) and columns (conditions) of a data matrix to identify groups of constant rows with a group of columns. This kind of clustering is called biclustering. Biclustering algorithms are extensively used in DNA microarray data analysis. More effective biclustering algorithms are highly desirable and needed. We introduce a new algorithm called, Enumerative tree (EnumTree) for biclustering of binary microarray data. is an algorithm adopting the approach of enumerating biclusters. This algorithm extracts all biclusters consistent good quality. The main idea of EnumLat is the construction of a new tree structure to represent adequately different biclusters discovered during the process of enumeration. This algorithm adopts the strategy of all biclusters at a time. The performance of the proposed algorithm is assessed using both synthetic and real DNA micryarray data, our algorithm outperforms other biclustering algorithms for binary microarray data. Biclusters with different numbers of rows. Moreover, we test the biological significance using a gene annotation web tool to show that our proposed method is able to produce biologically relevent biclusters.

Keywords: DNA microarray, biclustering, gene expression data, tree, datamining.

Procedia PDF Downloads 353

28675 Development of a Predictive Model to Prevent Financial Crisis

Authors: Tengqin Han

Abstract:

Delinquency has been a crucial factor in economics throughout the years. Commonly seen in credit card and mortgage, it played one of the crucial roles in causing the most recent financial crisis in 2008. In each case, a delinquency is a sign of the loaner being unable to pay off the debt, and thus may cause a lost of property in the end. Individually, one case of delinquency seems unimportant compared to the entire credit system. China, as an emerging economic entity, the national strength and economic strength has grown rapidly, and the gross domestic product (GDP) growth rate has remained as high as 8% in the past decades. However, potential risks exist behind the appearance of prosperity. Among the risks, the credit system is the most significant one. Due to long term and a large amount of balance of the mortgage, it is critical to monitor the risk during the performance period. In this project, about 300,000 mortgage account data are analyzed in order to develop a predictive model to predict the probability of delinquency. Through univariate analysis, the data is cleaned up, and through bivariate analysis, the variables with strong predictive power are detected. The project is divided into two parts. In the first part, the analysis data of 2005 are split into 2 parts, 60% for model development, and 40% for in-time model validation. The KS of model development is 31, and the KS for in-time validation is 31, indicating the model is stable. In addition, the model is further validation by out-of-time validation, which uses 40% of 2006 data, and KS is 33. This indicates the model is still stable and robust. In the second part, the model is improved by the addition of macroeconomic economic indexes, including GDP, consumer price index, unemployment rate, inflation rate, etc. The data of 2005 to 2010 is used for model development and validation. Compared with the base model (without microeconomic variables), KS is increased from 41 to 44, indicating that the macroeconomic variables can be used to improve the separation power of the model, and make the prediction more accurate.

Keywords: delinquency, mortgage, model development, model validation

Procedia PDF Downloads 204

28674 Secure Hashing Algorithm and Advance Encryption Algorithm in Cloud Computing

Authors: Jaimin Patel

Abstract:

Cloud computing is one of the most sharp and important movement in various computing technologies. It provides flexibility to users, cost effectiveness, location independence, easy maintenance, enables multitenancy, drastic performance improvements, and increased productivity. On the other hand, there are also major issues like security. Being a common server, security for a cloud is a major issue; it is important to provide security to protect user’s private data, and it is especially important in e-commerce and social networks. In this paper, encryption algorithms such as Advanced Encryption Standard algorithms, their vulnerabilities, risk of attacks, optimal time and complexity management and comparison with other algorithms based on software implementation is proposed. Encryption techniques to improve the performance of AES algorithms and to reduce risk management are given. Secure Hash Algorithms, their vulnerabilities, software implementations, risk of attacks and comparison with other hashing algorithms as well as the advantages and disadvantages between hashing techniques and encryption are given.

Keywords: Cloud computing, encryption algorithm, secure hashing algorithm, brute force attack, birthday attack, plaintext attack, man in middle attack

Procedia PDF Downloads 258

28673 Aggregate Angularity on the Permanent Deformation Zones of Hot Mix Asphalt

Authors: Lee P. Leon, Raymond Charles

Abstract:

This paper presents a method of evaluating the effect of aggregate angularity on hot mix asphalt (HMA) properties and its relationship to the Permanent Deformation resistance. The research concluded that aggregate particle angularity had a significant effect on the Permanent Deformation performance, and also that with an increase in coarse aggregate angularity there was an increase in the resistance of mixes to Permanent Deformation. A comparison between the measured data and predictive data of permanent deformation predictive models showed the limits of existing prediction models. The numerical analysis described the permanent deformation zones and concluded that angularity has an effect of the onset of these zones. Prediction of permanent deformation help road agencies and by extension economists and engineers determine the best approach for maintenance, rehabilitation, and new construction works of the road infrastructure.

Keywords: aggregate angularity, asphalt concrete, permanent deformation, rutting prediction

Procedia PDF Downloads 379

28672 Unsupervised Segmentation Technique for Acute Leukemia Cells Using Clustering Algorithms

Authors: N. H. Harun, A. S. Abdul Nasir, M. Y. Mashor, R. Hassan

Abstract:

Leukaemia is a blood cancer disease that contributes to the increment of mortality rate in Malaysia each year. There are two main categories for leukaemia, which are acute and chronic leukaemia. The production and development of acute leukaemia cells occurs rapidly and uncontrollable. Therefore, if the identification of acute leukaemia cells could be done fast and effectively, proper treatment and medicine could be delivered. Due to the requirement of prompt and accurate diagnosis of leukaemia, the current study has proposed unsupervised pixel segmentation based on clustering algorithm in order to obtain a fully segmented abnormal white blood cell (blast) in acute leukaemia image. In order to obtain the segmented blast, the current study proposed three clustering algorithms which are k-means, fuzzy c-means and moving k-means algorithms have been applied on the saturation component image. Then, median filter and seeded region growing area extraction algorithms have been applied, to smooth the region of segmented blast and to remove the large unwanted regions from the image, respectively. Comparisons among the three clustering algorithms are made in order to measure the performance of each clustering algorithm on segmenting the blast area. Based on the good sensitivity value that has been obtained, the results indicate that moving k-means clustering algorithm has successfully produced the fully segmented blast region in acute leukaemia image. Hence, indicating that the resultant images could be helpful to haematologists for further analysis of acute leukaemia.

Keywords: acute leukaemia images, clustering algorithms, image segmentation, moving k-means

Procedia PDF Downloads 267

28671 Application of Latent Class Analysis and Self-Organizing Maps for the Prediction of Treatment Outcomes for Chronic Fatigue Syndrome

Authors: Ben Clapperton, Daniel Stahl, Kimberley Goldsmith, Trudie Chalder

Abstract:

Chronic fatigue syndrome (CFS) is a condition characterised by chronic disabling fatigue and other symptoms that currently can't be explained by any underlying medical condition. Although clinical trials support the effectiveness of cognitive behaviour therapy (CBT), the success rate for individual patients is modest. Patients vary in their response and little is known which factors predict or moderate treatment outcomes. The aim of the project is to develop a prediction model from baseline characteristics of patients, such as demographics, clinical and psychological variables, which may predict likely treatment outcome and provide guidance for clinical decision making and help clinicians to recommend the best treatment. The project is aimed at identifying subgroups of patients with similar baseline characteristics that are predictive of treatment effects using modern cluster analyses and data mining machine learning algorithms. The characteristics of these groups will then be used to inform the types of individuals who benefit from a specific treatment. In addition, results will provide a better understanding of for whom the treatment works. The suitability of different clustering methods to identify subgroups and their response to different treatments of CFS patients is compared.

Keywords: chronic fatigue syndrome, latent class analysis, prediction modelling, self-organizing maps

Procedia PDF Downloads 201

28670 Artificial Steady-State-Based Nonlinear MPC for Wheeled Mobile Robot

Authors: M. H. Korayem, Sh. Ameri, N. Yousefi Lademakhi

Abstract:

To ensure the stability of closed-loop nonlinear model predictive control (NMPC) within a finite horizon, there is a need for appropriate design terminal ingredients, which can be a time-consuming and challenging effort. Otherwise, in order to ensure the stability of the control system, it is necessary to consider an infinite predictive horizon. Increasing the prediction horizon increases computational demand and slows down the implementation of the method. In this study, a new technique has been proposed to ensure system stability without terminal ingredients. This technique has been employed in the design of the NMPC algorithm, leading to a reduction in the computational complexity of designing terminal ingredients and computational burden. The studied system is a wheeled mobile robot (WMR) subjected to non-holonomic constraints. Simulation has been investigated for two problems: trajectory tracking and adjustment mode.

Keywords: wheeled mobile robot, nonlinear model predictive control, stability, without terminal ingredients

Procedia PDF Downloads 57

28669 Time Series Regression with Meta-Clusters

Authors: Monika Chuchro

Abstract:

This paper presents a preliminary attempt to apply classification of time series using meta-clusters in order to improve the quality of regression models. In this case, clustering was performed as a method to obtain a subgroups of time series data with normal distribution from inflow into waste water treatment plant data which Composed of several groups differing by mean value. Two simple algorithms: K-mean and EM were chosen as a clustering method. The rand index was used to measure the similarity. After simple meta-clustering, regression model was performed for each subgroups. The final model was a sum of subgroups models. The quality of obtained model was compared with the regression model made using the same explanatory variables but with no clustering of data. Results were compared by determination coefficient (R2), measure of prediction accuracy mean absolute percentage error (MAPE) and comparison on linear chart. Preliminary results allows to foresee the potential of the presented technique.

Keywords: clustering, data analysis, data mining, predictive models

Procedia PDF Downloads 445

28668 A Hierarchical Method for Multi-Class Probabilistic Classification Vector Machines

Authors: P. Byrnes, F. A. DiazDelaO

Abstract:

The Support Vector Machine (SVM) has become widely recognised as one of the leading algorithms in machine learning for both regression and binary classification. It expresses predictions in terms of a linear combination of kernel functions, referred to as support vectors. Despite its popularity amongst practitioners, SVM has some limitations, with the most significant being the generation of point prediction as opposed to predictive distributions. Stemming from this issue, a probabilistic model namely, Probabilistic Classification Vector Machines (PCVM), has been proposed which respects the original functional form of SVM whilst also providing a predictive distribution. As physical system designs become more complex, an increasing number of classification tasks involving industrial applications consist of more than two classes. Consequently, this research proposes a framework which allows for the extension of PCVM to a multi class setting. Additionally, the original PCVM framework relies on the use of type II maximum likelihood to provide estimates for both the kernel hyperparameters and model evidence. In a high dimensional multi class setting, however, this approach has been shown to be ineffective due to bad scaling as the number of classes increases. Accordingly, we propose the application of Markov Chain Monte Carlo (MCMC) based methods to provide a posterior distribution over both parameters and hyperparameters. The proposed framework will be validated against current multi class classifiers through synthetic and real life implementations.

Keywords: probabilistic classification vector machines, multi class classification, MCMC, support vector machines

Procedia PDF Downloads 206

28667 Meteosat Second Generation Image Compression Based on the Radon Transform and Linear Predictive Coding: Comparison and Performance

Authors: Cherifi Mehdi, Lahdir Mourad, Ameur Soltane

Abstract:

Image compression is used to reduce the number of bits required to represent an image. The Meteosat Second Generation satellite (MSG) allows the acquisition of 12 image files every 15 minutes. Which results a large databases sizes. The transform selected in the images compression should contribute to reduce the data representing the images. The Radon transform retrieves the Radon points that represent the sum of the pixels in a given angle for each direction. Linear predictive coding (LPC) with filtering provides a good decorrelation of Radon points using a Predictor constitute by the Symmetric Nearest Neighbor filter (SNN) coefficients, which result losses during decompression. Finally, Run Length Coding (RLC) gives us a high and fixed compression ratio regardless of the input image. In this paper, a novel image compression method based on the Radon transform and linear predictive coding (LPC) for MSG images is proposed. MSG image compression based on the Radon transform and the LPC provides a good compromise between compression and quality of reconstruction. A comparison of our method with other whose two based on DCT and one on DWT bi-orthogonal filtering is evaluated to show the power of the Radon transform in its resistibility against the quantization noise and to evaluate the performance of our method. Evaluation criteria like PSNR and the compression ratio allows showing the efficiency of our method of compression.

Keywords: image compression, radon transform, linear predictive coding (LPC), run lengthcoding (RLC), meteosat second generation (MSG)

Procedia PDF Downloads 392

28666 Optimizing Microgrid Operations: A Framework of Adaptive Model Predictive Control

Authors: Ruben Lopez-Rodriguez

Abstract:

In a microgrid, diverse energy sources (both renewable and non-renewable) are combined with energy storage units to form a localized power system. Microgrids function as independent entities, capable of meeting the energy needs of specific areas or communities. This paper introduces a Model Predictive Control (MPC) approach tailored for grid-connected microgrids, aiming to optimize their operation. The formulation employs Mixed-Integer Programming (MIP) to find optimal trajectories. This entails the fulfillment of continuous and binary constraints, all while accounting for commutations between various operating conditions such as storage unit charge/discharge, import/export from/towards the main grid, as well as asset connection/disconnection. To validate the proposed approach, a microgrid case study is conducted, and the simulation results are compared with those obtained using a rule-based strategy.

Keywords: microgrids, mixed logical dynamical systems, mixed-integer optimization, model predictive control

Procedia PDF Downloads 27

28665 Predictive Modelling of Aircraft Component Replacement Using Imbalanced Learning and Ensemble Method

Authors: Dangut Maren David, Skaf Zakwan

Abstract:

Adequate monitoring of vehicle component in other to obtain high uptime is the goal of predictive maintenance, the major challenge faced by businesses in industries is the significant cost associated with a delay in service delivery due to system downtime. Most of those businesses are interested in predicting those problems and proactively prevent them in advance before it occurs, which is the core advantage of Prognostic Health Management (PHM) application. The recent emergence of industry 4.0 or industrial internet of things (IIoT) has led to the need for monitoring systems activities and enhancing system-to-system or component-to- component interactions, this has resulted to a large generation of data known as big data. Analysis of big data represents an increasingly important, however, due to complexity inherently in the dataset such as imbalance classification problems, it becomes extremely difficult to build a model with accurate high precision. Data-driven predictive modeling for condition-based maintenance (CBM) has recently drowned research interest with growing attention to both academics and industries. The large data generated from industrial process inherently comes with a different degree of complexity which posed a challenge for analytics. Thus, imbalance classification problem exists perversely in industrial datasets which can affect the performance of learning algorithms yielding to poor classifier accuracy in model development. Misclassification of faults can result in unplanned breakdown leading economic loss. In this paper, an advanced approach for handling imbalance classification problem is proposed and then a prognostic model for predicting aircraft component replacement is developed to predict component replacement in advanced by exploring aircraft historical data, the approached is based on hybrid ensemble-based method which improves the prediction of the minority class during learning, we also investigate the impact of our approach on multiclass imbalance problem. We validate the feasibility and effectiveness in terms of the performance of our approach using real-world aircraft operation and maintenance datasets, which spans over 7 years. Our approach shows better performance compared to other similar approaches. We also validate our approach strength for handling multiclass imbalanced dataset, our results also show good performance compared to other based classifiers.

Keywords: prognostics, data-driven, imbalance classification, deep learning

Procedia PDF Downloads 151

28664 Comparison of Techniques for Detection and Diagnosis of Eccentricity in the Air-Gap Fault in Induction Motors

Authors: Abrahão S. Fontes, Carlos A. V. Cardoso, Levi P. B. Oliveira

Abstract:

The induction motors are used worldwide in various industries. Several maintenance techniques are applied to increase the operating time and the lifespan of these motors. Among these, the predictive maintenance techniques such as Motor Current Signature Analysis (MCSA), Motor Square Current Signature Analysis (MSCSA), Park's Vector Approach (PVA) and Park's Vector Square Modulus (PVSM) are used to detect and diagnose faults in electric motors, characterized by patterns in the stator current frequency spectrum. In this article, these techniques are applied and compared on a real motor, which has the fault of eccentricity in the air-gap. It was used as a theoretical model of an electric induction motor without fault in order to assist comparison between the stator current frequency spectrum patterns with and without faults. Metrics were purposed and applied to evaluate the sensitivity of each technique fault detection. The results presented here show that the above techniques are suitable for the fault of eccentricity in the air gap, whose comparison between these showed the suitability of each one.

Keywords: eccentricity in the air-gap, fault diagnosis, induction motors, predictive maintenance

Procedia PDF Downloads 326

28663 Prediction of Bariatric Surgery Publications by Using Different Machine Learning Algorithms

Authors: Senol Dogan, Gunay Karli

Abstract:

Identification of relevant publications based on a Medline query is time-consuming and error-prone. An all based process has the potential to solve this problem without any manual work. To the best of our knowledge, our study is the first to investigate the ability of machine learning to identify relevant articles accurately. 5 different machine learning algorithms were tested using 23 predictors based on several metadata fields attached to publications. We find that the Boosted model is the best-performing algorithm and its overall accuracy is 96%. In addition, specificity and sensitivity of the algorithm is 97 and 93%, respectively. As a result of the work, we understood that we can apply the same procedure to understand cancer gene expression big data.

Keywords: prediction of publications, machine learning, algorithms, bariatric surgery, comparison of algorithms, boosted, tree, logistic regression, ANN model

Procedia PDF Downloads 187

28662 Predictive Analytics for Theory Building

Authors: Ho-Won Jung, Donghun Lee, Hyung-Jin Kim

Abstract:

Predictive analytics (data analysis) uses a subset of measurements (the features, predictor, or independent variable) to predict another measurement (the outcome, target, or dependent variable) on a single person or unit. It applies empirical methods in statistics, operations research, and machine learning to predict the future, or otherwise unknown events or outcome on a single or person or unit, based on patterns in data. Most analyses of metabolic syndrome are not predictive analytics but statistical explanatory studies that build a proposed model (theory building) and then validate metabolic syndrome predictors hypothesized (theory testing). A proposed theoretical model forms with causal hypotheses that specify how and why certain empirical phenomena occur. Predictive analytics and explanatory modeling have their own territories in analysis. However, predictive analytics can perform vital roles in explanatory studies, i.e., scientific activities such as theory building, theory testing, and relevance assessment. In the context, this study is to demonstrate how to use our predictive analytics to support theory building (i.e., hypothesis generation). For the purpose, this study utilized a big data predictive analytics platform TM based on a co-occurrence graph. The co-occurrence graph is depicted with nodes (e.g., items in a basket) and arcs (direct connections between two nodes), where items in a basket are fully connected. A cluster is a collection of fully connected items, where the specific group of items has co-occurred in several rows in a data set. Clusters can be ranked using importance metrics, such as node size (number of items), frequency, surprise (observed frequency vs. expected), among others. The size of a graph can be represented by the numbers of nodes and arcs. Since the size of a co-occurrence graph does not depend directly on the number of observations (transactions), huge amounts of transactions can be represented and processed efficiently. For a demonstration, a total of 13,254 metabolic syndrome training data is plugged into the analytics platform to generate rules (potential hypotheses). Each observation includes 31 predictors, for example, associated with sociodemographic, habits, and activities. Some are intentionally included to get predictive analytics insights on variable selection such as cancer examination, house type, and vaccination. The platform automatically generates plausible hypotheses (rules) without statistical modeling. Then the rules are validated with an external testing dataset including 4,090 observations. Results as a kind of inductive reasoning show potential hypotheses extracted as a set of association rules. Most statistical models generate just one estimated equation. On the other hand, a set of rules (many estimated equations from a statistical perspective) in this study may imply heterogeneity in a population (i.e., different subpopulations with unique features are aggregated). Next step of theory development, i.e., theory testing, statistically tests whether a proposed theoretical model is a plausible explanation of a phenomenon interested in. If hypotheses generated are tested statistically with several thousand observations, most of the variables will become significant as the p-values approach zero. Thus, theory validation needs statistical methods utilizing a part of observations such as bootstrap resampling with an appropriate sample size.

Keywords: explanatory modeling, metabolic syndrome, predictive analytics, theory building

Procedia PDF Downloads 248

28661 The Effect of Acute Rejection and Delayed Graft Function on Renal Transplant Fibrosis in Live Donor Renal Transplantation

Authors: Wisam Ismail, Sarah Hosgood, Michael Nicholson

Abstract:

The research hypothesis is that early post-transplant allograft fibrosis will be linked to donor factors and that acute rejection and/or delayed graft function in the recipient will be independent risk factors for the development of fibrosis. This research hypothesis is to explore whether acute rejection/delay graft function has an effect on the renal transplant fibrosis within the first year post live donor kidney transplant between 1998 and 2009. Methods: The study has been designed to identify five time points of the renal transplant biopsies [0 (pre-transplant), 1 month, 3 months, 6 months and 12 months] for 300 live donor renal transplant patients over 12 years period between March 1997 – August 2009. Paraffin fixed slides were collected from Leicester General Hospital and Leicester Royal Infirmary. These were routinely sectioned at a thickness of 4 Micro millimetres for standardization. Conclusions: Fibrosis at 1 month after the transplant was found significantly associated with baseline fibrosis (p<0.001) and HTN in the transplant recipient (p<0.001). Dialysis after the transplant showed a weak association with fibrosis at 1 month (p=0.07). The negative coefficient for HTN (-0.05) suggests a reduction in fibrosis in the absence of HTN. Fibrosis at 1 month was significantly associated with fibrosis at baseline (p 0.01 and 95%CI 0.11 to 0.67). Fibrosis at 3, 6 or 12 months was not found to be associated with fibrosis at baseline (p=0.70. 0.65 and 0.50 respectively). The amount of fibrosis at 1 month is significantly associated with graft survival (p=0.01 and 95%CI 0.02 to 0.14). Rejection and severity of rejection were not found to be associated with fibrosis at 1 month. The amount of fibrosis at 1 month was significantly associated with graft survival (p=0.02) after adjusting for baseline fibrosis (p=0.01). Both baseline fibrosis and graft survival were significant predictive factors. The amount of fibrosis at 1 month was not found to be significantly associated with rejection (p=0.64) after adjusting for baseline fibrosis (p=0.01). The amount of fibrosis at 1 month was not found to be significantly associated with rejection severity (p=0.29) after adjusting for baseline fibrosis (p=0.04). Fibrosis at baseline and HTN in the recipient were found to be predictive factors of fibrosis at 1 month. (p 0.02, p <0.001 respectively). Age of the donor, their relation to the patient, the pre-op Creatinine, artery, kidney weight and warm time were not found to be significantly associated with fibrosis at 1 month. In this complex model baseline fibrosis, HTN in the recipient and cold time were found to be predictive factors of fibrosis at 1 month (p=0.01,<0.001 and 0.03 respectively). Donor age was found to be a predictive factor of fibrosis at 6 months. The above analysis was repeated for 3, 6 and 12 months. No associations were detected between fibrosis and any of the explanatory variables with the exception of the donor age which was found to be a predictive factor of fibrosis at 6 months.

Keywords: fibrosis, transplant, renal, rejection

Procedia PDF Downloads 210

28660 Predictive Analytics of Student Performance Determinants

Authors: Mahtab Davari, Charles Edward Okon, Somayeh Aghanavesi

Abstract:

Every institute of learning is usually interested in the performance of enrolled students. The level of these performances determines the approach an institute of study may adopt in rendering academic services. The focus of this paper is to evaluate students' academic performance in given courses of study using machine learning methods. This study evaluated various supervised machine learning classification algorithms such as Logistic Regression (LR), Support Vector Machine, Random Forest, Decision Tree, K-Nearest Neighbors, Linear Discriminant Analysis, and Quadratic Discriminant Analysis, using selected features to predict study performance. The accuracy, precision, recall, and F1 score obtained from a 5-Fold Cross-Validation were used to determine the best classification algorithm to predict students’ performances. SVM (using a linear kernel), LDA, and LR were identified as the best-performing machine learning methods. Also, using the LR model, this study identified students' educational habits such as reading and paying attention in class as strong determinants for a student to have an above-average performance. Other important features include the academic history of the student and work. Demographic factors such as age, gender, high school graduation, etc., had no significant effect on a student's performance.

Keywords: student performance, supervised machine learning, classification, cross-validation, prediction

Procedia PDF Downloads 94

28659 Efficient Prediction of Surface Roughness Using Box Behnken Design

Authors: Ajay Kumar Sarathe, Abhinay Kumar

Abstract:

Production of quality products required for specific engineering applications is an important issue. The roughness of the surface plays an important role in the quality of the product by using appropriate machining parameters to eliminate wastage due to over machining. To increase the quality of the surface, the optimum machining parameter setting is crucial during the machining operation. The effect of key machining parameters- spindle speed, feed rate, and depth of cut on surface roughness has been evaluated. Experimental work was carried out using High Speed Steel tool and AlSI 1018 as workpiece material. In this study, the predictive model has been developed using Box-Behnken Design. An experimental investigation has been carried out for this work using BBD for three factors and observed that the predictive model of Ra value is closed to predictive value with a marginal error of 2.8648 %. Developed model establishes a correlation between selected key machining parameters that influence the surface roughness in a AISI 1018. F

Keywords: ANOVA, BBD, optimisation, response surface methodology

Procedia PDF Downloads 135