Search results for: ensemble models
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 6544

Search results for: ensemble models

6484 Production Optimization under Geological Uncertainty Using Distance-Based Clustering

Authors: Byeongcheol Kang, Junyi Kim, Hyungsik Jung, Hyungjun Yang, Jaewoo An, Jonggeun Choe

Abstract:

It is important to figure out reservoir properties for better production management. Due to the limited information, there are geological uncertainties on very heterogeneous or channel reservoir. One of the solutions is to generate multiple equi-probable realizations using geostatistical methods. However, some models have wrong properties, which need to be excluded for simulation efficiency and reliability. We propose a novel method of model selection scheme, based on distance-based clustering for reliable application of production optimization algorithm. Distance is defined as a degree of dissimilarity between the data. We calculate Hausdorff distance to classify the models based on their similarity. Hausdorff distance is useful for shape matching of the reservoir models. We use multi-dimensional scaling (MDS) to describe the models on two dimensional space and group them by K-means clustering. Rather than simulating all models, we choose one representative model from each cluster and find out the best model, which has the similar production rates with the true values. From the process, we can select good reservoir models near the best model with high confidence. We make 100 channel reservoir models using single normal equation simulation (SNESIM). Since oil and gas prefer to flow through the sand facies, it is critical to characterize pattern and connectivity of the channels in the reservoir. After calculating Hausdorff distances and projecting the models by MDS, we can see that the models assemble depending on their channel patterns. These channel distributions affect operation controls of each production well so that the model selection scheme improves management optimization process. We use one of useful global search algorithms, particle swarm optimization (PSO), for our production optimization. PSO is good to find global optimum of objective function, but it takes too much time due to its usage of many particles and iterations. In addition, if we use multiple reservoir models, the simulation time for PSO will be soared. By using the proposed method, we can select good and reliable models that already matches production data. Considering geological uncertainty of the reservoir, we can get well-optimized production controls for maximum net present value. The proposed method shows one of novel solutions to select good cases among the various probabilities. The model selection schemes can be applied to not only production optimization but also history matching or other ensemble-based methods for efficient simulations.

Keywords: distance-based clustering, geological uncertainty, particle swarm optimization (PSO), production optimization

Procedia PDF Downloads 112
6483 Decision Trees Constructing Based on K-Means Clustering Algorithm

Authors: Loai Abdallah, Malik Yousef

Abstract:

A domain space for the data should reflect the actual similarity between objects. Since objects belonging to the same cluster usually share some common traits even though their geometric distance might be relatively large. In general, the Euclidean distance of data points that represented by large number of features is not capturing the actual relation between those points. In this study, we propose a new method to construct a different space that is based on clustering to form a new distance metric. The new distance space is based on ensemble clustering (EC). The EC distance space is defined by tracking the membership of the points over multiple runs of clustering algorithm metric. Over this distance, we train the decision trees classifier (DT-EC). The results obtained by applying DT-EC on 10 datasets confirm our hypotheses that embedding the EC space as a distance metric would improve the performance.

Keywords: ensemble clustering, decision trees, classification, K nearest neighbors

Procedia PDF Downloads 160
6482 Optimizing Approach for Sifting Process to Solve a Common Type of Empirical Mode Decomposition Mode Mixing

Authors: Saad Al-Baddai, Karema Al-Subari, Elmar Lang, Bernd Ludwig

Abstract:

Empirical mode decomposition (EMD), a new data-driven of time-series decomposition, has the advantage of supposing that a time series is non-linear or non-stationary, as is implicitly achieved in Fourier decomposition. However, the EMD suffers of mode mixing problem in some cases. The aim of this paper is to present a solution for a common type of signals causing of EMD mode mixing problem, in case a signal suffers of an intermittency. By an artificial example, the solution shows superior performance in terms of cope EMD mode mixing problem comparing with the conventional EMD and Ensemble Empirical Mode decomposition (EEMD). Furthermore, the over-sifting problem is also completely avoided; and computation load is reduced roughly six times compared with EEMD, an ensemble number of 50.

Keywords: empirical mode decomposition (EMD), mode mixing, sifting process, over-sifting

Procedia PDF Downloads 362
6481 A Dataset of Program Educational Objectives Mapped to ABET Outcomes: Data Cleansing, Exploratory Data Analysis and Modeling

Authors: Addin Osman, Anwar Ali Yahya, Mohammed Basit Kamal

Abstract:

Datasets or collections are becoming important assets by themselves and now they can be accepted as a primary intellectual output of a research. The quality and usage of the datasets depend mainly on the context under which they have been collected, processed, analyzed, validated, and interpreted. This paper aims to present a collection of program educational objectives mapped to student’s outcomes collected from self-study reports prepared by 32 engineering programs accredited by ABET. The manual mapping (classification) of this data is a notoriously tedious, time consuming process. In addition, it requires experts in the area, which are mostly not available. It has been shown the operational settings under which the collection has been produced. The collection has been cleansed, preprocessed, some features have been selected and preliminary exploratory data analysis has been performed so as to illustrate the properties and usefulness of the collection. At the end, the collection has been benchmarked using nine of the most widely used supervised multiclass classification techniques (Binary Relevance, Label Powerset, Classifier Chains, Pruned Sets, Random k-label sets, Ensemble of Classifier Chains, Ensemble of Pruned Sets, Multi-Label k-Nearest Neighbors and Back-Propagation Multi-Label Learning). The techniques have been compared to each other using five well-known measurements (Accuracy, Hamming Loss, Micro-F, Macro-F, and Macro-F). The Ensemble of Classifier Chains and Ensemble of Pruned Sets have achieved encouraging performance compared to other experimented multi-label classification methods. The Classifier Chains method has shown the worst performance. To recap, the benchmark has achieved promising results by utilizing preliminary exploratory data analysis performed on the collection, proposing new trends for research and providing a baseline for future studies.

Keywords: ABET, accreditation, benchmark collection, machine learning, program educational objectives, student outcomes, supervised multi-class classification, text mining

Procedia PDF Downloads 142
6480 Evaluation of Machine Learning Algorithms and Ensemble Methods for Prediction of Students’ Graduation

Authors: Soha A. Bahanshal, Vaibhav Verdhan, Bayong Kim

Abstract:

Graduation rates at six-year colleges are becoming a more essential indicator for incoming fresh students and for university rankings. Predicting student graduation is extremely beneficial to schools and has a huge potential for targeted intervention. It is important for educational institutions since it enables the development of strategic plans that will assist or improve students' performance in achieving their degrees on time (GOT). A first step and a helping hand in extracting useful information from these data and gaining insights into the prediction of students' progress and performance is offered by machine learning techniques. Data analysis and visualization techniques are applied to understand and interpret the data. The data used for the analysis contains students who have graduated in 6 years in the academic year 2017-2018 for science majors. This analysis can be used to predict the graduation of students in the next academic year. Different Predictive modelings such as logistic regression, decision trees, support vector machines, Random Forest, Naïve Bayes, and KNeighborsClassifier are applied to predict whether a student will graduate. These classifiers were evaluated with k folds of 5. The performance of these classifiers was compared based on accuracy measurement. The results indicated that Ensemble Classifier achieves better accuracy, about 91.12%. This GOT prediction model would hopefully be useful to university administration and academics in developing measures for assisting and boosting students' academic performance and ensuring they graduate on time.

Keywords: prediction, decision trees, machine learning, support vector machine, ensemble model, student graduation, GOT graduate on time

Procedia PDF Downloads 48
6479 Predicting Aggregation Propensity from Low-Temperature Conformational Fluctuations

Authors: Hamza Javar Magnier, Robin Curtis

Abstract:

There have been rapid advances in the upstream processing of protein therapeutics, which has shifted the bottleneck to downstream purification and formulation. Finding liquid formulations with shelf lives of up to two years is increasingly difficult for some of the newer therapeutics, which have been engineered for activity, but their formulations are often viscous, can phase separate, and have a high propensity for irreversible aggregation1. We explore means to develop improved predictive ability from a better understanding of how protein-protein interactions on formulation conditions (pH, ionic strength, buffer type, presence of excipients) and how these impact upon the initial steps in protein self-association and aggregation. In this work, we study the initial steps in the aggregation pathways using a minimal protein model based on square-well potentials and discontinuous molecular dynamics. The effect of model parameters, including range of interaction, stiffness, chain length, and chain sequence, implies that protein models fold according to various pathways. By reducing the range of interactions, the folding- and collapse- transition come together, and follow a single-step folding pathway from the denatured to the native state2. After parameterizing the model interaction-parameters, we developed an understanding of low-temperature conformational properties and fluctuations, and the correlation to the folding transition of proteins in isolation. The model fluctuations increase with temperature. We observe a low-temperature point, below which large fluctuations are frozen out. This implies that fluctuations at low-temperature can be correlated to the folding transition at the melting temperature. Because proteins “breath” at low temperatures, defining a native-state as a single structure with conserved contacts and a fixed three-dimensional structure is misleading. Rather, we introduce a new definition of a native-state ensemble based on our understanding of the core conservation, which takes into account the native fluctuations at low temperatures. This approach permits the study of a large range of length and time scales needed to link the molecular interactions to the macroscopically observed behaviour. In addition, these models studied are parameterized by fitting to experimentally observed protein-protein interactions characterized in terms of osmotic second virial coefficients.

Keywords: protein folding, native-ensemble, conformational fluctuation, aggregation

Procedia PDF Downloads 333
6478 MIMIC: A Multi Input Micro-Influencers Classifier

Authors: Simone Leonardi, Luca Ardito

Abstract:

Micro-influencers are effective elements in the marketing strategies of companies and institutions because of their capability to create an hyper-engaged audience around a specific topic of interest. In recent years, many scientific approaches and commercial tools have handled the task of detecting this type of social media users. These strategies adopt solutions ranging from rule based machine learning models to deep neural networks and graph analysis on text, images, and account information. This work compares the existing solutions and proposes an ensemble method to generalize them with different input data and social media platforms. The deployed solution combines deep learning models on unstructured data with statistical machine learning models on structured data. We retrieve both social media accounts information and multimedia posts on Twitter and Instagram. These data are mapped into feature vectors for an eXtreme Gradient Boosting (XGBoost) classifier. Sixty different topics have been analyzed to build a rule based gold standard dataset and to compare the performances of our approach against baseline classifiers. We prove the effectiveness of our work by comparing the accuracy, precision, recall, and f1 score of our model with different configurations and architectures. We obtained an accuracy of 0.91 with our best performing model.

Keywords: deep learning, gradient boosting, image processing, micro-influencers, NLP, social media

Procedia PDF Downloads 139
6477 A Review on Water Models of Surface Water Environment

Authors: Shahbaz G. Hassan

Abstract:

Water quality models are very important to predict the changes in surface water quality for environmental management. The aim of this paper is to give an overview of the water qualities, and to provide directions for selecting models in specific situation. Water quality models include one kind of model based on a mechanistic approach, while other models simulate water quality without considering a mechanism. Mechanistic models can be widely applied and have capabilities for long-time simulation, with highly complexity. Therefore, more spaces are provided to explain the principle and application experience of mechanistic models. Mechanism models have certain assumptions on rivers, lakes and estuaries, which limits the application range of the model, this paper introduces the principles and applications of water quality model based on the above three scenarios. On the other hand, mechanistic models are more easily to compute, and with no limit to the geographical conditions, but they cannot be used with confidence to simulate long term changes. This paper divides the empirical models into two broad categories according to the difference of mathematical algorithm, models based on artificial intelligence and models based on statistical methods.

Keywords: empirical models, mathematical, statistical, water quality

Procedia PDF Downloads 231
6476 An Intrusion Detection Systems Based on K-Means, K-Medoids and Support Vector Clustering Using Ensemble

Authors: A. Mohammadpour, Ebrahim Najafi Kajabad, Ghazale Ipakchi

Abstract:

Presently, computer networks’ security rise in importance and many studies have also been conducted in this field. By the penetration of the internet networks in different fields, many things need to be done to provide a secure industrial and non-industrial network. Fire walls, appropriate Intrusion Detection Systems (IDS), encryption protocols for information sending and receiving, and use of authentication certificated are among things, which should be considered for system security. The aim of the present study is to use the outcome of several algorithms, which cause decline in IDS errors, in the way that improves system security and prevents additional overload to the system. Finally, regarding the obtained result we can also detect the amount and percentage of more sub attacks. By running the proposed system, which is based on the use of multi-algorithmic outcome and comparing that by the proposed single algorithmic methods, we observed a 78.64% result in attack detection that is improved by 3.14% than the proposed algorithms.

Keywords: intrusion detection systems, clustering, k-means, k-medoids, SV clustering, ensemble

Procedia PDF Downloads 192
6475 Multi-Sensor Target Tracking Using Ensemble Learning

Authors: Bhekisipho Twala, Mantepu Masetshaba, Ramapulana Nkoana

Abstract:

Multiple classifier systems combine several individual classifiers to deliver a final classification decision. However, an increasingly controversial question is whether such systems can outperform the single best classifier, and if so, what form of multiple classifiers system yields the most significant benefit. Also, multi-target tracking detection using multiple sensors is an important research field in mobile techniques and military applications. In this paper, several multiple classifiers systems are evaluated in terms of their ability to predict a system’s failure or success for multi-sensor target tracking tasks. The Bristol Eden project dataset is utilised for this task. Experimental and simulation results show that the human activity identification system can fulfill requirements of target tracking due to improved sensors classification performances with multiple classifier systems constructed using boosting achieving higher accuracy rates.

Keywords: single classifier, ensemble learning, multi-target tracking, multiple classifiers

Procedia PDF Downloads 230
6474 Management and Marketing Implications of Tourism Gravity Models

Authors: Clive L. Morley

Abstract:

Gravity models and panel data modelling of tourism flows are receiving renewed attention, after decades of general neglect. Such models have quite different underpinnings from conventional demand models derived from micro-economic theory. They operate at a different level of data and with different theoretical bases. These differences have important consequences for the interpretation of the results and their policy and managerial implications. This review compares and contrasts the two model forms, clarifying the distinguishing features and the estimation requirements of each. In general, gravity models are not recommended for use to address specific management and marketing purposes.

Keywords: gravity models, micro-economics, demand models, marketing

Procedia PDF Downloads 411
6473 Electric Models for Crosstalk Predection: Analysis and Performance Evaluation

Authors: Kachout Mnaouer, Bel Hadj Tahar Jamel, Choubani Fethi

Abstract:

In this paper, three electric equivalent models to evaluate crosstalk between three-conductor transmission lines are proposed. First, electric equivalent models for three-conductor transmission lines are presented. Secondly, rigorous equations to calculate the per-unit length inductive and capacitive parameters are developed. These models allow us to calculate crosstalk between conductors. Finally, to validate the presented models, we compare the theoretical results with simulation data. Obtained results show that proposed models can be used to predict crosstalk performance.

Keywords: near-end crosstalk, inductive parameter, L, Π, T models

Procedia PDF Downloads 422
6472 The Grit in the Glamour: A Qualitative Study of the Well-Being of Fashion Models

Authors: Emily Fortune Super, Ameerah Khadaroo, Aurore Bardey

Abstract:

Fashion models are often assumed to have a glamorous job with limited consideration for their well-being. This study aims to assess the well-being of models through semi-structured interviews with six professional fashion models and six industry professionals. Thematic analysis revealed that although models experienced improved self-confidence, they also reported heightened anxiety levels, body image issues, and the negative influence of modelling on their self-esteem. By contrast, industry professionals reported no or minimum concerns about anxious behaviours or the general well-being of fashion models. Being resilient as a model was perceived as an essential attribute to have by both models and industry professionals as they face recurrent rejection in this industry. These results demonstrate a significant gap in the current understanding of the well-being of fashion models between industry professionals and the models themselves. Findings imply that there is an inherent need for change in the modelling industry to promote and enhance their well-being.

Keywords: body image, fashion industry, modelling, well-being

Procedia PDF Downloads 141
6471 Real-Time Radar Tracking Based on Nonlinear Kalman Filter

Authors: Milca F. Coelho, K. Bousson, Kawser Ahmed

Abstract:

To accurately track an aerospace vehicle in a time-critical situation and in a highly nonlinear environment, is one of the strongest interests within the aerospace community. The tracking is achieved by estimating accurately the state of a moving target, which is composed of a set of variables that can provide a complete status of the system at a given time. One of the main ingredients for a good estimation performance is the use of efficient estimation algorithms. A well-known framework is the Kalman filtering methods, designed for prediction and estimation problems. The success of the Kalman Filter (KF) in engineering applications is mostly due to the Extended Kalman Filter (EKF), which is based on local linearization. Besides its popularity, the EKF presents several limitations. To address these limitations and as a possible solution to tracking problems, this paper proposes the use of the Ensemble Kalman Filter (EnKF). Although the EnKF is being extensively used in the context of weather forecasting and it is being recognized for producing accurate and computationally effective estimation on systems with a very high dimension, it is almost unknown by the tracking community. The EnKF was initially proposed as an attempt to improve the error covariance calculation, which on the classic Kalman Filter is difficult to implement. Also, in the EnKF method the prediction and analysis error covariances have ensemble representations. These ensembles have sizes which limit the number of degrees of freedom, in a way that the filter error covariance calculations are a lot more practical for modest ensemble sizes. In this paper, a realistic simulation of a radar tracking was performed, where the EnKF was applied and compared with the Extended Kalman Filter. The results suggested that the EnKF is a promising tool for tracking applications, offering more advantages in terms of performance.

Keywords: Kalman filter, nonlinear state estimation, optimal tracking, stochastic environment

Procedia PDF Downloads 105
6470 An Ensemble System of Classifiers for Computer-Aided Volcano Monitoring

Authors: Flavio Cannavo

Abstract:

Continuous evaluation of the status of potentially hazardous volcanos plays a key role for civil protection purposes. The importance of monitoring volcanic activity, especially for energetic paroxysms that usually come with tephra emissions, is crucial not only for exposures to the local population but also for airline traffic. Presently, real-time surveillance of most volcanoes worldwide is essentially delegated to one or more human experts in volcanology, who interpret data coming from different kind of monitoring networks. Unfavorably, the high nonlinearity of the complex and coupled volcanic dynamics leads to a large variety of different volcanic behaviors. Moreover, continuously measured parameters (e.g. seismic, deformation, infrasonic and geochemical signals) are often not able to fully explain the ongoing phenomenon, thus making the fast volcano state assessment a very puzzling task for the personnel on duty at the control rooms. With the aim of aiding the personnel on duty in volcano surveillance, here we introduce a system based on an ensemble of data-driven classifiers to infer automatically the ongoing volcano status from all the available different kind of measurements. The system consists of a heterogeneous set of independent classifiers, each one built with its own data and algorithm. Each classifier gives an output about the volcanic status. The ensemble technique allows weighting the single classifier output to combine all the classifications into a single status that maximizes the performance. We tested the model on the Mt. Etna (Italy) case study by considering a long record of multivariate data from 2011 to 2015 and cross-validated it. Results indicate that the proposed model is effective and of great power for decision-making purposes.

Keywords: Bayesian networks, expert system, mount Etna, volcano monitoring

Procedia PDF Downloads 219
6469 Application of Complete Ensemble Empirical Mode Decomposition with Adaptive Noise and Multipoint Optimal Minimum Entropy Deconvolution in Railway Bearings Fault Diagnosis

Authors: Yao Cheng, Weihua Zhang

Abstract:

Although the measured vibration signal contains rich information on machine health conditions, the white noise interferences and the discrete harmonic coming from blade, shaft and mash make the fault diagnosis of rolling element bearings difficult. In order to overcome the interferences of useless signals, a new fault diagnosis method combining Complete Ensemble Empirical Mode Decomposition with adaptive noise (CEEMDAN) and Multipoint Optimal Minimum Entropy Deconvolution (MOMED) is proposed for the fault diagnosis of high-speed train bearings. Firstly, the CEEMDAN technique is applied to adaptively decompose the raw vibration signal into a series of finite intrinsic mode functions (IMFs) and a residue. Compared with Ensemble Empirical Mode Decomposition (EEMD), the CEEMDAN can provide an exact reconstruction of the original signal and a better spectral separation of the modes, which improves the accuracy of fault diagnosis. An effective sensitivity index based on the Pearson's correlation coefficients between IMFs and raw signal is adopted to select sensitive IMFs that contain bearing fault information. The composite signal of the sensitive IMFs is applied to further analysis of fault identification. Next, for propose of identifying the fault information precisely, the MOMED is utilized to enhance the periodic impulses in composite signal. As a non-iterative method, the MOMED has better deconvolution performance than the classical deconvolution methods such Minimum Entropy Deconvolution (MED) and Maximum Correlated Kurtosis Deconvolution (MCKD). Third, the envelope spectrum analysis is applied to detect the existence of bearing fault. The simulated bearing fault signals with white noise and discrete harmonic interferences are used to validate the effectiveness of the proposed method. Finally, the superiorities of the proposed method are further demonstrated by high-speed train bearing fault datasets measured from test rig. The analysis results indicate that the proposed method has strong practicability.

Keywords: bearing, complete ensemble empirical mode decomposition with adaptive noise, fault diagnosis, multipoint optimal minimum entropy deconvolution

Procedia PDF Downloads 339
6468 Cirrhosis Mortality Prediction as Classification using Frequent Subgraph Mining

Authors: Abdolghani Ebrahimi, Diego Klabjan, Chenxi Ge, Daniela Ladner, Parker Stride

Abstract:

In this work, we use machine learning and novel data analysis techniques to predict the one-year mortality of cirrhotic patients. Data from 2,322 patients with liver cirrhosis are collected at a single medical center. Different machine learning models are applied to predict one-year mortality. A comprehensive feature space including demographic information, comorbidity, clinical procedure and laboratory tests is being analyzed. A temporal pattern mining technic called Frequent Subgraph Mining (FSM) is being used. Model for End-stage liver disease (MELD) prediction of mortality is used as a comparator. All of our models statistically significantly outperform the MELD-score model and show an average 10% improvement of the area under the curve (AUC). The FSM technic itself does not improve the model significantly, but FSM, together with a machine learning technique called an ensemble, further improves the model performance. With the abundance of data available in healthcare through electronic health records (EHR), existing predictive models can be refined to identify and treat patients at risk for higher mortality. However, due to the sparsity of the temporal information needed by FSM, the FSM model does not yield significant improvements. To the best of our knowledge, this is the first work to apply modern machine learning algorithms and data analysis methods on predicting one-year mortality of cirrhotic patients and builds a model that predicts one-year mortality significantly more accurate than the MELD score. We have also tested the potential of FSM and provided a new perspective of the importance of clinical features.

Keywords: machine learning, liver cirrhosis, subgraph mining, supervised learning

Procedia PDF Downloads 108
6467 Ensemble Methods in Machine Learning: An Algorithmic Approach to Derive Distinctive Behaviors of Criminal Activity Applied to the Poaching Domain

Authors: Zachary Blanks, Solomon Sonya

Abstract:

Poaching presents a serious threat to endangered animal species, environment conservations, and human life. Additionally, some poaching activity has even been linked to supplying funds to support terrorist networks elsewhere around the world. Consequently, agencies dedicated to protecting wildlife habitats have a near intractable task of adequately patrolling an entire area (spanning several thousand kilometers) given limited resources, funds, and personnel at their disposal. Thus, agencies need predictive tools that are both high-performing and easily implementable by the user to help in learning how the significant features (e.g. animal population densities, topography, behavior patterns of the criminals within the area, etc) interact with each other in hopes of abating poaching. This research develops a classification model using machine learning algorithms to aid in forecasting future attacks that is both easy to train and performs well when compared to other models. In this research, we demonstrate how data imputation methods (specifically predictive mean matching, gradient boosting, and random forest multiple imputation) can be applied to analyze data and create significant predictions across a varied data set. Specifically, we apply these methods to improve the accuracy of adopted prediction models (Logistic Regression, Support Vector Machine, etc). Finally, we assess the performance of the model and the accuracy of our data imputation methods by learning on a real-world data set constituting four years of imputed data and testing on one year of non-imputed data. This paper provides three main contributions. First, we extend work done by the Teamcore and CREATE (Center for Risk and Economic Analysis of Terrorism Events) research group at the University of Southern California (USC) working in conjunction with the Department of Homeland Security to apply game theory and machine learning algorithms to develop more efficient ways of reducing poaching. This research introduces ensemble methods (Random Forests and Stochastic Gradient Boosting) and applies it to real-world poaching data gathered from the Ugandan rain forest park rangers. Next, we consider the effect of data imputation on both the performance of various algorithms and the general accuracy of the method itself when applied to a dependent variable where a large number of observations are missing. Third, we provide an alternate approach to predict the probability of observing poaching both by season and by month. The results from this research are very promising. We conclude that by using Stochastic Gradient Boosting to predict observations for non-commercial poaching by season, we are able to produce statistically equivalent results while being orders of magnitude faster in computation time and complexity. Additionally, when predicting potential poaching incidents by individual month vice entire seasons, boosting techniques produce a mean area under the curve increase of approximately 3% relative to previous prediction schedules by entire seasons.

Keywords: ensemble methods, imputation, machine learning, random forests, statistical analysis, stochastic gradient boosting, wildlife protection

Procedia PDF Downloads 263
6466 Lean Models Classification: Towards a Holistic View

Authors: Y. Tiamaz, N. Souissi

Abstract:

The purpose of this paper is to present a classification of Lean models which aims to capture all the concepts related to this approach and thus facilitate its implementation. This classification allows the identification of the most relevant models according to several dimensions. From this perspective, we present a review and an analysis of Lean models literature and we propose dimensions for the classification of the current proposals while respecting among others the axes of the Lean approach, the maturity of the models as well as their application domains. This classification allowed us to conclude that researchers essentially consider the Lean approach as a toolbox also they design their models to solve problems related to a specific environment. Since Lean approach is no longer intended only for the automotive sector where it was invented, but to all fields (IT, Hospital, ...), we consider that this approach requires a generic model that is capable of being implemented in all areas.

Keywords: lean approach, lean models, classification, dimensions, holistic view

Procedia PDF Downloads 406
6465 Short Answer Grading Using Multi-Context Features

Authors: S. Sharan Sundar, Nithish B. Moudhgalya, Nidhi Bhandari, Vineeth Vijayaraghavan

Abstract:

Automatic Short Answer Grading is one of the prime applications of artificial intelligence in education. Several approaches involving the utilization of selective handcrafted features, graphical matching techniques, concept identification and mapping, complex deep frameworks, sentence embeddings, etc. have been explored over the years. However, keeping in mind the real-world application of the task, these solutions present a slight overhead in terms of computations and resources in achieving high performances. In this work, a simple and effective solution making use of elemental features based on statistical, linguistic properties, and word-based similarity measures in conjunction with tree-based classifiers and regressors is proposed. The results for classification tasks show improvements ranging from 1%-30%, while the regression task shows a stark improvement of 35%. The authors attribute these improvements to the addition of multiple similarity scores to provide ensemble of scoring criteria to the models. The authors also believe the work could reinstate that classical natural language processing techniques and simple machine learning models can be used to achieve high results for short answer grading.

Keywords: artificial intelligence, intelligent systems, natural language processing, text mining

Procedia PDF Downloads 112
6464 The Effect of Particle Porosity in Mixed Matrix Membrane Permeation Models

Authors: Z. Sadeghi, M. R. Omidkhah, M. E. Masoomi

Abstract:

The purpose of this paper is to examine gas transport behavior of mixed matrix membranes (MMMs) combined with porous particles. Main existing models are categorized in two main groups; two-phase (ideal contact) and three-phase (non-ideal contact). A new coefficient, J, was obtained to express equations for estimating effect of the particle porosity in two-phase and three-phase models. Modified models evaluates with existing models and experimental data using Matlab software. Comparison of gas permeability of proposed modified models with existing models in different MMMs shows a better prediction of gas permeability in MMMs.

Keywords: mixed matrix membrane, permeation models, porous particles, porosity

Procedia PDF Downloads 347
6463 Using Swarm Intelligence to Forecast Outcomes of English Premier League Matches

Authors: Hans Schumann, Colin Domnauer, Louis Rosenberg

Abstract:

In this study, machine learning techniques were deployed on real-time human swarm data to forecast the likelihood of outcomes for English Premier League matches in the 2020/21 season. These techniques included ensemble models in combination with neural networks and were tested against an industry standard of Vegas Oddsmakers. Predictions made from the collective intelligence of human swarm participants managed to achieve a positive return on investment over a full season on matches, empirically proving the usefulness of a new artificial intelligence valuing human instinct and intelligence.

Keywords: artificial intelligence, data science, English Premier League, human swarming, machine learning, sports betting, swarm intelligence

Procedia PDF Downloads 180
6462 Exploring Students’ Visual Conception of Matter and Its Implications to Teaching and Learning Chemistry

Authors: Allen A. Espinosa, Arlyne C. Marasigan, Janir T. Datukan

Abstract:

The study explored how students visualize the states and classifications of matter using scientific models. It also identified misconceptions of students in using scientific models. In general, high percentage of students was able to use scientific models correctly and only a little misconception was identified. From the result of the study, a teaching framework was formulated wherein scientific models should be employed in classroom instruction to visualize abstract concepts in chemistry and for better conceptual understanding.

Keywords: visual conception, scientific models, mental models, states of matter, classification of matter

Procedia PDF Downloads 368
6461 Multi-Class Text Classification Using Ensembles of Classifiers

Authors: Syed Basit Ali Shah Bukhari, Yan Qiang, Saad Abdul Rauf, Syed Saqlaina Bukhari

Abstract:

Text Classification is the methodology to classify any given text into the respective category from a given set of categories. It is highly important and vital to use proper set of pre-processing , feature selection and classification techniques to achieve this purpose. In this paper we have used different ensemble techniques along with variance in feature selection parameters to see the change in overall accuracy of the result and also on some other individual class based features which include precision value of each individual category of the text. After subjecting our data through pre-processing and feature selection techniques , different individual classifiers were tested first and after that classifiers were combined to form ensembles to increase their accuracy. Later we also studied the impact of decreasing the classification categories on over all accuracy of data. Text classification is highly used in sentiment analysis on social media sites such as twitter for realizing people’s opinions about any cause or it is also used to analyze customer’s reviews about certain products or services. Opinion mining is a vital task in data mining and text categorization is a back-bone to opinion mining.

Keywords: Natural Language Processing, Ensemble Classifier, Bagging Classifier, AdaBoost

Procedia PDF Downloads 205
6460 Preserving Privacy in Workflow Delegation Models

Authors: Noha Nagy, Hoda Mokhtar, Mohamed El Sherkawi

Abstract:

The popularity of workflow delegation models and the increasing number of workflow provenance-aware systems motivate the need for finding more strict delegation models. Such models combine different approaches for enhanced security and respecting workflow privacy. Although modern enterprises seek conformance to workflow constraints to ensure correctness of their work, these constraints pose a threat to security, because these constraints can be good seeds for attacking privacy even in secure models. This paper introduces a comprehensive Workflow Delegation Model (WFDM) that utilizes provenance and workflow constraints to prevent malicious delegate from attacking workflow privacy as well as extending the delegation functionalities. In addition, we argue the need for exploiting workflow constraints to improve workflow security models.

Keywords: workflow delegation models, secure workflow, workflow privacy, workflow provenance

Procedia PDF Downloads 305
6459 A Method to Saturation Modeling of Synchronous Machines in d-q Axes

Authors: Mohamed Arbi Khlifi, Badr M. Alshammari

Abstract:

This paper discusses the general methods to saturation in the steady-state, two axis (d & q) frame models of synchronous machines. In particular, the important role of the magnetic coupling between the d-q axes (cross-magnetizing phenomenon), is demonstrated. For that purpose, distinct methods of saturation modeling of dumper synchronous machine with cross-saturation are identified, and detailed models synthesis in d-q axes. A number of models are given in the final developed form. The procedure and the novel models are verified by a critical application to prove the validity of the method and the equivalence between all developed models is reported. Advantages of some of the models over the existing ones and their applicability are discussed.

Keywords: cross-magnetizing, models synthesis, synchronous machine, saturated modeling, state-space vectors

Procedia PDF Downloads 425
6458 Ensemble Sampler For Infinite-Dimensional Inverse Problems

Authors: Jeremie Coullon, Robert J. Webber

Abstract:

We introduce a Markov chain Monte Carlo (MCMC) sam-pler for infinite-dimensional inverse problems. Our sam-pler is based on the affine invariant ensemble sampler, which uses interacting walkers to adapt to the covariance structure of the target distribution. We extend this ensem-ble sampler for the first time to infinite-dimensional func-tion spaces, yielding a highly efficient gradient-free MCMC algorithm. Because our ensemble sampler does not require gradients or posterior covariance estimates, it is simple to implement and broadly applicable. In many Bayes-ian inverse problems, Markov chain Monte Carlo (MCMC) meth-ods are needed to approximate distributions on infinite-dimensional function spaces, for example, in groundwater flow, medical imaging, and traffic flow. Yet designing efficient MCMC methods for function spaces has proved challenging. Recent gradi-ent-based MCMC methods preconditioned MCMC methods, and SMC methods have improved the computational efficiency of functional random walk. However, these samplers require gradi-ents or posterior covariance estimates that may be challenging to obtain. Calculating gradients is difficult or impossible in many high-dimensional inverse problems involving a numerical integra-tor with a black-box code base. Additionally, accurately estimating posterior covariances can require a lengthy pilot run or adaptation period. These concerns raise the question: is there a functional sampler that outperforms functional random walk without requir-ing gradients or posterior covariance estimates? To address this question, we consider a gradient-free sampler that avoids explicit covariance estimation yet adapts naturally to the covariance struc-ture of the sampled distribution. This sampler works by consider-ing an ensemble of walkers and interpolating and extrapolating between walkers to make a proposal. This is called the affine in-variant ensemble sampler (AIES), which is easy to tune, easy to parallelize, and efficient at sampling spaces of moderate dimen-sionality (less than 20). The main contribution of this work is to propose a functional ensemble sampler (FES) that combines func-tional random walk and AIES. To apply this sampler, we first cal-culate the Karhunen–Loeve (KL) expansion for the Bayesian prior distribution, assumed to be Gaussian and trace-class. Then, we use AIES to sample the posterior distribution on the low-wavenumber KL components and use the functional random walk to sample the posterior distribution on the high-wavenumber KL components. Alternating between AIES and functional random walk updates, we obtain our functional ensemble sampler that is efficient and easy to use without requiring detailed knowledge of the target dis-tribution. In past work, several authors have proposed splitting the Bayesian posterior into low-wavenumber and high-wavenumber components and then applying enhanced sampling to the low-wavenumber components. Yet compared to these other samplers, FES is unique in its simplicity and broad applicability. FES does not require any derivatives, and the need for derivative-free sam-plers has previously been emphasized. FES also eliminates the requirement for posterior covariance estimates. Lastly, FES is more efficient than other gradient-free samplers in our tests. In two nu-merical examples, we apply FES to challenging inverse problems that involve estimating a functional parameter and one or more scalar parameters. We compare the performance of functional random walk, FES, and an alternative derivative-free sampler that explicitly estimates the posterior covariance matrix. We conclude that FES is the fastest available gradient-free sampler for these challenging and multimodal test problems.

Keywords: Bayesian inverse problems, Markov chain Monte Carlo, infinite-dimensional inverse problems, dimensionality reduction

Procedia PDF Downloads 127
6457 Robot Spatial Reasoning via 3D Models

Authors: John Allard, Alex Rich, Iris Aguilar, Zachary Dodds

Abstract:

With this paper we present several experiences deploying novel, low-cost resources for computing with 3D spatial models. Certainly, computing with 3D models undergirds some of our field’s most important contributions to the human experience. Most often, those are contrived artifacts. This work extends that tradition by focusing on novel resources that deliver uncontrived models of a system’s current surroundings. Atop this new capability, we present several projects investigating the student-accessibility of the computational tools for reasoning about the 3D space around us. We conclude that, with current scaffolding, real-world 3D models are now an accessible and viable foundation for creative computational work.

Keywords: 3D vision, matterport model, real-world 3D models, mathematical and computational methods

Procedia PDF Downloads 510
6456 Uncertainty in Near-Term Global Surface Warming Linked to Pacific Trade Wind Variability

Authors: M. Hadi Bordbar, Matthew England, Alex Sen Gupta, Agus Santoso, Andrea Taschetto, Thomas Martin, Wonsun Park, Mojib Latif

Abstract:

Climate models generally simulate long-term reductions in the Pacific Walker Circulation with increasing atmospheric greenhouse gases. However, over two recent decades (1992-2011) there was a strong intensification of the Pacific Trade Winds that is linked with a slowdown in global surface warming. Using large ensembles of multiple climate models forced by increasing atmospheric greenhouse gas concentrations and starting from different ocean and/or atmospheric initial conditions, we reveal very diverse 20-year trends in the tropical Pacific climate associated with a considerable uncertainty in the globally averaged surface air temperature (SAT) in each model ensemble. This result suggests low confidence in our ability to accurately predict SAT trends over 20-year timescale only from external forcing. We show, however, that the uncertainty can be reduced when the initial oceanic state is adequately known and well represented in the model. Our analyses suggest that internal variability in the Pacific trade winds can mask the anthropogenic signal over a 20-year time frame, and drive transitions between periods of accelerated global warming and temporary slowdown periods.

Keywords: trade winds, walker circulation, hiatus in the global surface warming, internal climate variability

Procedia PDF Downloads 232
6455 Exploratory Study of Contemporary Models of Leadership

Authors: Gadah Alkeniah

Abstract:

Leadership is acknowledged internationally as fundamental to school efficiency and school enhancement nevertheless there are various understandings of what leadership is and how it is realised in practice. There are a number of educational leadership models that are considered important. However, the present study uses a systematic review method to examine and compare five models of the most well-known contemporary models of leadership as well as introduces the dimension of each model. Our results reveal that recently the distributed leadership has grown in popularity within the field of education. The study concludes by suggesting future directions in leadership development and education research.

Keywords: distributed leadership, instructional leadership, leadership models, moral leadership, strategic leadership, transformational leadership

Procedia PDF Downloads 176