Search results for: ensemble forecast
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 591

Search results for: ensemble forecast

531 Time Series Modelling for Forecasting Wheat Production and Consumption of South Africa in Time of War

Authors: Yiseyon Hosu, Joseph Akande

Abstract:

Wheat is one of the most important staple food grains of human for centuries and is largely consumed in South Africa. It has a special place in the South African economy because of its significance in food security, trade, and industry. This paper modelled and forecast the production and consumption of wheat in South Africa in the time covid-19 and the ongoing Russia-Ukraine war by using annual time series data from 1940–2021 based on the ARIMA models. Both the averaging forecast and selected models forecast indicate that there is the possibility of an increase with respect to production. The minimum and maximum growth in production is projected to be between 3million and 10 million tons, respectively. However, the model also forecast a possibility of depression with respect to consumption in South Africa. Although Covid-19 and the war between Ukraine and Russia, two major producers and exporters of global wheat, are having an effect on the volatility of the prices currently, the wheat production in South African is expected to increase and meat the consumption demand and provided an opportunity for increase export with respect to domestic consumption. The forecasting of production and consumption behaviours of major crops play an important role towards food and nutrition security, these findings can assist policymakers and will provide them with insights into the production and pricing policy of wheat in South Africa.

Keywords: ARIMA, food security, price volatility, staple food, South Africa

Procedia PDF Downloads 75
530 Ethnic Identity Formation in Diaspora of Bajau Samah: An Ethnomusicological Study of Bertitik Music Ensemble in the Northwest Coast of Sabah, Malaysia

Authors: Mohd Hassan Abdullah, Mohd Azam Sulong, Mohd Nizam Nasrifan, Nor Azman Mohd Ramli, Suflan Faidzal Arshad

Abstract:

The Bajau Samah is a maritime ethnic community that inhabits the west coast of Sabah, Malaysia. The majority of these ethnicities embrace Islam and practice their own culture. Bertitik music ensemble is one of the musical practices performed in various social events, especially weddings. The ensemble, which combines several musical instruments including gongs, drums and kulintangan is played by six musicians to accompany various social events in the community. The position of the Bajau Samah in a multi-ethnic community such as Kadazandusun, Rungus, Suluk, Malay, Iranun and others exposes to the cultural activities with various artistic elements of the surrounding community. Western influences have also played an important role in the process of hybridity and acculturation in this society. Cultural change and the influx of foreign cultures have threatened the sustainability of this musical practice. This study aims to musicologically analyze the elements of bertitik ensemble that form the uniqueness of the cultural identity of the Bajau Samah Ethnic group. An ethnomusicological approach has been used to parse the essence of the bertitik music repertoire in depth. Ethnographic study design which comprises fieldwork, interviews, observations and document analysis as the main methods were utilized to collect data. Music recordings were transcribed in the form of musical notation and then analyzed based on the theory of "the norms of musical styles". This study reveals that musical elements featured in the ensemble represent the symbol and cultural identity to this ethnic group. The findings of the study were documented in the form of musicological analysis, audio and video as well as transcriptions of the musical notation of the repertoire of the music ensemble. This study is in line with the National cultural policy gazetted by the government, which is "Conservation, preservation and development of culture towards strengthening the foundations of National Culture through joint research, development, education, expansion and cultural relations" It will benefit various parties including students, teachers, academics, cultural arts activists and so on towards preserving the nation's cultural heritage as well as strengthening the spirit of nationhood among the people of various races and ethnic group in Malaysia.

Keywords: ethnomusicology, ethnic music, Malaysian music, cultural identity

Procedia PDF Downloads 115
529 Multiple Relaxation Times in the Gibbs Ensemble Monte Carlo Simulation of Phase Separation

Authors: Bina Kumari, Subir K. Sarkar, Pradipta Bandyopadhyay

Abstract:

The autocorrelation function of the density fluctuation is studied in each of the two phases in a Gibbs Ensemble Monte Carlo (GEMC) simulation of the problem of phase separation for a square well potential with various values of its range. We find that the normalized autocorrelation function is described very well as a linear combination of an exponential function with a time scale τ₂ and a stretched exponential function with a time scale τ₁ and an exponent α. Dependence of (α, τ₁, τ₂) on the parameters of the GEMC algorithm and the range of the square well potential is investigated and interpreted. We also analyse the issue of how to choose the parameters of the GEMC simulation optimally.

Keywords: autocorrelation function, density fluctuation, GEMC, simulation

Procedia PDF Downloads 167
528 Design of an Ensemble Learning Behavior Anomaly Detection Framework

Authors: Abdoulaye Diop, Nahid Emad, Thierry Winter, Mohamed Hilia

Abstract:

Data assets protection is a crucial issue in the cybersecurity field. Companies use logical access control tools to vault their information assets and protect them against external threats, but they lack solutions to counter insider threats. Nowadays, insider threats are the most significant concern of security analysts. They are mainly individuals with legitimate access to companies information systems, which use their rights with malicious intents. In several fields, behavior anomaly detection is the method used by cyber specialists to counter the threats of user malicious activities effectively. In this paper, we present the step toward the construction of a user and entity behavior analysis framework by proposing a behavior anomaly detection model. This model combines machine learning classification techniques and graph-based methods, relying on linear algebra and parallel computing techniques. We show the utility of an ensemble learning approach in this context. We present some detection methods tests results on an representative access control dataset. The use of some explored classifiers gives results up to 99% of accuracy.

Keywords: cybersecurity, data protection, access control, insider threat, user behavior analysis, ensemble learning, high performance computing

Procedia PDF Downloads 104
527 Objective-Based System Dynamics Modeling to Forecast the Number of Health Professionals in Pudong New Area of Shanghai

Authors: Jie Ji, Jing Xu, Yuehong Zhuang, Xiangqing Kang, Ying Qian, Ping Zhou, Di Xue

Abstract:

Background: In 2014, there were 28,341 health professionals in Pudong new area of Shanghai and the number per 1000 population was 5.199, 55.55% higher than that in 2006. But it was always less than the average number of health professionals per 1000 population in Shanghai from 2006 to 2014. Therefore, allocation planning for the health professionals in Pudong new area has become a high priority task in order to meet the future demands of health care. In this study, we constructed an objective-based system dynamics model to forecast the number of health professionals in Pudong new area of Shanghai in 2020. Methods: We collected the data from health statistics reports and previous survey of human resources in Pudong new area of Shanghai. Nine experts, who were from health administrative departments, public hospitals and community health service centers, were consulted to estimate the current and future status of nine variables used in the system dynamics model. Based on the objective of the number of health professionals per 1000 population (8.0) in Shanghai for 2020, the system dynamics model for health professionals in Pudong new area of Shanghai was constructed to forecast the number of health professionals needed in Pudong new area in 2020. Results: The system dynamics model for health professionals in Pudong new area of Shanghai was constructed. The model forecasted that there will be 37,330 health professionals (6.433 per 1000 population) in 2020. If the success rate of health professional recruitment changed from 20% to 70%, the number of health professionals per 1000 population would be changed from 5.269 to 6.919. If this rate changed from 20% to 70% and the success rate of building new beds changed from 5% to 30% at the same time, the number of health professionals per 1000 population would be changed from 5.269 to 6.923. Conclusions: The system dynamics model could be used to simulate and forecast the health professionals. But, if there were no significant changes in health policies and management system, the number of health professionals per 1000 population would not reach the objectives in Pudong new area in 2020.

Keywords: allocation planning, forecast, health professional, system dynamics

Procedia PDF Downloads 363
526 Pipat Ensemble and Music for Ligkey in Amphur Muaeng, Chachoengsao Province

Authors: Prasan Briboonnanggoul

Abstract:

The major objective of this research study was to explore some aspects of the performance culture of musical folk drama called Ligkey. This study was undertaken in an effect to focus on the specific functions of orchestra which accompanied Ligkey on Thai musical instruments in Chachoengsao Province. The process of study and exploration consisted of questionnaire, interview, a tape recording of an interview and photographs of performances which all of them were analyzed for the finding. The information obtained from the study indicated that Ligkey still received stable attention from people despite lesser performances affected by economics crisis. Almost all of the performances were organized and supported by both the public sector and the private sector. Based on the summary and finding of this study, a) there were ten Ligkey ensemble and ten orchestra which were Mon orchestra, not the precedent and the predecessor known as Thai orchestra; b) a variety of functions performed by musicians must harmonize discipline, punctuality, patience, no negligence, proficiency in performance; c) folklore melodies known as Plengnapad were performed as usual, but folklore melodies and songs known as Plangsongchan got lesser and got a tendency towards extinction because of the plot which corresponded with a market-driven entertainment. Therefore, a purpose-built schema of the preservation of Thai folklore songs was that they should have been recognized by both the performers and the audiences and patronized by the public sector via the government media to publicize the value of popular art form.

Keywords: Pipat Ensemble, Ligkey, Amphur Muaeng, Chachoengsao Province

Procedia PDF Downloads 308
525 SEMCPRA-Sar-Esembled Model for Climate Prediction in Remote Area

Authors: Kamalpreet Kaur, Renu Dhir

Abstract:

Climate prediction is an essential component of climate research, which helps evaluate possible effects on economies, communities, and ecosystems. Climate prediction involves short-term weather prediction, seasonal prediction, and long-term climate change prediction. Climate prediction can use the information gathered from satellites, ground-based stations, and ocean buoys, among other sources. The paper's four architectures, such as ResNet50, VGG19, Inception-v3, and Xception, have been combined using an ensemble approach for overall performance and robustness. An ensemble of different models makes a prediction, and the majority vote determines the final prediction. The various architectures such as ResNet50, VGG19, Inception-v3, and Xception efficiently classify the dataset RSI-CB256, which contains satellite images into cloudy and non-cloudy. The generated ensembled S-E model (Sar-ensembled model) provides an accuracy of 99.25%.

Keywords: climate, satellite images, prediction, classification

Procedia PDF Downloads 45
524 Performance Assessment of Multi-Level Ensemble for Multi-Class Problems

Authors: Rodolfo Lorbieski, Silvia Modesto Nassar

Abstract:

Many supervised machine learning tasks require decision making across numerous different classes. Multi-class classification has several applications, such as face recognition, text recognition and medical diagnostics. The objective of this article is to analyze an adapted method of Stacking in multi-class problems, which combines ensembles within the ensemble itself. For this purpose, a training similar to Stacking was used, but with three levels, where the final decision-maker (level 2) performs its training by combining outputs from the tree-based pair of meta-classifiers (level 1) from Bayesian families. These are in turn trained by pairs of base classifiers (level 0) of the same family. This strategy seeks to promote diversity among the ensembles forming the meta-classifier level 2. Three performance measures were used: (1) accuracy, (2) area under the ROC curve, and (3) time for three factors: (a) datasets, (b) experiments and (c) levels. To compare the factors, ANOVA three-way test was executed for each performance measure, considering 5 datasets by 25 experiments by 3 levels. A triple interaction between factors was observed only in time. The accuracy and area under the ROC curve presented similar results, showing a double interaction between level and experiment, as well as for the dataset factor. It was concluded that level 2 had an average performance above the other levels and that the proposed method is especially efficient for multi-class problems when compared to binary problems.

Keywords: stacking, multi-layers, ensemble, multi-class

Procedia PDF Downloads 251
523 Decision Trees Constructing Based on K-Means Clustering Algorithm

Authors: Loai Abdallah, Malik Yousef

Abstract:

A domain space for the data should reflect the actual similarity between objects. Since objects belonging to the same cluster usually share some common traits even though their geometric distance might be relatively large. In general, the Euclidean distance of data points that represented by large number of features is not capturing the actual relation between those points. In this study, we propose a new method to construct a different space that is based on clustering to form a new distance metric. The new distance space is based on ensemble clustering (EC). The EC distance space is defined by tracking the membership of the points over multiple runs of clustering algorithm metric. Over this distance, we train the decision trees classifier (DT-EC). The results obtained by applying DT-EC on 10 datasets confirm our hypotheses that embedding the EC space as a distance metric would improve the performance.

Keywords: ensemble clustering, decision trees, classification, K nearest neighbors

Procedia PDF Downloads 165
522 D-Wave Quantum Computing Ising Model: A Case Study for Forecasting of Heat Waves

Authors: Dmytro Zubov, Francesco Volponi

Abstract:

In this paper, D-Wave quantum computing Ising model is used for the forecasting of positive extremes of daily mean air temperature. Forecast models are designed with two to five qubits, which represent 2-, 3-, 4-, and 5-day historical data respectively. Ising model’s real-valued weights and dimensionless coefficients are calculated using daily mean air temperatures from 119 places around the world, as well as sea level (Aburatsu, Japan). In comparison with current methods, this approach is better suited to predict heat wave values because it does not require the estimation of a probability distribution from scarce observations. Proposed forecast quantum computing algorithm is simulated based on traditional computer architecture and combinatorial optimization of Ising model parameters for the Ronald Reagan Washington National Airport dataset with 1-day lead-time on learning sample (1975-2010 yr). Analysis of the forecast accuracy (ratio of successful predictions to total number of predictions) on the validation sample (2011-2014 yr) shows that Ising model with three qubits has 100 % accuracy, which is quite significant as compared to other methods. However, number of identified heat waves is small (only one out of nineteen in this case). Other models with 2, 4, and 5 qubits have 20 %, 3.8 %, and 3.8 % accuracy respectively. Presented three-qubit forecast model is applied for prediction of heat waves at other five locations: Aurel Vlaicu, Romania – accuracy is 28.6 %; Bratislava, Slovakia – accuracy is 21.7 %; Brussels, Belgium – accuracy is 33.3 %; Sofia, Bulgaria – accuracy is 50 %; Akhisar, Turkey – accuracy is 21.4 %. These predictions are not ideal, but not zeros. They can be used independently or together with other predictions generated by different method(s). The loss of human life, as well as environmental, economic, and material damage, from extreme air temperatures could be reduced if some of heat waves are predicted. Even a small success rate implies a large socio-economic benefit.

Keywords: heat wave, D-wave, forecast, Ising model, quantum computing

Procedia PDF Downloads 476
521 Machine Learning Model to Predict TB Bacteria-Resistant Drugs from TB Isolates

Authors: Rosa Tsegaye Aga, Xuan Jiang, Pavel Vazquez Faci, Siqing Liu, Simon Rayner, Endalkachew Alemu, Markos Abebe

Abstract:

Tuberculosis (TB) is a major cause of disease globally. In most cases, TB is treatable and curable, but only with the proper treatment. There is a time when drug-resistant TB occurs when bacteria become resistant to the drugs that are used to treat TB. Current strategies to identify drug-resistant TB bacteria are laboratory-based, and it takes a longer time to identify the drug-resistant bacteria and treat the patient accordingly. But machine learning (ML) and data science approaches can offer new approaches to the problem. In this study, we propose to develop an ML-based model to predict the antibiotic resistance phenotypes of TB isolates in minutes and give the right treatment to the patient immediately. The study has been using the whole genome sequence (WGS) of TB isolates as training data that have been extracted from the NCBI repository and contain different countries’ samples to build the ML models. The reason that different countries’ samples have been included is to generalize the large group of TB isolates from different regions in the world. This supports the model to train different behaviors of the TB bacteria and makes the model robust. The model training has been considering three pieces of information that have been extracted from the WGS data to train the model. These are all variants that have been found within the candidate genes (F1), predetermined resistance-associated variants (F2), and only resistance-associated gene information for the particular drug. Two major datasets have been constructed using these three information. F1 and F2 information have been considered as two independent datasets, and the third information is used as a class to label the two datasets. Five machine learning algorithms have been considered to train the model. These are Support Vector Machine (SVM), Random forest (RF), Logistic regression (LR), Gradient Boosting, and Ada boost algorithms. The models have been trained on the datasets F1, F2, and F1F2 that is the F1 and the F2 dataset merged. Additionally, an ensemble approach has been used to train the model. The ensemble approach has been considered to run F1 and F2 datasets on gradient boosting algorithm and use the output as one dataset that is called F1F2 ensemble dataset and train a model using this dataset on the five algorithms. As the experiment shows, the ensemble approach model that has been trained on the Gradient Boosting algorithm outperformed the rest of the models. In conclusion, this study suggests the ensemble approach, that is, the RF + Gradient boosting model, to predict the antibiotic resistance phenotypes of TB isolates by outperforming the rest of the models.

Keywords: machine learning, MTB, WGS, drug resistant TB

Procedia PDF Downloads 28
520 Optimizing Approach for Sifting Process to Solve a Common Type of Empirical Mode Decomposition Mode Mixing

Authors: Saad Al-Baddai, Karema Al-Subari, Elmar Lang, Bernd Ludwig

Abstract:

Empirical mode decomposition (EMD), a new data-driven of time-series decomposition, has the advantage of supposing that a time series is non-linear or non-stationary, as is implicitly achieved in Fourier decomposition. However, the EMD suffers of mode mixing problem in some cases. The aim of this paper is to present a solution for a common type of signals causing of EMD mode mixing problem, in case a signal suffers of an intermittency. By an artificial example, the solution shows superior performance in terms of cope EMD mode mixing problem comparing with the conventional EMD and Ensemble Empirical Mode decomposition (EEMD). Furthermore, the over-sifting problem is also completely avoided; and computation load is reduced roughly six times compared with EEMD, an ensemble number of 50.

Keywords: empirical mode decomposition (EMD), mode mixing, sifting process, over-sifting

Procedia PDF Downloads 367
519 Forecasting the Influences of Information and Communication Technology on the Structural Changes of Japanese Industrial Sectors: A Study Using Statistical Analysis

Authors: Ubaidillah Zuhdi, Shunsuke Mori, Kazuhisa Kamegai

Abstract:

The purpose of this study is to forecast the influences of Information and Communication Technology (ICT) on the structural changes of Japanese economies based on Leontief Input-Output (IO) coefficients. This study establishes a statistical analysis to predict the future interrelationships among industries. We employ the Constrained Multivariate Regression (CMR) model to analyze the historical changes of input-output coefficients. Statistical significance of the model is then tested by Likelihood Ratio Test (LRT). In our model, ICT is represented by two explanatory variables, i.e. computers (including main parts and accessories) and telecommunications equipment. A previous study, which analyzed the influences of these variables on the structural changes of Japanese industrial sectors from 1985-2005, concluded that these variables had significant influences on the changes in the business circumstances of Japanese commerce, business services and office supplies, and personal services sectors. The projected future Japanese economic structure based on the above forecast generates the differentiated direct and indirect outcomes of ICT penetration.

Keywords: forecast, ICT, industrial structural changes, statistical analysis

Procedia PDF Downloads 356
518 The Design of a Vehicle Traffic Flow Prediction Model for a Gauteng Freeway Based on an Ensemble of Multi-Layer Perceptron

Authors: Tebogo Emma Makaba, Barnabas Ndlovu Gatsheni

Abstract:

The cities of Johannesburg and Pretoria both located in the Gauteng province are separated by a distance of 58 km. The traffic queues on the Ben Schoeman freeway which connects these two cities can stretch for almost 1.5 km. Vehicle traffic congestion impacts negatively on the business and the commuter’s quality of life. The goal of this paper is to identify variables that influence the flow of traffic and to design a vehicle traffic prediction model, which will predict the traffic flow pattern in advance. The model will unable motorist to be able to make appropriate travel decisions ahead of time. The data used was collected by Mikro’s Traffic Monitoring (MTM). Multi-Layer perceptron (MLP) was used individually to construct the model and the MLP was also combined with Bagging ensemble method to training the data. The cross—validation method was used for evaluating the models. The results obtained from the techniques were compared using predictive and prediction costs. The cost was computed using combination of the loss matrix and the confusion matrix. The predicted models designed shows that the status of the traffic flow on the freeway can be predicted using the following parameters travel time, average speed, traffic volume and day of month. The implications of this work is that commuters will be able to spend less time travelling on the route and spend time with their families. The logistics industry will save more than twice what they are currently spending.

Keywords: bagging ensemble methods, confusion matrix, multi-layer perceptron, vehicle traffic flow

Procedia PDF Downloads 320
517 Enhancing Sell-In and Sell-Out Forecasting Using Ensemble Machine Learning Method

Authors: Vishal Das, Tianyi Mao, Zhicheng Geng, Carmen Flores, Diego Pelloso, Fang Wang

Abstract:

Accurate sell-in and sell-out forecasting is a ubiquitous problem in the retail industry. It is an important element of any demand planning activity. As a global food and beverage company, Nestlé has hundreds of products in each geographical location that they operate in. Each product has its sell-in and sell-out time series data, which are forecasted on a weekly and monthly scale for demand and financial planning. To address this challenge, Nestlé Chilein collaboration with Amazon Machine Learning Solutions Labhas developed their in-house solution of using machine learning models for forecasting. Similar products are combined together such that there is one model for each product category. In this way, the models learn from a larger set of data, and there are fewer models to maintain. The solution is scalable to all product categories and is developed to be flexible enough to include any new product or eliminate any existing product in a product category based on requirements. We show how we can use the machine learning development environment on Amazon Web Services (AWS) to explore a set of forecasting models and create business intelligence dashboards that can be used with the existing demand planning tools in Nestlé. We explored recent deep learning networks (DNN), which show promising results for a variety of time series forecasting problems. Specifically, we used a DeepAR autoregressive model that can group similar time series together and provide robust predictions. To further enhance the accuracy of the predictions and include domain-specific knowledge, we designed an ensemble approach using DeepAR and XGBoost regression model. As part of the ensemble approach, we interlinked the sell-out and sell-in information to ensure that a future sell-out influences the current sell-in predictions. Our approach outperforms the benchmark statistical models by more than 50%. The machine learning (ML) pipeline implemented in the cloud is currently being extended for other product categories and is getting adopted by other geomarkets.

Keywords: sell-in and sell-out forecasting, demand planning, DeepAR, retail, ensemble machine learning, time-series

Procedia PDF Downloads 222
516 Using Gaussian Process in Wind Power Forecasting

Authors: Hacene Benkhoula, Mohamed Badreddine Benabdella, Hamid Bouzeboudja, Abderrahmane Asraoui

Abstract:

The wind is a random variable difficult to master, for this, we developed a mathematical and statistical methods enable to modeling and forecast wind power. Gaussian Processes (GP) is one of the most widely used families of stochastic processes for modeling dependent data observed over time, or space or time and space. GP is an underlying process formed by unrecognized operator’s uses to solve a problem. The purpose of this paper is to present how to forecast wind power by using the GP. The Gaussian process method for forecasting are presented. To validate the presented approach, a simulation under the MATLAB environment has been given.

Keywords: wind power, Gaussien process, modelling, forecasting

Procedia PDF Downloads 383
515 The Role of Inventory Classification in Supply Chain Responsiveness in a Build-to-Order and Build-To-Forecast Manufacturing Environment: A Comparative Analysis

Authors: Qamar Iqbal

Abstract:

Companies strive to improve their forecasting methods to predict the fluctuations in customer demand. These fluctuation and variation in demand affect the manufacturing operations and can limit a company’s ability to fulfill customer demand on time. Companies keep the inventory buffer and maintain the stocking levels to reduce the impact of demand variation. A mid-size company deals with thousands of stock keeping units (skus). It is neither easy and nor efficient to control and manage each sku. Inventory classification provides a tool to the management to increase their ability to support customer demand. The paper presents a framework that shows how inventory classification can play a role to increase supply chain responsiveness. A case study will be presented to further elaborate the method both for build-to-order and build-to-forecast manufacturing environments. Results will be compared that will show which manufacturing setting has advantage over another under different circumstances. The outcome of this study is very useful to the management because this will give them an insight on how inventory classification can be used to increase their ability to respond to changing customer needs.

Keywords: inventory classification, supply chain responsiveness, forecast, manufacturing environment

Procedia PDF Downloads 577
514 Natural Gas Production Forecasts Using Diffusion Models

Authors: Md. Abud Darda

Abstract:

Different options for natural gas production in wide geographic areas may be described through diffusion of innovation models. This type of modeling approach provides an indirect estimate of an ultimately recoverable resource, URR, capture the quantitative effects of observed strategic interventions, and allow ex-ante assessments of future scenarios over time. In order to ensure a sustainable energy policy, it is important to forecast the availability of this natural resource. Considering a finite life cycle, in this paper we try to investigate the natural gas production of Myanmar and Algeria, two important natural gas provider in the world energy market. A number of homogeneous and heterogeneous diffusion models, with convenient extensions, have been used. Models validation has also been performed in terms of prediction capability.

Keywords: diffusion models, energy forecast, natural gas, nonlinear production

Procedia PDF Downloads 207
513 A Dataset of Program Educational Objectives Mapped to ABET Outcomes: Data Cleansing, Exploratory Data Analysis and Modeling

Authors: Addin Osman, Anwar Ali Yahya, Mohammed Basit Kamal

Abstract:

Datasets or collections are becoming important assets by themselves and now they can be accepted as a primary intellectual output of a research. The quality and usage of the datasets depend mainly on the context under which they have been collected, processed, analyzed, validated, and interpreted. This paper aims to present a collection of program educational objectives mapped to student’s outcomes collected from self-study reports prepared by 32 engineering programs accredited by ABET. The manual mapping (classification) of this data is a notoriously tedious, time consuming process. In addition, it requires experts in the area, which are mostly not available. It has been shown the operational settings under which the collection has been produced. The collection has been cleansed, preprocessed, some features have been selected and preliminary exploratory data analysis has been performed so as to illustrate the properties and usefulness of the collection. At the end, the collection has been benchmarked using nine of the most widely used supervised multiclass classification techniques (Binary Relevance, Label Powerset, Classifier Chains, Pruned Sets, Random k-label sets, Ensemble of Classifier Chains, Ensemble of Pruned Sets, Multi-Label k-Nearest Neighbors and Back-Propagation Multi-Label Learning). The techniques have been compared to each other using five well-known measurements (Accuracy, Hamming Loss, Micro-F, Macro-F, and Macro-F). The Ensemble of Classifier Chains and Ensemble of Pruned Sets have achieved encouraging performance compared to other experimented multi-label classification methods. The Classifier Chains method has shown the worst performance. To recap, the benchmark has achieved promising results by utilizing preliminary exploratory data analysis performed on the collection, proposing new trends for research and providing a baseline for future studies.

Keywords: ABET, accreditation, benchmark collection, machine learning, program educational objectives, student outcomes, supervised multi-class classification, text mining

Procedia PDF Downloads 148
512 Evaluation of Machine Learning Algorithms and Ensemble Methods for Prediction of Students’ Graduation

Authors: Soha A. Bahanshal, Vaibhav Verdhan, Bayong Kim

Abstract:

Graduation rates at six-year colleges are becoming a more essential indicator for incoming fresh students and for university rankings. Predicting student graduation is extremely beneficial to schools and has a huge potential for targeted intervention. It is important for educational institutions since it enables the development of strategic plans that will assist or improve students' performance in achieving their degrees on time (GOT). A first step and a helping hand in extracting useful information from these data and gaining insights into the prediction of students' progress and performance is offered by machine learning techniques. Data analysis and visualization techniques are applied to understand and interpret the data. The data used for the analysis contains students who have graduated in 6 years in the academic year 2017-2018 for science majors. This analysis can be used to predict the graduation of students in the next academic year. Different Predictive modelings such as logistic regression, decision trees, support vector machines, Random Forest, Naïve Bayes, and KNeighborsClassifier are applied to predict whether a student will graduate. These classifiers were evaluated with k folds of 5. The performance of these classifiers was compared based on accuracy measurement. The results indicated that Ensemble Classifier achieves better accuracy, about 91.12%. This GOT prediction model would hopefully be useful to university administration and academics in developing measures for assisting and boosting students' academic performance and ensuring they graduate on time.

Keywords: prediction, decision trees, machine learning, support vector machine, ensemble model, student graduation, GOT graduate on time

Procedia PDF Downloads 58
511 Load Forecasting Using Neural Network Integrated with Economic Dispatch Problem

Authors: Mariyam Arif, Ye Liu, Israr Ul Haq, Ahsan Ashfaq

Abstract:

High cost of fossil fuels and intensifying installations of alternate energy generation sources are intimidating main challenges in power systems. Making accurate load forecasting an important and challenging task for optimal energy planning and management at both distribution and generation side. There are many techniques to forecast load but each technique comes with its own limitation and requires data to accurately predict the forecast load. Artificial Neural Network (ANN) is one such technique to efficiently forecast the load. Comparison between two different ranges of input datasets has been applied to dynamic ANN technique using MATLAB Neural Network Toolbox. It has been observed that selection of input data on training of a network has significant effects on forecasted results. Day-wise input data forecasted the load accurately as compared to year-wise input data. The forecasted load is then distributed among the six generators by using the linear programming to get the optimal point of generation. The algorithm is then verified by comparing the results of each generator with their respective generation limits.

Keywords: artificial neural networks, demand-side management, economic dispatch, linear programming, power generation dispatch

Procedia PDF Downloads 168
510 Stacking Ensemble Approach for Combining Different Methods in Real Estate Prediction

Authors: Sol Girouard, Zona Kostic

Abstract:

A home is often the largest and most expensive purchase a person makes. Whether the decision leads to a successful outcome will be determined by a combination of critical factors. In this paper, we propose a method that efficiently handles all the factors in residential real estate and performs predictions given a feature space with high dimensionality while controlling for overfitting. The proposed method was built on gradient descent and boosting algorithms and uses a mixed optimizing technique to improve the prediction power. Usually, a single model cannot handle all the cases thus our approach builds multiple models based on different subsets of the predictors. The algorithm was tested on 3 million homes across the U.S., and the experimental results demonstrate the efficiency of this approach by outperforming techniques currently used in forecasting prices. With everyday changes on the real estate market, our proposed algorithm capitalizes from new events allowing more efficient predictions.

Keywords: real estate prediction, gradient descent, boosting, ensemble methods, active learning, training

Procedia PDF Downloads 253
509 Forecasting Age-Specific Mortality Rates and Life Expectancy at Births for Malaysian Sub-Populations

Authors: Syazreen N. Shair, Saiful A. Ishak, Aida Y. Yusof, Azizah Murad

Abstract:

In this paper, we forecast age-specific Malaysian mortality rates and life expectancy at births by gender and ethnic groups including Malay, Chinese and Indian. Two mortality forecasting models are adopted the original Lee-Carter model and its recent modified version, the product ratio coherent model. While the first forecasts the mortality rates for each subpopulation independently, the latter accounts for the relationship between sub-populations. The evaluation of both models is performed using the out-of-sample forecast errors which are mean absolute percentage errors (MAPE) for mortality rates and mean forecast errors (MFE) for life expectancy at births. The best model is then used to perform the long-term forecasts up to the year 2030, the year when Malaysia is expected to become an aged nation. Results suggest that in terms of overall accuracy, the product ratio model performs better than the original Lee-Carter model. The association of lower mortality group (Chinese) in the subpopulation model can improve the forecasts of high mortality groups (Malay and Indian).

Keywords: coherent forecasts, life expectancy at births, Lee-Carter model, product-ratio model, mortality rates

Procedia PDF Downloads 200
508 Volatility Model with Markov Regime Switching to Forecast Baht/USD

Authors: Nop Sopipan

Abstract:

In this paper, we forecast the volatility of Baht/USDs using Markov Regime Switching GARCH (MRS-GARCH) models. These models allow volatility to have different dynamics according to unobserved regime variables. The main purpose of this paper is to find out whether MRS-GARCH models are an improvement on the GARCH type models in terms of modeling and forecasting Baht/USD volatility. The MRS-GARCH is the best performance model for Baht/USD volatility in short term but the GARCH model is best perform for long term.

Keywords: volatility, Markov Regime Switching, forecasting, Baht/USD

Procedia PDF Downloads 286
507 An Intrusion Detection Systems Based on K-Means, K-Medoids and Support Vector Clustering Using Ensemble

Authors: A. Mohammadpour, Ebrahim Najafi Kajabad, Ghazale Ipakchi

Abstract:

Presently, computer networks’ security rise in importance and many studies have also been conducted in this field. By the penetration of the internet networks in different fields, many things need to be done to provide a secure industrial and non-industrial network. Fire walls, appropriate Intrusion Detection Systems (IDS), encryption protocols for information sending and receiving, and use of authentication certificated are among things, which should be considered for system security. The aim of the present study is to use the outcome of several algorithms, which cause decline in IDS errors, in the way that improves system security and prevents additional overload to the system. Finally, regarding the obtained result we can also detect the amount and percentage of more sub attacks. By running the proposed system, which is based on the use of multi-algorithmic outcome and comparing that by the proposed single algorithmic methods, we observed a 78.64% result in attack detection that is improved by 3.14% than the proposed algorithms.

Keywords: intrusion detection systems, clustering, k-means, k-medoids, SV clustering, ensemble

Procedia PDF Downloads 199
506 Multi-Sensor Target Tracking Using Ensemble Learning

Authors: Bhekisipho Twala, Mantepu Masetshaba, Ramapulana Nkoana

Abstract:

Multiple classifier systems combine several individual classifiers to deliver a final classification decision. However, an increasingly controversial question is whether such systems can outperform the single best classifier, and if so, what form of multiple classifiers system yields the most significant benefit. Also, multi-target tracking detection using multiple sensors is an important research field in mobile techniques and military applications. In this paper, several multiple classifiers systems are evaluated in terms of their ability to predict a system’s failure or success for multi-sensor target tracking tasks. The Bristol Eden project dataset is utilised for this task. Experimental and simulation results show that the human activity identification system can fulfill requirements of target tracking due to improved sensors classification performances with multiple classifier systems constructed using boosting achieving higher accuracy rates.

Keywords: single classifier, ensemble learning, multi-target tracking, multiple classifiers

Procedia PDF Downloads 240
505 Forecasting Lake Malawi Water Level Fluctuations Using Stochastic Models

Authors: M. Mulumpwa, W. W. L. Jere, M. Lazaro, A. H. N. Mtethiwa

Abstract:

The study considered Seasonal Autoregressive Integrated Moving Average (SARIMA) processes to select an appropriate stochastic model to forecast the monthly data from the Lake Malawi water levels for the period 1986 through 2015. The appropriate model was chosen based on SARIMA (p, d, q) (P, D, Q)S. The Autocorrelation function (ACF), Partial autocorrelation (PACF), Akaike Information Criteria (AIC), Bayesian Information Criterion (BIC), Box–Ljung statistics, correlogram and distribution of residual errors were estimated. The SARIMA (1, 1, 0) (1, 1, 1)12 was selected to forecast the monthly data of the Lake Malawi water levels from August, 2015 to December, 2021. The plotted time series showed that the Lake Malawi water levels are decreasing since 2010 to date but not as much as was the case in 1995 through 1997. The future forecast of the Lake Malawi water levels until 2021 showed a mean of 474.47 m ranging from 473.93 to 475.02 meters with a confidence interval of 80% and 90% against registered mean of 473.398 m in 1997 and 475.475 m in 1989 which was the lowest and highest water levels in the lake respectively since 1986. The forecast also showed that the water levels of Lake Malawi will drop by 0.57 meters as compared to the mean water levels recorded in the previous years. These results suggest that the Lake Malawi water level may not likely go lower than that recorded in 1997. Therefore, utilisation and management of water-related activities and programs among others on the lake should provide room for such scenarios. The findings suggest a need to manage the Lake Malawi jointly and prudently with other stakeholders starting from the catchment area. This will reduce impacts of anthropogenic activities on the lake’s water quality, water level, aquatic and adjacent terrestrial ecosystems thereby ensuring its resilience to climate change impacts.

Keywords: forecasting, Lake Malawi, water levels, water level fluctuation, climate change, anthropogenic activities

Procedia PDF Downloads 204
504 Demand Forecasting to Reduce Dead Stock and Loss Sales: A Case Study of the Wholesale Electric Equipment and Part Company

Authors: Korpapa Srisamai, Pawee Siriruk

Abstract:

The purpose of this study is to forecast product demands and develop appropriate and adequate procurement plans to meet customer needs and reduce costs. When the product exceeds customer demands or does not move, it requires the company to support insufficient storage spaces. Moreover, some items, when stored for a long period of time, cause deterioration to dead stock. A case study of the wholesale company of electronic equipment and components, which has uncertain customer demands, is considered. The actual purchasing orders of customers are not equal to the forecast provided by the customers. In some cases, customers have higher product demands, resulting in the product being insufficient to meet the customer's needs. However, some customers have lower demands for products than estimates, causing insufficient storage spaces and dead stock. This study aims to reduce the loss of sales opportunities and the number of remaining goods in the warehouse, citing 30 product samples of the company's most popular products. The data were collected during the duration of the study from January to October 2022. The methods used to forecast are simple moving averages, weighted moving average, and exponential smoothing methods. The economic ordering quantity and reorder point are used to calculate to meet customer needs and track results. The research results are very beneficial to the company. The company can reduce the loss of sales opportunities by 20% so that the company has enough products to meet customer needs and can reduce unused products by up to 10% dead stock. This enables the company to order products more accurately, increasing profits and storage space.

Keywords: demand forecast, reorder point, lost sale, dead stock

Procedia PDF Downloads 90
503 Assessment of the Impacts of Climate Change on Climatic Zones over the Korean Peninsula for Natural Disaster Management Information

Authors: Sejin Jung, Dongho Kang, Byungsik Kim

Abstract:

Assessing the impact of climate change requires the use of a multi-model ensemble (MME) to quantify uncertainties between scenarios and produce downscaled outlines for simulation of climate under the influence of different factors, including topography. This study decreases climate change scenarios from the 13 global climate models (GCMs) to assess the impacts of future climate change. Unlike South Korea, North Korea lacks in studies using climate change scenarios of the CoupledModelIntercomparisonProject (CMIP5), and only recently did the country start the projection of extreme precipitation episodes. One of the main purposes of this study is to predict changes in the average climatic conditions of North Korea in the future. The result of comparing downscaled climate change scenarios with observation data for a reference period indicates high applicability of the Multi-Model Ensemble (MME). Furthermore, the study classifies climatic zones by applying the Köppen-Geiger climate classification system to the MME, which is validated for future precipitation and temperature. The result suggests that the continental climate (D) that covers the inland area for the reference climate is expected to shift into the temperate climate (C). The coefficient of variation (CVs) in the temperature ensemble is particularly low for the southern coast of the Korean peninsula, and accordingly, a high possibility of the shifting climatic zone of the coast is predicted. This research was supported by a grant (MOIS-DP-2015-05) of Disaster Prediction and Mitigation Technology Development Program funded by Ministry of Interior and Safety (MOIS, Korea).

Keywords: MME, North Korea, Koppen–Geiger, climatic zones, coefficient of variation, CV

Procedia PDF Downloads 94
502 Real-Time Radar Tracking Based on Nonlinear Kalman Filter

Authors: Milca F. Coelho, K. Bousson, Kawser Ahmed

Abstract:

To accurately track an aerospace vehicle in a time-critical situation and in a highly nonlinear environment, is one of the strongest interests within the aerospace community. The tracking is achieved by estimating accurately the state of a moving target, which is composed of a set of variables that can provide a complete status of the system at a given time. One of the main ingredients for a good estimation performance is the use of efficient estimation algorithms. A well-known framework is the Kalman filtering methods, designed for prediction and estimation problems. The success of the Kalman Filter (KF) in engineering applications is mostly due to the Extended Kalman Filter (EKF), which is based on local linearization. Besides its popularity, the EKF presents several limitations. To address these limitations and as a possible solution to tracking problems, this paper proposes the use of the Ensemble Kalman Filter (EnKF). Although the EnKF is being extensively used in the context of weather forecasting and it is being recognized for producing accurate and computationally effective estimation on systems with a very high dimension, it is almost unknown by the tracking community. The EnKF was initially proposed as an attempt to improve the error covariance calculation, which on the classic Kalman Filter is difficult to implement. Also, in the EnKF method the prediction and analysis error covariances have ensemble representations. These ensembles have sizes which limit the number of degrees of freedom, in a way that the filter error covariance calculations are a lot more practical for modest ensemble sizes. In this paper, a realistic simulation of a radar tracking was performed, where the EnKF was applied and compared with the Extended Kalman Filter. The results suggested that the EnKF is a promising tool for tracking applications, offering more advantages in terms of performance.

Keywords: Kalman filter, nonlinear state estimation, optimal tracking, stochastic environment

Procedia PDF Downloads 119