Search results for: Prediction Based Data Reduction

16823 Representing Data without Lost Compression Properties in Time Series: A Review

Authors: Nabilah Filzah Mohd Radzuan, Zalinda Othman, Azuraliza Abu Bakar, Abdul Razak Hamdan

Abstract:

Uncertain data is believed to be an important issue in building up a prediction model. The main objective in the time series uncertainty analysis is to formulate uncertain data in order to gain knowledge and fit low dimensional model prior to a prediction task. This paper discusses the performance of a number of techniques in dealing with uncertain data specifically those which solve uncertain data condition by minimizing the loss of compression properties.

Keywords: Compression properties, uncertainty, uncertain time series, mining technique, weather prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1620

16822 Urban Big Data: An Experimental Approach to Building-Value Estimation Using Web-Based Data

Authors: Sun-Young Jang, Sung-Ah Kim, Dongyoun Shin

Abstract:

Current real-estate value estimation, difficult for laymen, usually is performed by specialists. This paper presents an automated estimation process based on big data and machine-learning technology that calculates influences of building conditions on real-estate price measurement. The present study analyzed actual building sales sample data for Nonhyeon-dong, Gangnam-gu, Seoul, Korea, measuring the major influencing factors among the various building conditions. Further to that analysis, a prediction model was established and applied using RapidMiner Studio, a graphical user interface (GUI)-based tool for derivation of machine-learning prototypes. The prediction model is formulated by reference to previous examples. When new examples are applied, it analyses and predicts accordingly. The analysis process discerns the crucial factors effecting price increases by calculation of weighted values. The model was verified, and its accuracy determined, by comparing its predicted values with actual price increases.

Keywords: Big data, building-value analysis, machine learning, price prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1165

16821 Two States Mapping Based Neural Network Model for Decreasing of Prediction Residual Error

Authors: Insung Jung, lockjo Koo, Gi-Nam Wang

Abstract:

The objective of this paper is to design a model of human vital sign prediction for decreasing prediction error by using two states mapping based time series neural network BP (back-propagation) model. Normally, lot of industries has been applying the neural network model by training them in a supervised manner with the error back-propagation algorithm for time series prediction systems. However, it still has a residual error between real value and prediction output. Therefore, we designed two states of neural network model for compensation of residual error which is possible to use in the prevention of sudden death and metabolic syndrome disease such as hypertension disease and obesity. We found that most of simulations cases were satisfied by the two states mapping based time series prediction model compared to normal BP. In particular, small sample size of times series were more accurate than the standard MLP model. We expect that this algorithm can be available to sudden death prevention and monitoring AGENT system in a ubiquitous homecare environment.

Keywords: Neural network, U-healthcare, prediction, timeseries, computer aided prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1984

16820 Hybrid Approach for Software Defect Prediction Using Machine Learning with Optimization Technique

Authors: C. Manjula, Lilly Florence

Abstract:

Software technology is developing rapidly which leads to the growth of various industries. Now-a-days, software-based applications have been adopted widely for business purposes. For any software industry, development of reliable software is becoming a challenging task because a faulty software module may be harmful for the growth of industry and business. Hence there is a need to develop techniques which can be used for early prediction of software defects. Due to complexities in manual prediction, automated software defect prediction techniques have been introduced. These techniques are based on the pattern learning from the previous software versions and finding the defects in the current version. These techniques have attracted researchers due to their significant impact on industrial growth by identifying the bugs in software. Based on this, several researches have been carried out but achieving desirable defect prediction performance is still a challenging task. To address this issue, here we present a machine learning based hybrid technique for software defect prediction. First of all, Genetic Algorithm (GA) is presented where an improved fitness function is used for better optimization of features in data sets. Later, these features are processed through Decision Tree (DT) classification model. Finally, an experimental study is presented where results from the proposed GA-DT based hybrid approach is compared with those from the DT classification technique. The results show that the proposed hybrid approach achieves better classification accuracy.

Keywords: Decision tree, genetic algorithm, machine learning, software defect prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1465

16819 Artificial Neural Networks Technique for Seismic Hazard Prediction Using Seismic Bumps

Authors: Belkacem Selma, Boumediene Selma, Samira Chouraqui, Hanifi Missoum, Tourkia Guerzou

Abstract:

Natural disasters have occurred and will continue to cause human and material damage. Therefore, the idea of "preventing" natural disasters will never be possible. However, their prediction is possible with the advancement of technology. Even if natural disasters are effectively inevitable, their consequences may be partly controlled. The rapid growth and progress of artificial intelligence (AI) had a major impact on the prediction of natural disasters and risk assessment which are necessary for effective disaster reduction. Earthquake prediction to prevent the loss of human lives and even property damage is an important factor; that, is why it is crucial to develop techniques for predicting this natural disaster. This study aims to analyze the ability of artificial neural networks (ANNs) to predict earthquakes that occur in a given area. The used data describe the problem of high energy (higher than 104 J) seismic bumps forecasting in a coal mine using two long walls as an example. For this purpose, seismic bumps data obtained from mines have been analyzed. The results obtained show that the ANN is able to predict earthquake parameters with high accuracy; the classification accuracy through neural networks is more than 94%, and the models developed are efficient and robust and depend only weakly on the initial database.

Keywords: Earthquake prediction, artificial intelligence, AI, Artificial Neural Network, ANN, seismic bumps.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1187

16818 Estimation of Relative Subsidence of Collapsible Soils Using Electromagnetic Measurements

Authors: Henok Hailemariam, Frank Wuttke

Abstract:

Collapsible soils are weak soils that appear to be stable in their natural state, normally dry condition, but rapidly deform under saturation (wetting), thus generating large and unexpected settlements which often yield disastrous consequences for structures unwittingly built on such deposits. In this study, a prediction model for the relative subsidence of stressed collapsible soils based on dielectric permittivity measurement is presented. Unlike most existing methods for soil subsidence prediction, this model does not require moisture content as an input parameter, thus providing the opportunity to obtain accurate estimation of the relative subsidence of collapsible soils using dielectric measurement only. The prediction model is developed based on an existing relative subsidence prediction model (which is dependent on soil moisture condition) and an advanced theoretical frequency and temperature-dependent electromagnetic mixing equation (which effectively removes the moisture content dependence of the original relative subsidence prediction model). For large scale sub-surface soil exploration purposes, the spatial sub-surface soil dielectric data over wide areas and high depths of weak (collapsible) soil deposits can be obtained using non-destructive high frequency electromagnetic (HF-EM) measurement techniques such as ground penetrating radar (GPR). For laboratory or small scale in-situ measurements, techniques such as an open-ended coaxial line with widely applicable time domain reflectometry (TDR) or vector network analysers (VNAs) are usually employed to obtain the soil dielectric data. By using soil dielectric data obtained from small or large scale non-destructive HF-EM investigations, the new model can effectively predict the relative subsidence of weak soils without the need to extract samples for moisture content measurement. Some of the resulting benefits are the preservation of the undisturbed nature of the soil as well as a reduction in the investigation costs and analysis time in the identification of weak (problematic) soils. The accuracy of prediction of the presented model is assessed by conducting relative subsidence tests on a collapsible soil at various initial soil conditions and a good match between the model prediction and experimental results is obtained.

Keywords: Collapsible soil, relative subsidence, dielectric permittivity, moisture content.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1117

16817 Certain Data Dimension Reduction Techniques for application with ANN based MCS for Study of High Energy Shower

Authors: Gitanjali Devi, Kandarpa Kumar Sarma, Pranayee Datta, Anjana Kakoti Mahanta

Abstract:

Cosmic showers, from their places of origin in space, after entering earth generate secondary particles called Extensive Air Shower (EAS). Detection and analysis of EAS and similar High Energy Particle Showers involve a plethora of experimental setups with certain constraints for which soft-computational tools like Artificial Neural Network (ANN)s can be adopted. The optimality of ANN classifiers can be enhanced further by the use of Multiple Classifier System (MCS) and certain data - dimension reduction techniques. This work describes the performance of certain data dimension reduction techniques like Principal Component Analysis (PCA), Independent Component Analysis (ICA) and Self Organizing Map (SOM) approximators for application with an MCS formed using Multi Layer Perceptron (MLP), Recurrent Neural Network (RNN) and Probabilistic Neural Network (PNN). The data inputs are obtained from an array of detectors placed in a circular arrangement resembling a practical detector grid which have a higher dimension and greater correlation among themselves. The PCA, ICA and SOM blocks reduce the correlation and generate a form suitable for real time practical applications for prediction of primary energy and location of EAS from density values captured using detectors in a circular grid.

Keywords: EAS, Shower, Core, ANN, Location.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1608

16816 A Prediction-Based Reversible Watermarking for MRI Images

Authors: Nuha Omran Abokhdair, Azizah Bt Abdul Manaf

Abstract:

Reversible watermarking is a special branch of image watermarking, that is able to recover the original image after extracting the watermark from the image. In this paper, an adaptive prediction-based reversible watermarking scheme is presented, in order to increase the payload capacity of MRI medical images. The scheme divides the image into two parts, Region of Interest (ROI) and Region of Non-Interest (RONI). Two bits are embedded in each embeddable pixel of RONI and one bit is embedded in each embeddable pixel of ROI. The experimental results demonstrate that the proposed scheme is able to achieve high embedding capacity. This is mainly caused by two reasons. First, the pixels that were excluded from data embedding due to overflow/underflow are used for data embedding. Second, large location map that need to be added to watermark data as overhead is eliminated and thus lower data embedding capacity is prevented. Moreover, the scheme provides good visual quality to the watermarked image.

Keywords: Medical image watermarking, reversible watermarking, Difference Expansion, Prediction-Error Expansion.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1916

16815 Review and Comparison of Associative Classification Data Mining Approaches

Authors: Suzan Wedyan

Abstract:

Associative classification (AC) is a data mining approach that combines association rule and classification to build classification models (classifiers). AC has attracted a significant attention from several researchers mainly because it derives accurate classifiers that contain simple yet effective rules. In the last decade, a number of associative classification algorithms have been proposed such as Classification based Association (CBA), Classification based on Multiple Association Rules (CMAR), Class based Associative Classification (CACA), and Classification based on Predicted Association Rule (CPAR). This paper surveys major AC algorithms and compares the steps and methods performed in each algorithm including: rule learning, rule sorting, rule pruning, classifier building, and class prediction.

Keywords: Associative Classification, Classification, Data Mining, Learning, Rule Ranking, Rule Pruning, Prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6634

16814 Integration of Big Data to Predict Transportation for Smart Cities

Authors: Sun-Young Jang, Sung-Ah Kim, Dongyoun Shin

Abstract:

The Intelligent transportation system is essential to build smarter cities. Machine learning based transportation prediction could be highly promising approach by delivering invisible aspect visible. In this context, this research aims to make a prototype model that predicts transportation network by using big data and machine learning technology. In detail, among urban transportation systems this research chooses bus system. The research problem that existing headway model cannot response dynamic transportation conditions. Thus, bus delay problem is often occurred. To overcome this problem, a prediction model is presented to fine patterns of bus delay by using a machine learning implementing the following data sets; traffics, weathers, and bus statues. This research presents a flexible headway model to predict bus delay and analyze the result. The prototyping model is composed by real-time data of buses. The data are gathered through public data portals and real time Application Program Interface (API) by the government. These data are fundamental resources to organize interval pattern models of bus operations as traffic environment factors (road speeds, station conditions, weathers, and bus information of operating in real-time). The prototyping model is designed by the machine learning tool (RapidMiner Studio) and conducted tests for bus delays prediction. This research presents experiments to increase prediction accuracy for bus headway by analyzing the urban big data. The big data analysis is important to predict the future and to find correlations by processing huge amount of data. Therefore, based on the analysis method, this research represents an effective use of the machine learning and urban big data to understand urban dynamics.

Keywords: Big data, bus headway prediction, machine learning, public transportation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1563

16813 Dust Storm Prediction Using ANNs Technique (A Case Study: Zabol City)

Authors: Jamalizadeh, M.R., Moghaddamnia, A., Piri, J., Arbabi, V., Homayounifar, M., Shahryari, A.

Abstract:

Dust storms are one of the most costly and destructive events in many desert regions. They can cause massive damages both in natural environments and human lives. This paper is aimed at presenting a preliminary study on dust storms, as a major natural hazard in arid and semi-arid regions. As a case study, dust storm events occurred in Zabol city located in Sistan Region of Iran was analyzed to diagnose and predict dust storms. The identification and prediction of dust storm events could have significant impacts on damages reduction. Present models for this purpose are complicated and not appropriate for many areas with poor-data environments. The present study explores Gamma test for identifying inputs of ANNs model, for dust storm prediction. Results indicate that more attempts must be carried out concerning dust storms identification and segregate between various dust storm types.

Keywords: Dust Storm, Gamma Test, Prediction, ANNs, Zabol.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2153

16812 An Improved Prediction Model of Ozone Concentration Time Series Based On Chaotic Approach

Authors: N. Z. A. Hamid, M. S. M. Noorani

Abstract:

This study is focused on the development of prediction models of the Ozone concentration time series. Prediction model is built based on chaotic approach. Firstly, the chaotic nature of the time series is detected by means of phase space plot and the Cao method. Then, the prediction model is built and the local linear approximation method is used for the forecasting purposes. Traditional prediction of autoregressive linear model is also built. Moreover, an improvement in local linear approximation method is also performed. Prediction models are applied to the hourly Ozone time series observed at the benchmark station in Malaysia. Comparison of all models through the calculation of mean absolute error, root mean squared error and correlation coefficient shows that the one with improved prediction method is the best. Thus, chaotic approach is a good approach to be used to develop a prediction model for the Ozone concentration time series.

Keywords: Chaotic approach, phase space, Cao method, local linear approximation method.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1784

16811 A Novel Prediction Method for Tag SNP Selection using Genetic Algorithm based on KNN

Authors: Li-Yeh Chuang, Yu-Jen Hou, Jr., Cheng-Hong Yang

Abstract:

Single nucleotide polymorphisms (SNPs) hold much promise as a basis for disease-gene association. However, research is limited by the cost of genotyping the tremendous number of SNPs. Therefore, it is important to identify a small subset of informative SNPs, the so-called tag SNPs. This subset consists of selected SNPs of the genotypes, and accurately represents the rest of the SNPs. Furthermore, an effective evaluation method is needed to evaluate prediction accuracy of a set of tag SNPs. In this paper, a genetic algorithm (GA) is applied to tag SNP problems, and the K-nearest neighbor (K-NN) serves as a prediction method of tag SNP selection. The experimental data used was taken from the HapMap project; it consists of genotype data rather than haplotype data. The proposed method consistently identified tag SNPs with considerably better prediction accuracy than methods from the literature. At the same time, the number of tag SNPs identified was smaller than the number of tag SNPs in the other methods. The run time of the proposed method was much shorter than the run time of the SVM/STSA method when the same accuracy was reached.

Keywords: Genetic Algorithm (GA), Genotype, Single nucleotide polymorphism (SNP), tag SNPs.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1771

16810 Model Order Reduction for Frequency Response and Effect of Order of Method for Matching Condition

Authors: Aref Ghafouri, Mohammad Javad Mollakazemi, Farhad Asadi

Abstract:

In this paper, model order reduction method is used for approximation in linear and nonlinearity aspects in some experimental data. This method can be used for obtaining offline reduced model for approximation of experimental data and can produce and follow the data and order of system and also it can match to experimental data in some frequency ratios. In this study, the method is compared in different experimental data and influence of choosing of order of the model reduction for obtaining the best and sufficient matching condition for following the data is investigated in format of imaginary and reality part of the frequency response curve and finally the effect and important parameter of number of order reduction in nonlinear experimental data is explained further.

Keywords: Frequency response, Order of model reduction, frequency matching condition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2058

16809 A Spatial Information Network Traffic Prediction Method Based on Hybrid Model

Authors: Jingling Li, Yi Zhang, Wei Liang, Tao Cui, Jun Li

Abstract:

Compared with terrestrial network, the traffic of spatial information network has both self-similarity and short correlation characteristics. By studying its traffic prediction method, the resource utilization of spatial information network can be improved, and the method can provide an important basis for traffic planning of a spatial information network. In this paper, considering the accuracy and complexity of the algorithm, the spatial information network traffic is decomposed into approximate component with long correlation and detail component with short correlation, and a time series hybrid prediction model based on wavelet decomposition is proposed to predict the spatial network traffic. Firstly, the original traffic data are decomposed to approximate components and detail components by using wavelet decomposition algorithm. According to the autocorrelation and partial correlation smearing and truncation characteristics of each component, the corresponding model (AR/MA/ARMA) of each detail component can be directly established, while the type of approximate component modeling can be established by ARIMA model after smoothing. Finally, the prediction results of the multiple models are fitted to obtain the prediction results of the original data. The method not only considers the self-similarity of a spatial information network, but also takes into account the short correlation caused by network burst information, which is verified by using the measured data of a certain back bone network released by the MAWI working group in 2018. Compared with the typical time series model, the predicted data of hybrid model is closer to the real traffic data and has a smaller relative root means square error, which is more suitable for a spatial information network.

Keywords: Spatial Information Network, Traffic prediction, Wavelet decomposition, Time series model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 640

16808 Churn Prediction for Telecommunication Industry Using Artificial Neural Networks

Authors: Ulas Vural, M. Ergun Okay, E. Mesut Yildiz

Abstract:

Telecommunication service providers demand accurate and precise prediction of customer churn probabilities to increase the effectiveness of their customer relation services. The large amount of customer data owned by the service providers is suitable for analysis by machine learning methods. In this study, expenditure data of customers are analyzed by using an artificial neural network (ANN). The ANN model is applied to the data of customers with different billing duration. The proposed model successfully predicts the churn probabilities at 83% accuracy for only three months expenditure data and the prediction accuracy increases up to 89% when the nine month data is used. The experiments also show that the accuracy of ANN model increases on an extended feature set with information of the changes on the bill amounts.

Keywords: Customer relationship management, churn prediction, telecom industry, deep learning, Artificial Neural Networks, ANN.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 760

16807 Evaluation of Low-Reducible Sinter in Blast Furnace Technology by Mathematical Model Developed at Centre ENET, VSB – Technical University of Ostrava

Authors: S. Jursová, P. Pustějovská, S. Brožová, J. Bilík

Abstract:

The paper deals with possibilities of interpretation of iron ore reducibility tests. It presents a mathematical model developed at Centre ENET, VŠB – Technical University of Ostrava, Czech Republic for an evaluation of metallurgical material of blast furnace feedstock such as iron ore, sinter or pellets. According to the data from the test, the model predicts its usage in blast furnace technology and its effects on production parameters of shaft aggregate. At the beginning, the paper sums up the general concept and experience in mathematical modelling of iron ore reduction. It presents basic equation for the calculation and the main parts of the developed model. In the experimental part, there is an example of usage of the mathematical model. The paper describes the usage of data for some predictive calculation. There are presented material, method of carried test of iron ore reducibility. Then there are graphically interpreted effects of used material on carbon consumption, rate of direct reduction and the whole reduction process.

Keywords: Blast furnace technology, iron ore reduction, mathematical model, prediction of iron ore reduction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1991

16806 A New History Based Method to Handle the Recurring Concept Shifts in Data Streams

Authors: Hossein Morshedlou, Ahmad Abdollahzade Barforoush

Abstract:

Recent developments in storage technology and networking architectures have made it possible for broad areas of applications to rely on data streams for quick response and accurate decision making. Data streams are generated from events of real world so existence of associations, which are among the occurrence of these events in real world, among concepts of data streams is logical. Extraction of these hidden associations can be useful for prediction of subsequent concepts in concept shifting data streams. In this paper we present a new method for learning association among concepts of data stream and prediction of what the next concept will be. Knowing the next concept, an informed update of data model will be possible. The results of conducted experiments show that the proposed method is proper for classification of concept shifting data streams.

Keywords: Data Stream, Classification, Concept Shift, History.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1278

16805 Equity Risk Premiums and Risk Free Rates in Modelling and Prediction of Financial Markets

Authors: Mohammad Ghavami, Reza S. Dilmaghani

Abstract:

This paper presents an adaptive framework for modelling financial markets using equity risk premiums, risk free rates and volatilities. The recorded economic factors are initially used to train four adaptive filters for a certain limited period of time in the past. Once the systems are trained, the adjusted coefficients are used for modelling and prediction of an important financial market index. Two different approaches based on least mean squares (LMS) and recursive least squares (RLS) algorithms are investigated. Performance analysis of each method in terms of the mean squared error (MSE) is presented and the results are discussed. Computer simulations carried out using recorded data show MSEs of 4% and 3.4% for the next month prediction using LMS and RLS adaptive algorithms, respectively. In terms of twelve months prediction, RLS method shows a better tendency estimation compared to the LMS algorithm.

Keywords: Prediction of financial markets, Adaptive methods, MSE, LSE.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1022

16804 Novel Hybrid Method for Gene Selection and Cancer Prediction

Authors: Liping Jing, Michael K. Ng, Tieyong Zeng

Abstract:

Microarray data profiles gene expression on a whole genome scale, therefore, it provides a good way to study associations between gene expression and occurrence or progression of cancer. More and more researchers realized that microarray data is helpful to predict cancer sample. However, the high dimension of gene expressions is much larger than the sample size, which makes this task very difficult. Therefore, how to identify the significant genes causing cancer becomes emergency and also a hot and hard research topic. Many feature selection algorithms have been proposed in the past focusing on improving cancer predictive accuracy at the expense of ignoring the correlations between the features. In this work, a novel framework (named by SGS) is presented for stable gene selection and efficient cancer prediction . The proposed framework first performs clustering algorithm to find the gene groups where genes in each group have higher correlation coefficient, and then selects the significant genes in each group with Bayesian Lasso and important gene groups with group Lasso, and finally builds prediction model based on the shrinkage gene space with efficient classification algorithm (such as, SVM, 1NN, Regression and etc.). Experiment results on real world data show that the proposed framework often outperforms the existing feature selection and prediction methods, say SAM, IG and Lasso-type prediction model.

Keywords: Gene Selection, Cancer Prediction, Lasso, Clustering, Classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2044

16803 A Comparison of Grey Model and Fuzzy Predictive Model for Time Series

Authors: A. I. Dounis, P. Tiropanis, D. Tseles, G. Nikolaou, G. P. Syrcos

Abstract:

The prediction of meteorological parameters at a meteorological station is an interesting and open problem. A firstorder linear dynamic model GM(1,1) is the main component of the grey system theory. The grey model requires only a few previous data points in order to make a real-time forecast. In this paper, we consider the daily average ambient temperature as a time series and the grey model GM(1,1) applied to local prediction (short-term prediction) of the temperature. In the same case study we use a fuzzy predictive model for global prediction. We conclude the paper with a comparison between local and global prediction schemes.

Keywords: Fuzzy predictive model, grey model, local andglobal prediction, meteorological forecasting, time series.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2156

16802 Intelligent Earthquake Prediction System Based On Neural Network

Authors: Emad Amar, Tawfik Khattab, Fatma Zada

Abstract:

Predicting earthquakes is an important issue in the study of geography. Accurate prediction of earthquakes can help people to take effective measures to minimize the loss of personal and economic damage, such as large casualties, destruction of buildings and broken of traffic, occurred within a few seconds. United States Geological Survey (USGS) science organization provides reliable scientific information about Earthquake Existed throughout history & the Preliminary database from the National Center Earthquake Information (NEIC) show some useful factors to predict an earthquake in a seismic area like Aleutian Arc in the U.S. state of Alaska. The main advantage of this prediction method that it does not require any assumption, it makes prediction according to the future evolution of the object's time series. The article compares between simulation data result from trained BP and RBF neural network versus actual output result from the system calculations. Therefore, this article focuses on analysis of data relating to real earthquakes. Evaluation results show better accuracy and higher speed by using radial basis functions (RBF) neural network.

Keywords: BP neural network, Prediction, RBF neural network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3219

16801 Integration of Educational Data Mining Models to a Web-Based Support System for Predicting High School Student Performance

Authors: Sokkhey Phauk, Takeo Okazaki

Abstract:

The challenging task in educational institutions is to maximize the high performance of students and minimize the failure rate of poor-performing students. An effective method to leverage this task is to know student learning patterns with highly influencing factors and get an early prediction of student learning outcomes at the timely stage for setting up policies for improvement. Educational data mining (EDM) is an emerging disciplinary field of data mining, statistics, and machine learning concerned with extracting useful knowledge and information for the sake of improvement and development in the education environment. The study is of this work is to propose techniques in EDM and integrate it into a web-based system for predicting poor-performing students. A comparative study of prediction models is conducted. Subsequently, high performing models are developed to get higher performance. The hybrid random forest (Hybrid RF) produces the most successful classification. For the context of intervention and improving the learning outcomes, a feature selection method MICHI, which is the combination of mutual information (MI) and chi-square (CHI) algorithms based on the ranked feature scores, is introduced to select a dominant feature set that improves the performance of prediction and uses the obtained dominant set as information for intervention. By using the proposed techniques of EDM, an academic performance prediction system (APPS) is subsequently developed for educational stockholders to get an early prediction of student learning outcomes for timely intervention. Experimental outcomes and evaluation surveys report the effectiveness and usefulness of the developed system. The system is used to help educational stakeholders and related individuals for intervening and improving student performance.

Keywords: Academic performance prediction system, prediction model, educational data mining, dominant factors, feature selection methods, student performance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 981

16800 Modified Naïve Bayes Based Prediction Modeling for Crop Yield Prediction

Authors: Kefaya Qaddoum

Abstract:

Most of greenhouse growers desire a determined amount of yields in order to accurately meet market requirements. The purpose of this paper is to model a simple but often satisfactory supervised classification method. The original naive Bayes have a serious weakness, which is producing redundant predictors. In this paper, utilized regularization technique was used to obtain a computationally efficient classifier based on naive Bayes. The suggested construction, utilized L1-penalty, is capable of clearing redundant predictors, where a modification of the LARS algorithm is devised to solve this problem, making this method applicable to a wide range of data. In the experimental section, a study conducted to examine the effect of redundant and irrelevant predictors, and test the method on WSG data set for tomato yields, where there are many more predictors than data, and the urge need to predict weekly yield is the goal of this approach. Finally, the modified approach is compared with several naive Bayes variants and other classification algorithms (SVM and kNN), and is shown to be fairly good.

Keywords: Tomato yields prediction, naive Bayes, redundancy

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5110

16799 Breast Cancer Prediction Using Score-Level Fusion of Machine Learning and Deep Learning Models

Authors: [email protected]

Abstract:

Breast cancer is one of the most common types in women. Early prediction of breast cancer helps physicians detect cancer in its early stages. Big cancer data need a very powerful tool to analyze and extract predictions. Machine learning and deep learning are two of the most efficient tools for predicting cancer based on textual data. In this study, we developed a fusion model of two machine learning and deep learning models. To obtain the final prediction, Long-Short Term Memory (LSTM), ensemble learning with hyper parameters optimization, and score-level fusion is used. Experiments are done on the Breast Cancer Surveillance Consortium (BCSC) dataset after balancing and grouping the class categories. Five different training scenarios are used, and the tests show that the designed fusion model improved the performance by 3.3% compared to the individual models.

Keywords: Machine learning, Deep learning, cancer prediction, breast cancer, LSTM, Score-Level Fusion.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 406

16798 Analysis of Physicochemical Properties on Prediction of R5, X4 and R5X4 HIV-1 Coreceptor Usage

Authors: Kai-Ti Hsu, Hui-Ling Huang, Chun-Wei Tung, Yi-Hsiung Chen, Shinn-Ying Ho

Abstract:

Bioinformatics methods for predicting the T cell coreceptor usage from the array of membrane protein of HIV-1 are investigated. In this study, we aim to propose an effective prediction method for dealing with the three-class classification problem of CXCR4 (X4), CCR5 (R5) and CCR5/CXCR4 (R5X4). We made efforts in investigating the coreceptor prediction problem as follows: 1) proposing a feature set of informative physicochemical properties which is cooperated with SVM to achieve high prediction test accuracy of 81.48%, compared with the existing method with accuracy of 70.00%; 2) establishing a large up-to-date data set by increasing the size from 159 to 1225 sequences to verify the proposed prediction method where the mean test accuracy is 88.59%, and 3) analyzing the set of 14 informative physicochemical properties to further understand the characteristics of HIV-1coreceptors.

Keywords: Coreceptor, genetic algorithm, HIV-1, SVM, physicochemical properties, prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2387

16797 Pattern Recognition Using Feature Based Die-Map Clusteringin the Semiconductor Manufacturing Process

Authors: Seung Hwan Park, Cheng-Sool Park, Jun Seok Kim, Youngji Yoo, Daewoong An, Jun-Geol Baek

Abstract:

Depending on the big data analysis becomes important, yield prediction using data from the semiconductor process is essential. In general, yield prediction and analysis of the causes of the failure are closely related. The purpose of this study is to analyze pattern affects the final test results using a die map based clustering. Many researches have been conducted using die data from the semiconductor test process. However, analysis has limitation as the test data is less directly related to the final test results. Therefore, this study proposes a framework for analysis through clustering using more detailed data than existing die data. This study consists of three phases. In the first phase, die map is created through fail bit data in each sub-area of die. In the second phase, clustering using map data is performed. And the third stage is to find patterns that affect final test result. Finally, the proposed three steps are applied to actual industrial data and experimental results showed the potential field application.

Keywords: Die-Map Clustering, Feature Extraction, Pattern Recognition, Semiconductor Manufacturing Process.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3151

16796 Prediction of Coast Down Time for Mechanical Faults in Rotating Machinery Using Artificial Neural Networks

Authors: G. R. Rameshkumar, B. V. A Rao, K. P. Ramachandran

Abstract:

Misalignment and unbalance are the major concerns in rotating machinery. When the power supply to any rotating system is cutoff, the system begins to lose the momentum gained during sustained operation and finally comes to rest. The exact time period from when the power is cutoff until the rotor comes to rest is called Coast Down Time. The CDTs for different shaft cutoff speeds were recorded at various misalignment and unbalance conditions. The CDT reduction percentages were calculated for each fault and there is a specific correlation between the CDT reduction percentage and the severity of the fault. In this paper, radial basis network, a new generation of artificial neural networks, has been successfully incorporated for the prediction of CDT for misalignment and unbalance conditions. Radial basis network has been found to be successful in the prediction of CDT for mechanical faults in rotating machinery.

Keywords: Coast Down Time, Misalignment, Unbalance, Artificial Neural Networks, Radial Basis Network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2989

16795 Automatic Flood Prediction Using Rainfall Runoff Model in Moravian-Silesian Region

Authors: B. Sir, M. Podhoranyi, S. Kuchar, T. Kocyan

Abstract:

Rainfall runoff models play important role in hydrological predictions. However, the model is only one part of the process for creation of flood prediction. The aim of this paper is to show the process of successful prediction for flood event (May 15 – May 18 2014). Prediction was performed by rainfall runoff model HEC–HMS, one of the models computed within Floreon+ system. The paper briefly evaluates the results of automatic hydrologic prediction on the river Olše catchment and its gages Český Těšín and Věřňovice.

Keywords: Flood, HEC-HMS, Prediction, Rainfall – Runoff.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2228

16794 High Capacity Data Hiding based on Predictor and Histogram Modification

Authors: Hui-Yu Huang, Shih-Hsu Chang

Abstract:

In this paper, we propose a high capacity image hiding technology based on pixel prediction and the difference of modified histogram. This approach is used the pixel prediction and the difference of modified histogram to calculate the best embedding point. This approach can improve the predictive accuracy and increase the pixel difference to advance the hiding capacity. We also use the histogram modification to prevent the overflow and underflow. Experimental results demonstrate that our proposed method within the same average hiding capacity can still keep high quality of image and low distortion

Keywords: data hiding, predictor

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1886