Search results for: statistical weather prediction
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2279

Search results for: statistical weather prediction

1889 Machine Learning Development Audit Framework: Assessment and Inspection of Risk and Quality of Data, Model and Development Process

Authors: Jan Stodt, Christoph Reich

Abstract:

The usage of machine learning models for prediction is growing rapidly and proof that the intended requirements are met is essential. Audits are a proven method to determine whether requirements or guidelines are met. However, machine learning models have intrinsic characteristics, such as the quality of training data, that make it difficult to demonstrate the required behavior and make audits more challenging. This paper describes an ML audit framework that evaluates and reviews the risks of machine learning applications, the quality of the training data, and the machine learning model. We evaluate and demonstrate the functionality of the proposed framework by auditing an steel plate fault prediction model.

Keywords: Audit, machine learning, assessment, metrics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 902
1888 Lexicon-Based Sentiment Analysis for Stock Movement Prediction

Authors: Zane Turner, Kevin Labille, Susan Gauch

Abstract:

Sentiment analysis is a broad and expanding field that aims to extract and classify opinions from textual data. Lexicon-based approaches are based on the use of a sentiment lexicon, i.e., a list of words each mapped to a sentiment score, to rate the sentiment of a text chunk. Our work focuses on predicting stock price change using a sentiment lexicon built from financial conference call logs. We present a method to generate a sentiment lexicon based upon an existing probabilistic approach. By using a domain-specific lexicon, we outperform traditional techniques and demonstrate that domain-specific sentiment lexicons provide higher accuracy than generic sentiment lexicons when predicting stock price change.

Keywords: Lexicon, sentiment analysis, stock movement prediction., computational finance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 722
1887 Prediction of the Thermal Parameters of a High-Temperature Metallurgical Reactor Using Inverse Heat Transfer

Authors: Mohamed Hafid, Marcel Lacroix

Abstract:

This study presents an inverse analysis for predicting the thermal conductivities and the heat flux of a high-temperature metallurgical reactor simultaneously. Once these thermal parameters are predicted, the time-varying thickness of the protective phase-change bank that covers the inside surface of the brick walls of a metallurgical reactor can be calculated. The enthalpy method is used to solve the melting/solidification process of the protective bank. The inverse model rests on the Levenberg-Marquardt Method (LMM) combined with the Broyden method (BM). A statistical analysis for the thermal parameter estimation is carried out. The effect of the position of the temperature sensors, total number of measurements and measurement noise on the accuracy of inverse predictions is investigated. Recommendations are made concerning the location of temperature sensors.

Keywords: Inverse heat transfer, phase change, metallurgical reactor, Levenberg–Marquardt method, Broyden method, bank thickness.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1658
1886 Using Simulation for Prediction of Units Movements in Case of Communication Failure

Authors: J. Hodicky, P. Frantis

Abstract:

Command and Control (C2) system and its interfacethe Common Operational Picture (COP) are main means that supports commander in its decision making process. COP contains information about friendly and enemy unit positions. The friendly position is gathered via tactical network. In the case of tactical network failure the information about units are not available. The tactical simulator can be used as a tool that is capable to predict movements of units in respect of terrain features. Article deals with an experiment that was based on Czech C2 system that is in the case of connectivity lost fed by VR Forces simulator. Article analyzes maximum time interval in which the position created by simulator is still usable and truthful for commander in real time.

Keywords: command and control system, movement prediction, simulation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1247
1885 Lexicon-Based Sentiment Analysis for Stock Movement Prediction

Authors: Zane Turner, Kevin Labille, Susan Gauch

Abstract:

Sentiment analysis is a broad and expanding field that aims to extract and classify opinions from textual data. Lexicon-based approaches are based on the use of a sentiment lexicon, i.e., a list of words each mapped to a sentiment score, to rate the sentiment of a text chunk. Our work focuses on predicting stock price change using a sentiment lexicon built from financial conference call logs. We introduce a method to generate a sentiment lexicon based upon an existing probabilistic approach. By using a domain-specific lexicon, we outperform traditional techniques and demonstrate that domain-specific sentiment lexicons provide higher accuracy than generic sentiment lexicons when predicting stock price change.

Keywords: Computational finance, sentiment analysis, sentiment lexicon, stock movement prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1078
1884 Prediction the Limiting Drawing Ratio in Deep Drawing Process by Back Propagation Artificial Neural Network

Authors: H.Mohammadi Majd, M.Jalali Azizpour, M. Goodarzi

Abstract:

In this paper back-propagation artificial neural network (BPANN) with Levenberg–Marquardt algorithm is employed to predict the limiting drawing ratio (LDR) of the deep drawing process. To prepare a training set for BPANN, some finite element simulations were carried out. die and punch radius, die arc radius, friction coefficient, thickness, yield strength of sheet and strain hardening exponent were used as the input data and the LDR as the specified output used in the training of neural network. As a result of the specified parameters, the program will be able to estimate the LDR for any new given condition. Comparing FEM and BPANN results, an acceptable correlation was found.

Keywords: BPANN, deep drawing, prediction, limiting drawingratio (LDR), Levenberg–Marquardt algorithm

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1824
1883 Application of EEG Wavelet Power to Prediction of Antidepressant Treatment Response

Authors: Dorota Witkowska, Paweł Gosek, Lukasz Swiecicki, Wojciech Jernajczyk, Bruce J. West, Miroslaw Latka

Abstract:

In clinical practice, the selection of an antidepressant often degrades to lengthy trial-and-error. In this work we employ a normalized wavelet power of alpha waves as a biomarker of antidepressant treatment response. This novel EEG metric takes into account both non-stationarity and intersubject variability of alpha waves. We recorded resting, 19-channel EEG (closed eyes) in 22 inpatients suffering from unipolar (UD, n=10) or bipolar (BD, n=12) depression. The EEG measurement was done at the end of the short washout period which followed previously unsuccessful pharmacotherapy. The normalized alpha wavelet power of 11 responders was markedly different than that of 11 nonresponders at several, mostly temporoparietal sites. Using the prediction of treatment response based on the normalized alpha wavelet power, we achieved 81.8% sensitivity and 81.8% specificity for channel T4.

Keywords: Alpha waves, antidepressant, treatment outcome, wavelet.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1942
1882 Movie Genre Preference Prediction Using Machine Learning for Customer-Based Information

Authors: Haifeng Wang, Haili Zhang

Abstract:

Most movie recommendation systems have been developed for customers to find items of interest. This work introduces a predictive model usable by small and medium-sized enterprises (SMEs) who are in need of a data-based and analytical approach to stock proper movies for local audiences and retain more customers. We used classification models to extract features from thousands of customers’ demographic, behavioral and social information to predict their movie genre preference. In the implementation, a Gaussian kernel support vector machine (SVM) classification model and a logistic regression model were established to extract features from sample data and their test error-in-sample were compared. Comparison of error-out-sample was also made under different Vapnik–Chervonenkis (VC) dimensions in the machine learning algorithm to find and prevent overfitting. Gaussian kernel SVM prediction model can correctly predict movie genre preferences in 85% of positive cases. The accuracy of the algorithm increased to 93% with a smaller VC dimension and less overfitting. These findings advance our understanding of how to use machine learning approach to predict customers’ preferences with a small data set and design prediction tools for these enterprises.

Keywords: Computational social science, movie preference, machine learning, SVM.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1611
1881 A Fuzzy Predictive Filter for Sinusoidal Signals with Time-Varying Frequencies

Authors: X. Z. Gao, S. J. Ovaska, X. Wang

Abstract:

Prediction of sinusoidal signals with time-varying frequencies has been an important research topic in power electronics systems. To solve this problem, we propose a new fuzzy predictive filtering scheme, which is based on a Finite Impulse Response (FIR) filter bank. Fuzzy logic is introduced here to provide appropriate interpolation of individual filter outputs. Therefore, instead of regular 'hard' switching, our method has the advantageous 'soft' switching among different filters. Simulation comparisons between the fuzzy predictive filtering and conventional filter bank-based approach are made to demonstrate that the new scheme can achieve an enhanced prediction performance for slowly changing sinusoidal input signals.

Keywords: Predictive filtering, fuzzy logic, sinusoidal signals, time-varying frequencies.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1449
1880 Structural Parsing of Natural Language Text in Tamil Using Phrase Structure Hybrid Language Model

Authors: Selvam M, Natarajan. A M, Thangarajan R

Abstract:

Parsing is important in Linguistics and Natural Language Processing to understand the syntax and semantics of a natural language grammar. Parsing natural language text is challenging because of the problems like ambiguity and inefficiency. Also the interpretation of natural language text depends on context based techniques. A probabilistic component is essential to resolve ambiguity in both syntax and semantics thereby increasing accuracy and efficiency of the parser. Tamil language has some inherent features which are more challenging. In order to obtain the solutions, lexicalized and statistical approach is to be applied in the parsing with the aid of a language model. Statistical models mainly focus on semantics of the language which are suitable for large vocabulary tasks where as structural methods focus on syntax which models small vocabulary tasks. A statistical language model based on Trigram for Tamil language with medium vocabulary of 5000 words has been built. Though statistical parsing gives better performance through tri-gram probabilities and large vocabulary size, it has some disadvantages like focus on semantics rather than syntax, lack of support in free ordering of words and long term relationship. To overcome the disadvantages a structural component is to be incorporated in statistical language models which leads to the implementation of hybrid language models. This paper has attempted to build phrase structured hybrid language model which resolves above mentioned disadvantages. In the development of hybrid language model, new part of speech tag set for Tamil language has been developed with more than 500 tags which have the wider coverage. A phrase structured Treebank has been developed with 326 Tamil sentences which covers more than 5000 words. A hybrid language model has been trained with the phrase structured Treebank using immediate head parsing technique. Lexicalized and statistical parser which employs this hybrid language model and immediate head parsing technique gives better results than pure grammar and trigram based model.

Keywords: Hybrid Language Model, Immediate Head Parsing, Lexicalized and Statistical Parsing, Natural Language Processing, Parts of Speech, Probabilistic Context Free Grammar, Tamil Language, Tree Bank.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3607
1879 Review and Comparison of Associative Classification Data Mining Approaches

Authors: Suzan Wedyan

Abstract:

Associative classification (AC) is a data mining approach that combines association rule and classification to build classification models (classifiers). AC has attracted a significant attention from several researchers mainly because it derives accurate classifiers that contain simple yet effective rules. In the last decade, a number of associative classification algorithms have been proposed such as Classification based Association (CBA), Classification based on Multiple Association Rules (CMAR), Class based Associative Classification (CACA), and Classification based on Predicted Association Rule (CPAR). This paper surveys major AC algorithms and compares the steps and methods performed in each algorithm including: rule learning, rule sorting, rule pruning, classifier building, and class prediction.

Keywords: Associative Classification, Classification, Data Mining, Learning, Rule Ranking, Rule Pruning, Prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6584
1878 Fuzzy Estimation of Parameters in Statistical Models

Authors: A. Falsafain, S. M. Taheri, M. Mashinchi

Abstract:

Using a set of confidence intervals, we develop a common approach, to construct a fuzzy set as an estimator for unknown parameters in statistical models. We investigate a method to derive the explicit and unique membership function of such fuzzy estimators. The proposed method has been used to derive the fuzzy estimators of the parameters of a Normal distribution and some functions of parameters of two Normal distributions, as well as the parameters of the Exponential and Poisson distributions.

Keywords: Confidence interval. Fuzzy number. Fuzzy estimation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2228
1877 Air Quality Forecast Based on Principal Component Analysis-Genetic Algorithm and Back Propagation Model

Authors: Bin Mu, Site Li, Shijin Yuan

Abstract:

Under the circumstance of environment deterioration, people are increasingly concerned about the quality of the environment, especially air quality. As a result, it is of great value to give accurate and timely forecast of AQI (air quality index). In order to simplify influencing factors of air quality in a city, and forecast the city’s AQI tomorrow, this study used MATLAB software and adopted the method of constructing a mathematic model of PCA-GABP to provide a solution. To be specific, this study firstly made principal component analysis (PCA) of influencing factors of AQI tomorrow including aspects of weather, industry waste gas and IAQI data today. Then, we used the back propagation neural network model (BP), which is optimized by genetic algorithm (GA), to give forecast of AQI tomorrow. In order to verify validity and accuracy of PCA-GABP model’s forecast capability. The study uses two statistical indices to evaluate AQI forecast results (normalized mean square error and fractional bias). Eventually, this study reduces mean square error by optimizing individual gene structure in genetic algorithm and adjusting the parameters of back propagation model. To conclude, the performance of the model to forecast AQI is comparatively convincing and the model is expected to take positive effect in AQI forecast in the future.

Keywords: AQI forecast, principal component analysis, genetic algorithm, back propagation neural network model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 981
1876 Selecting Negative Examples for Protein-Protein Interaction

Authors: Mohammad Shoyaib, M. Abdullah-Al-Wadud, Oksam Chae

Abstract:

Proteomics is one of the largest areas of research for bioinformatics and medical science. An ambitious goal of proteomics is to elucidate the structure, interactions and functions of all proteins within cells and organisms. Predicting Protein-Protein Interaction (PPI) is one of the crucial and decisive problems in current research. Genomic data offer a great opportunity and at the same time a lot of challenges for the identification of these interactions. Many methods have already been proposed in this regard. In case of in-silico identification, most of the methods require both positive and negative examples of protein interaction and the perfection of these examples are very much crucial for the final prediction accuracy. Positive examples are relatively easy to obtain from well known databases. But the generation of negative examples is not a trivial task. Current PPI identification methods generate negative examples based on some assumptions, which are likely to affect their prediction accuracy. Hence, if more reliable negative examples are used, the PPI prediction methods may achieve even more accuracy. Focusing on this issue, a graph based negative example generation method is proposed, which is simple and more accurate than the existing approaches. An interaction graph of the protein sequences is created. The basic assumption is that the longer the shortest path between two protein-sequences in the interaction graph, the less is the possibility of their interaction. A well established PPI detection algorithm is employed with our negative examples and in most cases it increases the accuracy more than 10% in comparison with the negative pair selection method in that paper.

Keywords: Interaction graph, Negative training data, Protein-Protein interaction, Support vector machine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1668
1875 Analysis of Web User Identification Methods

Authors: Renáta Iváncsy, Sándor Juhász

Abstract:

Web usage mining has become a popular research area, as a huge amount of data is available online. These data can be used for several purposes, such as web personalization, web structure enhancement, web navigation prediction etc. However, the raw log files are not directly usable; they have to be preprocessed in order to transform them into a suitable format for different data mining tasks. One of the key issues in the preprocessing phase is to identify web users. Identifying users based on web log files is not a straightforward problem, thus various methods have been developed. There are several difficulties that have to be overcome, such as client side caching, changing and shared IP addresses and so on. This paper presents three different methods for identifying web users. Two of them are the most commonly used methods in web log mining systems, whereas the third on is our novel approach that uses a complex cookie-based method to identify web users. Furthermore we also take steps towards identifying the individuals behind the impersonal web users. To demonstrate the efficiency of the new method we developed an implementation called Web Activity Tracking (WAT) system that aims at a more precise distinction of web users based on log data. We present some statistical analysis created by the WAT on real data about the behavior of the Hungarian web users and a comprehensive analysis and comparison of the three methods

Keywords: Data preparation, Tracking individuals, Web useridentification, Web usage mining

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4361
1874 Investigation of the Main Trends of Tourist Expenses in Georgia

Authors: Nino Abesadze, Marine Mindorashvili, Nino Paresashvili

Abstract:

The main purpose of the article is to make complex statistical analysis of tourist expenses of foreign visitors. We used mixed technique of selection that implies rules of random and proportional selection. Computer software SPSS was used to compute statistical data for corresponding analysis. Corresponding methodology of tourism statistics was implemented according to international standards. Important information was collected and grouped from the major Georgian airports. Techniques of statistical observation were prepared. A representative population of foreign visitors and a rule of selection of respondents were determined. We have a trend of growth of tourist numbers and share of tourists from post-soviet countries constantly increases. Level of satisfaction with tourist facilities and quality of service has grown, but still we have a problem of disparity between quality of service and prices. The design of tourist expenses of foreign visitors is diverse; competitiveness of tourist products of Georgian tourist companies is higher.

Keywords: Tourist, expenses, methods, statistics, analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 911
1873 Immobilization of Lipase Enzyme by Low Cost Material: A Statistical Approach

Authors: Md. Z. Alam, Devi R. Asih, Md. N. Salleh

Abstract:

Immobilization of lipase enzyme produced from palm oil mill effluent (POME) by the activated carbon (AC) among the low cost support materials was optimized. The results indicated that immobilization of 94% was achieved by AC as the most suitable support material. A sequential optimization strategy based on a statistical experimental design, including one-factor-at-a-time (OFAT) method was used to determine the equilibrium time. Three components influencing lipase immobilization were optimized by the response surface methodology (RSM) based on the face-centered central composite design (FCCCD). On the statistical analysis of the results, the optimum enzyme concentration loading, agitation rate and carbon active dosage were found to be 30 U/ml, 300 rpm and 8 g/L respectively, with a maximum immobilization activity of 3732.9 U/g-AC after 2 hrs of immobilization. Analysis of variance (ANOVA) showed a high regression coefficient (R2) of 0.999, which indicated a satisfactory fit of the model with the experimental data. The parameters were statistically significant at p<0.05.

Keywords: Activated carbon, adsorption, immobilization, POME based lipase.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2529
1872 Learning to Recommend with Negative Ratings Based on Factorization Machine

Authors: Caihong Sun, Xizi Zhang

Abstract:

Rating prediction is an important problem for recommender systems. The task is to predict the rating for an item that a user would give. Most of the existing algorithms for the task ignore the effect of negative ratings rated by users on items, but the negative ratings have a significant impact on users’ purchasing decisions in practice. In this paper, we present a rating prediction algorithm based on factorization machines that consider the effect of negative ratings inspired by Loss Aversion theory. The aim of this paper is to develop a concave and a convex negative disgust function to evaluate the negative ratings respectively. Experiments are conducted on MovieLens dataset. The experimental results demonstrate the effectiveness of the proposed methods by comparing with other four the state-of-the-art approaches. The negative ratings showed much importance in the accuracy of ratings predictions.

Keywords: Factorization machines, feature engineering, negative ratings, recommendation systems.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 910
1871 Development of a Real-Time Energy Models for Photovoltaic Water Pumping System

Authors: Ammar Mahjoubi, Ridha Fethi Mechlouch, Belgacem Mahdhaoui, Ammar Ben Brahim

Abstract:

This purpose of this paper is to develop and validate a model to accurately predict the cell temperature of a PV module that adapts to various mounting configurations, mounting locations, and climates while only requiring readily available data from the module manufacturer. Results from this model are also compared to results from published cell temperature models. The models were used to predict real-time performance from a PV water pumping systems in the desert of Medenine, south of Tunisia using 60-min intervals of measured performance data during one complete year. Statistical analysis of the predicted results and measured data highlight possible sources of errors and the limitations and/or adequacy of existing models, to describe the temperature and efficiency of PV-cells and consequently, the accuracy of performance of PV water pumping systems prediction models.

Keywords: Temperature of a photovoltaic module, Predicted models, PV water pumping systems efficiency, Simulation, Desert of southern Tunisia.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1807
1870 Prediction of Cutting Tool Life in Drilling of Reinforced Aluminum Alloy Composite Using a Fuzzy Method

Authors: Mohammed T. Hayajneh

Abstract:

Machining of Metal Matrix Composites (MMCs) is very significant process and has been a main problem that draws many researchers to investigate the characteristics of MMCs during different machining process. The poor machining properties of hard particles reinforced MMCs make drilling process a rather interesting task. Unlike drilling of conventional materials, many problems can be seriously encountered during drilling of MMCs, such as tool wear and cutting forces. Cutting tool wear is a very significant concern in industries. Cutting tool wear not only influences the quality of the drilled hole, but also affects the cutting tool life. Prediction the cutting tool life during drilling is essential for optimizing the cutting conditions. However, the relationship between tool life and cutting conditions, tool geometrical factors and workpiece material properties has not yet been established by any machining theory. In this research work, fuzzy subtractive clustering system has been used to model the cutting tool life in drilling of Al2O3 particle reinforced aluminum alloy composite to investigate of the effect of cutting conditions on cutting tool life. This investigation can help in controlling and optimizing of cutting conditions when the process parameters are adjusted. The built model for prediction the tool life is identified by using drill diameter, cutting speed, and cutting feed rate as input data. The validity of the model was confirmed by the examinations under various cutting conditions. Experimental results have shown the efficiency of the model to predict cutting tool life.

Keywords: Composite, fuzzy, tool life, wear.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2057
1869 PM10 Prediction and Forecasting Using CART: A Case Study for Pleven, Bulgaria

Authors: Snezhana G. Gocheva-Ilieva, Maya P. Stoimenova

Abstract:

Ambient air pollution with fine particulate matter (PM10) is a systematic permanent problem in many countries around the world. The accumulation of a large number of measurements of both the PM10 concentrations and the accompanying atmospheric factors allow for their statistical modeling to detect dependencies and forecast future pollution. This study applies the classification and regression trees (CART) method for building and analyzing PM10 models. In the empirical study, average daily air data for the city of Pleven, Bulgaria for a period of 5 years are used. Predictors in the models are seven meteorological variables, time variables, as well as lagged PM10 variables and some lagged meteorological variables, delayed by 1 or 2 days with respect to the initial time series, respectively. The degree of influence of the predictors in the models is determined. The selected best CART models are used to forecast future PM10 concentrations for two days ahead after the last date in the modeling procedure and show very accurate results.

Keywords: Cross-validation, decision tree, lagged variables, short-term forecasting.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 679
1868 A New History Based Method to Handle the Recurring Concept Shifts in Data Streams

Authors: Hossein Morshedlou, Ahmad Abdollahzade Barforoush

Abstract:

Recent developments in storage technology and networking architectures have made it possible for broad areas of applications to rely on data streams for quick response and accurate decision making. Data streams are generated from events of real world so existence of associations, which are among the occurrence of these events in real world, among concepts of data streams is logical. Extraction of these hidden associations can be useful for prediction of subsequent concepts in concept shifting data streams. In this paper we present a new method for learning association among concepts of data stream and prediction of what the next concept will be. Knowing the next concept, an informed update of data model will be possible. The results of conducted experiments show that the proposed method is proper for classification of concept shifting data streams.

Keywords: Data Stream, Classification, Concept Shift, History.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1250
1867 An Investigation into the Application of Artificial Neural Networks to the Prediction of Injuries in Sport

Authors: J. McCullagh, T. Whitfort

Abstract:

Artificial Neural Networks (ANNs) have been used successfully in many scientific, industrial and business domains as a method for extracting knowledge from vast amounts of data. However the use of ANN techniques in the sporting domain has been limited. In professional sport, data is stored on many aspects of teams, games, training and players. Sporting organisations have begun to realise that there is a wealth of untapped knowledge contained in the data and there is great interest in techniques to utilise this data. This study will use player data from the elite Australian Football League (AFL) competition to train and test ANNs with the aim to predict the onset of injuries. The results demonstrate that an accuracy of 82.9% was achieved by the ANNs’ predictions across all examples with 94.5% of all injuries correctly predicted. These initial findings suggest that ANNs may have the potential to assist sporting clubs in the prediction of injuries.

Keywords: Artificial Neural Networks, data, injuries, sport

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2813
1866 Grid-HPA: Predicting Resource Requirements of a Job in the Grid Computing Environment

Authors: M. Bohlouli, M. Analoui

Abstract:

For complete support of Quality of Service, it is better that environment itself predicts resource requirements of a job by using special methods in the Grid computing. The exact and correct prediction causes exact matching of required resources with available resources. After the execution of each job, the used resources will be saved in the active database named "History". At first some of the attributes will be exploit from the main job and according to a defined similarity algorithm the most similar executed job will be exploited from "History" using statistic terms such as linear regression or average, resource requirements will be predicted. The new idea in this research is based on active database and centralized history maintenance. Implementation and testing of the proposed architecture results in accuracy percentage of 96.68% to predict CPU usage of jobs and 91.29% of memory usage and 89.80% of the band width usage.

Keywords: Active Database, Grid Computing, ResourceRequirement Prediction, Scheduling,

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1405
1865 Data Mining on the Router Logs for Statistical Application Classification

Authors: M. Rahmati, S.M. Mirzababaei

Abstract:

With the advance of information technology in the new era the applications of Internet to access data resources has steadily increased and huge amount of data have become accessible in various forms. Obviously, the network providers and agencies, look after to prevent electronic attacks that may be harmful or may be related to terrorist applications. Thus, these have facilitated the authorities to under take a variety of methods to protect the special regions from harmful data. One of the most important approaches is to use firewall in the network facilities. The main objectives of firewalls are to stop the transfer of suspicious packets in several ways. However because of its blind packet stopping, high process power requirements and expensive prices some of the providers are reluctant to use the firewall. In this paper we proposed a method to find a discriminate function to distinguish between usual packets and harmful ones by the statistical processing on the network router logs. By discriminating these data, an administrator may take an approach action against the user. This method is very fast and can be used simply in adjacent with the Internet routers.

Keywords: Data Mining, Firewall, Optimization, Packetclassification, Statistical Pattern Recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1611
1864 Feature Selection Approaches with Missing Values Handling for Data Mining - A Case Study of Heart Failure Dataset

Authors: N.Poolsawad, C.Kambhampati, J. G. F. Cleland

Abstract:

In this paper, we investigated the characteristic of a clinical dataseton the feature selection and classification measurements which deal with missing values problem.And also posed the appropriated techniques to achieve the aim of the activity; in this research aims to find features that have high effect to mortality and mortality time frame. We quantify the complexity of a clinical dataset. According to the complexity of the dataset, we proposed the data mining processto cope their complexity; missing values, high dimensionality, and the prediction problem by using the methods of missing value replacement, feature selection, and classification.The experimental results will extend to develop the prediction model for cardiology.

Keywords: feature selection, missing values, classification, clinical dataset, heart failure.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3181
1863 The Role and Importance of Genome Sequencing in Prediction of Cancer Risk

Authors: M. Sadeghi, H. Pezeshk, R. Tusserkani, A. Sharifi Zarchi, A. Malekpour, M. Foroughmand, S. Goliaei, M. Totonchi, N. Ansari–Pour

Abstract:

The role and relative importance of intrinsic and extrinsic factors in the development of complex diseases such as cancer still remains a controversial issue. Determining the amount of variation explained by these factors needs experimental data and statistical models. These models are nevertheless based on the occurrence and accumulation of random mutational events during stem cell division, thus rendering cancer development a stochastic outcome. We demonstrate that not only individual genome sequencing is uninformative in determining cancer risk, but also assigning a unique genome sequence to any given individual (healthy or affected) is not meaningful. Current whole-genome sequencing approaches are therefore unlikely to realize the promise of personalized medicine. In conclusion, since genome sequence differs from cell to cell and changes over time, it seems that determining the risk factor of complex diseases based on genome sequence is somewhat unrealistic, and therefore, the resulting data are likely to be inherently uninformative.

Keywords: Cancer risk, extrinsic factors, genome sequencing, intrinsic factors.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1076
1862 Improved Computational Efficiency of Machine Learning Algorithms Based on Evaluation Metrics to Control the Spread of Coronavirus in the UK

Authors: Swathi Ganesan, Nalinda Somasiri, Rebecca Jeyavadhanam, Gayathri Karthick

Abstract:

The COVID-19 crisis presents a substantial and critical hazard to worldwide health. Since the occurrence of the disease in late January 2020 in the UK, the number of infected people confirmed to acquire the illness has increased tremendously across the country, and the number of individuals affected is undoubtedly considerably high. The purpose of this research is to figure out a predictive machine learning (ML) archetypal that could forecast the COVID-19 cases within the UK. This study concentrates on the statistical data collected from 31st January 2020 to 31st March 2021 in the United Kingdom. Information on total COVID-19 cases registered, new cases encountered on a daily basis, total death registered, and patients’ death per day due to Coronavirus is collected from World Health Organization (WHO). Data preprocessing is carried out to identify any missing values, outliers, or anomalies in the dataset. The data are split into 8:2 ratio for training and testing purposes to forecast future new COVID-19 cases. Support Vector Machine (SVM), Random Forest (RF), and linear regression (LR) algorithms are chosen to study the model performance in the prediction of new COVID-19 cases. From the evaluation metrics such as r-squared value and mean squared error, the statistical performance of the model in predicting the new COVID-19 cases is evaluated. RF outperformed the other two ML algorithms with a training accuracy of 99.47% and testing accuracy of 98.26% when n = 30. The mean square error obtained for RF is 4.05e11, which is lesser compared to the other predictive models used for this study. From the experimental analysis, RF algorithm can perform more effectively and efficiently in predicting the new COVID-19 cases, which could help the health sector to take relevant control measures for the spread of the virus.

Keywords: COVID-19, machine learning, supervised learning, unsupervised learning, linear regression, support vector machine, random forest.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 115
1861 The Application of the Queuing Theory in the Traffic Flow of Intersection

Authors: Shuguo Yang, Xiaoyan Yang

Abstract:

It is practically significant to research the traffic flow of intersection because the capacity of intersection affects the efficiency of highway network directly. This paper analyzes the traffic conditions of an intersection in certain urban by the methods of queuing theory and statistical experiment, sets up a corresponding mathematical model and compares it with the actual values. The result shows that queuing theory is applied in the study of intersection traffic flow and it can provide references for the other similar designs.

Keywords: Intersection, Queuing theory, Statistical experiment, System metrics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7486
1860 Parameter Sensitivity Analysis of Artificial Neural Network for Predicting Water Turbidity

Authors: Chia-Ling Chang, Chung-Sheng Liao

Abstract:

The present study focuses on the discussion over the parameter of Artificial Neural Network (ANN). Sensitivity analysis is applied to assess the effect of the parameters of ANN on the prediction of turbidity of raw water in the water treatment plant. The result shows that transfer function of hidden layer is a critical parameter of ANN. When the transfer function changes, the reliability of prediction of water turbidity is greatly different. Moreover, the estimated water turbidity is less sensitive to training times and learning velocity than the number of neurons in the hidden layer. Therefore, it is important to select an appropriate transfer function and suitable number of neurons in the hidden layer in the process of parameter training and validation.

Keywords: Artificial Neural Network (ANN), sensitivity analysis, turbidity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2770