Search results for: Prediction Based Data Reduction

16643 Prediction of Protein Subchloroplast Locations using Random Forests

Authors: Chun-Wei Tung, Chyn Liaw, Shinn-Jang Ho, Shinn-Ying Ho

Abstract:

Protein subchloroplast locations are correlated with its functions. In contrast to the large amount of available protein sequences, the information of their locations and functions is less known. The experiment works for identification of protein locations and functions are costly and time consuming. The accurate prediction of protein subchloroplast locations can accelerate the study of functions of proteins in chloroplast. This study proposes a Random Forest based method, ChloroRF, to predict protein subchloroplast locations using interpretable physicochemical properties. In addition to high prediction accuracy, the ChloroRF is able to select important physicochemical properties. The important physicochemical properties are also analyzed to provide insights into the underlying mechanism.

Keywords: Chloroplast, Physicochemical properties, Proteinlocations, Random Forests.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1677

16642 Association Rule and Decision Tree based Methodsfor Fuzzy Rule Base Generation

Authors: Ferenc Peter Pach, Janos Abonyi

Abstract:

This paper focuses on the data-driven generation of fuzzy IF...THEN rules. The resulted fuzzy rule base can be applied to build a classifier, a model used for prediction, or it can be applied to form a decision support system. Among the wide range of possible approaches, the decision tree and the association rule based algorithms are overviewed, and two new approaches are presented based on the a priori fuzzy clustering based partitioning of the continuous input variables. An application study is also presented, where the developed methods are tested on the well known Wisconsin Breast Cancer classification problem.

Keywords:

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2305

16641 Profitability Assessment of Granite Aggregate Production and the Development of a Profit Assessment Model

Authors: Melodi Mbuyi Mata, Blessing Olamide Taiwo, Afolabi Ayodele David

Abstract:

The purpose of this research is to create empirical models for assessing the profitability of granite aggregate production in Akure, Ondo state aggregate quarries. In addition, an Artificial Neural Network (ANN) model and multivariate predicting models for granite profitability were developed in the study. A formal survey questionnaire was used to collect data for the study. The data extracted from the case study mine for this study include granite marketing operations, royalty, production costs, and mine production information. The following methods were used to achieve the goal of this study: descriptive statistics, MATLAB 2017, and SPSS16.0 software in analyzing and modeling the data collected from granite traders in the study areas. The ANN and Multi Variant Regression models' prediction accuracy was compared using a coefficient of determination (R2), Root Mean Square Error (RMSE), and mean square error (MSE). Due to the high prediction error, the model evaluation indices revealed that the ANN model was suitable for predicting generated profit in a typical quarry. More quarries in Nigeria's southwest region and other geopolitical zones should be considered to improve ANN prediction accuracy.

Keywords: National development, granite, profitability assessment, ANN models.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 84

16640 Calibration Model of %Titratable Acidity (Citric Acid) for Intact Tomato by Transmittance SW-NIR Spectroscopy

Authors: K. Petcharaporn, S. Kumchoo

Abstract:

The acidity (citric acid) is the one of chemical content that can be refer to the internal quality and it’s a maturity index of tomato, The titratable acidity (%TA) can be predicted by a non-destructive method prediction by using the transmittance short wavelength (SW-NIR) spectroscopy in the wavelength range between 665-955 nm. The set of 167 tomato samples divided into groups of 117 tomatoes sample for training set and 50 tomatoes sample for test set were used to establish the calibration model to predict and measure %TA by partial least squares regression (PLSR) technique. The spectra were pretreated with MSC pretreatment and it gave the optimal result for calibration model as (R = 0.92, RMSEC = 0.03%) and this model obtained high accuracy result to use for %TA prediction in test set as (R = 0.81, RMSEP = 0.05%). From the result of prediction in test set shown that the transmittance SW-NIR spectroscopy technique can be used for a non-destructive method for %TA prediction of tomato.

Keywords: Tomato, quality, prediction, transmittance, titratable acidity, citric acid.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2700

16639 Determining the Width and Depths of Cut in Milling on the Basis of a Multi-Dexel Model

Authors: Jens Friedrich, Matthias A. Gebele, Armin Lechler, Alexander Verl

Abstract:

Chatter vibrations and process instabilities are the most important factors limiting the productivity of the milling process. Chatter can leads to damage of the tool, the part or the machine tool. Therefore, the estimation and prediction of the process stability is very important. The process stability depends on the spindle speed, the depth of cut and the width of cut. In milling, the process conditions are defined in the NC-program. While the spindle speed is directly coded in the NC-program, the depth and width of cut are unknown. This paper presents a new simulation based approach for the prediction of the depth and width of cut of a milling process. The prediction is based on a material removal simulation with an analytically represented tool shape and a multi-dexel approach for the workpiece. The new calculation method allows the direct estimation of the depth and width of cut, which are the influencing parameters of the process stability, instead of the removed volume as existing approaches do. The knowledge can be used to predict the stability of new, unknown parts. Moreover with an additional vibration sensor, the stability lobe diagram of a milling process can be estimated and improved based on the estimated depth and width of cut.

Keywords: Dexel, process stability, material removal, milling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2261

16638 Identification, Prediction and Detection of the Process Fault in a Cement Rotary Kiln by Locally Linear Neuro-Fuzzy Technique

Authors: Masoud Sadeghian, Alireza Fatehi

Abstract:

In this paper, we use nonlinear system identification method to predict and detect process fault of a cement rotary kiln. After selecting proper inputs and output, an input-output model is identified for the plant. To identify the various operation points in the kiln, Locally Linear Neuro-Fuzzy (LLNF) model is used. This model is trained by LOLIMOT algorithm which is an incremental treestructure algorithm. Then, by using this method, we obtained 3 distinct models for the normal and faulty situations in the kiln. One of the models is for normal condition of the kiln with 15 minutes prediction horizon. The other two models are for the two faulty situations in the kiln with 7 minutes prediction horizon are presented. At the end, we detect these faults in validation data. The data collected from White Saveh Cement Company is used for in this study.

Keywords: Cement Rotary Kiln, Fault Detection, Delay Estimation Method, Locally Linear Neuro Fuzzy Model, LOLIMOT.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1673

16637 Model-free Prediction based on Tracking Theory and Newton Form of Polynomial

Authors: Guoyuan Qi , Yskandar Hamam, Barend Jacobus van Wyk, Shengzhi Du

Abstract:

The majority of existing predictors for time series are model-dependent and therefore require some prior knowledge for the identification of complex systems, usually involving system identification, extensive training, or online adaptation in the case of time-varying systems. Additionally, since a time series is usually generated by complex processes such as the stock market or other chaotic systems, identification, modeling or the online updating of parameters can be problematic. In this paper a model-free predictor (MFP) for a time series produced by an unknown nonlinear system or process is derived using tracking theory. An identical derivation of the MFP using the property of the Newton form of the interpolating polynomial is also presented. The MFP is able to accurately predict future values of a time series, is stable, has few tuning parameters and is desirable for engineering applications due to its simplicity, fast prediction speed and extremely low computational load. The performance of the proposed MFP is demonstrated using the prediction of the Dow Jones Industrial Average stock index.

Keywords: Forecast, model-free predictor, prediction, time series

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1784

16636 A Mathematical Modelling to Predict Rhamnolipid Production by Pseudomonas aeruginosa under Nitrogen Limiting Fed-Batch Fermentation

Authors: Seyed Ali Jafari, Mohammad Ghomi Avili, Emad Benhelal

Abstract:

In this study, a mathematical model was proposed and the accuracy of this model was assessed to predict the growth of Pseudomonas aeruginosa and rhamnolipid production under nitrogen limiting (sodium nitrate) fed-batch fermentation. All of the parameters used in this model were achieved individually without using any data from the literature. The overall growth kinetic of the strain was evaluated using a dual-parallel substrate Monod equation which was described by several batch experimental data. Fed-batch data under different glycerol (as the sole carbon source, C/N=10) concentrations and feed flow rates were used to describe the proposed fed-batch model and other parameters. In order to verify the accuracy of the proposed model several verification experiments were performed in a vast range of initial glycerol concentrations. While the results showed an acceptable prediction for rhamnolipid production (less than 10% error), in case of biomass prediction the errors were less than 23%. It was also found that the rhamnolipid production by P. aeruginosa was more sensitive at low glycerol concentrations. Based on the findings of this work, it was concluded that the proposed model could effectively be employed for rhamnolipid production by this strain under fed-batch fermentation on up to 80 g l- 1 glycerol.

Keywords: Fed-batch culture, glycerol, kinetic parameters, modelling, Pseudomonas aeruginosa, rhamnolipid.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2453

16635 Motion Prediction and Motion Vector Cost Reduction during Fast Block Motion Estimation in MCTF

Authors: Karunakar A K, Manohara Pai M M

Abstract:

In 3D-wavelet video coding framework temporal filtering is done along the trajectory of motion using Motion Compensated Temporal Filtering (MCTF). Hence computationally efficient motion estimation technique is the need of MCTF. In this paper a predictive technique is proposed in order to reduce the computational complexity of the MCTF framework, by exploiting the high correlation among the frames in a Group Of Picture (GOP). The proposed technique applies coarse and fine searches of any fast block based motion estimation, only to the first pair of frames in a GOP. The generated motion vectors are supplied to the next consecutive frames, even to subsequent temporal levels and only fine search is carried out around those predicted motion vectors. Hence coarse search is skipped for all the motion estimation in a GOP except for the first pair of frames. The technique has been tested for different fast block based motion estimation algorithms over different standard test sequences using MC-EZBC, a state-of-the-art scalable video coder. The simulation result reveals substantial reduction (i.e. 20.75% to 38.24%) in the number of search points during motion estimation, without compromising the quality of the reconstructed video compared to non-predictive techniques. Since the motion vectors of all the pair of frames in a GOP except the first pair will have value ±1 around the motion vectors of the previous pair of frames, the number of bits required for motion vectors is also reduced by 50%.

Keywords: Motion Compensated Temporal Filtering, predictivemotion estimation, lifted wavelet transform, motion vector

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1619

16634 An Enhanced Artificial Neural Network for Air Temperature Prediction

Authors: Brian A. Smith, Ronald W. McClendon, Gerrit Hoogenboom

Abstract:

The mitigation of crop loss due to damaging freezes requires accurate air temperature prediction models. An improved model for temperature prediction in Georgia was developed by including information on seasonality and modifying parameters of an existing artificial neural network model. Alternative models were compared by instantiating and training multiple networks for each model. The inclusion of up to 24 hours of prior weather information and inputs reflecting the day of year were among improvements that reduced average four-hour prediction error by 0.18°C compared to the prior model. Results strongly suggest model developers should instantiate and train multiple networks with different initial weights to establish appropriate model parameters.

Keywords: Time-series forecasting, weather modeling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1868

16633 Low Complexity Hybrid Scheme for PAPR Reduction in OFDM Systems Based on SLM and Clipping

Authors: V. Sudha, D. Sriram Kumar

Abstract:

In this paper, we present a low complexity hybrid scheme using conventional selective mapping (C-SLM) and clipping algorithms to reduce the high peak-to-average power ratio (PAPR) of orthogonal frequency division multiplexing (OFDM) signal. In the proposed scheme, the input data sequence (X) is divided into two sub-blocks, then clipping algorithm is applied to the first sub-block, whereas C-SLM algorithm is applied to the second sub-block in order to reduce both computational complexity and PAPR. The resultant time domain OFDM signal is obtained by combining the output of two sub-blocks. The simulation results show that the proposed hybrid scheme provides 0.45 dB PAPR reduction gain at CCDF value of 10-2 and 52% of computational complexity reduction when compared to C-SLM scheme at the expense of slight degradation in bit error rate (BER) performance.

Keywords: CCDF, Clipping, OFDM, PAPR, SLM.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1274

16632 MIMO System Order Reduction Using Real-Coded Genetic Algorithm

Authors: Swadhin Ku. Mishra, Sidhartha Panda, Simanchala Padhy, C. Ardil

Abstract:

In this paper, real-coded genetic algorithm (RCGA) optimization technique has been applied for large-scale linear dynamic multi-input-multi-output (MIMO) system. The method is based on error minimization technique where the integral square error between the transient responses of original and reduced order models has been minimized by RCGA. The reduction procedure is simple computer oriented and the approach is comparable in quality with the other well-known reduction techniques. Also, the proposed method guarantees stability of the reduced model if the original high-order MIMO system is stable. The proposed approach of MIMO system order reduction is illustrated with the help of an example and the results are compared with the recently published other well-known reduction techniques to show its superiority.

Keywords: Multi-input-multi-output (MIMO) system.Modelorder reduction. Integral squared error (ISE). Real-coded geneticalgorithm

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2263

16631 A Proposed Performance Prediction Approach for Manufacturing Processes using ANNs

Authors: M. S. Abdelwahed, M. A. El-Baz, T. T. El-Midany

Abstract:

this paper aims to provide an approach to predict the performance of the product produced after multi-stages of manufacturing processes, as well as the assembly. Such approach aims to control and subsequently identify the relationship between the process inputs and outputs so that a process engineer can more accurately predict how the process output shall perform based on the system inputs. The approach is guided by a six-sigma methodology to obtain improved performance. In this paper a case study of the manufacture of a hermetic reciprocating compressor is presented. The application of artificial neural networks (ANNs) technique is introduced to improve performance prediction within this manufacturing environment. The results demonstrate that the approach predicts accurately and effectively.

Keywords: Artificial neural networks, Reciprocating compressor manufacturing, Performance prediction, Quality improvement

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1782

16630 Impact of Faults in Different Software Systems: A Survey

Authors: Neeraj Mohan, Parvinder S. Sandhu, Hardeep Singh

Abstract:

Software maintenance is extremely important activity in software development life cycle. It involves a lot of human efforts, cost and time. Software maintenance may be further subdivided into different activities such as fault prediction, fault detection, fault prevention, fault correction etc. This topic has gained substantial attention due to sophisticated and complex applications, commercial hardware, clustered architecture and artificial intelligence. In this paper we surveyed the work done in the field of software maintenance. Software fault prediction has been studied in context of fault prone modules, self healing systems, developer information, maintenance models etc. Still a lot of things like modeling and weightage of impact of different kind of faults in the various types of software systems need to be explored in the field of fault severity.

Keywords: Fault prediction, Software Maintenance, Automated Fault Prediction, and Failure Mode Analysis

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2079

16629 Methods for Distinction of Cattle Using Supervised Learning

Authors: Radoslav Židek, Veronika Šidlová, Radovan Kasarda, Birgit Fuerst-Waltl

Abstract:

Machine learning represents a set of topics dealing with the creation and evaluation of algorithms that facilitate pattern recognition, classification, and prediction, based on models derived from existing data. The data can present identification patterns which are used to classify into groups. The result of the analysis is the pattern which can be used for identification of data set without the need to obtain input data used for creation of this pattern. An important requirement in this process is careful data preparation validation of model used and its suitable interpretation. For breeders, it is important to know the origin of animals from the point of the genetic diversity. In case of missing pedigree information, other methods can be used for traceability of animal´s origin. Genetic diversity written in genetic data is holding relatively useful information to identify animals originated from individual countries. We can conclude that the application of data mining for molecular genetic data using supervised learning is an appropriate tool for hypothesis testing and identifying an individual.

Keywords: Genetic data, Pinzgau cattle, supervised learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2318

16628 Dry Relaxation Shrinkage Prediction of Bordeaux Fiber Using a Feed Forward Neural

Authors: Baeza S. Roberto

Abstract:

The knitted fabric suffers a deformation in its dimensions due to stretching and tension factors, transverse and longitudinal respectively, during the process in rectilinear knitting machines so it performs a dry relaxation shrinkage procedure and thermal action of prefixed to obtain stable conditions in the knitting. This paper presents a dry relaxation shrinkage prediction of Bordeaux fiber using a feed forward neural network and linear regression models. Six operational alternatives of shrinkage were predicted. A comparison of the results was performed finding neural network models with higher levels of explanation of the variability and prediction. The presence of different reposes is included. The models were obtained through a neural toolbox of Matlab and Minitab software with real data in a knitting company of Southern Guanajuato. The results allow predicting dry relaxation shrinkage of each alternative operation.

Keywords: Neural network, dry relaxation, knitting, linear regression.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1760

16627 Using High Performance Computing for Online Flood Monitoring and Prediction

Authors: Stepan Kuchar, Martin Golasowski, Radim Vavrik, Michal Podhoranyi, Boris Sir, Jan Martinovic

Abstract:

The main goal of this article is to describe the online flood monitoring and prediction system Floreon+ primarily developed for the Moravian-Silesian region in the Czech Republic and the basic process it uses for running automatic rainfall-runoff and hydrodynamic simulations along with their calibration and uncertainty modeling. It takes a long time to execute such process sequentially, which is not acceptable in the online scenario, so the use of a high performance computing environment is proposed for all parts of the process to shorten their duration. Finally, a case study on the Ostravice River catchment is presented that shows actual durations and their gain from the parallel implementation.

Keywords: Flood prediction process, High performance computing, Online flood prediction system, Parallelization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2331

16626 Copper Price Prediction Model for Various Economic Situations

Authors: Haidy S. Ghali, Engy Serag, A. Samer Ezeldin

Abstract:

Copper is an essential raw material used in the construction industry. During 2021 and the first half of 2022, the global market suffered from a significant fluctuation in copper raw material prices due to the aftermath of both the COVID-19 pandemic and the Russia-Ukraine war which exposed its consumers to an unexpected financial risk. Thereto, this paper aims to develop two hybrid price prediction models using artificial neural network and long short-term memory (ANN-LSTM), by Python, that can forecast the average monthly copper prices, traded in the London Metal Exchange; the first model is a multivariate model that forecasts the copper price of the next 1-month and the second is a univariate model that predicts the copper prices of the upcoming three months. Historical data of average monthly London Metal Exchange copper prices are collected from January 2009 till July 2022 and potential external factors are identified and employed in the multivariate model. These factors lie under three main categories: energy prices, and economic indicators of the three major exporting countries of copper depending on the data availability. Before developing the LSTM models, the collected external parameters are analyzed with respect to the copper prices using correlation, and multicollinearity tests in R software; then, the parameters are further screened to select the parameters that influence the copper prices. Then, the two LSTM models are developed, and the dataset is divided into training, validation, and testing sets. The results show that the performance of the 3-month prediction model is better than the 1-month prediction model; but still, both models can act as predicting tools for diverse economic situations.

Keywords: Copper prices, prediction model, neural network, time series forecasting.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 188

16625 Design Based Performance Prediction of Component Based Software Products

Authors: K. S. Jasmine, R. Vasantha

Abstract:

Component-Based software engineering provides an opportunity for better quality and increased productivity in software development by using reusable software components [10]. One of the most critical aspects of the quality of a software system is its performance. The systematic application of software performance engineering techniques throughout the development process can help to identify design alternatives that preserve desirable qualities such as extensibility and reusability while meeting performance objectives [1]. In the present scenario, software engineering methodologies strongly focus on the functionality of the system, while applying a “fix- it-later" approach to software performance aspects [3]. As a result, lengthy fine-tunings, expensive extra hard ware, or even redesigns are necessary for the system to meet the performance requirements. In this paper, we propose design based, implementation independent, performance prediction approach to reduce the overhead associated in the later phases while developing a performance guaranteed software product with the help of Unified Modeling Language (UML).

Keywords: Software Reuse, Component-based development, Unified Modeling Language, Software performance, Software components, Performance engineering, Software engineering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1867

16624 Critical Assessment of Scoring Schemes for Protein-Protein Docking Predictions

Authors: Dhananjay C. Joshi, Jung-Hsin Lin

Abstract:

Protein-protein interactions (PPI) play a crucial role in many biological processes such as cell signalling, transcription, translation, replication, signal transduction, and drug targeting, etc. Structural information about protein-protein interaction is essential for understanding the molecular mechanisms of these processes. Structures of protein-protein complexes are still difficult to obtain by biophysical methods such as NMR and X-ray crystallography, and therefore protein-protein docking computation is considered an important approach for understanding protein-protein interactions. However, reliable prediction of the protein-protein complexes is still under way. In the past decades, several grid-based docking algorithms based on the Katchalski-Katzir scoring scheme were developed, e.g., FTDock, ZDOCK, HADDOCK, RosettaDock, HEX, etc. However, the success rate of protein-protein docking prediction is still far from ideal. In this work, we first propose a more practical measure for evaluating the success of protein-protein docking predictions,the rate of first success (RFS), which is similar to the concept of mean first passage time (MFPT). Accordingly, we have assessed the ZDOCK bound and unbound benchmarks 2.0 and 3.0. We also createda new benchmark set for protein-protein docking predictions, in which the complexes have experimentally determined binding affinity data. We performed free energy calculation based on the solution of non-linear Poisson-Boltzmann equation (nlPBE) to improve the binding mode prediction. We used the well-studied thebarnase-barstarsystem to validate the parameters for free energy calculations. Besides,thenlPBE-based free energy calculations were conducted for the badly predicted cases by ZDOCK and ZRANK. We found that direct molecular mechanics energetics cannot be used to discriminate the native binding pose from the decoys.Our results indicate that nlPBE-based calculations appeared to be one of the promising approaches for improving the success rate of binding pose predictions.

Keywords: protein-protein docking, protein-protein interaction, molecular mechanics energetics, Poisson-Boltzmann calculations

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1805

16623 Iraqi Short Term Electrical Load Forecasting Based On Interval Type-2 Fuzzy Logic

Authors: Firas M. Tuaimah, Huda M. Abdul Abbas

Abstract:

Accurate Short Term Load Forecasting (STLF) is essential for a variety of decision making processes. However, forecasting accuracy can drop due to the presence of uncertainty in the operation of energy systems or unexpected behavior of exogenous variables. Interval Type 2 Fuzzy Logic System (IT2 FLS), with additional degrees of freedom, gives an excellent tool for handling uncertainties and it improved the prediction accuracy. The training data used in this study covers the period from January 1, 2012 to February 1, 2012 for winter season and the period from July 1, 2012 to August 1, 2012 for summer season. The actual load forecasting period starts from January 22, till 28, 2012 for winter model and from July 22 till 28, 2012 for summer model. The real data for Iraqi power system which belongs to the Ministry of Electricity.

Keywords: Short term load forecasting, prediction interval, type 2 fuzzy logic systems.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1888

16622 The Design of a Vehicle Traffic Flow Prediction Model for a Gauteng Freeway Based on an Ensemble of Multi-Layer Perceptron

Authors: Tebogo Emma Makaba, Barnabas Ndlovu Gatsheni

Abstract:

The cities of Johannesburg and Pretoria both located in the Gauteng province are separated by a distance of 58 km. The traffic queues on the Ben Schoeman freeway which connects these two cities can stretch for almost 1.5 km. Vehicle traffic congestion impacts negatively on the business and the commuter’s quality of life. The goal of this paper is to identify variables that influence the flow of traffic and to design a vehicle traffic prediction model, which will predict the traffic flow pattern in advance. The model will unable motorist to be able to make appropriate travel decisions ahead of time. The data used was collected by Mikro’s Traffic Monitoring (MTM). Multi-Layer perceptron (MLP) was used individually to construct the model and the MLP was also combined with Bagging ensemble method to training the data. The cross—validation method was used for evaluating the models. The results obtained from the techniques were compared using predictive and prediction costs. The cost was computed using combination of the loss matrix and the confusion matrix. The predicted models designed shows that the status of the traffic flow on the freeway can be predicted using the following parameters travel time, average speed, traffic volume and day of month. The implications of this work is that commuters will be able to spend less time travelling on the route and spend time with their families. The logistics industry will save more than twice what they are currently spending.

Keywords: Bagging ensemble methods, confusion matrix, multi-layer perceptron, vehicle traffic flow.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1777

16621 Electricity Consumption Prediction Model using Neuro-Fuzzy System

Authors: Rahib Abiyev, Vasif H. Abiyev, C. Ardil

Abstract:

In this paper the development of neural network based fuzzy inference system for electricity consumption prediction is considered. The electricity consumption depends on number of factors, such as number of customers, seasons, type-s of customers, number of plants, etc. It is nonlinear process and can be described by chaotic time-series. The structure and algorithms of neuro-fuzzy system for predicting future values of electricity consumption is described. To determine the unknown coefficients of the system, the supervised learning algorithm is used. As a result of learning, the rules of neuro-fuzzy system are formed. The developed system is applied for predicting future values of electricity consumption of Northern Cyprus. The simulation of neuro-fuzzy system has been performed.

Keywords: Fuzzy logic, neural network, neuro-fuzzy system, neuro-fuzzy prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2012

16620 Towards End-To-End Disease Prediction from Raw Metagenomic Data

Authors: Maxence Queyrel, Edi Prifti, Alexandre Templier, Jean-Daniel Zucker

Abstract:

Analysis of the human microbiome using metagenomic sequencing data has demonstrated high ability in discriminating various human diseases. Raw metagenomic sequencing data require multiple complex and computationally heavy bioinformatics steps prior to data analysis. Such data contain millions of short sequences read from the fragmented DNA sequences and stored as fastq files. Conventional processing pipelines consist in multiple steps including quality control, filtering, alignment of sequences against genomic catalogs (genes, species, taxonomic levels, functional pathways, etc.). These pipelines are complex to use, time consuming and rely on a large number of parameters that often provide variability and impact the estimation of the microbiome elements. Training Deep Neural Networks directly from raw sequencing data is a promising approach to bypass some of the challenges associated with mainstream bioinformatics pipelines. Most of these methods use the concept of word and sentence embeddings that create a meaningful and numerical representation of DNA sequences, while extracting features and reducing the dimensionality of the data. In this paper we present an end-to-end approach that classifies patients into disease groups directly from raw metagenomic reads: metagenome2vec. This approach is composed of four steps (i) generating a vocabulary of k-mers and learning their numerical embeddings; (ii) learning DNA sequence (read) embeddings; (iii) identifying the genome from which the sequence is most likely to come and (iv) training a multiple instance learning classifier which predicts the phenotype based on the vector representation of the raw data. An attention mechanism is applied in the network so that the model can be interpreted, assigning a weight to the influence of the prediction for each genome. Using two public real-life data-sets as well a simulated one, we demonstrated that this original approach reaches high performance, comparable with the state-of-the-art methods applied directly on processed data though mainstream bioinformatics workflows. These results are encouraging for this proof of concept work. We believe that with further dedication, the DNN models have the potential to surpass mainstream bioinformatics workflows in disease classification tasks.

Keywords: Metagenomics, phenotype prediction, deep learning, embeddings, multiple instance learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 911

16619 Pre-Operative Tool for Facial-Post-Surgical Estimation and Detection

Authors: Ayat E. Ali, Christeen R. Aziz, Merna A. Helmy, Mohammed M. Malek, Sherif H. El-Gohary

Abstract:

Goal: Purpose of the project was to make a plastic surgery prediction by using pre-operative images for the plastic surgeries’ patients and to show this prediction on a screen to compare between the current case and the appearance after the surgery. Methods: To this aim, we implemented a software which used data from the internet for facial skin diseases, skin burns, pre-and post-images for plastic surgeries then the post- surgical prediction is done by using K-nearest neighbor (KNN). So we designed and fabricated a smart mirror divided into two parts a screen and a reflective mirror so patient's pre- and post-appearance will be showed at the same time. Results: We worked on some skin diseases like vitiligo, skin burns and wrinkles. We classified the three degrees of burns using KNN classifier with accuracy 60%. We also succeeded in segmenting the area of vitiligo. Our future work will include working on more skin diseases, classify them and give a prediction for the look after the surgery. Also we will go deeper into facial deformities and plastic surgeries like nose reshaping and face slim down. Conclusion: Our project will give a prediction relates strongly to the real look after surgery and decrease different diagnoses among doctors. Significance: The mirror may have broad societal appeal as it will make the distance between patient's satisfaction and the medical standards smaller.

Keywords: K-nearest neighbor, face detection, vitiligo, bone deformity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 701

16618 Performance Optimization of Data Mining Application Using Radial Basis Function Classifier

Authors: M. Govindarajan, R. M.Chandrasekaran

Abstract:

Text data mining is a process of exploratory data analysis. Classification maps data into predefined groups or classes. It is often referred to as supervised learning because the classes are determined before examining the data. This paper describes proposed radial basis function Classifier that performs comparative crossvalidation for existing radial basis function Classifier. The feasibility and the benefits of the proposed approach are demonstrated by means of data mining problem: direct Marketing. Direct marketing has become an important application field of data mining. Comparative Cross-validation involves estimation of accuracy by either stratified k-fold cross-validation or equivalent repeated random subsampling. While the proposed method may have high bias; its performance (accuracy estimation in our case) may be poor due to high variance. Thus the accuracy with proposed radial basis function Classifier was less than with the existing radial basis function Classifier. However there is smaller the improvement in runtime and larger improvement in precision and recall. In the proposed method Classification accuracy and prediction accuracy are determined where the prediction accuracy is comparatively high.

Keywords: Text Data Mining, Comparative Cross-validation, Radial Basis Function, runtime, accuracy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1555

16617 Injury Prediction for Soccer Players Using Machine Learning

Authors: Amiel Satvedi, Richard Pyne

Abstract:

Injuries in professional sports occur on a regular basis. Some may be minor while others can cause huge impact on a player’s career and earning potential. In soccer, there is a high risk of players picking up injuries during game time. This research work seeks to help soccer players reduce the risk of getting injured by predicting the likelihood of injury while playing in the near future and then providing recommendations for intervention. The injury prediction tool will use a soccer player’s number of minutes played on the field, number of appearances, distance covered and performance data for the current and previous seasons as variables to conduct statistical analysis and provide injury predictive results using a machine learning linear regression model.

Keywords: Injury predictor, soccer injury prevention, machine learning in soccer, big data in soccer.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1749

16616 Comparison of Bayesian and Regression Schemes to Model Public Health Services

Authors: Sotirios Raptis

Abstract:

Bayesian reasoning (BR) or Linear (Auto) Regression (AR/LR) can predict different sources of data using priors or other data, and can link social service demands in cohorts, while their consideration in isolation (self-prediction) may lead to service misuse ignoring the context. The paper advocates that BR with Binomial (BD), or Normal (ND) models or raw data (.D) as probabilistic updates can be compared to AR/LR to link services in Scotland and reduce cost by sharing healthcare (HC) resources. Clustering, cross-correlation, along with BR, LR, AR can better predict demand. Insurance companies and policymakers can link such services, and examples include those offered to the elderly, and low-income people, smoking-related services linked to mental health services, or epidemiological weight in children. 22 service packs are used that are published by Public Health Services (PHS) Scotland and Scottish Government (SG) from 1981 to 2019, broken into 110 year series (factors), joined using LR, AR, BR. The Primary component analysis found 11 significant factors, while C-Means (CM) clustering gave five major clusters.

Keywords: Bayesian probability, cohorts, data frames, regression, services, prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 225

16615 Reliability Analysis for Cyclic Fatigue Life Prediction in Railroad Bolt Hole

Authors: Hasan Keshavarzian, Tayebeh Nesari

Abstract:

Bolted rail joint is one of the most vulnerable areas in railway track. A comprehensive approach was developed for studying the reliability of fatigue crack initiation of railroad bolt hole under random axle loads and random material properties. The operation condition was also considered as stochastic variables. In order to obtain the comprehensive probability model of fatigue crack initiation life prediction in railroad bolt hole, we used FEM, response surface method (RSM), and reliability analysis. Combined energy-density based and critical plane based fatigue concept is used for the fatigue crack prediction. The dynamic loads were calculated according to the axle load, speed, and track properties. The results show that axle load is most sensitive parameter compared to Poisson’s ratio in fatigue crack initiation life. Also, the reliability index decreases slowly due to high cycle fatigue regime in this area.

Keywords: Rail-wheel tribology, rolling contact mechanic, finite element modeling, reliability analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1112

16614 Application of Data Mining Tools to Predicate Completion Time of a Project

Authors: Seyed Hossein Iranmanesh, Zahra Mokhtari

Abstract:

Estimation time and cost of work completion in a project and follow up them during execution are contributors to success or fail of a project, and is very important for project management team. Delivering on time and within budgeted cost needs to well managing and controlling the projects. To dealing with complex task of controlling and modifying the baseline project schedule during execution, earned value management systems have been set up and widely used to measure and communicate the real physical progress of a project. But it often fails to predict the total duration of the project. In this paper data mining techniques is used predicting the total project duration in term of Time Estimate At Completion-EAC (t). For this purpose, we have used a project with 90 activities, it has updated day by day. Then, it is used regular indexes in literature and applied Earned Duration Method to calculate time estimate at completion and set these as input data for prediction and specifying the major parameters among them using Clem software. By using data mining, the effective parameters on EAC and the relationship between them could be extracted and it is very useful to manage a project with minimum delay risks. As we state, this could be a simple, safe and applicable method in prediction the completion time of a project during execution.

Keywords: Data Mining Techniques, Earned Duration Method, Earned Value, Estimate At Completion.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1803