Search results for: panel data regression models
9377 Studding of Number of Dataset on Precision of Estimated Saturated Hydraulic Conductivity
Authors: M. Siosemarde, M. Byzedi
Abstract:
Saturated hydraulic conductivity of Soil is an important property in processes involving water and solute flow in soils. Saturated hydraulic conductivity of soil is difficult to measure and can be highly variable, requiring a large number of replicate samples. In this study, 60 sets of soil samples were collected at Saqhez region of Kurdistan province-IRAN. The statistics such as Correlation Coefficient (R), Root Mean Square Error (RMSE), Mean Bias Error (MBE) and Mean Absolute Error (MAE) were used to evaluation the multiple linear regression models varied with number of dataset. In this study the multiple linear regression models were evaluated when only percentage of sand, silt, and clay content (SSC) were used as inputs, and when SSC and bulk density, Bd, (SSC+Bd) were used as inputs. The R, RMSE, MBE and MAE values of the 50 dataset for method (SSC), were calculated 0.925, 15.29, -1.03 and 12.51 and for method (SSC+Bd), were calculated 0.927, 15.28,-1.11 and 12.92, respectively, for relationship obtained from multiple linear regressions on data. Also the R, RMSE, MBE and MAE values of the 10 dataset for method (SSC), were calculated 0.725, 19.62, - 9.87 and 18.91 and for method (SSC+Bd), were calculated 0.618, 24.69, -17.37 and 22.16, respectively, which shows when number of dataset increase, precision of estimated saturated hydraulic conductivity, increases.Keywords: dataset, precision, saturated hydraulic conductivity, soil and statistics.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17929376 PM10 Prediction and Forecasting Using CART: A Case Study for Pleven, Bulgaria
Authors: Snezhana G. Gocheva-Ilieva, Maya P. Stoimenova
Abstract:
Ambient air pollution with fine particulate matter (PM10) is a systematic permanent problem in many countries around the world. The accumulation of a large number of measurements of both the PM10 concentrations and the accompanying atmospheric factors allow for their statistical modeling to detect dependencies and forecast future pollution. This study applies the classification and regression trees (CART) method for building and analyzing PM10 models. In the empirical study, average daily air data for the city of Pleven, Bulgaria for a period of 5 years are used. Predictors in the models are seven meteorological variables, time variables, as well as lagged PM10 variables and some lagged meteorological variables, delayed by 1 or 2 days with respect to the initial time series, respectively. The degree of influence of the predictors in the models is determined. The selected best CART models are used to forecast future PM10 concentrations for two days ahead after the last date in the modeling procedure and show very accurate results.Keywords: Cross-validation, decision tree, lagged variables, short-term forecasting.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7379375 LabVIEW with Fuzzy Logic Controller Simulation Panel for Condition Monitoring of Oil and Dry Type Transformer
Authors: N. A. Muhamad, S.A.M. Ali
Abstract:
Condition monitoring of electrical power equipment has attracted considerable attention for many years. The aim of this paper is to use Labview with Fuzzy Logic controller to build a simulation system to diagnose transformer faults and monitor its condition. The front panel of the system was designed using LabVIEW to enable computer to act as customer-designed instrument. The dissolved gas-in-oil analysis (DGA) method was used as technique for oil type transformer diagnosis; meanwhile terminal voltages and currents analysis method was used for dry type transformer. Fuzzy Logic was used as expert system that assesses all information keyed in at the front panel to diagnose and predict the condition of the transformer. The outcome of the Fuzzy Logic interpretation will be displayed at front panel of LabVIEW to show the user the conditions of the transformer at any time.Keywords: LabVIEW, Fuzzy Logic, condition monitoring, oiltransformer, dry transformer, DGA, terminal values.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 32329374 Automatic Generation of Ontology from Data Source Directed by Meta Models
Authors: Widad Jakjoud, Mohamed Bahaj, Jamal Bakkas
Abstract:
Through this paper we present a method for automatic generation of ontological model from any data source using Model Driven Architecture (MDA), this generation is dedicated to the cooperation of the knowledge engineering and software engineering. Indeed, reverse engineering of a data source generates a software model (schema of data) that will undergo transformations to generate the ontological model. This method uses the meta-models to validate software and ontological models.
Keywords: Meta model, model, ontology, data source.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19989373 Operating Live E! Digital Meteorological Equipments Using Solar Photovoltaics
Authors: Eiko Takaoka, Ryohei Takahashi, Takashi Toyoda
Abstract:
We installed solar panels and digital meteorological equipments whose electrical power is supplied using PV on July 13, 2011. Then, the relationship between the electric power generation and the irradiation, air temperature, and wind velocity was investigated on a roof at a university. The electrical power generation, irradiation, air temperature, and wind velocity were monitored over two years. By analyzing the measured meteorological data and electric power generation data using PTC, we calculated the size of the solar panel that is most suitable for this system. We also calculated the wasted power generation using PTC with the measured meteorological data obtained in this study. In conclusion, to reduce the "wasted power generation", a smaller-size solar panel is required for stable operation.
Keywords: Digital meteorological equipments, PV, photovoltaic, irradiation, PTC.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15449372 Identifying Factors Contributing to the Spread of Lyme Disease: A Regression Analysis of Virginia’s Data
Authors: Fatemeh Valizadeh Gamchi, Edward L. Boone
Abstract:
This research focuses on Lyme disease, a widespread infectious condition in the United States caused by the bacterium Borrelia burgdorferi sensu stricto. It is critical to identify environmental and economic elements that are contributing to the spread of the disease. This study examined data from Virginia to identify a subset of explanatory variables significant for Lyme disease case numbers. To identify relevant variables and avoid overfitting, linear poisson, and regularization regression methods such as ridge, lasso, and elastic net penalty were employed. Cross-validation was performed to acquire tuning parameters. The methods proposed can automatically identify relevant disease count covariates. The efficacy of the techniques was assessed using four criteria on three simulated datasets. Finally, using the Virginia Department of Health’s Lyme disease dataset, the study successfully identified key factors, and the results were consistent with previous studies.
Keywords: Lyme disease, Poisson generalized linear model, Ridge regression, Lasso Regression, elastic net regression.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1229371 Development of Rock Engineering System-Based Models for Tunneling Progress Analysis and Evaluation: Case Study of Tailrace Tunnel of Azad Power Plant Project
Authors: S. Golmohammadi, M. Noorian Bidgoli
Abstract:
Tunneling progress is a key parameter in the blasting method of tunneling. Taking measures to enhance tunneling advance can limit the progress distance without a supporting system, subsequently reducing or eliminating the risk of damage. This paper focuses on modeling tunneling progress using three main groups of parameters (tunneling geometry, blasting pattern, and rock mass specifications) based on the Rock Engineering Systems (RES) methodology. In the proposed models, four main effective parameters on tunneling progress are considered as inputs (RMR, Q-system, Specific charge of blasting, Area), with progress as the output. Data from 86 blasts conducted at the tailrace tunnel in the Azad Dam, western Iran, were used to evaluate the progress value for each blast. The results indicated that, for the 86 blasts, the progress of the estimated model aligns mostly with the measured progress. This paper presents a method for building the interaction matrix (statistical base) of the RES model. Additionally, a comparison was made between the results of the new RES-based model and a Multi-Linear Regression (MLR) analysis model. In the RES-based model, the effective parameters are RMR (35.62%), Q (28.6%), q (specific charge of blasting) (20.35%), and A (15.42%), respectively, whereas for MLR analysis, the main parameters are RMR, Q (system), q, and A. These findings confirm the superior performance of the RES-based model over the other proposed models.
Keywords: Rock Engineering Systems, tunneling progress, Multi Linear Regression, Specific charge of blasting.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1419370 Decision Trees for Predicting Risk of Mortality using Routinely Collected Data
Authors: Tessy Badriyah, Jim S. Briggs, Dave R. Prytherch
Abstract:
It is well known that Logistic Regression is the gold standard method for predicting clinical outcome, especially predicting risk of mortality. In this paper, the Decision Tree method has been proposed to solve specific problems that commonly use Logistic Regression as a solution. The Biochemistry and Haematology Outcome Model (BHOM) dataset obtained from Portsmouth NHS Hospital from 1 January to 31 December 2001 was divided into four subsets. One subset of training data was used to generate a model, and the model obtained was then applied to three testing datasets. The performance of each model from both methods was then compared using calibration (the χ2 test or chi-test) and discrimination (area under ROC curve or c-index). The experiment presented that both methods have reasonable results in the case of the c-index. However, in some cases the calibration value (χ2) obtained quite a high result. After conducting experiments and investigating the advantages and disadvantages of each method, we can conclude that Decision Trees can be seen as a worthy alternative to Logistic Regression in the area of Data Mining.Keywords: Decision Trees, Logistic Regression, clinical outcome, risk of mortality.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 25239369 Derivation of Empirical Formulae to Predict Pressure and Impulsive Asymptotes for P-I Diagrams of One-way RC Panels
Authors: Azrul A. Mutalib, Masoud Abedini, Shahrizan Baharom, Hong Hao
Abstract:
There are only limited studies that directly correlate the increase in reinforced concrete (RC) panel structural capacities in resisting the blast loads with different RC panel structural properties in terms of blast loading characteristics, RC panel dimensions, steel reinforcement ratio and concrete material strength. In this paper, numerical analyses of dynamic response and damage of the one-way RC panel to blast loads are carried out using the commercial software LS-DYNA. A series of simulations are performed to predict the blast response and damage of columns with different level and magnitude of blast loads. The numerical results are used to develop pressureimpulse (P-I) diagrams of one-way RC panels. Based on the numerical results, the empirical formulae are derived to calculate the pressure and impulse asymptotes of the P-I diagrams of RC panels. The results presented in this paper can be used to construct P-I diagrams of RC panels with different concrete and reinforcement properties. The P-I diagrams are very useful to assess panel capacities in resisting different blast loads.
Keywords: One-way reinforced concrete (RC) panels, Explosive loads, LS-DYNA Software, Pressure-Impulse (P-I) diagram, Numerical.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 27629368 Applying the Regression Technique for Prediction of the Acute Heart Attack
Authors: Paria Soleimani, Arezoo Neshati
Abstract:
Myocardial infarction is one of the leading causes of death in the world. Some of these deaths occur even before the patient reaches the hospital. Myocardial infarction occurs as a result of impaired blood supply. Because the most of these deaths are due to coronary artery disease, hence the awareness of the warning signs of a heart attack is essential. Some heart attacks are sudden and intense, but most of them start slowly, with mild pain or discomfort, then early detection and successful treatment of these symptoms is vital to save them. Therefore, importance and usefulness of a system designing to assist physicians in early diagnosis of the acute heart attacks is obvious. The main purpose of this study would be to enable patients to become better informed about their condition and to encourage them to seek professional care at an earlier stage in the appropriate situations. For this purpose, the data were collected on 711 heart patients in Iran hospitals. 28 attributes of clinical factors can be reported by patients; were studied. Three logistic regression models were made on the basis of the 28 features to predict the risk of heart attacks. The best logistic regression model in terms of performance had a C-index of 0.955 and with an accuracy of 94.9%. The variables, severe chest pain, back pain, cold sweats, shortness of breath, nausea and vomiting, were selected as the main features.
Keywords: Coronary heart disease, acute heart attacks, prediction, logistic regression.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 24259367 The Classification Model for Hard Disk Drive Functional Tests under Sparse Data Conditions
Authors: S. Pattanapairoj, D. Chetchotsak
Abstract:
This paper proposed classification models that would be used as a proxy for hard disk drive (HDD) functional test equitant which required approximately more than two weeks to perform the HDD status classification in either “Pass" or “Fail". These models were constructed by using committee network which consisted of a number of single neural networks. This paper also included the method to solve the problem of sparseness data in failed part, which was called “enforce learning method". Our results reveal that the constructed classification models with the proposed method could perform well in the sparse data conditions and thus the models, which used a few seconds for HDD classification, could be used to substitute the HDD functional tests.Keywords: Sparse data, Classifications, Committee network
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17369366 Effect of a Linear-Exponential Penalty Functionon the GA-s Efficiency in Optimization of a Laminated Composite Panel
Authors: A. Abedian, M. H. Ghiasi, B. Dehghan-Manshadi
Abstract:
A stiffened laminated composite panel (1 m length × 0.5m width) was optimized for minimum weight and deflection under several constraints using genetic algorithm. Here, a significant study on the performance of a penalty function with two kinds of static and dynamic penalty factors was conducted. The results have shown that linear dynamic penalty factors are more effective than the static ones. Also, a specially combined linear-exponential function has shown to perform more effective than the previously mentioned penalty functions. This was then resulted in the less sensitivity of the GA to the amount of penalty factor.Keywords: Genetic algorithms, penalty function, stiffenedcomposite panel, finite element method.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16789365 Ultimate Shear Resistance of Plate Girders Part 2- Höglund Theory
Authors: Ahmed S. Elamary
Abstract:
Ultimate shear resistance (USR) of slender plate girders can be predicted theoretically using Cardiff theory or Höglund theory. This paper will be concerned with predicting the USR using Höglund theory and EC3. Two main factors can affect the USR, the panel width “b” and the web depth “d”, consequently, the panel aspect ratio (b/d) has to be identified by limits. In most of the previous study, there is no limit for panel aspect ratio indicated. In this paper theoretical analysis has been conducted to study the effect of (b/d) on the USR. The analysis based on ninety six test results of steel plate girders subjected to shear executed and collected by others. New formula proposed to predict the percentage of the distance between the plastic hinges form in the flanges “c” to panel width “b”. Conservative limits of (c/b) have been suggested to get a consistent value of USR.
Keywords: Ultimate shear resistance, Plate Girder, Höglund’s theory, EC3.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 49989364 Predictability of the Two Commonly Used Models to Represent the Thin-layer Re-wetting Characteristics of Barley
Authors: M. A. Basunia
Abstract:
Thirty three re-wetting tests were conducted at different combinations of temperatures (5.7- 46.30C) and relative humidites (48.2-88.6%) with barley. Two most commonly used thinlayer drying and rewetting models i.e. Page and Diffusion were compared for their ability to the fit the experimental re-wetting data based on the standard error of estimate (SEE) of the measured and simulated moisture contents. The comparison shows both the Page and Diffusion models fit the re-wetting experimental data of barley well. The average SEE values for the Page and Diffusion models were 0.176 % d.b. and 0.199 % d.b., respectively. The Page and Diffusion models were found to be most suitable equations, to describe the thin-layer re-wetting characteristics of barley over a typically five day re-wetting. These two models can be used for the simulation of deep-bed re-wetting of barley occurring during ventilated storage and deep bed drying.Keywords: Thin-layer, barley, re-wetting parameters, temperature, relative humidity.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14959363 Speaker Independent Quranic Recognizer Basedon Maximum Likelihood Linear Regression
Authors: Ehab Mourtaga, Ahmad Sharieh, Mousa Abdallah
Abstract:
An automatic speech recognition system for the formal Arabic language is needed. The Quran is the most formal spoken book in Arabic, it is spoken all over the world. In this research, an automatic speech recognizer for Quranic based speakerindependent was developed and tested. The system was developed based on the tri-phone Hidden Markov Model and Maximum Likelihood Linear Regression (MLLR). The MLLR computes a set of transformations which reduces the mismatch between an initial model set and the adaptation data. It uses the regression class tree, as well as, estimates a set of linear transformations for the mean and variance parameters of a Gaussian mixture HMM system. The 30th Chapter of the Quran, with five of the most famous readers of the Quran, was used for the training and testing of the data. The chapter includes about 2000 distinct words. The advantages of using the Quranic verses as the database in this developed recognizer are the uniqueness of the words and the high level of orderliness between verses. The level of accuracy from the tested data ranged 68 to 85%.Keywords: Hidden Markov Model (HMM), MaximumLikelihood Linear Regression (MLLR), Quran, Regression ClassTree, Speech Recognition, Speaker-independent.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19159362 Density Estimation using Generalized Linear Model and a Linear Combination of Gaussians
Authors: Aly Farag, Ayman El-Baz, Refaat Mohamed
Abstract:
In this paper we present a novel approach for density estimation. The proposed approach is based on using the logistic regression model to get initial density estimation for the given empirical density. The empirical data does not exactly follow the logistic regression model, so, there will be a deviation between the empirical density and the density estimated using logistic regression model. This deviation may be positive and/or negative. In this paper we use a linear combination of Gaussian (LCG) with positive and negative components as a model for this deviation. Also, we will use the expectation maximization (EM) algorithm to estimate the parameters of LCG. Experiments on real images demonstrate the accuracy of our approach.
Keywords: Logistic regression model, Expectationmaximization, Segmentation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17339361 Acute Coronary Syndrome Prediction Using Data Mining Techniques- An Application
Authors: Tahseen A. Jilani, Huda Yasin, Madiha Yasin, C. Ardil
Abstract:
In this paper we use data mining techniques to investigate factors that contribute significantly to enhancing the risk of acute coronary syndrome. We assume that the dependent variable is diagnosis – with dichotomous values showing presence or absence of disease. We have applied binary regression to the factors affecting the dependent variable. The data set has been taken from two different cardiac hospitals of Karachi, Pakistan. We have total sixteen variables out of which one is assumed dependent and other 15 are independent variables. For better performance of the regression model in predicting acute coronary syndrome, data reduction techniques like principle component analysis is applied. Based on results of data reduction, we have considered only 14 out of sixteen factors.
Keywords: Acute coronary syndrome (ACS), binary logistic regression analyses, myocardial ischemia (MI), principle component analysis, unstable angina (U.A.).
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21149360 Consumption Insurance against the Chronic Illness: Evidence from Thailand
Authors: Yuthapoom Thanakijborisut
Abstract:
This paper studies consumption insurance against the chronic illness in Thailand. The study estimates the impact of household consumption in the chronic illness on consumption growth. Chronic illness is the health care costs of a person or a household’s decision in treatment for the long term; the causes and effects of the household’s ability for smooth consumption. The chronic illnesses are measured in health status when at least one member within the household faces the chronic illness. The data used is from the Household Social Economic Panel Survey conducted during 2007 and 2012. The survey collected data from approximately 6,000 households from every province, both inside and outside municipal areas in Thailand. The study estimates the change in household consumption by using an ordinary least squares (OLS) regression model. The result shows that the members within the household facing the chronic illness would reduce the consumption by around 4%. This case indicates that consumption insurance in Thailand is quite sufficient against chronic illness.
Keywords: Consumption insurance, chronic illness, health care, Thailand.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 10989359 Statistical Analysis for Overdispersed Medical Count Data
Authors: Y. N. Phang, E. F. Loh
Abstract:
Many researchers have suggested the use of zero inflated Poisson (ZIP) and zero inflated negative binomial (ZINB) models in modeling overdispersed medical count data with extra variations caused by extra zeros and unobserved heterogeneity. The studies indicate that ZIP and ZINB always provide better fit than using the normal Poisson and negative binomial models in modeling overdispersed medical count data. In this study, we proposed the use of Zero Inflated Inverse Trinomial (ZIIT), Zero Inflated Poisson Inverse Gaussian (ZIPIG) and zero inflated strict arcsine models in modeling overdispered medical count data. These proposed models are not widely used by many researchers especially in the medical field. The results show that these three suggested models can serve as alternative models in modeling overdispersed medical count data. This is supported by the application of these suggested models to a real life medical data set. Inverse trinomial, Poisson inverse Gaussian and strict arcsine are discrete distributions with cubic variance function of mean. Therefore, ZIIT, ZIPIG and ZISA are able to accommodate data with excess zeros and very heavy tailed. They are recommended to be used in modeling overdispersed medical count data when ZIP and ZINB are inadequate.
Keywords: Zero inflated, inverse trinomial distribution, Poisson inverse Gaussian distribution, strict arcsine distribution, Pearson’s goodness of fit.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 33159358 Designing Social Care Policies in the Long Term: A Study Using Regression, Clustering and Backpropagation Neural Nets
Authors: Sotirios Raptis
Abstract:
Linking social needs to social classes using different criteria may lead to social services misuse. The paper discusses using ML and Neural Networks (NNs) in linking public services in Scotland in the long term and advocates, this can result in a reduction of the services cost connecting resources needed in groups for similar services. The paper combines typical regression models with clustering and cross-correlation as complementary constituents to predict the demand. Insurance companies and public policymakers can pack linked services such as those offered to the elderly or to low-income people in the longer term. The work is based on public data from 22 services offered by Public Health Services (PHS) Scotland and from the Scottish Government (SG) from 1981 to 2019 that are broken into 110 years series called factors and uses Linear Regression (LR), Autoregression (ARMA) and 3 types of back-propagation (BP) Neural Networks (BPNN) to link them under specific conditions. Relationships found were between smoking related healthcare provision, mental health-related health services, and epidemiological weight in Primary 1(Education) Body Mass Index (BMI) in children. Primary component analysis (PCA) found 11 significant factors while C-Means (CM) clustering gave 5 major factors clusters.
Keywords: Probability, cohorts, data frames, services, prediction.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4609357 On the outlier Detection in Nonlinear Regression
Authors: Hossein Riazoshams, Midi Habshah, Jr., Mohamad Bakri Adam
Abstract:
The detection of outliers is very essential because of their responsibility for producing huge interpretative problem in linear as well as in nonlinear regression analysis. Much work has been accomplished on the identification of outlier in linear regression, but not in nonlinear regression. In this article we propose several outlier detection techniques for nonlinear regression. The main idea is to use the linear approximation of a nonlinear model and consider the gradient as the design matrix. Subsequently, the detection techniques are formulated. Six detection measures are developed that combined with three estimation techniques such as the Least-Squares, M and MM-estimators. The study shows that among the six measures, only the studentized residual and Cook Distance which combined with the MM estimator, consistently capable of identifying the correct outliers.Keywords: Nonlinear Regression, outliers, Gradient, LeastSquare, M-estimate, MM-estimate.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 31799356 Agile Methodology for Modeling and Design of Data Warehouses -AM4DW-
Authors: Nieto Bernal Wilson, Carmona Suarez Edgar
Abstract:
The organizations have structured and unstructured information in different formats, sources, and systems. Part of these come from ERP under OLTP processing that support the information system, however these organizations in OLAP processing level, presented some deficiencies, part of this problematic lies in that does not exist interesting into extract knowledge from their data sources, as also the absence of operational capabilities to tackle with these kind of projects. Data Warehouse and its applications are considered as non-proprietary tools, which are of great interest to business intelligence, since they are repositories basis for creating models or patterns (behavior of customers, suppliers, products, social networks and genomics) and facilitate corporate decision making and research. The following paper present a structured methodology, simple, inspired from the agile development models as Scrum, XP and AUP. Also the models object relational, spatial data models, and the base line of data modeling under UML and Big data, from this way sought to deliver an agile methodology for the developing of data warehouses, simple and of easy application. The methodology naturally take into account the application of process for the respectively information analysis, visualization and data mining, particularly for patterns generation and derived models from the objects facts structured.
Keywords: Data warehouse, model data, big data, object fact, object relational fact, process developed data warehouse.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14789355 Optimization of Multifunctional Battery Structures for Mars
Authors: James A Foster, Guglielmo S Aglietti
Abstract:
Multifunctional structures are a potentially disruptive technology that allows for significant mass savings on spacecraft. The specific concept addressed herein is that of a multifunctional power structure. In this paper, a parametric optimisation of the design of such a structure that uses commercially available battery cells is presented. Using numerical modelling, it was found that there exists several trade-offs aboutthe conflict between the capacity of the panel and its mechanical properties. It was found that there is no universal optimal location for the cells. Placing them close to the mechanical interfaces increases loading in the mechanically weak cells whereas placing them at the centre of the panel increases the stress inthe panel and reduces the stiffness of the structure.Keywords: Design Optimization, Multifunctional Structures, Power Storage.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16349354 Probabilistic Characteristics of older PR Frames in the Mid-America Earthquake Region
Authors: Do-Hwan Kim, Roberto Leon
Abstract:
Probabilistic characteristics of seismic responses of the Partially Restrained connection rotation (PRCR) and panel zone deformation (PZD) installed in older steel moment frames were investigated in accordance with statistical inference in decision-making process. The 4, 6 and 8 story older steel moment frames with clip angle and T-stub connections were designed and analyzed using 2%/50yrs ground motions in four cities of the Mid-America earthquake region. The probability density function and cumulative distribution function of PRCR and PZD were determined by the goodness-of-fit tests based on probabilistic parameters measured from the results of the nonlinear time-history analyses. The obtained probabilistic parameters and distributions can be used to find out what performance level mainly PR connections and panel zones satisfy and how many PR connections and panel zones experience a serious damage under the Mid-America ground motions.Keywords: Mid-America earthquake, Panel zone, PR connection, Probabilistic characteristics, seismic performance
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14129353 The Effect of Particle Porosity in Mixed Matrix Membrane Permeation Models
Authors: Z. Sadeghi, M. R. Omidkhah, M. E. Masoomi
Abstract:
The purpose of this paper is to examine gas transport behavior of mixed matrix membranes (MMMs) combined with porous particles. Main existing models are categorized in two main groups; two-phase (ideal contact) and three-phase (non-ideal contact). A new coefficient, J, was obtained to express equations for estimating effect of the particle porosity in two-phase and three-phase models. Modified models evaluates with existing models and experimental data using Matlab software. Comparison of gas permeability of proposed modified models with existing models in different MMMs shows a better prediction of gas permeability in MMMs.
Keywords: Mixed Matrix Membrane, Permeation Models, Porous particles, Porosity.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 26349352 Comparison of Different k-NN Models for Speed Prediction in an Urban Traffic Network
Authors: Seyoung Kim, Jeongmin Kim, Kwang Ryel Ryu
Abstract:
A database that records average traffic speeds measured at five-minute intervals for all the links in the traffic network of a metropolitan city. While learning from this data the models that can predict future traffic speed would be beneficial for the applications such as the car navigation system, building predictive models for every link becomes a nontrivial job if the number of links in a given network is huge. An advantage of adopting k-nearest neighbor (k-NN) as predictive models is that it does not require any explicit model building. Instead, k-NN takes a long time to make a prediction because it needs to search for the k-nearest neighbors in the database at prediction time. In this paper, we investigate how much we can speed up k-NN in making traffic speed predictions by reducing the amount of data to be searched for without a significant sacrifice of prediction accuracy. The rationale behind this is that we had a better look at only the recent data because the traffic patterns not only repeat daily or weekly but also change over time. In our experiments, we build several different k-NN models employing different sets of features which are the current and past traffic speeds of the target link and the neighbor links in its up/down-stream. The performances of these models are compared by measuring the average prediction accuracy and the average time taken to make a prediction using various amounts of data.Keywords: Big data, k-NN, machine learning, traffic speed prediction.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13769351 Forecasting Stock Indexes Using Bayesian Additive Regression Tree
Authors: Darren Zou
Abstract:
Forecasting the stock market is a very challenging task. Various economic indicators such as GDP, exchange rates, interest rates, and unemployment have a substantial impact on the stock market. Time series models are the traditional methods used to predict stock market changes. In this paper, a machine learning method, Bayesian Additive Regression Tree (BART) is used in predicting stock market indexes based on multiple economic indicators. BART can be used to model heterogeneous treatment effects, and thereby works well when models are misspecified. It also has the capability to handle non-linear main effects and multi-way interactions without much input from financial analysts. In this research, BART is proposed to provide a reliable prediction on day-to-day stock market activities. By comparing the analysis results from BART and with time series method, BART can perform well and has better prediction capability than the traditional methods.
Keywords: Bayesian, Forecast, Stock, BART.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7349350 Comparing Machine Learning Estimation of Fuel Consumption of Heavy-Duty Vehicles
Authors: Victor Bodell, Lukas Ekstrom, Somayeh Aghanavesi
Abstract:
Fuel consumption (FC) is one of the key factors in determining expenses of operating a heavy-duty vehicle. A customer may therefore request an estimate of the FC of a desired vehicle. The modular design of heavy-duty vehicles allows their construction by specifying the building blocks, such as gear box, engine and chassis type. If the combination of building blocks is unprecedented, it is unfeasible to measure the FC, since this would first r equire the construction of the vehicle. This paper proposes a machine learning approach to predict FC. This study uses around 40,000 vehicles specific and o perational e nvironmental c onditions i nformation, such as road slopes and driver profiles. A ll v ehicles h ave d iesel engines and a mileage of more than 20,000 km. The data is used to investigate the accuracy of machine learning algorithms Linear regression (LR), K-nearest neighbor (KNN) and Artificial n eural n etworks (ANN) in predicting fuel consumption for heavy-duty vehicles. Performance of the algorithms is evaluated by reporting the prediction error on both simulated data and operational measurements. The performance of the algorithms is compared using nested cross-validation and statistical hypothesis testing. The statistical evaluation procedure finds that ANNs have the lowest prediction error compared to LR and KNN in estimating fuel consumption on both simulated and operational data. The models have a mean relative prediction error of 0.3% on simulated data, and 4.2% on operational data.Keywords: Artificial neural networks, fuel consumption, machine learning, regression, statistical tests.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 8299349 Regression Test Selection Technique for Multi-Programming Language
Authors: Walid S. Abd El-hamid, Sherif S. El-Etriby, Mohiy M. Hadhoud
Abstract:
Regression testing is a maintenance activity applied to modified software to provide confidence that the changed parts are correct and that the unchanged parts have not been adversely affected by the modifications. Regression test selection techniques reduce the cost of regression testing, by selecting a subset of an existing test suite to use in retesting modified programs. This paper presents the first general regression-test-selection technique, which based on code and allows selecting test cases for any programs written in any programming language. Then it handles incomplete program. We also describe RTSDiff, a regression-test-selection system that implements the proposed technique. The results of the empirical studied that performed in four programming languages java, C#, Cµ and Visual basic show that the efficiency and effective in reducing the size of test suit.Keywords: Regression testing, testing, test selection, softwareevolution, software maintenance.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15339348 The Maximum Likelihood Method of Random Coefficient Dynamic Regression Model
Authors: Autcha Araveeporn
Abstract:
The Random Coefficient Dynamic Regression (RCDR) model is to developed from Random Coefficient Autoregressive (RCA) model and Autoregressive (AR) model. The RCDR model is considered by adding exogenous variables to RCA model. In this paper, the concept of the Maximum Likelihood (ML) method is used to estimate the parameter of RCDR(1,1) model. Simulation results have shown the AIC and BIC criterion to compare the performance of the the RCDR(1,1) model. The variables as the stationary and weakly stationary data are good estimates where the exogenous variables are weakly stationary. However, the model selection indicated that variables are nonstationarity data based on the stationary data of the exogenous variables.Keywords: Autoregressive, Maximum Likelihood Method, Nonstationarity, Random Coefficient Dynamic Regression, Stationary.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1647