Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 3383

Search results for: auto regression

3353 Application Difference between Cox and Logistic Regression Models

Abstract:

The logistic regression and Cox regression models (proportional hazard model) at present are being employed in the analysis of prospective epidemiologic research looking into risk factors in their application on chronic diseases. However, a theoretical relationship between the two models has been studied. By definition, Cox regression model also called Cox proportional hazard model is a procedure that is used in modeling data regarding time leading up to an event where censored cases exist. Whereas the Logistic regression model is mostly applicable in cases where the independent variables consist of numerical as well as nominal values while the resultant variable is binary (dichotomous). Arguments and findings of many researchers focused on the overview of Cox and Logistic regression models and their different applications in different areas. In this work, the analysis is done on secondary data whose source is SPSS exercise data on BREAST CANCER with a sample size of 1121 women where the main objective is to show the application difference between Cox regression model and logistic regression model based on factors that cause women to die due to breast cancer. Thus we did some analysis manually i.e. on lymph nodes status, and SPSS software helped to analyze the mentioned data. This study found out that there is an application difference between Cox and Logistic regression models which is Cox regression model is used if one wishes to analyze data which also include the follow-up time whereas Logistic regression model analyzes data without follow-up-time. Also, they have measurements of association which is different: hazard ratio and odds ratio for Cox and logistic regression models respectively. A similarity between the two models is that they are both applicable in the prediction of the upshot of a categorical variable i.e. a variable that can accommodate only a restricted number of categories. In conclusion, Cox regression model differs from logistic regression by assessing a rate instead of proportion. The two models can be applied in many other researches since they are suitable methods for analyzing data but the more recommended is the Cox, regression model.

Keywords: logistic regression model, Cox regression model, survival analysis, hazard ratio

Procedia PDF Downloads 423

3352 Stock Market Prediction by Regression Model with Social Moods

Authors: Masahiro Ohmura, Koh Kakusho, Takeshi Okadome

Abstract:

This paper presents a regression model with autocorrelated errors in which the inputs are social moods obtained by analyzing the adjectives in Twitter posts using a document topic model. The regression model predicts Dow Jones Industrial Average (DJIA) more precisely than autoregressive moving-average models.

Keywords: stock market prediction, social moods, regression model, DJIA

Procedia PDF Downloads 519

3351 Model-Based Software Regression Test Suite Reduction

Authors: Shiwei Deng, Yang Bao

Abstract:

In this paper, we present a model-based regression test suite reducing approach that uses EFSM model dependence analysis and probability-driven greedy algorithm to reduce software regression test suites. The approach automatically identifies the difference between the original model and the modified model as a set of elementary model modifications. The EFSM dependence analysis is performed for each elementary modification to reduce the regression test suite, and then the probability-driven greedy algorithm is adopted to select the minimum set of test cases from the reduced regression test suite that cover all interaction patterns. Our initial experience shows that the approach may significantly reduce the size of regression test suites.

Keywords: dependence analysis, EFSM model, greedy algorithm, regression test

Procedia PDF Downloads 398

3350 Segmentation of Piecewise Polynomial Regression Model by Using Reversible Jump MCMC Algorithm

Authors: Suparman

Abstract:

Piecewise polynomial regression model is very flexible model for modeling the data. If the piecewise polynomial regression model is matched against the data, its parameters are not generally known. This paper studies the parameter estimation problem of piecewise polynomial regression model. The method which is used to estimate the parameters of the piecewise polynomial regression model is Bayesian method. Unfortunately, the Bayes estimator cannot be found analytically. Reversible jump MCMC algorithm is proposed to solve this problem. Reversible jump MCMC algorithm generates the Markov chain that converges to the limit distribution of the posterior distribution of piecewise polynomial regression model parameter. The resulting Markov chain is used to calculate the Bayes estimator for the parameters of piecewise polynomial regression model.

Keywords: piecewise regression, bayesian, reversible jump MCMC, segmentation

Procedia PDF Downloads 340

3349 Attempt to Reuse Used-PCs as Distributed Storage

Authors: Toshiya Kawato, Shin-ichi Motomura, Masayuki Higashino, Takao Kawamura

Abstract:

Storage for storing data is indispensable. If a storage capacity becomes insufficient, we can increase its capacity by adding new disks. It is, however, difficult to add a new disk when a budget is not enough. On the other hand, there are many unused idle resources such as used personal computers despite those use value. In order to solve those problems, used personal computers can be reused as storage. In this paper, we attempt to reuse used-PCs as a distributed storage. First, we list up the characteristics of used-PCs and design a storage system that utilizes its characteristics. Next, we experimentally implement an auto-construction system that automatically constructs a distributed storage environment in used-PCs.

Keywords: distributed storage, used personal computer, idle resource, auto construction

Procedia PDF Downloads 221

3348 A Fuzzy Linear Regression Model Based on Dissemblance Index

Authors: Shih-Pin Chen, Shih-Syuan You

Abstract:

Fuzzy regression models are useful for investigating the relationship between explanatory variables and responses in fuzzy environments. To overcome the deficiencies of previous models and increase the explanatory power of fuzzy data, the graded mean integration (GMI) representation is applied to determine representative crisp regression coefficients. A fuzzy regression model is constructed based on the modified dissemblance index (MDI), which can precisely measure the actual total error. Compared with previous studies based on the proposed MDI and distance criterion, the results from commonly used test examples show that the proposed fuzzy linear regression model has higher explanatory power and forecasting accuracy.

Keywords: dissemblance index, fuzzy linear regression, graded mean integration, mathematical programming

Procedia PDF Downloads 407

3347 The Theory behind Logistic Regression

Authors: Jan Henrik Wosnitza

Abstract:

The logistic regression has developed into a standard approach for estimating conditional probabilities in a wide range of applications including credit risk prediction. The article at hand contributes to the current literature on logistic regression fourfold: First, it is demonstrated that the binary logistic regression automatically meets its model assumptions under very general conditions. This result explains, at least in part, the logistic regression's popularity. Second, the requirement of homoscedasticity in the context of binary logistic regression is theoretically substantiated. The variances among the groups of defaulted and non-defaulted obligors have to be the same across the level of the aggregated default indicators in order to achieve linear logits. Third, this article sheds some light on the question why nonlinear logits might be superior to linear logits in case of a small amount of data. Fourth, an innovative methodology for estimating correlations between obligor-specific log-odds is proposed. In order to crystallize the key ideas, this paper focuses on the example of credit risk prediction. However, the results presented in this paper can easily be transferred to any other field of application.

Keywords: correlation, credit risk estimation, default correlation, homoscedasticity, logistic regression, nonlinear logistic regression

Procedia PDF Downloads 394

3346 The Influence of Applying Mechanical Chest Compression Systems on the Effectiveness of Cardiopulmonary Resuscitation in Out-of-Hospital Cardiac Arrest

Authors: Slawomir Pilip, Michal Wasilewski, Daniel Celinski, Leszek Szpakowski, Grzegorz Michalak

Abstract:

The aim of the study was to evaluate the effectiveness of cardiopulmonary resuscitation taken by Medical Emergency Teams (MET) at the place of an accident including the usage of mechanical chest compression systems. In the period of January-May 2017, there were 137 cases of a sudden cardiac arrest in a chosen region of Eastern Poland with 360.000 inhabitants. Medical records and questionnaires filled by METs were analysed to prove the effectiveness of cardiopulmonary resuscitations that were considered to be effective when an early indication of spontaneous circulation was provided and the patient was taken to hospital. A chest compression system used by METs was applied in 60 cases (Lucas3 - 34 patients; Auto Pulse - 24 patients). The effectiveness of cardiopulmonary resuscitation among patients who were employed a chest compression system was much higher (43,3%) than the manual cardiac massage (36,4%). Thus, the usage of Lucas3 chest compression system resulted in 47% while Auto Pulse was 33,3%. The average ambulance arrival time could have had a significant impact on the subsequent effectiveness of cardiopulmonary resuscitation in these cases. Ambulances equipped with Lucas3 reached the destination within 8 minutes, and those with Auto Pulse needed 12,1 minutes. Moreover, taking effective basic life support (BLS) by bystanders before the ambulance arrival was much more frequent for ambulances with Lucas3 than Auto Pulse. Therefore, the percentage of BLS among the group of patients who were employed Lucas3 by METs was 26,5%, and 20,8% for Auto Pulse. The total percentage of taking BLS by bystanders before the ambulance arrival resulted in 25% of patients who were later applied a chest compression system by METs. Not only was shockable cardiac rhythm obtained in 47% of these cases, but an early indication of spontaneous circulation was also provided in all these patients. Both Lucas3 and Auto Pulse were evaluated to be significantly useful in improving the effectiveness of cardiopulmonary resuscitation by 97% of Medical Emergency Teams. Therefore, implementation of chest compression systems essentially makes the cardiopulmonary resuscitation even more effective. The ambulance arrival time, taking successful BLS by bystanders before the ambulance arrival and the presence of shockable cardiac rhythm determine an early indication of spontaneous circulation among patients after a sudden cardiac arrest.

Keywords: cardiac arrest, effectiveness, mechanical chest compression systems, resuscitation

Procedia PDF Downloads 222

3345 Learning Dynamic Representations of Nodes in Temporally Variant Graphs

Authors: Sandra Mitrovic, Gaurav Singh

Abstract:

In many industries, including telecommunications, churn prediction has been a topic of active research. A lot of attention has been drawn on devising the most informative features, and this area of research has gained even more focus with spread of (social) network analytics. The call detail records (CDRs) have been used to construct customer networks and extract potentially useful features. However, to the best of our knowledge, no studies including network features have yet proposed a generic way of representing network information. Instead, ad-hoc and dataset dependent solutions have been suggested. In this work, we build upon a recently presented method (node2vec) to obtain representations for nodes in observed network. The proposed approach is generic and applicable to any network and domain. Unlike node2vec, which assumes a static network, we consider a dynamic and time-evolving network. To account for this, we propose an approach that constructs the feature representation of each node by generating its node2vec representations at different timestamps, concatenating them and finally compressing using an auto-encoder-like method in order to retain reasonably long and informative feature vectors. We test the proposed method on churn prediction task in telco domain. To predict churners at timestamp ts+1, we construct training and testing datasets consisting of feature vectors from time intervals [t1, ts-1] and [t2, ts] respectively, and use traditional supervised classification models like SVM and Logistic Regression. Observed results show the effectiveness of proposed approach as compared to ad-hoc feature selection based approaches and static node2vec.

Keywords: churn prediction, dynamic networks, node2vec, auto-encoders

Procedia PDF Downloads 286

3344 Application of Stochastic Models to Annual Extreme Streamflow Data

Authors: Karim Hamidi Machekposhti, Hossein Sedghi

Abstract:

This study was designed to find the best stochastic model (using of time series analysis) for annual extreme streamflow (peak and maximum streamflow) of Karkheh River at Iran. The Auto-regressive Integrated Moving Average (ARIMA) model used to simulate these series and forecast those in future. For the analysis, annual extreme streamflow data of Jelogir Majin station (above of Karkheh dam reservoir) for the years 1958–2005 were used. A visual inspection of the time plot gives a little increasing trend; therefore, series is not stationary. The stationarity observed in Auto-Correlation Function (ACF) and Partial Auto-Correlation Function (PACF) plots of annual extreme streamflow was removed using first order differencing (d=1) in order to the development of the ARIMA model. Interestingly, the ARIMA(4,1,1) model developed was found to be most suitable for simulating annual extreme streamflow for Karkheh River. The model was found to be appropriate to forecast ten years of annual extreme streamflow and assist decision makers to establish priorities for water demand. The Statistical Analysis System (SAS) and Statistical Package for the Social Sciences (SPSS) codes were used to determinate of the best model for this series.

Keywords: stochastic models, ARIMA, extreme streamflow, Karkheh river

Procedia PDF Downloads 121

3343 Model Averaging for Poisson Regression

Authors: Zhou Jianhong

Abstract:

Model averaging is a desirable approach to deal with model uncertainty, which, however, has rarely been explored for Poisson regression. In this paper, we propose a model averaging procedure based on an unbiased estimator of the expected Kullback-Leibler distance for the Poisson regression. Simulation study shows that the proposed model average estimator outperforms some other commonly used model selection and model average estimators in some situations. Our proposed methods are further applied to a real data example and the advantage of this method is demonstrated again.

Keywords: model averaging, poission regression, Kullback-Leibler distance, statistics

Procedia PDF Downloads 484

3342 Establishment of the Regression Uncertainty of the Critical Heat Flux Power Correlation for an Advanced Fuel Bundle

Authors: L. Q. Yuan, J. Yang, A. Siddiqui

Abstract:

A new regression uncertainty analysis methodology was applied to determine the uncertainties of the critical heat flux (CHF) power correlation for an advanced 43-element bundle design, which was developed by Canadian Nuclear Laboratories (CNL) to achieve improved economics, resource utilization and energy sustainability. The new methodology is considered more appropriate than the traditional methodology in the assessment of the experimental uncertainty associated with regressions. The methodology was first assessed using both the Monte Carlo Method (MCM) and the Taylor Series Method (TSM) for a simple linear regression model, and then extended successfully to a non-linear CHF power regression model (CHF power as a function of inlet temperature, outlet pressure and mass flow rate). The regression uncertainty assessed by MCM agrees well with that by TSM. An equation to evaluate the CHF power regression uncertainty was developed and expressed as a function of independent variables that determine the CHF power.

Keywords: CHF experiment, CHF correlation, regression uncertainty, Monte Carlo Method, Taylor Series Method

Procedia PDF Downloads 388

3341 Non-Parametric Regression over Its Parametric Couterparts with Large Sample Size

Authors: Jude Opara, Esemokumo Perewarebo Akpos

Abstract:

This paper is on non-parametric linear regression over its parametric counterparts with large sample size. Data set on anthropometric measurement of primary school pupils was taken for the analysis. The study used 50 randomly selected pupils for the study. The set of data was subjected to normality test, and it was discovered that the residuals are not normally distributed (i.e. they do not follow a Gaussian distribution) for the commonly used least squares regression method for fitting an equation into a set of (x,y)-data points using the Anderson-Darling technique. The algorithms for the nonparametric Theil’s regression are stated in this paper as well as its parametric OLS counterpart. The use of a programming language software known as “R Development” was used in this paper. From the analysis, the result showed that there exists a significant relationship between the response and the explanatory variable for both the parametric and non-parametric regression. To know the efficiency of one method over the other, the Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC) are used, and it is discovered that the nonparametric regression performs better than its parametric regression counterparts due to their lower values in both the AIC and BIC. The study however recommends that future researchers should study a similar work by examining the presence of outliers in the data set, and probably expunge it if detected and re-analyze to compare results.

Keywords: Theil’s regression, Bayesian information criterion, Akaike information criterion, OLS

Procedia PDF Downloads 276

3340 Impact Study on a Load Rich Island and Development of Frequency Based Auto-Load Shedding Scheme to Improve Service Reliability of the Island

Authors: Md. Shafiullah, M. Shafiul Alam, Bandar Suliman Alsharif

Abstract:

Electrical quantities such as frequency, voltage, current are being fluctuated due to abnormalities in power system. Most of the abnormalities cause fluctuation in system frequency and sometimes extreme abnormalities lead to system blackout. To protect the system from complete blackout planned and proper islanding plays a very important role even in case of extreme abnormalities. Islanding operation not only helps stabilizing a faulted system but also supports power supplies to critical and important loads, in extreme emergency. But the islanding systems are weaker than integrated system so the stability of islands is the prime concern when an integrated system is disintegrated. In this paper, different impacts on a load rich island have been studied and a frequency based auto-load shedding scheme has been developed for sudden load addition, generation outage and combined effect of both to the island. The developed scheme has been applied to Khulna-Barisal Island to validate the effectiveness of the developed technique. Various types of abnormalities to the test system have been simulated and for the simulation purpose CYME PSAF (Power System Analysis Framework) has been used.

Keywords: auto load shedding, FS&FD relay, impact study, island, PSAF, ROCOF

Procedia PDF Downloads 431

3339 Prediction of Phonon Thermal Conductivity of F.C.C. Al by Molecular Dynamics Simulation

Authors: Leila Momenzadeh, Alexander V. Evteev, Elena V. Levchenko, Tanvir Ahmed, Irina Belova, Graeme Murch

Abstract:

In this work, the phonon thermal conductivity of f.c.c. Al is investigated in detail in the temperature range 100 – 900 K within the framework of equilibrium molecular dynamics simulations making use of the Green-Kubo formalism and one of the most reliable embedded-atom method potentials. It is found that the heat current auto-correlation function of the f.c.c. Al model demonstrates a two-stage temporal decay similar to the previously observed for f.c.c Cu model. After the first stage of decay, the heat current auto-correlation function of the f.c.c. Al model demonstrates a peak in the temperature range 100-800 K. The intensity of the peak decreases as the temperature increases. At 900 K, it transforms to a shoulder. To describe the observed two-stage decay of the heat current auto-correlation function of the f.c.c. Al model, we employ decomposition model recently developed for phonon-mediated thermal transport in a monoatomic lattice. We found that the electronic contribution to the total thermal conductivity of f.c.c. Al dominates over the whole studied temperature range. However, the phonon contribution to the total thermal conductivity of f.c.c. Al increases as temperature decreases. It is about 1.05% at 900 K and about 12.5% at 100 K.

Keywords: aluminum, gGreen-Kubo formalism, molecular dynamics, phonon thermal conductivity

Procedia PDF Downloads 384

3338 Use of Multistage Transition Regression Models for Credit Card Income Prediction

Authors: Denys Osipenko, Jonathan Crook

Abstract:

Because of the variety of the card holders’ behaviour types and income sources each consumer account can be transferred to a variety of states. Each consumer account can be inactive, transactor, revolver, delinquent, defaulted and requires an individual model for the income prediction. The estimation of transition probabilities between statuses at the account level helps to avoid the memorylessness of the Markov Chains approach. This paper investigates the transition probabilities estimation approaches to credit cards income prediction at the account level. The key question of empirical research is which approach gives more accurate results: multinomial logistic regression or multistage conditional logistic regression with binary target. Both models have shown moderate predictive power. Prediction accuracy for conditional logistic regression depends on the order of stages for the conditional binary logistic regression. On the other hand, multinomial logistic regression is easier for usage and gives integrate estimations for all states without priorities. Thus further investigations can be concentrated on alternative modeling approaches such as discrete choice models.

Keywords: multinomial regression, conditional logistic regression, credit account state, transition probability

Procedia PDF Downloads 460

3337 Internet Purchases in European Union Countries: Multiple Linear Regression Approach

Authors: Ksenija Dumičić, Anita Čeh Časni, Irena Palić

Abstract:

This paper examines economic and Information and Communication Technology (ICT) development influence on recently increasing Internet purchases by individuals for European Union member states. After a growing trend for Internet purchases in EU27 was noticed, all possible regression analysis was applied using nine independent variables in 2011. Finally, two linear regression models were studied in detail. Conducted simple linear regression analysis confirmed the research hypothesis that the Internet purchases in analysed EU countries is positively correlated with statistically significant variable Gross Domestic Product per capita (GDPpc). Also, analysed multiple linear regression model with four regressors, showing ICT development level, indicates that ICT development is crucial for explaining the Internet purchases by individuals, confirming the research hypothesis.

Keywords: European union, Internet purchases, multiple linear regression model, outlier

Procedia PDF Downloads 277

3336 Optimization of Slider Crank Mechanism Using Design of Experiments and Multi-Linear Regression

Authors: Galal Elkobrosy, Amr M. Abdelrazek, Bassuny M. Elsouhily, Mohamed E. Khidr

Abstract:

Crank shaft length, connecting rod length, crank angle, engine rpm, cylinder bore, mass of piston and compression ratio are the inputs that can control the performance of the slider crank mechanism and then its efficiency. Several combinations of these seven inputs are used and compared. The throughput engine torque predicted by the simulation is analyzed through two different regression models, with and without interaction terms, developed according to multi-linear regression using LU decomposition to solve system of algebraic equations. These models are validated. A regression model in seven inputs including their interaction terms lowered the polynomial degree from 3^rd degree to 1^stdegree and suggested valid predictions and stable explanations.

Keywords: design of experiments, regression analysis, SI engine, statistical modeling

Procedia PDF Downloads 154

3335 An Epsilon Hierarchical Fuzzy Twin Support Vector Regression

Authors: Arindam Chaudhuri

Abstract:

The research presents epsilon- hierarchical fuzzy twin support vector regression (epsilon-HFTSVR) based on epsilon-fuzzy twin support vector regression (epsilon-FTSVR) and epsilon-twin support vector regression (epsilon-TSVR). Epsilon-FTSVR is achieved by incorporating trapezoidal fuzzy numbers to epsilon-TSVR which takes care of uncertainty existing in forecasting problems. Epsilon-FTSVR determines a pair of epsilon-insensitive proximal functions by solving two related quadratic programming problems. The structural risk minimization principle is implemented by introducing regularization term in primal problems of epsilon-FTSVR. This yields dual stable positive definite problems which improves regression performance. Epsilon-FTSVR is then reformulated as epsilon-HFTSVR consisting of a set of hierarchical layers each containing epsilon-FTSVR. Experimental results on both synthetic and real datasets reveal that epsilon-HFTSVR has remarkable generalization performance with minimum training time.

Keywords: regression, epsilon-TSVR, epsilon-FTSVR, epsilon-HFTSVR

Procedia PDF Downloads 331

3334 Determining the Direction of Causality between Creating Innovation and Technology Market

Authors: Liubov Evstigneeva

Abstract:

In this paper an attempt is made to establish causal nexuses between innovation and international trade in Russia. The topicality of this issue is determined by the necessity of choosing policy instruments for economic modernization and transition to innovative development. The vector auto regression (VAR) model and Granger test are applied for the Russian monthly data from 2005 until the second quartile of 2015. Both lagged import and export at the national level cause innovation, the latter starts to stimulate foreign trade since it is a remote lag. In comparison to aggregate data, the results by patent’s categories are more diverse. Importing technologies from foreign countries stimulates patent activity, while innovations created in Russia are only Granger causality for import to Commonwealth of Independent States.

Keywords: export, import, innovation, patents

Procedia PDF Downloads 295

3333 Polarization Effects in Cosmic-Ray Acceleration by Cyclotron Auto-Resonance

Authors: Yousef I. Salamin

Abstract:

Theoretical investigations, analytical as well as numerical, have shown that electrons can be accelerated to GeV energies by the process of cyclotron auto-resonance acceleration (CARA). In CARA, the particle would be injected along the lines of a uniform magnetic field aligned parallel to the direction of propagation of a plane-wave radiation field. Unfortunately, an accelerator based on CARA would be prohibitively too long and too expensive to build and maintain. However, the process stands a better chance of success near the polar cap of a compact object (such as a neutron star, a black hole or a magnetar) or in an environment created in the wake of a binary neutron-star or blackhole merger. Dynamics of the nuclides ₁H¹, ₂He⁴, ₂₆Fe⁵⁶, and ₂₈Ni⁶², in such astrophysical conditions, have been investigated by single-particle calculations and many-particle simulations. The investigations show that these nuclides can reach ZeV energies (1 ZeV = 10²¹ eV) due to interaction with super-intense radiation of wavelengths = 1 and 10 m and = 50 pm and magnetic fields of strengths at the mega- and giga-tesla levels. Examples employing radiation intensities in the range 10³²-10⁴² W/m² have been used. Employing a two-parameter model for representing the radiation field, CARA is analytically generalized to include any state of polarization, and the basic working equations are derived rigorously and in closed analytic form.

Keywords: compact objects, cosmic-ray acceleration, cyclotron auto-resonance, polarization effects, zevatron

Procedia PDF Downloads 89

3332 Nonparametric Truncated Spline Regression Model on the Data of Human Development Index in Indonesia

Authors: Kornelius Ronald Demu, Dewi Retno Sari Saputro, Purnami Widyaningsih

Abstract:

Human Development Index (HDI) is a standard measurement for a country's human development. Several factors may have influenced it, such as life expectancy, gross domestic product (GDP) based on the province's annual expenditure, the number of poor people, and the percentage of an illiterate people. The scatter plot between HDI and the influenced factors show that the plot does not follow a specific pattern or form. Therefore, the HDI's data in Indonesia can be applied with a nonparametric regression model. The estimation of the regression curve in the nonparametric regression model is flexible because it follows the shape of the data pattern. One of the nonparametric regression's method is a truncated spline. Truncated spline regression is one of the nonparametric approach, which is a modification of the segmented polynomial functions. The estimator of a truncated spline regression model was affected by the selection of the optimal knots point. Knot points is a focus point of spline truncated functions. The optimal knots point was determined by the minimum value of generalized cross validation (GCV). In this article were applied the data of Human Development Index with a truncated spline nonparametric regression model. The results of this research were obtained the best-truncated spline regression model to the HDI's data in Indonesia with the combination of optimal knots point 5-5-5-4. Life expectancy and the percentage of an illiterate people were the significant factors depend to the HDI in Indonesia. The coefficient of determination is 94.54%. This means the regression model is good enough to applied on the data of HDI in Indonesia.

Keywords: generalized cross validation (GCV), Human Development Index (HDI), knots point, nonparametric regression, truncated spline

Procedia PDF Downloads 305

3331 Regression Model Evaluation on Depth Camera Data for Gaze Estimation

Authors: James Purnama, Riri Fitri Sari

Abstract:

We investigate the machine learning algorithm selection problem in the term of a depth image based eye gaze estimation, with respect to its essential difficulty in reducing the number of required training samples and duration time of training. Statistics based prediction accuracy are increasingly used to assess and evaluate prediction or estimation in gaze estimation. This article evaluates Root Mean Squared Error (RMSE) and R-Squared statistical analysis to assess machine learning methods on depth camera data for gaze estimation. There are 4 machines learning methods have been evaluated: Random Forest Regression, Regression Tree, Support Vector Machine (SVM), and Linear Regression. The experiment results show that the Random Forest Regression has the lowest RMSE and the highest R-Squared, which means that it is the best among other methods.

Keywords: gaze estimation, gaze tracking, eye tracking, kinect, regression model, orange python

Procedia PDF Downloads 508

3330 Estimating 3D-Position of a Stationary Random Acoustic Source Using Bispectral Analysis of 4-Point Detected Signals

Authors: Katsumi Hirata

Abstract:

To develop the useful acoustic environmental recognition system, the method of estimating 3D-position of a stationary random acoustic source using bispectral analysis of 4-point detected signals is proposed. The method uses information about amplitude attenuation and propagation delay extracted from amplitude ratios and angles of auto- and cross-bispectra of the detected signals. It is expected that using bispectral analysis affects less influence of Gaussian noises than using conventional power spectral one. In this paper, the basic principle of the method is mentioned first, and its validity and features are considered from results of the fundamental experiments assumed ideal circumstances.

Keywords: 4-point detection, a stationary random acoustic source, auto- and cross-bispectra, estimation of 3D-position

Procedia PDF Downloads 331

3329 Generalized Extreme Value Regression with Binary Dependent Variable: An Application for Predicting Meteorological Drought Probabilities

Authors: Retius Chifurira

Abstract:

Logistic regression model is the most used regression model to predict meteorological drought probabilities. When the dependent variable is extreme, the logistic model fails to adequately capture drought probabilities. In order to adequately predict drought probabilities, we use the generalized linear model (GLM) with the quantile function of the generalized extreme value distribution (GEVD) as the link function. The method maximum likelihood estimation is used to estimate the parameters of the generalized extreme value (GEV) regression model. We compare the performance of the logistic and the GEV regression models in predicting drought probabilities for Zimbabwe. The performance of the regression models are assessed using the goodness-of-fit tests, namely; relative root mean square error (RRMSE) and relative mean absolute error (RMAE). Results show that the GEV regression model performs better than the logistic model, thereby providing a good alternative candidate for predicting drought probabilities. This paper provides the first application of GLM derived from extreme value theory to predict drought probabilities for a drought-prone country such as Zimbabwe.

Keywords: generalized extreme value distribution, general linear model, mean annual rainfall, meteorological drought probabilities

Procedia PDF Downloads 159

3328 Structural and Magnetic Properties of CoFe2-xNdxO4 Spinel Ferrite Nanoparticles

Authors: R. S. Yadav, J. Havlica, I. Kuřitka, Z. Kozakova, J. Masilko, M. Hajdúchová, V. Enev, J. Wasserbauer

Abstract:

In this present work, CoFe2-xNdxO4 (0.0 ≤ x ≥0.1) spinel ferrite nanoparticles were synthesized by starch-assisted sol-gel auto-combustion method. Powder X-ray diffraction patterns were revealed the formation of cubic spinel ferrite with the signature of NdFeO3 phase at higher Nd3+ concentration. The field emission scanning electron microscopy study demonstrated the spherical nanoparticle in the size range between 5-15 nm. Raman and Fourier Transform Infrared spectra supported the formation of the spinel ferrite structure in the nanocrystalline form. The X-ray photoelectron spectroscopy (XPS) analysis confirmed the presence of Co2+ and Fe3+ at octahedral as well as a tetrahedral site in CoFe2-xNdxO4 nanoparticles. The change in magnetic properties with a variation of concentration of Nd3+ ions in cobalt ferrite nanoparticles was observed.

Keywords: nanoparticles, spinel ferrites, sol-gel auto-combustion method, CoFe2-xNdxO4

Procedia PDF Downloads 467

3327 The Extended Skew Gaussian Process for Regression

Authors: M. T. Alodat

Abstract:

In this paper, we propose a generalization to the Gaussian process regression(GPR) model called the extended skew Gaussian process for regression(ESGPr) model. The ESGPR model works better than the GPR model when the errors are skewed. We derive the predictive distribution for the ESGPR model at a new input. Also we apply the ESGPR model to FOREX data and we find that it fits the Forex data better than the GPR model.

Keywords: extended skew normal distribution, Gaussian process for regression, predictive distribution, ESGPr model

Procedia PDF Downloads 520

3326 Integrated Nested Laplace Approximations For Quantile Regression

Authors: Kajingulu Malandala, Ranganai Edmore

Abstract:

The asymmetric Laplace distribution (ADL) is commonly used as the likelihood function of the Bayesian quantile regression, and it offers different families of likelihood method for quantile regression. Notwithstanding their popularity and practicality, ADL is not smooth and thus making it difficult to maximize its likelihood. Furthermore, Bayesian inference is time consuming and the selection of likelihood may mislead the inference, as the Bayes theorem does not automatically establish the posterior inference. Furthermore, ADL does not account for greater skewness and Kurtosis. This paper develops a new aspect of quantile regression approach for count data based on inverse of the cumulative density function of the Poisson, binomial and Delaporte distributions using the integrated nested Laplace Approximations. Our result validates the benefit of using the integrated nested Laplace Approximations and support the approach for count data.

Keywords: quantile regression, Delaporte distribution, count data, integrated nested Laplace approximation

Procedia PDF Downloads 134

3325 Modelling of Reactive Methodologies in Auto-Scaling Time-Sensitive Services With a MAPE-K Architecture

Authors: Óscar Muñoz Garrigós, José Manuel Bernabeu Aubán

Abstract:

Time-sensitive services are the base of the cloud services industry. Keeping low service saturation is essential for controlling response time. All auto-scalable services make use of reactive auto-scaling. However, reactive auto-scaling has few in-depth studies. This presentation shows a model for reactive auto-scaling methodologies with a MAPE-k architecture. Queuing theory can compute different properties of static services but lacks some parameters related to the transition between models. Our model uses queuing theory parameters to relate the transition between models. It associates MAPE-k related times, the sampling frequency, the cooldown period, the number of requests that an instance can handle per unit of time, the number of incoming requests at a time instant, and a function that describes the acceleration in the service's ability to handle more requests. This model is later used as a solution to horizontally auto-scale time-sensitive services composed of microservices, reevaluating the model’s parameters periodically to allocate resources. The solution requires limiting the acceleration of the growth in the number of incoming requests to keep a constrained response time. Business benefits determine such limits. The solution can add a dynamic number of instances and remains valid under different system sizes. The study includes performance recommendations to improve results according to the incoming load shape and business benefits. The exposed methodology is tested in a simulation. The simulator contains a load generator and a service composed of two microservices, where the frontend microservice depends on a backend microservice with a 1:1 request relation ratio. A common request takes 2.3 seconds to be computed by the service and is discarded if it takes more than 7 seconds. Both microservices contain a load balancer that assigns requests to the less loaded instance and preemptively discards requests if they are not finished in time to prevent resource saturation. When load decreases, instances with lower load are kept in the backlog where no more requests are assigned. If the load grows and an instance in the backlog is required, it returns to the running state, but if it finishes the computation of all requests and is no longer required, it is permanently deallocated. A few load patterns are required to represent the worst-case scenario for reactive systems: the following scenarios test response times, resource consumption and business costs. The first scenario is a burst-load scenario. All methodologies will discard requests if the rapidness of the burst is high enough. This scenario focuses on the number of discarded requests and the variance of the response time. The second scenario contains sudden load drops followed by bursts to observe how the methodology behaves when releasing resources that are lately required. The third scenario contains diverse growth accelerations in the number of incoming requests to observe how approaches that add a different number of instances can handle the load with less business cost. The exposed methodology is compared against a multiple threshold CPU methodology allocating/deallocating 10 or 20 instances, outperforming the competitor in all studied metrics.

Keywords: reactive auto-scaling, auto-scaling, microservices, cloud computing

Procedia PDF Downloads 66

3324 The Use of Geographically Weighted Regression for Deforestation Analysis: Case Study in Brazilian Cerrado

Authors: Ana Paula Camelo, Keila Sanches

Abstract:

The Geographically Weighted Regression (GWR) was proposed in geography literature to allow relationship in a regression model to vary over space. In Brazil, the agricultural exploitation of the Cerrado Biome is the main cause of deforestation. In this study, we propose a methodology using geostatistical methods to characterize the spatial dependence of deforestation in the Cerrado based on agricultural production indicators. Therefore, it was used the set of exploratory spatial data analysis tools (ESDA) and confirmatory analysis using GWR. It was made the calibration a non-spatial model, evaluation the nature of the regression curve, election of the variables by stepwise process and multicollinearity analysis. After the evaluation of the non-spatial model was processed the spatial-regression model, statistic evaluation of the intercept and verification of its effect on calibration. In an analysis of Spearman’s correlation the results between deforestation and livestock was +0.783 and with soybeans +0.405. The model presented R²=0.936 and showed a strong spatial dependence of agricultural activity of soybeans associated to maize and cotton crops. The GWR is a very effective tool presenting results closer to the reality of deforestation in the Cerrado when compared with other analysis.

Keywords: deforestation, geographically weighted regression, land use, spatial analysis

Procedia PDF Downloads 328