Search results for: generalized linear regression
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 6314

Search results for: generalized linear regression

6224 Machine Learning Approach for Predicting Students’ Academic Performance and Study Strategies Based on Their Motivation

Authors: Fidelia A. Orji, Julita Vassileva

Abstract:

This research aims to develop machine learning models for students' academic performance and study strategy prediction, which could be generalized to all courses in higher education. Key learning attributes (intrinsic, extrinsic, autonomy, relatedness, competence, and self-esteem) used in building the models are chosen based on prior studies, which revealed that the attributes are essential in students’ learning process. Previous studies revealed the individual effects of each of these attributes on students’ learning progress. However, few studies have investigated the combined effect of the attributes in predicting student study strategy and academic performance to reduce the dropout rate. To bridge this gap, we used Scikit-learn in python to build five machine learning models (Decision Tree, K-Nearest Neighbour, Random Forest, Linear/Logistic Regression, and Support Vector Machine) for both regression and classification tasks to perform our analysis. The models were trained, evaluated, and tested for accuracy using 924 university dentistry students' data collected by Chilean authors through quantitative research design. A comparative analysis of the models revealed that the tree-based models such as the random forest (with prediction accuracy of 94.9%) and decision tree show the best results compared to the linear, support vector, and k-nearest neighbours. The models built in this research can be used in predicting student performance and study strategy so that appropriate interventions could be implemented to improve student learning progress. Thus, incorporating strategies that could improve diverse student learning attributes in the design of online educational systems may increase the likelihood of students continuing with their learning tasks as required. Moreover, the results show that the attributes could be modelled together and used to adapt/personalize the learning process.

Keywords: classification models, learning strategy, predictive modeling, regression models, student academic performance, student motivation, supervised machine learning

Procedia PDF Downloads 98
6223 A Fuzzy Nonlinear Regression Model for Interval Type-2 Fuzzy Sets

Authors: O. Poleshchuk, E. Komarov

Abstract:

This paper presents a regression model for interval type-2 fuzzy sets based on the least squares estimation technique. Unknown coefficients are assumed to be triangular fuzzy numbers. The basic idea is to determine aggregation intervals for type-1 fuzzy sets, membership functions of whose are low membership function and upper membership function of interval type-2 fuzzy set. These aggregation intervals were called weighted intervals. Low and upper membership functions of input and output interval type-2 fuzzy sets for developed regression models are considered as piecewise linear functions.

Keywords: interval type-2 fuzzy sets, fuzzy regression, weighted interval

Procedia PDF Downloads 338
6222 Physical Activity and Mental Health: A Cross-Sectional Investigation into the Relationship of Specific Physical Activity Domains and Mental Well-Being

Authors: Katja Siefken, Astrid Junge

Abstract:

Background: Research indicates that physical activity (PA) protects us from developing mental disorders. The knowledge regarding optimal domain, intensity, type, context, and amount of PA promotion for the prevention of mental disorders is sparse and incoherent. The objective of this study is to determine the relationship between PA domains and mental well-being, and whether associations vary by domain, amount, context, intensity, and type of PA. Methods: 310 individuals (age: 25 yrs., SD 7; 73% female) completed a questionnaire on personal patterns of their PA behaviour (IPQA) and their mental health (Centre of Epidemiologic Studies Depression Scale (CES-D), Generalized Anxiety Disorder (GAD-7) scale, the subjective physical well-being (FEW-16)). Linear and multiple regression were used for analysis. Findings: Individuals who met the PA recommendation (N=269) reported higher scores on subjective physical well-being than those who did not meet the PA recommendations (N=41). Whilst vigorous intensity PA predicts subjective well-being (β = .122, p = .028), it also correlates with depression. The more vigorously physically active a person is, the higher the depression score (β = .127, p = .026). The strongest impact of PA on mental well-being can be seen in the transport domain. A positive linear correlation on subjective physical well-being (β =.175, p = .002), and a negative linear correlation for anxiety (β =-.142, p = .011) and depression (β = -.164, p = .004) was found. Multiple regression analysis indicates similar results: Time spent in active transport on the bicycle significantly lowers anxiety and depression scores and enhances subjective physical well-being. The more time a participant spends using the bicycle for transport, the lower the depression (β = -.143, p = .013) and anxiety scores (β = -.111,p = .050). Conclusions: Meeting the PA recommendations enhances subjective physical well-being. Active transport has a substantial impact on mental well-being. Findings have implications for policymakers, employers, public health experts and civil society. A stronger focus on the promotion and protection of health through active transport is recommended. Inter-sectoral exchange, outside the health sector, is required. Health systems must engage other sectors in adopting policies that maximize possible health gains.

Keywords: active transport, mental well-being, health promotion, psychological disorders

Procedia PDF Downloads 298
6221 Apricot Insurance Portfolio Risk

Authors: Kasirga Yildirak, Ismail Gur

Abstract:

We propose a model to measure hail risk of an Agricultural Insurance portfolio. Hail is one of the major catastrophic event that causes big amount of loss to an insurer. Moreover, it is very hard to predict due to its strange atmospheric characteristics. We make use of parcel based claims data on apricot damage collected by the Turkish Agricultural Insurance Pool (TARSIM). As our ultimate aim is to compute the loadings assigned to specific parcels, we build a portfolio risk model that makes use of PD and the severity of the exposures. PD is computed by Spherical-Linear and Circular –Linear regression models as the data carries coordinate information and seasonality. Severity is mapped into integer brackets so that Probability Generation Function could be employed. Individual regressions are run on each clusters estimated on different criteria. Loss distribution is constructed by Panjer Recursion technique. We also show that one risk-one crop model can easily be extended to the multi risk–multi crop model by assuming conditional independency.

Keywords: hail insurance, spherical regression, circular regression, spherical clustering

Procedia PDF Downloads 232
6220 Durrmeyer Type Modification of q-Generalized Bernstein Operators

Authors: Ruchi, A. M. Acu, Purshottam N. Agrawal

Abstract:

The purpose of this paper to introduce the Durrmeyer type modification of q-generalized-Bernstein operators which include the Bernstein polynomials in the particular α = 0. We investigate the rate of convergence by means of the Lipschitz class and the Peetre’s K-functional. Also, we define the bivariate case of Durrmeyer type modification of q-generalized-Bernstein operators and study the degree of approximation with the aid of the partial modulus of continuity and the Peetre’s K-functional. Finally, we introduce the GBS (Generalized Boolean Sum) of the Durrmeyer type modification of q- generalized-Bernstein operators and investigate the approximation of the Bögel continuous and Bögel differentiable functions with the aid of the Lipschitz class and the mixed modulus of smoothness.

Keywords: Bögel continuous, Bögel differentiable, generalized Boolean sum, Peetre’s K-functional, Lipschitz class, mixed modulus of smoothness

Procedia PDF Downloads 175
6219 Adomian’s Decomposition Method to Generalized Magneto-Thermoelasticity

Authors: Hamdy M. Youssef, Eman A. Al-Lehaibi

Abstract:

Due to many applications and problems in the fields of plasma physics, geophysics, and other many topics, the interaction between the strain field and the magnetic field has to be considered. Adomian introduced the decomposition method for solving linear and nonlinear functional equations. This method leads to accurate, computable, approximately convergent solutions of linear and nonlinear partial and ordinary differential equations even the equations with variable coefficients. This paper is dealing with a mathematical model of generalized thermoelasticity of a half-space conducting medium. A magnetic field with constant intensity acts normal to the bounding plane has been assumed. Adomian’s decomposition method has been used to solve the model when the bounding plane is taken to be traction free and thermally loaded by harmonic heating. The numerical results for the temperature increment, the stress, the strain, the displacement, the induced magnetic, and the electric fields have been represented in figures. The magnetic field, the relaxation time, and the angular thermal load have significant effects on all the studied fields.

Keywords: Adomian’s decomposition method, magneto-thermoelasticity, finite conductivity, iteration method, thermal load

Procedia PDF Downloads 121
6218 Optimization of Machine Learning Regression Results: An Application on Health Expenditures

Authors: Songul Cinaroglu

Abstract:

Machine learning regression methods are recommended as an alternative to classical regression methods in the existence of variables which are difficult to model. Data for health expenditure is typically non-normal and have a heavily skewed distribution. This study aims to compare machine learning regression methods by hyperparameter tuning to predict health expenditure per capita. A multiple regression model was conducted and performance results of Lasso Regression, Random Forest Regression and Support Vector Machine Regression recorded when different hyperparameters are assigned. Lambda (λ) value for Lasso Regression, number of trees for Random Forest Regression, epsilon (ε) value for Support Vector Regression was determined as hyperparameters. Study results performed by using 'k' fold cross validation changed from 5 to 50, indicate the difference between machine learning regression results in terms of R², RMSE and MAE values that are statistically significant (p < 0.001). Study results reveal that Random Forest Regression (R² ˃ 0.7500, RMSE ≤ 0.6000 ve MAE ≤ 0.4000) outperforms other machine learning regression methods. It is highly advisable to use machine learning regression methods for modelling health expenditures.

Keywords: machine learning, lasso regression, random forest regression, support vector regression, hyperparameter tuning, health expenditure

Procedia PDF Downloads 192
6217 Forecasting Electricity Spot Price with Generalized Long Memory Modeling: Wavelet and Neural Network

Authors: Souhir Ben Amor, Heni Boubaker, Lotfi Belkacem

Abstract:

This aims of this paper is to forecast the electricity spot prices. First, we focus on modeling the conditional mean of the series so we adopt a generalized fractional -factor Gegenbauer process (k-factor GARMA). Secondly, the residual from the -factor GARMA model has used as a proxy for the conditional variance; these residuals were predicted using two different approaches. In the first approach, a local linear wavelet neural network model (LLWNN) has developed to predict the conditional variance using the Back Propagation learning algorithms. In the second approach, the Gegenbauer generalized autoregressive conditional heteroscedasticity process (G-GARCH) has adopted, and the parameters of the k-factor GARMA-G-GARCH model has estimated using the wavelet methodology based on the discrete wavelet packet transform (DWPT) approach. The empirical results have shown that the k-factor GARMA-G-GARCH model outperform the hybrid k-factor GARMA-LLWNN model, and find it is more appropriate for forecasts.

Keywords: electricity price, k-factor GARMA, LLWNN, G-GARCH, forecasting

Procedia PDF Downloads 205
6216 Spectral Clustering from the Discrepancy View and Generalized Quasirandomness

Authors: Marianna Bolla

Abstract:

The aim of this paper is to compare spectral, discrepancy, and degree properties of expanding graph sequences. As we can prove equivalences and implications between them and the definition of the generalized (multiclass) quasirandomness of Lovasz–Sos (2008), they can be regarded as generalized quasirandom properties akin to the equivalent quasirandom properties of the seminal Chung-Graham-Wilson paper (1989) in the one-class scenario. Since these properties are valid for deterministic graph sequences, irrespective of stochastic models, the partial implications also justify for low-dimensional embedding of large-scale graphs and for discrepancy minimizing spectral clustering.

Keywords: generalized random graphs, multiway discrepancy, normalized modularity spectra, spectral clustering

Procedia PDF Downloads 169
6215 Formulating a Flexible-Spread Fuzzy Regression Model Based on Dissemblance Index

Authors: Shih-Pin Chen, Shih-Syuan You

Abstract:

This study proposes a regression model with flexible spreads for fuzzy input-output data to cope with the situation that the existing measures cannot reflect the actual estimation error. The main idea is that a dissemblance index (DI) is carefully identified and defined for precisely measuring the actual estimation error. Moreover, the graded mean integration (GMI) representation is adopted for determining more representative numeric regression coefficients. Notably, to comprehensively compare the performance of the proposed model with other ones, three different criteria are adopted. The results from commonly used test numerical examples and an application to Taiwan's business monitoring indicator illustrate that the proposed dissemblance index method not only produces valid fuzzy regression models for fuzzy input-output data, but also has satisfactory and stable performance in terms of the total estimation error based on these three criteria.

Keywords: dissemblance index, forecasting, fuzzy sets, linear regression

Procedia PDF Downloads 332
6214 Transverse Vibration of Non-Homogeneous Rectangular Plates of Variable Thickness Using GDQ

Authors: R. Saini, R. Lal

Abstract:

The effect of non-homogeneity on the free transverse vibration of thin rectangular plates of bilinearly varying thickness has been analyzed using generalized differential quadrature (GDQ) method. The non-homogeneity of the plate material is assumed to arise due to linear variations in Young’s modulus and density of the plate material with the in-plane coordinates x and y. Numerical results have been computed for fully clamped and fully simply supported boundary conditions. The solution procedure by means of GDQ method has been implemented in a MATLAB code. The effect of various plate parameters has been investigated for the first three modes of vibration. A comparison of results with those available in literature has been presented.

Keywords: rectangular, non-homogeneous, bilinear thickness, generalized differential quadrature (GDQ)

Procedia PDF Downloads 368
6213 Switched System Diagnosis Based on Intelligent State Filtering with Unknown Models

Authors: Nada Slimane, Foued Theljani, Faouzi Bouani

Abstract:

The paper addresses the problem of fault diagnosis for systems operating in several modes (normal or faulty) based on states assessment. We use, for this purpose, a methodology consisting of three main processes: 1) sequential data clustering, 2) linear model regression and 3) state filtering. Typically, Kalman Filter (KF) is an algorithm that provides estimation of unknown states using a sequence of I/O measurements. Inevitably, although it is an efficient technique for state estimation, it presents two main weaknesses. First, it merely predicts states without being able to isolate/classify them according to their different operating modes, whether normal or faulty modes. To deal with this dilemma, the KF is endowed with an extra clustering step based fully on sequential version of the k-means algorithm. Second, to provide state estimation, KF requires state space models, which can be unknown. A linear regularized regression is used to identify the required models. To prove its effectiveness, the proposed approach is assessed on a simulated benchmark.

Keywords: clustering, diagnosis, Kalman Filtering, k-means, regularized regression

Procedia PDF Downloads 153
6212 Image Compression Based on Regression SVM and Biorthogonal Wavelets

Authors: Zikiou Nadia, Lahdir Mourad, Ameur Soltane

Abstract:

In this paper, we propose an effective method for image compression based on SVM Regression (SVR), with three different kernels, and biorthogonal 2D Discrete Wavelet Transform. SVM regression could learn dependency from training data and compressed using fewer training points (support vectors) to represent the original data and eliminate the redundancy. Biorthogonal wavelet has been used to transform the image and the coefficients acquired are then trained with different kernels SVM (Gaussian, Polynomial, and Linear). Run-length and Arithmetic coders are used to encode the support vectors and its corresponding weights, obtained from the SVM regression. The peak signal noise ratio (PSNR) and their compression ratios of several test images, compressed with our algorithm, with different kernels are presented. Compared with other kernels, Gaussian kernel achieves better image quality. Experimental results show that the compression performance of our method gains much improvement.

Keywords: image compression, 2D discrete wavelet transform (DWT-2D), support vector regression (SVR), SVM Kernels, run-length, arithmetic coding

Procedia PDF Downloads 355
6211 An Extension of the Generalized Extreme Value Distribution

Authors: Serge Provost, Abdous Saboor

Abstract:

A q-analogue of the generalized extreme value distribution which includes the Gumbel distribution is introduced. The additional parameter q allows for increased modeling flexibility. The resulting distribution can have a finite, semi-infinite or infinite support. It can also produce several types of hazard rate functions. The model parameters are determined by making use of the method of maximum likelihood. It will be shown that it compares favourably to three related distributions in connection with the modeling of a certain hydrological data set.

Keywords: extreme value theory, generalized extreme value distribution, goodness-of-fit statistics, Gumbel distribution

Procedia PDF Downloads 315
6210 Least Squares Method Identification of Corona Current-Voltage Characteristics and Electromagnetic Field in Electrostatic Precipitator

Authors: H. Nouri, I. E. Achouri, A. Grimes, H. Ait Said, M. Aissou, Y. Zebboudj

Abstract:

This paper aims to analysis the behaviour of DC corona discharge in wire-to-plate electrostatic precipitators (ESP). Current-voltage curves are particularly analysed. Experimental results show that discharge current is strongly affected by the applied voltage. The proposed method of current identification is to use the method of least squares. Least squares problems that of into two categories: linear or ordinary least squares and non-linear least squares, depending on whether or not the residuals are linear in all unknowns. The linear least-squares problem occurs in statistical regression analysis; it has a closed-form solution. A closed-form solution (or closed form expression) is any formula that can be evaluated in a finite number of standard operations. The non-linear problem has no closed-form solution and is usually solved by iterative.

Keywords: electrostatic precipitator, current-voltage characteristics, least squares method, electric field, magnetic field

Procedia PDF Downloads 407
6209 Forecast of the Small Wind Turbines Sales with Replacement Purchases and with or without Account of Price Changes

Authors: V. Churkin, M. Lopatin

Abstract:

The purpose of the paper is to estimate the US small wind turbines market potential and forecast the small wind turbines sales in the US. The forecasting method is based on the application of the Bass model and the generalized Bass model of innovations diffusion under replacement purchases. In the work an exponential distribution is used for modeling of replacement purchases. Only one parameter of such distribution is determined by average lifetime of small wind turbines. The identification of the model parameters is based on nonlinear regression analysis on the basis of the annual sales statistics which has been published by the American Wind Energy Association (AWEA) since 2001 up to 2012. The estimation of the US average market potential of small wind turbines (for adoption purchases) without account of price changes is 57080 (confidence interval from 49294 to 64866 at P = 0.95) under average lifetime of wind turbines 15 years, and 62402 (confidence interval from 54154 to 70648 at P = 0.95) under average lifetime of wind turbines 20 years. In the first case the explained variance is 90,7%, while in the second - 91,8%. The effect of the wind turbines price changes on their sales was estimated using generalized Bass model. This required a price forecast. To do this, the polynomial regression function, which is based on the Berkeley Lab statistics, was used. The estimation of the US average market potential of small wind turbines (for adoption purchases) in that case is 42542 (confidence interval from 32863 to 52221 at P = 0.95) under average lifetime of wind turbines 15 years, and 47426 (confidence interval from 36092 to 58760 at P = 0.95) under average lifetime of wind turbines 20 years. In the first case the explained variance is 95,3%, while in the second –95,3%.

Keywords: bass model, generalized bass model, replacement purchases, sales forecasting of innovations, statistics of sales of small wind turbines in the United States

Procedia PDF Downloads 328
6208 A Comparison of Smoothing Spline Method and Penalized Spline Regression Method Based on Nonparametric Regression Model

Authors: Autcha Araveeporn

Abstract:

This paper presents a study about a nonparametric regression model consisting of a smoothing spline method and a penalized spline regression method. We also compare the techniques used for estimation and prediction of nonparametric regression model. We tried both methods with crude oil prices in dollars per barrel and the Stock Exchange of Thailand (SET) index. According to the results, it is concluded that smoothing spline method performs better than that of penalized spline regression method.

Keywords: nonparametric regression model, penalized spline regression method, smoothing spline method, Stock Exchange of Thailand (SET)

Procedia PDF Downloads 402
6207 An Overbooking Model for Car Rental Service with Different Types of Cars

Authors: Naragain Phumchusri, Kittitach Pongpairoj

Abstract:

Overbooking is a very useful revenue management technique that could help reduce costs caused by either undersales or oversales. In this paper, we propose an overbooking model for two types of cars that can minimize the total cost for car rental service. With two types of cars, there is an upgrade possibility for lower type to upper type. This makes the model more complex than one type of cars scenario. We have found that convexity can be proved in this case. Sensitivity analysis of the parameters is conducted to observe the effects of relevant parameters on the optimal solution. Model simplification is proposed using multiple linear regression analysis, which can help estimate the optimal overbooking level using appropriate independent variables. The results show that the overbooking level from multiple linear regression model is relatively close to the optimal solution (with the adjusted R-squared value of at least 72.8%). To evaluate the performance of the proposed model, the total cost was compared with the case where the decision maker uses a naïve method for the overbooking level. It was found that the total cost from optimal solution is only 0.5 to 1 percent (on average) lower than the cost from regression model, while it is approximately 67% lower than the cost obtained by the naïve method. It indicates that our proposed simplification method using regression analysis can effectively perform in estimating the overbooking level.

Keywords: overbooking, car rental industry, revenue management, stochastic model

Procedia PDF Downloads 149
6206 Rd-PLS Regression: From the Analysis of Two Blocks of Variables to Path Modeling

Authors: E. Tchandao Mangamana, V. Cariou, E. Vigneau, R. Glele Kakai, E. M. Qannari

Abstract:

A new definition of a latent variable associated with a dataset makes it possible to propose variants of the PLS2 regression and the multi-block PLS (MB-PLS). We shall refer to these variants as Rd-PLS regression and Rd-MB-PLS respectively because they are inspired by both Redundancy analysis and PLS regression. Usually, a latent variable t associated with a dataset Z is defined as a linear combination of the variables of Z with the constraint that the length of the loading weights vector equals 1. Formally, t=Zw with ‖w‖=1. Denoting by Z' the transpose of Z, we define herein, a latent variable by t=ZZ’q with the constraint that the auxiliary variable q has a norm equal to 1. This new definition of a latent variable entails that, as previously, t is a linear combination of the variables in Z and, in addition, the loading vector w=Z’q is constrained to be a linear combination of the rows of Z. More importantly, t could be interpreted as a kind of projection of the auxiliary variable q onto the space generated by the variables in Z, since it is collinear to the first PLS1 component of q onto Z. Consider the situation in which we aim to predict a dataset Y from another dataset X. These two datasets relate to the same individuals and are assumed to be centered. Let us consider a latent variable u=YY’q to which we associate the variable t= XX’YY’q. Rd-PLS consists in seeking q (and therefore u and t) so that the covariance between t and u is maximum. The solution to this problem is straightforward and consists in setting q to the eigenvector of YY’XX’YY’ associated with the largest eigenvalue. For the determination of higher order components, we deflate X and Y with respect to the latent variable t. Extending Rd-PLS to the context of multi-block data is relatively easy. Starting from a latent variable u=YY’q, we consider its ‘projection’ on the space generated by the variables of each block Xk (k=1, ..., K) namely, tk= XkXk'YY’q. Thereafter, Rd-MB-PLS seeks q in order to maximize the average of the covariances of u with tk (k=1, ..., K). The solution to this problem is given by q, eigenvector of YY’XX’YY’, where X is the dataset obtained by horizontally merging datasets Xk (k=1, ..., K). For the determination of latent variables of order higher than 1, we use a deflation of Y and Xk with respect to the variable t= XX’YY’q. In the same vein, extending Rd-MB-PLS to the path modeling setting is straightforward. Methods are illustrated on the basis of case studies and performance of Rd-PLS and Rd-MB-PLS in terms of prediction is compared to that of PLS2 and MB-PLS.

Keywords: multiblock data analysis, partial least squares regression, path modeling, redundancy analysis

Procedia PDF Downloads 113
6205 Parameter Estimation in Dynamical Systems Based on Latent Variables

Authors: Arcady Ponosov

Abstract:

A novel mathematical approach is suggested, which facilitates a compressed representation and efficient validation of parameter-rich ordinary differential equation models describing the dynamics of complex, especially biology-related, systems and which is based on identification of the system's latent variables. In particular, an efficient parameter estimation method for the compressed non-linear dynamical systems is developed. The method is applied to the so-called 'power-law systems' being non-linear differential equations typically used in Biochemical System Theory.

Keywords: generalized law of mass action, metamodels, principal components, synergetic systems

Procedia PDF Downloads 328
6204 Glucose Monitoring System Using Machine Learning Algorithms

Authors: Sangeeta Palekar, Neeraj Rangwani, Akash Poddar, Jayu Kalambe

Abstract:

The bio-medical analysis is an indispensable procedure for identifying health-related diseases like diabetes. Monitoring the glucose level in our body regularly helps us identify hyperglycemia and hypoglycemia, which can cause severe medical problems like nerve damage or kidney diseases. This paper presents a method for predicting the glucose concentration in blood samples using image processing and machine learning algorithms. The glucose solution is prepared by the glucose oxidase (GOD) and peroxidase (POD) method. An experimental database is generated based on the colorimetric technique. The image of the glucose solution is captured by the raspberry pi camera and analyzed using image processing by extracting the RGB, HSV, LUX color space values. Regression algorithms like multiple linear regression, decision tree, RandomForest, and XGBoost were used to predict the unknown glucose concentration. The multiple linear regression algorithm predicts the results with 97% accuracy. The image processing and machine learning-based approach reduce the hardware complexities of existing platforms.

Keywords: artificial intelligence glucose detection, glucose oxidase, peroxidase, image processing, machine learning

Procedia PDF Downloads 171
6203 Estimation of Coefficients of Ridge and Principal Components Regressions with Multicollinear Data

Authors: Rajeshwar Singh

Abstract:

The presence of multicollinearity is common in handling with several explanatory variables simultaneously due to exhibiting a linear relationship among them. A great problem arises in understanding the impact of explanatory variables on the dependent variable. Thus, the method of least squares estimation gives inexact estimates. In this case, it is advised to detect its presence first before proceeding further. Using the ridge regression degree of its occurrence is reduced but principal components regression gives good estimates in this situation. This paper discusses well-known techniques of the ridge and principal components regressions and applies to get the estimates of coefficients by both techniques. In addition to it, this paper also discusses the conflicting claim on the discovery of the method of ridge regression based on available documents.

Keywords: conflicting claim on credit of discovery of ridge regression, multicollinearity, principal components and ridge regressions, variance inflation factor

Procedia PDF Downloads 379
6202 Regional Hydrological Extremes Frequency Analysis Based on Statistical and Hydrological Models

Authors: Hadush Kidane Meresa

Abstract:

The hydrological extremes frequency analysis is the foundation for the hydraulic engineering design, flood protection, drought management and water resources management and planning to utilize the available water resource to meet the desired objectives of different organizations and sectors in a country. This spatial variation of the statistical characteristics of the extreme flood and drought events are key practice for regional flood and drought analysis and mitigation management. For different hydro-climate of the regions, where the data set is short, scarcity, poor quality and insufficient, the regionalization methods are applied to transfer at-site data to a region. This study aims in regional high and low flow frequency analysis for Poland River Basins. Due to high frequent occurring of hydrological extremes in the region and rapid water resources development in this basin have caused serious concerns over the flood and drought magnitude and frequencies of the river in Poland. The magnitude and frequency result of high and low flows in the basin is needed for flood and drought planning, management and protection at present and future. Hydrological homogeneous high and low flow regions are formed by the cluster analysis of site characteristics, using the hierarchical and C- mean clustering and PCA method. Statistical tests for regional homogeneity are utilized, by Discordancy and Heterogeneity measure tests. In compliance with results of the tests, the region river basin has been divided into ten homogeneous regions. In this study, frequency analysis of high and low flows using AM for high flow and 7-day minimum low flow series is conducted using six statistical distributions. The use of L-moment and LL-moment method showed a homogeneous region over entire province with Generalized logistic (GLOG), Generalized extreme value (GEV), Pearson type III (P-III), Generalized Pareto (GPAR), Weibull (WEI) and Power (PR) distributions as the regional drought and flood frequency distributions. The 95% percentile and Flow duration curves of 1, 7, 10, 30 days have been plotted for 10 stations. However, the cluster analysis performed two regions in west and east of the province where L-moment and LL-moment method demonstrated the homogeneity of the regions and GLOG and Pearson Type III (PIII) distributions as regional frequency distributions for each region, respectively. The spatial variation and regional frequency distribution of flood and drought characteristics for 10 best catchment from the whole region was selected and beside the main variable (streamflow: high and low) we used variables which are more related to physiographic and drainage characteristics for identify and delineate homogeneous pools and to derive best regression models for ungauged sites. Those are mean annual rainfall, seasonal flow, average slope, NDVI, aspect, flow length, flow direction, maximum soil moisture, elevation, and drainage order. The regional high-flow or low-flow relationship among one streamflow characteristics with (AM or 7-day mean annual low flows) some basin characteristics is developed using Generalized Linear Mixed Model (GLMM) and Generalized Least Square (GLS) regression model, providing a simple and effective method for estimation of flood and drought of desired return periods for ungauged catchments.

Keywords: flood , drought, frequency, magnitude, regionalization, stochastic, ungauged, Poland

Procedia PDF Downloads 568
6201 Chaotic Search Optimal Design and Modeling of Permanent Magnet Synchronous Linear Motor

Authors: Yang Yi-Fei, Luo Min-Zhou, Zhang Fu-Chun, He Nai-Bao, Xing Shao-Bang

Abstract:

This paper presents an electromagnetic finite element model of permanent magnet synchronous linear motor and distortion rate of the air gap flux density waveform is analyzed in detail. By designing the sample space of the parameters, nonlinear regression modeling of the orthogonal experimental design is introduced. We put forward for possible air gap flux density waveform sine electromagnetic scheme. Parameters optimization of the permanent magnet synchronous linear motor is also introduced which is based on chaotic search and adaptation function. Simulation results prove that the pole shifting does not affect the motor back electromotive symmetry based on the structural parameters, it provides a novel way for the optimum design of permanent magnet synchronous linear motor and other engineering.

Keywords: permanent magnet synchronous linear motor, finite element analysis, chaotic search, optimization design

Procedia PDF Downloads 387
6200 Inverse Scattering of Two-Dimensional Objects Using an Enhancement Method

Authors: A.R. Eskandari, M.R. Eskandari

Abstract:

A 2D complete identification algorithm for dielectric and multiple objects immersed in air is presented. The employed technique consists of initially retrieving the shape and position of the scattering object using a linear sampling method and then determining the electric permittivity and conductivity of the scatterer using adjoint sensitivity analysis. This inversion algorithm results in high computational speed and efficiency, and it can be generalized for any scatterer structure. Also, this method is robust with respect to noise. The numerical results clearly show that this hybrid approach provides accurate reconstructions of various objects.

Keywords: inverse scattering, microwave imaging, two-dimensional objects, Linear Sampling Method (LSM)

Procedia PDF Downloads 362
6199 Dry Relaxation Shrinkage Prediction of Bordeaux Fiber Using a Feed Forward Neural

Authors: Baeza S. Roberto

Abstract:

The knitted fabric suffers a deformation in its dimensions due to stretching and tension factors, transverse and longitudinal respectively, during the process in rectilinear knitting machines so it performs a dry relaxation shrinkage procedure and thermal action of prefixed to obtain stable conditions in the knitting. This paper presents a dry relaxation shrinkage prediction of Bordeaux fiber using a feed forward neural network and linear regression models. Six operational alternatives of shrinkage were predicted. A comparison of the results was performed finding neural network models with higher levels of explanation of the variability and prediction. The presence of different reposes are included. The models were obtained through a neural toolbox of Matlab and Minitab software with real data in a knitting company of Southern Guanajuato. The results allow predicting dry relaxation shrinkage of each alternative operation.

Keywords: neural network, dry relaxation, knitting, linear regression

Procedia PDF Downloads 555
6198 A Bivariate Inverse Generalized Exponential Distribution and Its Applications in Dependent Competing Risks Model

Authors: Fatemah A. Alqallaf, Debasis Kundu

Abstract:

The aim of this paper is to introduce a bivariate inverse generalized exponential distribution which has a singular component. The proposed bivariate distribution can be used when the marginals have heavy-tailed distributions, and they have non-monotone hazard functions. Due to the presence of the singular component, it can be used quite effectively when there are ties in the data. Since it has four parameters, it is a very flexible bivariate distribution, and it can be used quite effectively for analyzing various bivariate data sets. Several dependency properties and dependency measures have been obtained. The maximum likelihood estimators cannot be obtained in closed form, and it involves solving a four-dimensional optimization problem. To avoid that, we have proposed to use an EM algorithm, and it involves solving only one non-linear equation at each `E'-step. Hence, the implementation of the proposed EM algorithm is very straight forward in practice. Extensive simulation experiments and the analysis of one data set have been performed. We have observed that the proposed bivariate inverse generalized exponential distribution can be used for modeling dependent competing risks data. One data set has been analyzed to show the effectiveness of the proposed model.

Keywords: Block and Basu bivariate distributions, competing risks, EM algorithm, Marshall-Olkin bivariate exponential distribution, maximum likelihood estimators

Procedia PDF Downloads 114
6197 Statistical Analysis and Impact Forecasting of Connected and Autonomous Vehicles on the Environment: Case Study in the State of Maryland

Authors: Alireza Ansariyar, Safieh Laaly

Abstract:

Over the last decades, the vehicle industry has shown increased interest in integrating autonomous, connected, and electrical technologies in vehicle design with the primary hope of improving mobility and road safety while reducing transportation’s environmental impact. Using the State of Maryland (M.D.) in the United States as a pilot study, this research investigates CAVs’ fuel consumption and air pollutants (C.O., PM, and NOx) and utilizes meaningful linear regression models to predict CAV’s environmental effects. Maryland transportation network was simulated in VISUM software, and data on a set of variables were collected through a comprehensive survey. The number of pollutants and fuel consumption were obtained for the time interval 2010 to 2021 from the macro simulation. Eventually, four linear regression models were proposed to predict the amount of C.O., NOx, PM pollutants, and fuel consumption in the future. The results highlighted that CAVs’ pollutants and fuel consumption have a significant correlation with the income, age, and race of the CAV customers. Furthermore, the reliability of four statistical models was compared with the reliability of macro simulation model outputs in the year 2030. The error of three pollutants and fuel consumption was obtained at less than 9% by statistical models in SPSS. This study is expected to assist researchers and policymakers with planning decisions to reduce CAV environmental impacts in M.D.

Keywords: connected and autonomous vehicles, statistical model, environmental effects, pollutants and fuel consumption, VISUM, linear regression models

Procedia PDF Downloads 420
6196 Count Regression Modelling on Number of Migrants in Households

Authors: Tsedeke Lambore Gemecho, Ayele Taye Goshu

Abstract:

The main objective of this study is to identify the determinants of the number of international migrants in a household and to compare regression models for count response. This study is done by collecting data from total of 2288 household heads of 16 randomly sampled districts in Hadiya and Kembata-Tembaro zones of Southern Ethiopia. The Poisson mixed models, as special cases of the generalized linear mixed model, is explored to determine effects of the predictors: age of household head, farm land size, and household size. Two ethnicities Hadiya and Kembata are included in the final model as dummy variables. Stepwise variable selection has indentified four predictors: age of head, farm land size, family size and dummy variable ethnic2 (0=other, 1=Kembata). These predictors are significant at 5% significance level with count response number of migrant. The Poisson mixed model consisting of the four predictors with random effects districts. Area specific random effects are significant with the variance of about 0.5105 and standard deviation of 0.7145. The results show that the number of migrant increases with heads age, family size, and farm land size. In conclusion, there is a significantly high number of international migration per household in the area. Age of household head, family size, and farm land size are determinants that increase the number of international migrant in households. Community-based intervention is needed so as to monitor and regulate the international migration for the benefits of the society.

Keywords: Poisson regression, GLM, number of migrant, Hadiya and Kembata Tembaro zones

Procedia PDF Downloads 265
6195 Linear Stability of Convection in an Inclined Channel with Nanofluid Saturated Porous Medium

Authors: D. Srinivasacharya, Nidhi Humnekar

Abstract:

The goal of this research is to numerically investigate the convection of nanofluid flow in an inclined porous channel. Brownian motion and thermophoresis effects are accounted for by nanofluid. In addition, the flow in the porous region governs Brinkman’s equation. The perturbed state of the generalized eigenvalue problem is obtained using normal mode analysis, and Chebyshev spectral collocation was used to solve this problem. For various values of the governing parameters, the critical wavenumber and critical Rayleigh number are calculated, and preferred modes are identified.

Keywords: Brinkman model, inclined channel, nanofluid, linear stability, porous media

Procedia PDF Downloads 92