Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 7846

Search results for: weibull regression distribution

7726 The Use of Geographically Weighted Regression for Deforestation Analysis: Case Study in Brazilian Cerrado

Authors: Ana Paula Camelo, Keila Sanches

Abstract:

The Geographically Weighted Regression (GWR) was proposed in geography literature to allow relationship in a regression model to vary over space. In Brazil, the agricultural exploitation of the Cerrado Biome is the main cause of deforestation. In this study, we propose a methodology using geostatistical methods to characterize the spatial dependence of deforestation in the Cerrado based on agricultural production indicators. Therefore, it was used the set of exploratory spatial data analysis tools (ESDA) and confirmatory analysis using GWR. It was made the calibration a non-spatial model, evaluation the nature of the regression curve, election of the variables by stepwise process and multicollinearity analysis. After the evaluation of the non-spatial model was processed the spatial-regression model, statistic evaluation of the intercept and verification of its effect on calibration. In an analysis of Spearman’s correlation the results between deforestation and livestock was +0.783 and with soybeans +0.405. The model presented R²=0.936 and showed a strong spatial dependence of agricultural activity of soybeans associated to maize and cotton crops. The GWR is a very effective tool presenting results closer to the reality of deforestation in the Cerrado when compared with other analysis.

Keywords: deforestation, geographically weighted regression, land use, spatial analysis

Procedia PDF Downloads 329

7725 Investigating the Effects of Data Transformations on a Bi-Dimensional Chi-Square Test

Authors: Alexandru George Vaduva, Adriana Vlad, Bogdan Badea

Abstract:

In this research, we conduct a Monte Carlo analysis on a two-dimensional χ2 test, which is used to determine the minimum distance required for independent sampling in the context of chaotic signals. We investigate the impact of transforming initial data sets from any probability distribution to new signals with a uniform distribution using the Spearman rank correlation on the χ2 test. This transformation removes the randomness of the data pairs, and as a result, the observed distribution of χ2 test values differs from the expected distribution. We propose a solution to this problem and evaluate it using another chaotic signal.

Keywords: chaotic signals, logistic map, Pearson’s test, Chi Square test, bivariate distribution, statistical independence

Procedia PDF Downloads 57

7724 Modeling the Current and Future Distribution of Anthus Pratensis under Climate Change

Authors: Zahira Belkacemi

Abstract:

One of the most important tools in conservation biology is information on the geographic distribution of species and the variables determining those patterns. In this study, we used maximum-entropy niche modeling (Maxent) to predict the current and future distribution of Anthus pratensis using climatic variables. The results showed that the species would not be highly affected by the climate change in shifting its distribution; however, the results of this study should be improved by taking into account other predictors, and that the NATURA 2000 protected sites will be efficient at 42% in protecting the species.

Keywords: anthus pratensis, climate change, Europe, species distribution model

Procedia PDF Downloads 107

7723 Application of the Quantile Regression Approach to the Heterogeneity of the Fine Wine Prices

Authors: Charles-Olivier Amédée-Manesme, Benoit Faye, Eric Le Fur

Abstract:

In this paper, the heterogeneity of the Bordeaux Legends 50 wine market price segment is addressed. For this purpose, quantile regression is applied – with market segmentation based on wine bottle price quantile – and the hedonic price of wine attributes is computed for various price segments of the market. The approach is applied to a major privately held data set which consists of approximately 30,000 transactions over the 2003–2014 period. The findings suggest that the relative hedonic prices of several wine attributes differ significantly among deciles. In particular, the elasticity coefficient of the expert ratings shows strong variation among prices. If - as suggested in the literature - expert ratings have a positive influence on wine price on average, they have a clearly decreasing impact over the quantiles. Finally, the lower the wine price, the higher the potential for price appreciation over time. Other variables such as chateaux or vintage are also shown to vary across the distribution of wine prices. While enhancing our understanding of the complex market dynamics that underlie Bordeaux wines’ price, this research provides empirical evidence that the QR approach adequately captures heterogeneity among wine price ranges, which simultaneously applies to wine stock, vintage and auctions’ house.

Keywords: hedonics, market segmentation, quantile regression, heterogeneity, wine economics

Procedia PDF Downloads 309

7722 Applying the Crystal Model Approach on Light Nuclei for Calculating Radii and Density Distribution

Authors: A. Amar

Abstract:

A new model, namely the crystal model, has been modified to calculate the radius and density distribution of light nuclei up to ⁸Be. The crystal model has been modified according to solid-state physics, which uses the analogy between nucleon distribution and atoms distribution in the crystal. The model has analytical analysis to calculate the radius where the density distribution of light nuclei has obtained from analogy of crystal lattice. The distribution of nucleons over crystal has been discussed in a general form. The equation that has been used to calculate binding energy was taken from the solid-state model of repulsive and attractive force. The numbers of the protons were taken to control repulsive force, where the atomic number was responsible for the attractive force. The parameter has been calculated from the crystal model was found to be proportional to the radius of the nucleus. The density distribution of light nuclei was taken as a summation of two clusters distribution as in ⁶Li=alpha+deuteron configuration. A test has been done on the data obtained for radius and density distribution using double folding for d+⁶,⁷Li with M3Y nucleon-nucleon interaction. Good agreement has been obtained for both the radius and density distribution of light nuclei. The model failed to calculate the radius of ⁹Be, so modifications should be done to overcome discrepancy.

Keywords: nuclear physics, nuclear lattice, study nucleus as crystal, light nuclei till to ⁸Be

Procedia PDF Downloads 142

7721 A Study on Reliability of Gender and Stature Determination by Odontometric and Craniofacial Anthropometric Parameters

Authors: Churamani Pokhrel, C. B. Jha, S. R. Niraula, P. R. Pokharel

Abstract:

Human identification is one of the most challenging subjects that man has confronted. The determination of adult sex and stature are two of the four key factors (sex, stature, age, and race) in identification of an individual. Craniofacial and odontometric parameters are important tools for forensic anthropologists when it is not possible to apply advanced techniques for identification purposes. The present study provides anthropometric correlation of the parameters with stature and gender and also devises regression formulae for reconstruction of stature. A total of 312 Nepalese students with equal distribution of sex i.e., 156 male and 156 female students of age 18-35 years were taken for the study. Total of 10 parameters were measured (age, sex, stature, head circumference, head length, head breadth, facial height, bi-zygomatic width, mesio-distal canine width and inter-canine distance of both maxilla and mandible). Co-relation and regression analysis was done to find the association between the parameters. All parameters were found to be greater in males than females and each was found to be statistically significant. Out of total 312 samples, the best regressor for the determination of stature was head circumference and mandibular inter-canine width and that for gender was head circumference and right mandibular teeth. The accuracy of prediction was 83%. Regression equations and analysis generated from craniofacial and odontometric parameters can be a supplementary approach for the estimation of stature and gender when extremities are not available.

Keywords: craniofacial, gender, odontometric, stature

Procedia PDF Downloads 155

7720 Social Media as a Distribution Channel for Thailand’s Rice Berry Product

Authors: Phutthiwat Waiyawuththanapoom, Wannapong Waiyawuththanapoom, Pimploi Tirastittam

Abstract:

Nowadays, it is a globalization era which social media plays an important role to the lifestyle as an information source, tools to connect people together and etc. This research is object to find out about the significant level of the social media as a distribution channel to the agriculture product of Thailand. In this research, the agriculture product is the Rice Berry which is the cross-bred unmilled rice producing dark violet grain, is a combination of Hom Nin Rice and Thai Jasmine/ Fragrant Rice 105. Rice Berry has a very high nutrition and nice aroma so the product is in the growth stage of the product cycle. The problem for the Rice Berry product in Thailand is the production and the distribution channel. This study is to confirm that the social media is another option as the distribution channel for the product which is not a mass production product. This will be the role model for the other niche market product to select the distribution channel.

Keywords: distribution, social media, rice berry, distribution channel

Procedia PDF Downloads 402

7719 Weighted Rank Regression with Adaptive Penalty Function

Authors: Kang-Mo Jung

Abstract:

The use of regularization for statistical methods has become popular. The least absolute shrinkage and selection operator (LASSO) framework has become the standard tool for sparse regression. However, it is well known that the LASSO is sensitive to outliers or leverage points. We consider a new robust estimation which is composed of the weighted loss function of the pairwise difference of residuals and the adaptive penalty function regulating the tuning parameter for each variable. Rank regression is resistant to regression outliers, but not to leverage points. By adopting a weighted loss function, the proposed method is robust to leverage points of the predictor variable. Furthermore, the adaptive penalty function gives us good statistical properties in variable selection such as oracle property and consistency. We develop an efficient algorithm to compute the proposed estimator using basic functions in program R. We used an optimal tuning parameter based on the Bayesian information criterion (BIC). Numerical simulation shows that the proposed estimator is effective for analyzing real data set and contaminated data.

Keywords: adaptive penalty function, robust penalized regression, variable selection, weighted rank regression

Procedia PDF Downloads 432

7718 MapReduce Logistic Regression Algorithms with RHadoop

Authors: Byung Ho Jung, Dong Hoon Lim

Abstract:

Logistic regression is a statistical method for analyzing a dataset in which there are one or more independent variables that determine an outcome. Logistic regression is used extensively in numerous disciplines, including the medical and social science fields. In this paper, we address the problem of estimating parameters in the logistic regression based on MapReduce framework with RHadoop that integrates R and Hadoop environment applicable to large scale data. There exist three learning algorithms for logistic regression, namely Gradient descent method, Cost minimization method and Newton-Rhapson's method. The Newton-Rhapson's method does not require a learning rate, while gradient descent and cost minimization methods need to manually pick a learning rate. The experimental results demonstrated that our learning algorithms using RHadoop can scale well and efficiently process large data sets on commodity hardware. We also compared the performance of our Newton-Rhapson's method with gradient descent and cost minimization methods. The results showed that our newton's method appeared to be the most robust to all data tested.

Keywords: big data, logistic regression, MapReduce, RHadoop

Procedia PDF Downloads 246

7717 A Generalized Weighted Loss for Support Vextor Classification and Multilayer Perceptron

Authors: Filippo Portera

Abstract:

Usually standard algorithms employ a loss where each error is the mere absolute difference between the true value and the prediction, in case of a regression task. In the present, we present several error weighting schemes that are a generalization of the consolidated routine. We study both a binary classification model for Support Vextor Classification and a regression net for Multylayer Perceptron. Results proves that the error is never worse than the standard procedure and several times it is better.

Keywords: loss, binary-classification, MLP, weights, regression

Procedia PDF Downloads 64

7716 Spatial Distribution of Local Sheep Breeds in Antalya Province

Authors: Serife Gulden Yilmaz, Suleyman Karaman

Abstract:

Sheep breeding is important in terms of meeting both the demand of red meat consumption and the availability of industrial raw materials and the employment of the rural sector in Turkey. It is also very important to ensure the selection and continuity of the breeds that are raised in order to increase quality and productive products related to sheep breeding. The protection of local breeds and crossbreds also enables the development of the sector in the region and the reduction of imports. In this study, the data were obtained from the records of the Turkish Statistical Institute and Antalya Sheep & Goat Breeders' Association. Spatial distribution of sheep breeds in Antalya is reviewed statistically in terms of concentration at the local level for 2015 period spatially. For this reason; mapping, box plot, linear regression are used in this study. Concentration is introduced by means of studbook data on sheep breeding as locals and total sheep farm by mapping. It is observed that Pırlak breed (17.5%) and Merinos crossbreed (16.3%) have the highest concentration in the region. These breeds are respectively followed by Akkaraman breed (11%), Pirlak crossbreed (8%), Merinos breed (7.9%) Akkaraman crossbreed (7.9%) and Ivesi breed (7.2%).

Keywords: sheep breeds, local, spatial distribution, agglomeration, Antalya

Procedia PDF Downloads 256

7715 Interference among Lambsquarters and Oil Rapeseed Cultivars

Authors: Reza Siyami, Bahram Mirshekari

Abstract:

Seed and oil yield of rapeseed is considerably affected by weeds interference including mustard (Sinapis arvensis L.), lambsquarters (Chenopodium album L.) and redroot pigweed (Amaranthus retroflexus L.) throughout the East Azerbaijan province in Iran. To formulate the relationship between four independent growth variables measured in our experiment with a dependent variable, multiple regression analysis was carried out for the weed leaves number per plant (X1), green cover percentage (X2), LAI (X3) and leaf area per plant (X4) as independent variables and rapeseed oil yield as a dependent variable. The multiple regression equation is shown as follows: Seed essential oil yield (kg/ha) = 0.156 + 0.0325 (X1) + 0.0489 (X2) + 0.0415 (X3) + 0.133 (X4). Furthermore, the stepwise regression analysis was also carried out for the data obtained to test the significance of the independent variables affecting the oil yield as a dependent variable. The resulted stepwise regression equation is shown as follows: Oil yield = 4.42 + 0.0841 (X2) + 0.0801 (X3); R2 = 81.5. The stepwise regression analysis verified that the green cover percentage and LAI of weed had a marked increasing effect on the oil yield of rapeseed.

Keywords: green cover percentage, independent variable, interference, regression

Procedia PDF Downloads 389

7714 Copula-Based Estimation of Direct and Indirect Effects in Path Analysis Model

Authors: Alam Ali, Ashok Kumar Pathak

Abstract:

Path analysis is a statistical technique used to evaluate the strength of the direct and indirect effects of variables. One or more structural regression equations are used to estimate a series of parameters in order to find the better fit of data. Sometimes, exogenous variables do not show a significant strength of their direct and indirect effect when the assumption of classical regression (ordinary least squares (OLS)) are violated by the nature of the data. The main motive of this article is to investigate the efficacy of the copula-based regression approach over the classical regression approach and calculate the direct and indirect effects of variables when data violates the OLS assumption and variables are linked through an elliptical copula. We perform this study using a well-organized numerical scheme. Finally, a real data application is also presented to demonstrate the performance of the superiority of the copula approach.

Keywords: path analysis, copula-based regression models, direct and indirect effects, k-fold cross validation technique

Procedia PDF Downloads 46

7713 Analysis of Spatial Heterogeneity of Residential Prices in Guangzhou: An Actual Study Based on Poi Geographically Weighted Regression Model

Authors: Zichun Guo

Abstract:

Guangzhou's housing prices have declined for a long time compared with the other three major cities. As Guangzhou's housing price ladder increases, the influencing factors of housing prices have gradually attracted attention. This article attempts to use housing price data and POI data and uses the Kriging spatial interpolation method and the geographically weighted regression model to explore the distribution of housing prices and the impact of factors. Caused, especially located in Huadu District and the city center. The response is mainly obvious in surrounding areas, which may be related to housing positioning. Economic POIs close to the city center have stronger responses. The factors affecting housing prices provide this method, which is conducive to the management and macro-control of relevant departments, better meets the demand for home purchases, and realizes financing-side reforms.

Keywords: housing prices, spatial heterogeneity, Guangzhou, POI

Procedia PDF Downloads 6

7712 Performance Analysis of Proprietary and Non-Proprietary Tools for Regression Testing Using Genetic Algorithm

Authors: K. Hema Shankari, R. Thirumalaiselvi, N. V. Balasubramanian

Abstract:

The present paper addresses to the research in the area of regression testing with emphasis on automated tools as well as prioritization of test cases. The uniqueness of regression testing and its cyclic nature is pointed out. The difference in approach between industry, with business model as basis, and academia, with focus on data mining, is highlighted. Test Metrics are discussed as a prelude to our formula for prioritization; a case study is further discussed to illustrate this methodology. An industrial case study is also described in the paper, where the number of test cases is so large that they have to be grouped as Test Suites. In such situations, a genetic algorithm proposed by us can be used to reconfigure these Test Suites in each cycle of regression testing. The comparison is made between a proprietary tool and an open source tool using the above-mentioned metrics. Our approach is clarified through several tables.

Keywords: APFD metric, genetic algorithm, regression testing, RFT tool, test case prioritization, selenium tool

Procedia PDF Downloads 401

7711 Pre-Shared Key Distribution Algorithms' Attacks for Body Area Networks: A Survey

Authors: Priti Kumari, Tricha Anjali

Abstract:

Body Area Networks (BANs) have emerged as the most promising technology for pervasive health care applications. Since they facilitate communication of very sensitive health data, information leakage in such networks can put human life at risk, and hence security inside BANs is a critical issue. Safe distribution and periodic refreshment of cryptographic keys are needed to ensure the highest level of security. In this paper, we focus on the key distribution techniques and how they are categorized for BAN. The state-of-art pre-shared key distribution algorithms are surveyed. Possible attacks on algorithms are demonstrated with examples.

Keywords: attacks, body area network, key distribution, key refreshment, pre-shared keys

Procedia PDF Downloads 334

7710 A Hybrid Model Tree and Logistic Regression Model for Prediction of Soil Shear Strength in Clay

Authors: Ehsan Mehryaar, Seyed Armin Motahari Tabari

Abstract:

Without a doubt, soil shear strength is the most important property of the soil. The majority of fatal and catastrophic geological accidents are related to shear strength failure of the soil. Therefore, its prediction is a matter of high importance. However, acquiring the shear strength is usually a cumbersome task that might need complicated laboratory testing. Therefore, prediction of it based on common and easy to get soil properties can simplify the projects substantially. In this paper, A hybrid model based on the classification and regression tree algorithm and logistic regression is proposed where each leaf of the tree is an independent regression model. A database of 189 points for clay soil, including Moisture content, liquid limit, plastic limit, clay content, and shear strength, is collected. The performance of the developed model compared to the existing models and equations using root mean squared error and coefficient of correlation.

Keywords: model tree, CART, logistic regression, soil shear strength

Procedia PDF Downloads 165

7709 A Regression Model for Residual-State Creep Failure

Authors: Deepak Raj Bhat, Ryuichi Yatabe

Abstract:

In this study, a residual-state creep failure model was developed based on the residual-state creep test results of clayey soils. To develop the proposed model, the regression analyses were done by using the R. The model results of the failure time (tf) and critical displacement (δc) were compared with experimental results and found in close agreements to each others. It is expected that the proposed regression model for residual-state creep failure will be more useful for the prediction of displacement of different clayey soils in the future.

Keywords: regression model, residual-state creep failure, displacement prediction, clayey soils

Procedia PDF Downloads 376

7708 Climate Changes in Albania and Their Effect on Cereal Yield

Authors: Lule Basha, Eralda Gjika

Abstract:

This study is focused on analyzing climate change in Albania and its potential effects on cereal yields. Initially, monthly temperature and rainfalls in Albania were studied for the period 1960-2021. Climacteric variables are important variables when trying to model cereal yield behavior, especially when significant changes in weather conditions are observed. For this purpose, in the second part of the study, linear and nonlinear models explaining cereal yield are constructed for the same period, 1960-2021. The multiple linear regression analysis and lasso regression method are applied to the data between cereal yield and each independent variable: average temperature, average rainfall, fertilizer consumption, arable land, land under cereal production, and nitrous oxide emissions. In our regression model, heteroscedasticity is not observed, data follow a normal distribution, and there is a low correlation between factors, so we do not have the problem of multicollinearity. Machine-learning methods, such as random forest, are used to predict cereal yield responses to climacteric and other variables. Random Forest showed high accuracy compared to the other statistical models in the prediction of cereal yield. We found that changes in average temperature negatively affect cereal yield. The coefficients of fertilizer consumption, arable land, and land under cereal production are positively affecting production. Our results show that the Random Forest method is an effective and versatile machine-learning method for cereal yield prediction compared to the other two methods.

Keywords: cereal yield, climate change, machine learning, multiple regression model, random forest

Procedia PDF Downloads 58

7707 A Fuzzy Nonlinear Regression Model for Interval Type-2 Fuzzy Sets

Authors: O. Poleshchuk, E. Komarov

Abstract:

This paper presents a regression model for interval type-2 fuzzy sets based on the least squares estimation technique. Unknown coefficients are assumed to be triangular fuzzy numbers. The basic idea is to determine aggregation intervals for type-1 fuzzy sets, membership functions of whose are low membership function and upper membership function of interval type-2 fuzzy set. These aggregation intervals were called weighted intervals. Low and upper membership functions of input and output interval type-2 fuzzy sets for developed regression models are considered as piecewise linear functions.

Keywords: interval type-2 fuzzy sets, fuzzy regression, weighted interval

Procedia PDF Downloads 335

7706 Formulating a Flexible-Spread Fuzzy Regression Model Based on Dissemblance Index

Authors: Shih-Pin Chen, Shih-Syuan You

Abstract:

This study proposes a regression model with flexible spreads for fuzzy input-output data to cope with the situation that the existing measures cannot reflect the actual estimation error. The main idea is that a dissemblance index (DI) is carefully identified and defined for precisely measuring the actual estimation error. Moreover, the graded mean integration (GMI) representation is adopted for determining more representative numeric regression coefficients. Notably, to comprehensively compare the performance of the proposed model with other ones, three different criteria are adopted. The results from commonly used test numerical examples and an application to Taiwan's business monitoring indicator illustrate that the proposed dissemblance index method not only produces valid fuzzy regression models for fuzzy input-output data, but also has satisfactory and stable performance in terms of the total estimation error based on these three criteria.

Keywords: dissemblance index, forecasting, fuzzy sets, linear regression

Procedia PDF Downloads 330

7705 Bayesian Value at Risk Forecast Using Realized Conditional Autoregressive Expectiel Mdodel with an Application of Cryptocurrency

Authors: Niya Chen, Jennifer Chan

Abstract:

In the financial market, risk management helps to minimize potential loss and maximize profit. There are two ways to assess risks; the first way is to calculate the risk directly based on the volatility. The most common risk measurements are Value at Risk (VaR), sharp ratio, and beta. Alternatively, we could look at the quantile of the return to assess the risk. Popular return models such as GARCH and stochastic volatility (SV) focus on modeling the mean of the return distribution via capturing the volatility dynamics; however, the quantile/expectile method will give us an idea of the distribution with the extreme return value. It will allow us to forecast VaR using return which is direct information. The advantage of using these non-parametric methods is that it is not bounded by the distribution assumptions from the parametric method. But the difference between them is that expectile uses a second-order loss function while quantile regression uses a first-order loss function. We consider several quantile functions, different volatility measures, and estimates from some volatility models. To estimate the expectile of the model, we use Realized Conditional Autoregressive Expectile (CARE) model with the bayesian method to achieve this. We would like to see if our proposed models outperform existing models in cryptocurrency, and we will test it by using Bitcoin mainly as well as Ethereum.

Keywords: expectile, CARE Model, CARR Model, quantile, cryptocurrency, Value at Risk

Procedia PDF Downloads 82

7704 Evaluation of Best-Fit Probability Distribution for Prediction of Extreme Hydrologic Phenomena

Authors: Karim Hamidi Machekposhti, Hossein Sedghi

Abstract:

The probability distributions are the best method for forecasting of extreme hydrologic phenomena such as rainfall and flood flows. In this research, in order to determine suitable probability distribution for estimating of annual extreme rainfall and flood flows (discharge) series with different return periods, precipitation with 40 and discharge with 58 years time period had been collected from Karkheh River at Iran. After homogeneity and adequacy tests, data have been analyzed by Stormwater Management and Design Aid (SMADA) software and residual sum of squares (R.S.S). The best probability distribution was Log Pearson Type III with R.S.S value (145.91) and value (13.67) for peak discharge and Log Pearson Type III with R.S.S values (141.08) and (8.95) for maximum discharge in Jelogir Majin and Pole Zal stations, respectively. The best distribution for maximum precipitation in Jelogir Majin and Pole Zal stations was Log Pearson Type III distribution with R.S.S values (1.74&1.90) and then Pearson Type III distribution with R.S.S values (1.53&1.69). Overall, the Log Pearson Type III distributions are acceptable distribution types for representing statistics of extreme hydrologic phenomena in Karkheh River at Iran with the Pearson Type III distribution as a potential alternative.

Keywords: Karkheh River, Log Pearson Type III, probability distribution, residual sum of squares

Procedia PDF Downloads 170

7703 Image Compression Based on Regression SVM and Biorthogonal Wavelets

Authors: Zikiou Nadia, Lahdir Mourad, Ameur Soltane

Abstract:

In this paper, we propose an effective method for image compression based on SVM Regression (SVR), with three different kernels, and biorthogonal 2D Discrete Wavelet Transform. SVM regression could learn dependency from training data and compressed using fewer training points (support vectors) to represent the original data and eliminate the redundancy. Biorthogonal wavelet has been used to transform the image and the coefficients acquired are then trained with different kernels SVM (Gaussian, Polynomial, and Linear). Run-length and Arithmetic coders are used to encode the support vectors and its corresponding weights, obtained from the SVM regression. The peak signal noise ratio (PSNR) and their compression ratios of several test images, compressed with our algorithm, with different kernels are presented. Compared with other kernels, Gaussian kernel achieves better image quality. Experimental results show that the compression performance of our method gains much improvement.

Keywords: image compression, 2D discrete wavelet transform (DWT-2D), support vector regression (SVR), SVM Kernels, run-length, arithmetic coding

Procedia PDF Downloads 352

7702 A Comparative Study of Additive and Nonparametric Regression Estimators and Variable Selection Procedures

Authors: Adriano Z. Zambom, Preethi Ravikumar

Abstract:

One of the biggest challenges in nonparametric regression is the curse of dimensionality. Additive models are known to overcome this problem by estimating only the individual additive effects of each covariate. However, if the model is misspecified, the accuracy of the estimator compared to the fully nonparametric one is unknown. In this work the efficiency of completely nonparametric regression estimators such as the Loess is compared to the estimators that assume additivity in several situations, including additive and non-additive regression scenarios. The comparison is done by computing the oracle mean square error of the estimators with regards to the true nonparametric regression function. Then, a backward elimination selection procedure based on the Akaike Information Criteria is proposed, which is computed from either the additive or the nonparametric model. Simulations show that if the additive model is misspecified, the percentage of time it fails to select important variables can be higher than that of the fully nonparametric approach. A dimension reduction step is included when nonparametric estimator cannot be computed due to the curse of dimensionality. Finally, the Boston housing dataset is analyzed using the proposed backward elimination procedure and the selected variables are identified.

Keywords: additive model, nonparametric regression, variable selection, Akaike Information Criteria

Procedia PDF Downloads 240

7701 Wind Resource Estimation and Economic Analysis for Rakiraki, Fiji

Authors: Kaushal Kishore

Abstract:

Immense amount of imported fuels are used in Fiji for electricity generation, transportation and for carrying out miscellaneous household work. To alleviate its dependency on fossil fuel, paramount importance has been given to instigate the utilization of renewable energy sources for power generation and to reduce the environmental dilapidation. Amongst the many renewable energy sources, wind has been considered as one of the best identified renewable sources that are comprehensively available in Fiji. In this study the wind resource assessment for three locations in Rakiraki, Fiji has been carried out. The wind resource estimation at Rokavukavu, Navolau and at Tuvavatu has been analyzed. The average wind speed at 55 m above ground level (a.g.l) at Rokavukavu, Navolau, and Tuvavatu sites are 5.91 m/s, 8.94 m/s and 8.13 m/s with the turbulence intensity of 14.9%, 17.1%, and 11.7% respectively. The moment fitting method has been used to estimate the Weibull parameter and the power density at each sites. A high resolution wind resource map for the three locations has been developed by using Wind Atlas Analysis and Application Program (WAsP). The results obtained from WAsP exhibited good wind potential at Navolau and Tuvavatu sites. A wind farm has been proposed at Navolau and Tuvavatu site that comprises six Vergnet 275 kW wind turbines at each site. The annual energy production (AEP) for each wind farm is estimated and an economic analysis is performed. The economic analysis for the proposed wind farms at Navolau and Tuvavatu sites showed a payback period of 5 and 6 years respectively.

Keywords: annual energy production, Rakiraki Fiji, turbulence intensity, Weibull parameter, wind speed, Wind Atlas Analysis and Application Program

Procedia PDF Downloads 163

7700 Partial Least Square Regression for High-Dimentional and High-Correlated Data

Authors: Mohammed Abdullah Alshahrani

Abstract:

The research focuses on investigating the use of partial least squares (PLS) methodology for addressing challenges associated with high-dimensional correlated data. Recent technological advancements have led to experiments producing data characterized by a large number of variables compared to observations, with substantial inter-variable correlations. Such data patterns are common in chemometrics, where near-infrared (NIR) spectrometer calibrations record chemical absorbance levels across hundreds of wavelengths, and in genomics, where thousands of genomic regions' copy number alterations (CNA) are recorded from cancer patients. PLS serves as a widely used method for analyzing high-dimensional data, functioning as a regression tool in chemometrics and a classification method in genomics. It handles data complexity by creating latent variables (components) from original variables. However, applying PLS can present challenges. The study investigates key areas to address these challenges, including unifying interpretations across three main PLS algorithms and exploring unusual negative shrinkage factors encountered during model fitting. The research presents an alternative approach to addressing the interpretation challenge of predictor weights associated with PLS. Sparse estimation of predictor weights is employed using a penalty function combining a lasso penalty for sparsity and a Cauchy distribution-based penalty to account for variable dependencies. The results demonstrate sparse and grouped weight estimates, aiding interpretation and prediction tasks in genomic data analysis. High-dimensional data scenarios, where predictors outnumber observations, are common in regression analysis applications. Ordinary least squares regression (OLS), the standard method, performs inadequately with high-dimensional and highly correlated data. Copy number alterations (CNA) in key genes have been linked to disease phenotypes, highlighting the importance of accurate classification of gene expression data in bioinformatics and biology using regularized methods like PLS for regression and classification.

Keywords: partial least square regression, genetics data, negative filter factors, high dimensional data, high correlated data

Procedia PDF Downloads 9

7699 Application and Verification of Regression Model to Landslide Susceptibility Mapping

Authors: Masood Beheshtirad

Abstract:

Identification of regions having potential for landslide occurrence is one of the basic measures in natural resources management. Different landslide hazard mapping models are proposed based on the environmental condition and goals. In this research landslide hazard map using multiple regression model were provided and applicability of this model is investigated in Baghdasht watershed. Dependent variable is landslide inventory map and independent variables consist of information layers as Geology, slope, aspect, distance from river, distance from road, fault and land use. For doing this, existing landslides have been identified and an inventory map made. The landslide hazard map is based on the multiple regression provided. The level of similarity potential hazard classes and figures of this model were compared with the landslide inventory map in the SPSS environments. Results of research showed that there is a significant correlation between the potential hazard classes and figures with area of the landslides. The multiple regression model is suitable for application in the Baghdasht Watershed.

Keywords: landslide, mapping, multiple model, regression

Procedia PDF Downloads 302

7698 Predicting Bridge Pier Scour Depth with SVM

Authors: Arun Goel

Abstract:

Prediction of maximum local scour is necessary for the safety and economical design of the bridges. A number of equations have been developed over the years to predict local scour depth using laboratory data and a few pier equations have also been proposed using field data. Most of these equations are empirical in nature as indicated by the past publications. In this paper, attempts have been made to compute local depth of scour around bridge pier in dimensional and non-dimensional form by using linear regression, simple regression and SVM (Poly and Rbf) techniques along with few conventional empirical equations. The outcome of this study suggests that the SVM (Poly and Rbf) based modeling can be employed as an alternate to linear regression, simple regression and the conventional empirical equations in predicting scour depth of bridge piers. The results of present study on the basis of non-dimensional form of bridge pier scour indicates the improvement in the performance of SVM (Poly and Rbf) in comparison to dimensional form of scour.

Keywords: modeling, pier scour, regression, prediction, SVM (Poly and Rbf kernels)

Procedia PDF Downloads 425

7697 Application of Universal Distribution Factors for Real-Time Complex Power Flow Calculation

Authors: Abdullah M. Alodhaiani, Yasir A. Alturki, Mohamed A. Elkady

Abstract:

Complex power flow distribution factors, which relate line complex power flows to the bus injected complex powers, have been widely used in various power system planning and analysis studies. In particular, AC distribution factors have been used extensively in the recent power and energy pricing studies in free electricity market field. As was demonstrated in the existing literature, many of the electricity market related costing studies rely on the use of the distribution factors. These known distribution factors, whether the injection shift factors (ISF’s) or power transfer distribution factors (PTDF’s), are linear approximations of the first order sensitivities of the active power flows with respect to various variables. This paper presents a novel model for evaluating the universal distribution factors (UDF’s), which are appropriate for an extensive range of power systems analysis and free electricity market studies. These distribution factors are used for the calculations of lines complex power flows and its independent of bus power injections, they are compact matrix-form expressions with total flexibility in determining the position on the line at which line flows are measured. The proposed approach was tested on IEEE 9-Bus system. Numerical results demonstrate that the proposed approach is very accurate compared with exact method.

Keywords: distribution factors, power system, sensitivity factors, electricity market

Procedia PDF Downloads 438