Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 17749

Search results for: Gaussian process for regression

17479 Agile Software Effort Estimation Using Regression Techniques

Abstract:

Effort estimation is among the activities carried out in software development processes. An accurate model of estimation leads to project success. The method of agile effort estimation is a complex task because of the dynamic nature of software development. Researchers are still conducting studies on agile effort estimation to enhance prediction accuracy. Due to these reasons, we investigated and proposed a model on LASSO and Elastic Net regression to enhance estimation accuracy. The proposed model has major components: preprocessing, train-test split, training with default parameters, and cross-validation. During the preprocessing phase, the entire dataset is normalized. After normalization, a train-test split is performed on the dataset, setting training at 80% and testing set to 20%. We chose two different phases for training the two algorithms (Elastic Net and LASSO) regression following the train-test-split. In the first phase, the two algorithms are trained using their default parameters and evaluated on the testing data. In the second phase, the grid search technique (the grid is used to search for tuning and select optimum parameters) and 5-fold cross-validation to get the final trained model. Finally, the final trained model is evaluated using the testing set. The experimental work is applied to the agile story point dataset of 21 software projects collected from six firms. The results show that both Elastic Net and LASSO regression outperformed the compared ones. Compared to the proposed algorithms, LASSO regression achieved better predictive performance and has acquired PRED (8%) and PRED (25%) results of 100.0, MMRE of 0.0491, MMER of 0.0551, MdMRE of 0.0593, MdMER of 0.063, and MSE of 0.0007. The result implies LASSO regression algorithm trained model is the most acceptable, and higher estimation performance exists in the literature.

Keywords: agile software development, effort estimation, elastic net regression, LASSO

Procedia PDF Downloads 16

17478 Modeling Studies on the Elevated Temperatures Formability of Tube Ends Using RSM

Authors: M. J. Davidson, N. Selvaraj, L. Venugopal

Abstract:

The elevated temperature forming studies on the expansion of thin walled tubes have been studied in the present work. The influence of process parameters namely the die angle, the die ratio and the operating temperatures on the expansion of tube ends at elevated temperatures is carried out. The range of operating parameters have been identified by perfoming extensive simulation studies. The hot forming parameters have been evaluated for AA2014 alloy for performing the simulation studies. Experimental matrix has been developed from the feasible range got from the simulation results. The design of experiments is used for the optimization of process parameters. Response Surface Method’s (RSM) and Box-Behenken design (BBD) is used for developing the mathematical model for expansion. Analysis of variance (ANOVA) is used to analyze the influence of process parameters on the expansion of tube ends. The effect of various process combinations of expansion are analyzed through graphical representations. The developed model is found to be appropriate as the coefficient of determination value is very high and is equal to 0.9726. The predicted values are found to coincide well with the experimental results, within acceptable error limits.

Keywords: expansion, optimization, Response Surface Method (RSM), ANOVA, bbd, residuals, regression, tube

Procedia PDF Downloads 477

17477 Robustified Asymmetric Logistic Regression Model for Global Fish Stock Assessment

Authors: Osamu Komori, Shinto Eguchi, Hiroshi Okamura, Momoko Ichinokawa

Abstract:

The long time-series data on population assessments are essential for global ecosystem assessment because the temporal change of biomass in such a database reflects the status of global ecosystem properly. However, the available assessment data usually have limited sample sizes and the ratio of populations with low abundance of biomass (collapsed) to those with high abundance (non-collapsed) is highly imbalanced. To allow for the imbalance and uncertainty involved in the ecological data, we propose a binary regression model with mixed effects for inferring ecosystem status through an asymmetric logistic model. In the estimation equation, we observe that the weights for the non-collapsed populations are relatively reduced, which in turn puts more importance on the small number of observations of collapsed populations. Moreover, we extend the asymmetric logistic regression model using propensity score to allow for the sample biases observed in the labeled and unlabeled datasets. It robustified the estimation procedure and improved the model fitting.

Keywords: double robust estimation, ecological binary data, mixed effect logistic regression model, propensity score

Procedia PDF Downloads 231

17476 Urban-Rural Inequality in Mexico after Nafta: A Quantile Regression Analysis

Authors: Rene Valdiviezo-Issa

Abstract:

In this paper, we use Mexico’s Households Income and Expenditures (ENIGH) survey to explain the behaviour that the urban-rural expenditure gap has had since Mexico’s incorporation to the North American Free Trade Agreement (NAFTA) in 1994 and we compare it with the latest available survey, which took place in 2014. We use real trimestral expenditure per capita (RTEPC) as the measure of welfare. We use quantile regressions and a quantile regression decomposition to describe the gap between urban and rural distributions of log RTEPC. We discover that the decrease in the difference between the urban and rural distributions of log RTEPC, or inequality, is motivated because of a deprivation of the urban areas, in very specific characteristics, rather than an improvement of the urban areas. When using the decomposition we observe that the gap is primarily brought about because differences in returns to covariates between the urban and rural areas.

Keywords: quantile regression, urban-rural inequality, inequality in Mexico, income decompositon

Procedia PDF Downloads 251

17475 Evaluation of the Beach Erosion Process in Varadero, Matanzas, Cuba: Effects of Different Hurricane Trajectories

Authors: Ana Gabriela Diaz, Luis Fermín Córdova, Jr., Roberto Lamazares

Abstract:

The island of Cuba, the largest of the Greater Antilles, is located in the tropical North Atlantic. It is annually affected by numerous weather events, which have caused severe damage to our coastal areas. In the same way that many other coastlines around the world, the beautiful beaches of the Hicacos Peninsula also suffer from erosion. This leads to a structural regression of the coastline. If measures are not taken, the hotels will be exposed to the advance of the sea, and it will be a serious problem for the economy. With the aim of studying the intensity of this type of activity, specialists of group of coastal and marine engineering from CIH, in the framework of the research conducted within the project MEGACOSTAS 2, provide their research to simulate extreme events and assess their impact in coastal areas, mainly regarding the definition of flood volumes and morphodynamic changes in sandy beaches. The main objective of this work is the evaluation of the process of Varadero beach erosion (the coastal sector has an important impact in the country's economy) on the Hicacos Peninsula for different paths of hurricanes. The mathematical model XBeach, which was integrated into the Coastal engineering system introduced by the project of MEGACOSTA 2 to determine the area and the more critical profiles for the path of hurricanes under study, was applied. The results of this project have shown that Center area is the greatest dynamic area in the simulation of the three paths of hurricanes under study, showing high erosion volumes and the greatest average length of regression of the coastline, from 15- 22 m.

Keywords: beach, erosion, mathematical model, coastal areas

Procedia PDF Downloads 199

17474 The Influence of Air Temperature Controls in Estimation of Air Temperature over Homogeneous Terrain

Authors: Fariza Yunus, Jasmee Jaafar, Zamalia Mahmud, Nurul Nisa’ Khairul Azmi, Nursalleh K. Chang, Nursalleh K. Chang

Abstract:

Variation of air temperature from one place to another is cause by air temperature controls. In general, the most important control of air temperature is elevation. Another significant independent variable in estimating air temperature is the location of meteorological stations. Distances to coastline and land use type are also contributed to significant variations in the air temperature. On the other hand, in homogeneous terrain direct interpolation of discrete points of air temperature work well to estimate air temperature values in un-sampled area. In this process the estimation is solely based on discrete points of air temperature. However, this study presents that air temperature controls also play significant roles in estimating air temperature over homogenous terrain of Peninsular Malaysia. An Inverse Distance Weighting (IDW) interpolation technique was adopted to generate continuous data of air temperature. This study compared two different datasets, observed mean monthly data of T, and estimation error of T–T’, where T’ estimated value from a multiple regression model. The multiple regression model considered eight independent variables of elevation, latitude, longitude, coastline, and four land use types of water bodies, forest, agriculture and build up areas, to represent the role of air temperature controls. Cross validation analysis was conducted to review accuracy of the estimation values. Final results show, estimation values of T–T’ produced lower errors for mean monthly mean air temperature over homogeneous terrain in Peninsular Malaysia.

Keywords: air temperature control, interpolation analysis, peninsular Malaysia, regression model, air temperature

Procedia PDF Downloads 347

17473 BART Matching Method: Using Bayesian Additive Regression Tree for Data Matching

Authors: Gianna Zou

Abstract:

Propensity score matching (PSM), introduced by Paul R. Rosenbaum and Donald Rubin in 1983, is a popular statistical matching technique which tries to estimate the treatment effects by taking into account covariates that could impact the efficacy of study medication in clinical trials. PSM can be used to reduce the bias due to confounding variables. However, PSM assumes that the response values are normally distributed. In some cases, this assumption may not be held. In this paper, a machine learning method - Bayesian Additive Regression Tree (BART), is used as a more robust method of matching. BART can work well when models are misspecified since it can be used to model heterogeneous treatment effects. Moreover, it has the capability to handle non-linear main effects and multiway interactions. In this research, a BART Matching Method (BMM) is proposed to provide a more reliable matching method over PSM. By comparing the analysis results from PSM and BMM, BMM can perform well and has better prediction capability when the response values are not normally distributed.

Keywords: BART, Bayesian, matching, regression

Procedia PDF Downloads 113

17472 Separating Landform from Noise in High-Resolution Digital Elevation Models through Scale-Adaptive Window-Based Regression

Authors: Anne M. Denton, Rahul Gomes, David W. Franzen

Abstract:

High-resolution elevation data are becoming increasingly available, but typical approaches for computing topographic features, like slope and curvature, still assume small sliding windows, for example, of size 3x3. That means that the digital elevation model (DEM) has to be resampled to the scale of the landform features that are of interest. Any higher resolution is lost in this resampling. When the topographic features are computed through regression that is performed at the resolution of the original data, the accuracy can be much higher, and the reported result can be adjusted to the length scale that is relevant locally. Slope and variance are calculated for overlapping windows, meaning that one regression result is computed per raster point. The number of window centers per area is the same for the output as for the original DEM. Slope and variance are computed by performing regression on the points in the surrounding window. Such an approach is computationally feasible because of the additive nature of regression parameters and variance. Any doubling of window size in each direction only takes a single pass over the data, corresponding to a logarithmic scaling of the resulting algorithm as a function of the window size. Slope and variance are stored for each aggregation step, allowing the reported slope to be selected to minimize variance. The approach thereby adjusts the effective window size to the landform features that are characteristic to the area within the DEM. Starting with a window size of 2x2, each iteration aggregates 2x2 non-overlapping windows from the previous iteration. Regression results are stored for each iteration, and the slope at minimal variance is reported in the final result. As such, the reported slope is adjusted to the length scale that is characteristic of the landform locally. The length scale itself and the variance at that length scale are also visualized to aid in interpreting the results for slope. The relevant length scale is taken to be half of the window size of the window over which the minimum variance was achieved. The resulting process was evaluated for 1-meter DEM data and for artificial data that was constructed to have defined length scales and added noise. A comparison with ESRI ArcMap was performed and showed the potential of the proposed algorithm. The resolution of the resulting output is much higher and the slope and aspect much less affected by noise. Additionally, the algorithm adjusts to the scale of interest within the region of the image. These benefits are gained without additional computational cost in comparison with resampling the DEM and computing the slope over 3x3 images in ESRI ArcMap for each resolution. In summary, the proposed approach extracts slope and aspect of DEMs at the lengths scales that are characteristic locally. The result is of higher resolution and less affected by noise than existing techniques.

Keywords: high resolution digital elevation models, multi-scale analysis, slope calculation, window-based regression

Procedia PDF Downloads 94

17471 The Relationship Between Hourly Compensation and Unemployment Rate Using the Panel Data Regression Analysis

Authors: S. K. Ashiquer Rahman

Abstract:

the paper concentrations on the importance of hourly compensation, emphasizing the significance of the unemployment rate. There are the two most important factors of a nation these are its unemployment rate and hourly compensation. These are not merely statistics but they have profound effects on individual, families, and the economy. They are inversely related to one another. When we consider the unemployment rate that will probably decline as hourly compensations in manufacturing rise. But when we reduced the unemployment rates and increased job prospects could result from higher compensation. That’s why, the increased hourly compensation in the manufacturing sector that could have a favorable effect on job changing issues. Moreover, the relationship between hourly compensation and unemployment is complex and influenced by broader economic factors. In this paper, we use panel data regression models to evaluate the expected link between hourly compensation and unemployment rate in order to determine the effect of hourly compensation on unemployment rate. We estimate the fixed effects model, evaluate the error components, and determine which model (the FEM or ECM) is better by pooling all 60 observations. We then analysis and review the data by comparing 3 several countries (United States, Canada and the United Kingdom) using panel data regression models. Finally, we provide result, analysis and a summary of the extensive research on how the hourly compensation effects on the unemployment rate. Additionally, this paper offers relevant and useful informational to help the government and academic community use an econometrics and social approach to lessen on the effect of the hourly compensation on Unemployment rate to eliminate the problem.

Keywords: hourly compensation, Unemployment rate, panel data regression models, dummy variables, random effects model, fixed effects model, the linear regression model

Procedia PDF Downloads 31

17470 Assessing the Impacts of Urbanization on Urban Precincts: A Case of Golconda Precinct, Hyderabad

Authors: Sai AKhila Budaraju

Abstract:

Heritage sites are an integral part of cities and carry a sense of identity to the cities/ towns, but the process of urbanization is a carrying potential threat for the loss of these heritage sites/monuments. Both Central and State Governments listed the historic Golconda fort as National Important Monument and the Heritage precinct with eight heritage-listed buildings and two historical sites respectively, for conservation and preservation, due to the presence of IT Corridor 6kms away accommodating more people in the precinct is under constant pressure. The heritage precinct possesses high property values, being a prime location connecting the IT corridor and CBD (central business district )areas. The primary objective of the study was to assess and identify the factors that are affecting the heritage precinct through Mapping and documentation, Identifying and assessing the factors through empirical analysis, Ordinal regression analysis and Hedonic Pricing Model. Ordinal regression analysis was used to identify the factors that contribute to the changes in the precinct due to urbanization. Hedonic Pricing Model was used to understand and establish a relation whether the presence of historical monuments is also a contributing factor to the property value and to what extent this influence can contribute. The above methods and field visit indicates the Physical, socio-economic factors and the neighborhood characteristics of the precinct contributing to the property values. The outturns and the potential elements derived from the analysis of the Development Control Rules were derived as recommendations to Integrate both Old and newly built environments.

Keywords: heritage planning, heritage conservation, hedonic pricing model, ordinal regression analysis

Procedia PDF Downloads 160

17469 Generating 3D Battery Cathode Microstructures using Gaussian Mixture Models and Pix2Pix

Authors: Wesley Teskey, Vedran Glavas, Julian Wegener

Abstract:

Generating battery cathode microstructures is an important area of research, given the proliferation of the use of automotive batteries. Currently, finite element analysis (FEA) is often used for simulations of battery cathode microstructures before physical batteries can be manufactured and tested to verify the simulation results. Unfortunately, a key drawback of using FEA is that this method of simulation is very slow in terms of computational runtime. Generative AI offers the key advantage of speed when compared to FEA, and because of this, generative AI is capable of evaluating very large numbers of candidate microstructures. Given AI generated candidate microstructures, a subset of the promising microstructures can be selected for further validation using FEA. Leveraging the speed advantage of AI allows for a better final microstructural selection because high speed allows for the evaluation of many more candidate microstructures. For the approach presented, battery cathode 3D candidate microstructures are generated using Gaussian Mixture Models (GMMs) and pix2pix. This approach first uses GMMs to generate a population of spheres (representing the “active material” of the cathode). Once spheres have been sampled from the GMM, they are placed within a microstructure. Subsequently, the pix2pix sweeps over the 3D microstructure (iteratively) slice by slice and adds details to the microstructure to determine what portions of the microstructure will become electrolyte and what part of the microstructure will become binder. In this manner, each subsequent slice of the microstructure is evaluated using pix2pix, where the inputs into pix2pix are the previously processed layers of the microstructure. By feeding into pix2pix previously fully processed layers of the microstructure, pix2pix can be used to ensure candidate microstructures represent a realistic physical reality. More specifically, in order for the microstructure to represent a realistic physical reality, the locations of electrolyte and binder in each layer of the microstructure must reasonably match the locations of electrolyte and binder in previous layers to ensure geometric continuity. Using the above outlined approach, a 10x to 100x speed increase was possible when generating candidate microstructures using AI when compared to using a FEA only approach for this task. A key metric for evaluating microstructures was the battery specific power value that the microstructures would be able to produce. The best generative AI result obtained was a 12% increase in specific power for a candidate microstructure when compared to what a FEA only approach was capable of producing. This 12% increase in specific power was verified by FEA simulation.

Keywords: finite element analysis, gaussian mixture models, generative design, Pix2Pix, structural design

Procedia PDF Downloads 73

17468 Simulation of a Fluid Catalytic Cracking Process

Authors: Sungho Kim, Dae Shik Kim, Jong Min Lee

Abstract:

Fluid catalytic cracking (FCC) process is one of the most important process in modern refinery indusrty. This paper focuses on the fluid catalytic cracking (FCC) process. As the FCC process is difficult to model well, due to its nonlinearities and various interactions between its process variables, rigorous process modeling of whole FCC plant is demanded for control and plant-wide optimization of the plant. In this study, a process design for the FCC plant includes riser reactor, main fractionator, and gas processing unit was developed. A reactor model was described based on four-lumped kinetic scheme. Main fractionator, gas processing unit and other process units are designed to simulate real plant data, using a process flowsheet simulator, Aspen PLUS. The custom reactor model was integrated with the process flowsheet simulator to develop an integrated process model.

Keywords: fluid catalytic cracking, simulation, plant data, process design

Procedia PDF Downloads 421

17467 Chemometric QSRR Evaluation of Behavior of s-Triazine Pesticides in Liquid Chromatography

Authors: Lidija R. Jevrić, Sanja O. Podunavac-Kuzmanović, Strahinja Z. Kovačević

Abstract:

This study considers the selection of the most suitable in silico molecular descriptors that could be used for s-triazine pesticides characterization. Suitable descriptors among topological, geometrical and physicochemical are used for quantitative structure-retention relationships (QSRR) model establishment. Established models were obtained using linear regression (LR) and multiple linear regression (MLR) analysis. In this paper, MLR models were established avoiding multicollinearity among the selected molecular descriptors. Statistical quality of established models was evaluated by standard and cross-validation statistical parameters. For detection of similarity or dissimilarity among investigated s-triazine pesticides and their classification, principal component analysis (PCA) and hierarchical cluster analysis (HCA) were used and gave similar grouping. This study is financially supported by COST action TD1305.

Keywords: chemometrics, classification analysis, molecular descriptors, pesticides, regression analysis

Procedia PDF Downloads 352

17466 Statistic Regression and Open Data Approach for Identifying Economic Indicators That Influence e-Commerce

Authors: Apollinaire Barme, Simon Tamayo, Arthur Gaudron

Abstract:

This paper presents a statistical approach to identify explanatory variables linearly related to e-commerce sales. The proposed methodology allows specifying a regression model in order to quantify the relevance between openly available data (economic and demographic) and national e-commerce sales. The proposed methodology consists in collecting data, preselecting input variables, performing regressions for choosing variables and models, testing and validating. The usefulness of the proposed approach is twofold: on the one hand, it allows identifying the variables that influence e- commerce sales with an accessible approach. And on the other hand, it can be used to model future sales from the input variables. Results show that e-commerce is linearly dependent on 11 economic and demographic indicators.

Keywords: e-commerce, statistical modeling, regression, empirical research

Procedia PDF Downloads 192

17465 Modified Clusterwise Regression for Pavement Management

Authors: Mukesh Khadka, Alexander Paz, Hanns de la Fuente-Mella

Abstract:

Typically, pavement performance models are developed in two steps: (i) pavement segments with similar characteristics are grouped together to form a cluster, and (ii) the corresponding performance models are developed using statistical techniques. A challenge is to select the characteristics that define clusters and the segments associated with them. If inappropriate characteristics are used, clusters may include homogeneous segments with different performance behavior or heterogeneous segments with similar performance behavior. Prediction accuracy of performance models can be improved by grouping the pavement segments into more uniform clusters by including both characteristics and a performance measure. This grouping is not always possible due to limited information. It is impractical to include all the potential significant factors because some of them are potentially unobserved or difficult to measure. Historical performance of pavement segments could be used as a proxy to incorporate the effect of the missing potential significant factors in clustering process. The current state-of-the-art proposes Clusterwise Linear Regression (CLR) to determine the pavement clusters and the associated performance models simultaneously. CLR incorporates the effect of significant factors as well as a performance measure. In this study, a mathematical program was formulated for CLR models including multiple explanatory variables. Pavement data collected recently over the entire state of Nevada were used. International Roughness Index (IRI) was used as a pavement performance measure because it serves as a unified standard that is widely accepted for evaluating pavement performance, especially in terms of riding quality. Results illustrate the advantage of the using CLR. Previous studies have used CLR along with experimental data. This study uses actual field data collected across a variety of environmental, traffic, design, and construction and maintenance conditions.

Keywords: clusterwise regression, pavement management system, performance model, optimization

Procedia PDF Downloads 225

17464 Evaluation of Organizational Culture and Its Effects on Innovation in the IT Sector: A Case Study from UAE

Authors: Amir M. Shikhli, Refaat H. Abdel-Razek, Salaheddine Bendak

Abstract:

Innovation is considered to be one of the key factors that influence long-term success of any company. The problem of many organizations in developing countries is trying to implement innovation without a strong basis within the organizational culture to support it. The objective of this study is to assess the effects of organizational culture on innovation in one of the biggest information technology organizations in UAE, Injazat Data System. First, an Organizational Culture Assessment Instrument (OCAI) was used as a survey and Competing Value Framework as a model to analyze the existing culture within the organization and determine its characteristics. Following that, a modified version of the Community Innovation Survey (CIS) was used to determine innovation types introduced by the organization. Then multiple linear regression analysis was used to find out the effects of existing organizational culture on innovation. Results show that existing organizational culture is composed of a combination of Hierarchy (29.4%), Clan (25.8%), Market (24.9%) and Adhocracy (19.9%). Results of the second survey show that the organization focuses on organizational innovation (26.8%) followed by market and product innovations (25.6%) and finally process innovation (22.0%). Regression analysis results reveal that for each innovation type there is a recommended combination of the four culture types. For product innovation, the combination is 47.4% Clan, 17.9% Adhocracy, 1.0% Market and 33.3% Hierarchy; for process innovation it is 19.7% Clan, 45.2% Adhocracy, 32.0% Market and 3.1% Hierarchy; for organizational innovation the combination is 5.4% Clan, 32.7% Adhocracy, 6.0% Market and 55.9% Hierarchy; and for market innovation it is 25.5% Clan, 42.6% Adhocracy, 32.6% Market and 8.4% Hierarchy. Based on these recommended combinations, this study suggests two ways to enhance the innovation culture in the organization. First, if the management decides on the innovation type to be enhanced, a comparison between the existing culture and the recommended combination of selected innovation types will lead to difference in percentages of each culture type. Then further analysis should show how to modify the existing culture to match the recommended combination. Second, if the innovation type is not selected, but the management wants to enhance innovation culture in the organization, the difference in percentages of each culture type will lead to finding out the recommended combination of culture types that gives the narrowest gap between existing culture and recommended combination.

Keywords: developing countries, organizational culture, innovation types, product innovation, process innovation, organizational innovation, marketing innovation

Procedia PDF Downloads 239

17463 Quasistationary States and Mean Field Model

Authors: Sergio Curilef, Boris Atenas

Abstract:

Systems with long-range interactions are very common in nature. They are observed from the atomic scale to the astronomical scale and exhibit anomalies, such as inequivalence of ensembles, negative heat capacity, ergodicity breaking, nonequilibrium phase transitions, quasistationary states, and anomalous diffusion. These anomalies are exacerbated when special initial conditions are imposed; in particular, we use the so-called water bag initial conditions that stand for a uniform distribution. Several theoretical and practical implications are discussed here. A potential energy inspired by dipole-dipole interactions is proposed to build the dipole-type Hamiltonian mean-field model. As expected, the dynamics is novel and general to the behavior of systems with long-range interactions, which is obtained through molecular dynamics technique. Two plateaus sequentially emerge before arriving at equilibrium, which are corresponding to two different quasistationary states. The first plateau is a type of quasistationary state the lifetime of which depends on a power law of N and the second plateau seems to be a true quasistationary state as reported in the literature. The general behavior of the model according to its dynamics and thermodynamics is described. Using numerical simulation we characterize the mean kinetic energy, caloric curve, and the diffusion law through the mean square of displacement. The present challenge is to characterize the distributions in phase space. Certainly, the equilibrium state is well characterized by the Gaussian distribution, but quasistationary states in general depart from any Gaussian function.

Keywords: dipole-type interactions, dynamics and thermodynamics, mean field model, quasistationary states

Procedia PDF Downloads 177

17462 Support Vector Regression for Retrieval of Soil Moisture Using Bistatic Scatterometer Data at X-Band

Authors: Dileep Kumar Gupta, Rajendra Prasad, Pradeep Kumar, Varun Narayan Mishra, Ajeet Kumar Vishwakarma, Prashant K. Srivastava

Abstract:

An approach was evaluated for the retrieval of soil moisture of bare soil surface using bistatic scatterometer data in the angular range of 200 to 700 at VV- and HH- polarization. The microwave data was acquired by specially designed X-band (10 GHz) bistatic scatterometer. The linear regression analysis was done between scattering coefficients and soil moisture content to select the suitable incidence angle for retrieval of soil moisture content. The 250 incidence angle was found more suitable. The support vector regression analysis was used to approximate the function described by the input-output relationship between the scattering coefficient and corresponding measured values of the soil moisture content. The performance of support vector regression algorithm was evaluated by comparing the observed and the estimated soil moisture content by statistical performance indices %Bias, root mean squared error (RMSE) and Nash-Sutcliffe Efficiency (NSE). The values of %Bias, root mean squared error (RMSE) and Nash-Sutcliffe Efficiency (NSE) were found 2.9451, 1.0986, and 0.9214, respectively at HH-polarization. At VV- polarization, the values of %Bias, root mean squared error (RMSE) and Nash-Sutcliffe Efficiency (NSE) were found 3.6186, 0.9373, and 0.9428, respectively.

Keywords: bistatic scatterometer, soil moisture, support vector regression, RMSE, %Bias, NSE

Procedia PDF Downloads 386

17461 Optimization of Springback Prediction in U-Channel Process Using Response Surface Methodology

Authors: Muhamad Sani Buang, Shahrul Azam Abdullah, Juri Saedon

Abstract:

There is not much effective guideline on development of design parameters selection on springback for advanced high strength steel sheet metal in U-channel process during cold forming process. This paper presents the development of predictive model for springback in U-channel process on advanced high strength steel sheet employing Response Surface Methodology (RSM). The experimental was performed on dual phase steel sheet, DP590 in U-channel forming process while design of experiment (DoE) approach was used to investigates the effects of four factors namely blank holder force (BHF), clearance (C) and punch travel (Tp) and rolling direction (R) were used as input parameters using two level values by applying Full Factorial design (24). From a statistical analysis of variant (ANOVA), result showed that blank holder force (BHF), clearance (C) and punch travel (Tp) displayed significant effect on springback of flange angle (β2) and wall opening angle (β1), while rolling direction (R) factor is insignificant. The significant parameters are optimized in order to reduce the springback behavior using Central Composite Design (CCD) in RSM and the optimum parameters were determined. A regression model for springback was developed. The effect of individual parameters and their response was also evaluated. The results obtained from optimum model are in agreement with the experimental values

Keywords: advance high strength steel, u-channel process, springback, design of experiment, optimization, response surface methodology (rsm)

Procedia PDF Downloads 511

17460 Reliability Modeling on Drivers’ Decision during Yellow Phase

Authors: Sabyasachi Biswas, Indrajit Ghosh

Abstract:

The random and heterogeneous behavior of vehicles in India puts up a greater challenge for researchers. Stop-and-go modeling at signalized intersections under heterogeneous traffic conditions has remained one of the most sought-after fields. Vehicles are often caught up in the dilemma zone and are unable to take quick decisions whether to stop or cross the intersection. This hampers the traffic movement and may lead to accidents. The purpose of this work is to develop a stop and go prediction model that depicts the drivers’ decision during the yellow time at signalised intersections. To accomplish this, certain traffic parameters were taken into account to develop surrogate model. This research investigated the Stop and Go behavior of the drivers by collecting data from 4-signalized intersections located in two major Indian cities. Model was developed to predict the drivers’ decision making during the yellow phase of the traffic signal. The parameters used for modeling included distance to stop line, time to stop line, speed, and length of the vehicle. A Kriging base surrogate model has been developed to investigate the drivers’ decision-making behavior in amber phase. It is observed that the proposed approach yields a highly accurate result (97.4 percent) by Gaussian function. It was observed that the accuracy for the crossing probability was 95.45, 90.9 and 86.36.11 percent respectively as predicted by the Kriging models with Gaussian, Exponential and Linear functions.

Keywords: decision-making decision, dilemma zone, surrogate model, Kriging

Procedia PDF Downloads 281

17459 Ground Motion Modeling Using the Least Absolute Shrinkage and Selection Operator

Authors: Yildiz Stella Dak, Jale Tezcan

Abstract:

Ground motion models that relate a strong motion parameter of interest to a set of predictive seismological variables describing the earthquake source, the propagation path of the seismic wave, and the local site conditions constitute a critical component of seismic hazard analyses. When a sufficient number of strong motion records are available, ground motion relations are developed using statistical analysis of the recorded ground motion data. In regions lacking a sufficient number of recordings, a synthetic database is developed using stochastic, theoretical or hybrid approaches. Regardless of the manner the database was developed, ground motion relations are developed using regression analysis. Development of a ground motion relation is a challenging process which inevitably requires the modeler to make subjective decisions regarding the inclusion criteria of the recordings, the functional form of the model and the set of seismological variables to be included in the model. Because these decisions are critically important to the validity and the applicability of the model, there is a continuous interest on procedures that will facilitate the development of ground motion models. This paper proposes the use of the Least Absolute Shrinkage and Selection Operator (LASSO) in selecting the set predictive seismological variables to be used in developing a ground motion relation. The LASSO can be described as a penalized regression technique with a built-in capability of variable selection. Similar to the ridge regression, the LASSO is based on the idea of shrinking the regression coefficients to reduce the variance of the model. Unlike ridge regression, where the coefficients are shrunk but never set equal to zero, the LASSO sets some of the coefficients exactly to zero, effectively performing variable selection. Given a set of candidate input variables and the output variable of interest, LASSO allows ranking the input variables in terms of their relative importance, thereby facilitating the selection of the set of variables to be included in the model. Because the risk of overfitting increases as the ratio of the number of predictors to the number of recordings increases, selection of a compact set of variables is important in cases where a small number of recordings are available. In addition, identification of a small set of variables can improve the interpretability of the resulting model, especially when there is a large number of candidate predictors. A practical application of the proposed approach is presented, using more than 600 recordings from the National Geospatial-Intelligence Agency (NGA) database, where the effect of a set of seismological predictors on the 5% damped maximum direction spectral acceleration is investigated. The set of candidate predictors considered are Magnitude, Rrup, Vs30. Using LASSO, the relative importance of the candidate predictors has been ranked. Regression models with increasing levels of complexity were constructed using one, two, three, and four best predictors, and the models’ ability to explain the observed variance in the target variable have been compared. The bias-variance trade-off in the context of model selection is discussed.

Keywords: ground motion modeling, least absolute shrinkage and selection operator, penalized regression, variable selection

Procedia PDF Downloads 297

17458 Predictive Analysis for Big Data: Extension of Classification and Regression Trees Algorithm

Authors: Ameur Abdelkader, Abed Bouarfa Hafida

Abstract:

Since its inception, predictive analysis has revolutionized the IT industry through its robustness and decision-making facilities. It involves the application of a set of data processing techniques and algorithms in order to create predictive models. Its principle is based on finding relationships between explanatory variables and the predicted variables. Past occurrences are exploited to predict and to derive the unknown outcome. With the advent of big data, many studies have suggested the use of predictive analytics in order to process and analyze big data. Nevertheless, they have been curbed by the limits of classical methods of predictive analysis in case of a large amount of data. In fact, because of their volumes, their nature (semi or unstructured) and their variety, it is impossible to analyze efficiently big data via classical methods of predictive analysis. The authors attribute this weakness to the fact that predictive analysis algorithms do not allow the parallelization and distribution of calculation. In this paper, we propose to extend the predictive analysis algorithm, Classification And Regression Trees (CART), in order to adapt it for big data analysis. The major changes of this algorithm are presented and then a version of the extended algorithm is defined in order to make it applicable for a huge quantity of data.

Keywords: predictive analysis, big data, predictive analysis algorithms, CART algorithm

Procedia PDF Downloads 110

17457 Forecasting Equity Premium Out-of-Sample with Sophisticated Regression Training Techniques

Authors: Jonathan Iworiso

Abstract:

Forecasting the equity premium out-of-sample is a major concern to researchers in finance and emerging markets. The quest for a superior model that can forecast the equity premium with significant economic gains has resulted in several controversies on the choice of variables and suitable techniques among scholars. This research focuses mainly on the application of Regression Training (RT) techniques to forecast monthly equity premium out-of-sample recursively with an expanding window method. A broad category of sophisticated regression models involving model complexity was employed. The RT models include Ridge, Forward-Backward (FOBA) Ridge, Least Absolute Shrinkage and Selection Operator (LASSO), Relaxed LASSO, Elastic Net, and Least Angle Regression were trained and used to forecast the equity premium out-of-sample. In this study, the empirical investigation of the RT models demonstrates significant evidence of equity premium predictability both statistically and economically relative to the benchmark historical average, delivering significant utility gains. They seek to provide meaningful economic information on mean-variance portfolio investment for investors who are timing the market to earn future gains at minimal risk. Thus, the forecasting models appeared to guarantee an investor in a market setting who optimally reallocates a monthly portfolio between equities and risk-free treasury bills using equity premium forecasts at minimal risk.

Keywords: regression training, out-of-sample forecasts, expanding window, statistical predictability, economic significance, utility gains

Procedia PDF Downloads 70

17456 Self-Image of Police Officers

Authors: Leo Carlo B. Rondina

Abstract:

Self-image is an important factor to improve the self-esteem of the personnel. The purpose of the study is to determine the self-image of the police. The respondents were the 503 policemen assigned in different Police Station in Davao City, and they were chosen with the used of random sampling. With the used of Exploratory Factor Analysis (EFA), latent construct variables of police image were identified as follows; professionalism, obedience, morality and justice and fairness. Further, ordinal regression indicates statistical characteristics on ages 21-40 which means the age of the respondent statistically improves self-image.

Keywords: police image, exploratory factor analysis, ordinal regression, Galatea effect

Procedia PDF Downloads 248

17455 Regression Analysis of Travel Indicators and Public Transport Usage in Urban Areas

Authors: Mehdi Moeinaddini, Zohreh Asadi-Shekari, Muhammad Zaly Shah, Amran Hamzah

Abstract:

Currently, planners try to have more green travel options to decrease economic, social and environmental problems. Therefore, this study tries to find significant urban travel factors to be used to increase the usage of alternative urban travel modes. This paper attempts to identify the relationship between prominent urban mobility indicators and daily trips by public transport in 30 cities from various parts of the world. Different travel modes, infrastructures and cost indicators were evaluated in this research as mobility indicators. The results of multi-linear regression analysis indicate that there is a significant relationship between mobility indicators and the daily usage of public transport.

Keywords: green travel modes, urban travel indicators, daily trips by public transport, multi-linear regression analysis

Procedia PDF Downloads 519

17454 Development of Generalized Correlation for Liquid Thermal Conductivity of N-Alkane and Olefin

Authors: A. Ishag Mohamed, A. A. Rabah

Abstract:

The objective of this research is to develop a generalized correlation for the prediction of thermal conductivity of n-Alkanes and Alkenes. There is a minority of research and lack of correlation for thermal conductivity of liquids in the open literature. The available experimental data are collected covering the groups of n-Alkanes and Alkenes.The data were assumed to correlate to temperature using Filippov correlation. Nonparametric regression of Grace Algorithm was used to develop the generalized correlation model. A spread sheet program based on Microsoft Excel was used to plot and calculate the value of the coefficients. The results obtained were compared with the data that found in Perry's Chemical Engineering Hand Book. The experimental data correlated to the temperature ranged "between" 273.15 to 673.15 K, with R2 = 0.99.The developed correlation reproduced experimental data that which were not included in regression with absolute average percent deviation (AAPD) of less than 7 %. Thus the spread sheet was quite accurate which produces reliable data.

Keywords: N-Alkanes, N-Alkenes, nonparametric, regression

Procedia PDF Downloads 621

17453 Modeling and Simulation of Fluid Catalytic Cracking Process

Authors: Sungho Kim, Dae Shik Kim, Jong Min Lee

Abstract:

Fluid catalytic cracking (FCC) process is one of the most important process in modern refinery industry. This paper focuses on the fluid catalytic cracking (FCC) process. As the FCC process is difficult to model well, due to its non linearities and various interactions between its process variables, rigorous process modeling of whole FCC plant is demanded for control and plant-wide optimization of the plant. In this study, a process design for the FCC plant includes riser reactor, main fractionator, and gas processing unit was developed. A reactor model was described based on four-lumped kinetic scheme. Main fractionator, gas processing unit and other process units are designed to simulate real plant data, using a process flow sheet simulator, Aspen PLUS. The custom reactor model was integrated with the process flow sheet simulator to develop an integrated process model.

Keywords: fluid catalytic cracking, simulation, plant data, process design

Procedia PDF Downloads 486

17452 Response Surface Methodology for the Optimization of Paddy Husker by Medium Brown Rice Peeling Machine 6 Rubber Type

Authors: S. Bangphan, P. Bangphan, C. Ketsombun, T. Sammana

Abstract:

Optimization of response surface methodology (RSM) was employed to study the effects of three factor (rubber of clearance, spindle of speed, and rice of moisture) in brown rice peeling machine of the optimal good rice yield (99.67, average of three repeats). The optimized composition derived from RSM regression was analyzed using Regression analysis and Analysis of Variance (ANOVA). At a significant level α=0.05, the values of Regression coefficient, R2 adjust were 96.55% and standard deviation were 1.05056. The independent variables are initial rubber of clearance, spindle of speed and rice of moisture parameters namely. The investigating responses are final rubber clearance, spindle of speed and moisture of rice.

Keywords: brown rice, response surface methodology (RSM), peeling machine, optimization, paddy husker

Procedia PDF Downloads 542

17451 The Effect of Peer Support on Adaptation to University Life in First Year Students of the University

Authors: Bilgen Ozluk, Ayfer Karaaslan

Abstract:

Introduction: Adaptation to university life is a difficult process for students. In peer support, students are expected to help other students or sometimes adults using their helping skills. Therefore, it is expected that peer support will have significant effect on students’ adaptation to university life. Aim: This study was conducted with the aim of determining the effect of peer support on adaptation to university life in the first year students of the faculty of health sciences. Methods: The population consists of 340 first year university students receiving education in the departments of nursing, health management, social services, nutrition and dietetics, physiotherapy and rehabilitation at an university located in the province of Konya. The sample of the study consisted of 274 students who voluntarily participated in the study. The data were collected between the dates 23 May 2016 and 3 June 2016. The data were collected using the socio-demographic information, the peer support scale and the university life adaptation scale. Ethical approvals for the study and permission from the university were taken. Numbers, percentages, averages, one-Way ANOVA, pearson correlation analysis and regression analysis have been used in assessing the data. Findings: When the problems most frequently encountered by students just starting the university were ordered, problems regarding their classes took the first place by 41.6%, socio-cultural problems took the second place by 38.7%, and economic problems took the third place by 37.6%. The mean total score of the Adaptation to University Life Scale was found to be 216.78±32.15. Considering that the lowest and highest scores that can be gained from the scale are 132 and 289 respectively, it was found that the adaptation to university life levels of the students were higher than the average. The mean adaptation to university life score of the nursing students was higher than those of the students of other departments. The mean score of ‘the Peer Support Scale’ was found to be 47.24±10.27. Considering that the lowest and highest scores that can be gained from the scale are 17 and 68 respectively, it was found that the peer support levels of the students were higher than the average. As a result of the regression analysis, it was found that 20% of the total variance regarding adaptation to university life was explained by peer support. Conclution: Receiving the support peer groups becomes highly important in the university adaptation process of first-year students. Peer support will create the means for easier completion of this difficult transition process.

Keywords: adaptation to university life, first years, peer support, university student

Procedia PDF Downloads 176

17450 Competitors’ Influence Analysis of a Retailer by Using Customer Value and Huff’s Gravity Model

Authors: Yepeng Cheng, Yasuhiko Morimoto

Abstract:

Customer relationship analysis is vital for retail stores, especially for supermarkets. The point of sale (POS) systems make it possible to record the daily purchasing behaviors of customers as an identification point of sale (ID-POS) database, which can be used to analyze customer behaviors of a supermarket. The customer value is an indicator based on ID-POS database for detecting the customer loyalty of a store. In general, there are many supermarkets in a city, and other nearby competitor supermarkets significantly affect the customer value of customers of a supermarket. However, it is impossible to get detailed ID-POS databases of competitor supermarkets. This study firstly focused on the customer value and distance between a customer's home and supermarkets in a city, and then constructed the models based on logistic regression analysis to analyze correlations between distance and purchasing behaviors only from a POS database of a supermarket chain. During the modeling process, there are three primary problems existed, including the incomparable problem of customer values, the multicollinearity problem among customer value and distance data, and the number of valid partial regression coefficients. The improved customer value, Huff’s gravity model, and inverse attractiveness frequency are considered to solve these problems. This paper presents three types of models based on these three methods for loyal customer classification and competitors’ influence analysis. In numerical experiments, all types of models are useful for loyal customer classification. The type of model, including all three methods, is the most superior one for evaluating the influence of the other nearby supermarkets on customers' purchasing of a supermarket chain from the viewpoint of valid partial regression coefficients and accuracy.

Keywords: customer value, Huff's Gravity Model, POS, Retailer

Procedia PDF Downloads 92