Search results for: simple regression analysis
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 30161

Search results for: simple regression analysis

29981 The Potential Factors Relating to the Decision of Return Migration of Myanmar Migrant Workers: A Case Study in Prachuap Khiri Khan Province

Authors: Musthaya Patchanee

Abstract:

The aim of this research is to study potential factors relating to the decision of return migration of Myanmar migrant workers in Prachuap Khiri Khan Province by conducting a random sampling of 400 people aged between 15-59 who migrated from Myanmar. The information collected through interviews was analyzed to find a percentage and mean using the Stepwise Multiple Regression Analysis. The results have shown that 33.25% of Myanmar migrant workers want to return to their home country within the next 1-5 years, 46.25%, in 6-10 years and the rest, in over 10 years. The factors relating to such decision can be concluded that the scale of the decision of return migration has a positive relationship with a statistical significance at 0.05 with a conformity with friends and relatives (r=0.886), a relationship with family and community (r=0.782), possession of land in hometown (r=0.756) and educational level (r=0.699). However, the factor of property possession in Prachuap Khiri Khan is the only factor with a high negative relationship (r=0.-537). From the Stepwise Multiple Regression Analysis, the results have shown that the conformity with friends and relatives and educational level factors are influential to the decision of return migration of Myanmar migrant workers in Prachuap Khiri Khan Province, which can predict the decision at 86.60% and the multiple regression equation from the analysis is Y= 6.744+1.198 conformity + 0.647 education.

Keywords: decision of return migration, factors of return migration, Myanmar migrant workers, Prachuap Khiri Khan Province

Procedia PDF Downloads 513
29980 The Molecular Characteristic of Heliotropium digynum in Saudi Arabia by Inter-Simple Sequence Repeat (ISSR) Analysis

Authors: Mona Alwhibi, Najat Bukhary

Abstract:

Heliotropium digynum, a member of Boraginaceae family, the growth of the plant, as well as its size, length of inflorescence, and speed of development depends on the amount of rain in its habitat. In this study, we studied the applicability of inter-simple sequence repeat (ISSR) polymorphism in Heliotropium digynum in a different region of Saudi Arabia. We found that. ISSR analysis using 15 primers were used for ISSR-PCR optimization trials, five primers (UBC810, UBC811, UBC818, UBC834, and UBC849) which gave the best amplification results produced a total of 43 polymorphic bands. The number of polymorphic loci was 20 and the percentage of polymorphism was 90.47%. The similarity result indicates the presence of a high-level genetic diversity between populations and a dendrogram constructed by UPGMA method.

Keywords: genetic differentiation, genetic diversity, Heliotropium digynum, ISSR

Procedia PDF Downloads 458
29979 Stability-Indicating High-Performance Thin-Layer Chromatography Method for Estimation of Naftopidil

Authors: P. S. Jain, K. D. Bobade, S. J. Surana

Abstract:

A simple, selective, precise and Stability-indicating High-performance thin-layer chromatographic method for analysis of Naftopidil both in a bulk and in pharmaceutical formulation has been developed and validated. The method employed, HPTLC aluminium plates precoated with silica gel as the stationary phase. The solvent system consisted of hexane: ethyl acetate: glacial acetic acid (4:4:2 v/v). The system was found to give compact spot for Naftopidil (Rf value of 0.43±0.02). Densitometric analysis of Naftopidil was carried out in the absorbance mode at 253 nm. The linear regression analysis data for the calibration plots showed good linear relationship with r2=0.999±0.0001 with respect to peak area in the concentration range 200-1200 ng per spot. The method was validated for precision, recovery and robustness. The limits of detection and quantification were 20.35 and 61.68 ng per spot, respectively. Naftopidil was subjected to acid and alkali hydrolysis, oxidation and thermal degradation. The drug undergoes degradation under acidic, basic, oxidation and thermal conditions. This indicates that the drug is susceptible to acid, base, oxidation and thermal conditions. The degraded product was well resolved from the pure drug with significantly different Rf value. Statistical analysis proves that the method is repeatable, selective and accurate for the estimation of investigated drug. The proposed developed HPTLC method can be applied for identification and quantitative determination of Naftopidil in bulk drug and pharmaceutical formulation.

Keywords: naftopidil, HPTLC, validation, stability, degradation

Procedia PDF Downloads 378
29978 Solving Dimensionality Problem and Finding Statistical Constructs on Latent Regression Models: A Novel Methodology with Real Data Application

Authors: Sergio Paez Moncaleano, Alvaro Mauricio Montenegro

Abstract:

This paper presents a novel statistical methodology for measuring and founding constructs in Latent Regression Analysis. This approach uses the qualities of Factor Analysis in binary data with interpretations on Item Response Theory (IRT). In addition, based on the fundamentals of submodel theory and with a convergence of many ideas of IRT, we propose an algorithm not just to solve the dimensionality problem (nowadays an open discussion) but a new research field that promises more fear and realistic qualifications for examiners and a revolution on IRT and educational research. In the end, the methodology is applied to a set of real data set presenting impressive results for the coherence, speed and precision. Acknowledgments: This research was financed by Colciencias through the project: 'Multidimensional Item Response Theory Models for Practical Application in Large Test Designed to Measure Multiple Constructs' and both authors belong to SICS Research Group from Universidad Nacional de Colombia.

Keywords: item response theory, dimensionality, submodel theory, factorial analysis

Procedia PDF Downloads 344
29977 Analysis of Effect of Microfinance on the Profit Level of Small and Medium Scale Enterprises in Lagos State, Nigeria

Authors: Saheed Olakunle Sanusi, Israel Ajibade Adedeji

Abstract:

The study analysed the effect of microfinance on the profit level of small and medium scale enterprises in Lagos. The data for the study were obtained by simple random sampling, and total of one hundred and fifty (150) small and medium scale enterprises (SMEs) were sampled for the study. Seventy-five (75) each are microfinance users and non-users. Data were analysed using descriptive statistics, logit model, t-test and ordinary least square (OLS) regression. The mean profit of the enterprises using microfinance is ₦16.8m, while for the non-users of microfinance is ₦5.9m. The mean profit of microfinance users is statistically different from the non-users. The result of the logit model specified for the determinant of access to microfinance showed that three of specified variables- educational status of the enterprise head, credit utilisation and volume of business investment are significant at P < 0.01. Enterprises with many years of experience, highly educated enterprise heads and high volume of business investment have more potential access to microfinance. The OLS regression model indicated that three parameters namely number of school years, the volume of business investment and (dummy) participation in microfinance were found to be significant at P < 0.05. These variables are therefore significant determinants of impacts of microfinance on profit level in the study area. The study, therefore, concludes and recommends that to improve the status of small and medium scale enterprises for an increase in profit, the full benefit of access to microfinance can be enhanced through investment in social infrastructure and human capital development. Also, concerted efforts should be made to encouraged non-users of microfinance among SMEs to use it in order to boost their profit.

Keywords: credit utilisation, logit model, microfinance, small and medium enterprises

Procedia PDF Downloads 177
29976 A Kolmogorov-Smirnov Type Goodness-Of-Fit Test of Multinomial Logistic Regression Model in Case-Control Studies

Authors: Chen Li-Ching

Abstract:

The multinomial logistic regression model is used popularly for inferring the relationship of risk factors and disease with multiple categories. This study based on the discrepancy between the nonparametric maximum likelihood estimator and semiparametric maximum likelihood estimator of the cumulative distribution function to propose a Kolmogorov-Smirnov type test statistic to assess adequacy of the multinomial logistic regression model for case-control data. A bootstrap procedure is presented to calculate the critical value of the proposed test statistic. Empirical type I error rates and powers of the test are performed by simulation studies. Some examples will be illustrated the implementation of the test.

Keywords: case-control studies, goodness-of-fit test, Kolmogorov-Smirnov test, multinomial logistic regression

Procedia PDF Downloads 428
29975 Indian Premier League (IPL) Score Prediction: Comparative Analysis of Machine Learning Models

Authors: Rohini Hariharan, Yazhini R, Bhamidipati Naga Shrikarti

Abstract:

In the realm of cricket, particularly within the context of the Indian Premier League (IPL), the ability to predict team scores accurately holds significant importance for both cricket enthusiasts and stakeholders alike. This paper presents a comprehensive study on IPL score prediction utilizing various machine learning algorithms, including Support Vector Machines (SVM), XGBoost, Multiple Regression, Linear Regression, K-nearest neighbors (KNN), and Random Forest. Through meticulous data preprocessing, feature engineering, and model selection, we aimed to develop a robust predictive framework capable of forecasting team scores with high precision. Our experimentation involved the analysis of historical IPL match data encompassing diverse match and player statistics. Leveraging this data, we employed state-of-the-art machine learning techniques to train and evaluate the performance of each model. Notably, Multiple Regression emerged as the top-performing algorithm, achieving an impressive accuracy of 77.19% and a precision of 54.05% (within a threshold of +/- 10 runs). This research contributes to the advancement of sports analytics by demonstrating the efficacy of machine learning in predicting IPL team scores. The findings underscore the potential of advanced predictive modeling techniques to provide valuable insights for cricket enthusiasts, team management, and betting agencies. Additionally, this study serves as a benchmark for future research endeavors aimed at enhancing the accuracy and interpretability of IPL score prediction models.

Keywords: indian premier league (IPL), cricket, score prediction, machine learning, support vector machines (SVM), xgboost, multiple regression, linear regression, k-nearest neighbors (KNN), random forest, sports analytics

Procedia PDF Downloads 24
29974 An Infinite Mixture Model for Modelling Stutter Ratio in Forensic Data Analysis

Authors: M. A. C. S. Sampath Fernando, James M. Curran, Renate Meyer

Abstract:

Forensic DNA analysis has received much attention over the last three decades, due to its incredible usefulness in human identification. The statistical interpretation of DNA evidence is recognised as one of the most mature fields in forensic science. Peak heights in an Electropherogram (EPG) are approximately proportional to the amount of template DNA in the original sample being tested. A stutter is a minor peak in an EPG, which is not masking as an allele of a potential contributor, and considered as an artefact that is presumed to be arisen due to miscopying or slippage during the PCR. Stutter peaks are mostly analysed in terms of stutter ratio that is calculated relative to the corresponding parent allele height. Analysis of mixture profiles has always been problematic in evidence interpretation, especially with the presence of PCR artefacts like stutters. Unlike binary and semi-continuous models; continuous models assign a probability (as a continuous weight) for each possible genotype combination, and significantly enhances the use of continuous peak height information resulting in more efficient reliable interpretations. Therefore, the presence of a sound methodology to distinguish between stutters and real alleles is essential for the accuracy of the interpretation. Sensibly, any such method has to be able to focus on modelling stutter peaks. Bayesian nonparametric methods provide increased flexibility in applied statistical modelling. Mixture models are frequently employed as fundamental data analysis tools in clustering and classification of data and assume unidentified heterogeneous sources for data. In model-based clustering, each unknown source is reflected by a cluster, and the clusters are modelled using parametric models. Specifying the number of components in finite mixture models, however, is practically difficult even though the calculations are relatively simple. Infinite mixture models, in contrast, do not require the user to specify the number of components. Instead, a Dirichlet process, which is an infinite-dimensional generalization of the Dirichlet distribution, is used to deal with the problem of a number of components. Chinese restaurant process (CRP), Stick-breaking process and Pólya urn scheme are frequently used as Dirichlet priors in Bayesian mixture models. In this study, we illustrate an infinite mixture of simple linear regression models for modelling stutter ratio and introduce some modifications to overcome weaknesses associated with CRP.

Keywords: Chinese restaurant process, Dirichlet prior, infinite mixture model, PCR stutter

Procedia PDF Downloads 306
29973 Model of Optimal Centroids Approach for Multivariate Data Classification

Authors: Pham Van Nha, Le Cam Binh

Abstract:

Particle swarm optimization (PSO) is a population-based stochastic optimization algorithm. PSO was inspired by the natural behavior of birds and fish in migration and foraging for food. PSO is considered as a multidisciplinary optimization model that can be applied in various optimization problems. PSO’s ideas are simple and easy to understand but PSO is only applied in simple model problems. We think that in order to expand the applicability of PSO in complex problems, PSO should be described more explicitly in the form of a mathematical model. In this paper, we represent PSO in a mathematical model and apply in the multivariate data classification. First, PSOs general mathematical model (MPSO) is analyzed as a universal optimization model. Then, Model of Optimal Centroids (MOC) is proposed for the multivariate data classification. Experiments were conducted on some benchmark data sets to prove the effectiveness of MOC compared with several proposed schemes.

Keywords: analysis of optimization, artificial intelligence based optimization, optimization for learning and data analysis, global optimization

Procedia PDF Downloads 182
29972 Forecasting of Grape Juice Flavor by Using Support Vector Regression

Authors: Ren-Jieh Kuo, Chun-Shou Huang

Abstract:

The research of juice flavor forecasting has become more important in China. Due to the fast economic growth in China, many different kinds of juices have been introduced to the market. If a beverage company can understand their customers’ preference well, the juice can be served more attractively. Thus, this study intends to introduce the basic theory and computing process of grapes juice flavor forecasting based on support vector regression (SVR). Applying SVR, BPN and LR to forecast the flavor of grapes juice in real data, the result shows that SVR is more suitable and effective at predicting performance.

Keywords: flavor forecasting, artificial neural networks, Support Vector Regression, China

Procedia PDF Downloads 456
29971 The Impact of Governance on Happiness: Evidence from Quantile Regressions

Authors: Chiung-Ju Huang

Abstract:

This study utilizes the quantile regression analysis to examine the impact of governance (including democratic quality and technical quality) on happiness in 101 countries worldwide, classified as “developed countries” and “developing countries”. The empirical results show that the impact of democratic quality and technical quality on happiness is significantly positive for “developed countries”, while is insignificant for “developing countries”. The results suggest that the authorities in developed countries can enhance the level of individual happiness by means of improving the democracy quality and technical quality. However, for developing countries, promoting the quality of governance in order to enhance the level of happiness may not be effective. Policy makers in developed countries may pay more attention on increasing real GDP per capita instead of promoting the quality of governance to enhance individual happiness.

Keywords: governance, happiness, multiple regression, quantile regression

Procedia PDF Downloads 253
29970 Relationship between Gender and Performance with Respect to a Basic Math Skills Quiz in Statistics Courses in Lebanon

Authors: Hiba Naccache

Abstract:

The present research investigated whether gender differences affect performance in a simple math quiz in statistics course. Participants of this study comprised a sample of 567 statistics students in two different universities in Lebanon. Data were collected through a simple math quiz. Analysis of quantitative data indicated that there wasn’t a significant difference in math performance between males and females. The results suggest that improvements in student performance may depend on improved mastery of basic algebra especially for females. The implications of these findings and further recommendations were discussed.

Keywords: gender, education, math, statistics

Procedia PDF Downloads 350
29969 An Automated Stock Investment System Using Machine Learning Techniques: An Application in Australia

Authors: Carol Anne Hargreaves

Abstract:

A key issue in stock investment is how to select representative features for stock selection. The objective of this paper is to firstly determine whether an automated stock investment system, using machine learning techniques, may be used to identify a portfolio of growth stocks that are highly likely to provide returns better than the stock market index. The second objective is to identify the technical features that best characterize whether a stock’s price is likely to go up and to identify the most important factors and their contribution to predicting the likelihood of the stock price going up. Unsupervised machine learning techniques, such as cluster analysis, were applied to the stock data to identify a cluster of stocks that was likely to go up in price – portfolio 1. Next, the principal component analysis technique was used to select stocks that were rated high on component one and component two – portfolio 2. Thirdly, a supervised machine learning technique, the logistic regression method, was used to select stocks with a high probability of their price going up – portfolio 3. The predictive models were validated with metrics such as, sensitivity (recall), specificity and overall accuracy for all models. All accuracy measures were above 70%. All portfolios outperformed the market by more than eight times. The top three stocks were selected for each of the three stock portfolios and traded in the market for one month. After one month the return for each stock portfolio was computed and compared with the stock market index returns. The returns for all three stock portfolios was 23.87% for the principal component analysis stock portfolio, 11.65% for the logistic regression portfolio and 8.88% for the K-means cluster portfolio while the stock market performance was 0.38%. This study confirms that an automated stock investment system using machine learning techniques can identify top performing stock portfolios that outperform the stock market.

Keywords: machine learning, stock market trading, logistic regression, cluster analysis, factor analysis, decision trees, neural networks, automated stock investment system

Procedia PDF Downloads 131
29968 Estimation of Coefficients of Ridge and Principal Components Regressions with Multicollinear Data

Authors: Rajeshwar Singh

Abstract:

The presence of multicollinearity is common in handling with several explanatory variables simultaneously due to exhibiting a linear relationship among them. A great problem arises in understanding the impact of explanatory variables on the dependent variable. Thus, the method of least squares estimation gives inexact estimates. In this case, it is advised to detect its presence first before proceeding further. Using the ridge regression degree of its occurrence is reduced but principal components regression gives good estimates in this situation. This paper discusses well-known techniques of the ridge and principal components regressions and applies to get the estimates of coefficients by both techniques. In addition to it, this paper also discusses the conflicting claim on the discovery of the method of ridge regression based on available documents.

Keywords: conflicting claim on credit of discovery of ridge regression, multicollinearity, principal components and ridge regressions, variance inflation factor

Procedia PDF Downloads 381
29967 A Multinomial Logistic Regression Analysis of Factors Influencing Couples' Fertility Preferences in Kenya

Authors: Naomi W. Maina

Abstract:

Fertility preference is a subject of great significance in developing countries. Studies reveal that the preferences of fertility are actually significant in determining the society’s fertility levels because the fertility behavior of the future has a high likelihood of falling under the effect of currently observed fertility inclinations. The objective of this study was to establish the factors associated with fertility preference amongst couples in Kenya by fitting a multinomial logistic regression model against 5,265 couple data obtained from Kenya demographic health survey 2014. Results revealed that the type of place of residence, the region of residence, age and spousal age gap significantly influence desire for additional children among couples in Kenya. There was the notable high likelihood of couples living in rural settlements having similar fertility preference compared to those living in urban settlements. Moreover, geographical disparities such as in northern Kenya revealed significant differences in a couples desire to have additional children compared to Nairobi. The odds of a couple’s desire for additional children were further observed to vary dependent on either the wife or husbands age and to a large extent the spousal age gap. Evidenced from the study, was the fact that as spousal age gap increases, the desire for more children amongst couples decreases. Insights derived from this study would be attractive to demographers, health practitioners, policymakers, and non-governmental organizations implementing fertility related interventions in Kenya among other stakeholders. Moreover, with the adoption of devolution, there is a clear need for adoption of population policies that are County specific as opposed to a national population policy as is the current practice in Kenya. Additionally, researchers or students who have little understanding in the application of multinomial logistic regression, both theoretical understanding and practical analysis in SPSS as well as application on real datasets, will find this article useful.

Keywords: couples' desire, fertility, fertility preference, multinomial regression analysis

Procedia PDF Downloads 152
29966 Comparison of Multivariate Adaptive Regression Splines and Random Forest Regression in Predicting Forced Expiratory Volume in One Second

Authors: P. V. Pramila , V. Mahesh

Abstract:

Pulmonary Function Tests are important non-invasive diagnostic tests to assess respiratory impairments and provides quantifiable measures of lung function. Spirometry is the most frequently used measure of lung function and plays an essential role in the diagnosis and management of pulmonary diseases. However, the test requires considerable patient effort and cooperation, markedly related to the age of patients esulting in incomplete data sets. This paper presents, a nonlinear model built using Multivariate adaptive regression splines and Random forest regression model to predict the missing spirometric features. Random forest based feature selection is used to enhance both the generalization capability and the model interpretability. In the present study, flow-volume data are recorded for N= 198 subjects. The ranked order of feature importance index calculated by the random forests model shows that the spirometric features FVC, FEF 25, PEF,FEF 25-75, FEF50, and the demographic parameter height are the important descriptors. A comparison of performance assessment of both models prove that, the prediction ability of MARS with the `top two ranked features namely the FVC and FEF 25 is higher, yielding a model fit of R2= 0.96 and R2= 0.99 for normal and abnormal subjects. The Root Mean Square Error analysis of the RF model and the MARS model also shows that the latter is capable of predicting the missing values of FEV1 with a notably lower error value of 0.0191 (normal subjects) and 0.0106 (abnormal subjects). It is concluded that combining feature selection with a prediction model provides a minimum subset of predominant features to train the model, yielding better prediction performance. This analysis can assist clinicians with a intelligence support system in the medical diagnosis and improvement of clinical care.

Keywords: FEV, multivariate adaptive regression splines pulmonary function test, random forest

Procedia PDF Downloads 281
29965 Relationship Between Family Factors and Tendency to Addiction

Authors: Farzaneh Golshekoh

Abstract:

The aim of this study was to examine the relationship between religious beliefs, family responsibility and emotional atmosphere with a tendency to addiction in high school female students in Ahwaz. The sample consisted of 250 students who were selected by cluster random sampling from among all high school female students in Ahvaz. Measuring tools were Iranian tendency towards addiction (IAPS), responsibility California Psychological Inventory (CPI), emotional family atmosphere (AFC) and religious beliefs. The simple correlation coefficient at α=0/05 showed that there is a significant negative relationship between religious beliefs, family responsibility and emotional atmosphere with a tendency to abuse female students. The regression analysis showed that the variables of the emotional atmosphere of the family and religious beliefs as predictors of female students have a tendency to addiction.

Keywords: emotional atmosphere, family responsibility, religious beliefs, tendency to addiction

Procedia PDF Downloads 412
29964 Psychosocial Factors in Relation to Musculoskeletal Disorders among Nursing Professionals in Kurdistan Region, Iraq

Authors: Karwan Khudhir

Abstract:

A cross-sectional study was carried out to determine the prevalence of musculoskeletal disorders (MSDs) and psychosocial factors associated with it, among Kurdistan nursing professionals. Simple random sampling was used to select 220 nurses and data were collected by self-administrative questionnaire. Results of the study showed that the overall prevalence of MSDs among Kurdistan nurses was 74% in different body regions and, by body regions, neck pain was reported to be the highest complaint of twelve-month MSDs (48.4%) compared to other body parts. Logistic regression analysis indicated 6 variables that are significantly associated with musculoskeletal disorders: smoking (OR=19.472, 95% CI: 5.396, 70.273), BMI (OR= 5.106, 95% CI: 1.735, 15.025), physical activity (OR=8.639, 95% CI: 3.075, 24.271), psychological demand (OR=6.685, 95% CI: 3.318, 13.468), social support (OR=3.143, 95% CI: 1.202, 4.814) and job satisfaction (OR=2.44, 95% CI: 1.04, 5.63). Prevention strategies and health education which emphasizes on psychosocial risk factors and how to improve working conditions should be introduced.

Keywords: Kurdistan Region, Iraq, musculoskeletal disorders, nurses, psycho-social factors

Procedia PDF Downloads 200
29963 Application of Multilinear Regression Analysis for Prediction of Synthetic Shear Wave Velocity Logs in Upper Assam Basin

Authors: Triveni Gogoi, Rima Chatterjee

Abstract:

Shear wave velocity (Vs) estimation is an important approach in the seismic exploration and characterization of a hydrocarbon reservoir. There are varying methods for prediction of S-wave velocity, if recorded S-wave log is not available. But all the available methods for Vs prediction are empirical mathematical models. Shear wave velocity can be estimated using P-wave velocity by applying Castagna’s equation, which is the most common approach. The constants used in Castagna’s equation vary for different lithologies and geological set-ups. In this study, multiple regression analysis has been used for estimation of S-wave velocity. The EMERGE module from Hampson-Russel software has been used here for generation of S-wave log. Both single attribute and multi attributes analysis have been carried out for generation of synthetic S-wave log in Upper Assam basin. Upper Assam basin situated in North Eastern India is one of the most important petroleum provinces of India. The present study was carried out using four wells of the study area. Out of these wells, S-wave velocity was available for three wells. The main objective of the present study is a prediction of shear wave velocities for wells where S-wave velocity information is not available. The three wells having S-wave velocity were first used to test the reliability of the method and the generated S-wave log was compared with actual S-wave log. Single attribute analysis has been carried out for these three wells within the depth range 1700-2100m, which corresponds to Barail group of Oligocene age. The Barail Group is the main target zone in this study, which is the primary producing reservoir of the basin. A system generated list of attributes with varying degrees of correlation appeared and the attribute with the highest correlation was concerned for the single attribute analysis. Crossplot between the attributes shows the variation of points from line of best fit. The final result of the analysis was compared with the available S-wave log, which shows a good visual fit with a correlation of 72%. Next multi-attribute analysis has been carried out for the same data using all the wells within the same analysis window. A high correlation of 85% has been observed between the output log from the analysis and the recorded S-wave. The almost perfect fit between the synthetic S-wave and the recorded S-wave log validates the reliability of the method. For further authentication, the generated S-wave data from the wells have been tied to the seismic and correlated them. Synthetic share wave log has been generated for the well M2 where S-wave is not available and it shows a good correlation with the seismic. Neutron porosity, density, AI and P-wave velocity are proved to be the most significant variables in this statistical method for S-wave generation. Multilinear regression method thus can be considered as a reliable technique for generation of shear wave velocity log in this study.

Keywords: Castagna's equation, multi linear regression, multi attribute analysis, shear wave logs

Procedia PDF Downloads 201
29962 Valuation of Caps and Floors in a LIBOR Market Model with Markov Jump Risks

Authors: Shih-Kuei Lin

Abstract:

The characterization of the arbitrage-free dynamics of interest rates is developed in this study under the presence of Markov jump risks, when the term structure of the interest rates is modeled through simple forward rates. We consider Markov jump risks by allowing randomness in jump sizes, independence between jump sizes and jump times. The Markov jump diffusion model is used to capture empirical phenomena and to accurately describe interest jump risks in a financial market. We derive the arbitrage-free model of simple forward rates under the spot measure. Moreover, the analytical pricing formulas for a cap and a floor are derived under the forward measure when the jump size follows a lognormal distribution. In our empirical analysis, we find that the LIBOR market model with Markov jump risk better accounts for changes from/to different states and different rates.

Keywords: arbitrage-free, cap and floor, Markov jump diffusion model, simple forward rate model, volatility smile, EM algorithm

Procedia PDF Downloads 397
29961 Partial Least Square Regression for High-Dimentional and High-Correlated Data

Authors: Mohammed Abdullah Alshahrani

Abstract:

The research focuses on investigating the use of partial least squares (PLS) methodology for addressing challenges associated with high-dimensional correlated data. Recent technological advancements have led to experiments producing data characterized by a large number of variables compared to observations, with substantial inter-variable correlations. Such data patterns are common in chemometrics, where near-infrared (NIR) spectrometer calibrations record chemical absorbance levels across hundreds of wavelengths, and in genomics, where thousands of genomic regions' copy number alterations (CNA) are recorded from cancer patients. PLS serves as a widely used method for analyzing high-dimensional data, functioning as a regression tool in chemometrics and a classification method in genomics. It handles data complexity by creating latent variables (components) from original variables. However, applying PLS can present challenges. The study investigates key areas to address these challenges, including unifying interpretations across three main PLS algorithms and exploring unusual negative shrinkage factors encountered during model fitting. The research presents an alternative approach to addressing the interpretation challenge of predictor weights associated with PLS. Sparse estimation of predictor weights is employed using a penalty function combining a lasso penalty for sparsity and a Cauchy distribution-based penalty to account for variable dependencies. The results demonstrate sparse and grouped weight estimates, aiding interpretation and prediction tasks in genomic data analysis. High-dimensional data scenarios, where predictors outnumber observations, are common in regression analysis applications. Ordinary least squares regression (OLS), the standard method, performs inadequately with high-dimensional and highly correlated data. Copy number alterations (CNA) in key genes have been linked to disease phenotypes, highlighting the importance of accurate classification of gene expression data in bioinformatics and biology using regularized methods like PLS for regression and classification.

Keywords: partial least square regression, genetics data, negative filter factors, high dimensional data, high correlated data

Procedia PDF Downloads 16
29960 Estimate of Maximum Expected Intensity of One-Half-Wave Lines Dancing

Authors: A. Bekbaev, M. Dzhamanbaev, R. Abitaeva, A. Karbozova, G. Nabyeva

Abstract:

In this paper, the regression dependence of dancing intensity from wind speed and length of span was established due to the statistic data obtained from multi-year observations on line wires dancing accumulated by power systems of Kazakhstan and the Russian Federation. The lower and upper limitations of the equations parameters were estimated, as well as the adequacy of the regression model. The constructed model will be used in research of dancing phenomena for the development of methods and means of protection against dancing and for zoning plan of the territories of line wire dancing.

Keywords: power lines, line wire dancing, dancing intensity, regression equation, dancing area intensity

Procedia PDF Downloads 290
29959 Analytical Modelling of Surface Roughness during Compacted Graphite Iron Milling Using Ceramic Inserts

Authors: Ş. Karabulut, A. Güllü, A. Güldaş, R. Gürbüz

Abstract:

This study investigates the effects of the lead angle and chip thickness variation on surface roughness during the machining of compacted graphite iron using ceramic cutting tools under dry cutting conditions. Analytical models were developed for predicting the surface roughness values of the specimens after the face milling process. Experimental data was collected and imported to the artificial neural network model. A multilayer perceptron model was used with the back propagation algorithm employing the input parameters of lead angle, cutting speed and feed rate in connection with chip thickness. Furthermore, analysis of variance was employed to determine the effects of the cutting parameters on surface roughness. Artificial neural network and regression analysis were used to predict surface roughness. The values thus predicted were compared with the collected experimental data, and the corresponding percentage error was computed. Analysis results revealed that the lead angle is the dominant factor affecting surface roughness. Experimental results indicated an improvement in the surface roughness value with decreasing lead angle value from 88° to 45°.

Keywords: CGI, milling, surface roughness, ANN, regression, modeling, analysis

Procedia PDF Downloads 426
29958 Supply Chain Risk Management (SCRM): A Simplified Alternative for Implementing SCRM for Small and Medium Enterprises

Authors: Paul W. Murray, Marco Barajas

Abstract:

Recent changes in supply chains, especially globalization and collaboration, have created new risks for enterprises of all sizes. A variety of complex frameworks, often based on enterprise risk management strategies have been presented under the heading of Supply Chain Risk Management (SCRM). The literature on promotes the benefits of a robust SCRM strategy; however, implementing SCRM is difficult and resource demanding for Large Enterprises (LEs), and essentially out of reach for Small and Medium Enterprises (SMEs). This research debunks the idea that SCRM is necessary for all enterprises and instead proposes a simple and effective Vendor Selection Template (VST). Empirical testing and a survey of supply chain practitioners provide a measure of validation to the VST. The resulting VSTis a valuable contribution because is easy to use, provides practical results, and is sufficiently flexible to be universally applied to SMEs.

Keywords: multiple regression analysis, supply chain management, risk assessment, vendor selection

Procedia PDF Downloads 432
29957 Drivers of Liking: Probiotic Petit Suisse Cheese

Authors: Helena Bolini, Erick Esmerino, Adriano Cruz, Juliana Paixao

Abstract:

The currently concern for health has increased demand for low-calorie ingredients and functional foods as probiotics. Understand the reasons that infer on food choice, besides a challenging task, it is important step for development and/or reformulation of existing food products. The use of appropriate multivariate statistical techniques, such as External Preference Map (PrefMap), associated with regression by Partial Least Squares (PLS) can help in determining those factors. Thus, this study aimed to determine, through PLS regression analysis, the sensory attributes considered drivers of liking in probiotic petit suisse cheeses, strawberry flavor, sweetened with different sweeteners. Five samples in same equivalent sweetness: PROB1 (Sucralose 0.0243%), PROB2 (Stevia 0.1520%), PROB3 (Aspartame 0.0877%), PROB4 (Neotame 0.0025%) and PROB5 (Sucrose 15.2%) determined by just-about-right and magnitude estimation methods, and three commercial samples COM1, COM2 and COM3, were studied. Analysis was done over data coming from QDA, performed by 12 expert (highly trained assessors) on 20 descriptor terms, correlated with data from assessment of overall liking in acceptance test, carried out by 125 consumers, on all samples. Sequentially, results were submitted to PLS regression using XLSTAT software from Byossistemes. As shown in results, it was possible determine, that three sensory descriptor terms might be considered drivers of liking of probiotic petit suisse cheese samples added with sweeteners (p<0.05). The milk flavor was noticed as a sensory characteristic with positive impact on acceptance, while descriptors bitter taste and sweet aftertaste were perceived as descriptor terms with negative impact on acceptance of petit suisse probiotic cheeses. It was possible conclude that PLS regression analysis is a practical and useful tool in determining drivers of liking of probiotic petit suisse cheeses sweetened with artificial and natural sweeteners, allowing food industry to understand and improve their formulations maximizing the acceptability of their products.

Keywords: acceptance, consumer, quantitative descriptive analysis, sweetener

Procedia PDF Downloads 420
29956 Incorporating Anomaly Detection in a Digital Twin Scenario Using Symbolic Regression

Authors: Manuel Alves, Angelica Reis, Armindo Lobo, Valdemar Leiras

Abstract:

In industry 4.0, it is common to have a lot of sensor data. In this deluge of data, hints of possible problems are difficult to spot. The digital twin concept aims to help answer this problem, but it is mainly used as a monitoring tool to handle the visualisation of data. Failure detection is of paramount importance in any industry, and it consumes a lot of resources. Any improvement in this regard is of tangible value to the organisation. The aim of this paper is to add the ability to forecast test failures, curtailing detection times. To achieve this, several anomaly detection algorithms were compared with a symbolic regression approach. To this end, Isolation Forest, One-Class SVM and an auto-encoder have been explored. For the symbolic regression PySR library was used. The first results show that this approach is valid and can be added to the tools available in this context as a low resource anomaly detection method since, after training, the only requirement is the calculation of a polynomial, a useful feature in the digital twin context.

Keywords: anomaly detection, digital twin, industry 4.0, symbolic regression

Procedia PDF Downloads 96
29955 Competitors’ Influence Analysis of a Retailer by Using Customer Value and Huff’s Gravity Model

Authors: Yepeng Cheng, Yasuhiko Morimoto

Abstract:

Customer relationship analysis is vital for retail stores, especially for supermarkets. The point of sale (POS) systems make it possible to record the daily purchasing behaviors of customers as an identification point of sale (ID-POS) database, which can be used to analyze customer behaviors of a supermarket. The customer value is an indicator based on ID-POS database for detecting the customer loyalty of a store. In general, there are many supermarkets in a city, and other nearby competitor supermarkets significantly affect the customer value of customers of a supermarket. However, it is impossible to get detailed ID-POS databases of competitor supermarkets. This study firstly focused on the customer value and distance between a customer's home and supermarkets in a city, and then constructed the models based on logistic regression analysis to analyze correlations between distance and purchasing behaviors only from a POS database of a supermarket chain. During the modeling process, there are three primary problems existed, including the incomparable problem of customer values, the multicollinearity problem among customer value and distance data, and the number of valid partial regression coefficients. The improved customer value, Huff’s gravity model, and inverse attractiveness frequency are considered to solve these problems. This paper presents three types of models based on these three methods for loyal customer classification and competitors’ influence analysis. In numerical experiments, all types of models are useful for loyal customer classification. The type of model, including all three methods, is the most superior one for evaluating the influence of the other nearby supermarkets on customers' purchasing of a supermarket chain from the viewpoint of valid partial regression coefficients and accuracy.

Keywords: customer value, Huff's Gravity Model, POS, Retailer

Procedia PDF Downloads 99
29954 Microstructural Characterization and Mechanical Properties of Al-2Mn-5Fe Ternary Eutectic Alloy

Authors: Emin Çadirli, Izzettin Yilmazer, Uğur Büyük, Hasan Kaya

Abstract:

Al-2Mn-5Fe eutectic alloy (wt.%) was prepared in a graphite crucible under vacuum atmosphere. The samples were directionally solidified upward at a constant temperature gradient in four different of growth rates by using a Bridgman method. The values of eutectic spacing were measured from longitudinal and transverse sections of the samples. The dependence of eutectic spacing on the growth rate was determined by using linear regression analysis. The microhardness and tensile strength of the studied alloy also were measured from directionally solidified samples. The dependency of the microhardness and tensile strength for directionally solidified Al-2Mn-5Fe eutectic alloy on the growth rate were investigated and the relationships between them were experimentally obtained by using regression analysis. The results obtained in present work were compared with the previous similar experimental results obtained for binary and ternary alloys.

Keywords: eutectic alloy, microhardness, microstructure, tensile strength

Procedia PDF Downloads 447
29953 Regional Flood Frequency Analysis in Narmada Basin: A Case Study

Authors: Ankit Shah, R. K. Shrivastava

Abstract:

Flood and drought are two main features of hydrology which affect the human life. Floods are natural disasters which cause millions of rupees’ worth of damage each year in India and the whole world. Flood causes destruction in form of life and property. An accurate estimate of the flood damage potential is a key element to an effective, nationwide flood damage abatement program. Also, the increase in demand of water due to increase in population, industrial and agricultural growth, has let us know that though being a renewable resource it cannot be taken for granted. We have to optimize the use of water according to circumstances and conditions and need to harness it which can be done by construction of hydraulic structures. For their safe and proper functioning of hydraulic structures, we need to predict the flood magnitude and its impact. Hydraulic structures play a key role in harnessing and optimization of flood water which in turn results in safe and maximum use of water available. Mainly hydraulic structures are constructed on ungauged sites. There are two methods by which we can estimate flood viz. generation of Unit Hydrographs and Flood Frequency Analysis. In this study, Regional Flood Frequency Analysis has been employed. There are many methods for estimating the ‘Regional Flood Frequency Analysis’ viz. Index Flood Method. National Environmental and Research Council (NERC Methods), Multiple Regression Method, etc. However, none of the methods can be considered universal for every situation and location. The Narmada basin is located in Central India. It is drained by most of the tributaries, most of which are ungauged. Therefore it is very difficult to estimate flood on these tributaries and in the main river. As mentioned above Artificial Neural Network (ANN)s and Multiple Regression Method is used for determination of Regional flood Frequency. The annual peak flood data of 20 sites gauging sites of Narmada Basin is used in the present study to determine the Regional Flood relationships. Homogeneity of the considered sites is determined by using the Index Flood Method. Flood relationships obtained by both the methods are compared with each other, and it is found that ANN is more reliable than Multiple Regression Method for the present study area.

Keywords: artificial neural network, index flood method, multi layer perceptrons, multiple regression, Narmada basin, regional flood frequency

Procedia PDF Downloads 390
29952 Impact of Infrastructural Development on Socio-Economic Growth: An Empirical Investigation in India

Authors: Jonardan Koner

Abstract:

The study attempts to find out the impact of infrastructural investment on state economic growth in India. It further tries to determine the magnitude of the impact of infrastructural investment on economic indicator, i.e., per-capita income (PCI) in Indian States. The study uses panel regression technique to measure the impact of infrastructural investment on per-capita income (PCI) in Indian States. Panel regression technique helps incorporate both the cross-section and time-series aspects of the dataset. In order to analyze the difference in impact of the explanatory variables on the explained variables across states, the study uses Fixed Effect Panel Regression Model. The conclusions of the study are that infrastructural investment has a desirable impact on economic development and that the impact is different for different states in India. We analyze time series data (annual frequency) ranging from 1991 to 2010. The study reveals that the infrastructural investment significantly explains the variation of economic indicators.

Keywords: infrastructural investment, multiple regression, panel regression techniques, economic development, fixed effect dummy variable model

Procedia PDF Downloads 350