Search results for: hierarchical regression analysis
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 28311

Search results for: hierarchical regression analysis

28041 A Kolmogorov-Smirnov Type Goodness-Of-Fit Test of Multinomial Logistic Regression Model in Case-Control Studies

Authors: Chen Li-Ching

Abstract:

The multinomial logistic regression model is used popularly for inferring the relationship of risk factors and disease with multiple categories. This study based on the discrepancy between the nonparametric maximum likelihood estimator and semiparametric maximum likelihood estimator of the cumulative distribution function to propose a Kolmogorov-Smirnov type test statistic to assess adequacy of the multinomial logistic regression model for case-control data. A bootstrap procedure is presented to calculate the critical value of the proposed test statistic. Empirical type I error rates and powers of the test are performed by simulation studies. Some examples will be illustrated the implementation of the test.

Keywords: case-control studies, goodness-of-fit test, Kolmogorov-Smirnov test, multinomial logistic regression

Procedia PDF Downloads 427
28040 Hierarchical Surface Inspired by Lotus-Leaf for Electrical Generators from Waterdrop

Authors: Jaewook Ha, Jin-beak Kim, Seongmin Kim

Abstract:

In order to solve global warming and climate change issues, increased efforts have been devoted towards clean and sustainable energy sources as well as new energy generating devices. Nanogenerator is a device that converts mechanical/thermal energy as produced by small-scale physical change into electricity. Here we propose that nature-leaf surface could be used for preparation of a triboelectric nanogenerator. The nature-leaf surface consists of polydimethylsiloxane microscale pillars and polytetrafluoroethylene nanoparticles. Interaction between the nature-leaf surface and water was studied and the electrical outputs from the motion of single water drop were measured. A 40-μL water drop can generate a peak voltage of 1 V and a peak current of 0.7 μA. This nanogenerator might be used to drive electric devices in the outdoor environments in a sustainable manner.

Keywords: hierarchical surface, lotus-leaf, electrical generator, waterdrop

Procedia PDF Downloads 255
28039 The Impact of Governance on Happiness: Evidence from Quantile Regressions

Authors: Chiung-Ju Huang

Abstract:

This study utilizes the quantile regression analysis to examine the impact of governance (including democratic quality and technical quality) on happiness in 101 countries worldwide, classified as “developed countries” and “developing countries”. The empirical results show that the impact of democratic quality and technical quality on happiness is significantly positive for “developed countries”, while is insignificant for “developing countries”. The results suggest that the authorities in developed countries can enhance the level of individual happiness by means of improving the democracy quality and technical quality. However, for developing countries, promoting the quality of governance in order to enhance the level of happiness may not be effective. Policy makers in developed countries may pay more attention on increasing real GDP per capita instead of promoting the quality of governance to enhance individual happiness.

Keywords: governance, happiness, multiple regression, quantile regression

Procedia PDF Downloads 250
28038 An Automated Stock Investment System Using Machine Learning Techniques: An Application in Australia

Authors: Carol Anne Hargreaves

Abstract:

A key issue in stock investment is how to select representative features for stock selection. The objective of this paper is to firstly determine whether an automated stock investment system, using machine learning techniques, may be used to identify a portfolio of growth stocks that are highly likely to provide returns better than the stock market index. The second objective is to identify the technical features that best characterize whether a stock’s price is likely to go up and to identify the most important factors and their contribution to predicting the likelihood of the stock price going up. Unsupervised machine learning techniques, such as cluster analysis, were applied to the stock data to identify a cluster of stocks that was likely to go up in price – portfolio 1. Next, the principal component analysis technique was used to select stocks that were rated high on component one and component two – portfolio 2. Thirdly, a supervised machine learning technique, the logistic regression method, was used to select stocks with a high probability of their price going up – portfolio 3. The predictive models were validated with metrics such as, sensitivity (recall), specificity and overall accuracy for all models. All accuracy measures were above 70%. All portfolios outperformed the market by more than eight times. The top three stocks were selected for each of the three stock portfolios and traded in the market for one month. After one month the return for each stock portfolio was computed and compared with the stock market index returns. The returns for all three stock portfolios was 23.87% for the principal component analysis stock portfolio, 11.65% for the logistic regression portfolio and 8.88% for the K-means cluster portfolio while the stock market performance was 0.38%. This study confirms that an automated stock investment system using machine learning techniques can identify top performing stock portfolios that outperform the stock market.

Keywords: machine learning, stock market trading, logistic regression, cluster analysis, factor analysis, decision trees, neural networks, automated stock investment system

Procedia PDF Downloads 128
28037 Forecasting of Grape Juice Flavor by Using Support Vector Regression

Authors: Ren-Jieh Kuo, Chun-Shou Huang

Abstract:

The research of juice flavor forecasting has become more important in China. Due to the fast economic growth in China, many different kinds of juices have been introduced to the market. If a beverage company can understand their customers’ preference well, the juice can be served more attractively. Thus, this study intends to introduce the basic theory and computing process of grapes juice flavor forecasting based on support vector regression (SVR). Applying SVR, BPN and LR to forecast the flavor of grapes juice in real data, the result shows that SVR is more suitable and effective at predicting performance.

Keywords: flavor forecasting, artificial neural networks, Support Vector Regression, China

Procedia PDF Downloads 452
28036 A Multinomial Logistic Regression Analysis of Factors Influencing Couples' Fertility Preferences in Kenya

Authors: Naomi W. Maina

Abstract:

Fertility preference is a subject of great significance in developing countries. Studies reveal that the preferences of fertility are actually significant in determining the society’s fertility levels because the fertility behavior of the future has a high likelihood of falling under the effect of currently observed fertility inclinations. The objective of this study was to establish the factors associated with fertility preference amongst couples in Kenya by fitting a multinomial logistic regression model against 5,265 couple data obtained from Kenya demographic health survey 2014. Results revealed that the type of place of residence, the region of residence, age and spousal age gap significantly influence desire for additional children among couples in Kenya. There was the notable high likelihood of couples living in rural settlements having similar fertility preference compared to those living in urban settlements. Moreover, geographical disparities such as in northern Kenya revealed significant differences in a couples desire to have additional children compared to Nairobi. The odds of a couple’s desire for additional children were further observed to vary dependent on either the wife or husbands age and to a large extent the spousal age gap. Evidenced from the study, was the fact that as spousal age gap increases, the desire for more children amongst couples decreases. Insights derived from this study would be attractive to demographers, health practitioners, policymakers, and non-governmental organizations implementing fertility related interventions in Kenya among other stakeholders. Moreover, with the adoption of devolution, there is a clear need for adoption of population policies that are County specific as opposed to a national population policy as is the current practice in Kenya. Additionally, researchers or students who have little understanding in the application of multinomial logistic regression, both theoretical understanding and practical analysis in SPSS as well as application on real datasets, will find this article useful.

Keywords: couples' desire, fertility, fertility preference, multinomial regression analysis

Procedia PDF Downloads 151
28035 Comparison of Multivariate Adaptive Regression Splines and Random Forest Regression in Predicting Forced Expiratory Volume in One Second

Authors: P. V. Pramila , V. Mahesh

Abstract:

Pulmonary Function Tests are important non-invasive diagnostic tests to assess respiratory impairments and provides quantifiable measures of lung function. Spirometry is the most frequently used measure of lung function and plays an essential role in the diagnosis and management of pulmonary diseases. However, the test requires considerable patient effort and cooperation, markedly related to the age of patients esulting in incomplete data sets. This paper presents, a nonlinear model built using Multivariate adaptive regression splines and Random forest regression model to predict the missing spirometric features. Random forest based feature selection is used to enhance both the generalization capability and the model interpretability. In the present study, flow-volume data are recorded for N= 198 subjects. The ranked order of feature importance index calculated by the random forests model shows that the spirometric features FVC, FEF 25, PEF,FEF 25-75, FEF50, and the demographic parameter height are the important descriptors. A comparison of performance assessment of both models prove that, the prediction ability of MARS with the `top two ranked features namely the FVC and FEF 25 is higher, yielding a model fit of R2= 0.96 and R2= 0.99 for normal and abnormal subjects. The Root Mean Square Error analysis of the RF model and the MARS model also shows that the latter is capable of predicting the missing values of FEV1 with a notably lower error value of 0.0191 (normal subjects) and 0.0106 (abnormal subjects). It is concluded that combining feature selection with a prediction model provides a minimum subset of predominant features to train the model, yielding better prediction performance. This analysis can assist clinicians with a intelligence support system in the medical diagnosis and improvement of clinical care.

Keywords: FEV, multivariate adaptive regression splines pulmonary function test, random forest

Procedia PDF Downloads 276
28034 Estimation of Coefficients of Ridge and Principal Components Regressions with Multicollinear Data

Authors: Rajeshwar Singh

Abstract:

The presence of multicollinearity is common in handling with several explanatory variables simultaneously due to exhibiting a linear relationship among them. A great problem arises in understanding the impact of explanatory variables on the dependent variable. Thus, the method of least squares estimation gives inexact estimates. In this case, it is advised to detect its presence first before proceeding further. Using the ridge regression degree of its occurrence is reduced but principal components regression gives good estimates in this situation. This paper discusses well-known techniques of the ridge and principal components regressions and applies to get the estimates of coefficients by both techniques. In addition to it, this paper also discusses the conflicting claim on the discovery of the method of ridge regression based on available documents.

Keywords: conflicting claim on credit of discovery of ridge regression, multicollinearity, principal components and ridge regressions, variance inflation factor

Procedia PDF Downloads 375
28033 Application of Multilinear Regression Analysis for Prediction of Synthetic Shear Wave Velocity Logs in Upper Assam Basin

Authors: Triveni Gogoi, Rima Chatterjee

Abstract:

Shear wave velocity (Vs) estimation is an important approach in the seismic exploration and characterization of a hydrocarbon reservoir. There are varying methods for prediction of S-wave velocity, if recorded S-wave log is not available. But all the available methods for Vs prediction are empirical mathematical models. Shear wave velocity can be estimated using P-wave velocity by applying Castagna’s equation, which is the most common approach. The constants used in Castagna’s equation vary for different lithologies and geological set-ups. In this study, multiple regression analysis has been used for estimation of S-wave velocity. The EMERGE module from Hampson-Russel software has been used here for generation of S-wave log. Both single attribute and multi attributes analysis have been carried out for generation of synthetic S-wave log in Upper Assam basin. Upper Assam basin situated in North Eastern India is one of the most important petroleum provinces of India. The present study was carried out using four wells of the study area. Out of these wells, S-wave velocity was available for three wells. The main objective of the present study is a prediction of shear wave velocities for wells where S-wave velocity information is not available. The three wells having S-wave velocity were first used to test the reliability of the method and the generated S-wave log was compared with actual S-wave log. Single attribute analysis has been carried out for these three wells within the depth range 1700-2100m, which corresponds to Barail group of Oligocene age. The Barail Group is the main target zone in this study, which is the primary producing reservoir of the basin. A system generated list of attributes with varying degrees of correlation appeared and the attribute with the highest correlation was concerned for the single attribute analysis. Crossplot between the attributes shows the variation of points from line of best fit. The final result of the analysis was compared with the available S-wave log, which shows a good visual fit with a correlation of 72%. Next multi-attribute analysis has been carried out for the same data using all the wells within the same analysis window. A high correlation of 85% has been observed between the output log from the analysis and the recorded S-wave. The almost perfect fit between the synthetic S-wave and the recorded S-wave log validates the reliability of the method. For further authentication, the generated S-wave data from the wells have been tied to the seismic and correlated them. Synthetic share wave log has been generated for the well M2 where S-wave is not available and it shows a good correlation with the seismic. Neutron porosity, density, AI and P-wave velocity are proved to be the most significant variables in this statistical method for S-wave generation. Multilinear regression method thus can be considered as a reliable technique for generation of shear wave velocity log in this study.

Keywords: Castagna's equation, multi linear regression, multi attribute analysis, shear wave logs

Procedia PDF Downloads 199
28032 Relationship of Organizational Culture, Teacher Psychological Empowerment, and Organizational Citizenship Behavior in Universities in Bangkalan District

Authors: Iqbal Abd. Muhbir Hadi Anam

Abstract:

The purpose of the study is to discuss the relationship between organizational culture, teacher psychological empowerment, and organizational citizenship behavior at the University of Bangkalan District. The data was obtained using a survey of 100 respondents tested for validity and reliability. The analytical technique used is a hierarchical regression test. The results showed that the organizational culture of the university had a strong influence on the psychological empowerment of teachers and the psychological empowerment of teachers and that the organizational culture and psychological empowerment of teachers provided effective predictions of the psychological empowerment of the university. In addition, organizational culture directly or indirectly influences teachers' organizational citizenship behavior through psychological empowerment. Given these results, universities need to build an organizational culture that reflects the nature of the university.

Keywords: organizational behavior, teacher psychological empowerment, organizational citizenship behavior, universities

Procedia PDF Downloads 171
28031 Using Mind Mapping and Morphological Analysis within a New Methodology for Teaching Students of Products’ Design

Authors: Kareem Saber

Abstract:

Many products’ design instructors search for how to help students to develop their designs simply by reducing design stages and extrapolating simple design process forms to achieve design creativity. So, the researcher extrapolated a new design process form called “hierarchical design” which reduced design process into three stages and he had tried that methodology on about two hundred students. That trial had led to great results as students could develop their designs which characterized by creativity and innovation. That proved the success and effectiveness of the proposed methodology.

Keywords: mind mapping, morphological analysis, product design, design process

Procedia PDF Downloads 144
28030 Partial Least Square Regression for High-Dimentional and High-Correlated Data

Authors: Mohammed Abdullah Alshahrani

Abstract:

The research focuses on investigating the use of partial least squares (PLS) methodology for addressing challenges associated with high-dimensional correlated data. Recent technological advancements have led to experiments producing data characterized by a large number of variables compared to observations, with substantial inter-variable correlations. Such data patterns are common in chemometrics, where near-infrared (NIR) spectrometer calibrations record chemical absorbance levels across hundreds of wavelengths, and in genomics, where thousands of genomic regions' copy number alterations (CNA) are recorded from cancer patients. PLS serves as a widely used method for analyzing high-dimensional data, functioning as a regression tool in chemometrics and a classification method in genomics. It handles data complexity by creating latent variables (components) from original variables. However, applying PLS can present challenges. The study investigates key areas to address these challenges, including unifying interpretations across three main PLS algorithms and exploring unusual negative shrinkage factors encountered during model fitting. The research presents an alternative approach to addressing the interpretation challenge of predictor weights associated with PLS. Sparse estimation of predictor weights is employed using a penalty function combining a lasso penalty for sparsity and a Cauchy distribution-based penalty to account for variable dependencies. The results demonstrate sparse and grouped weight estimates, aiding interpretation and prediction tasks in genomic data analysis. High-dimensional data scenarios, where predictors outnumber observations, are common in regression analysis applications. Ordinary least squares regression (OLS), the standard method, performs inadequately with high-dimensional and highly correlated data. Copy number alterations (CNA) in key genes have been linked to disease phenotypes, highlighting the importance of accurate classification of gene expression data in bioinformatics and biology using regularized methods like PLS for regression and classification.

Keywords: partial least square regression, genetics data, negative filter factors, high dimensional data, high correlated data

Procedia PDF Downloads 9
28029 Analytical Modelling of Surface Roughness during Compacted Graphite Iron Milling Using Ceramic Inserts

Authors: Ş. Karabulut, A. Güllü, A. Güldaş, R. Gürbüz

Abstract:

This study investigates the effects of the lead angle and chip thickness variation on surface roughness during the machining of compacted graphite iron using ceramic cutting tools under dry cutting conditions. Analytical models were developed for predicting the surface roughness values of the specimens after the face milling process. Experimental data was collected and imported to the artificial neural network model. A multilayer perceptron model was used with the back propagation algorithm employing the input parameters of lead angle, cutting speed and feed rate in connection with chip thickness. Furthermore, analysis of variance was employed to determine the effects of the cutting parameters on surface roughness. Artificial neural network and regression analysis were used to predict surface roughness. The values thus predicted were compared with the collected experimental data, and the corresponding percentage error was computed. Analysis results revealed that the lead angle is the dominant factor affecting surface roughness. Experimental results indicated an improvement in the surface roughness value with decreasing lead angle value from 88° to 45°.

Keywords: CGI, milling, surface roughness, ANN, regression, modeling, analysis

Procedia PDF Downloads 423
28028 Decision Making during the Project Management Life Cycle of Infrastructure Projects

Authors: Karrar Raoof Kareem Kamoona, Enas Fathi Taher AlHares, Zeynep Isik

Abstract:

The various disciplines in the construction industry and the co-existence of the people in the various disciplines are what builds well-developed, closely-knit interpersonal skills at various hierarchical levels thus leading to a varied way of leadership. The varied decision making aspects during the lifecycle of a project include: autocratic, participatory and last but not least, free-rein. We can classify some of the decision makers in the construction industry in a hierarchical manner as follows: project executive, project manager, superintendent, office engineer and finally the field engineer. This survey looked at how decisions are made during the construction period by the key stakeholders in the project. From the paper it is evident that the three decision making aspects can be used at different times or at times together in order to bring out the best leadership decision. A blend of different leadership styles should be used to enhance the success rate during the project lifecycle.

Keywords: leadership style, construction, decision-making, built environment

Procedia PDF Downloads 334
28027 Drivers of Liking: Probiotic Petit Suisse Cheese

Authors: Helena Bolini, Erick Esmerino, Adriano Cruz, Juliana Paixao

Abstract:

The currently concern for health has increased demand for low-calorie ingredients and functional foods as probiotics. Understand the reasons that infer on food choice, besides a challenging task, it is important step for development and/or reformulation of existing food products. The use of appropriate multivariate statistical techniques, such as External Preference Map (PrefMap), associated with regression by Partial Least Squares (PLS) can help in determining those factors. Thus, this study aimed to determine, through PLS regression analysis, the sensory attributes considered drivers of liking in probiotic petit suisse cheeses, strawberry flavor, sweetened with different sweeteners. Five samples in same equivalent sweetness: PROB1 (Sucralose 0.0243%), PROB2 (Stevia 0.1520%), PROB3 (Aspartame 0.0877%), PROB4 (Neotame 0.0025%) and PROB5 (Sucrose 15.2%) determined by just-about-right and magnitude estimation methods, and three commercial samples COM1, COM2 and COM3, were studied. Analysis was done over data coming from QDA, performed by 12 expert (highly trained assessors) on 20 descriptor terms, correlated with data from assessment of overall liking in acceptance test, carried out by 125 consumers, on all samples. Sequentially, results were submitted to PLS regression using XLSTAT software from Byossistemes. As shown in results, it was possible determine, that three sensory descriptor terms might be considered drivers of liking of probiotic petit suisse cheese samples added with sweeteners (p<0.05). The milk flavor was noticed as a sensory characteristic with positive impact on acceptance, while descriptors bitter taste and sweet aftertaste were perceived as descriptor terms with negative impact on acceptance of petit suisse probiotic cheeses. It was possible conclude that PLS regression analysis is a practical and useful tool in determining drivers of liking of probiotic petit suisse cheeses sweetened with artificial and natural sweeteners, allowing food industry to understand and improve their formulations maximizing the acceptability of their products.

Keywords: acceptance, consumer, quantitative descriptive analysis, sweetener

Procedia PDF Downloads 414
28026 Competitors’ Influence Analysis of a Retailer by Using Customer Value and Huff’s Gravity Model

Authors: Yepeng Cheng, Yasuhiko Morimoto

Abstract:

Customer relationship analysis is vital for retail stores, especially for supermarkets. The point of sale (POS) systems make it possible to record the daily purchasing behaviors of customers as an identification point of sale (ID-POS) database, which can be used to analyze customer behaviors of a supermarket. The customer value is an indicator based on ID-POS database for detecting the customer loyalty of a store. In general, there are many supermarkets in a city, and other nearby competitor supermarkets significantly affect the customer value of customers of a supermarket. However, it is impossible to get detailed ID-POS databases of competitor supermarkets. This study firstly focused on the customer value and distance between a customer's home and supermarkets in a city, and then constructed the models based on logistic regression analysis to analyze correlations between distance and purchasing behaviors only from a POS database of a supermarket chain. During the modeling process, there are three primary problems existed, including the incomparable problem of customer values, the multicollinearity problem among customer value and distance data, and the number of valid partial regression coefficients. The improved customer value, Huff’s gravity model, and inverse attractiveness frequency are considered to solve these problems. This paper presents three types of models based on these three methods for loyal customer classification and competitors’ influence analysis. In numerical experiments, all types of models are useful for loyal customer classification. The type of model, including all three methods, is the most superior one for evaluating the influence of the other nearby supermarkets on customers' purchasing of a supermarket chain from the viewpoint of valid partial regression coefficients and accuracy.

Keywords: customer value, Huff's Gravity Model, POS, Retailer

Procedia PDF Downloads 93
28025 Estimate of Maximum Expected Intensity of One-Half-Wave Lines Dancing

Authors: A. Bekbaev, M. Dzhamanbaev, R. Abitaeva, A. Karbozova, G. Nabyeva

Abstract:

In this paper, the regression dependence of dancing intensity from wind speed and length of span was established due to the statistic data obtained from multi-year observations on line wires dancing accumulated by power systems of Kazakhstan and the Russian Federation. The lower and upper limitations of the equations parameters were estimated, as well as the adequacy of the regression model. The constructed model will be used in research of dancing phenomena for the development of methods and means of protection against dancing and for zoning plan of the territories of line wire dancing.

Keywords: power lines, line wire dancing, dancing intensity, regression equation, dancing area intensity

Procedia PDF Downloads 287
28024 Perceived Organizational Justice, Trust and Employee Engagement in Bank Managers

Authors: Seemal Mazhar Khan, Tahira Mubashar

Abstract:

The present research aimed to investigate the relationship in perceived organizational justice, organizational trust and employee engagement in bank employees. It was hypothesized: there is likely to be a relationship in perceived organizational justices, organizational trust and employee engagement; perceived organizational justice and organizational trust are likely to predict employee engagement; there is likely to be effect of bank type and designation on perceived organizational justice, organizational trust and employee engagement. The sample consisted of 150 bank employees (50 from government, 50 from private and 50 from privatized banks) selected from different banks in Lahore, Pakistan. Correlational research design was used to conduct this study. Perceived Organizational Justices Questionnaire, Organizational Trust Questionnaire and Employee Engagement Scale were used for assessment. Pearson product moment correlation, hierarchical regression and multivariate analysis of covariance were applied. Results showed a positive significant relationship in perceived organizational justice and organizational engagement and there were also a positive significant relation between organizational trust and job and organizational engagement. Results showed that organizational trust predicts organizational engagement after controlling the effect of age, marital status and socio-economic status and there is a significant interaction effect of bank type and designation level on organizational trust in bank employees. The findings of the research can serve as a platform for the awareness of important antecedents of employee engagement and organizations can inculcate trust for better and improved engagement of its employees, thereby, enhancing the productivity of their employees.

Keywords: bank employees, organizational engagement, perceived organizational justice, trust

Procedia PDF Downloads 363
28023 Microstructural Characterization and Mechanical Properties of Al-2Mn-5Fe Ternary Eutectic Alloy

Authors: Emin Çadirli, Izzettin Yilmazer, Uğur Büyük, Hasan Kaya

Abstract:

Al-2Mn-5Fe eutectic alloy (wt.%) was prepared in a graphite crucible under vacuum atmosphere. The samples were directionally solidified upward at a constant temperature gradient in four different of growth rates by using a Bridgman method. The values of eutectic spacing were measured from longitudinal and transverse sections of the samples. The dependence of eutectic spacing on the growth rate was determined by using linear regression analysis. The microhardness and tensile strength of the studied alloy also were measured from directionally solidified samples. The dependency of the microhardness and tensile strength for directionally solidified Al-2Mn-5Fe eutectic alloy on the growth rate were investigated and the relationships between them were experimentally obtained by using regression analysis. The results obtained in present work were compared with the previous similar experimental results obtained for binary and ternary alloys.

Keywords: eutectic alloy, microhardness, microstructure, tensile strength

Procedia PDF Downloads 444
28022 Incorporating Anomaly Detection in a Digital Twin Scenario Using Symbolic Regression

Authors: Manuel Alves, Angelica Reis, Armindo Lobo, Valdemar Leiras

Abstract:

In industry 4.0, it is common to have a lot of sensor data. In this deluge of data, hints of possible problems are difficult to spot. The digital twin concept aims to help answer this problem, but it is mainly used as a monitoring tool to handle the visualisation of data. Failure detection is of paramount importance in any industry, and it consumes a lot of resources. Any improvement in this regard is of tangible value to the organisation. The aim of this paper is to add the ability to forecast test failures, curtailing detection times. To achieve this, several anomaly detection algorithms were compared with a symbolic regression approach. To this end, Isolation Forest, One-Class SVM and an auto-encoder have been explored. For the symbolic regression PySR library was used. The first results show that this approach is valid and can be added to the tools available in this context as a low resource anomaly detection method since, after training, the only requirement is the calculation of a polynomial, a useful feature in the digital twin context.

Keywords: anomaly detection, digital twin, industry 4.0, symbolic regression

Procedia PDF Downloads 89
28021 Bayesian Estimation of Hierarchical Models for Genotypic Differentiation of Arabidopsis thaliana

Authors: Gautier Viaud, Paul-Henry Cournède

Abstract:

Plant growth models have been used extensively for the prediction of the phenotypic performance of plants. However, they remain most often calibrated for a given genotype and therefore do not take into account genotype by environment interactions. One way of achieving such an objective is to consider Bayesian hierarchical models. Three levels can be identified in such models: The first level describes how a given growth model describes the phenotype of the plant as a function of individual parameters, the second level describes how these individual parameters are distributed within a plant population, the third level corresponds to the attribution of priors on population parameters. Thanks to the Bayesian framework, choosing appropriate priors for the population parameters permits to derive analytical expressions for the full conditional distributions of these population parameters. As plant growth models are of a nonlinear nature, individual parameters cannot be sampled explicitly, and a Metropolis step must be performed. This allows for the use of a hybrid Gibbs--Metropolis sampler. A generic approach was devised for the implementation of both general state space models and estimation algorithms within a programming platform. It was designed using the Julia language, which combines an elegant syntax, metaprogramming capabilities and exhibits high efficiency. Results were obtained for Arabidopsis thaliana on both simulated and real data. An organ-scale Greenlab model for the latter is thus presented, where the surface areas of each individual leaf can be simulated. It is assumed that the error made on the measurement of leaf areas is proportional to the leaf area itself; multiplicative normal noises for the observations are therefore used. Real data were obtained via image analysis of zenithal images of Arabidopsis thaliana over a period of 21 days using a two-step segmentation and tracking algorithm which notably takes advantage of the Arabidopsis thaliana phyllotaxy. Since the model formulation is rather flexible, there is no need that the data for a single individual be available at all times, nor that the times at which data is available be the same for all the different individuals. This allows to discard data from image analysis when it is not considered reliable enough, thereby providing low-biased data in large quantity for leaf areas. The proposed model precisely reproduces the dynamics of Arabidopsis thaliana’s growth while accounting for the variability between genotypes. In addition to the estimation of the population parameters, the level of variability is an interesting indicator of the genotypic stability of model parameters. A promising perspective is to test whether some of the latter should be considered as fixed effects.

Keywords: bayesian, genotypic differentiation, hierarchical models, plant growth models

Procedia PDF Downloads 279
28020 Regional Flood Frequency Analysis in Narmada Basin: A Case Study

Authors: Ankit Shah, R. K. Shrivastava

Abstract:

Flood and drought are two main features of hydrology which affect the human life. Floods are natural disasters which cause millions of rupees’ worth of damage each year in India and the whole world. Flood causes destruction in form of life and property. An accurate estimate of the flood damage potential is a key element to an effective, nationwide flood damage abatement program. Also, the increase in demand of water due to increase in population, industrial and agricultural growth, has let us know that though being a renewable resource it cannot be taken for granted. We have to optimize the use of water according to circumstances and conditions and need to harness it which can be done by construction of hydraulic structures. For their safe and proper functioning of hydraulic structures, we need to predict the flood magnitude and its impact. Hydraulic structures play a key role in harnessing and optimization of flood water which in turn results in safe and maximum use of water available. Mainly hydraulic structures are constructed on ungauged sites. There are two methods by which we can estimate flood viz. generation of Unit Hydrographs and Flood Frequency Analysis. In this study, Regional Flood Frequency Analysis has been employed. There are many methods for estimating the ‘Regional Flood Frequency Analysis’ viz. Index Flood Method. National Environmental and Research Council (NERC Methods), Multiple Regression Method, etc. However, none of the methods can be considered universal for every situation and location. The Narmada basin is located in Central India. It is drained by most of the tributaries, most of which are ungauged. Therefore it is very difficult to estimate flood on these tributaries and in the main river. As mentioned above Artificial Neural Network (ANN)s and Multiple Regression Method is used for determination of Regional flood Frequency. The annual peak flood data of 20 sites gauging sites of Narmada Basin is used in the present study to determine the Regional Flood relationships. Homogeneity of the considered sites is determined by using the Index Flood Method. Flood relationships obtained by both the methods are compared with each other, and it is found that ANN is more reliable than Multiple Regression Method for the present study area.

Keywords: artificial neural network, index flood method, multi layer perceptrons, multiple regression, Narmada basin, regional flood frequency

Procedia PDF Downloads 387
28019 Facies Analysis and Depositional Environment of Late Cretaceous (Cenomanian) Lidam Formation, South East Sirt Basin, Libya

Authors: Miloud M. Abugares

Abstract:

This study concentrates on the facies analysis, cyclicity and depositional environment of the Upper Cretaceous (Cenomanian) carbonate ramp deposits of the Lidam Formation. Core description, petrographic analysis data from five wells in Hamid and 3V areas in the SE Sirt Basin, Libya were studied in detail. The Lidam Formation is one of the main oil producing carbonate reservoirs in Southeast Sirt Basin and this study represents one of the key detailed studies of this Formation. In this study, ten main facies have been identified. These facies are; Chicken-Wire Anhydrite Facies, Fine Replacive Dolomite Facies, Bioclastic Sandstone Facies, Laminated Shale Facies, Stromatolitic Laminated Mudstone Facies, Ostracod Bioturbated Wackestone Facies, Bioturbated Mollusc Packstone Facies, Foraminifera Bioclastic Packstone/Grainstone Facies Peloidal Ooidal Packstone/Grainstone Facies and Squamariacean/Coralline Algae Bindstone Facies. These deposits are inferred to have formed in supratidal sabkha, intertidal, semi-open restricted shallow lagoon and higher energy shallow shoal environments. The overall depositional setting is interpreted as have been deposited in inner carbonate ramp deposits. The best reservoir quality is encountered in Peloidal- Ooidal Packstone/Grainstone facies, these facies represents storm - dominated shoal to back shoal deposits and constitute the inner part of carbonate ramp deposits. The succession shows a conspicuous hierarchical cyclicity. Porous shoal and backshoal deposits form during maximum transgression system and early regression hemi-cycle of the Lidam Fm. However; oil producing from shoal and backshoal deposits which only occur in the upper intervals 15 - 20 feet, which forms the large scale transgressive cycle of the Upper Lidam Formation.

Keywords: Lidam Fm. Sirt Basin, Wackestone Facies, petrographic, intertidal

Procedia PDF Downloads 482
28018 Agriculture Yield Prediction Using Predictive Analytic Techniques

Authors: Nagini Sabbineni, Rajini T. V. Kanth, B. V. Kiranmayee

Abstract:

India’s economy primarily depends on agriculture yield growth and their allied agro industry products. The agriculture yield prediction is the toughest task for agricultural departments across the globe. The agriculture yield depends on various factors. Particularly countries like India, majority of agriculture growth depends on rain water, which is highly unpredictable. Agriculture growth depends on different parameters, namely Water, Nitrogen, Weather, Soil characteristics, Crop rotation, Soil moisture, Surface temperature and Rain water etc. In our paper, lot of Explorative Data Analysis is done and various predictive models were designed. Further various regression models like Linear, Multiple Linear, Non-linear models are tested for the effective prediction or the forecast of the agriculture yield for various crops in Andhra Pradesh and Telangana states.

Keywords: agriculture yield growth, agriculture yield prediction, explorative data analysis, predictive models, regression models

Procedia PDF Downloads 274
28017 Impact of Infrastructural Development on Socio-Economic Growth: An Empirical Investigation in India

Authors: Jonardan Koner

Abstract:

The study attempts to find out the impact of infrastructural investment on state economic growth in India. It further tries to determine the magnitude of the impact of infrastructural investment on economic indicator, i.e., per-capita income (PCI) in Indian States. The study uses panel regression technique to measure the impact of infrastructural investment on per-capita income (PCI) in Indian States. Panel regression technique helps incorporate both the cross-section and time-series aspects of the dataset. In order to analyze the difference in impact of the explanatory variables on the explained variables across states, the study uses Fixed Effect Panel Regression Model. The conclusions of the study are that infrastructural investment has a desirable impact on economic development and that the impact is different for different states in India. We analyze time series data (annual frequency) ranging from 1991 to 2010. The study reveals that the infrastructural investment significantly explains the variation of economic indicators.

Keywords: infrastructural investment, multiple regression, panel regression techniques, economic development, fixed effect dummy variable model

Procedia PDF Downloads 345
28016 Antibacterial Evaluation, in Silico ADME and QSAR Studies of Some Benzimidazole Derivatives

Authors: Strahinja Kovačević, Lidija Jevrić, Miloš Kuzmanović, Sanja Podunavac-Kuzmanović

Abstract:

In this paper, various derivatives of benzimidazole have been evaluated against Gram-negative bacteria Escherichia coli. For all investigated compounds the minimum inhibitory concentration (MIC) was determined. Quantitative structure-activity relationships (QSAR) attempts to find consistent relationships between the variations in the values of molecular properties and the biological activity for a series of compounds so that these rules can be used to evaluate new chemical entities. The correlation between MIC and some absorption, distribution, metabolism and excretion (ADME) parameters was investigated, and the mathematical models for predicting the antibacterial activity of this class of compounds were developed. The quality of the multiple linear regression (MLR) models was validated by the leave-one-out (LOO) technique, as well as by the calculation of the statistical parameters for the developed models and the results are discussed on the basis of the statistical data. The results of this study indicate that ADME parameters have a significant effect on the antibacterial activity of this class of compounds. Principal component analysis (PCA) and agglomerative hierarchical clustering algorithms (HCA) confirmed that the investigated molecules can be classified into groups on the basis of the ADME parameters: Madin-Darby Canine Kidney cell permeability (MDCK), Plasma protein binding (PPB%), human intestinal absorption (HIA%) and human colon carcinoma cell permeability (Caco-2).

Keywords: benzimidazoles, QSAR, ADME, in silico

Procedia PDF Downloads 345
28015 The Role of Environmental Analysis in Managing Knowledge in Small and Medium Sized Enterprises

Authors: Liu Yao, B. T. Wan Maseri, Wan Mohd, B. T. Nurul Izzah, Mohd Shah, Wei Wei

Abstract:

Effectively managing knowledge has become a vital weapon for businesses to survive or to succeed in the increasingly competitive market. But do they perform environmental analysis when managing knowledge? If yes, how is the level and significance? This paper established a conceptual framework covering the basic knowledge management activities (KMA) to examine their contribution towards organizational performance (OP). Environmental analysis (EA) was then investigated from both internal and external aspects, to identify its effects on that contribution. Data was collected from 400 Chinese SMEs by questionnaires. Cronbach's α and factor analysis were conducted. Regression results show that the external analysis presents higher level than internal analysis. However, the internal analysis mediates the effects of external analysis on the KMA-OP relation and plays more significant role in the relation comparing with the external analysis. Thus, firms shall improve environmental analysis especially the internal analysis to enhance their KM practices.

Keywords: knowledge management, environmental analysis, performance, mediating, small sized enterprises, medium sized enterprises

Procedia PDF Downloads 579
28014 Hierarchical Operation Strategies for Grid Connected Building Microgrid with Energy Storage and Photovoltatic Source

Authors: Seon-Ho Yoon, Jin-Young Choi, Dong-Jun Won

Abstract:

This paper presents hierarchical operation strategies which are minimizing operation error between day ahead operation plan and real time operation. Operating power systems between centralized and decentralized approaches can be represented as hierarchical control scheme, featured as primary control, secondary control and tertiary control. Primary control is known as local control, featuring fast response. Secondary control is referred to as microgrid Energy Management System (EMS). Tertiary control is responsible of coordinating the operations of multi-microgrids. In this paper, we formulated 3 stage microgrid operation strategies which are similar to hierarchical control scheme. First stage is to set a day ahead scheduled output power of Battery Energy Storage System (BESS) which is only controllable source in microgrid and it is optimized to minimize cost of exchanged power with main grid using Particle Swarm Optimization (PSO) method. Second stage is to control the active and reactive power of BESS to be operated in day ahead scheduled plan in case that State of Charge (SOC) error occurs between real time and scheduled plan. The third is rescheduling the system when the predicted error is over the limited value. The first stage can be compared with the secondary control in that it adjusts the active power. The second stage is comparable to the primary control in that it controls the error in local manner. The third stage is compared with the secondary control in that it manages power balancing. The proposed strategies will be applied to one of the buildings in Electronics and Telecommunication Research Institute (ETRI). The building microgrid is composed of Photovoltaic (PV) generation, BESS and load and it will be interconnected with the main grid. Main purpose of that is minimizing operation cost and to be operated in scheduled plan. Simulation results support validation of proposed strategies.

Keywords: Battery Energy Storage System (BESS), Energy Management System (EMS), Microgrid (MG), Particle Swarm Optimization (PSO)

Procedia PDF Downloads 229
28013 A Quadratic Model to Early Predict the Blastocyst Stage with a Time Lapse Incubator

Authors: Cecile Edel, Sandrine Giscard D'Estaing, Elsa Labrune, Jacqueline Lornage, Mehdi Benchaib

Abstract:

Introduction: The use of incubator equipped with time-lapse technology in Artificial Reproductive Technology (ART) allows a continuous surveillance. With morphocinetic parameters, algorithms are available to predict the potential outcome of an embryo. However, the different proposed time-lapse algorithms do not take account the missing data, and then some embryos could not be classified. The aim of this work is to construct a predictive model even in the case of missing data. Materials and methods: Patients: A retrospective study was performed, in biology laboratory of reproduction at the hospital ‘Femme Mère Enfant’ (Lyon, France) between 1 May 2013 and 30 April 2015. Embryos (n= 557) obtained from couples (n=108) were cultured in a time-lapse incubator (Embryoscope®, Vitrolife, Goteborg, Sweden). Time-lapse incubator: The morphocinetic parameters obtained during the three first days of embryo life were used to build the predictive model. Predictive model: A quadratic regression was performed between the number of cells and time. N = a. T² + b. T + c. N: number of cells at T time (T in hours). The regression coefficients were calculated with Excel software (Microsoft, Redmond, WA, USA), a program with Visual Basic for Application (VBA) (Microsoft) was written for this purpose. The quadratic equation was used to find a value that allows to predict the blastocyst formation: the synthetize value. The area under the curve (AUC) obtained from the ROC curve was used to appreciate the performance of the regression coefficients and the synthetize value. A cut-off value has been calculated for each regression coefficient and for the synthetize value to obtain two groups where the difference of blastocyst formation rate according to the cut-off values was maximal. The data were analyzed with SPSS (IBM, Il, Chicago, USA). Results: Among the 557 embryos, 79.7% had reached the blastocyst stage. The synthetize value corresponds to the value calculated with time value equal to 99, the highest AUC was then obtained. The AUC for regression coefficient ‘a’ was 0.648 (p < 0.001), 0.363 (p < 0.001) for the regression coefficient ‘b’, 0.633 (p < 0.001) for the regression coefficient ‘c’, and 0.659 (p < 0.001) for the synthetize value. The results are presented as follow: blastocyst formation rate under cut-off value versus blastocyst rate formation above cut-off value. For the regression coefficient ‘a’ the optimum cut-off value was -1.14.10-3 (61.3% versus 84.3%, p < 0.001), 0.26 for the regression coefficient ‘b’ (83.9% versus 63.1%, p < 0.001), -4.4 for the regression coefficient ‘c’ (62.2% versus 83.1%, p < 0.001) and 8.89 for the synthetize value (58.6% versus 85.0%, p < 0.001). Conclusion: This quadratic regression allows to predict the outcome of an embryo even in case of missing data. Three regression coefficients and a synthetize value could represent the identity card of an embryo. ‘a’ regression coefficient represents the acceleration of cells division, ‘b’ regression coefficient represents the speed of cell division. We could hypothesize that ‘c’ regression coefficient could represent the intrinsic potential of an embryo. This intrinsic potential could be dependent from oocyte originating the embryo. These hypotheses should be confirmed by studies analyzing relationship between regression coefficients and ART parameters.

Keywords: ART procedure, blastocyst formation, time-lapse incubator, quadratic model

Procedia PDF Downloads 284
28012 A Hybrid Fuzzy Clustering Approach for Fertile and Unfertile Analysis

Authors: Shima Soltanzadeh, Mohammad Hosain Fazel Zarandi, Mojtaba Barzegar Astanjin

Abstract:

Diagnosis of male infertility by the laboratory tests is expensive and, sometimes it is intolerable for patients. Filling out the questionnaire and then using classification method can be the first step in decision-making process, so only in the cases with a high probability of infertility we can use the laboratory tests. In this paper, we evaluated the performance of four classification methods including naive Bayesian, neural network, logistic regression and fuzzy c-means clustering as a classification, in the diagnosis of male infertility due to environmental factors. Since the data are unbalanced, the ROC curves are most suitable method for the comparison. In this paper, we also have selected the more important features using a filtering method and examined the impact of this feature reduction on the performance of each methods; generally, most of the methods had better performance after applying the filter. We have showed that using fuzzy c-means clustering as a classification has a good performance according to the ROC curves and its performance is comparable to other classification methods like logistic regression.

Keywords: classification, fuzzy c-means, logistic regression, Naive Bayesian, neural network, ROC curve

Procedia PDF Downloads 304