Search results for: large margin nearest neighbor regression
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 10152

Search results for: large margin nearest neighbor regression

9912 Machine Vision System for Measuring the Quality of Bulk Sun-dried Organic Raisins

Authors: Navab Karimi, Tohid Alizadeh

Abstract:

An intelligent vision-based system was designed to measure the quality and purity of raisins. A machine vision setup was utilized to capture the images of bulk raisins in ranges of 5-50% mixed pure-impure berries. The textural features of bulk raisins were extracted using Grey-level Histograms, Co-occurrence Matrix, and Local Binary Pattern (a total of 108 features). Genetic Algorithm and neural network regression were used for selecting and ranking the best features (21 features). As a result, the GLCM features set was found to have the highest accuracy (92.4%) among the other sets. Followingly, multiple feature combinations of the previous stage were fed into the second regression (linear regression) to increase accuracy, wherein a combination of 16 features was found to be the optimum. Finally, a Support Vector Machine (SVM) classifier was used to differentiate the mixtures, producing the best efficiency and accuracy of 96.2% and 97.35%, respectively.

Keywords: sun-dried organic raisin, genetic algorithm, feature extraction, ann regression, linear regression, support vector machine, south azerbaijan.

Procedia PDF Downloads 46
9911 Analysis of Factors Affecting the Number of Infant and Maternal Mortality in East Java with Geographically Weighted Bivariate Generalized Poisson Regression Method

Authors: Luh Eka Suryani, Purhadi

Abstract:

Poisson regression is a non-linear regression model with response variable in the form of count data that follows Poisson distribution. Modeling for a pair of count data that show high correlation can be analyzed by Poisson Bivariate Regression. Data, the number of infant mortality and maternal mortality, are count data that can be analyzed by Poisson Bivariate Regression. The Poisson regression assumption is an equidispersion where the mean and variance values are equal. However, the actual count data has a variance value which can be greater or less than the mean value (overdispersion and underdispersion). Violations of this assumption can be overcome by applying Generalized Poisson Regression. Characteristics of each regency can affect the number of cases occurred. This issue can be overcome by spatial analysis called geographically weighted regression. This study analyzes the number of infant mortality and maternal mortality based on conditions in East Java in 2016 using Geographically Weighted Bivariate Generalized Poisson Regression (GWBGPR) method. Modeling is done with adaptive bisquare Kernel weighting which produces 3 regency groups based on infant mortality rate and 5 regency groups based on maternal mortality rate. Variables that significantly influence the number of infant and maternal mortality are the percentages of pregnant women visit health workers at least 4 times during pregnancy, pregnant women get Fe3 tablets, obstetric complication handled, clean household and healthy behavior, and married women with the first marriage age under 18 years.

Keywords: adaptive bisquare kernel, GWBGPR, infant mortality, maternal mortality, overdispersion

Procedia PDF Downloads 131
9910 Seismic Stratigraphy of the First Deposits of the Kribi-Campo Offshore Sub-basin (Gulf of Guinea): Pre-cretaceous Early Marine Incursion and Source Rocks Modeling

Authors: Mike-Franck Mienlam Essi, Joseph Quentin Yene Atangana, Mbida Yem

Abstract:

The Kribi-Campo sub-basin belongs to the southern domain of the Cameroon Atlantic Margin in the Gulf of Guinea. It is the African homologous segment of the Sergipe-Alagoas Basin, located at the northeast side of the Brazil margin. The onset of the seafloor spreading period in the Southwest African Margin in general and the study area particularly remains controversial. Various studies locate this event during the Cretaceous times (Early Aptian to Late Albian), while others suggested that this event occurred during Pre-Cretaceous period (Palaeozoic or Jurassic). This work analyses 02 Cameroon Span seismic lines to re-examine the Early marine incursion period of the study area for a better understanding of the margin evolution. The methodology of analysis in this study is based on the delineation of the first seismic sequence, using the reflector’s terminations tracking and the analysis of its internal reflections associated to the external configuration of the package. The results obtained indicate from the bottom upwards that the first deposits overlie a first seismic horizon (H1) associated to “onlap” terminations at its top and underlie a second horizon which shows “Downlap” terminations at its top (H2). The external configuration of this package features a prograded fill pattern, and it is observed within the depocenter area with discontinuous reflections that pinch out against the basement. From east to west, this sequence shows two seismic facies (SF1 and SF2). SF1 has parallel to subparallel reflections, characterized by high amplitude, and SF2 shows parallel and stratified reflections, characterized by low amplitude. The distribution of these seismic facies reveals a lateral facies variation observed. According to the fundamentals works on seismic stratigraphy and the literature review of the geological context of the study area, particularly, the stratigraphical natures of the identified horizons and seismic facies have been highlighted. The seismic horizons H1 and H2 correspond to Top basement and “Downlap Surface,” respectively. SF1 indicates continental sediments (Sands/Sandstone) and SF2 marine deposits (shales, clays). Then, the prograding configuration observed suggests a marine regression. The correlation of these results with the lithochronostratigraphic chart of Sergipe-Alagoas Basin reveals that the first marine deposits through the study area are dated from Pre-Cretaceous times (Palaeozoic or Jurassic). The first deposits onto the basement represents the end of a cycle of sedimentation. The hypothesis of Mike.F. Mienlam Essi is with the Earth Sciences Department of the Faculty of Science of the University of Yaoundé I, P.O. BOX 812 CAMEROON (e-mail: [email protected]). Joseph.Q. Yene Atangana is with the Earth Sciences Department of the Faculty of Science of the University of Yaoundé I, P.O. BOX 812 CAMEROON (e-mail: [email protected]). Mbida Yem is with the Earth Sciences Department of the Faculty of Science of the University of Yaoundé I, P.O. BOX 812 CAMEROON (e-mail: [email protected]). Cretaceous seafloor spreading through the study area is the onset of another cycle of sedimentation. Furthermore, the presence of marine sediments into the first deposits implies that this package could contain marine source rocks. The spatial tracking of these deposits reveals that they could be found in some onshore parts of the Kribi-Campo area or even in the northern side.

Keywords: cameroon span seismic, early marine incursion, kribi-campo sub-basin, pre-cretaceous period, sergipe-alagoas basin

Procedia PDF Downloads 82
9909 Determining the Causality Variables in Female Genital Mutilation: A Factor Screening Approach

Authors: Ekele Alih, Enejo Jalija

Abstract:

Female Genital Mutilation (FGM) is made up of three types namely: Clitoridectomy, Excision and Infibulation. In this study, we examine the factors responsible for FGM in order to identify the causality variables in a logistic regression approach. From the result of the survey conducted by the Public Health Division, Nigeria Institute of Medical Research, Yaba, Lagos State, the tau statistic, τ was used to screen 9 factors that causes FGM in order to select few of the predictors before multiple regression equation is obtained. The need for this may be that the sample size may not be able to sustain having a regression with all the predictors or to avoid multi-collinearity. A total of 300 respondents, comprising 150 adult males and 150 adult females were selected for the household survey based on the multi-stage sampling procedure. The tau statistic,

Keywords: female genital mutilation, logistic regression, tau statistic, African society

Procedia PDF Downloads 227
9908 A Monte Carlo Fuzzy Logistic Regression Framework against Imbalance and Separation

Authors: Georgios Charizanos, Haydar Demirhan, Duygu Icen

Abstract:

Two of the most impactful issues in classical logistic regression are class imbalance and complete separation. These can result in model predictions heavily leaning towards the imbalanced class on the binary response variable or over-fitting issues. Fuzzy methodology offers key solutions for handling these problems. However, most studies propose the transformation of the binary responses into a continuous format limited within [0,1]. This is called the possibilistic approach within fuzzy logistic regression. Following this approach is more aligned with straightforward regression since a logit-link function is not utilized, and fuzzy probabilities are not generated. In contrast, we propose a method of fuzzifying binary response variables that allows for the use of the logit-link function; hence, a probabilistic fuzzy logistic regression model with the Monte Carlo method. The fuzzy probabilities are then classified by selecting a fuzzy threshold. Different combinations of fuzzy and crisp input, output, and coefficients are explored, aiming to understand which of these perform better under different conditions of imbalance and separation. We conduct numerical experiments using both synthetic and real datasets to demonstrate the performance of the fuzzy logistic regression framework against seven crisp machine learning methods. The proposed framework shows better performance irrespective of the degree of imbalance and presence of separation in the data, while the considered machine learning methods are significantly impacted.

Keywords: fuzzy logistic regression, fuzzy, logistic, machine learning

Procedia PDF Downloads 36
9907 Naïve Bayes: A Classical Approach for the Epileptic Seizures Recognition

Authors: Bhaveek Maini, Sanjay Dhanka, Surita Maini

Abstract:

Electroencephalography (EEG) is used to classify several epileptic seizures worldwide. It is a very crucial task for the neurologist to identify the epileptic seizure with manual EEG analysis, as it takes lots of effort and time. Human error is always at high risk in EEG, as acquiring signals needs manual intervention. Disease diagnosis using machine learning (ML) has continuously been explored since its inception. Moreover, where a large number of datasets have to be analyzed, ML is acting as a boon for doctors. In this research paper, authors proposed two different ML models, i.e., logistic regression (LR) and Naïve Bayes (NB), to predict epileptic seizures based on general parameters. These two techniques are applied to the epileptic seizures recognition dataset, available on the UCI ML repository. The algorithms are implemented on an 80:20 train test ratio (80% for training and 20% for testing), and the performance of the model was validated by 10-fold cross-validation. The proposed study has claimed accuracy of 81.87% and 95.49% for LR and NB, respectively.

Keywords: epileptic seizure recognition, logistic regression, Naïve Bayes, machine learning

Procedia PDF Downloads 34
9906 Landslide Susceptibility Mapping: A Comparison between Logistic Regression and Multivariate Adaptive Regression Spline Models in the Municipality of Oudka, Northern of Morocco

Authors: S. Benchelha, H. C. Aoudjehane, M. Hakdaoui, R. El Hamdouni, H. Mansouri, T. Benchelha, M. Layelmam, M. Alaoui

Abstract:

The logistic regression (LR) and multivariate adaptive regression spline (MarSpline) are applied and verified for analysis of landslide susceptibility map in Oudka, Morocco, using geographical information system. From spatial database containing data such as landslide mapping, topography, soil, hydrology and lithology, the eight factors related to landslides such as elevation, slope, aspect, distance to streams, distance to road, distance to faults, lithology map and Normalized Difference Vegetation Index (NDVI) were calculated or extracted. Using these factors, landslide susceptibility indexes were calculated by the two mentioned methods. Before the calculation, this database was divided into two parts, the first for the formation of the model and the second for the validation. The results of the landslide susceptibility analysis were verified using success and prediction rates to evaluate the quality of these probabilistic models. The result of this verification was that the MarSpline model is the best model with a success rate (AUC = 0.963) and a prediction rate (AUC = 0.951) higher than the LR model (success rate AUC = 0.918, rate prediction AUC = 0.901).

Keywords: landslide susceptibility mapping, regression logistic, multivariate adaptive regression spline, Oudka, Taounate

Procedia PDF Downloads 160
9905 Islamic Equity Markets Response to Volatility of Bitcoin

Authors: Zakaria S. G. Hegazy, Walid M. A. Ahmed

Abstract:

This paper examines the dependence structure of Islamic stock markets on Bitcoin’s realized volatility components in bear, normal, and bull market periods. A quantile regression approach is employed, after adjusting raw returns with respect to a broad set of relevant global factors and accounting for structural breaks in the data. The results reveal that upside volatility tends to exert negative influences on Islamic developed-market returns more in bear than in bull market conditions, while downside volatility positively affects returns during bear and bull conditions. For emerging markets, we find that the upside (downside) component exerts lagged negative (positive) effects on returns in bear (all) market regimes. By and large, the dependence structures turn out to be asymmetric. Our evidence provides essential implications for investors.

Keywords: cryptocurrency markets, bitcoin, realized volatility measures, asymmetry, quantile regression

Procedia PDF Downloads 153
9904 Partial Least Square Regression for High-Dimentional and High-Correlated Data

Authors: Mohammed Abdullah Alshahrani

Abstract:

The research focuses on investigating the use of partial least squares (PLS) methodology for addressing challenges associated with high-dimensional correlated data. Recent technological advancements have led to experiments producing data characterized by a large number of variables compared to observations, with substantial inter-variable correlations. Such data patterns are common in chemometrics, where near-infrared (NIR) spectrometer calibrations record chemical absorbance levels across hundreds of wavelengths, and in genomics, where thousands of genomic regions' copy number alterations (CNA) are recorded from cancer patients. PLS serves as a widely used method for analyzing high-dimensional data, functioning as a regression tool in chemometrics and a classification method in genomics. It handles data complexity by creating latent variables (components) from original variables. However, applying PLS can present challenges. The study investigates key areas to address these challenges, including unifying interpretations across three main PLS algorithms and exploring unusual negative shrinkage factors encountered during model fitting. The research presents an alternative approach to addressing the interpretation challenge of predictor weights associated with PLS. Sparse estimation of predictor weights is employed using a penalty function combining a lasso penalty for sparsity and a Cauchy distribution-based penalty to account for variable dependencies. The results demonstrate sparse and grouped weight estimates, aiding interpretation and prediction tasks in genomic data analysis. High-dimensional data scenarios, where predictors outnumber observations, are common in regression analysis applications. Ordinary least squares regression (OLS), the standard method, performs inadequately with high-dimensional and highly correlated data. Copy number alterations (CNA) in key genes have been linked to disease phenotypes, highlighting the importance of accurate classification of gene expression data in bioinformatics and biology using regularized methods like PLS for regression and classification.

Keywords: partial least square regression, genetics data, negative filter factors, high dimensional data, high correlated data

Procedia PDF Downloads 9
9903 Tectono-Stratigraphic Architecture, Depositional Systems and Salt Tectonics to Strike-Slip Faulting in Kribi-Campo-Cameroon Atlantic Margin with an Unsupervised Machine Learning Approach (West African Margin)

Authors: Joseph Bertrand Iboum Kissaaka, Charles Fonyuy Ngum Tchioben, Paul Gustave Fowe Kwetche, Jeannette Ngo Elogan Ntem, Joseph Binyet Njebakal, Ribert Yvan Makosso-Tchapi, François Mvondo Owono, Marie Joseph Ntamak-Nida

Abstract:

Located in the Gulf of Guinea, the Kribi-Campo sub-basin belongs to the Aptian salt basins along the West African Margin. In this paper, we investigated the tectono-stratigraphic architecture of the basin, focusing on the role of salt tectonics and strike-slip faults along the Kribi Fracture Zone with implications for reservoir prediction. Using 2D seismic data and well data interpreted through sequence stratigraphy with integrated seismic attributes analysis with Python Programming and unsupervised Machine Learning, at least six second-order sequences, indicating three main stages of tectono-stratigraphic evolution, were determined: pre-salt syn-rift, post-salt rift climax and post-rift stages. The pre-salt syn-rift stage with KTS1 tectonosequence (Barremian-Aptian) reveals a transform rifting along NE-SW transfer faults associated with N-S to NNE-SSW syn-rift longitudinal faults bounding a NW-SE half-graben filled with alluvial to lacustrine-fan delta deposits. The post-salt rift-climax stage (Lower to Upper Cretaceous) includes two second-order tectonosequences (KTS2 and KTS3) associated with the salt tectonics and Campo High uplift. During the rift-climax stage, the growth of salt diapirs developed syncline withdrawal basins filled by early forced regression, mid transgressive and late normal regressive systems tracts. The early rift climax underlines some fine-grained hangingwall fans or delta deposits and coarse-grained fans from the footwall of fault scarps. The post-rift stage (Paleogene to Neogene) contains at least three main tectonosequences KTS4, KTS5 and KTS6-7. The first one developed some turbiditic lobe complexes considered as mass transport complexes and feeder channel-lobe complexes cutting the unstable shelf edge of the Campo High. The last two developed submarine Channel Complexes associated with lobes towards the southern part and braided delta to tidal channels towards the northern part of the Kribi-Campo sub-basin. The reservoir distribution in the Kribi-Campo sub-basin reveals some channels, fan lobes reservoirs and stacked channels reaching up to the polygonal fault systems.

Keywords: tectono-stratigraphic architecture, Kribi-Campo sub-basin, machine learning, pre-salt sequences, post-salt sequences

Procedia PDF Downloads 16
9902 Robust Variable Selection Based on Schwarz Information Criterion for Linear Regression Models

Authors: Shokrya Saleh A. Alshqaq, Abdullah Ali H. Ahmadini

Abstract:

The Schwarz information criterion (SIC) is a popular tool for selecting the best variables in regression datasets. However, SIC is defined using an unbounded estimator, namely, the least-squares (LS), which is highly sensitive to outlying observations, especially bad leverage points. A method for robust variable selection based on SIC for linear regression models is thus needed. This study investigates the robustness properties of SIC by deriving its influence function and proposes a robust SIC based on the MM-estimation scale. The aim of this study is to produce a criterion that can effectively select accurate models in the presence of vertical outliers and high leverage points. The advantages of the proposed robust SIC is demonstrated through a simulation study and an analysis of a real dataset.

Keywords: influence function, robust variable selection, robust regression, Schwarz information criterion

Procedia PDF Downloads 114
9901 Molecular Survey and Genetic Diversity of Bartonella henselae Strains Infecting Stray Cats from Algeria

Authors: Naouelle Azzag, Nadia Haddad, Benoit Durand, Elisabeth Petit, Ali Ammouche, Bruno Chomel, Henri J. Boulouis

Abstract:

Bartonella henselae is a small, gram negative, arthropod-borne bacterium that has been shown to cause multiple clinical manifestations in humans including cat scratch disease, bacillary angiomatosis, endocarditis, and bacteremia. In this research, we report the results of a cross sectional study of Bartonella henselae bacteremia in stray cats from Algiers. Whole blood of 227 stray cats from Algiers was tested for the presence of Bartonella species by culture and for the evaluation of the genetic diversity of B. henselae strains by multi-locus variable number of tandem repeats assay (MLVA). Bacteremia prevalence was 17% and only B. henselae was identified. Type I was the predominant type (64%). MLVA typing of 259 strains from 30 bacteremic cats revealed 52 different profiles. 51 of these profiles were specific to Algerian cats/identified for the first time. 20/30 cats (67%) harbored 2 to 7 MLVA profiles simultaneously. The similarity of MLVA profiles obtained from the same cat, neighbor-joining clustering and structure-neighbor clustering showed that such a diversity likely results from two different mechanisms occurring either independently or simultaneously independent infections and genetic drift from a primary strain.

Keywords: Bartonella, cat, MLVA, genetic

Procedia PDF Downloads 120
9900 A Comparison of Neural Network and DOE-Regression Analysis for Predicting Resource Consumption of Manufacturing Processes

Authors: Frank Kuebler, Rolf Steinhilper

Abstract:

Artificial neural networks (ANN) as well as Design of Experiments (DOE) based regression analysis (RA) are mainly used for modeling of complex systems. Both methodologies are commonly applied in process and quality control of manufacturing processes. Due to the fact that resource efficiency has become a critical concern for manufacturing companies, these models needs to be extended to predict resource-consumption of manufacturing processes. This paper describes an approach to use neural networks as well as DOE based regression analysis for predicting resource consumption of manufacturing processes and gives a comparison of the achievable results based on an industrial case study of a turning process.

Keywords: artificial neural network, design of experiments, regression analysis, resource efficiency, manufacturing process

Procedia PDF Downloads 492
9899 Logistic Regression Model versus Additive Model for Recurrent Event Data

Authors: Entisar A. Elgmati

Abstract:

Recurrent infant diarrhea is studied using daily data collected in Salvador, Brazil over one year and three months. A logistic regression model is fitted instead of Aalen's additive model using the same covariates that were used in the analysis with the additive model. The model gives reasonably similar results to that using additive regression model. In addition, the problem with the estimated conditional probabilities not being constrained between zero and one in additive model is solved here. Also martingale residuals that have been used to judge the goodness of fit for the additive model are shown to be useful for judging the goodness of fit of the logistic model.

Keywords: additive model, cumulative probabilities, infant diarrhoea, recurrent event

Procedia PDF Downloads 605
9898 Required SNR for PPM in Downlink Gamma-Gamma Turbulence Channel

Authors: Selami Şahin

Abstract:

In this paper, in order to achieve sufficient bit error rate (BER) according to zenith angle of the satellite to ground station, SNR requirement is investigated utilizing pulse position modulation (PPM). To realize explicit results, all parameters such as link distance, Rytov variance, scintillation index, wavelength, aperture diameter of the receiver, Fried's parameter and zenith angle have been taken into account. Results indicate that after some parameters are determined since the constraints of the system, to achieve desired BER, required SNR values are in wide range while zenith angle changes from small to large values. Therefore, in order not to utilize high link margin, either SNR should adjust according to zenith angle or link should establish with predetermined intervals of the zenith angle.

Keywords: Free-space optical communication, optical downlink channel, atmospheric turbulence, wireless optical communication

Procedia PDF Downloads 369
9897 Managing and Sustaining Strategic Relationships with Distributors by Electronic Agencies in Jordan

Authors: Abdallah Q. Bataineh

Abstract:

The electronics market in Jordan is facing extraordinary expectations from consumers, whose opinions are progressively more essential and have effective power on the overall marketing strategy preparation and execution by electronics agents. This research aimed to explore the effect of price volatile, follow-up, maintenance and warranty policy on distributor’s retention. Focus group, in-depth interviews, and self-administered questionnaire were held with a total sample of 50 electronics distribution stores who have a direct contact and purchase frequently from electronic agencies. By using descriptive statistics and multiple regression tests, the main findings of this research is that there is an impact of price volatile, follow-up, maintenance and warranty policy on distributor’s retention, and the key predictor variable was price volatile. Thus, the researcher recommended flat rate pricing strategy to ensure that all distributors will sell the product on the same pricing base, regardless of the generated margin by each one of them. Moreover, conclusion and future research were also discussed.

Keywords: distributors retention, follow-up, maintenance, price volatile, warranty policy

Procedia PDF Downloads 210
9896 Identifying Factors Contributing to the Spread of Lyme Disease: A Regression Analysis of Virginia’s Data

Authors: Fatemeh Valizadeh Gamchi, Edward L. Boone

Abstract:

This research focuses on Lyme disease, a widespread infectious condition in the United States caused by the bacterium Borrelia burgdorferi sensu stricto. It is critical to identify environmental and economic elements that are contributing to the spread of the disease. This study examined data from Virginia to identify a subset of explanatory variables significant for Lyme disease case numbers. To identify relevant variables and avoid overfitting, linear poisson, and regularization regression methods such as a ridge, lasso, and elastic net penalty were employed. Cross-validation was performed to acquire tuning parameters. The methods proposed can automatically identify relevant disease count covariates. The efficacy of the techniques was assessed using four criteria on three simulated datasets. Finally, using the Virginia Department of Health’s Lyme disease data set, the study successfully identified key factors, and the results were consistent with previous studies.

Keywords: lyme disease, Poisson generalized linear model, ridge regression, lasso regression, elastic net regression

Procedia PDF Downloads 96
9895 An Analysis of the Effect of Sharia Financing and Work Relation Founding towards Non-Performing Financing in Islamic Banks in Indonesia

Authors: Muhammad Bahrul Ilmi

Abstract:

The purpose of this research is to analyze the influence of Islamic financing and work relation founding simultaneously and partially towards non-performing financing in Islamic banks. This research was regression quantitative field research, and had been done in Muammalat Indonesia Bank and Islamic Danamon Bank in 3 months. The populations of this research were 15 account officers of Muammalat Indonesia Bank and Islamic Danamon Bank in Surakarta, Indonesia. The techniques of collecting data used in this research were documentation, questionnaire, literary study and interview. Regression analysis result shows that Islamic financing and work relation founding simultaneously has positive and significant effect towards non performing financing of two Islamic Banks. It is obtained with probability value 0.003 which is less than 0.05 and F value 9.584. The analysis result of Islamic financing regression towards non performing financing shows the significant effect. It is supported by double linear regression analysis with probability value 0.001 which is less than 0.05. The regression analysis of work relation founding effect towards non-performing financing shows insignificant effect. This is shown in the double linear regression analysis with probability value 0.161 which is bigger than 0.05.

Keywords: Syariah financing, work relation founding, non-performing financing (NPF), Islamic Bank

Procedia PDF Downloads 404
9894 A Kolmogorov-Smirnov Type Goodness-Of-Fit Test of Multinomial Logistic Regression Model in Case-Control Studies

Authors: Chen Li-Ching

Abstract:

The multinomial logistic regression model is used popularly for inferring the relationship of risk factors and disease with multiple categories. This study based on the discrepancy between the nonparametric maximum likelihood estimator and semiparametric maximum likelihood estimator of the cumulative distribution function to propose a Kolmogorov-Smirnov type test statistic to assess adequacy of the multinomial logistic regression model for case-control data. A bootstrap procedure is presented to calculate the critical value of the proposed test statistic. Empirical type I error rates and powers of the test are performed by simulation studies. Some examples will be illustrated the implementation of the test.

Keywords: case-control studies, goodness-of-fit test, Kolmogorov-Smirnov test, multinomial logistic regression

Procedia PDF Downloads 427
9893 A Study on Inference from Distance Variables in Hedonic Regression

Authors: Yan Wang, Yasushi Asami, Yukio Sadahiro

Abstract:

In urban area, several landmarks may affect housing price and rents, hedonic analysis should employ distance variables corresponding to each landmarks. Unfortunately, the effects of distances to landmarks on housing prices are generally not consistent with the true price. These distance variables may cause magnitude error in regression, pointing a problem of spatial multicollinearity. In this paper, we provided some approaches for getting the samples with less bias and method on locating the specific sampling area to avoid the multicollinerity problem in two specific landmarks case.

Keywords: landmarks, hedonic regression, distance variables, collinearity, multicollinerity

Procedia PDF Downloads 426
9892 Forecasting of Grape Juice Flavor by Using Support Vector Regression

Authors: Ren-Jieh Kuo, Chun-Shou Huang

Abstract:

The research of juice flavor forecasting has become more important in China. Due to the fast economic growth in China, many different kinds of juices have been introduced to the market. If a beverage company can understand their customers’ preference well, the juice can be served more attractively. Thus, this study intends to introduce the basic theory and computing process of grapes juice flavor forecasting based on support vector regression (SVR). Applying SVR, BPN and LR to forecast the flavor of grapes juice in real data, the result shows that SVR is more suitable and effective at predicting performance.

Keywords: flavor forecasting, artificial neural networks, Support Vector Regression, China

Procedia PDF Downloads 453
9891 Estimation of a Finite Population Mean under Random Non Response Using Improved Nadaraya and Watson Kernel Weights

Authors: Nelson Bii, Christopher Ouma, John Odhiambo

Abstract:

Non-response is a potential source of errors in sample surveys. It introduces bias and large variance in the estimation of finite population parameters. Regression models have been recognized as one of the techniques of reducing bias and variance due to random non-response using auxiliary data. In this study, it is assumed that random non-response occurs in the survey variable in the second stage of cluster sampling, assuming full auxiliary information is available throughout. Auxiliary information is used at the estimation stage via a regression model to address the problem of random non-response. In particular, the auxiliary information is used via an improved Nadaraya-Watson kernel regression technique to compensate for random non-response. The asymptotic bias and mean squared error of the estimator proposed are derived. Besides, a simulation study conducted indicates that the proposed estimator has smaller values of the bias and smaller mean squared error values compared to existing estimators of finite population mean. The proposed estimator is also shown to have tighter confidence interval lengths at a 95% coverage rate. The results obtained in this study are useful, for instance, in choosing efficient estimators of the finite population mean in demographic sample surveys.

Keywords: mean squared error, random non-response, two-stage cluster sampling, confidence interval lengths

Procedia PDF Downloads 107
9890 Estimation of Coefficients of Ridge and Principal Components Regressions with Multicollinear Data

Authors: Rajeshwar Singh

Abstract:

The presence of multicollinearity is common in handling with several explanatory variables simultaneously due to exhibiting a linear relationship among them. A great problem arises in understanding the impact of explanatory variables on the dependent variable. Thus, the method of least squares estimation gives inexact estimates. In this case, it is advised to detect its presence first before proceeding further. Using the ridge regression degree of its occurrence is reduced but principal components regression gives good estimates in this situation. This paper discusses well-known techniques of the ridge and principal components regressions and applies to get the estimates of coefficients by both techniques. In addition to it, this paper also discusses the conflicting claim on the discovery of the method of ridge regression based on available documents.

Keywords: conflicting claim on credit of discovery of ridge regression, multicollinearity, principal components and ridge regressions, variance inflation factor

Procedia PDF Downloads 375
9889 The Impact of System Cascading Collapse and Transmission Line Outages to the Transfer Capability Assessment

Authors: Nur Ashida Salim, Muhammad Murtadha Othman, Ismail Musirin, Mohd Salleh Serwan

Abstract:

Uncertainty of system operating conditions is one of the causative reasons which may render to the instability of a transmission system. This will encumber the performance of transmission system to efficiently transmit the electrical power between areas. For that reason, accurate assessment of Transmission Reliability Margin (TRM) is essential in order to ensure effective power transfer between areas during the occurrence of system uncertainties. The power transfer is also called as the Available Transfer Capability (ATC) in which it is the information required by the utilities and marketers to instigate selling and buying the electric energy. This paper proposes a computationally effective approach to estimate TRM and ATC by considering the uncertainties of system cascading collapse and transmission line outages which is identified as the main reasons in power system instability. In accordance to the results that have been obtained, the proposed method is essential for the transmission providers which could help the power marketers and planning sectors in the operation and reserving transmission services based on the ATC calculated.

Keywords: system cascading collapse, transmission line outages, transmission reliability margin, available transfer capability

Procedia PDF Downloads 396
9888 Binary Logistic Regression Model in Predicting the Employability of Senior High School Graduates

Authors: Cromwell F. Gopo, Joy L. Picar

Abstract:

This study aimed to predict the employability of senior high school graduates for S.Y. 2018- 2019 in the Davao del Norte Division through quantitative research design using the descriptive status and predictive approaches among the indicated parameters, namely gender, school type, academics, academic award recipient, skills, values, and strand. The respondents of the study were the 33 secondary schools offering senior high school programs identified through simple random sampling, which resulted in 1,530 cases of graduates’ secondary data, which were analyzed using frequency, percentage, mean, standard deviation, and binary logistic regression. Results showed that the majority of the senior high school graduates who come from large schools were females. Further, less than half of these graduates received any academic award in any semester. In general, the graduates’ performance in academics, skills, and values were proficient. Moreover, less than half of the graduates were not employed. Then, those who were employed were either contractual, casual, or part-time workers dominated by GAS graduates. Further, the predictors of employability were gender and the Information and Communications Technology (ICT) strand, while the remaining variables did not add significantly to the model. The null hypothesis had been rejected as the coefficients of the predictors in the binary logistic regression equation did not take the value of 0. After utilizing the model, it was concluded that Technical-Vocational-Livelihood (TVL) graduates except ICT had greater estimates of employability.

Keywords: employability, senior high school graduates, Davao del Norte, Philippines

Procedia PDF Downloads 110
9887 Natural Regeneration Dynamics in Different Microsites within Gaps of Different Sizes

Authors: M. E. Hammond, R. Pokorny

Abstract:

Not much research has gone into the dynamics of natural regeneration of trees species in tropical forest regions. This study seeks to investigate the impact of gap sizes and light distribution in forest floors on the regeneration of Celtis mildbraedii (CEM), Nesogordonia papaverine (NES) and Terminalia superba (TES). These are selected economically important tree species with different shade tolerance attributes. The spatial distribution patterns and the potential regeneration competition index (RCI) among species using height to diameter ratio (HDR) have been assessed. Gap sizes ranging between 287 – 971 m² were selected at the Bia Tano forest reserve, a tropical moist semi-deciduous forest in Ghana. Four (4) transects in the cardinal directions were constructed from the center of each gap. Along each transect, ten 1 m² sampling zones at 2 m spacing were established. Then, three gap microsites (labeled ecozones I, II, III) were delineated within these sampling zones based on the varying temporal light distribution on the forest floor. Data on height (H), root collar diameter (RCD) and regeneration census were gathered from each of the ten sampling zones. CEM and NES seedlings (≤ 50 cm) and saplings (≥ 51 cm) were present in all ecozones of the large gaps. Seedlings of TES were observed in all ecozones of large and small gaps. Regression analysis showed a significant negative linear relationship between independent RCD and H growth variables on dependent HDR index in ecozones II and III of both large and small gaps. There was a correlation between RCD and H in both large and small gaps. A strong regeneration competition was observed among species in ecozone II in large (df 2, F=3.6, p=0.035) and small (df 2, F=17.9, p=0.000) gaps. These results contribute to the understanding of the natural regeneration of different species with regards to light regimes in forest floors.

Keywords: Celtis mildbraedii, ecozones, gaps, Nesogordonia papaverifera, regeneration, Terminalia superba

Procedia PDF Downloads 108
9886 Estimate of Maximum Expected Intensity of One-Half-Wave Lines Dancing

Authors: A. Bekbaev, M. Dzhamanbaev, R. Abitaeva, A. Karbozova, G. Nabyeva

Abstract:

In this paper, the regression dependence of dancing intensity from wind speed and length of span was established due to the statistic data obtained from multi-year observations on line wires dancing accumulated by power systems of Kazakhstan and the Russian Federation. The lower and upper limitations of the equations parameters were estimated, as well as the adequacy of the regression model. The constructed model will be used in research of dancing phenomena for the development of methods and means of protection against dancing and for zoning plan of the territories of line wire dancing.

Keywords: power lines, line wire dancing, dancing intensity, regression equation, dancing area intensity

Procedia PDF Downloads 287
9885 The Comparison of the Reliability Margin Measure for the Different Concepts in the Slope Analysis

Authors: Filip Dodigovic, Kreso Ivandic, Damir Stuhec, S. Strelec

Abstract:

The general difference analysis between the former and new design concepts in geotechnical engineering is carried out. The application of new regulations results in the need for real adaptation of the computation principles of limit states, i.e. by providing a uniform way of analyzing engineering tasks. Generally, it is not possible to unambiguously match the limit state verification procedure with those in the construction engineering. The reasons are the inability to fully consistency of the common probabilistic basis of the analysis, and the fundamental effect of material properties on the value of actions and the influence of actions on resistance. Consequently, it is not possible to apply separate factorization with partial coefficients, as in construction engineering. For the slope stability analysis design procedures problems in the light of the use of limit states in relation to the concept of allowable stresses is detailed in. The quantifications of the safety margins in the slope stability analysis for both approaches is done. When analyzing the stability of the slope, by the strict application of the adopted forms from the new regulations for significant external temporary and/or seismic actions, the equivalent margin of safety is increased. The consequence is the emergence of more conservative solutions.

Keywords: allowable pressure, Eurocode 7, limit states, slope stability

Procedia PDF Downloads 311
9884 Incorporating Anomaly Detection in a Digital Twin Scenario Using Symbolic Regression

Authors: Manuel Alves, Angelica Reis, Armindo Lobo, Valdemar Leiras

Abstract:

In industry 4.0, it is common to have a lot of sensor data. In this deluge of data, hints of possible problems are difficult to spot. The digital twin concept aims to help answer this problem, but it is mainly used as a monitoring tool to handle the visualisation of data. Failure detection is of paramount importance in any industry, and it consumes a lot of resources. Any improvement in this regard is of tangible value to the organisation. The aim of this paper is to add the ability to forecast test failures, curtailing detection times. To achieve this, several anomaly detection algorithms were compared with a symbolic regression approach. To this end, Isolation Forest, One-Class SVM and an auto-encoder have been explored. For the symbolic regression PySR library was used. The first results show that this approach is valid and can be added to the tools available in this context as a low resource anomaly detection method since, after training, the only requirement is the calculation of a polynomial, a useful feature in the digital twin context.

Keywords: anomaly detection, digital twin, industry 4.0, symbolic regression

Procedia PDF Downloads 89
9883 Knowledge, Attitude and Practices of Contraception among the Married Women of Reproductive Age Group in Selected Wards of Dharan Sub-Metropolitan City

Authors: Pratima Thapa

Abstract:

Background: It is very critical to understand that awareness of family planning and proper utilization of contraceptives is an important indicator for reducing maternal and neonatal mortality and morbidity. It also plays an important role in promoting reproductive health of the women in an underdeveloped country like ours. Objective: To assess knowledge, attitude and practices of contraception among married women of reproductive age group in selected wards of Dharan Sub-Metropolitan City. Materials and methods: A cross-sectional descriptive study was conducted among 209 married women of reproductive age. Simple random sampling was used to select the wards, population proportionate sampling for selecting the sample numbers from each wards and purposive sampling for selecting each sample. Semi-structured questionnaire was used to collect data. Descriptive and inferential statistics were used to interpret the data considering p-value 0.05. Results: The mean ± SD age of the respondents was 30.01 ± 8.12 years. Majority 92.3% had ever heard of contraception. Popular known method was Inj. Depo (92.7%). Mass media (85.8%) was the major source of information. Mean percentage score of knowledge was 45.23%.less than half (45%) had adequate knowledge. Majority 90.4% had positive attitude. Only 64.6% were using contraceptives currently. Misbeliefs and fear of side effects were the main reason for not using contraceptives. Education, occupation, and total income of the family was associated with knowledge regarding contraceptives. Results for Binary Logistic Regression showed significant correlates of attitude with distance to the nearest health facility (OR=7.97, p<0.01), education (OR=0.24, p<0.05) and age group (0.03, p<0.01). Regarding practice, likelihood of being current user of contraceptives increased significantly by being literate (OR=5.97, p<0.01), having nuclear family (OR=4.96, p<0.01), living in less than 30 minute walk distance from nearest health facility (OR=3.34, p<0.05), women’s participation in decision making regarding household and fertility choices (OR=5.23, p<0.01) and husband’s support on using contraceptives (OR=9.05, p<0.01). Significant and positive correlation between knowledge-attitude, knowledge-practice and attitude-practice were observed. Conclusion: Results of the study indicates that there is need to increase awareness programs in order to intensify the knowledge and practices of contraception. The positive correlation indorses that better knowledge can lead to positive attitude and hence good practice. Further, projects aiming to increase better counselling about contraceptives, its side effects and the positive effects that outweighs the negative aspects should be enrolled appropriately.

Keywords: attitude, contraceptives, knowledge, practice

Procedia PDF Downloads 227