Search results for: regression test
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 11840

Search results for: regression test

11450 Working Memory Capacity and Motivation in Japanese English as a Foreign Language Learners' Speaking Skills

Authors: Akiko Kondo

Abstract:

Although the effects of working memory capacity on second/foreign language speaking skills have been researched in depth, few studies have focused on Japanese English as a foreign language (EFL) learners as compared to other languages (Indo-European languages), and the sample sizes of the relevant Japanese studies have been relatively small. Furthermore, comparing the effects of working memory capacity and motivation which is another kind of frequently researched individual factor on L2 speaking skills would add to the scholarly literature in the field of second language acquisition research. Therefore, the purposes of this study were to investigate whether working memory capacity and motivation have significant relationships with Japanese EFL learners’ speaking skills and to investigate the degree to which working memory capacity and motivation contribute to their English speaking skills. One-hundred and ten Japanese EFL students aged 18 to 26 years participated in this study. All of them are native Japanese speakers and have learned English as s foreign language for 6 to 15. They completed the Versant English speaking test, which has been widely used to measure non-native speakers’ English speaking skills, two types of working memory tests (the L1-based backward digit span test and the L1-based listening span test), and the language learning motivation survey. The researcher designed the working memory tests and the motivation survey. To investigate the relationship between the variables (English speaking skills, working memory capacity, and language learning motivation), a correlation analysis was conducted, which showed that L2 speaking test scores were significantly related to both working memory capacity and language learning motivation, although the correlation coefficients were weak. Furthermore, a multiple regression analysis was performed, with L2 speaking skills as the dependent variable and working memory capacity and language learning motivation as the independent variables. The results showed that working memory capacity and motivation significantly explained the variance in L2 speaking skills and that the L2 motivation had slightly larger effects on the L2 speaking skills than the working memory capacity. Although this study includes several limitations, the results could contribute to the generalization of the effects of individual differences, such as working memory and motivation on L2 learning, in the literature.

Keywords: individual differences, motivation, speaking skills, working memory

Procedia PDF Downloads 165
11449 Comparative Analysis of Predictive Models for Customer Churn Prediction in the Telecommunication Industry

Authors: Deepika Christopher, Garima Anand

Abstract:

To determine the best model for churn prediction in the telecom industry, this paper compares 11 machine learning algorithms, namely Logistic Regression, Support Vector Machine, Random Forest, Decision Tree, XGBoost, LightGBM, Cat Boost, AdaBoost, Extra Trees, Deep Neural Network, and Hybrid Model (MLPClassifier). It also aims to pinpoint the top three factors that lead to customer churn and conducts customer segmentation to identify vulnerable groups. According to the data, the Logistic Regression model performs the best, with an F1 score of 0.6215, 81.76% accuracy, 68.95% precision, and 56.57% recall. The top three attributes that cause churn are found to be tenure, Internet Service Fiber optic, and Internet Service DSL; conversely, the top three models in this article that perform the best are Logistic Regression, Deep Neural Network, and AdaBoost. The K means algorithm is applied to establish and analyze four different customer clusters. This study has effectively identified customers that are at risk of churn and may be utilized to develop and execute strategies that lower customer attrition.

Keywords: attrition, retention, predictive modeling, customer segmentation, telecommunications

Procedia PDF Downloads 58
11448 An Investigation of Item Bias in Free Boarding and Scholarship Examination in Turkey

Authors: Yeşim Özer Özkan, Fatma Büşra Fincan

Abstract:

Biased sample is a regression of an observation, design process and all of the specifications lead to tendency of a side or the situation of leaving from the objectivity. It is expected that, test items are answered by the students who come from different social groups and the same ability not to be different from each other. The importance of the expectation increases especially during student selection and placement examinations. For example, all of the test items should not be beneficial for just a male or female group. The aim of the research is an investigation of item bias whether or not the exam included in 2014 free boarding and scholarship examination in terms of gender variable. Data which belong to 5th, 6th, and 7th grade the secondary education students were obtained by the General Directorate of Measurement, Evaluation and Examination Services in Turkey. 20% students were selected randomly within 192090 students. Based on 38418 students’ exam paper were examined for determination item bias. Winsteps 3.8.1 package program was used to determine bias in analysis of data, according to Rasch Model in respect to gender variable. Mathematics items tests were examined in terms of gender bias. Firstly, confirmatory factor analysis was applied twenty-five math questions. After that, NFI, TLI, CFI, IFI, RFI, GFI, RMSEA, and SRMR were examined in order to be validity and values of goodness of fit. Modification index values of confirmatory factor analysis were examined and then some of the items were omitted because these items gave an error in terms of model conformity and conceptual. The analysis shows that in 2014 free boarding and scholarship examination exam does not include bias. This is an indication of the gender of the examination to be made in favor of or against different groups of students.

Keywords: gender, item bias, placement test, Rasch model

Procedia PDF Downloads 231
11447 Comparing Student Performance on Paper-Based versus Computer-Based Formats of Standardized Tests

Authors: Jin Koo

Abstract:

During the coronavirus pandemic, there has been a further increasing demand for computer-based tests (CBT), and now it has become an important test mode. The main purpose of this study is to investigate the comparability of student scores obtained from computerized-based formats of a standardized test in the two subject areas of reading and mathematics. Also, this study investigates whether there is an interaction effect between test modes of CBT and paper-based tests (PBT) and gender/ability level in each subject area. The test used in this study is a multiple-choice standardized test for students in grades 8-11. For this study, data were collected during four test administrations: 2015-16, 2017-18, and 2020-21. This research used a one-factor between-subjects ANOVA to compute the PBT and CBT groups’ test means for each subject area (reading and mathematics). Also, 2-factor between-subjects ANOVAs were conducted to investigate examinee characteristics: gender (male and female), ethnicity (African-American, Asian, Hispanic, multi-racial, and White), and ability level (low, average, and high-ability groups). The author found that students’ test scores in the two subject areas varied across CBT and PBT by gender and ability level, meaning that gender, ethnicity, and ability level were related to the score difference. These results will be discussed according to the current testing systems. In addition, this study’s results will open up to school teachers and test developers the possible influence that gender, ethnicity, and ability level have on a student’s score based on whether they take the CBT or PBT.

Keywords: ability level, computer-based, gender, paper-based, test

Procedia PDF Downloads 101
11446 The Effect of Region of Residence on Fertility in Nigeria

Authors: Motlatso Rampedi

Abstract:

Nigeria has the fifth highest Total Fertility Rate in Sub-Saharan Africa at 5.5 children born to a woman. Some demographic research has found that there is an association between region of residence and fertility in Nigeria, with the Northern regions pertaining to high fertility and the Southern regions pertaining to low fertility levels. Even so, little attention has been given to understanding the effect of region of residence on fertility. Instead, a significant amount of research has been conducted on exploring the proximate determinants of fertility in Nigeria. The objective of this study was to test whether there is an association between region of residence and fertility in Nigeria. Using a sample size of 38 948 women aged 15-49 derived from the 2013 NDHS and the Poisson regression model for analysis, the study has found that region of residence has a significant effect on fertility. Moreover, the ANOVA test has shown that there is a socioeconomic disparity by region of residence in Nigeria. The Northern regions of Nigeria have shown to have higher levels of fertility as compared to the Southern regions. Therefore, while proximate determinants of fertility and socio-demographic characteristics of women are important, region of residence remains one of the fundamental determinants of fertility. Given these findings, it is recommended that government should not exhaust its resources or focus its fertility reduction policies and programmes at entire populations but target specific regions where fertility is most prevalent.

Keywords: high fertility, region, socioeconomic disparity, socio-demographic characteristics

Procedia PDF Downloads 309
11445 Research on the Spatio-Temporal Evolution Pattern of Traffic Dominance in Shaanxi Province

Authors: Leng Jian-Wei, Wang Lai-Jun, Li Ye

Abstract:

In order to measure and analyze the transportation situation within the counties of Shaanxi province over a certain period of time and to promote the province's future transportation planning and development, this paper proposes a reasonable layout plan and compares model rationality. The study uses entropy weight method to measure the transportation advantages of 107 counties in Shaanxi province from three dimensions: road network density, trunk line influence and location advantage in 2013 and 2021, and applies spatial autocorrelation analysis method to analyze the spatial layout and development trend of county-level transportation, and conducts ordinary least square (OLS)regression on transportation impact factors and other influencing factors. The paper also compares the regression fitting degree of the Geographically weighted regression(GWR) model and the OLS model. The results show that spatially, the transportation advantages of Shaanxi province generally show a decreasing trend from the Weihe Plain to the surrounding areas and mainly exhibit high-high clustering phenomenon. Temporally, transportation advantages show an overall upward trend, and the phenomenon of spatial imbalance gradually decreases. People's travel demands have changed to some extent, and the demand for rapid transportation has increased overall. The GWR model regression fitting degree of transportation advantages is 0.74, which is higher than the OLS regression model's fitting degree of 0.64. Based on the evolution of transportation advantages, it is predicted that this trend will continue for a period of time in the future. To improve the transportation advantages of Shaanxi province increasing the layout of rapid transportation can effectively enhance the transportation advantages of Shaanxi province. When analyzing spatial heterogeneity, geographic factors should be considered to establish a more reliable model

Keywords: traffic dominance, GWR model, spatial autocorrelation analysis, temporal and spatial evolution

Procedia PDF Downloads 89
11444 Local Interpretable Model-agnostic Explanations (LIME) Approach to Email Spam Detection

Authors: Rohini Hariharan, Yazhini R., Blessy Maria Mathew

Abstract:

The task of detecting email spam is a very important one in the era of digital technology that needs effective ways of curbing unwanted messages. This paper presents an approach aimed at making email spam categorization algorithms transparent, reliable and more trustworthy by incorporating Local Interpretable Model-agnostic Explanations (LIME). Our technique assists in providing interpretable explanations for specific classifications of emails to help users understand the decision-making process by the model. In this study, we developed a complete pipeline that incorporates LIME into the spam classification framework and allows creating simplified, interpretable models tailored to individual emails. LIME identifies influential terms, pointing out key elements that drive classification results, thus reducing opacity inherent in conventional machine learning models. Additionally, we suggest a visualization scheme for displaying keywords that will improve understanding of categorization decisions by users. We test our method on a diverse email dataset and compare its performance with various baseline models, such as Gaussian Naive Bayes, Multinomial Naive Bayes, Bernoulli Naive Bayes, Support Vector Classifier, K-Nearest Neighbors, Decision Tree, and Logistic Regression. Our testing results show that our model surpasses all other models, achieving an accuracy of 96.59% and a precision of 99.12%.

Keywords: text classification, LIME (local interpretable model-agnostic explanations), stemming, tokenization, logistic regression.

Procedia PDF Downloads 48
11443 Effect of Genuine Missing Data Imputation on Prediction of Urinary Incontinence

Authors: Suzan Arslanturk, Mohammad-Reza Siadat, Theophilus Ogunyemi, Ananias Diokno

Abstract:

Missing data is a common challenge in statistical analyses of most clinical survey datasets. A variety of methods have been developed to enable analysis of survey data to deal with missing values. Imputation is the most commonly used among the above methods. However, in order to minimize the bias introduced due to imputation, one must choose the right imputation technique and apply it to the correct type of missing data. In this paper, we have identified different types of missing values: missing data due to skip pattern (SPMD), undetermined missing data (UMD), and genuine missing data (GMD) and applied rough set imputation on only the GMD portion of the missing data. We have used rough set imputation to evaluate the effect of such imputation on prediction by generating several simulation datasets based on an existing epidemiological dataset (MESA). To measure how well each dataset lends itself to the prediction model (logistic regression), we have used p-values from the Wald test. To evaluate the accuracy of the prediction, we have considered the width of 95% confidence interval for the probability of incontinence. Both imputed and non-imputed simulation datasets were fit to the prediction model, and they both turned out to be significant (p-value < 0.05). However, the Wald score shows a better fit for the imputed compared to non-imputed datasets (28.7 vs. 23.4). The average confidence interval width was decreased by 10.4% when the imputed dataset was used, meaning higher precision. The results show that using the rough set method for missing data imputation on GMD data improve the predictive capability of the logistic regression. Further studies are required to generalize this conclusion to other clinical survey datasets.

Keywords: rough set, imputation, clinical survey data simulation, genuine missing data, predictive index

Procedia PDF Downloads 169
11442 Manual Dexterity in Patients with Motor Neuron Disease

Authors: Magdalena Barbara Kaziuk, Ilona Hubner, Jacek Hubner, Slawomir Kroczka

Abstract:

Background: The motor neuron disease is a progressive neurodegenerative disease causing malfunction. Irrespective of the form of the disease and its onset always leads to the worsening of the quality of life, with patients usually depending on the family. Materials and methods: The study included 20 persons (5 females, 15 males, aged 65,5 ± 20 years) with clinically certain or probable diagnosis of the motor neuron disease. Patients were examined three times in the period of six months. The diagnosis was established based on the criteria of El Escorial. Manual dexterity was assessed using the test of the card Rene Zazzo and the test of shading in with lines Mira Stambak. Results: All patients achieved unsatisfactory results in Rene Zazzo’s test of the card and most of the patients (60%) in Mira Stambak’s test of shading with lines. Significantly higher test results were achieved for Rene Zazzo’s test and lower test results for Mira Stambak’s test in consecutive measurements. Conclusions: Impairment of manual dexterity is present already at the moment of diagnosing the disease and is growing significantly during its course. The quality of life for MND patients undergoes gradual deterioration as a result of the malfunction.

Keywords: manual dexterity, motor neuron disease, quality of life, malfunction

Procedia PDF Downloads 342
11441 The Relationship among Perceived Risk, Product Knowledge, Brand Image and the Insurance Purchase Intention of Taiwanese Working Holiday Youths

Authors: Wan-Ling Chang, Hsiu-Ju Huang, Jui-Hsiu Chang

Abstract:

In 2004, the Ministry of Foreign Affairs Taiwan launched ‘An Arrangement on Working Holiday Scheme’ with 15 countries including New Zealand, Japan, Canada, Germany, South Korea, Britain, Australia and others. The aim of the scheme is to allow young people to work and study English or other foreign languages. Each year, there are 30,000 Taiwanese youths applied for participating in the working holiday schemes. However, frequent accidents could cause huge medical expenses and post-delivery fee, which are usually unaffordable for most families. Therefore, this study explored the relationship among perceived risk toward working holiday, insurance product knowledge, brand image and insurance purchase intention for Taiwanese youths who plan to apply for working holiday. A survey questionnaire was distributed for data collection. A total of 316 questionnaires were collected for data analyzed. Data were analyzed using descriptive statistics, independent samples T-test, one-way ANOVA, correlation analysis, regression analysis and hierarchical regression methods of analysis and hypothesis testing. The results of this research indicate that perceived risk has a negative influence on insurance purchase intention. On the opposite, product knowledge has brand image has a positive influence on the insurance purchase intention. According to the mentioned results, practical implications were further addressed for insurance companies when developing a future marketing plan.

Keywords: insurance product knowledges, insurance purchase intention, perceived risk, working holiday

Procedia PDF Downloads 253
11440 Practical Methods for Automatic MC/DC Test Cases Generation of Boolean Expressions

Authors: Sekou Kangoye, Alexis Todoskoff, Mihaela Barreau

Abstract:

Modified Condition/Decision Coverage (MC/DC) is a structural coverage criterion that aims to prove that all conditions involved in a Boolean expression can influence the result of that expression. In the context of automotive, MC/DC is highly recommended and even required for most security and safety applications testing. However, due to complex Boolean expressions that often embedded in those applications, generating a set of MC/DC compliant test cases for any of these expressions is a nontrivial task and can be time consuming for testers. In this paper we present an approach to automatically generate MC/DC test cases for any Boolean expression. We introduce novel techniques, essentially based on binary trees to quickly and optimally generate MC/DC test cases for the expressions. Thus, the approach can be used to reduce the manual testing effort of testers.

Keywords: binary trees, MC/DC, test case generation, nontrivial task

Procedia PDF Downloads 451
11439 Effect of Drying on the Concrete Structures

Authors: A. Brahma

Abstract:

The drying of hydraulics materials is unavoidable and conducted to important spontaneous deformations. In this study, we show that it is possible to describe the drying shrinkage of the high-performance concrete by a simple expression. A multiple regression model was developed for the prediction of the drying shrinkage of the high-performance concrete. The assessment of the proposed model has been done by a set of statistical tests. The model developed takes in consideration the main parameters of confection and conservation. There was a very good agreement between drying shrinkage predicted by the multiple regression model and experimental results. The developed model adjusts easily to all hydraulic concrete types.

Keywords: hydraulic concretes, drying, shrinkage, prediction, modeling

Procedia PDF Downloads 368
11438 A Study on the Method of Accelerated Life Test to Electric Rotating System

Authors: Youn-Hwan Kim, Jae-Won Moon, Hae-Joong Kim

Abstract:

This paper introduces the study on the method of accelerated life test to electrical rotating system. In recent years, as well as efficiency for motors and generators, there is a growing need for research on the life expectancy. It is considered impossible to calculate the acceleration coefficient by increasing the rotational load or temperature load as the acceleration stress in the motor system because the temperature of the copper exceeds the wire thermal class rating. In this paper, the accelerated life test methods of the electrical rotating system are classified according to the application. This paper describes the development of the test procedure for the highly accelerated life test (HALT) of the 100kW permanent magnet synchronous motor (PMSM) of electric vehicle. Finally, it explains how to select acceleration load for vibration, temperature, bearing load, etc. for accelerated life test.

Keywords: acceleration coefficient, electric vehicle motor, HALT, life expectancy, vibration

Procedia PDF Downloads 328
11437 An Investigation of the Determinants of Discount Rate Manipulation in Swedish and Finnish Listed Companies

Authors: Fredrik Hartwig, Peter Lindberg

Abstract:

In 2004, the International Accounting Standards Board (IASB) issued new accounting standards for impairment testing of goodwill. IFRS 3 Business Combinations and IAS 36 Impairment of Assets prohibited amortization of acquired goodwill and instead required companies to test goodwill for impairment annually or more often if necessary. The goodwill impairment test is based on management’s judgement and estimations, making the impairment-only-approach subjective and unreliable. Management can use the discretion opportunistically by managing goodwill impairments. The IASB’s remedy to the reliability problem has been to demand transparent financial reports. IAS 36 paragraph 134 requires detailed disclosures regarding the impairment test in order to make potentially unreasonable assumptions and estimations visible. The disclosure requirements should thus (in theory) make it more difficult for management to ‘choose’ assumptions and estimations that suit an agenda. Whether the requirement to disclose detailed disclosures regarding the impairment test leads to less opportunism is however an empirical question. This work analyses whether one of the required disclosures in IAS 36 paragraph 134, the reported discount rate, differs from an independently estimated risk-adjusted discount rate. Estimates of discount rates that are either lower or higher than the independently estimated discount rate are here defined as opportunism. In the former case - i.e. when the reported discount rate is lower - the objective may be to avoid profit reducing impairment charges. In the latter case - i.e. when the reported discount rate is higher - the objective may be to reduce profits or take ‘big baths’. This paper differs in one important respect from previous similar studies, the majority of which are based on purely descriptive statistics; we use multivariate regression analysis to analyze what factors affect deviations between disclosed discount rates and independently estimated discount rates. The sample consists of Swedish and Finnish listed companies. Swedish and Finnish listed companies are analysed since the accounting oversight bodies differ between the two countries. The results show that discount rate deviations in Swedish and Finnish listed companies are significantly related to accounting oversight, size and industry but not financial risk, business risk and goodwill intensity.

Keywords: discount rate, manipulation, goodwill impairment test, disclosures

Procedia PDF Downloads 131
11436 Influence of Parameters of Modeling and Data Distribution for Optimal Condition on Locally Weighted Projection Regression Method

Authors: Farhad Asadi, Mohammad Javad Mollakazemi, Aref Ghafouri

Abstract:

Recent research in neural networks science and neuroscience for modeling complex time series data and statistical learning has focused mostly on learning from high input space and signals. Local linear models are a strong choice for modeling local nonlinearity in data series. Locally weighted projection regression is a flexible and powerful algorithm for nonlinear approximation in high dimensional signal spaces. In this paper, different learning scenario of one and two dimensional data series with different distributions are investigated for simulation and further noise is inputted to data distribution for making different disordered distribution in time series data and for evaluation of algorithm in locality prediction of nonlinearity. Then, the performance of this algorithm is simulated and also when the distribution of data is high or when the number of data is less the sensitivity of this approach to data distribution and influence of important parameter of local validity in this algorithm with different data distribution is explained.

Keywords: local nonlinear estimation, LWPR algorithm, online training method, locally weighted projection regression method

Procedia PDF Downloads 503
11435 Exploration and Evaluation of the Effect of Multiple Countermeasures on Road Safety

Authors: Atheer Al-Nuaimi, Harry Evdorides

Abstract:

Every day many people die or get disabled or injured on roads around the world, which necessitates more specific treatments for transportation safety issues. International road assessment program (iRAP) model is one of the comprehensive road safety models which accounting for many factors that affect road safety in a cost-effective way in low and middle income countries. In iRAP model road safety has been divided into five star ratings from 1 star (the lowest level) to 5 star (the highest level). These star ratings are based on star rating score which is calculated by iRAP methodology depending on road attributes, traffic volumes and operating speeds. The outcome of iRAP methodology are the treatments that can be used to improve road safety and reduce fatalities and serious injuries (FSI) numbers. These countermeasures can be used separately as a single countermeasure or mix as multiple countermeasures for a location. There is general agreement that the adequacy of a countermeasure is liable to consistent losses when it is utilized as a part of mix with different countermeasures. That is, accident diminishment appraisals of individual countermeasures cannot be easily added together. The iRAP model philosophy makes utilization of a multiple countermeasure adjustment factors to predict diminishments in the effectiveness of road safety countermeasures when more than one countermeasure is chosen. A multiple countermeasure correction factors are figured for every 100-meter segment and for every accident type. However, restrictions of this methodology incorporate a presumable over-estimation in the predicted crash reduction. This study aims to adjust this correction factor by developing new models to calculate the effect of using multiple countermeasures on the number of fatalities for a location or an entire road. Regression models have been used to establish relationships between crash frequencies and the factors that affect their rates. Multiple linear regression, negative binomial regression, and Poisson regression techniques were used to develop models that can address the effectiveness of using multiple countermeasures. Analyses are conducted using The R Project for Statistical Computing showed that a model developed by negative binomial regression technique could give more reliable results of the predicted number of fatalities after the implementation of road safety multiple countermeasures than the results from iRAP model. The results also showed that the negative binomial regression approach gives more precise results in comparison with multiple linear and Poisson regression techniques because of the overdispersion and standard error issues.

Keywords: international road assessment program, negative binomial, road multiple countermeasures, road safety

Procedia PDF Downloads 241
11434 Identification of Rainfall Trends in Qatar

Authors: Abdullah Al Mamoon, Ataur Rahman

Abstract:

Due to climate change, future rainfall will change at many locations on earth; however, the spatial and temporal patterns of this change are not easy to predict. One approach of predicting such future changes is to examine the trends in the historical rainfall data at a given region and use the identified trends to make future prediction. For this, a statistical trend test is commonly applied to the historical data. This paper examines the trends of daily extreme rainfall events from 30 rain gauges located in the State of Qatar. Rainfall data covering from 1962 to 2011 were used in the analysis. A combination of four non-parametric and parametric tests was applied to identify trends at 10%, 5%, and 1% significance levels. These tests are Mann-Kendall (MK), Spearman’s Rho (SR), Linear Regression (LR) and CUSUM tests. These tests showed both positive and negative trends throughout the country. Only eight stations showed positive (upward) trend, which were however not statistically significant. In contrast, significant negative (downward) trends were found at the 5% and 10% levels of significance in six stations. The MK, SR and LR tests exhibited very similar results. This finding has important implications in the derivation/upgrade of design rainfall for Qatar, which will affect design and operation of future urban drainage infrastructure in Qatar.

Keywords: trends, extreme rainfall, daily rainfall, Mann-Kendall test, climate change, Qatar

Procedia PDF Downloads 564
11433 A Brief Study about Nonparametric Adherence Tests

Authors: Vinicius R. Domingues, Luan C. S. M. Ozelim

Abstract:

The statistical study has become indispensable for various fields of knowledge. Not any different, in Geotechnics the study of probabilistic and statistical methods has gained power considering its use in characterizing the uncertainties inherent in soil properties. One of the situations where engineers are constantly faced is the definition of a probability distribution that represents significantly the sampled data. To be able to discard bad distributions, goodness-of-fit tests are necessary. In this paper, three non-parametric goodness-of-fit tests are applied to a data set computationally generated to test the goodness-of-fit of them to a series of known distributions. It is shown that the use of normal distribution does not always provide satisfactory results regarding physical and behavioral representation of the modeled parameters.

Keywords: Kolmogorov-Smirnov test, Anderson-Darling test, Cramer-Von-Mises test, nonparametric adherence tests

Procedia PDF Downloads 446
11432 Best Resource Recommendation for a Stochastic Process

Authors: Likewin Thomas, M. V. Manoj Kumar, B. Annappa

Abstract:

The aim of this study was to develop an Artificial Neural Network0 s recommendation model for an online process using the complexity of load, performance, and average servicing time of the resources. Here, the proposed model investigates the resource performance using stochastic gradient decent method for learning ranking function. A probabilistic cost function is implemented to identify the optimal θ values (load) on each resource. Based on this result the recommendation of resource suitable for performing the currently executing task is made. The test result of CoSeLoG project is presented with an accuracy of 72.856%.

Keywords: ADALINE, neural network, gradient decent, process mining, resource behaviour, polynomial regression model

Procedia PDF Downloads 391
11431 Rd-PLS Regression: From the Analysis of Two Blocks of Variables to Path Modeling

Authors: E. Tchandao Mangamana, V. Cariou, E. Vigneau, R. Glele Kakai, E. M. Qannari

Abstract:

A new definition of a latent variable associated with a dataset makes it possible to propose variants of the PLS2 regression and the multi-block PLS (MB-PLS). We shall refer to these variants as Rd-PLS regression and Rd-MB-PLS respectively because they are inspired by both Redundancy analysis and PLS regression. Usually, a latent variable t associated with a dataset Z is defined as a linear combination of the variables of Z with the constraint that the length of the loading weights vector equals 1. Formally, t=Zw with ‖w‖=1. Denoting by Z' the transpose of Z, we define herein, a latent variable by t=ZZ’q with the constraint that the auxiliary variable q has a norm equal to 1. This new definition of a latent variable entails that, as previously, t is a linear combination of the variables in Z and, in addition, the loading vector w=Z’q is constrained to be a linear combination of the rows of Z. More importantly, t could be interpreted as a kind of projection of the auxiliary variable q onto the space generated by the variables in Z, since it is collinear to the first PLS1 component of q onto Z. Consider the situation in which we aim to predict a dataset Y from another dataset X. These two datasets relate to the same individuals and are assumed to be centered. Let us consider a latent variable u=YY’q to which we associate the variable t= XX’YY’q. Rd-PLS consists in seeking q (and therefore u and t) so that the covariance between t and u is maximum. The solution to this problem is straightforward and consists in setting q to the eigenvector of YY’XX’YY’ associated with the largest eigenvalue. For the determination of higher order components, we deflate X and Y with respect to the latent variable t. Extending Rd-PLS to the context of multi-block data is relatively easy. Starting from a latent variable u=YY’q, we consider its ‘projection’ on the space generated by the variables of each block Xk (k=1, ..., K) namely, tk= XkXk'YY’q. Thereafter, Rd-MB-PLS seeks q in order to maximize the average of the covariances of u with tk (k=1, ..., K). The solution to this problem is given by q, eigenvector of YY’XX’YY’, where X is the dataset obtained by horizontally merging datasets Xk (k=1, ..., K). For the determination of latent variables of order higher than 1, we use a deflation of Y and Xk with respect to the variable t= XX’YY’q. In the same vein, extending Rd-MB-PLS to the path modeling setting is straightforward. Methods are illustrated on the basis of case studies and performance of Rd-PLS and Rd-MB-PLS in terms of prediction is compared to that of PLS2 and MB-PLS.

Keywords: multiblock data analysis, partial least squares regression, path modeling, redundancy analysis

Procedia PDF Downloads 147
11430 Good Governance Complementary to Corruption Abatement: A Cross-Country Analysis

Authors: Kamal Ray, Tapati Bhattacharya

Abstract:

Private use of public office for private gain could be a tentative definition of corruption and most distasteful event of corruption is that it is not there, nor that it is pervasive, but it is socially acknowledged in the global economy, especially in the developing nations. We attempted to assess the interrelationship between the Corruption perception index (CPI) and the principal components of governance indicators as per World Bank like Control of Corruption (CC), rule of law (RL), regulatory quality (RQ) and government effectiveness (GE). Our empirical investigation concentrates upon the degree of reflection of governance indicators upon the CPI in order to single out the most powerful corruption-generating indicator in the selected countries. We have collected time series data on above governance indicators such as CC, RL, RQ and GE of the selected eleven countries from the year of 1996 to 2012 from World Bank data set. The countries are USA, UK, France, Germany, Greece, China, India, Japan, Thailand, Brazil, and South Africa. Corruption Perception Index (CPI) of the countries mentioned above for the period of 1996 to 2012is also collected. Graphical method of simple line diagram against the time series data on CPI is applied for quick view for the relative positions of different trend lines of different nations. The correlation coefficient is enough to assess primarily the degree and direction of association between the variables as we get the numerical data on governance indicators of the selected countries. The tool of Granger Causality Test (1969) is taken into account for investigating causal relationships between the variables, cause and effect to speak of. We do not need to verify stationary test as length of time series is short. Linear regression is taken as a tool for quantification of a change in explained variables due to change in explanatory variable in respect of governance vis a vis corruption. A bilateral positive causal link between CPI and CC is noticed in UK, index-value of CC increases by 1.59 units as CPI increases by one unit and CPI rises by 0.39 units as CC rises by one unit, and hence it has a multiplier effect so far as reduction in corruption is concerned in UK. GE causes strongly to the reduction of corruption in UK. In France, RQ is observed to be a most powerful indicator in reducing corruption whereas it is second most powerful indicator after GE in reducing of corruption in Japan. Governance-indicator like GE plays an important role to push down the corruption in Japan. In China and India, GE is proactive as well as influencing indicator to curb corruption. The inverse relationship between RL and CPI in Thailand indicates that ongoing machineries related to RL is not complementary to the reduction of corruption. The state machineries of CC in S. Africa are highly relevant to reduce the volume of corruption. In Greece, the variations of CPI positively influence the variations of CC and the indicator like GE is effective in controlling corruption as reflected by CPI. All the governance-indicators selected so far have failed to arrest their state level corruptions in USA, Germany and Brazil.

Keywords: corruption perception index, governance indicators, granger causality test, regression

Procedia PDF Downloads 306
11429 Partial Least Square Regression for High-Dimentional and High-Correlated Data

Authors: Mohammed Abdullah Alshahrani

Abstract:

The research focuses on investigating the use of partial least squares (PLS) methodology for addressing challenges associated with high-dimensional correlated data. Recent technological advancements have led to experiments producing data characterized by a large number of variables compared to observations, with substantial inter-variable correlations. Such data patterns are common in chemometrics, where near-infrared (NIR) spectrometer calibrations record chemical absorbance levels across hundreds of wavelengths, and in genomics, where thousands of genomic regions' copy number alterations (CNA) are recorded from cancer patients. PLS serves as a widely used method for analyzing high-dimensional data, functioning as a regression tool in chemometrics and a classification method in genomics. It handles data complexity by creating latent variables (components) from original variables. However, applying PLS can present challenges. The study investigates key areas to address these challenges, including unifying interpretations across three main PLS algorithms and exploring unusual negative shrinkage factors encountered during model fitting. The research presents an alternative approach to addressing the interpretation challenge of predictor weights associated with PLS. Sparse estimation of predictor weights is employed using a penalty function combining a lasso penalty for sparsity and a Cauchy distribution-based penalty to account for variable dependencies. The results demonstrate sparse and grouped weight estimates, aiding interpretation and prediction tasks in genomic data analysis. High-dimensional data scenarios, where predictors outnumber observations, are common in regression analysis applications. Ordinary least squares regression (OLS), the standard method, performs inadequately with high-dimensional and highly correlated data. Copy number alterations (CNA) in key genes have been linked to disease phenotypes, highlighting the importance of accurate classification of gene expression data in bioinformatics and biology using regularized methods like PLS for regression and classification.

Keywords: partial least square regression, genetics data, negative filter factors, high dimensional data, high correlated data

Procedia PDF Downloads 51
11428 Applying the Regression Technique for ‎Prediction of the Acute Heart Attack ‎

Authors: Paria Soleimani, Arezoo Neshati

Abstract:

Myocardial infarction is one of the leading causes of ‎death in the world. Some of these deaths occur even before the patient ‎reaches the hospital. Myocardial infarction occurs as a result of ‎impaired blood supply. Because the most of these deaths are due to ‎coronary artery disease, hence the awareness of the warning signs of a ‎heart attack is essential. Some heart attacks are sudden and intense, but ‎most of them start slowly, with mild pain or discomfort, then early ‎detection and successful treatment of these symptoms is vital to save ‎them. Therefore, importance and usefulness of a system designing to ‎assist physicians in the early diagnosis of the acute heart attacks is ‎obvious.‎ The purpose of this study is to determine how well a predictive ‎model would perform based on the only patient-reportable clinical ‎history factors, without using diagnostic tests or physical exams. This ‎type of the prediction model might have application outside of the ‎hospital setting to give accurate advice to patients to influence them to ‎seek care in appropriate situations. For this purpose, the data were ‎collected on 711 heart patients in Iran hospitals. 28 attributes of clinical ‎factors can be reported by patients; were studied. Three logistic ‎regression models were made on the basis of the 28 features to predict ‎the risk of heart attacks. The best logistic regression model in terms of ‎performance had a C-index of 0.955 and with an accuracy of 94.9%. ‎The variables, severe chest pain, back pain, cold sweats, shortness of ‎breath, nausea, and vomiting were selected as the main features.‎

Keywords: Coronary heart disease, Acute heart attacks, Prediction, Logistic ‎regression‎

Procedia PDF Downloads 450
11427 Fuzzy Logic Classification Approach for Exponential Data Set in Health Care System for Predication of Future Data

Authors: Manish Pandey, Gurinderjit Kaur, Meenu Talwar, Sachin Chauhan, Jagbir Gill

Abstract:

Health-care management systems are a unit of nice connection as a result of the supply a straightforward and fast management of all aspects relating to a patient, not essentially medical. What is more, there are unit additional and additional cases of pathologies during which diagnosing and treatment may be solely allotted by victimization medical imaging techniques. With associate ever-increasing prevalence, medical pictures area unit directly acquired in or regenerate into digital type, for his or her storage additionally as sequent retrieval and process. Data Mining is the process of extracting information from large data sets through using algorithms and Techniques drawn from the field of Statistics, Machine Learning and Data Base Management Systems. Forecasting may be a prediction of what's going to occur within the future, associated it's an unsure method. Owing to the uncertainty, the accuracy of a forecast is as vital because the outcome foretold by foretelling the freelance variables. A forecast management should be wont to establish if the accuracy of the forecast is within satisfactory limits. Fuzzy regression strategies have normally been wont to develop shopper preferences models that correlate the engineering characteristics with shopper preferences relating to a replacement product; the patron preference models offer a platform, wherever by product developers will decide the engineering characteristics so as to satisfy shopper preferences before developing the merchandise. Recent analysis shows that these fuzzy regression strategies area units normally will not to model client preferences. We tend to propose a Testing the strength of Exponential Regression Model over regression toward the mean Model.

Keywords: health-care management systems, fuzzy regression, data mining, forecasting, fuzzy membership function

Procedia PDF Downloads 280
11426 Glucose Monitoring System Using Machine Learning Algorithms

Authors: Sangeeta Palekar, Neeraj Rangwani, Akash Poddar, Jayu Kalambe

Abstract:

The bio-medical analysis is an indispensable procedure for identifying health-related diseases like diabetes. Monitoring the glucose level in our body regularly helps us identify hyperglycemia and hypoglycemia, which can cause severe medical problems like nerve damage or kidney diseases. This paper presents a method for predicting the glucose concentration in blood samples using image processing and machine learning algorithms. The glucose solution is prepared by the glucose oxidase (GOD) and peroxidase (POD) method. An experimental database is generated based on the colorimetric technique. The image of the glucose solution is captured by the raspberry pi camera and analyzed using image processing by extracting the RGB, HSV, LUX color space values. Regression algorithms like multiple linear regression, decision tree, RandomForest, and XGBoost were used to predict the unknown glucose concentration. The multiple linear regression algorithm predicts the results with 97% accuracy. The image processing and machine learning-based approach reduce the hardware complexities of existing platforms.

Keywords: artificial intelligence glucose detection, glucose oxidase, peroxidase, image processing, machine learning

Procedia PDF Downloads 206
11425 Statistical Analysis of the Impact of Maritime Transport Gross Domestic Product (GDP) on Nigeria’s Economy

Authors: Kehinde Peter Oyeduntan, Kayode Oshinubi

Abstract:

Nigeria is referred as the ‘Giant of Africa’ due to high population, land mass and large economy. However, it still trails far behind many smaller economies in the continent in terms of maritime operations. As we have seen that the maritime industry is the spark plug for national growth, because it houses the most crucial infrastructure that generates wealth for a nation, it is worrisome that a nation with six seaports lag in maritime activities. In this research, we have studied how the Gross Domestic Product (GDP) of the maritime transport influences the Nigerian economy. To do this, we applied Simple Linear Regression (SLR), Support Vector Machine (SVM), Polynomial Regression Model (PRM), Generalized Additive Model (GAM) and Generalized Linear Mixed Model (GLMM) to model the relationship between the nation’s Total GDP (TGDP) and the Maritime Transport GDP (MGDP) using a time series data of 20 years. The result showed that the MGDP is statistically significant to the Nigerian economy. Amongst the statistical tool applied, the PRM of order 4 describes the relationship better when compared to other methods. The recommendations presented in this study will guide policy makers and help improve the economy of Nigeria in terms of its GDP.

Keywords: maritime transport, economy, GDP, regression, port

Procedia PDF Downloads 155
11424 A Study of the Weld Properties of Inconel 625 Based on Nb Content

Authors: JongWon Han, NoHoon Kim, HyoIk Ahn, HaeWoo Lee

Abstract:

In this study, shielded metal arc welding was performed as a function of Nb content at 2.24 wt%, 3.25 wt%, and 4.26 wt%. The microstructure was observed using scanning electron microscopy/energy dispersive X-ray spectroscopy (SEM/EDS) and showed the development of a columnar dendrite structure in the specimen having the least Nb content. From the hardness test, the hardness value was confirmed to reduce with decreasing Nb content. From electron backscatter diffraction (EBSD) analysis, the largest grain size was found in the specimen with Nb content of 2.24 wt%. The potentiodynamic polarization test was carried out to determine the pitting corrosion resistance; there was no significant difference in the pitting corrosion resistance with increasing Nb content. To evaluate the degree of sensitization to intergranular corrosion, the Double Loop Electrochemical Potentiodynamic Reactivation(DL-EPR test) was conducted. A similar degree of sensitization was found in two specimens except with a Nb content of 2.24 wt%, while a relatively high degree of sensitization was found in the specimen with a Nb content of 2.24 wt%.

Keywords: inconel 625, Nb content, potentiodynamic test, DL-EPR test

Procedia PDF Downloads 311
11423 The Effect of Accounting Conservatism on Cost of Capital: A Quantile Regression Approach for MENA Countries

Authors: Maha Zouaoui Khalifa, Hakim Ben Othman, Hussaney Khaled

Abstract:

Prior empirical studies have investigated the economic consequences of accounting conservatism by examining its impact on the cost of equity capital (COEC). However, findings are not conclusive. We assume that inconsistent results of such association may be attributed to the regression models used in data analysis. To address this issue, we re-examine the effect of different dimension of accounting conservatism: unconditional conservatism (U_CONS) and conditional conservatism (C_CONS) on the COEC for a sample of listed firms from Middle Eastern and North Africa (MENA) countries, applying quantile regression (QR) approach developed by Koenker and Basset (1978). While classical ordinary least square (OLS) method is widely used in empirical accounting research, however it may produce inefficient and bias estimates in the case of departures from normality or long tail error distribution. QR method is more powerful than OLS to handle this kind of problem. It allows the coefficient on the independent variables to shift across the distribution of the dependent variable whereas OLS method only estimates the conditional mean effects of a response variable. We find as predicted that U_CONS has a significant positive effect on the COEC however, C_CONS has a negative impact. Findings suggest also that the effect of the two dimensions of accounting conservatism differs considerably across COEC quantiles. Comparing results from QR method with those of OLS, this study throws more lights on the association between accounting conservatism and COEC.

Keywords: unconditional conservatism, conditional conservatism, cost of equity capital, OLS, quantile regression, emerging markets, MENA countries

Procedia PDF Downloads 357
11422 Developing a Staff Education Program on Subglottic Suction Endotracheal Tubes

Authors: Emily Toon

Abstract:

Nurses play a critical role in the prevention of ventilator-associated pneumonia through the maintenance of endotracheal tubes and use of subglottic secretion drainage via subglottic suctioning endotracheal tubes. The purpose of this evidence based practice project is to develop a staff education program on subglottic suctioning endotracheal tubes for critical care nurses at Middlesex Health with the aim of determining and documenting increased knowledge and/or practice change. The setting included registered nurses within Middlesex Health’s critical care unit who were recruited to complete a pre-test (n=14), view a presentation, and complete a post-test (n=10). Average pre-test scores were compared to average post-test scores to determine an increase in knowledge and/or practice change. The overall mean pre-test score was 59.7 percent, compared with the mean post-test score of 88.1 percent. Pre- and post-test scores were unmatched, so statistical significance could not be determined. The hypothesis that a staff education program on subglottic suctioning endotracheal tubes would demonstrate an increase in knowledge was supported, but not statistically. By integrating a pre-test/post-test design into educational presentations to evaluate increased knowledge, data generated may be used to improve methods and practices of delivering education and enhance staff learning.

Keywords: endotracheal tubes, staff education, subglottic secretion drainage, ventilator-associated pneumonia

Procedia PDF Downloads 115
11421 Optimizing the Scanning Time with Radiation Prediction Using a Machine Learning Technique

Authors: Saeed Eskandari, Seyed Rasoul Mehdikhani

Abstract:

Radiation sources have been used in many industries, such as gamma sources in medical imaging. These waves have destructive effects on humans and the environment. It is very important to detect and find the source of these waves because these sources cannot be seen by the eye. A portable robot has been designed and built with the purpose of revealing radiation sources that are able to scan the place from 5 to 20 meters away and shows the location of the sources according to the intensity of the waves on a two-dimensional digital image. The operation of the robot is done by measuring the pixels separately. By increasing the image measurement resolution, we will have a more accurate scan of the environment, and more points will be detected. But this causes a lot of time to be spent on scanning. In this paper, to overcome this challenge, we designed a method that can optimize this time. In this method, a small number of important points of the environment are measured. Hence the remaining pixels are predicted and estimated by regression algorithms in machine learning. The research method is based on comparing the actual values of all pixels. These steps have been repeated with several other radiation sources. The obtained results of the study show that the values estimated by the regression method are very close to the real values.

Keywords: regression, machine learning, scan radiation, robot

Procedia PDF Downloads 80