Search results for: multivariate regression tree
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 4354

Search results for: multivariate regression tree

3904 A Study of User Awareness and Attitudes Towards Civil-ID Authentication in Oman’s Electronic Services

Authors: Raya Al Khayari, Rasha Al Jassim, Muna Al Balushi, Fatma Al Moqbali, Said El Hajjar

Abstract:

This study utilizes linear regression analysis to investigate the correlation between user account passwords and the probability of civil ID exposure, offering statistical insights into civil ID security. The study employs multiple linear regression (MLR) analysis to further investigate the elements that influence consumers’ views of civil ID security. This aims to increase awareness and improve preventive measures. The results obtained from the MLR analysis provide a thorough comprehension and can guide specific educational and awareness campaigns aimed at promoting improved security procedures. In summary, the study’s results offer significant insights for improving existing security measures and developing more efficient tactics to reduce risks related to civil ID security in Oman. By identifying key factors that impact consumers’ perceptions, organizations can tailor their strategies to address vulnerabilities effectively. Additionally, the findings can inform policymakers on potential regulatory changes to enhance civil ID security in the country.

Keywords: civil-id disclosure, awareness, linear regression, multiple regression

Procedia PDF Downloads 44
3903 A Research on Inference from Multiple Distance Variables in Hedonic Regression Focus on Three Variables

Authors: Yan Wang, Yasushi Asami, Yukio Sadahiro

Abstract:

In urban context, urban nodes such as amenity or hazard will certainly affect house price, while classic hedonic analysis will employ distance variables measured from each urban nodes. However, effects from distances to facilities on house prices generally do not represent the true price of the property. Distance variables measured on the same surface are suffering a problem called multicollinearity, which is usually presented as magnitude variance and mean value in regression, errors caused by instability. In this paper, we provided a theoretical framework to identify and gather the data with less bias, and also provided specific sampling method on locating the sample region to avoid the spatial multicollinerity problem in three distance variable’s case.

Keywords: hedonic regression, urban node, distance variables, multicollinerity, collinearity

Procedia PDF Downloads 453
3902 A Combinatorial Representation for the Invariant Measure of Diffusion Processes on Metric Graphs

Authors: Michele Aleandri, Matteo Colangeli, Davide Gabrielli

Abstract:

We study a generalization to a continuous setting of the classical Markov chain tree theorem. In particular, we consider an irreducible diffusion process on a metric graph. The unique invariant measure has an atomic component on the vertices and an absolutely continuous part on the edges. We show that the corresponding density at x can be represented by a normalized superposition of the weights associated to metric arborescences oriented toward the point x. A metric arborescence is a metric tree oriented towards its root. The weight of each oriented metric arborescence is obtained by the product of the exponential of integrals of the form ∫a/b², where b is the drift and σ² is the diffusion coefficient, along the oriented edges, for a weight for each node determined by the local orientation of the arborescence around the node and for the inverse of the diffusion coefficient at x. The metric arborescences are obtained by cutting the original metric graph along some edges.

Keywords: diffusion processes, metric graphs, invariant measure, reversibility

Procedia PDF Downloads 158
3901 Determining of the Performance of Data Mining Algorithm Determining the Influential Factors and Prediction of Ischemic Stroke: A Comparative Study in the Southeast of Iran

Authors: Y. Mehdipour, S. Ebrahimi, A. Jahanpour, F. Seyedzaei, B. Sabayan, A. Karimi, H. Amirifard

Abstract:

Ischemic stroke is one of the common reasons for disability and mortality. The fourth leading cause of death in the world and the third in some other sources. Only 1/3 of the patients with ischemic stroke fully recover, 1/3 of them end in permanent disability and 1/3 face death. Thus, the use of predictive models to predict stroke has a vital role in reducing the complications and costs related to this disease. Thus, the aim of this study was to specify the effective factors and predict ischemic stroke with the help of DM methods. The present study was a descriptive-analytic study. The population was 213 cases from among patients referring to Ali ibn Abi Talib (AS) Hospital in Zahedan. Data collection tool was a checklist with the validity and reliability confirmed. This study used DM algorithms of decision tree for modeling. Data analysis was performed using SPSS-19 and SPSS Modeler 14.2. The results of the comparison of algorithms showed that CHAID algorithm with 95.7% accuracy has the best performance. Moreover, based on the model created, factors such as anemia, diabetes mellitus, hyperlipidemia, transient ischemic attacks, coronary artery disease, and atherosclerosis are the most effective factors in stroke. Decision tree algorithms, especially CHAID algorithm, have acceptable precision and predictive ability to determine the factors affecting ischemic stroke. Thus, by creating predictive models through this algorithm, will play a significant role in decreasing the mortality and disability caused by ischemic stroke.

Keywords: data mining, ischemic stroke, decision tree, Bayesian network

Procedia PDF Downloads 162
3900 Urban Energy Demand Modelling: Spatial Analysis Approach

Authors: Hung-Chu Chen, Han Qi, Bauke de Vries

Abstract:

Energy consumption in the urban environment has attracted numerous researches in recent decades. However, it is comparatively rare to find literary works which investigated 3D spatial analysis of urban energy demand modelling. In order to analyze the spatial correlation between urban morphology and energy demand comprehensively, this paper investigates their relation by using the spatial regression tool. In addition, the spatial regression tool which is applied in this paper is ordinary least squares regression (OLS) and geographically weighted regression (GWR) model. Normalized Difference Built-up Index (NDBI), Normalized Difference Vegetation Index (NDVI), and building volume are explainers of urban morphology, which act as independent variables of Energy-land use (E-L) model. NDBI and NDVI are used as the index to describe five types of land use: urban area (U), open space (O), artificial green area (G), natural green area (V), and water body (W). Accordingly, annual electricity, gas demand and energy demand are dependent variables of the E-L model. Based on the analytical result of E-L model relation, it revealed that energy demand and urban morphology are closely connected and the possible causes and practical use are discussed. Besides, the spatial analysis methods of OLS and GWR are compared.

Keywords: energy demand model, geographically weighted regression, normalized difference built-up index, normalized difference vegetation index, spatial statistics

Procedia PDF Downloads 137
3899 Factors Influencing Family Resilience and Quality of Life in Pediatric Cancer Patients and Their Caregivers: A Cluster Analysis

Authors: Li Wang, Dan Shu, Shiguang Pang, Lixiu Wang, Bing Xiang Yang, Qian Liu

Abstract:

Background: Cancer is one of the most severe diseases in childhood; long-term treatment and its side effects significantly impact the patient's physical, psychological, social functioning and quality of life while also placing substantial physical and psychological burdens on caregivers and families. Family resilience is crucial for children with cancer, helping them cope better with the disease and supporting the family in facing challenges together. As a family-level variable, family resilience requires information from multiple family members. However, to our best knowledge, there is currently no research investigating family resilience from both the perspectives of pediatric cancer patients and their caregivers. Therefore, this study aims to investigate the family resilience and quality of life of pediatric cancer patients from a patient–caregiver dyadic perspective. Methods: A total of 149 dyads of patients diagnosed with pediatric cancer patients and their principal caregivers were recruited from oncology departments of 4 tertiary hospitals in Wuhan and Taiyuan, China. All participants completed questionnaires that identified their demographic and clinical characteristics as well as assessed their family resilience and quality of life for both the patients and their caregivers. K-means cluster analysis was used to identify different clusters of family resilience based on the reports from patients and caregivers. Multivariate logistic regression and linear regression are used to analyze the factors influencing family resilience and quality of life, as well as the relationship between the two. Results: Three clusters of family resilience were identified: a cluster of high family resilience (HR), a cluster of low family resilience (LR), and a cluster of discrepant family resilience (DR). Most (67.1%) families fell into the cluster with low resilience. Characteristics such as the types of caregivers perceived social support of the patient were different among the three clusters. Compared to the LR group, families where the mother is the caregiver and where the patient has high social support are more likely to be assigned to the HR. The quality of life for caregivers was consistently highest in the HR cluster and lowest in the LR cluster. The patient's quality of life is not related to family resilience. In the linear regression analysis of the patient's quality of life, patients who are the first-born have higher quality of life, while those living with their parents have lower quality of life. The participants' characteristics were not associated with the quality of life for caregivers. Conclusions: In most families, family resilience was low. Families with maternal caregivers and patients receiving high levels of social support are more inclined to be higher levels of family resilience. Family resilience was linked to the quality of life of caregivers of pediatric cancer patients. The clinical implications of this findings suggest that healthcare and social support organizations should prioritize and support the participation of mothers in caregiving responsibilities. Furthermore, they should assist families in accessing social support to enhance family resilience. This study also emphasizes the importance of promoting family resilience for enhancing family health and happiness, as well as improving the quality of life for caregivers.

Keywords: pediatric cancer, cluster analysis, family resilience, quality of life

Procedia PDF Downloads 20
3898 Heart Failure Identification and Progression by Classifying Cardiac Patients

Authors: Muhammad Saqlain, Nazar Abbas Saqib, Muazzam A. Khan

Abstract:

Heart Failure (HF) has become the major health problem in our society. The prevalence of HF has increased as the patient’s ages and it is the major cause of the high mortality rate in adults. A successful identification and progression of HF can be helpful to reduce the individual and social burden from this syndrome. In this study, we use a real data set of cardiac patients to propose a classification model for the identification and progression of HF. The data set has divided into three age groups, namely young, adult, and old and then each age group have further classified into four classes according to patient’s current physical condition. Contemporary Data Mining classification algorithms have been applied to each individual class of every age group to identify the HF. Decision Tree (DT) gives the highest accuracy of 90% and outperform all other algorithms. Our model accurately diagnoses different stages of HF for each age group and it can be very useful for the early prediction of HF.

Keywords: decision tree, heart failure, data mining, classification model

Procedia PDF Downloads 394
3897 Modeling Aeration of Sharp Crested Weirs by Using Support Vector Machines

Authors: Arun Goel

Abstract:

The present paper attempts to investigate the prediction of air entrainment rate and aeration efficiency of a free over-fall jets issuing from a triangular sharp crested weir by using regression based modelling. The empirical equations, support vector machine (polynomial and radial basis function) models and the linear regression techniques were applied on the triangular sharp crested weirs relating the air entrainment rate and the aeration efficiency to the input parameters namely drop height, discharge, and vertex angle. It was observed that there exists a good agreement between the measured values and the values obtained using empirical equations, support vector machine (Polynomial and rbf) models, and the linear regression techniques. The test results demonstrated that the SVM based (Poly & rbf) model also provided acceptable prediction of the measured values with reasonable accuracy along with empirical equations and linear regression techniques in modelling the air entrainment rate and the aeration efficiency of a free over-fall jets issuing from triangular sharp crested weir. Further sensitivity analysis has also been performed to study the impact of input parameter on the output in terms of air entrainment rate and aeration efficiency.

Keywords: air entrainment rate, dissolved oxygen, weir, SVM, regression

Procedia PDF Downloads 421
3896 Risk Analysis of Leaks from a Subsea Oil Facility Based on Fuzzy Logic Techniques

Authors: Belén Vinaixa Kinnear, Arturo Hidalgo López, Bernardo Elembo Wilasi, Pablo Fernández Pérez, Cecilia Hernández Fuentealba

Abstract:

The expanded use of risk assessment in legislative and corporate decision-making has increased the role of expert judgement in giving data for security-related decision-making. Expert judgements are required in most steps of risk assessment: danger recognizable proof, hazard estimation, risk evaluation, and examination of choices. This paper presents a fault tree analysis (FTA), which implies a probabilistic failure analysis applied to leakage of oil in a subsea production system. In standard FTA, the failure probabilities of items of a framework are treated as exact values while evaluating the failure probability of the top event. There is continuously insufficiency of data for calculating the failure estimation of components within the drilling industry. Therefore, fuzzy hypothesis can be used as a solution to solve the issue. The aim of this paper is to examine the leaks from the Zafiro West subsea oil facility by using fuzzy fault tree analysis (FFTA). As a result, the research has given theoretical and practical contributions to maritime safety and environmental protection. It has been also an effective strategy used traditionally in identifying hazards in nuclear installations and power industries.

Keywords: expert judgment, probability assessment, fault tree analysis, risk analysis, oil pipelines, subsea production system, drilling, quantitative risk analysis, leakage failure, top event, off-shore industry

Procedia PDF Downloads 181
3895 Discerning Divergent Nodes in Social Networks

Authors: Mehran Asadi, Afrand Agah

Abstract:

In data mining, partitioning is used as a fundamental tool for classification. With the help of partitioning, we study the structure of data, which allows us to envision decision rules, which can be applied to classification trees. In this research, we used online social network dataset and all of its attributes (e.g., Node features, labels, etc.) to determine what constitutes an above average chance of being a divergent node. We used the R statistical computing language to conduct the analyses in this report. The data were found on the UC Irvine Machine Learning Repository. This research introduces the basic concepts of classification in online social networks. In this work, we utilize overfitting and describe different approaches for evaluation and performance comparison of different classification methods. In classification, the main objective is to categorize different items and assign them into different groups based on their properties and similarities. In data mining, recursive partitioning is being utilized to probe the structure of a data set, which allow us to envision decision rules and apply them to classify data into several groups. Estimating densities is hard, especially in high dimensions, with limited data. Of course, we do not know the densities, but we could estimate them using classical techniques. First, we calculated the correlation matrix of the dataset to see if any predictors are highly correlated with one another. By calculating the correlation coefficients for the predictor variables, we see that density is strongly correlated with transitivity. We initialized a data frame to easily compare the quality of the result classification methods and utilized decision trees (with k-fold cross validation to prune the tree). The method performed on this dataset is decision trees. Decision tree is a non-parametric classification method, which uses a set of rules to predict that each observation belongs to the most commonly occurring class label of the training data. Our method aggregates many decision trees to create an optimized model that is not susceptible to overfitting. When using a decision tree, however, it is important to use cross-validation to prune the tree in order to narrow it down to the most important variables.

Keywords: online social networks, data mining, social cloud computing, interaction and collaboration

Procedia PDF Downloads 139
3894 A Comprehensive Method of Fault Detection and Isolation based on Testability Modeling Data

Authors: Junyou Shi, Weiwei Cui

Abstract:

Testability modeling is a commonly used method in testability design and analysis of system. A dependency matrix will be obtained from testability modeling, and we will give a quantitative evaluation about fault detection and isolation. Based on the dependency matrix, we can obtain the diagnosis tree. The tree provides the procedures of the fault detection and isolation. But the dependency matrix usually includes built-in test (BIT) and manual test in fact. BIT runs the test automatically and is not limited by the procedures. The method above cannot give a more efficient diagnosis and use the advantages of the BIT. A Comprehensive method of fault detection and isolation is proposed. This method combines the advantages of the BIT and Manual test by splitting the matrix. The result of the case study shows that the method is effective.

Keywords: fault detection, fault isolation, testability modeling, BIT

Procedia PDF Downloads 320
3893 Use of Regression Analysis in Determining the Length of Plastic Hinge in Reinforced Concrete Columns

Authors: Mehmet Alpaslan Köroğlu, Musa Hakan Arslan, Muslu Kazım Körez

Abstract:

Basic objective of this study is to create a regression analysis method that can estimate the length of a plastic hinge which is an important design parameter, by making use of the outcomes of (lateral load-lateral displacement hysteretic curves) the experimental studies conducted for the reinforced square concrete columns. For this aim, 170 different square reinforced concrete column tests results have been collected from the existing literature. The parameters which are thought affecting the plastic hinge length such as cross-section properties, features of material used, axial loading level, confinement of the column, longitudinal reinforcement bars in the columns etc. have been obtained from these 170 different square reinforced concrete column tests. In the study, when determining the length of plastic hinge, using the experimental test results, a regression analysis have been separately tested and compared with each other. In addition, the outcome of mentioned methods on determination of plastic hinge length of the reinforced concrete columns has been compared to other methods available in the literature.

Keywords: columns, plastic hinge length, regression analysis, reinforced concrete

Procedia PDF Downloads 461
3892 Forecasting the Influences of Information and Communication Technology on the Structural Changes of Japanese Industrial Sectors: A Study Using Statistical Analysis

Authors: Ubaidillah Zuhdi, Shunsuke Mori, Kazuhisa Kamegai

Abstract:

The purpose of this study is to forecast the influences of Information and Communication Technology (ICT) on the structural changes of Japanese economies based on Leontief Input-Output (IO) coefficients. This study establishes a statistical analysis to predict the future interrelationships among industries. We employ the Constrained Multivariate Regression (CMR) model to analyze the historical changes of input-output coefficients. Statistical significance of the model is then tested by Likelihood Ratio Test (LRT). In our model, ICT is represented by two explanatory variables, i.e. computers (including main parts and accessories) and telecommunications equipment. A previous study, which analyzed the influences of these variables on the structural changes of Japanese industrial sectors from 1985-2005, concluded that these variables had significant influences on the changes in the business circumstances of Japanese commerce, business services and office supplies, and personal services sectors. The projected future Japanese economic structure based on the above forecast generates the differentiated direct and indirect outcomes of ICT penetration.

Keywords: forecast, ICT, industrial structural changes, statistical analysis

Procedia PDF Downloads 363
3891 Measures of Corporate Governance Efficiency on the Quality Level of Value Relevance Using IFRS and Corporate Governance Acts: Evidence from African Stock Exchanges

Authors: Tchapo Tchaga Sophia, Cai Chun

Abstract:

This study measures the efficiency level of corporate governance to improve the quality level of value relevance in the resolution of market value efficiency increase issues, transparency problems, risk frauds, agency problems, investors' confidence, and decision-making issues using IFRS and Corporate Governance Acts (CGA). The final sample of this study contains 3660 firms from ten countries' stock markets from 2010 to 2020. Based on the efficiency market theory and the positive accounting theory, this paper uses multiple econometrical methods (DID method, multivariate and univariate regression methods) and models (Ohlson model and compliance index model) regression to see the incidence results of corporate governance mechanisms on the value relevance level under the influence of IFRS and corporate governance regulations act framework in Africa's stock exchanges for non-financial firms. The results on value relevance show that the corporate governance system, strengthened by the adoption of IFRS and enforcement of new corporate governance regulations, produces better financial statement information when its compliance level is high. And that is both value-relevant and comparable to results in more developed markets. Similar positive and significant results were obtained when predicting future book value per share and earnings per share through the determination of stock price and stock return. The findings of this study have important implications for regulators, academics, investors, and other users regarding the effects of IFRS and the Corporate Governance Act (CGA) on the relationship between corporate governance and accounting information relevance in the African stock market. The contributions of this paper are also based on the uniqueness of the data used in this study. The unique data is from Africa, and not all existing findings provide evidence for Africa and of the DID method used to examine the relationship between corporate governance and value relevance on African stock exchanges.

Keywords: corporate governance value, market efficiency value, value relevance, African stock market, stock return-stock price

Procedia PDF Downloads 47
3890 PDDA: Priority-Based, Dynamic Data Aggregation Approach for Sensor-Based Big Data Framework

Authors: Lutful Karim, Mohammed S. Al-kahtani

Abstract:

Sensors are being used in various applications such as agriculture, health monitoring, air and water pollution monitoring, traffic monitoring and control and hence, play the vital role in the growth of big data. However, sensors collect redundant data. Thus, aggregating and filtering sensors data are significantly important to design an efficient big data framework. Current researches do not focus on aggregating and filtering data at multiple layers of sensor-based big data framework. Thus, this paper introduces (i) three layers data aggregation and framework for big data and (ii) a priority-based, dynamic data aggregation scheme (PDDA) for the lowest layer at sensors. Simulation results show that the PDDA outperforms existing tree and cluster-based data aggregation scheme in terms of overall network energy consumptions and end-to-end data transmission delay.

Keywords: big data, clustering, tree topology, data aggregation, sensor networks

Procedia PDF Downloads 327
3889 The Use of Multivariate Statistical and GIS for Characterization Groundwater Quality in Laghouat Region, Algeria

Authors: Rouighi Mustapha, Bouzid Laghaa Souad, Rouighi Tahar

Abstract:

Due to rain Shortage and the increase of population in the last years, wells excavation and groundwater use for different purposes had been increased without any planning. This is a great challenge for our country. Moreover, this scarcity of water resources in this region is unfortunately combined with rapid fresh water resources quality deterioration, due to salinity and contamination processes. Therefore, it is necessary to conduct the studies about groundwater quality in Algeria. In this work consists in the identification of the factors which influence the water quality parameters in Laghouat region by using statistical analysis Principal Component Analysis (PCA), Hierarchical Cluster Analysis (HCA) and geographic information system (GIS) in an attempt to discriminate the sources of the variation of water quality variations. The results of PCA technique indicate that variables responsible for water quality composition are mainly related to soluble salts variables; natural processes and the nature of the rock which modifies significantly the water chemistry. Inferred from the positive correlation between K+ and NO3-, NO3- is believed to be human induced rather than naturally originated. In this study, the multivariate statistical analysis and GIS allows the hydrogeologist to have supplementary tools in the characterization and evaluating of aquifers.

Keywords: cluster, analysis, GIS, groundwater, laghouat, quality

Procedia PDF Downloads 313
3888 Modelling Operational Risk Using Extreme Value Theory and Skew t-Copulas via Bayesian Inference

Authors: Betty Johanna Garzon Rozo, Jonathan Crook, Fernando Moreira

Abstract:

Operational risk losses are heavy tailed and are likely to be asymmetric and extremely dependent among business lines/event types. We propose a new methodology to assess, in a multivariate way, the asymmetry and extreme dependence between severity distributions, and to calculate the capital for Operational Risk. This methodology simultaneously uses (i) several parametric distributions and an alternative mix distribution (the Lognormal for the body of losses and the Generalized Pareto Distribution for the tail) via extreme value theory using SAS®, (ii) the multivariate skew t-copula applied for the first time for operational losses and (iii) Bayesian theory to estimate new n-dimensional skew t-copula models via Markov chain Monte Carlo (MCMC) simulation. This paper analyses a newly operational loss data set, SAS Global Operational Risk Data [SAS OpRisk], to model operational risk at international financial institutions. All the severity models are constructed in SAS® 9.2. We implement the procedure PROC SEVERITY and PROC NLMIXED. This paper focuses in describing this implementation.

Keywords: operational risk, loss distribution approach, extreme value theory, copulas

Procedia PDF Downloads 585
3887 An Application to Predict the Best Study Path for Information Technology Students in Learning Institutes

Authors: L. S. Chathurika

Abstract:

Early prediction of student performance is an important factor to be gained academic excellence. Whatever the study stream in secondary education, students lay the foundation for higher studies during the first year of their degree or diploma program in Sri Lanka. The information technology (IT) field has certain improvements in the education domain by selecting specialization areas to show the talents and skills of students. These specializations can be software engineering, network administration, database administration, multimedia design, etc. After completing the first-year, students attempt to select the best path by considering numerous factors. The purpose of this experiment is to predict the best study path using machine learning algorithms. Five classification algorithms: decision tree, support vector machine, artificial neural network, Naïve Bayes, and logistic regression are selected and tested. The support vector machine obtained the highest accuracy, 82.4%. Then affecting features are recognized to select the best study path.

Keywords: algorithm, classification, evaluation, features, testing, training

Procedia PDF Downloads 110
3886 Predictive Factors of Prognosis in Acute Stroke Patients Receiving Traditional Chinese Medicine Therapy: A Retrospective Study

Authors: Shaoyi Lu

Abstract:

Background: Traditional Chinese medicine has been used to treat stroke, which is a major cause of morbidity and mortality. There is, however, no clear agreement about the optimal timing, population, efficacy, and predictive prognosis factors of traditional Chinese medicine supplemental therapy. Method: In this study, we used a retrospective analysis with data collection from stroke patients in Stroke Registry In Chang Gung Healthcare System (SRICHS). Stroke patients who received traditional Chinese medicine consultation in neurology ward of Keelung Chang Gung Memorial Hospital from Jan 2010 to Dec 2014 were enrolled. Clinical profiles including the neurologic deficit, activities of daily living and other basic characteristics were analyzed. Through propensity score matching, we compared the NIHSS and Barthel index before and after the hospitalization, and applied with subgroup analysis, and adjusted by multivariate regression method. Results: Totally 115 stroke patients were enrolled with experiment group in 23 and control group in 92. The most important factor for prognosis prediction were the scores of National Institutes of Health Stroke Scale and Barthel index right before the hospitalization. Traditional Chinese medicine intervention had no statistically significant influence on the neurological deficit of acute stroke patients, and mild negative influence on daily activity performance of acute hemorrhagic stroke patient. Conclusion: Efficacy of traditional Chinese medicine as a supplemental therapy for acute stroke patients was controversial. The reason for this phenomenon might be complex and require more research to comprehend. Key words: traditional Chinese medicine, acupuncture, Stroke, NIH stroke scale, Barthel index, predictive factor. Method: In this study, we used a retrospective analysis with data collection from stroke patients in Stroke Registry In Chang Gung Healthcare System (SRICHS). Stroke patients who received traditional Chinese medicine consultation in neurology ward of Keelung Chang Gung Memorial Hospital from Jan 2010 to Dec 2014 were enrolled. Clinical profiles including the neurologic deficit, activities of daily living and other basic characteristics were analyzed. Through propensity score matching, we compared the NIHSS and Barthel index before and after the hospitalization, and applied with subgroup analysis, and adjusted by multivariate regression method. Results: Totally 115 stroke patients were enrolled with experiment group in 23 and control group in 92. The most important factor for prognosis prediction were the scores of National Institutes of Health Stroke Scale and Barthel index right before the hospitalization. Traditional Chinese medicine intervention had no statistically significant influence on the neurological deficit of acute stroke patients, and mild negative influence on daily activity performance of acute hemorrhagic stroke patient. Conclusion: Efficacy of traditional Chinese medicine as a supplemental therapy for acute stroke patients was controversial. The reason for this phenomenon might be complex and require more research to comprehend.

Keywords: traditional Chinese medicine, complementary and alternative medicine, stroke, acupuncture

Procedia PDF Downloads 353
3885 Measurement Errors and Misclassifications in Covariates in Logistic Regression: Bayesian Adjustment of Main and Interaction Effects and the Sample Size Implications

Authors: Shahadut Hossain

Abstract:

Measurement errors in continuous covariates and/or misclassifications in categorical covariates are common in epidemiological studies. Regression analysis ignoring such mismeasurements seriously biases the estimated main and interaction effects of covariates on the outcome of interest. Thus, adjustments for such mismeasurements are necessary. In this research, we propose a Bayesian parametric framework for eliminating deleterious impacts of covariate mismeasurements in logistic regression. The proposed adjustment method is unified and thus can be applied to any generalized linear and non-linear regression models. Furthermore, adjustment for covariate mismeasurements requires validation data usually in the form of either gold standard measurements or replicates of the mismeasured covariates on a subset of the study population. Initial investigation shows that adequacy of such adjustment depends on the sizes of main and validation samples, especially when prevalences of the categorical covariates are low. Thus, we investigate the impact of main and validation sample sizes on the adjusted estimates, and provide a general guideline about these sample sizes based on simulation studies.

Keywords: measurement errors, misclassification, mismeasurement, validation sample, Bayesian adjustment

Procedia PDF Downloads 398
3884 A Predictive Machine Learning Model of the Survival of Female-led and Co-Led Small and Medium Enterprises in the UK

Authors: Mais Khader, Xingjie Wei

Abstract:

This research sheds light on female entrepreneurs by providing new insights on the survival predictions of companies led by females in the UK. This study aims to build a predictive machine learning model of the survival of female-led & co-led small & medium enterprises (SMEs) in the UK over the period 2000-2020. The predictive model built utilised a combination of financial and non-financial features related to both companies and their directors to predict SMEs' survival. These features were studied in terms of their contribution to the resultant predictive model. Five machine learning models are used in the modelling: Decision tree, AdaBoost, Naïve Bayes, Logistic regression and SVM. The AdaBoost model had the highest performance of the five models, with an accuracy of 73% and an AUC of 80%. The results show high feature importance in predicting companies' survival for company size, management experience, financial performance, industry, region, and females' percentage in management.

Keywords: company survival, entrepreneurship, females, machine learning, SMEs

Procedia PDF Downloads 85
3883 Quantitative Structure-Activity Relationship Study of Some Quinoline Derivatives as Antimalarial Agents

Authors: M. Ouassaf, S. Belaid

Abstract:

A series of quinoline derivatives with antimalarial activity were subjected to two-dimensional quantitative structure-activity relationship (2D-QSAR) studies. Three models were implemented using multiple regression linear MLR, a regression partial least squares (PLS), nonlinear regression (MNLR), to see which descriptors are closely related to the activity biologic. We relied on a principal component analysis (PCA). Based on our results, a comparison of the quality of, MLR, PLS, and MNLR models shows that the MNLR (R = 0.914 and R² = 0.835, RCV= 0.853) models have substantially better predictive capability because the MNLR approach gives better results than MLR (R = 0.835 and R² = 0,752, RCV=0.601)), PLS (R = 0.742 and R² = 0.552, RCV=0.550) The model of MNLR gave statistically significant results and showed good stability to data variation in leave-one-out cross-validation. The obtained results suggested that our proposed model MNLR may be useful to predict the biological activity of derivatives of quinoline.

Keywords: antimalarial, quinoline, QSAR, PCA, MLR , MNLR, MLR

Procedia PDF Downloads 141
3882 The Teacher’s Role in Generating and Maintaining the Motivation of Adult Learners of English: A Mixed Methods Study in Hungarian Corporate Contexts

Authors: Csaba Kalman

Abstract:

In spite of the existence of numerous second language (L2) motivation theories, the teacher’s role in motivating learners has remained an under-researched niche to this day. If we narrow down our focus on the teacher’s role on motivating adult learners of English in an English as a Foreign Language (EFL) context in corporate environments, empirical research is practically non-existent. This study fills the above research niche by exploring the most motivating aspects of the teacher’s personality, behaviour, and teaching practices that affect adult learners’ L2 motivation in corporate contexts in Hungary. The study was conducted in a wide range of industries in 18 organisations that employ over 250 people in Hungary. In order to triangulate the research, 21 human resources managers, 18 language teachers, and 466 adult learners of English were involved in the investigation by participating in interview studies, and quantitative questionnaire studies that measured ten scales related to the teacher’s role, as well as two criterion measure scales of intrinsic and extrinsic motivation. The qualitative data were analysed using a template organising style, while descriptive, inferential statistics, as well as multivariate statistical techniques, such as correlation and regression analyses, were used for analysing the quantitative data. The results showed that certain aspects of the teacher’s personality (thoroughness, enthusiasm, credibility, and flexibility), as well as preparedness, incorporating English for Specific Purposes (ESP) in the syllabus, and focusing on the present, proved to be the most salient aspects of the teacher’s motivating influence. The regression analyses conducted with the criterion measure scales revealed that 22% of the variance in learners’ intrinsic motivation could be explained by the teacher’s preparedness and appearance, and 23% of the variance in learners’ extrinsic motivation could be attributed to the teacher’s personal branding and incorporating ESP in the syllabus. The findings confirm the pivotal role teachers play in motivating L2 learners independent of the context they teach in; and, at the same time, call for further research so that we can better conceptualise the motivating influence of L2 teachers.

Keywords: adult learners, corporate contexts, motivation, teacher’s role

Procedia PDF Downloads 90
3881 Process Optimization of Mechanochemical Synthesis for the Production of 4,4 Bipyridine Based MOFS using Twin Screw Extrusion and Multivariate Analysis

Authors: Ahmed Metawea, Rodrigo Soto, Majeida Kharejesh, Gavin Walker, Ahmad B. Albadarin

Abstract:

In this study, towards a green approach, we have investigated the effect of operating conditions of solvent assessed twin-screw extruder (TSE) for the production of 4, 4-bipyridine (1-dimensional coordinated polymer (1D)) based coordinated polymer using cobalt nitrate as a metal precursor with molar ratio 1:1. Different operating parameters such as solvent percentage, screw speed and feeding rate are considered. The resultant product is characterized using offline characterization methods, namely Powder X-ray diffraction (PXRD), Raman spectroscopy and scanning electron microscope (SEM) in order to investigate the product purity and surface morphology. A lower feeding rate increased the product’s quality as more resident time was provided for the reaction to take place. The most important influencing factor was the amount of liquid added. The addition of water helped in facilitating the reaction inside the TSE by increasing the surface area of the reaction for particles

Keywords: MOFS, multivariate analysis, process optimization, chemometric

Procedia PDF Downloads 143
3880 A CD40 Variant is Associated with Systemic Bone Loss Among Patients with Rheumatoid Arthritis

Authors: Rim Sghiri, Samia Al Shouli, Hana Benhassine, Nejla Elamri, Zahid Shakoor, Foued Slama, Adel Almogren, Hala Zeglaoui, Elyes Bouajina, Ramzi Zemni

Abstract:

Objectives: Little is known about genes predisposing to systemic bone loss (SBL) in rheumatoid arthritis (RA). Therefore, we examined the association between SBL and a variant of CD40 gene, which is known to play a critical role in both immune response and bone homeostasis among patients with RA. Methods: CD40 rs48104850 was genotyped in 176 adult RA patients. Bone mineral density (BMD) was measured using dual-energy X-ray absorptiometry (DXA). Results: Low BMD was observed in 116 (65.9%) patients. Among them, 60 (34.1%) had low femoral neck (FN) Z score, 72 (40.9%) had low total femur (TF) Z score, and 105 (59.6%) had low lumbar spine (LS) Z score. CD40 rs4810485 was found to be associated with reduced TF Z score with the CD40 rs4810485 T allele protecting against reduced TF Z score (OR = 0.40, 95% CI = 0.23-0.68, p = 0.0005). This association was confirmed in the multivariate logistic regression analysis (OR=0.31, 95% CI= 0.16-0.59, p=3.84 x 10₋₄). Moreover, median FN BMD was reduced among RA patients with CD40 rs4810485 GG genotype compared to RA patients harbouring CD40 rs4810485 TT and GT genotypes (0.788± 0.136 versus 0.826± 0.146g/cm², p=0.001). Conclusion: This study, for the first time ever, demonstrated an association between a CD40 genetic variant and SBL among patients with RA.

Keywords: rheumatoid arthritis, CD40 gene, bone mineral density, systemic bone loss, rs48104850

Procedia PDF Downloads 446
3879 Agile Software Effort Estimation Using Regression Techniques

Authors: Mikiyas Adugna

Abstract:

Effort estimation is among the activities carried out in software development processes. An accurate model of estimation leads to project success. The method of agile effort estimation is a complex task because of the dynamic nature of software development. Researchers are still conducting studies on agile effort estimation to enhance prediction accuracy. Due to these reasons, we investigated and proposed a model on LASSO and Elastic Net regression to enhance estimation accuracy. The proposed model has major components: preprocessing, train-test split, training with default parameters, and cross-validation. During the preprocessing phase, the entire dataset is normalized. After normalization, a train-test split is performed on the dataset, setting training at 80% and testing set to 20%. We chose two different phases for training the two algorithms (Elastic Net and LASSO) regression following the train-test-split. In the first phase, the two algorithms are trained using their default parameters and evaluated on the testing data. In the second phase, the grid search technique (the grid is used to search for tuning and select optimum parameters) and 5-fold cross-validation to get the final trained model. Finally, the final trained model is evaluated using the testing set. The experimental work is applied to the agile story point dataset of 21 software projects collected from six firms. The results show that both Elastic Net and LASSO regression outperformed the compared ones. Compared to the proposed algorithms, LASSO regression achieved better predictive performance and has acquired PRED (8%) and PRED (25%) results of 100.0, MMRE of 0.0491, MMER of 0.0551, MdMRE of 0.0593, MdMER of 0.063, and MSE of 0.0007. The result implies LASSO regression algorithm trained model is the most acceptable, and higher estimation performance exists in the literature.

Keywords: agile software development, effort estimation, elastic net regression, LASSO

Procedia PDF Downloads 46
3878 Assessment of Association Between Microalbuminuria and Lung Function Test Among the Community of Jimma Town

Authors: Diriba Dereje

Abstract:

Background: Cardiac and renal disease are the most prevalent chronic non-communicable diseases (CNCD) affecting the community in a significant manner. The best and recommended method in halting CNCD is by working on prevention as early as possible. This is only possible if early surrogate markers are identified. As part of the stated solution, this study will identify an association between microalbuminuria (an early surrogate marker of renal and cardiac disease) and lung function test among adult in the community. Objective: The main aim of this study was to assess an association between microalbuminuria (an early surrogate marker of renal and cardiac disease) and lung function test among adult in the community. Methodology: Community based cross sectional study was conducted among 384 adult in Jimma town. A systematic sampling technique was used in selecting participants to the study. In searching for the possible association, binary and multivariate logistic regression and t-test was conducted. Finally, the association between microalbuminuria and lung function test was well stated in the form of figures and written description. Result and Conclusion: A significant association was found between microalbuminuria and different lung function test parameters.

Keywords: microalbuminuria, lung function, association, test

Procedia PDF Downloads 180
3877 Using Discriminant Analysis to Forecast Crime Rate in Nigeria

Authors: O. P. Popoola, O. A. Alawode, M. O. Olayiwola, A. M. Oladele

Abstract:

This research work is based on using discriminant analysis to forecast crime rate in Nigeria between 1996 and 2008. The work is interested in how gender (male and female) relates to offences committed against the government, against other properties, disturbance in public places, murder/robbery offences and other offences. The data used was collected from the National Bureau of Statistics (NBS). SPSS, the statistical package was used to analyse the data. Time plot was plotted on all the 29 offences gotten from the raw data. Eigenvalues and Multivariate tests, Wilks’ Lambda, standardized canonical discriminant function coefficients and the predicted classifications were estimated. The research shows that the distribution of the scores from each function is standardized to have a mean O and a standard deviation of 1. The magnitudes of the coefficients indicate how strongly the discriminating variable affects the score. In the predicted group membership, 172 cases that were predicted to commit crime against Government group, 66 were correctly predicted and 106 were incorrectly predicted. After going through the predicted classifications, we found out that most groups numbers that were correctly predicted were less than those that were incorrectly predicted.

Keywords: discriminant analysis, DA, multivariate analysis of variance, MANOVA, canonical correlation, and Wilks’ Lambda

Procedia PDF Downloads 453
3876 Robustified Asymmetric Logistic Regression Model for Global Fish Stock Assessment

Authors: Osamu Komori, Shinto Eguchi, Hiroshi Okamura, Momoko Ichinokawa

Abstract:

The long time-series data on population assessments are essential for global ecosystem assessment because the temporal change of biomass in such a database reflects the status of global ecosystem properly. However, the available assessment data usually have limited sample sizes and the ratio of populations with low abundance of biomass (collapsed) to those with high abundance (non-collapsed) is highly imbalanced. To allow for the imbalance and uncertainty involved in the ecological data, we propose a binary regression model with mixed effects for inferring ecosystem status through an asymmetric logistic model. In the estimation equation, we observe that the weights for the non-collapsed populations are relatively reduced, which in turn puts more importance on the small number of observations of collapsed populations. Moreover, we extend the asymmetric logistic regression model using propensity score to allow for the sample biases observed in the labeled and unlabeled datasets. It robustified the estimation procedure and improved the model fitting.

Keywords: double robust estimation, ecological binary data, mixed effect logistic regression model, propensity score

Procedia PDF Downloads 255
3875 PM10 Prediction and Forecasting Using CART: A Case Study for Pleven, Bulgaria

Authors: Snezhana G. Gocheva-Ilieva, Maya P. Stoimenova

Abstract:

Ambient air pollution with fine particulate matter (PM10) is a systematic permanent problem in many countries around the world. The accumulation of a large number of measurements of both the PM10 concentrations and the accompanying atmospheric factors allow for their statistical modeling to detect dependencies and forecast future pollution. This study applies the classification and regression trees (CART) method for building and analyzing PM10 models. In the empirical study, average daily air data for the city of Pleven, Bulgaria for a period of 5 years are used. Predictors in the models are seven meteorological variables, time variables, as well as lagged PM10 variables and some lagged meteorological variables, delayed by 1 or 2 days with respect to the initial time series, respectively. The degree of influence of the predictors in the models is determined. The selected best CART models are used to forecast future PM10 concentrations for two days ahead after the last date in the modeling procedure and show very accurate results.

Keywords: cross-validation, decision tree, lagged variables, short-term forecasting

Procedia PDF Downloads 185