Search results for: multivariate regression tree
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 4442

Search results for: multivariate regression tree

3752 Integrating Machine Learning and Rule-Based Decision Models for Enhanced B2B Sales Forecasting and Customer Prioritization

Authors: Wenqi Liu, Reginald Bailey

Abstract:

This study explores an advanced approach to enhancing B2B sales forecasting by integrating machine learning models with a rule-based decision framework. The methodology begins with the development of a machine learning classification model to predict conversion likelihood, aiming to improve accuracy over traditional methods like logistic regression. The classification model's effectiveness is measured using metrics such as accuracy, precision, recall, and F1 score, alongside a feature importance analysis to identify key predictors. Following this, a machine learning regression model is used to forecast sales value, with the objective of reducing mean absolute error (MAE) compared to linear regression techniques. The regression model's performance is assessed using MAE, root mean square error (RMSE), and R-squared metrics, emphasizing feature contribution to the prediction. To bridge the gap between predictive analytics and decision-making, a rule-based decision model is introduced that prioritizes customers based on predefined thresholds for conversion probability and predicted sales value. This approach significantly enhances customer prioritization and improves overall sales performance by increasing conversion rates and optimizing revenue generation. The findings suggest that this combined framework offers a practical, data-driven solution for sales teams, facilitating more strategic decision-making in B2B environments.

Keywords: sales forecasting, machine learning, rule-based decision model, customer prioritization, predictive analytics

Procedia PDF Downloads 14
3751 Infodemic Detection on Social Media with a Multi-Dimensional Deep Learning Framework

Authors: Raymond Xu, Cindy Jingru Wang

Abstract:

Social media has become a globally connected and influencing platform. Social media data, such as tweets, can help predict the spread of pandemics and provide individuals and healthcare providers early warnings. Public psychological reactions and opinions can be efficiently monitored by AI models on the progression of dominant topics on Twitter. However, statistics show that as the coronavirus spreads, so does an infodemic of misinformation due to pandemic-related factors such as unemployment and lockdowns. Social media algorithms are often biased toward outrage by promoting content that people have an emotional reaction to and are likely to engage with. This can influence users’ attitudes and cause confusion. Therefore, social media is a double-edged sword. Combating fake news and biased content has become one of the essential tasks. This research analyzes the variety of methods used for fake news detection covering random forest, logistic regression, support vector machines, decision tree, naive Bayes, BoW, TF-IDF, LDA, CNN, RNN, LSTM, DeepFake, and hierarchical attention network. The performance of each method is analyzed. Based on these models’ achievements and limitations, a multi-dimensional AI framework is proposed to achieve higher accuracy in infodemic detection, especially pandemic-related news. The model is trained on contextual content, images, and news metadata.

Keywords: artificial intelligence, fake news detection, infodemic detection, image recognition, sentiment analysis

Procedia PDF Downloads 253
3750 Modeling Pan Evaporation Using Intelligent Methods of ANN, LSSVM and Tree Model M5 (Case Study: Shahroud and Mayamey Stations)

Authors: Hamidreza Ghazvinian, Khosro Ghazvinian, Touba Khodaiean

Abstract:

The importance of evaporation estimation in water resources and agricultural studies is undeniable. Pan evaporation are used as an indicator to determine the evaporation of lakes and reservoirs around the world due to the ease of interpreting its data. In this research, intelligent models were investigated in estimating pan evaporation on a daily basis. Shahroud and Mayamey were considered as the studied cities. These two cities are located in Semnan province in Iran. The mentioned cities have dry weather conditions that are susceptible to high evaporation potential. Meteorological data of 11 years of synoptic stations of Shahrood and Mayamey cities were used. The intelligent models used in this study are Artificial Neural Network (ANN), Least Squares Support Vector Machine (LSSVM), and M5 tree models. Meteorological parameters of minimum and maximum air temperature (Tmax, Tmin), wind speed (WS), sunshine hours (SH), air pressure (PA), relative humidity (RH) as selected input data and evaporation data from pan (EP) to The output data was considered. 70% of data is used at the education level, and 30 % of the data is used at the test level. Models used with explanation coefficient evaluation (R2) Root of Mean Squares Error (RMSE) and Mean Absolute Error (MAE). The results for the two Shahroud and Mayamey stations showed that the above three models' operations are rather appropriate.

Keywords: pan evaporation, intelligent methods, shahroud, mayamey

Procedia PDF Downloads 73
3749 Effect of Drying on the Concrete Structures

Authors: A. Brahma

Abstract:

The drying of hydraulics materials is unavoidable and conducted to important spontaneous deformations. In this study, we show that it is possible to describe the drying shrinkage of the high-performance concrete by a simple expression. A multiple regression model was developed for the prediction of the drying shrinkage of the high-performance concrete. The assessment of the proposed model has been done by a set of statistical tests. The model developed takes in consideration the main parameters of confection and conservation. There was a very good agreement between drying shrinkage predicted by the multiple regression model and experimental results. The developed model adjusts easily to all hydraulic concrete types.

Keywords: hydraulic concretes, drying, shrinkage, prediction, modeling

Procedia PDF Downloads 366
3748 Adjustment of Parents of Children with Autism: A Multivariate Model

Authors: Ayelet Siman-Tov, Shlomo Kaniel

Abstract:

Objectives: The research validates a multivariate model that predicts parental adjustment to coping successfully with an autistic child. The model comprises four elements: parental stress, parental resources, parental adjustment and the child's autism symptoms. Background and aims: The purpose of the current study is the construction and validation of a model for the adjustment of parents and a child with autism. The suggested model is based on theoretical views on stress and links personal resources, stress, perception, parental mental health and quality of marriage and child adjustment with autism. The family stress approach focuses on the family as a system made up of a dynamic interaction between its members, who constitute interdependent parts of the system, and thus, a change in one family member brings about changes in the processes of the entire family system. From this perspective, a rise of new demands in the family and stress in the role of one family member affects the family system as a whole. Materials and methods: 176 parents of children aged between 6 to 16 diagnosed with ASD answered several questionnaires measuring parental stress, personal resources (sense of coherence, locus of control, social support), adjustment (mental health and marriage quality) and the child's autism symptoms. Results: Path analysis showed that a sense of coherence, internal locus of control, social support and quality of marriage increase the ability to cope with the stress of parenting an autistic child. Directions for further research are suggested.

Keywords: stress, adjustment, resources, Autism, parents, coherence

Procedia PDF Downloads 137
3747 Women, Quality of Life, and Infertility: The Mediating Role of Social Support and Hope

Authors: Saeideh Lotfi Nikoo, Azadeh Ghaheri, Reza Omani Samani

Abstract:

Context: In most cultures around the globe, infertility is recognized as a crisis and exposed infertile couples are under psychosocial pressure. Indeed, the quality of life (QoL) for infertile women is lower in comparison with fertile control. Objective, The purpose of this study, was to investigate the impact of social support and hope on QoL in women undergoing infertility treatment. Methods: A cross-sectional study. Patient(s): In this cross-sectional study, 350 infertile women were recruited who were referred to an infertility clinic for the first time and had no history of Assisted Reproductive Techniques (ART) failure. Intervention(s): Questionnaires on the Fertility Quality of Life (FertiQoL), Multi-dimensional Scale of Perceived Social Support (family and friends), and Snyder Hope Scale (pathway and agency) were used to collect data. Data analysis was done by univariate and multivariate analysis. P value <0.05 was considered statistically significant. Result(s): Multivariate analysis indicated that infertile women with a higher score of social support (by family & friends) (b= 0.59 (CI 95%: 0.03, 1.15) (P = 0.040), b= 0.61 (CI 95%: 0.17, 1.04) (P = 0.006)) and hope (pathway & agency) (b= 0.94 (CI 95%: 0.29, 1.59) (P = 0.005), b= 1.13 (CI 95%: 0.45, 1.82) (P = 0.001) respectively) have significantly better Core FertiQoL. The result revealed that social support and hope are significantly and positively associated with other subscales of FertiQoL as well. Conclusions: According to the results, lifestyle interventions such as receiving social support, building a sound family with effective communication, and providing appropriate health education are of crucial importance to address psychological distress and improve the fertility QoL of women experiencing fertility problems.

Keywords: inertility, social support, infertile women, hope

Procedia PDF Downloads 91
3746 Influence of Parameters of Modeling and Data Distribution for Optimal Condition on Locally Weighted Projection Regression Method

Authors: Farhad Asadi, Mohammad Javad Mollakazemi, Aref Ghafouri

Abstract:

Recent research in neural networks science and neuroscience for modeling complex time series data and statistical learning has focused mostly on learning from high input space and signals. Local linear models are a strong choice for modeling local nonlinearity in data series. Locally weighted projection regression is a flexible and powerful algorithm for nonlinear approximation in high dimensional signal spaces. In this paper, different learning scenario of one and two dimensional data series with different distributions are investigated for simulation and further noise is inputted to data distribution for making different disordered distribution in time series data and for evaluation of algorithm in locality prediction of nonlinearity. Then, the performance of this algorithm is simulated and also when the distribution of data is high or when the number of data is less the sensitivity of this approach to data distribution and influence of important parameter of local validity in this algorithm with different data distribution is explained.

Keywords: local nonlinear estimation, LWPR algorithm, online training method, locally weighted projection regression method

Procedia PDF Downloads 501
3745 Exploration and Evaluation of the Effect of Multiple Countermeasures on Road Safety

Authors: Atheer Al-Nuaimi, Harry Evdorides

Abstract:

Every day many people die or get disabled or injured on roads around the world, which necessitates more specific treatments for transportation safety issues. International road assessment program (iRAP) model is one of the comprehensive road safety models which accounting for many factors that affect road safety in a cost-effective way in low and middle income countries. In iRAP model road safety has been divided into five star ratings from 1 star (the lowest level) to 5 star (the highest level). These star ratings are based on star rating score which is calculated by iRAP methodology depending on road attributes, traffic volumes and operating speeds. The outcome of iRAP methodology are the treatments that can be used to improve road safety and reduce fatalities and serious injuries (FSI) numbers. These countermeasures can be used separately as a single countermeasure or mix as multiple countermeasures for a location. There is general agreement that the adequacy of a countermeasure is liable to consistent losses when it is utilized as a part of mix with different countermeasures. That is, accident diminishment appraisals of individual countermeasures cannot be easily added together. The iRAP model philosophy makes utilization of a multiple countermeasure adjustment factors to predict diminishments in the effectiveness of road safety countermeasures when more than one countermeasure is chosen. A multiple countermeasure correction factors are figured for every 100-meter segment and for every accident type. However, restrictions of this methodology incorporate a presumable over-estimation in the predicted crash reduction. This study aims to adjust this correction factor by developing new models to calculate the effect of using multiple countermeasures on the number of fatalities for a location or an entire road. Regression models have been used to establish relationships between crash frequencies and the factors that affect their rates. Multiple linear regression, negative binomial regression, and Poisson regression techniques were used to develop models that can address the effectiveness of using multiple countermeasures. Analyses are conducted using The R Project for Statistical Computing showed that a model developed by negative binomial regression technique could give more reliable results of the predicted number of fatalities after the implementation of road safety multiple countermeasures than the results from iRAP model. The results also showed that the negative binomial regression approach gives more precise results in comparison with multiple linear and Poisson regression techniques because of the overdispersion and standard error issues.

Keywords: international road assessment program, negative binomial, road multiple countermeasures, road safety

Procedia PDF Downloads 239
3744 Measures of Phylogenetic Support for Phylogenomic and the Whole Genomes of Two Lungfish Restate Lungfish and Origin of Land Vertebrates

Authors: Yunfeng Shan, Xiaoliang Wang, Youjun Zhou

Abstract:

Whole-genome data from two lungfish species, along with other species, present a valuable opportunity to reassess the longstanding debate regarding the evolutionary relationships among tetrapods, lungfishes, and coelacanths. However, the use of bootstrap support has become outdated for large-scale phylogenomic data. Without robust phylogenetic support, the phylogenetic trees become meaningless. Therefore, it is necessary to re-evaluate the phylogenies of tetrapods, lungfishes, and coelacanths using novel measures of phylogenetic support specifically designed for phylogenomic data, as the previous phylogenies were based on 100% bootstrap support. Our findings consistently provide strong evidence favoring lungfish as the closest living relative of tetrapods. This conclusion is based on high gene support confidence with confidence intervals exceeding 95%, high internode certainty, and high gene concordance factor. The evidence stems from two datasets containing recently deciphered whole genomes of two lungfish species, as well as five previous datasets derived from lungfish transcriptomes. These results yield fresh insights into the three hypotheses regarding the phylogenies of tetrapods, lungfishes, and coelacanths. Importantly, these hypotheses are not mere conjectures but are substantiated by a significant number of genes. Analyzing real biological data further demonstrates that the inclusion of additional taxa diminishes the number of orthologues and leads to more diverse tree topologies. Consequently, gene trees and species trees may not be identical even when whole-genome sequencing data is utilized. However, it is worth noting that many gene trees can accurately reflect the species tree if an appropriate number of taxa, typically ranging from six to ten, are sampled. Therefore, it is crucial to carefully select the number of taxa and an appropriate outgroup while excluding fast-evolving taxa as outgroups to mitigate the adverse effects of long-branch attraction (LBA) and achieve an accurate reconstruction of the species tree. This is particularly important as more whole-genome sequencing data becomes available.

Keywords: gene support confidence (GSC), origin of land vertebrates, coelacanth, two whole genomes of lungfishes, confidence intervals

Procedia PDF Downloads 85
3743 Rd-PLS Regression: From the Analysis of Two Blocks of Variables to Path Modeling

Authors: E. Tchandao Mangamana, V. Cariou, E. Vigneau, R. Glele Kakai, E. M. Qannari

Abstract:

A new definition of a latent variable associated with a dataset makes it possible to propose variants of the PLS2 regression and the multi-block PLS (MB-PLS). We shall refer to these variants as Rd-PLS regression and Rd-MB-PLS respectively because they are inspired by both Redundancy analysis and PLS regression. Usually, a latent variable t associated with a dataset Z is defined as a linear combination of the variables of Z with the constraint that the length of the loading weights vector equals 1. Formally, t=Zw with ‖w‖=1. Denoting by Z' the transpose of Z, we define herein, a latent variable by t=ZZ’q with the constraint that the auxiliary variable q has a norm equal to 1. This new definition of a latent variable entails that, as previously, t is a linear combination of the variables in Z and, in addition, the loading vector w=Z’q is constrained to be a linear combination of the rows of Z. More importantly, t could be interpreted as a kind of projection of the auxiliary variable q onto the space generated by the variables in Z, since it is collinear to the first PLS1 component of q onto Z. Consider the situation in which we aim to predict a dataset Y from another dataset X. These two datasets relate to the same individuals and are assumed to be centered. Let us consider a latent variable u=YY’q to which we associate the variable t= XX’YY’q. Rd-PLS consists in seeking q (and therefore u and t) so that the covariance between t and u is maximum. The solution to this problem is straightforward and consists in setting q to the eigenvector of YY’XX’YY’ associated with the largest eigenvalue. For the determination of higher order components, we deflate X and Y with respect to the latent variable t. Extending Rd-PLS to the context of multi-block data is relatively easy. Starting from a latent variable u=YY’q, we consider its ‘projection’ on the space generated by the variables of each block Xk (k=1, ..., K) namely, tk= XkXk'YY’q. Thereafter, Rd-MB-PLS seeks q in order to maximize the average of the covariances of u with tk (k=1, ..., K). The solution to this problem is given by q, eigenvector of YY’XX’YY’, where X is the dataset obtained by horizontally merging datasets Xk (k=1, ..., K). For the determination of latent variables of order higher than 1, we use a deflation of Y and Xk with respect to the variable t= XX’YY’q. In the same vein, extending Rd-MB-PLS to the path modeling setting is straightforward. Methods are illustrated on the basis of case studies and performance of Rd-PLS and Rd-MB-PLS in terms of prediction is compared to that of PLS2 and MB-PLS.

Keywords: multiblock data analysis, partial least squares regression, path modeling, redundancy analysis

Procedia PDF Downloads 146
3742 Retinal Changes in Patients with Idiopathic Inflammatory Myopathies: A Case-Control Study

Authors: Rachna Agarwal, R. Naveen, Darpan Thakre, Rohit Shahi, Maryam Abbasi, Upendra Rathore, Latika Gupta

Abstract:

Aim: Retinal changes are the window to systemic vasculature. Therefore, we explored retinal changes in patients with idiopathic inflammatory myopathies (IIM) as a surrogate for vascular health. Methods: Adult and juvenile IIM patients visiting a tertiary care centre in 2021 satisfying the International Myositis Classification Criteria were enrolled for detailed ophthalmic examination in comparison with healthy controls (HC). Patients with conditions that precluded thorough posterior chamber examination were excluded. Scale variables are expressed as median (IQR). Multivariate analysis (binary logistic regression-BLR) was conducted, adjusting for age, gender, and comorbidities besides factors significant in univariate analysis. Results: 43 patients with IIM [31 females; age 36 (23-45) years; disease duration 5.5 (2-12) months] were enrolled for participation. DM (44%) was the most common diagnosis. IIM patients exhibited frequent attenuation of retinal vessels (32.6% vs. 4.3%, p <0.001), AV nicking (14% vs. 2.2%, p=0.053), and vascular tortuosity (18.6% vs. 2.2%, p=0.012), besides decreased visual acuity (53.5% vs. 10.9%, p<0.001) and immature cataracts (34.9% vs. 2.2%, p<0.001). Attenuation of vessels [OR 10.9 (1.7-71), p=0.004] emerged as significantly different from HC after adjusting for covariates in BLR. Notably, adults with IIM were more predisposed to retinal abnormalities [21 (57%) vs. 1 (16%), p=0.068], especially attenuation of vessels [14(38%) vs. 0(0), p=0.067] than jIIM. However, no difference was found in retinal features amongst the subtypes of adult IIM, nor did they correlate with MDAAT, MDI, or HAQ-DI. Conclusion: Retinal microvasculopathy and diminution of vision occur in nearly one-third to half of the patients with IIM. Microvasculopathy occurs across subtypes of IIM, and more so in adults, calling for further investigation as a surrogate for damage assessment and potentially even systemic vascular health.

Keywords: idiopathic inflammatory myopathies, vascular health, retinal microvasculopathy, arterial attenuation

Procedia PDF Downloads 89
3741 Incidence, Risk Factors and Impact of Major Adverse Events Following Paediatric Cardiac Surgery

Authors: Sandipika Gupta

Abstract:

Objective: Due to admirably low 30-day mortality rates for paediatric cardiac surgery, it is now pertinent to turn towards more intermediate-length outcomes such as morbidities closely associated with these surgeries. One such morbidity, major adverse events (MAE) comprises a group of adverse outcomes associated with paediatric cardiac surgery (e.g. cardiac arrest, major haemorrhage). Methods: This is a retrospective study that analysed the incidence and impact of MAE which was the primary outcome in the UK population. The data was collected in 5 centres between October 2015 and June 2017, amassing 3090 surgical episodes. The incidence and risk factors for MAE, were assessed through descriptive statistical analyses and multivariate logistic regression. The secondary outcomes of life status at 6 months and the length of hospital stay were also evaluated to understand the impact of MAE on patients. Results: Out of 3090 episodes, 134 (4.3%) had a postoperative MAE. The majority of the episodes were in: neonates (47%, P<0.001), high-risk cardiac diagnosis groups (20.1%, P<0.001), episodes with longer 5mes on the bypass (72.4%, P<0.001) and urgent surgeries (57.9%, P<0.001). Episodes reporting MAE also reported longer lengths of stay in hospital (29 days vs 9 days, P<0.001). Furthermore, patients experiencing MAE were at a higher risk of mortality at the 6-month life status check (mortality rates: 29.2% vs 2%, P<0.001).Conclusions: Key risk factors were identified. An important negative impact of MAE was found for patients. The identified risk factors could be used to profile and flag at-risk patients. Monitoring of MAE rates and closer investigation into the care pathway before and after individual MAEs in children’s heart units may lead to a reduction in these terrible events.

Keywords:

Procedia PDF Downloads 231
3740 The Impact of the Board of Directors’ Characteristics on Tax Aggressiveness in USA Companies

Authors: jihen ayadi sellami

Abstract:

The rapid evolution of the global financial landscape has led to increased attention to corporate tax policies and the need to understand the factors that influence their tax behavior. In order to mitigate any residual loss for shareholders resulting from tax aggressiveness and resolve the agency problem, appropriate systems that separate the function of management from that of controlling are needed. In this context of growing concerns to limit aggressive corporate taxation practices through governance, this study discusses. Its aims is to examine the influence of six key characteristics of the board of directors (board size, diligence, CEO duality, presence of audit committees, gender diversity and independence of directors), given a governance mechanism, on the tax decisions of non-financial corporations in the United State. In fact, using a sample of 90 non-financial US firms from S&P 500 over a period of 4 years going from 2014 to 2017, the results based on a multivariate linear regression highlight significant associations between these characteristics and corporate tax policy. Notably, larger board, gender diversity, diligence and increased director independence appear to play an important role in reducing aggressive taxation. While duality has a positive and significant correlation with tax aggressiveness, that can be explained by the fact that the manager did properly exploit his specific position within the company. These findings contribute to a deeper understanding of how board characteristics can influence corporate tax management, providing avenues for more effective corporate governance and more responsible tax decision-making

Keywords: tax aggressiveness, board of directors, board size, CEO duality, audit committees, gender diversity, director independence, diligence, corporate governance, united states

Procedia PDF Downloads 60
3739 Partial Least Square Regression for High-Dimentional and High-Correlated Data

Authors: Mohammed Abdullah Alshahrani

Abstract:

The research focuses on investigating the use of partial least squares (PLS) methodology for addressing challenges associated with high-dimensional correlated data. Recent technological advancements have led to experiments producing data characterized by a large number of variables compared to observations, with substantial inter-variable correlations. Such data patterns are common in chemometrics, where near-infrared (NIR) spectrometer calibrations record chemical absorbance levels across hundreds of wavelengths, and in genomics, where thousands of genomic regions' copy number alterations (CNA) are recorded from cancer patients. PLS serves as a widely used method for analyzing high-dimensional data, functioning as a regression tool in chemometrics and a classification method in genomics. It handles data complexity by creating latent variables (components) from original variables. However, applying PLS can present challenges. The study investigates key areas to address these challenges, including unifying interpretations across three main PLS algorithms and exploring unusual negative shrinkage factors encountered during model fitting. The research presents an alternative approach to addressing the interpretation challenge of predictor weights associated with PLS. Sparse estimation of predictor weights is employed using a penalty function combining a lasso penalty for sparsity and a Cauchy distribution-based penalty to account for variable dependencies. The results demonstrate sparse and grouped weight estimates, aiding interpretation and prediction tasks in genomic data analysis. High-dimensional data scenarios, where predictors outnumber observations, are common in regression analysis applications. Ordinary least squares regression (OLS), the standard method, performs inadequately with high-dimensional and highly correlated data. Copy number alterations (CNA) in key genes have been linked to disease phenotypes, highlighting the importance of accurate classification of gene expression data in bioinformatics and biology using regularized methods like PLS for regression and classification.

Keywords: partial least square regression, genetics data, negative filter factors, high dimensional data, high correlated data

Procedia PDF Downloads 49
3738 Applying the Regression Technique for ‎Prediction of the Acute Heart Attack ‎

Authors: Paria Soleimani, Arezoo Neshati

Abstract:

Myocardial infarction is one of the leading causes of ‎death in the world. Some of these deaths occur even before the patient ‎reaches the hospital. Myocardial infarction occurs as a result of ‎impaired blood supply. Because the most of these deaths are due to ‎coronary artery disease, hence the awareness of the warning signs of a ‎heart attack is essential. Some heart attacks are sudden and intense, but ‎most of them start slowly, with mild pain or discomfort, then early ‎detection and successful treatment of these symptoms is vital to save ‎them. Therefore, importance and usefulness of a system designing to ‎assist physicians in the early diagnosis of the acute heart attacks is ‎obvious.‎ The purpose of this study is to determine how well a predictive ‎model would perform based on the only patient-reportable clinical ‎history factors, without using diagnostic tests or physical exams. This ‎type of the prediction model might have application outside of the ‎hospital setting to give accurate advice to patients to influence them to ‎seek care in appropriate situations. For this purpose, the data were ‎collected on 711 heart patients in Iran hospitals. 28 attributes of clinical ‎factors can be reported by patients; were studied. Three logistic ‎regression models were made on the basis of the 28 features to predict ‎the risk of heart attacks. The best logistic regression model in terms of ‎performance had a C-index of 0.955 and with an accuracy of 94.9%. ‎The variables, severe chest pain, back pain, cold sweats, shortness of ‎breath, nausea, and vomiting were selected as the main features.‎

Keywords: Coronary heart disease, Acute heart attacks, Prediction, Logistic ‎regression‎

Procedia PDF Downloads 447
3737 Understanding the Issue of Reproductive Matters among Urban Women: A Study of Four Cities in India from National Family Health Survey-4

Authors: Priyanka Dixit

Abstract:

Reproductive health problem is an important public health issue in most of the developing countries like India. It is a common problem in India for women in the reproductive age group to suffer from reproductive illnesses and not seek care. Existing literatures tell us very little about the several dimensions of reproductive morbidity. In addition the general perception says, metros have better medical infrastructure, so its residents should lead a healthier life. However some of the studies reveal a very different picture. Therefore, the present study is conducted with the specific objectives to find out the prevalence of reproductive health problem and treatment seeking behavior of currently married women in four metro cities in India namely; Mumbai, Delhi, Chennai and Kolkata. In addition, this paper also examines the effect of socio-economic and demographic factors on self-reported reproductive health problems. Bi-variate and multivariate regression have been applied to achieve the proposed objectives. Study is based on National Family Health Survey 2015-16 data. The analysis shows that the prevalence of any reproductive health problem among women is the highest in Mumbai followed by Delhi, Chennai, and Kolkata. A bulk of women in all four metro cities has reported abdominal pain, itching and burning sensation as the major problems while urinating. However, in spite of the high prevalence of reproductive health problems, a huge proportion of such women in all these cities do not seek any advice or treatment for these problems. This study also investigates determinants that affect the prevalence of reproductive health problem to policy makers plan for proper interventions for improving women’s reproductive health.

Keywords: reproductive health, India, national family health survey-4, city

Procedia PDF Downloads 210
3736 Fuzzy Logic Classification Approach for Exponential Data Set in Health Care System for Predication of Future Data

Authors: Manish Pandey, Gurinderjit Kaur, Meenu Talwar, Sachin Chauhan, Jagbir Gill

Abstract:

Health-care management systems are a unit of nice connection as a result of the supply a straightforward and fast management of all aspects relating to a patient, not essentially medical. What is more, there are unit additional and additional cases of pathologies during which diagnosing and treatment may be solely allotted by victimization medical imaging techniques. With associate ever-increasing prevalence, medical pictures area unit directly acquired in or regenerate into digital type, for his or her storage additionally as sequent retrieval and process. Data Mining is the process of extracting information from large data sets through using algorithms and Techniques drawn from the field of Statistics, Machine Learning and Data Base Management Systems. Forecasting may be a prediction of what's going to occur within the future, associated it's an unsure method. Owing to the uncertainty, the accuracy of a forecast is as vital because the outcome foretold by foretelling the freelance variables. A forecast management should be wont to establish if the accuracy of the forecast is within satisfactory limits. Fuzzy regression strategies have normally been wont to develop shopper preferences models that correlate the engineering characteristics with shopper preferences relating to a replacement product; the patron preference models offer a platform, wherever by product developers will decide the engineering characteristics so as to satisfy shopper preferences before developing the merchandise. Recent analysis shows that these fuzzy regression strategies area units normally will not to model client preferences. We tend to propose a Testing the strength of Exponential Regression Model over regression toward the mean Model.

Keywords: health-care management systems, fuzzy regression, data mining, forecasting, fuzzy membership function

Procedia PDF Downloads 277
3735 Tree-Based Inference for Regionalization: A Comparative Study of Global Topological Perturbation Methods

Authors: Orhun Aydin, Mark V. Janikas, Rodrigo Alves, Renato Assuncao

Abstract:

In this paper, a tree-based perturbation methodology for regionalization inference is presented. Regionalization is a constrained optimization problem that aims to create groups with similar attributes while satisfying spatial contiguity constraints. Similar to any constrained optimization problem, the spatial constraint may hinder convergence to some global minima, resulting in spatially contiguous members of a group with dissimilar attributes. This paper presents a general methodology for rigorously perturbing spatial constraints through the use of random spanning trees. The general framework presented can be used to quantify the effect of the spatial constraints in the overall regionalization result. We compare several types of stochastic spanning trees used in inference problems such as fuzzy regionalization and determining the number of regions. Performance of stochastic spanning trees is juxtaposed against the traditional permutation-based hypothesis testing frequently used in spatial statistics. Inference results for fuzzy regionalization and determining the number of regions is presented on the Local Area Personal Incomes for Texas Counties provided by the Bureau of Economic Analysis.

Keywords: regionalization, constrained clustering, probabilistic inference, fuzzy clustering

Procedia PDF Downloads 228
3734 Statistical Analysis of the Impact of Maritime Transport Gross Domestic Product (GDP) on Nigeria’s Economy

Authors: Kehinde Peter Oyeduntan, Kayode Oshinubi

Abstract:

Nigeria is referred as the ‘Giant of Africa’ due to high population, land mass and large economy. However, it still trails far behind many smaller economies in the continent in terms of maritime operations. As we have seen that the maritime industry is the spark plug for national growth, because it houses the most crucial infrastructure that generates wealth for a nation, it is worrisome that a nation with six seaports lag in maritime activities. In this research, we have studied how the Gross Domestic Product (GDP) of the maritime transport influences the Nigerian economy. To do this, we applied Simple Linear Regression (SLR), Support Vector Machine (SVM), Polynomial Regression Model (PRM), Generalized Additive Model (GAM) and Generalized Linear Mixed Model (GLMM) to model the relationship between the nation’s Total GDP (TGDP) and the Maritime Transport GDP (MGDP) using a time series data of 20 years. The result showed that the MGDP is statistically significant to the Nigerian economy. Amongst the statistical tool applied, the PRM of order 4 describes the relationship better when compared to other methods. The recommendations presented in this study will guide policy makers and help improve the economy of Nigeria in terms of its GDP.

Keywords: maritime transport, economy, GDP, regression, port

Procedia PDF Downloads 151
3733 The Effect of Accounting Conservatism on Cost of Capital: A Quantile Regression Approach for MENA Countries

Authors: Maha Zouaoui Khalifa, Hakim Ben Othman, Hussaney Khaled

Abstract:

Prior empirical studies have investigated the economic consequences of accounting conservatism by examining its impact on the cost of equity capital (COEC). However, findings are not conclusive. We assume that inconsistent results of such association may be attributed to the regression models used in data analysis. To address this issue, we re-examine the effect of different dimension of accounting conservatism: unconditional conservatism (U_CONS) and conditional conservatism (C_CONS) on the COEC for a sample of listed firms from Middle Eastern and North Africa (MENA) countries, applying quantile regression (QR) approach developed by Koenker and Basset (1978). While classical ordinary least square (OLS) method is widely used in empirical accounting research, however it may produce inefficient and bias estimates in the case of departures from normality or long tail error distribution. QR method is more powerful than OLS to handle this kind of problem. It allows the coefficient on the independent variables to shift across the distribution of the dependent variable whereas OLS method only estimates the conditional mean effects of a response variable. We find as predicted that U_CONS has a significant positive effect on the COEC however, C_CONS has a negative impact. Findings suggest also that the effect of the two dimensions of accounting conservatism differs considerably across COEC quantiles. Comparing results from QR method with those of OLS, this study throws more lights on the association between accounting conservatism and COEC.

Keywords: unconditional conservatism, conditional conservatism, cost of equity capital, OLS, quantile regression, emerging markets, MENA countries

Procedia PDF Downloads 354
3732 Remittances and Water Access: A Cross-Sectional Study of Sub Saharan Africa Countries

Authors: Narges Ebadi, Davod Ahmadi, Hiliary Monteith, Hugo Melgar-Quinonez

Abstract:

Migration cannot necessarily relieve pressure on water resources in origin communities, and male out-migration can increase the water management burden of women. However, inflows of financial remittances seem to offer possibilities of investing in improving drinking-water access. Therefore, remittances may be an important pathway for migrants to support water security. This paper explores the association between water access and the receipt of remittances in households in sub-Saharan Africa. Data from round 6 of the 'Afrobarometer' surveys in 2016 were used (n= 49,137). Descriptive, bivariate and multivariate statistical analyses were carried out in this study. Regardless of country, findings from descriptive analyses showed that approximately 80% of the respondents never received remittance, and 52% had enough clean water. Only one-fifth of the respondents had piped water supply inside the house (19.9%), and approximately 25% had access to a toilet inside the house. Bivariate analyses revealed that even though receiving remittances was significantly associated with water supply, the strength of association was very weak. However, other factors such as the area of residence (rural vs. urban), cash income frequencies, electricity access, and asset ownership were strongly associated with water access. Results from unadjusted multinomial logistic regression revealed that the probability of having no access to piped water increased among remittance recipients who received financial support at least once a month (OR=1.324) (p < 0.001). In contrast, those not receiving remittances were more likely to regularly have a water access concern (OR=1.294) (p < 0.001), and not have access to a latrine (OR=1.665) (p < 0.001). In conclusion, receiving remittances is significantly related to water access as the strength of odds ratios for socio-demographic factors was stronger.

Keywords: remittances, water access, SSA, migration

Procedia PDF Downloads 177
3731 Intrusion Detection in Computer Networks Using a Hybrid Model of Firefly and Differential Evolution Algorithms

Authors: Mohammad Besharatloo

Abstract:

Intrusion detection is an important research topic in network security because of increasing growth in the use of computer network services. Intrusion detection is done with the aim of detecting the unauthorized use or abuse in the networks and systems by the intruders. Therefore, the intrusion detection system is an efficient tool to control the user's access through some predefined regulations. Since, the data used in intrusion detection system has high dimension, a proper representation is required to show the basis structure of this data. Therefore, it is necessary to eliminate the redundant features to create the best representation subset. In the proposed method, a hybrid model of differential evolution and firefly algorithms was employed to choose the best subset of properties. In addition, decision tree and support vector machine (SVM) are adopted to determine the quality of the selected properties. In the first, the sorted population is divided into two sub-populations. These optimization algorithms were implemented on these sub-populations, respectively. Then, these sub-populations are merged to create next repetition population. The performance evaluation of the proposed method is done based on KDD Cup99. The simulation results show that the proposed method has better performance than the other methods in this context.

Keywords: intrusion detection system, differential evolution, firefly algorithm, support vector machine, decision tree

Procedia PDF Downloads 91
3730 Impact of International Student Mobility on European and Global Identity: A Case Study of Switzerland

Authors: Karina Oborune

Abstract:

International student mobility involves a unique spatio-temporal context and exploring the various aspects of mobile students’ experience can lead to new findings within identity studies. The previous studies have mainly focused on student mobility within Europe and its impact on European identity arguing that students who participate in intra-European mobility already feel European before exchange. Contrary to previous studies, in this paper student mobility is analyzed from different point of view. In order to see whether a true Europeanization of identities is taking place, it is necessary to contrast European identity with alternative supranational identity which could similarly result from student mobility and in particular a global identity. Besides, in the paper there is explored whether geographical constellation (host country continental location during mobility- Europe vs. outside of Europe) plays a role. Based on newly developed model of multicultural, social and socio-demographic variables there is argued that after intra-European mobility only global identity of students could be increased (H1), but the mobility to countries outside of Europe causes changes in European identity (H2). The quantitative study (survey, n=1440, 22 higher education institutions, experimental group of former and future/potential mobile students and control group of non-mobile students) was held in Switzerland where is equally high number of students who participate in intra-European and outside of Europe mobility. The results of multivariate linear regression showed that students who participate in exchange in Europe increase their European identity due to having close friends from Europe, as well as due to length of the mobility experience had impact, but students who participate in exchange outside of Europe increase their global identity due to having close friends from outside of Europe and proficiency in foreign languages.

Keywords: student mobility, European identity, global identity, global identity

Procedia PDF Downloads 728
3729 Optimizing the Scanning Time with Radiation Prediction Using a Machine Learning Technique

Authors: Saeed Eskandari, Seyed Rasoul Mehdikhani

Abstract:

Radiation sources have been used in many industries, such as gamma sources in medical imaging. These waves have destructive effects on humans and the environment. It is very important to detect and find the source of these waves because these sources cannot be seen by the eye. A portable robot has been designed and built with the purpose of revealing radiation sources that are able to scan the place from 5 to 20 meters away and shows the location of the sources according to the intensity of the waves on a two-dimensional digital image. The operation of the robot is done by measuring the pixels separately. By increasing the image measurement resolution, we will have a more accurate scan of the environment, and more points will be detected. But this causes a lot of time to be spent on scanning. In this paper, to overcome this challenge, we designed a method that can optimize this time. In this method, a small number of important points of the environment are measured. Hence the remaining pixels are predicted and estimated by regression algorithms in machine learning. The research method is based on comparing the actual values of all pixels. These steps have been repeated with several other radiation sources. The obtained results of the study show that the values estimated by the regression method are very close to the real values.

Keywords: regression, machine learning, scan radiation, robot

Procedia PDF Downloads 76
3728 Chemometric Regression Analysis of Radical Scavenging Ability of Kombucha Fermented Kefir-Like Products

Authors: Strahinja Kovacevic, Milica Karadzic Banjac, Jasmina Vitas, Stefan Vukmanovic, Radomir Malbasa, Lidija Jevric, Sanja Podunavac-Kuzmanovic

Abstract:

The present study deals with chemometric regression analysis of quality parameters and the radical scavenging ability of kombucha fermented kefir-like products obtained with winter savory (WS), peppermint (P), stinging nettle (SN) and wild thyme tea (WT) kombucha inoculums. Each analyzed sample was described by milk fat content (MF, %), total unsaturated fatty acids content (TUFA, %), monounsaturated fatty acids content (MUFA, %), polyunsaturated fatty acids content (PUFA, %), the ability of free radicals scavenging (RSA Dₚₚₕ, % and RSA.ₒₕ, %) and pH values measured after each hour from the start until the end of fermentation. The aim of the conducted regression analysis was to establish chemometric models which can predict the radical scavenging ability (RSA Dₚₚₕ, % and RSA.ₒₕ, %) of the samples by correlating it with the MF, TUFA, MUFA, PUFA and the pH value at the beginning, in the middle and at the end of fermentation process which lasted between 11 and 17 hours, until pH value of 4.5 was reached. The analysis was carried out applying univariate linear (ULR) and multiple linear regression (MLR) methods on the raw data and the data standardized by the min-max normalization method. The obtained models were characterized by very limited prediction power (poor cross-validation parameters) and weak statistical characteristics. Based on the conducted analysis it can be concluded that the resulting radical scavenging ability cannot be precisely predicted only on the basis of MF, TUFA, MUFA, PUFA content, and pH values, however, other quality parameters should be considered and included in the further modeling. This study is based upon work from project: Kombucha beverages production using alternative substrates from the territory of the Autonomous Province of Vojvodina, 142-451-2400/2019-03, supported by Provincial Secretariat for Higher Education and Scientific Research of AP Vojvodina.

Keywords: chemometrics, regression analysis, kombucha, quality control

Procedia PDF Downloads 141
3727 An Evaluation of Neuropsychiatric Manifestations in Systemic Lupus Erythematosus Patients in Saudi Arabia and Their Associated Factors

Authors: Yousef M. Alammari, Mahmoud A. Gaddoury, Reem A. Almohaini, Sara A. Alharbi, Lena S. Alsaleem, Lujain H. Allowaihiq, Maha H. Alrashid, Abdullah H. Alghamdi, Abdullah A. Alaryni

Abstract:

Objective: The goal of this study was to establish the prevalence of neuropsychiatric symptoms in systemic lupus erythematosus (NPSLE) patients in Saudi Arabia and the variables that are linked to it. Methods: During June 2021, this cross-sectional study was carried out among SLE patients in Saudi Arabia. The Saudi Rheumatism Association exploited social media platforms to provide a self-administered online questionnaire to SLE patients. All data analyses were performed using the Statistical Packages for Social Sciences (SPSS) version 26. Results: Two hundred and five SLE patients participated in the study (females 91.3 % vs. males 8.7 %). In addition, 13.5 % of patients had a family history of SLE, and 26% had SLE for one to three years. Alteration or loss of sensation (53.4%), Fear (52.4%), and headache (48.1%) were the most prevalent signs of neuropsychiatric symptoms in systemic lupus erythematosus (NPSLE) patients. The prevalence of patients with NPSLE was 40%. In a multivariate regression model, fear, altered sensations, cerebrovascular illness, sleep disruption, and diminished interest in routine activities were identified as independent risk variables for NPSLE. Conclusion: Nearly half of SLE patients demonstrated NP manifestations, with significant symptoms including fear, alteration of sensation, cerebrovascular disease, sleep disturbance, and reduced interest in normal activities. To detect the pathophysiology of NPSLE, it is necessary to understand the relationship between neuropsychiatric morbidity and other relevant rheumatic disorders in the SLE population.

Keywords: neuropsychiatric, systemic lupus erythematosus, NPSLE, prevalence, SLE patients

Procedia PDF Downloads 75
3726 Predicting the Impact of Scope Changes on Project Cost and Schedule Using Machine Learning Techniques

Authors: Soheila Sadeghi

Abstract:

In the dynamic landscape of project management, scope changes are an inevitable reality that can significantly impact project performance. These changes, whether initiated by stakeholders, external factors, or internal project dynamics, can lead to cost overruns and schedule delays. Accurately predicting the consequences of these changes is crucial for effective project control and informed decision-making. This study aims to develop predictive models to estimate the impact of scope changes on project cost and schedule using machine learning techniques. The research utilizes a comprehensive dataset containing detailed information on project tasks, including the Work Breakdown Structure (WBS), task type, productivity rate, estimated cost, actual cost, duration, task dependencies, scope change magnitude, and scope change timing. Multiple machine learning models are developed and evaluated to predict the impact of scope changes on project cost and schedule. These models include Linear Regression, Decision Tree, Ridge Regression, Random Forest, Gradient Boosting, and XGBoost. The dataset is split into training and testing sets, and the models are trained using the preprocessed data. Cross-validation techniques are employed to assess the robustness and generalization ability of the models. The performance of the models is evaluated using metrics such as Mean Squared Error (MSE) and R-squared. Residual plots are generated to assess the goodness of fit and identify any patterns or outliers. Hyperparameter tuning is performed to optimize the XGBoost model and improve its predictive accuracy. The feature importance analysis reveals the relative significance of different project attributes in predicting the impact on cost and schedule. Key factors such as productivity rate, scope change magnitude, task dependencies, estimated cost, actual cost, duration, and specific WBS elements are identified as influential predictors. The study highlights the importance of considering both cost and schedule implications when managing scope changes. The developed predictive models provide project managers with a data-driven tool to proactively assess the potential impact of scope changes on project cost and schedule. By leveraging these insights, project managers can make informed decisions, optimize resource allocation, and develop effective mitigation strategies. The findings of this research contribute to improved project planning, risk management, and overall project success.

Keywords: cost impact, machine learning, predictive modeling, schedule impact, scope changes

Procedia PDF Downloads 38
3725 Enhancing Spatial Interpolation: A Multi-Layer Inverse Distance Weighting Model for Complex Regression and Classification Tasks in Spatial Data Analysis

Authors: Yakin Hajlaoui, Richard Labib, Jean-François Plante, Michel Gamache

Abstract:

This study introduces the Multi-Layer Inverse Distance Weighting Model (ML-IDW), inspired by the mathematical formulation of both multi-layer neural networks (ML-NNs) and Inverse Distance Weighting model (IDW). ML-IDW leverages ML-NNs' processing capabilities, characterized by compositions of learnable non-linear functions applied to input features, and incorporates IDW's ability to learn anisotropic spatial dependencies, presenting a promising solution for nonlinear spatial interpolation and learning from complex spatial data. it employ gradient descent and backpropagation to train ML-IDW, comparing its performance against conventional spatial interpolation models such as Kriging and standard IDW on regression and classification tasks using simulated spatial datasets of varying complexity. the results highlight the efficacy of ML-IDW, particularly in handling complex spatial datasets, exhibiting lower mean square error in regression and higher F1 score in classification.

Keywords: deep learning, multi-layer neural networks, gradient descent, spatial interpolation, inverse distance weighting

Procedia PDF Downloads 52
3724 Copper Price Prediction Model for Various Economic Situations

Authors: Haidy S. Ghali, Engy Serag, A. Samer Ezeldin

Abstract:

Copper is an essential raw material used in the construction industry. During the year 2021 and the first half of 2022, the global market suffered from a significant fluctuation in copper raw material prices due to the aftermath of both the COVID-19 pandemic and the Russia-Ukraine war, which exposed its consumers to an unexpected financial risk. Thereto, this paper aims to develop two ANN-LSTM price prediction models, using Python, that can forecast the average monthly copper prices traded in the London Metal Exchange; the first model is a multivariate model that forecasts the copper price of the next 1-month and the second is a univariate model that predicts the copper prices of the upcoming three months. Historical data of average monthly London Metal Exchange copper prices are collected from January 2009 till July 2022, and potential external factors are identified and employed in the multivariate model. These factors lie under three main categories: energy prices and economic indicators of the three major exporting countries of copper, depending on the data availability. Before developing the LSTM models, the collected external parameters are analyzed with respect to the copper prices using correlation and multicollinearity tests in R software; then, the parameters are further screened to select the parameters that influence the copper prices. Then, the two LSTM models are developed, and the dataset is divided into training, validation, and testing sets. The results show that the performance of the 3-Month prediction model is better than the 1-Month prediction model, but still, both models can act as predicting tools for diverse economic situations.

Keywords: copper prices, prediction model, neural network, time series forecasting

Procedia PDF Downloads 111
3723 Indian Premier League (IPL) Score Prediction: Comparative Analysis of Machine Learning Models

Authors: Rohini Hariharan, Yazhini R, Bhamidipati Naga Shrikarti

Abstract:

In the realm of cricket, particularly within the context of the Indian Premier League (IPL), the ability to predict team scores accurately holds significant importance for both cricket enthusiasts and stakeholders alike. This paper presents a comprehensive study on IPL score prediction utilizing various machine learning algorithms, including Support Vector Machines (SVM), XGBoost, Multiple Regression, Linear Regression, K-nearest neighbors (KNN), and Random Forest. Through meticulous data preprocessing, feature engineering, and model selection, we aimed to develop a robust predictive framework capable of forecasting team scores with high precision. Our experimentation involved the analysis of historical IPL match data encompassing diverse match and player statistics. Leveraging this data, we employed state-of-the-art machine learning techniques to train and evaluate the performance of each model. Notably, Multiple Regression emerged as the top-performing algorithm, achieving an impressive accuracy of 77.19% and a precision of 54.05% (within a threshold of +/- 10 runs). This research contributes to the advancement of sports analytics by demonstrating the efficacy of machine learning in predicting IPL team scores. The findings underscore the potential of advanced predictive modeling techniques to provide valuable insights for cricket enthusiasts, team management, and betting agencies. Additionally, this study serves as a benchmark for future research endeavors aimed at enhancing the accuracy and interpretability of IPL score prediction models.

Keywords: indian premier league (IPL), cricket, score prediction, machine learning, support vector machines (SVM), xgboost, multiple regression, linear regression, k-nearest neighbors (KNN), random forest, sports analytics

Procedia PDF Downloads 51