Search results for: regression model
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 18168

Search results for: regression model

17868 Mapping of Urban Micro-Climate in Lyon (France) by Integrating Complementary Predictors at Different Scales into Multiple Linear Regression Models

Authors: Lucille Alonso, Florent Renard

Abstract:

The characterizations of urban heat island (UHI) and their interactions with climate change and urban climates are the main research and public health issue, due to the increasing urbanization of the population. These solutions require a better knowledge of the UHI and micro-climate in urban areas, by combining measurements and modelling. This study is part of this topic by evaluating microclimatic conditions in dense urban areas in the Lyon Metropolitan Area (France) using a combination of data traditionally used such as topography, but also from LiDAR (Light Detection And Ranging) data, Landsat 8 satellite observation and Sentinel and ground measurements by bike. These bicycle-dependent weather data collections are used to build the database of the variable to be modelled, the air temperature, over Lyon’s hyper-center. This study aims to model the air temperature, measured during 6 mobile campaigns in Lyon in clear weather, using multiple linear regressions based on 33 explanatory variables. They are of various categories such as meteorological parameters from remote sensing, topographic variables, vegetation indices, the presence of water, humidity, bare soil, buildings, radiation, urban morphology or proximity and density to various land uses (water surfaces, vegetation, bare soil, etc.). The acquisition sources are multiple and come from the Landsat 8 and Sentinel satellites, LiDAR points, and cartographic products downloaded from an open data platform in Greater Lyon. Regarding the presence of low, medium, and high vegetation, the presence of buildings and ground, several buffers close to these factors were tested (5, 10, 20, 25, 50, 100, 200 and 500m). The buffers with the best linear correlations with air temperature for ground are 5m around the measurement points, for low and medium vegetation, and for building 50m and for high vegetation is 100m. The explanatory model of the dependent variable is obtained by multiple linear regression of the remaining explanatory variables (Pearson correlation matrix with a |r| < 0.7 and VIF with < 5) by integrating a stepwise sorting algorithm. Moreover, holdout cross-validation is performed, due to its ability to detect over-fitting of multiple regression, although multiple regression provides internal validation and randomization (80% training, 20% testing). Multiple linear regression explained, on average, 72% of the variance for the study days, with an average RMSE of only 0.20°C. The impact on the model of surface temperature in the estimation of air temperature is the most important variable. Other variables are recurrent such as distance to subway stations, distance to water areas, NDVI, digital elevation model, sky view factor, average vegetation density, or building density. Changing urban morphology influences the city's thermal patterns. The thermal atmosphere in dense urban areas can only be analysed on a microscale to be able to consider the local impact of trees, streets, and buildings. There is currently no network of fixed weather stations sufficiently deployed in central Lyon and most major urban areas. Therefore, it is necessary to use mobile measurements, followed by modelling to characterize the city's multiple thermal environments.

Keywords: air temperature, LIDAR, multiple linear regression, surface temperature, urban heat island

Procedia PDF Downloads 107
17867 Removal of Phenol from Aqueous Solution Using Watermelon (Citrullus C. lanatus) Rind

Authors: Fidelis Chigondo

Abstract:

This study focuses on investigating the effectiveness of watermelon rind in phenol removal from aqueous solution. The effects of various parameters (pH, initial phenol concentration, biosorbent dosage and contact time) on phenol adsorption were investigated. The pH of 2, initial phenol concentration of 40 ppm, the biosorbent dosage of 0.6 g and contact time of 6 h also deduced to be the optimum conditions for the adsorption process. The maximum phenol removal under optimized conditions was 85%. The sorption data fitted to the Freundlich isotherm with a regression coefficient of 0.9824. The kinetics was best described by the intraparticle diffusion model and Elovich Equation with regression coefficients of 1 and 0.8461 respectively showing that the reaction is chemisorption on a heterogeneous surface and the intraparticle diffusion rate only is the rate determining step. The study revealed that watermelon rind has a potential of removing phenol from industrial wastewaters.

Keywords: biosorption, phenol, biosorbent, watermelon rind

Procedia PDF Downloads 222
17866 Low-Cost, Portable Optical Sensor with Regression Algorithm Models for Accurate Monitoring of Nitrites in Environments

Authors: David X. Dong, Qingming Zhang, Meng Lu

Abstract:

Nitrites enter waterways as runoff from croplands and are discharged from many industrial sites. Excessive nitrite inputs to water bodies lead to eutrophication. On-site rapid detection of nitrite is of increasing interest for managing fertilizer application and monitoring water source quality. Existing methods for detecting nitrites use spectrophotometry, ion chromatography, electrochemical sensors, ion-selective electrodes, chemiluminescence, and colorimetric methods. However, these methods either suffer from high cost or provide low measurement accuracy due to their poor selectivity to nitrites. Therefore, it is desired to develop an accurate and economical method to monitor nitrites in environments. We report a low-cost optical sensor, in conjunction with a machine learning (ML) approach to enable high-accuracy detection of nitrites in water sources. The sensor works under the principle of measuring molecular absorptions of nitrites at three narrowband wavelengths (295 nm, 310 nm, and 357 nm) in the ultraviolet (UV) region. These wavelengths are chosen because they have relatively high sensitivity to nitrites; low-cost light-emitting devices (LEDs) and photodetectors are also available at these wavelengths. A regression model is built, trained, and utilized to minimize cross-sensitivities of these wavelengths to the same analyte, thus achieving precise and reliable measurements with various interference ions. The measured absorbance data is input to the trained model that can provide nitrite concentration prediction for the sample. The sensor is built with i) a miniature quartz cuvette as the test cell that contains a liquid sample under test, ii) three low-cost UV LEDs placed on one side of the cell as light sources, with each LED providing a narrowband light, and iii) a photodetector with a built-in amplifier and an analog-to-digital converter placed on the other side of the test cell to measure the power of transmitted light. This simple optical design allows measuring the absorbance data of the sample at the three wavelengths. To train the regression model, absorbances of nitrite ions and their combination with various interference ions are first obtained at the three UV wavelengths using a conventional spectrophotometer. Then, the spectrophotometric data are inputs to different regression algorithm models for training and evaluating high-accuracy nitrite concentration prediction. Our experimental results show that the proposed approach enables instantaneous nitrite detection within several seconds. The sensor hardware costs about one hundred dollars, which is much cheaper than a commercial spectrophotometer. The ML algorithm helps to reduce the average relative errors to below 3.5% over a concentration range from 0.1 ppm to 100 ppm of nitrites. The sensor has been validated to measure nitrites at three sites in Ames, Iowa, USA. This work demonstrates an economical and effective approach to the rapid, reagent-free determination of nitrites with high accuracy. The integration of the low-cost optical sensor and ML data processing can find a wide range of applications in environmental monitoring and management.

Keywords: optical sensor, regression model, nitrites, water quality

Procedia PDF Downloads 45
17865 The Impact of Simulation-based Learning on the Clinical Self-efficacy and Adherence to Infection Control Practices of Nursing Students

Authors: Raeed Alanazi

Abstract:

Introduction: Nursing students have a crucial role to play in the inhibition of infectious diseases and, therefore, must be trained in infection control and prevention modules prior to entering clinical settings. Simulations have been found to have a positive impact on infection control skills and the use of standard precautions. Aim: The purpose of this study was to use the four sources of self-efficacy in explaining the level of clinical self-efficacy and adherence to infection control practices in Saudi nursing students during simulation practice. Method: A cross-sectional design with convenience sampling was used. This study was conducted in all Saudi nursing schools, with a total number of 197 students participated in this study. Three scales were used simulation self- efficacy Scale (SSES), the four sources of self-efficacy scale (SSES), and Compliance with Standard Precautions Scale (CSPS). Multiple linear regression was used to test the use of the four sources of self-efficacy (SSES) in explaining level of clinical self-efficacy and adherence to infection control in nursing students. Results: The vicarious experience subscale (p =.044) was statistically significant. The regression model indicated that for every one unit increase in vicarious experience (observation and reflection in simulation), the participants’ adherence to infection control increased by .13 units (β =.22, t = 2.03, p =.044). In addition, the regression model indicated that for every one unit increase in education level, the participants’ adherence to infection control increased by 1.82 units (beta=.34= 3.64, p <.001). Also, the mastery experience subscale (p <.001) and vicarious experience subscale (p = .020) were shared significant associations with clinical self-efficacy. Conclusion: The findings of this research support the idea that simulation-based learning can be a valuable teaching-learning method to help nursing students develop clinical competence, which is essential in providing quality and safe nursing care.

Keywords: simulation-based learning, clinical self-efficacy, infection control, nursing students

Procedia PDF Downloads 43
17864 Quantitative Structure Activity Relationship Model for Predicting the Aromatase Inhibition Activity of 1,2,3-Triazole Derivatives

Authors: M. Ouassaf, S. Belaidi

Abstract:

Aromatase is an estrogen biosynthetic enzyme belonging to the cytochrome P450 family, which catalyzes the limiting step in the conversion of androgens to estrogens. As it is relevant for the promotion of tumor cell growth. A set of thirty 1,2,3-triazole derivatives was used in the quantitative structure activity relationship (QSAR) study using regression multiple linear (MLR), We divided the data into two training and testing groups. The results showed a good predictive ability of the MLR model, the models were statistically robust internally (R² = 0.982) and the predictability of the model was tested by several parameters. including external criteria (R²pred = 0.851, CCC = 0.946). The knowledge gained in this study should provide relevant information that contributes to the origins of aromatase inhibitory activity and, therefore, facilitates our ongoing quest for aromatase inhibitors with robust properties.

Keywords: aromatase inhibitors, QSAR, MLR, 1, 2, 3-triazole

Procedia PDF Downloads 87
17863 Applicability of Cameriere’s Age Estimation Method in a Sample of Turkish Adults

Authors: Hatice Boyacioglu, Nursel Akkaya, Humeyra Ozge Yilanci, Hilmi Kansu, Nihal Avcu

Abstract:

The strong relationship between the reduction in the size of the pulp cavity and increasing age has been reported in the literature. This relationship can be utilized to estimate the age of an individual by measuring the pulp cavity size using dental radiographs as a non-destructive method. The purpose of this study is to develop a population specific regression model for age estimation in a sample of Turkish adults by applying Cameriere’s method on panoramic radiographs. The sample consisted of 100 panoramic radiographs of Turkish patients (40 men, 60 women) aged between 20 and 70 years. Pulp and tooth area ratios (AR) of the maxilla¬¬ry canines were measured by two maxillofacial radiologists and then the results were subjected to regression analysis. There were no statistically significant intra-observer and inter-observer differences. The correlation coefficient between age and the AR of the maxillary canines was -0.71 and the following regression equation was derived: Estimated Age = 77,365 – ( 351,193 × AR ). The mean prediction error was 4 years which is within acceptable errors limits for age estimation. This shows that the pulp/tooth area ratio is a useful variable for assessing age with reasonable accuracy. Based on the results of this research, it was concluded that Cameriere’s method is suitable for dental age estimation and it can be used for forensic procedures in Turkish adults. These instructions give you guidelines for preparing papers for conferences or journals.

Keywords: age estimation by teeth, forensic dentistry, panoramic radiograph, Cameriere's method

Procedia PDF Downloads 422
17862 Interaction between Space Syntax and Agent-Based Approaches for Vehicle Volume Modelling

Authors: Chuan Yang, Jing Bie, Panagiotis Psimoulis, Zhong Wang

Abstract:

Modelling and understanding vehicle volume distribution over the urban network are essential for urban design and transport planning. The space syntax approach was widely applied as the main conceptual and methodological framework for contemporary vehicle volume models with the help of the statistical method of multiple regression analysis (MRA). However, the MRA model with space syntax variables shows a limitation in vehicle volume predicting in accounting for the crossed effect of the urban configurational characters and socio-economic factors. The aim of this paper is to construct models by interacting with the combined impact of the street network structure and socio-economic factors. In this paper, we present a multilevel linear (ML) and an agent-based (AB) vehicle volume model at an urban scale interacting with space syntax theoretical framework. The ML model allowed random effects of urban configurational characteristics in different urban contexts. And the AB model was developed with the incorporation of transformed space syntax components of the MRA models into the agents’ spatial behaviour. Three models were implemented in the same urban environment. The ML model exhibit superiority over the original MRA model in identifying the relative impacts of the configurational characters and macro-scale socio-economic factors that shape vehicle movement distribution over the city. Compared with the ML model, the suggested AB model represented the ability to estimate vehicle volume in the urban network considering the combined effects of configurational characters and land-use patterns at the street segment level.

Keywords: space syntax, vehicle volume modeling, multilevel model, agent-based model

Procedia PDF Downloads 107
17861 An Alternative Approach for Assessing the Impact of Cutting Conditions on Surface Roughness Using Single Decision Tree

Authors: S. Ghorbani, N. I. Polushin

Abstract:

In this study, an approach to identify factors affecting on surface roughness in a machining process is presented. This study is based on 81 data about surface roughness over a wide range of cutting tools (conventional, cutting tool with holes, cutting tool with composite material), workpiece materials (AISI 1045 Steel, AA2024 aluminum alloy, A48-class30 gray cast iron), spindle speed (630-1000 rpm), feed rate (0.05-0.075 mm/rev), depth of cut (0.05-0.15 mm) and tool overhang (41-65 mm). A single decision tree (SDT) analysis was done to identify factors for predicting a model of surface roughness, and the CART algorithm was employed for building and evaluating regression tree. Results show that a single decision tree is better than traditional regression models with higher rate and forecast accuracy and strong value.

Keywords: cutting condition, surface roughness, decision tree, CART algorithm

Procedia PDF Downloads 344
17860 Bivariate Time-to-Event Analysis with Copula-Based Cox Regression

Authors: Duhania O. Mahara, Santi W. Purnami, Aulia N. Fitria, Merissa N. Z. Wirontono, Revina Musfiroh, Shofi Andari, Sagiran Sagiran, Estiana Khoirunnisa, Wahyudi Widada

Abstract:

For assessing interventions in numerous disease areas, the use of multiple time-to-event outcomes is common. An individual might experience two different events called bivariate time-to-event data, the events may be correlated because it come from the same subject and also influenced by individual characteristics. The bivariate time-to-event case can be applied by copula-based bivariate Cox survival model, using the Clayton and Frank copulas to analyze the dependence structure of each event and also the covariates effect. By applying this method to modeling the recurrent event infection of hemodialysis insertion on chronic kidney disease (CKD) patients, from the AIC and BIC values we find that the Clayton copula model was the best model with Kendall’s Tau is (τ=0,02).

Keywords: bivariate cox, bivariate event, copula function, survival copula

Procedia PDF Downloads 46
17859 Monitoring Large-Coverage Forest Canopy Height by Integrating LiDAR and Sentinel-2 Images

Authors: Xiaobo Liu, Rakesh Mishra, Yun Zhang

Abstract:

Continuous monitoring of forest canopy height with large coverage is essential for obtaining forest carbon stocks and emissions, quantifying biomass estimation, analyzing vegetation coverage, and determining biodiversity. LiDAR can be used to collect accurate woody vegetation structure such as canopy height. However, LiDAR’s coverage is usually limited because of its high cost and limited maneuverability, which constrains its use for dynamic and large area forest canopy monitoring. On the other hand, optical satellite images, like Sentinel-2, have the ability to cover large forest areas with a high repeat rate, but they do not have height information. Hence, exploring the solution of integrating LiDAR data and Sentinel-2 images to enlarge the coverage of forest canopy height prediction and increase the prediction repeat rate has been an active research topic in the environmental remote sensing community. In this study, we explore the potential of training a Random Forest Regression (RFR) model and a Convolutional Neural Network (CNN) model, respectively, to develop two predictive models for predicting and validating the forest canopy height of the Acadia Forest in New Brunswick, Canada, with a 10m ground sampling distance (GSD), for the year 2018 and 2021. Two 10m airborne LiDAR-derived canopy height models, one for 2018 and one for 2021, are used as ground truth to train and validate the RFR and CNN predictive models. To evaluate the prediction performance of the trained RFR and CNN models, two new predicted canopy height maps (CHMs), one for 2018 and one for 2021, are generated using the trained RFR and CNN models and 10m Sentinel-2 images of 2018 and 2021, respectively. The two 10m predicted CHMs from Sentinel-2 images are then compared with the two 10m airborne LiDAR-derived canopy height models for accuracy assessment. The validation results show that the mean absolute error (MAE) for year 2018 of the RFR model is 2.93m, CNN model is 1.71m; while the MAE for year 2021 of the RFR model is 3.35m, and the CNN model is 3.78m. These demonstrate the feasibility of using the RFR and CNN models developed in this research for predicting large-coverage forest canopy height at 10m spatial resolution and a high revisit rate.

Keywords: remote sensing, forest canopy height, LiDAR, Sentinel-2, artificial intelligence, random forest regression, convolutional neural network

Procedia PDF Downloads 54
17858 Rural Livelihood under a Changing Climate Pattern in the Zio District of Togo, West Africa

Authors: Martial Amou

Abstract:

This study was carried out to assess the situation of households’ livelihood under a changing climate pattern in the Zio district of Togo, West Africa. The study examined three important aspects: (i) assessment of households’ livelihood situation under a changing climate pattern, (ii) farmers’ perception and understanding of local climate change, (iii) determinants of adaptation strategies undertaken in cropping pattern to climate change. To this end, secondary sources of data, and survey data collected from 235 farmers in four villages in the study area were used. Adapted conceptual framework from Sustainable Livelihood Framework of DFID, two steps Binary Logistic Regression Model and descriptive statistics were used in this study as methodological approaches. Based on Sustainable Livelihood Approach (SLA), various factors revolving around the livelihoods of the rural community were grouped into social, natural, physical, human, and financial capital. Thus, the study came up that households’ livelihood situation represented by the overall livelihood index in the study area (34%) is below the standard average households’ livelihood security index (50%). The natural capital was found as the poorest asset (13%) and this will severely affect the sustainability of livelihood in the long run. The result from descriptive statistics and the first step regression (selection model) indicated that most of the farmers in the study area have clear understanding of climate change even though they do not have any idea about greenhouse gases as the main cause behind the issue. From the second step regression (output model) result, education, farming experience, access to credit, access to extension services, cropland size, membership of a social group, distance to the nearest input market, were found to be the significant determinants of adaptation measures undertaken in cropping pattern by farmers in the study area. Based on the result of this study, recommendations are made to farmers, policy makers, institutions, and development service providers in order to better target interventions which build, promote or facilitate the adoption of adaptation measures with potential to build resilience to climate change and then improve rural livelihood.

Keywords: climate change, rural livelihood, cropping pattern, adaptation, Zio District

Procedia PDF Downloads 300
17857 Study on the Factors Influencing the Built Environment of Residential Areas on the Lifestyle Walking Trips of the Elderly

Authors: Daming Xu, Yuanyuan Wang

Abstract:

Abstract: Under the trend of rapid expansion of urbanization, the motorized urban characteristics become more and more obvious, and the walkability of urban space is seriously affected. The construction of walkability of space, as the main mode of travel for the elderly in their daily lives, has become more and more important in the current social context of serious aging. Settlement is the most basic living unit of residents, and daily shopping, medical care, and other daily trips are closely related to the daily life of the elderly. Therefore, it is of great practical significance to explore the impact of built environment on elderly people's daily walking trips at the settlement level for the construction of pedestrian-friendly settlements for the elderly. The study takes three typical settlements in Harbin Daoli District in three different periods as examples and obtains data on elderly people's walking trips and built environment characteristics through field research, questionnaire distribution, and internet data acquisition. Finally, correlation analysis and multinomial logistic regression model were applied to analyze the influence mechanism of built environment on elderly people's walkability based on the control of personal attribute variables in order to provide reference and guidance for the construction of walkability for elderly people in built environment in the future.

Keywords: built environment, elderly, walkability, multinomial logistic regression model

Procedia PDF Downloads 51
17856 Evaluation of the CRISP-DM Business Understanding Step: An Approach for Assessing the Predictive Power of Regression versus Classification for the Quality Prediction of Hydraulic Test Results

Authors: Christian Neunzig, Simon Fahle, Jürgen Schulz, Matthias Möller, Bernd Kuhlenkötter

Abstract:

Digitalisation in production technology is a driver for the application of machine learning methods. Through the application of predictive quality, the great potential for saving necessary quality control can be exploited through the data-based prediction of product quality and states. However, the serial use of machine learning applications is often prevented by various problems. Fluctuations occur in real production data sets, which are reflected in trends and systematic shifts over time. To counteract these problems, data preprocessing includes rule-based data cleaning, the application of dimensionality reduction techniques, and the identification of comparable data subsets to extract stable features. Successful process control of the target variables aims to centre the measured values around a mean and minimise variance. Competitive leaders claim to have mastered their processes. As a result, much of the real data has a relatively low variance. For the training of prediction models, the highest possible generalisability is required, which is at least made more difficult by this data availability. The implementation of a machine learning application can be interpreted as a production process. The CRoss Industry Standard Process for Data Mining (CRISP-DM) is a process model with six phases that describes the life cycle of data science. As in any process, the costs to eliminate errors increase significantly with each advancing process phase. For the quality prediction of hydraulic test steps of directional control valves, the question arises in the initial phase whether a regression or a classification is more suitable. In the context of this work, the initial phase of the CRISP-DM, the business understanding, is critically compared for the use case at Bosch Rexroth with regard to regression and classification. The use of cross-process production data along the value chain of hydraulic valves is a promising approach to predict the quality characteristics of workpieces. Suitable methods for leakage volume flow regression and classification for inspection decision are applied. Impressively, classification is clearly superior to regression and achieves promising accuracies.

Keywords: classification, CRISP-DM, machine learning, predictive quality, regression

Procedia PDF Downloads 114
17855 Using AI for Analysing Political Leaders

Authors: Shuai Zhao, Shalendra D. Sharma, Jin Xu

Abstract:

This research uses advanced machine learning models to learn a number of hypotheses regarding political executives. Specifically, it analyses the impact these powerful leaders have on economic growth by using leaders’ data from the Archigos database from 1835 to the end of 2015. The data is processed by the AutoGluon, which was developed by Amazon. Automated Machine Learning (AutoML) and AutoGluon can automatically extract features from the data and then use multiple classifiers to train the data. Use a linear regression model and classification model to establish the relationship between leaders and economic growth (GDP per capita growth), and to clarify the relationship between their characteristics and economic growth from a machine learning perspective. Our work may show as a model or signal for collaboration between the fields of statistics and artificial intelligence (AI) that can light up the way for political researchers and economists.

Keywords: comparative politics, political executives, leaders’ characteristics, artificial intelligence

Procedia PDF Downloads 52
17854 Naïve Bayes: A Classical Approach for the Epileptic Seizures Recognition

Authors: Bhaveek Maini, Sanjay Dhanka, Surita Maini

Abstract:

Electroencephalography (EEG) is used to classify several epileptic seizures worldwide. It is a very crucial task for the neurologist to identify the epileptic seizure with manual EEG analysis, as it takes lots of effort and time. Human error is always at high risk in EEG, as acquiring signals needs manual intervention. Disease diagnosis using machine learning (ML) has continuously been explored since its inception. Moreover, where a large number of datasets have to be analyzed, ML is acting as a boon for doctors. In this research paper, authors proposed two different ML models, i.e., logistic regression (LR) and Naïve Bayes (NB), to predict epileptic seizures based on general parameters. These two techniques are applied to the epileptic seizures recognition dataset, available on the UCI ML repository. The algorithms are implemented on an 80:20 train test ratio (80% for training and 20% for testing), and the performance of the model was validated by 10-fold cross-validation. The proposed study has claimed accuracy of 81.87% and 95.49% for LR and NB, respectively.

Keywords: epileptic seizure recognition, logistic regression, Naïve Bayes, machine learning

Procedia PDF Downloads 34
17853 New Approach for Load Modeling

Authors: Slim Chokri

Abstract:

Load forecasting is one of the central functions in power systems operations. Electricity cannot be stored, which means that for electric utility, the estimate of the future demand is necessary in managing the production and purchasing in an economically reasonable way. A majority of the recently reported approaches are based on neural network. The attraction of the methods lies in the assumption that neural networks are able to learn properties of the load. However, the development of the methods is not finished, and the lack of comparative results on different model variations is a problem. This paper presents a new approach in order to predict the Tunisia daily peak load. The proposed method employs a computational intelligence scheme based on the Fuzzy neural network (FNN) and support vector regression (SVR). Experimental results obtained indicate that our proposed FNN-SVR technique gives significantly good prediction accuracy compared to some classical techniques.

Keywords: neural network, load forecasting, fuzzy inference, machine learning, fuzzy modeling and rule extraction, support vector regression

Procedia PDF Downloads 404
17852 Survival Analysis of Identifying the Risk Factors of Affecting the First Recurrence Time of Breast Cancer: The Case of Tigray, Ethiopia

Authors: Segen Asayehegn

Abstract:

Introduction: In Tigray, Ethiopia, next to cervical cancer, breast cancer is one of the most common cancer health problems for women. Objectives: This article is proposed to identify the prospective and potential risk factors affecting the time-to-first-recurrence of breast cancer patients in Tigray, Ethiopia. Methods: The data were taken from the patient’s medical record that registered from January 2010 to January 2020. The study considered a sample size of 1842 breast cancer patients. Powerful non-parametric and parametric shared frailty survival regression models (FSRM) were applied, and model comparisons were performed. Results: Out of 1842 breast cancer patients, about 1290 (70.02%) recovered/cured the disease. The median cure time from breast cancer is found at 12.8 months. The model comparison suggested that the lognormal parametric shared a frailty survival regression model predicted that treatment, stage of breast cancer, smoking habit, and marital status significantly affects the first recurrence of breast cancer. Conclusion: Factors like treatment, stages of cancer, and marital status were improved while smoking habits worsened the time to cure breast cancer. Recommendation: Thus, the authors recommend reducing breast cancer health problems, the regional health sector facilities need to be improved. More importantly, concerned bodies and medical doctors should emphasize the identified factors during treatment. Furthermore, general awareness programs should be given to the community on the identified factors.

Keywords: acceleration factor, breast cancer, Ethiopia, shared frailty survival models, Tigray

Procedia PDF Downloads 107
17851 Estimation of a Finite Population Mean under Random Non Response Using Improved Nadaraya and Watson Kernel Weights

Authors: Nelson Bii, Christopher Ouma, John Odhiambo

Abstract:

Non-response is a potential source of errors in sample surveys. It introduces bias and large variance in the estimation of finite population parameters. Regression models have been recognized as one of the techniques of reducing bias and variance due to random non-response using auxiliary data. In this study, it is assumed that random non-response occurs in the survey variable in the second stage of cluster sampling, assuming full auxiliary information is available throughout. Auxiliary information is used at the estimation stage via a regression model to address the problem of random non-response. In particular, the auxiliary information is used via an improved Nadaraya-Watson kernel regression technique to compensate for random non-response. The asymptotic bias and mean squared error of the estimator proposed are derived. Besides, a simulation study conducted indicates that the proposed estimator has smaller values of the bias and smaller mean squared error values compared to existing estimators of finite population mean. The proposed estimator is also shown to have tighter confidence interval lengths at a 95% coverage rate. The results obtained in this study are useful, for instance, in choosing efficient estimators of the finite population mean in demographic sample surveys.

Keywords: mean squared error, random non-response, two-stage cluster sampling, confidence interval lengths

Procedia PDF Downloads 107
17850 Test of Capital Account Monetary Model of Floating Exchange Rate Determination: Further Evidence from Selected African Countries

Authors: Oloyede John Adebayo

Abstract:

This paper tested a variant of the monetary model of exchange rate determination, called Frankel’s Capital Account Monetary Model (CAAM) based on Real Interest Rate Differential, on the floating exchange rate experiences of three developing countries of Africa; viz: Ghana, Nigeria and the Gambia. The study adopted the Auto regressive Instrumental Package (AIV) and Almon Polynomial Lag Procedure of regression analysis based on the assumption that the coefficients follow a third-order Polynomial with zero-end constraint. The results found some support for the CAAM hypothesis that exchange rate responds proportionately to changes in money supply, inversely to income and positively to interest rates and expected inflation differentials. On this basis, the study points the attention of monetary authorities and researchers to the relevance and usefulness of CAAM as appropriate tool and useful benchmark for analyzing the exchange rate behaviour of most developing countries.

Keywords: exchange rate, monetary model, interest differentials, capital account

Procedia PDF Downloads 374
17849 Evaluation of the Effect of IMS on the Social Responsibility in the Oil and Gas Production Companies of National Iranian South Oil Fields Company (NISOC)

Authors: Kamran Taghizadeh

Abstract:

This study was aimed at evaluating the effect of IMS including occupational health system, environmental management system, and safety and health system on the social responsibility (case study of NISOC`s oil and gas production companies). This study`s objectives include evaluating the IMS situation and its effect on social responsibility in addition of providing appropriate solutions based on the study`s hypotheses as a basis for future. Data collection was carried out by library and field studies as well as a questionnaire. The stratified random method was the sampling method and a sample of 285 employees in addition to the collected data (from the questionnaire) were analyzed by inferential statistics methods using SPSS software. Finally, results of regression and fitted model at a significance level of 5% confirmed all hypotheses meaning that IMS and its items have a significant effect on social responsibility.

Keywords: social responsibility, integrated management, oil and gas production companies, regression

Procedia PDF Downloads 227
17848 Statistical Analysis Approach for the e-Glassy Mortar And Radiation Shielding Behaviors Using Anova

Authors: Abadou Yacine, Faid Hayette

Abstract:

Significant investigations were performed on the use and impact on physical properties along with the mechanical strength of the recycled and reused E-glass waste powder. However, it has been modelled how recycled display e-waste glass may affect the characteristics and qualities of dune sand mortar. To be involved in this field, an investigation has been done with the substitution of dune sand for recycled E-glass waste and constant water-cement ratios. The linear relationship between the dune sand mortar and E-glass mortar mix % contributes to the model's reliability. The experimental data was exposed to regression analysis using JMP Statistics software. The regression model with one predictor presented the general form of the equation for the prediction of the five properties' characteristics of dune sand mortar from the substitution ratio of E-waste glass and curing age. The results illustrate that curing a long-term process produced an E-glass waste mortar specimen with the highest compressive strength of 68 MPa in the laboratory environment. Anova analysis indicated that the curing at long-term has the utmost importance on the sorptivity level and ultrasonic pulse velocity loss. Furthermore, the E-glass waste powder percentage has the utmost importance on the compressive strength and improvement in dynamic elasticity modulus. Besides, a significant enhancement of radiation-shielding applications.

Keywords: ANOVA analysis, E-glass waste, durability and sustainability, radiation-shielding

Procedia PDF Downloads 33
17847 Determinants of Poverty: A Logit Regression Analysis of Zakat Applicants

Authors: Zunaidah Ab Hasan, Azhana Othman, Abd Halim Mohd Noor, Nor Shahrina Mohd Rafien

Abstract:

Zakat is a portion of wealth contributed from financially able Muslims to be distributed to predetermine recipients; main among them are the poor and the needy. Distribution of the zakat fund is given with the objective to lift the recipients from poverty. Due to the multidimensional and multifaceted nature of poverty, it is imperative that the causes of poverty are properly identified for assistance given by zakat authorities reached the intended target. Despite, various studies undertaken to identify the poor correctly, there are reports of the poor not receiving the adequate assistance required from zakat. Thus, this study examines the determinants of poverty among applicants for zakat assistance distributed by the State Islamic Religious Council in Malacca (SIRCM). Malacca is a state in Malaysia. The respondents were based on the list of names of new zakat applicants for the month of April and May 2014 provided by SIRCM. A binary logistic regression was estimated based on this data with either zakat applications is rejected or accepted as the dependent variable and set of demographic variables and health as the explanatory variables. Overall, the logistic model successfully predicted factors of acceptance of zakat applications. Three independent variables namely gender, age; size of households and health significantly explain the likelihood of a successful zakat application. Among others, the finding suggests the importance of focusing on providing education opportunity in helping the poor.

Keywords: logistic regression, zakat distribution, status of zakat applications, poverty, education

Procedia PDF Downloads 314
17846 Quality Parameters of Offset Printing Wastewater

Authors: Kiurski S. Jelena, Kecić S. Vesna, Aksentijević M. Snežana

Abstract:

Samples of tap and wastewater were collected in three offset printing facilities in Novi Sad, Serbia. Ten physicochemical parameters were analyzed within all collected samples: pH, conductivity, m - alkalinity, p - alkalinity, acidity, carbonate concentration, hydrogen carbonate concentration, active oxygen content, chloride concentration and total alkali content. All measurements were conducted using the standard analytical and instrumental methods. Comparing the obtained results for tap water and wastewater, a clear quality difference was noticeable, since all physicochemical parameters were significantly higher within wastewater samples. The study also involves the application of simple linear regression analysis on the obtained dataset. By using software package ORIGIN 5 the pH value was mutually correlated with other physicochemical parameters. Based on the obtained values of Pearson coefficient of determination a strong positive correlation between chloride concentration and pH (r = -0.943), as well as between acidity and pH (r = -0.855) was determined. In addition, statistically significant difference was obtained only between acidity and chloride concentration with pH values, since the values of parameter F (247.634 and 182.536) were higher than Fcritical (5.59). In this way, results of statistical analysis highlighted the most influential parameter of water contamination in offset printing, in the form of acidity and chloride concentration. The results showed that variable dependence could be represented by the general regression model: y = a0 + a1x+ k, which further resulted with matching graphic regressions.

Keywords: pollution, printing industry, simple linear regression analysis, wastewater

Procedia PDF Downloads 210
17845 A Study of User Awareness and Attitudes Towards Civil-ID Authentication in Oman’s Electronic Services

Authors: Raya Al Khayari, Rasha Al Jassim, Muna Al Balushi, Fatma Al Moqbali, Said El Hajjar

Abstract:

This study utilizes linear regression analysis to investigate the correlation between user account passwords and the probability of civil ID exposure, offering statistical insights into civil ID security. The study employs multiple linear regression (MLR) analysis to further investigate the elements that influence consumers’ views of civil ID security. This aims to increase awareness and improve preventive measures. The results obtained from the MLR analysis provide a thorough comprehension and can guide specific educational and awareness campaigns aimed at promoting improved security procedures. In summary, the study’s results offer significant insights for improving existing security measures and developing more efficient tactics to reduce risks related to civil ID security in Oman. By identifying key factors that impact consumers’ perceptions, organizations can tailor their strategies to address vulnerabilities effectively. Additionally, the findings can inform policymakers on potential regulatory changes to enhance civil ID security in the country.

Keywords: civil-id disclosure, awareness, linear regression, multiple regression

Procedia PDF Downloads 2
17844 Forecasting Stock Indexes Using Bayesian Additive Regression Tree

Authors: Darren Zou

Abstract:

Forecasting the stock market is a very challenging task. Various economic indicators such as GDP, exchange rates, interest rates, and unemployment have a substantial impact on the stock market. Time series models are the traditional methods used to predict stock market changes. In this paper, a machine learning method, Bayesian Additive Regression Tree (BART) is used in predicting stock market indexes based on multiple economic indicators. BART can be used to model heterogeneous treatment effects, and thereby works well when models are misspecified. It also has the capability to handle non-linear main effects and multi-way interactions without much input from financial analysts. In this research, BART is proposed to provide a reliable prediction on day-to-day stock market activities. By comparing the analysis results from BART and with time series method, BART can perform well and has better prediction capability than the traditional methods.

Keywords: BART, Bayesian, predict, stock

Procedia PDF Downloads 96
17843 A Research on Inference from Multiple Distance Variables in Hedonic Regression Focus on Three Variables

Authors: Yan Wang, Yasushi Asami, Yukio Sadahiro

Abstract:

In urban context, urban nodes such as amenity or hazard will certainly affect house price, while classic hedonic analysis will employ distance variables measured from each urban nodes. However, effects from distances to facilities on house prices generally do not represent the true price of the property. Distance variables measured on the same surface are suffering a problem called multicollinearity, which is usually presented as magnitude variance and mean value in regression, errors caused by instability. In this paper, we provided a theoretical framework to identify and gather the data with less bias, and also provided specific sampling method on locating the sample region to avoid the spatial multicollinerity problem in three distance variable’s case.

Keywords: hedonic regression, urban node, distance variables, multicollinerity, collinearity

Procedia PDF Downloads 440
17842 Predicting Growth of Eucalyptus Marginata in a Mediterranean Climate Using an Individual-Based Modelling Approach

Authors: S.K. Bhandari, E. Veneklaas, L. McCaw, R. Mazanec, K. Whitford, M. Renton

Abstract:

Eucalyptus marginata, E. diversicolor and Corymbia calophylla form widespread forests in south-west Western Australia (SWWA). These forests have economic and ecological importance, and therefore, tree growth and sustainable management are of high priority. This paper aimed to analyse and model the growth of these species at both stand and individual levels, but this presentation will focus on predicting the growth of E. Marginata at the individual tree level. More specifically, the study wanted to investigate how well individual E. marginata tree growth could be predicted by considering the diameter and height of the tree at the start of the growth period, and whether this prediction could be improved by also accounting for the competition from neighbouring trees in different ways. The study also wanted to investigate how many neighbouring trees or what neighbourhood distance needed to be considered when accounting for competition. To achieve this aim, the Pearson correlation coefficient was examined among competition indices (CIs), between CIs and dbh growth, and selected the competition index that can best predict the diameter growth of individual trees of E. marginata forest managed under different thinning regimes at Inglehope in SWWA. Furthermore, individual tree growth models were developed using simple linear regression, multiple linear regression, and linear mixed effect modelling approaches. Individual tree growth models were developed for thinned and unthinned stand separately. The developed models were validated using two approaches. In the first approach, models were validated using a subset of data that was not used in model fitting. In the second approach, the model of the one growth period was validated with the data of another growth period. Tree size (diameter and height) was a significant predictor of growth. This prediction was improved when the competition was included in the model. The fit statistic (coefficient of determination) of the model ranged from 0.31 to 0.68. The model with spatial competition indices validated as being more accurate than with non-spatial indices. The model prediction can be optimized if 10 to 15 competitors (by number) or competitors within ~10 m (by distance) from the base of the subject tree are included in the model, which can reduce the time and cost of collecting the information about the competitors. As competition from neighbours was a significant predictor with a negative effect on growth, it is recommended including neighbourhood competition when predicting growth and considering thinning treatments to minimize the effect of competition on growth. These model approaches are likely to be useful tools for the conservations and sustainable management of forests of E. marginata in SWWA. As a next step in optimizing the number and distance of competitors, further studies in larger size plots and with a larger number of plots than those used in the present study are recommended.

Keywords: competition, growth, model, thinning

Procedia PDF Downloads 99
17841 Identifying Diabetic Retinopathy Complication by Predictive Techniques in Indian Type 2 Diabetes Mellitus Patients

Authors: Faiz N. K. Yusufi, Aquil Ahmed, Jamal Ahmad

Abstract:

Predicting the risk of diabetic retinopathy (DR) in Indian type 2 diabetes patients is immensely necessary. India, being the second largest country after China in terms of a number of diabetic patients, to the best of our knowledge not a single risk score for complications has ever been investigated. Diabetic retinopathy is a serious complication and is the topmost reason for visual impairment across countries. Any type or form of DR has been taken as the event of interest, be it mild, back, grade I, II, III, and IV DR. A sample was determined and randomly collected from the Rajiv Gandhi Centre for Diabetes and Endocrinology, J.N.M.C., A.M.U., Aligarh, India. Collected variables include patients data such as sex, age, height, weight, body mass index (BMI), blood sugar fasting (BSF), post prandial sugar (PP), glycosylated haemoglobin (HbA1c), diastolic blood pressure (DBP), systolic blood pressure (SBP), smoking, alcohol habits, total cholesterol (TC), triglycerides (TG), high density lipoprotein (HDL), low density lipoprotein (LDL), very low density lipoprotein (VLDL), physical activity, duration of diabetes, diet control, history of antihypertensive drug treatment, family history of diabetes, waist circumference, hip circumference, medications, central obesity and history of DR. Cox proportional hazard regression is used to design risk scores for the prediction of retinopathy. Model calibration and discrimination are assessed from Hosmer Lemeshow and area under receiver operating characteristic curve (ROC). Overfitting and underfitting of the model are checked by applying regularization techniques and best method is selected between ridge, lasso and elastic net regression. Optimal cut off point is chosen by Youden’s index. Five-year probability of DR is predicted by both survival function, and Markov chain two state model and the better technique is concluded. The risk scores developed can be applied by doctors and patients themselves for self evaluation. Furthermore, the five-year probabilities can be applied as well to forecast and maintain the condition of patients. This provides immense benefit in real application of DR prediction in T2DM.

Keywords: Cox proportional hazard regression, diabetic retinopathy, ROC curve, type 2 diabetes mellitus

Procedia PDF Downloads 148
17840 Modelling of Factors Affecting Bond Strength of Fibre Reinforced Polymer Externally Bonded to Timber and Concrete

Authors: Abbas Vahedian, Rijun Shrestha, Keith Crews

Abstract:

In recent years, fibre reinforced polymers as applications of strengthening materials have received significant attention by civil engineers and environmentalists because of their excellent characteristics. Currently, these composites have become a mainstream technology for strengthening of infrastructures such as steel, concrete and more recently, timber and masonry structures. However, debonding is identified as the main problem which limit the full utilisation of the FRP material. In this paper, a preliminary analysis of factors affecting bond strength of FRP-to-concrete and timber bonded interface has been conducted. A novel theoretical method through regression analysis has been established to evaluate these factors. Results of proposed model are then assessed with results of pull-out tests and satisfactory comparisons are achieved between measured failure loads (R2 = 0.83, P < 0.0001) and the predicted loads (R2 = 0.78, P < 0.0001).

Keywords: debonding, fibre reinforced polymers (FRP), pull-out test, stepwise regression analysis

Procedia PDF Downloads 207
17839 Efficient Credit Card Fraud Detection Based on Multiple ML Algorithms

Authors: Neha Ahirwar

Abstract:

In the contemporary digital era, the rise of credit card fraud poses a significant threat to both financial institutions and consumers. As fraudulent activities become more sophisticated, there is an escalating demand for robust and effective fraud detection mechanisms. Advanced machine learning algorithms have become crucial tools in addressing this challenge. This paper conducts a thorough examination of the design and evaluation of a credit card fraud detection system, utilizing four prominent machine learning algorithms: random forest, logistic regression, decision tree, and XGBoost. The surge in digital transactions has opened avenues for fraudsters to exploit vulnerabilities within payment systems. Consequently, there is an urgent need for proactive and adaptable fraud detection systems. This study addresses this imperative by exploring the efficacy of machine learning algorithms in identifying fraudulent credit card transactions. The selection of random forest, logistic regression, decision tree, and XGBoost for scrutiny in this study is based on their documented effectiveness in diverse domains, particularly in credit card fraud detection. These algorithms are renowned for their capability to model intricate patterns and provide accurate predictions. Each algorithm is implemented and evaluated for its performance in a controlled environment, utilizing a diverse dataset comprising both genuine and fraudulent credit card transactions.

Keywords: efficient credit card fraud detection, random forest, logistic regression, XGBoost, decision tree

Procedia PDF Downloads 25