Search results for: logistic regression model
18305 Analysis of Attention to the Confucius Institute from Domestic and Foreign Mainstream Media
Authors: Wei Yang, Xiaohui Cui, Weiping Zhu, Liqun Liu
Abstract:
The rapid development of the Confucius Institute is attracting more and more attention from mainstream media around the world. Mainstream media plays a large role in public information dissemination and public opinion. This study presents efforts to analyze the correlation and functional relationship between domestic and foreign mainstream media by analyzing the amount of reports on the Confucius Institute. Three kinds of correlation calculation methods, the Pearson correlation coefficient (PCC), the Spearman correlation coefficient (SCC), and the Kendall rank correlation coefficient (KCC), were applied to analyze the correlations among mainstream media from three regions: mainland of China; Hong Kong and Macao (the two special administration regions of China denoted as SARs); and overseas countries excluding China, such as the United States, England, and Canada. Further, the paper measures the functional relationships among the regions using a regression model. The experimental analyses found high correlations among mainstream media from the different regions. Additionally, we found that there is a linear relationship between the mainstream media of overseas countries and those of the SARs by analyzing the amount of reports on the Confucius Institute based on a data set obtained by crawling the websites of 106 mainstream media during the years 2004 to 2014.Keywords: mainstream media, Confucius institute, correlation analysis, regression model
Procedia PDF Downloads 31818304 Wealth-Based Inequalities in Child Health: A Micro-Level Analysis of Maharashtra State in India
Abstract:
The study examines the degree and magnitude of wealth-based inequalities in child health and its determinants in India. Despite making strides in economic growth, India has failed to secure a better nutritional status for all the children. The country currently faces the double burden of malnutrition as well as the problems of overweight and obesity. Child malnutrition, obesity, unsafe water, sanitation among others are identified as the risk factors for Non-Communicable Diseases (NCDs). Eliminating malnutrition in all its forms will catalyse improved health and economic outcomes. The assessment of the distributive dimension of child health across various segments of the population is essential for effective policy intervention. The study utilises the fourth round of District Level Health Survey for 2012-13 to analyse the inequalities among children in the age group 0-14 years in Maharashtra, a state in the western region of India with a population of 11.24 crores which constitutes 9.3 percent of the total population of India. The study considers the extent of health inequality by state, districts, sector, age-groups, and gender. The z-scores of four child health outcome variables are computed to assess the nutritional status of pre-school and school children using WHO reference. The descriptive statistics, concentration curves, concentration indices, correlation matrix, logistic regression have been used to analyse the data. The results indicate that magnitude of inequality is higher in Maharashtra and child health inequalities manifest primarily among the weaker sections of society. The concentration curves show that there exists a pro-poor inequality in child malnutrition measured by stunting, wasting, underweight, anaemia and a pro-rich overweight inequality. The inequalities in anaemia are observably lower due to the widespread prevalence. Rural areas exhibit a higher incidence of malnutrition, but greater inequality is observed in the urban areas. Overall, the wealth-based inequalities do not vary significantly between age groups. It appears that there is no gender discrimination at the state level. Further, rural-urban differentials in gender show that boys from the rural area and girls living in the urban region experience higher disparities in health. The relative distribution of undernutrition across districts in Maharashtra reveals that malnutrition is rampant and considerable heterogeneity also exists. A negative correlation is established between malnutrition prevalence and human development indicators. The findings of logistic regression analysis reveal that lower economic status of the household is associated with a higher probability of being malnourished. The study recognises household wealth, education of the parent, child gender, and household size as factors significantly related to malnutrition. The results suggest that among the supply-side variables, child-oriented government programmes might be beneficial in tackling nutrition deficit. In order to bridge the health inequality gap, the government needs to target the schemes better and should expand the coverage of services.Keywords: child health, inequality, malnutrition, obesity
Procedia PDF Downloads 14618303 A New Nonlinear State-Space Model and Its Application
Authors: Abdullah Eqal Al Mazrooei
Abstract:
In this work, a new nonlinear model will be introduced. The model is in the state-space form. The nonlinearity of this model is in the state equation where the state vector is multiplied by its self. This technique makes our model generalizes many famous models as Lotka-Volterra model and Lorenz model which have many applications in the real life. We will apply our new model to estimate the wind speed by using a new nonlinear estimator which suitable to work with our model.Keywords: nonlinear systems, state-space model, Kronecker product, nonlinear estimator
Procedia PDF Downloads 69118302 Diabetes Mellitus and Blood Glucose Variability Increases the 30-day Readmission Rate after Kidney Transplantation
Authors: Harini Chakkera
Abstract:
Background: Inpatient hyperglycemia is an established independent risk factor among several patient cohorts with hospital readmission. This has not been studied after kidney transplantation. Nearly one-third of patients who have undergone a kidney transplant reportedly experience 30-day readmission. Methods: Data on first-time solitary kidney transplantations were retrieved between September 2015 to December 2018. Information was linked to the electronic health record to determine a diagnosis of diabetes mellitus and extract glucometeric and insulin therapy data. Univariate logistic regression analysis and the XGBoost algorithm were used to predict 30-day readmission. We report the average performance of the models on the testing set on five bootstrapped partitions of the data to ensure statistical significance. Results: The cohort included 1036 patients who received kidney transplantation, and 224 (22%) experienced 30-day readmission. The machine learning algorithm was able to predict 30-day readmission with an average AUC of 77.3% (95% CI 75.30-79.3%). We observed statistically significant differences in the presence of pretransplant diabetes, inpatient-hyperglycemia, inpatient-hypoglycemia, and minimum and maximum glucose values among those with higher 30-day readmission rates. The XGBoost model identified the index admission length of stay, presence of hyper- and hypoglycemia and recipient and donor BMI values as the most predictive risk factors of 30-day readmission. Additionally, significant variations in the therapeutic management of blood glucose by providers were observed. Conclusions: Suboptimal glucose metrics during hospitalization after kidney transplantation is associated with an increased risk for 30-day hospital readmission. Optimizing the hospital blood glucose management, a modifiable factor, after kidney transplantation may reduce the risk of 30-day readmission.Keywords: kidney, transplant, diabetes, insulin
Procedia PDF Downloads 9018301 Mapping of Urban Micro-Climate in Lyon (France) by Integrating Complementary Predictors at Different Scales into Multiple Linear Regression Models
Authors: Lucille Alonso, Florent Renard
Abstract:
The characterizations of urban heat island (UHI) and their interactions with climate change and urban climates are the main research and public health issue, due to the increasing urbanization of the population. These solutions require a better knowledge of the UHI and micro-climate in urban areas, by combining measurements and modelling. This study is part of this topic by evaluating microclimatic conditions in dense urban areas in the Lyon Metropolitan Area (France) using a combination of data traditionally used such as topography, but also from LiDAR (Light Detection And Ranging) data, Landsat 8 satellite observation and Sentinel and ground measurements by bike. These bicycle-dependent weather data collections are used to build the database of the variable to be modelled, the air temperature, over Lyon’s hyper-center. This study aims to model the air temperature, measured during 6 mobile campaigns in Lyon in clear weather, using multiple linear regressions based on 33 explanatory variables. They are of various categories such as meteorological parameters from remote sensing, topographic variables, vegetation indices, the presence of water, humidity, bare soil, buildings, radiation, urban morphology or proximity and density to various land uses (water surfaces, vegetation, bare soil, etc.). The acquisition sources are multiple and come from the Landsat 8 and Sentinel satellites, LiDAR points, and cartographic products downloaded from an open data platform in Greater Lyon. Regarding the presence of low, medium, and high vegetation, the presence of buildings and ground, several buffers close to these factors were tested (5, 10, 20, 25, 50, 100, 200 and 500m). The buffers with the best linear correlations with air temperature for ground are 5m around the measurement points, for low and medium vegetation, and for building 50m and for high vegetation is 100m. The explanatory model of the dependent variable is obtained by multiple linear regression of the remaining explanatory variables (Pearson correlation matrix with a |r| < 0.7 and VIF with < 5) by integrating a stepwise sorting algorithm. Moreover, holdout cross-validation is performed, due to its ability to detect over-fitting of multiple regression, although multiple regression provides internal validation and randomization (80% training, 20% testing). Multiple linear regression explained, on average, 72% of the variance for the study days, with an average RMSE of only 0.20°C. The impact on the model of surface temperature in the estimation of air temperature is the most important variable. Other variables are recurrent such as distance to subway stations, distance to water areas, NDVI, digital elevation model, sky view factor, average vegetation density, or building density. Changing urban morphology influences the city's thermal patterns. The thermal atmosphere in dense urban areas can only be analysed on a microscale to be able to consider the local impact of trees, streets, and buildings. There is currently no network of fixed weather stations sufficiently deployed in central Lyon and most major urban areas. Therefore, it is necessary to use mobile measurements, followed by modelling to characterize the city's multiple thermal environments.Keywords: air temperature, LIDAR, multiple linear regression, surface temperature, urban heat island
Procedia PDF Downloads 13718300 Application of Grey Theory in the Forecast of Facility Maintenance Hours for Office Building Tenants and Public Areas
Authors: Yen Chia-Ju, Cheng Ding-Ruei
Abstract:
This study took case office building as subject and explored the responsive work order repair request of facilities and equipment in offices and public areas by gray theory, with the purpose of providing for future related office building owners, executive managers, property management companies, mechanical and electrical companies as reference for deciding and assessing forecast model. Important conclusions of this study are summarized as follows according to the study findings: 1. Grey Relational Analysis discusses the importance of facilities repair number of six categories, namely, power systems, building systems, water systems, air conditioning systems, fire systems and manpower dispatch in order. In terms of facilities maintenance importance are power systems, building systems, water systems, air conditioning systems, manpower dispatch and fire systems in order. 2. GM (1,N) and regression method took maintenance hours as dependent variables and repair number, leased area and tenants number as independent variables and conducted single month forecast based on 12 data from January to December 2011. The mean absolute error and average accuracy of GM (1,N) from verification results were 6.41% and 93.59%; the mean absolute error and average accuracy of regression model were 4.66% and 95.34%, indicating that they have highly accurate forecast capability.Keywords: rey theory, forecast model, Taipei 101, office buildings, property management, facilities, equipment
Procedia PDF Downloads 44418299 The Relationship between Class Attendance and Performance of Industrial Engineering Students Enrolled for a Statistics Subject at the University of Technology
Authors: Tshaudi Motsima
Abstract:
Class attendance is key at all levels of education. At tertiary level many students develop a tendency of not attending all classes without being aware of the repercussions of not attending all classes. It is important for all students to attend all classes as they can receive first-hand information and they can benefit more. The student who attends classes is likely to perform better academically than the student who does not. The aim of this paper is to assess the relationship between class attendance and academic performance of industrial engineering students. The data for this study were collected through the attendance register of students and the other data were accessed from the Integrated Tertiary Software and the Higher Education Data Analyzer Portal. Data analysis was conducted on a sample of 93 students. The results revealed that students with medium predicate scores (OR = 3.8; p = 0.027) and students with low predicate scores (OR = 21.4, p < 0.001) were significantly likely to attend less than 80% of the classes as compared to students with high predicate scores. Students with examination performance of less than 50% were likely to attend less than 80% of classes than students with examination performance of 50% and above, but the differences were not statistically significant (OR = 1.3; p = 0.750).Keywords: class attendance, examination performance, final outcome, logistic regression
Procedia PDF Downloads 13318298 The Impact of Simulation-based Learning on the Clinical Self-efficacy and Adherence to Infection Control Practices of Nursing Students
Authors: Raeed Alanazi
Abstract:
Introduction: Nursing students have a crucial role to play in the inhibition of infectious diseases and, therefore, must be trained in infection control and prevention modules prior to entering clinical settings. Simulations have been found to have a positive impact on infection control skills and the use of standard precautions. Aim: The purpose of this study was to use the four sources of self-efficacy in explaining the level of clinical self-efficacy and adherence to infection control practices in Saudi nursing students during simulation practice. Method: A cross-sectional design with convenience sampling was used. This study was conducted in all Saudi nursing schools, with a total number of 197 students participated in this study. Three scales were used simulation self- efficacy Scale (SSES), the four sources of self-efficacy scale (SSES), and Compliance with Standard Precautions Scale (CSPS). Multiple linear regression was used to test the use of the four sources of self-efficacy (SSES) in explaining level of clinical self-efficacy and adherence to infection control in nursing students. Results: The vicarious experience subscale (p =.044) was statistically significant. The regression model indicated that for every one unit increase in vicarious experience (observation and reflection in simulation), the participants’ adherence to infection control increased by .13 units (β =.22, t = 2.03, p =.044). In addition, the regression model indicated that for every one unit increase in education level, the participants’ adherence to infection control increased by 1.82 units (beta=.34= 3.64, p <.001). Also, the mastery experience subscale (p <.001) and vicarious experience subscale (p = .020) were shared significant associations with clinical self-efficacy. Conclusion: The findings of this research support the idea that simulation-based learning can be a valuable teaching-learning method to help nursing students develop clinical competence, which is essential in providing quality and safe nursing care.Keywords: simulation-based learning, clinical self-efficacy, infection control, nursing students
Procedia PDF Downloads 7118297 Calibration Model of %Titratable Acidity (Citric Acid) for Intact Tomato by Transmittance SW-NIR Spectroscopy
Authors: K. Petcharaporn, S. Kumchoo
Abstract:
The acidity (citric acid) is one of the chemical contents that can refer to the internal quality and the maturity index of tomato. The titratable acidity (%TA) can be predicted by a non-destructive method prediction by using the transmittance short wavelength (SW-NIR). Spectroscopy in the wavelength range between 665-955 nm. The set of 167 tomato samples divided into groups of 117 tomatoes sample for training set and 50 tomatoes sample for test set were used to establish the calibration model to predict and measure %TA by partial least squares regression (PLSR) technique. The spectra were pretreated with MSC pretreatment and it gave the optimal result for calibration model as (R = 0.92, RMSEC = 0.03%) and this model obtained high accuracy result to use for %TA prediction in test set as (R = 0.81, RMSEP = 0.05%). From the result of prediction in test set shown that the transmittance SW-NIR spectroscopy technique can be used for a non-destructive method for %TA prediction of tomatoes.Keywords: tomato, quality, prediction, transmittance, titratable acidity, citric acid
Procedia PDF Downloads 27318296 Low-Cost, Portable Optical Sensor with Regression Algorithm Models for Accurate Monitoring of Nitrites in Environments
Authors: David X. Dong, Qingming Zhang, Meng Lu
Abstract:
Nitrites enter waterways as runoff from croplands and are discharged from many industrial sites. Excessive nitrite inputs to water bodies lead to eutrophication. On-site rapid detection of nitrite is of increasing interest for managing fertilizer application and monitoring water source quality. Existing methods for detecting nitrites use spectrophotometry, ion chromatography, electrochemical sensors, ion-selective electrodes, chemiluminescence, and colorimetric methods. However, these methods either suffer from high cost or provide low measurement accuracy due to their poor selectivity to nitrites. Therefore, it is desired to develop an accurate and economical method to monitor nitrites in environments. We report a low-cost optical sensor, in conjunction with a machine learning (ML) approach to enable high-accuracy detection of nitrites in water sources. The sensor works under the principle of measuring molecular absorptions of nitrites at three narrowband wavelengths (295 nm, 310 nm, and 357 nm) in the ultraviolet (UV) region. These wavelengths are chosen because they have relatively high sensitivity to nitrites; low-cost light-emitting devices (LEDs) and photodetectors are also available at these wavelengths. A regression model is built, trained, and utilized to minimize cross-sensitivities of these wavelengths to the same analyte, thus achieving precise and reliable measurements with various interference ions. The measured absorbance data is input to the trained model that can provide nitrite concentration prediction for the sample. The sensor is built with i) a miniature quartz cuvette as the test cell that contains a liquid sample under test, ii) three low-cost UV LEDs placed on one side of the cell as light sources, with each LED providing a narrowband light, and iii) a photodetector with a built-in amplifier and an analog-to-digital converter placed on the other side of the test cell to measure the power of transmitted light. This simple optical design allows measuring the absorbance data of the sample at the three wavelengths. To train the regression model, absorbances of nitrite ions and their combination with various interference ions are first obtained at the three UV wavelengths using a conventional spectrophotometer. Then, the spectrophotometric data are inputs to different regression algorithm models for training and evaluating high-accuracy nitrite concentration prediction. Our experimental results show that the proposed approach enables instantaneous nitrite detection within several seconds. The sensor hardware costs about one hundred dollars, which is much cheaper than a commercial spectrophotometer. The ML algorithm helps to reduce the average relative errors to below 3.5% over a concentration range from 0.1 ppm to 100 ppm of nitrites. The sensor has been validated to measure nitrites at three sites in Ames, Iowa, USA. This work demonstrates an economical and effective approach to the rapid, reagent-free determination of nitrites with high accuracy. The integration of the low-cost optical sensor and ML data processing can find a wide range of applications in environmental monitoring and management.Keywords: optical sensor, regression model, nitrites, water quality
Procedia PDF Downloads 7218295 Regression of Hand Kinematics from Surface Electromyography Data Using an Long Short-Term Memory-Transformer Model
Authors: Anita Sadat Sadati Rostami, Reza Almasi Ghaleh
Abstract:
Surface electromyography (sEMG) offers important insights into muscle activation and has applications in fields including rehabilitation and human-computer interaction. The purpose of this work is to predict the degree of activation of two joints in the index finger using an LSTM-Transformer architecture trained on sEMG data from the Ninapro DB8 dataset. We apply advanced preprocessing techniques, such as multi-band filtering and customizable rectification methods, to enhance the encoding of sEMG data into features that are beneficial for regression tasks. The processed data is converted into spike patterns and simulated using Leaky Integrate-and-Fire (LIF) neuron models, allowing for neuromorphic-inspired processing. Our findings demonstrate that adjusting filtering parameters and neuron dynamics and employing the LSTM-Transformer model improves joint angle prediction performance. This study contributes to the ongoing development of deep learning frameworks for sEMG analysis, which could lead to improvements in motor control systems.Keywords: surface electromyography, LSTM-transformer, spiking neural networks, hand kinematics, leaky integrate-and-fire neuron, band-pass filtering, muscle activity decoding
Procedia PDF Downloads 918294 The Role of Urban Development Patterns for Mitigating Extreme Urban Heat: The Case Study of Doha, Qatar
Authors: Yasuyo Makido, Vivek Shandas, David J. Sailor, M. Salim Ferwati
Abstract:
Mitigating extreme urban heat is challenging in a desert climate such as Doha, Qatar, since outdoor daytime temperature area often too high for the human body to tolerate. Recent studies demonstrate that cities in arid and semiarid areas can exhibit ‘urban cool islands’ - urban areas that are cooler than the surrounding desert. However, the variation of temperatures as a result of the time of day and factors leading to temperature change remain at the question. To address these questions, we examined the spatial and temporal variation of air temperature in Doha, Qatar by conducting multiple vehicle-base local temperature observations. We also employed three statistical approaches to model surface temperatures using relevant predictors: (1) Ordinary Least Squares, (2) Regression Tree Analysis and (3) Random Forest for three time periods. Although the most important determinant factors varied by day and time, distance to the coast was the significant determinant at midday. A 70%/30% holdout method was used to create a testing dataset to validate the results through Pearson’s correlation coefficient. The Pearson’s analysis suggests that the Random Forest model more accurately predicts the surface temperatures than the other methods. We conclude with recommendations about the types of development patterns that show the greatest potential for reducing extreme heat in air climates.Keywords: desert cities, tree-structure regression model, urban cool Island, vehicle temperature traverse
Procedia PDF Downloads 39218293 Study of the Influence of Non Genetic Factors Affecting over Nutrition Students in Ayutthaya Province, Thailand
Authors: Thananyada Buapian
Abstract:
Overnutrition is emerging as a morbid disease in developing and Westernized countries. Because of its comorbidity diseases, it is cost-effective to prevent and manage this disease earlier. In Thailand, this alarming disease has long been studied, but the prevalence is still higher than that in the past. Physicians should recognize it well and have a definite direction to face and combat this dangerous disease. Rapid changes in the tremendous figure of overnutrition students indicate that genetic factors are not the primary determinants since human genes have remained unchanged for a century. This study aims to assess the prevalence of overnutrition students and to investigate the non-genetic factors affecting over nutrition students. A cross-sectional school-based survey was conducted. A two-stage sampling was adopted. Respondents included 1,850 students in grades 4 to 6 in Ayutthaya Province. An anthropometric measurement and questionnaire were developed. Childhood over nutrition was defined as a weight-for-height Z-score above +2SD of NCHS/WHO references. About thirty three percent of the children were over nutrition in Ayutthaya province. Stepwise multiple logistic regression analysis showed that 8 statistically significant non genetic factors explain the variation of childhood over nutrition by 18 percent. Sex is the prime factor to explain the variation of childhood over nutrition, followed by duration of light physical activities, duration of moderate physical activities, having been breastfed, the presence of a healthy role model of the caregiver, number of siblings, birth order, and occupation of the caregiver, respectively. Non genetic factors, especially the subjects’ demographic and physical activities, as well as the caregivers’ background and family environment, should be considered in viable approach to remedy this health imbalance in children.Keywords: non genetic factors, non-genetic, over nutrition, over nutrition students
Procedia PDF Downloads 27218292 Role of P53, KI67 and Cyclin a Immunohistochemical Assay in Predicting Wilms’ Tumor Mortality
Authors: Ahmed Atwa, Ashraf Hafez, Mohamed Abdelhameed, Adel Nabeeh, Mohamed Dawaba, Tamer Helmy
Abstract:
Introduction and Objective: Tumour staging and grading do not usually reflect the future behavior of Wilms' tumor (WT) regarding mortality. Therefore, in this study, P53, Ki67 and cyclin A immunohistochemistry were used in a trial to predict WT cancer-specific survival (CSS). Methods: In this nonconcurrent cohort study, patients' archived data, including age at presentation, gender, history, clinical examination and radiological investigations, were retrieved then the patients were reviewed at the outpatient clinic of a tertiary care center by history-taking, clinical examination and radiological investigations to detect the oncological outcome. Cases that received preoperative chemotherapy or died due to causes other than WT were excluded. Formalin-fixed, paraffin-embedded specimens obtained from the previously preserved blocks at the pathology laboratory were taken on positively charged slides for IHC with p53, Ki67 and cyclin A. All specimens were examined by an experienced histopathologist devoted to the urological practice and blinded to the patient's clinical findings. P53 and cyclin A staining were scored as 0 (no nuclear staining),1 (<10% nuclear staining), 2 (10-50% nuclear staining) and 3 (>50% nuclear staining). Ki67 proliferation index (PI) was graded as low, borderline and high. Results: Of the 75 cases, 40 (53.3%) were males and 35 (46.7%) were females, and the median age was 36 months (2-216). With a mean follow-up of 78.6±31 months, cancer-specific mortality (CSM) occurred in 15 (20%) and 11 (14.7%) patients, respectively. Kaplan-Meier curve was used for survival analysis, and groups were compared using the Log-rank test. Multivariate logistic regression and Cox regression were not used because only one variable (cyclin A) had shown statistical significance (P=.02), whereas the other significant factor (residual tumor) had few cases. Conclusions: Cyclin A IHC should be considered as a marker for the prediction of WT CSS. Prospective studies with a larger sample size are needed.Keywords: wilms’ tumour, nephroblastoma, urology, survival
Procedia PDF Downloads 6718291 Applicability of Cameriere’s Age Estimation Method in a Sample of Turkish Adults
Authors: Hatice Boyacioglu, Nursel Akkaya, Humeyra Ozge Yilanci, Hilmi Kansu, Nihal Avcu
Abstract:
The strong relationship between the reduction in the size of the pulp cavity and increasing age has been reported in the literature. This relationship can be utilized to estimate the age of an individual by measuring the pulp cavity size using dental radiographs as a non-destructive method. The purpose of this study is to develop a population specific regression model for age estimation in a sample of Turkish adults by applying Cameriere’s method on panoramic radiographs. The sample consisted of 100 panoramic radiographs of Turkish patients (40 men, 60 women) aged between 20 and 70 years. Pulp and tooth area ratios (AR) of the maxilla¬¬ry canines were measured by two maxillofacial radiologists and then the results were subjected to regression analysis. There were no statistically significant intra-observer and inter-observer differences. The correlation coefficient between age and the AR of the maxillary canines was -0.71 and the following regression equation was derived: Estimated Age = 77,365 – ( 351,193 × AR ). The mean prediction error was 4 years which is within acceptable errors limits for age estimation. This shows that the pulp/tooth area ratio is a useful variable for assessing age with reasonable accuracy. Based on the results of this research, it was concluded that Cameriere’s method is suitable for dental age estimation and it can be used for forensic procedures in Turkish adults. These instructions give you guidelines for preparing papers for conferences or journals.Keywords: age estimation by teeth, forensic dentistry, panoramic radiograph, Cameriere's method
Procedia PDF Downloads 45018290 Evaluation of the CRISP-DM Business Understanding Step: An Approach for Assessing the Predictive Power of Regression versus Classification for the Quality Prediction of Hydraulic Test Results
Authors: Christian Neunzig, Simon Fahle, Jürgen Schulz, Matthias Möller, Bernd Kuhlenkötter
Abstract:
Digitalisation in production technology is a driver for the application of machine learning methods. Through the application of predictive quality, the great potential for saving necessary quality control can be exploited through the data-based prediction of product quality and states. However, the serial use of machine learning applications is often prevented by various problems. Fluctuations occur in real production data sets, which are reflected in trends and systematic shifts over time. To counteract these problems, data preprocessing includes rule-based data cleaning, the application of dimensionality reduction techniques, and the identification of comparable data subsets to extract stable features. Successful process control of the target variables aims to centre the measured values around a mean and minimise variance. Competitive leaders claim to have mastered their processes. As a result, much of the real data has a relatively low variance. For the training of prediction models, the highest possible generalisability is required, which is at least made more difficult by this data availability. The implementation of a machine learning application can be interpreted as a production process. The CRoss Industry Standard Process for Data Mining (CRISP-DM) is a process model with six phases that describes the life cycle of data science. As in any process, the costs to eliminate errors increase significantly with each advancing process phase. For the quality prediction of hydraulic test steps of directional control valves, the question arises in the initial phase whether a regression or a classification is more suitable. In the context of this work, the initial phase of the CRISP-DM, the business understanding, is critically compared for the use case at Bosch Rexroth with regard to regression and classification. The use of cross-process production data along the value chain of hydraulic valves is a promising approach to predict the quality characteristics of workpieces. Suitable methods for leakage volume flow regression and classification for inspection decision are applied. Impressively, classification is clearly superior to regression and achieves promising accuracies.Keywords: classification, CRISP-DM, machine learning, predictive quality, regression
Procedia PDF Downloads 14418289 Study of the Association between Salivary Microbiological Data, Oral Health Indicators, Behavioral Factors, and Social Determinants among Post-COVID Patients Aged 7 to 12 Years in Tbilisi City
Authors: Lia Mania, Ketevan Nanobashvili
Abstract:
Background: The coronavirus disease COVID-19 has become the cause of a global health crisis during the current pandemic. This study aims to fill the paucity of epidemiological studies on the impact of COVID-19 on the oral health of pediatric populations. Methods: It was conducted an observational, cross-sectional study in Georgia, in Tbilisi (capital of Georgia), among 7 to 12-year-old PCR or rapid test-confirmed post-Covid populations in all districts of Tbilisi (10 districts in total). 332 beneficiaries who were infected with Covid within one year were included in the study. The population was selected in schools of Tbilisi according to the principle of cluster selection. A simple random selection took place in the selected clusters. According to this principle, an equal number of beneficiaries were selected in all districts of Tbilisi. By July 1, 2022, according to National Center for Disease Control and Public Health data (NCDC.Ge), the number of test-confirmed cases in the population aged 0-18 in Tbilisi was 115137 children (17.7% of all confirmed cases). The number of patients to be examined was determined by the sample size. Oral screening, microbiological examination of saliva, and administration of oral health questionnaires to guardians were performed. Statistical processing of data was done with SPSS-23. Risk factors were estimated by odds ratio and logistic regression with 95% confidence interval. Results: Statistically reliable differences between the averages of oral health indicators in asymptomatic and symptomatic covid-infected groups are: for caries intensity (DMF+def) t=4.468 and p=0.000, for modified gingival index (MGI) t=3.048, p=0.002, for simplified oral hygiene index (S-OHI) t=4.853; p=0.000. Symptomatic covid-infection has a reliable effect on the oral microbiome (Staphylococcus aureus, Candida albicans, Pseudomonas aeruginosa, Streptococcus pneumoniae, Staphylococcus epidermalis); (n=332; 77.3% vs n=332; 58.0%; OR=2.46, 95%CI: 1.318-4.617). According to the logistic regression, it was found that the severity of the covid infection has a significant effect on the frequency of pathogenic and conditionally pathogenic bacteria in the oral cavity B=0.903 AOR=2.467 (CL 1.318-4.617). Symptomatic covid-infection affects oral health indicators, regardless of the presence of other risk factors, such as parental employment status, tooth brushing behaviors, carbohydrate meal, fruit consumption. (p<0.05). Conclusion: Risk factors (parental employment status, tooth brushing behaviors, carbohydrate consumption) were associated with poorer oral health status in a post-Covid population of 7- to 12-year-old children. However, such a risk factor as symptomatic ongoing covid-infection affected the oral microbiome in terms of the abundant growth of pathogenic and conditionally pathogenic bacteria (Staphylococcus aureus, Candida albicans, Pseudomonas aeruginosa, Streptococcus pneumoniae, Staphylococcus epidermalis) and further worsened oral health indicators. Thus, a close association was established between symptomatic covid-infection and microbiome changes in the post-covid period; also - between the variables of oral health indicators and the symptomatic course of covid-infection.Keywords: oral microbiome, COVID-19, population based research, oral health indicators
Procedia PDF Downloads 6918288 Time Fetching Water and Maternal Childcare Practices: Comparative Study of Women with Children Living in Ethiopia and Malawi
Authors: Davod Ahmadigheidari, Isabel Alvarez, Kate Sinclair, Marnie Davidson, Patrick Cortbaoui, Hugo Melgar-Quiñonez
Abstract:
The burden of collecting water tends to disproportionately fall on women and girls in low-income countries. Specifically, women spend between one to eight hours per day fetching water for domestic use in Sub-Saharan Africa. While there has been research done on the global time burden for collecting water, it has been mainly focused on water quality parameters; leaving the relationship between water fetching and health outcomes understudied. There is little available evidence regarding the relationship between water fetching and maternal child care practices. The main objective of this study was to help fill the aforementioned gap in the literature. Data from two surveys in Ethiopia and Malawi conducted by CARE Canada in 2016-2017 were used. Descriptive statistics indicate that women were predominantly responsible for collecting water in both Ethiopia (87%) and Malawi (99%) respectively, with the majority spending more than 30 minutes per day on water collection. With regards to child care practices, in both countries, breastfeeding was relatively high (77% and 82%, respectively); and treatment for malnutrition was low (15% and 8%, respectively). However, the same consistency was not found for weighing; in Ethiopia only 16% took their children for weighting in contrast to 94% in Malawi. These three practices were summed to create one variable for regressions analyses. Unadjusted logistic regression findings showed that only in Ethiopia was time fetching water significantly associated with child care practices. Once adjusted for covariates, this relationship was no longer found to be significant. Adjusted logistic regressions also showed that the factors that did influence child care practices differed slightly between the two countries. In Ethiopia, a lack of access to community water supply (OR= 0.668; P=0.010), poor attitudes towards gender equality (OR= 0.608; P=0.001), no access to land and (OR=0.603; P=0.000), significantly decreased a women’s odd of using positive childcare practices. Notably, being young women between 15-24 years (OR=2.308; P=0.017), and 25-29 (OR=2.065; P=0.028) increased probability of using positive childcare practices. Whereas in Malawi, higher maternal age, low decision-making power, significantly decreased a women’s odd of using positive childcare practices. In conclusion, this study found that even though amount of time spent by women fetching water makes a difference for childcare practices, it is not significantly related to women’s child care practices when controlling the covariates. Importantly, women’s age contributes to child care practices in Ethiopia and Malawi.Keywords: time fetching water, community water supply, women’s child care practices, Ethiopia, Malawi
Procedia PDF Downloads 20218287 Social Media Marketing Efforts and Hospital Brand Equity: An Empirical Investigation
Authors: Abrar R. Al-Hasan
Abstract:
Despite the widespread use of social media by consumers and marketers, empirical research investigating their economic value in the healthcare industry still lags. This study explores the impact of the use of social media marketing efforts on a hospital's brand equity and, ultimately, consumer response. Using social media data from Twitter and Facebook, along with an online and offline survey methodology, data is analyzed using logistic regression models. A random sample of (728) residents of the Kuwaiti population is used. The results of this study found that social media marketing efforts (SMME) in terms of use and validation lead to higher hospital brand equity and in turn, patient loyalty and patient visit. The study highlights the impact of SMME on hospital brand equity and patient response. Healthcare organizations should guide their marketing efforts to better manage this new way of marketing and communicating with patients to enhance their consumer loyalty and financial performance.Keywords: brand equity, healthcare marketing, patient visit, social media, SMME
Procedia PDF Downloads 17318286 Advancing Urban Sustainability through Data-Driven Machine Learning Solutions
Authors: Nasim Eslamirad, Mahdi Rasoulinezhad, Francesco De Luca, Sadok Ben Yahia, Kimmo Sakari Lylykangas, Francesco Pilla
Abstract:
With the ongoing urbanization, cities face increasing environmental challenges impacting human well-being. To tackle these issues, data-driven approaches in urban analysis have gained prominence, leveraging urban data to promote sustainability. Integrating Machine Learning techniques enables researchers to analyze and predict complex environmental phenomena like Urban Heat Island occurrences in urban areas. This paper demonstrates the implementation of data-driven approach and interpretable Machine Learning algorithms with interpretability techniques to conduct comprehensive data analyses for sustainable urban design. The developed framework and algorithms are demonstrated for Tallinn, Estonia to develop sustainable urban strategies to mitigate urban heat waves. Geospatial data, preprocessed and labeled with UHI levels, are used to train various ML models, with Logistic Regression emerging as the best-performing model based on evaluation metrics to derive a mathematical equation representing the area with UHI or without UHI effects, providing insights into UHI occurrences based on buildings and urban features. The derived formula highlights the importance of building volume, height, area, and shape length to create an urban environment with UHI impact. The data-driven approach and derived equation inform mitigation strategies and sustainable urban development in Tallinn and offer valuable guidance for other locations with varying climates.Keywords: data-driven approach, machine learning transparent models, interpretable machine learning models, urban heat island effect
Procedia PDF Downloads 3718285 An Alternative Approach for Assessing the Impact of Cutting Conditions on Surface Roughness Using Single Decision Tree
Authors: S. Ghorbani, N. I. Polushin
Abstract:
In this study, an approach to identify factors affecting on surface roughness in a machining process is presented. This study is based on 81 data about surface roughness over a wide range of cutting tools (conventional, cutting tool with holes, cutting tool with composite material), workpiece materials (AISI 1045 Steel, AA2024 aluminum alloy, A48-class30 gray cast iron), spindle speed (630-1000 rpm), feed rate (0.05-0.075 mm/rev), depth of cut (0.05-0.15 mm) and tool overhang (41-65 mm). A single decision tree (SDT) analysis was done to identify factors for predicting a model of surface roughness, and the CART algorithm was employed for building and evaluating regression tree. Results show that a single decision tree is better than traditional regression models with higher rate and forecast accuracy and strong value.Keywords: cutting condition, surface roughness, decision tree, CART algorithm
Procedia PDF Downloads 37518284 Antecedents of Spinouts: Technology Relatedness, Intellectual Property Rights, and Venture Capital
Authors: Sepideh Yeganegi, Andre Laplume, Parshotam Dass, Cam-Loi Huynh
Abstract:
This paper empirically examines organizational and institutional antecedents of entrepreneurial entry. We employ multi-level logistic regression modelling methods on a sub-sample of the Global Entrepreneurship Monitor’s 2011 survey covering 30 countries. The results reveal that employees who have experience with activities unrelated to the core technology of their organizations are more likely to spin out entrepreneurial ventures, whereas those with experiences related to the core technology are less likely to do so. In support of the recent theory, we find that the strength of intellectual property rights and the availability of venture capital have negative and positive effects, respectively, on the likelihood that employees turn into entrepreneurs. These institutional factors also moderate the effect of relatedness to core technology such that entrepreneurial entries by employees with experiences related to core technology are curbed more severely by stronger intellectual property rights protection regimes and lack of venture capital.Keywords: spinouts, intellectual property rights, venture capital, entrepreneurship, organizational experiences, core technology
Procedia PDF Downloads 35618283 A Study of User Awareness and Attitudes Towards Civil-ID Authentication in Oman’s Electronic Services
Authors: Raya Al Khayari, Rasha Al Jassim, Muna Al Balushi, Fatma Al Moqbali, Said El Hajjar
Abstract:
This study utilizes linear regression analysis to investigate the correlation between user account passwords and the probability of civil ID exposure, offering statistical insights into civil ID security. The study employs multiple linear regression (MLR) analysis to further investigate the elements that influence consumers’ views of civil ID security. This aims to increase awareness and improve preventive measures. The results obtained from the MLR analysis provide a thorough comprehension and can guide specific educational and awareness campaigns aimed at promoting improved security procedures. In summary, the study’s results offer significant insights for improving existing security measures and developing more efficient tactics to reduce risks related to civil ID security in Oman. By identifying key factors that impact consumers’ perceptions, organizations can tailor their strategies to address vulnerabilities effectively. Additionally, the findings can inform policymakers on potential regulatory changes to enhance civil ID security in the country.Keywords: civil-id disclosure, awareness, linear regression, multiple regression
Procedia PDF Downloads 5718282 Prediction of Alzheimer's Disease Based on Blood Biomarkers and Machine Learning Algorithms
Authors: Man-Yun Liu, Emily Chia-Yu Su
Abstract:
Alzheimer's disease (AD) is the public health crisis of the 21st century. AD is a degenerative brain disease and the most common cause of dementia, a costly disease on the healthcare system. Unfortunately, the cause of AD is poorly understood, furthermore; the treatments of AD so far can only alleviate symptoms rather cure or stop the progress of the disease. Currently, there are several ways to diagnose AD; medical imaging can be used to distinguish between AD, other dementias, and early onset AD, and cerebrospinal fluid (CSF). Compared with other diagnostic tools, blood (plasma) test has advantages as an approach to population-based disease screening because it is simpler, less invasive also cost effective. In our study, we used blood biomarkers dataset of The Alzheimer’s disease Neuroimaging Initiative (ADNI) which was funded by National Institutes of Health (NIH) to do data analysis and develop a prediction model. We used independent analysis of datasets to identify plasma protein biomarkers predicting early onset AD. Firstly, to compare the basic demographic statistics between the cohorts, we used SAS Enterprise Guide to do data preprocessing and statistical analysis. Secondly, we used logistic regression, neural network, decision tree to validate biomarkers by SAS Enterprise Miner. This study generated data from ADNI, contained 146 blood biomarkers from 566 participants. Participants include cognitive normal (healthy), mild cognitive impairment (MCI), and patient suffered Alzheimer’s disease (AD). Participants’ samples were separated into two groups, healthy and MCI, healthy and AD, respectively. We used the two groups to compare important biomarkers of AD and MCI. In preprocessing, we used a t-test to filter 41/47 features between the two groups (healthy and AD, healthy and MCI) before using machine learning algorithms. Then we have built model with 4 machine learning methods, the best AUC of two groups separately are 0.991/0.709. We want to stress the importance that the simple, less invasive, common blood (plasma) test may also early diagnose AD. As our opinion, the result will provide evidence that blood-based biomarkers might be an alternative diagnostics tool before further examination with CSF and medical imaging. A comprehensive study on the differences in blood-based biomarkers between AD patients and healthy subjects is warranted. Early detection of AD progression will allow physicians the opportunity for early intervention and treatment.Keywords: Alzheimer's disease, blood-based biomarkers, diagnostics, early detection, machine learning
Procedia PDF Downloads 32218281 A Research on Inference from Multiple Distance Variables in Hedonic Regression Focus on Three Variables
Authors: Yan Wang, Yasushi Asami, Yukio Sadahiro
Abstract:
In urban context, urban nodes such as amenity or hazard will certainly affect house price, while classic hedonic analysis will employ distance variables measured from each urban nodes. However, effects from distances to facilities on house prices generally do not represent the true price of the property. Distance variables measured on the same surface are suffering a problem called multicollinearity, which is usually presented as magnitude variance and mean value in regression, errors caused by instability. In this paper, we provided a theoretical framework to identify and gather the data with less bias, and also provided specific sampling method on locating the sample region to avoid the spatial multicollinerity problem in three distance variable’s case.Keywords: hedonic regression, urban node, distance variables, multicollinerity, collinearity
Procedia PDF Downloads 46518280 Quantitative Structure Activity Relationship Model for Predicting the Aromatase Inhibition Activity of 1,2,3-Triazole Derivatives
Authors: M. Ouassaf, S. Belaidi
Abstract:
Aromatase is an estrogen biosynthetic enzyme belonging to the cytochrome P450 family, which catalyzes the limiting step in the conversion of androgens to estrogens. As it is relevant for the promotion of tumor cell growth. A set of thirty 1,2,3-triazole derivatives was used in the quantitative structure activity relationship (QSAR) study using regression multiple linear (MLR), We divided the data into two training and testing groups. The results showed a good predictive ability of the MLR model, the models were statistically robust internally (R² = 0.982) and the predictability of the model was tested by several parameters. including external criteria (R²pred = 0.851, CCC = 0.946). The knowledge gained in this study should provide relevant information that contributes to the origins of aromatase inhibitory activity and, therefore, facilitates our ongoing quest for aromatase inhibitors with robust properties.Keywords: aromatase inhibitors, QSAR, MLR, 1, 2, 3-triazole
Procedia PDF Downloads 11518279 New Approach for Load Modeling
Authors: Slim Chokri
Abstract:
Load forecasting is one of the central functions in power systems operations. Electricity cannot be stored, which means that for electric utility, the estimate of the future demand is necessary in managing the production and purchasing in an economically reasonable way. A majority of the recently reported approaches are based on neural network. The attraction of the methods lies in the assumption that neural networks are able to learn properties of the load. However, the development of the methods is not finished, and the lack of comparative results on different model variations is a problem. This paper presents a new approach in order to predict the Tunisia daily peak load. The proposed method employs a computational intelligence scheme based on the Fuzzy neural network (FNN) and support vector regression (SVR). Experimental results obtained indicate that our proposed FNN-SVR technique gives significantly good prediction accuracy compared to some classical techniques.Keywords: neural network, load forecasting, fuzzy inference, machine learning, fuzzy modeling and rule extraction, support vector regression
Procedia PDF Downloads 43518278 Estimation of a Finite Population Mean under Random Non Response Using Improved Nadaraya and Watson Kernel Weights
Authors: Nelson Bii, Christopher Ouma, John Odhiambo
Abstract:
Non-response is a potential source of errors in sample surveys. It introduces bias and large variance in the estimation of finite population parameters. Regression models have been recognized as one of the techniques of reducing bias and variance due to random non-response using auxiliary data. In this study, it is assumed that random non-response occurs in the survey variable in the second stage of cluster sampling, assuming full auxiliary information is available throughout. Auxiliary information is used at the estimation stage via a regression model to address the problem of random non-response. In particular, the auxiliary information is used via an improved Nadaraya-Watson kernel regression technique to compensate for random non-response. The asymptotic bias and mean squared error of the estimator proposed are derived. Besides, a simulation study conducted indicates that the proposed estimator has smaller values of the bias and smaller mean squared error values compared to existing estimators of finite population mean. The proposed estimator is also shown to have tighter confidence interval lengths at a 95% coverage rate. The results obtained in this study are useful, for instance, in choosing efficient estimators of the finite population mean in demographic sample surveys.Keywords: mean squared error, random non-response, two-stage cluster sampling, confidence interval lengths
Procedia PDF Downloads 14018277 Epidemiological Investigation of Abortion in Ewes in Algeria
Authors: Laatra Zemmouri, Said Boukhechem, Samia Haffaf, Mohamed Lafri
Abstract:
A study was conducted in order to determine the prevalence and risk factors associated with abortion in ewes in the region of M’sila, located in central-eastern Algeria. A questionnaire was carried out to obtain information about the occurrence of abortion, sheep housing conditions, vaccination, feeding and management practices, and whether the farmers kept other livestock. This cross-sectional study was conducted for 36 months (between 2016 and 2019). A total of 71 sheep flocks were visited. Among 8168 ewes, we recorded 734 (8.99%) abortions and 3861 lambings. The risk factor analysis using multivariable logistic regression showed an association between abortion and vaccination against brucellosis (CI 95%= 2,76-1,35; p<0,001). Abortion decreased when dogs are owned (CI 95%= 0,36-0,84; p= 0.006), however, abortion increased with the presence of cats in farms (CI 95%= 1,24-2,8; p=0.003). There was a significant association between abortion and keeping goats (CI 95%= 1,18-2,40; p= 0.004), bovins (CI 95%= 0,3-0,68; p<0,001) and poultry CI 95%= 0,39-0,77; p= 0.001) in farms. Through this study, it is noticed that a strong association between the occurrence of abortion and estrus synchronization, stillbirth occurrence, and feed supplementation (p<0.05). Identification of the causes of abortion is an important task to reduce foetal losses and to improve livestock productivity.Keywords: abortion, ewes, questionnaire, risk factors
Procedia PDF Downloads 22718276 Prey-Stage Preference, Functional Response, and Mutual Interference of Amblyseius swirskii Anthias-Henriot on Frankliniella occidentalis Priesner
Authors: Marjan Heidarian Dehkordi, Hossein Allahyari, Bruce Parker, Reza Talaee-Hassanlouei
Abstract:
The Western flower thrips, Frankliniella occidentalis Priesner (Thysanoptera: Thripidae), is a significant pest of many economically important crops. This study evaluated the functional responses, prey-stage preferences and mutual interference of Amblyseius swirskii Anthias-Henriot (Acari: Phytoseiidae) with F. occidentalis as the host under laboratory conditions. The predator species showed no prey stage preference for either prey 1st or 2nd instar. Logistic regression analysis suggested Type II (convex) functional response for the predator species. Consequently, the per capita searching efficiency decreased significantly from 1.2425 to -7.4987 as predator densities increased from 2 to 8. The findings from this study could help select better biological control agents for effective control of F. occidentalis and other pests in vegetable production.Keywords: biological control, functional responses, mutual interference, prey-stage preferences
Procedia PDF Downloads 325