Search results for: penalized logistic regression
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 3284

Search results for: penalized logistic regression

3014 Incorporating Anomaly Detection in a Digital Twin Scenario Using Symbolic Regression

Authors: Manuel Alves, Angelica Reis, Armindo Lobo, Valdemar Leiras

Abstract:

In industry 4.0, it is common to have a lot of sensor data. In this deluge of data, hints of possible problems are difficult to spot. The digital twin concept aims to help answer this problem, but it is mainly used as a monitoring tool to handle the visualisation of data. Failure detection is of paramount importance in any industry, and it consumes a lot of resources. Any improvement in this regard is of tangible value to the organisation. The aim of this paper is to add the ability to forecast test failures, curtailing detection times. To achieve this, several anomaly detection algorithms were compared with a symbolic regression approach. To this end, Isolation Forest, One-Class SVM and an auto-encoder have been explored. For the symbolic regression PySR library was used. The first results show that this approach is valid and can be added to the tools available in this context as a low resource anomaly detection method since, after training, the only requirement is the calculation of a polynomial, a useful feature in the digital twin context.

Keywords: anomaly detection, digital twin, industry 4.0, symbolic regression

Procedia PDF Downloads 89
3013 The Effect of Geographical Differentials of Epidemiological Transition on Health-Seeking Behavior in India

Authors: Sumit Kumar Das, Laishram Ladusingh

Abstract:

Aim: The aim of the study is to examine the differential of epidemiological transition across fifteen agro-climatic zones of India and its effect on health-seeking behavior. Data and Methods: Unit level data on consumption expenditure on health of India from three decadal rounds conducted by National Sample Survey Organization are used for the analysis. These three rounds are 52nd (1995-96), 60th (2004-05) and 71st (2014-15). The age-adjusted prevalence rate for communicable diseases and non-communicable diseases are estimated for fifteen agro-climatic zones of India for three time periods. Bivariate analysis is used to find out determinants of health-seeking behavior. Multilevel logistic regression is used to examine factors effecting on household health-seeking behavior. Result: The prevalence of communicable diseases is increasing in most of the zones of India. Every South Indian zones, Gujarat plains, and lower Gangetic plain are facing the severe attack of dual burden of diseases. Demand for medical advice has increased in southern zones, and east zones, reliance on private healthcare facilities are increasing in most of the zone. Demographic characteristics of the household head have a significant impact on health-seeking behavior. Conclusion: Proper program implementation is required considering the disease prevalence and differential in the pattern of health seeking behavior. Along with initiation and strengthening of programs for non-communicable, existing programs for communicable diseases need to monitor and supervised strictly.

Keywords: agro-climatic zone, epidemiological transition, health-seeking behavior, multilevel regression

Procedia PDF Downloads 154
3012 Factors Contributing to Delayed Diagnosis and Treatment of Breast Cancer and Its Outcome in Jamhoriat Hospital Kabul, Afghanistan

Authors: Ahmad Jawad Fardin

Abstract:

Over 60% of patients with breast cancer in Afghanistan present late with advanced stage III and IV, a major cause for the poor survival rate. The objectives of this study were to identify the contributing factors for the diagnosis and treatment delay and its outcome. This cross-sectional study was conducted on 318 patients with histologically confirmed breast cancer in the oncology department of Jamhoriat hospital, which is the first and only national cancer center in Afghanistan; data were collected from medical records and interviews conducted with women diagnosed with breast cancer, linear regression and logistic regression were used for analysis. Patient delay was defined as the time from first recognition of symptoms until first medical consultation and doctor form first consultation with a health care provider until histological confirmation of breast cancer. The mean age of patients was 49.2+_ 11.5years. The average time for the final diagnosis of breast cancer was 8.5 months; most patients had ductal carcinoma 260.7 (82%). Factors associated with delay were low education level 76% poor socioeconomic and cultural conditions 81% lack of cancer center 73% lack of screening 19%. The stage distribution was as follows stage IV 4 22% stage III 44.4% stage II 29.3% stage I 4.3%. Complex associated factors were identified to delayed the diagnosis of breast cancer and increased adverse outcomes consequently. Raising awareness and education in women, the establishment of cancer centers and providing accessible diagnosis service and screening, training of general practitioners; required to promote early detection, diagnosis and treatment.

Keywords: delayed diagnosis and poor outcome, breast cancer in Afghanistan, poor outcome of delayed breast cancer treatment, breast cancer delayed diagnosis and treatment in Afghanistan

Procedia PDF Downloads 151
3011 Impact of Infrastructural Development on Socio-Economic Growth: An Empirical Investigation in India

Authors: Jonardan Koner

Abstract:

The study attempts to find out the impact of infrastructural investment on state economic growth in India. It further tries to determine the magnitude of the impact of infrastructural investment on economic indicator, i.e., per-capita income (PCI) in Indian States. The study uses panel regression technique to measure the impact of infrastructural investment on per-capita income (PCI) in Indian States. Panel regression technique helps incorporate both the cross-section and time-series aspects of the dataset. In order to analyze the difference in impact of the explanatory variables on the explained variables across states, the study uses Fixed Effect Panel Regression Model. The conclusions of the study are that infrastructural investment has a desirable impact on economic development and that the impact is different for different states in India. We analyze time series data (annual frequency) ranging from 1991 to 2010. The study reveals that the infrastructural investment significantly explains the variation of economic indicators.

Keywords: infrastructural investment, multiple regression, panel regression techniques, economic development, fixed effect dummy variable model

Procedia PDF Downloads 345
3010 The Effect of Slum Neighborhoods on Pregnancy Outcomes in Tanzania: Secondary Analysis of the 2015-2016 Tanzania Demographic and Health Survey Data

Authors: Luisa Windhagen, Atsumi Hirose, Alex Bottle

Abstract:

Global urbanization has resulted in the expansion of slums, leaving over 10 million Tanzanians in urban poverty and at risk of poor health. Whilst rural residence has historically been associated with an increased risk of adverse pregnancy outcomes, recent studies found higher perinatal mortality rates in urban Tanzania. This study aims to understand to what extent slum neighborhoods may account for the spatial disparities seen in Tanzania. We generated a slum indicator based on UN-HABITAT criteria to identify slum clusters within the 2015-2016 Tanzania Demographic and Health Survey. Descriptive statistics, disaggregated by urban slum, urban non-slum, and rural areas, were produced. Simple and multivariable logistic regression examined the association between cluster residence type and neonatal mortality and stillbirth. For neonatal mortality, we additionally built a multilevel logistic regression model, adjusting for confounding and clustering. The neonatal mortality ratio was highest in slums (38.3 deaths per 1000 live births); the stillbirth rate was three times higher in slums (32.4 deaths per 1000 births) than in urban non-slums. Neonatal death was more likely to occur in slums than in urban non-slums (aOR=2.15, 95% CI=1.02-4.56) and rural areas (aOR=1.78, 95% CI=1.15-2.77). Odds of stillbirth were over five times higher among rural than urban non-slum residents (aOR=5.25, 95% CI=1.31-20.96). The results suggest that slums contribute to the urban disadvantage in Tanzanian neonatal health. Higher neonatal mortality in slums may be attributable to lack of education, lower socioeconomic status, poor healthcare access, and environmental factors, including indoor and outdoor air pollution and unsanitary conditions from inadequate housing. However, further research is required to ascertain specific causalities as well as significant associations between residence type and other pregnancy outcomes. The high neonatal mortality, stillbirth, and slum formation rates in Tanzania signify that considerable change is necessary to achieve international goals for health and human settlements. Disparities in access to adequate housing, safe water and sanitation, high standard antenatal, intrapartum, and neonatal care, and maternal education need to urgently be addressed. This study highlights the spatial neonatal mortality shift from rural settings to urban informal settlements in Tanzania. Importantly, other low- and middle-income countries experiencing overwhelming urbanization and slum expansion may also be at risk of a reversing trend in residential neonatal health differences.

Keywords: urban health, slum residence, neonatal mortality, stillbirth, global urbanisation

Procedia PDF Downloads 37
3009 A Quadratic Model to Early Predict the Blastocyst Stage with a Time Lapse Incubator

Authors: Cecile Edel, Sandrine Giscard D'Estaing, Elsa Labrune, Jacqueline Lornage, Mehdi Benchaib

Abstract:

Introduction: The use of incubator equipped with time-lapse technology in Artificial Reproductive Technology (ART) allows a continuous surveillance. With morphocinetic parameters, algorithms are available to predict the potential outcome of an embryo. However, the different proposed time-lapse algorithms do not take account the missing data, and then some embryos could not be classified. The aim of this work is to construct a predictive model even in the case of missing data. Materials and methods: Patients: A retrospective study was performed, in biology laboratory of reproduction at the hospital ‘Femme Mère Enfant’ (Lyon, France) between 1 May 2013 and 30 April 2015. Embryos (n= 557) obtained from couples (n=108) were cultured in a time-lapse incubator (Embryoscope®, Vitrolife, Goteborg, Sweden). Time-lapse incubator: The morphocinetic parameters obtained during the three first days of embryo life were used to build the predictive model. Predictive model: A quadratic regression was performed between the number of cells and time. N = a. T² + b. T + c. N: number of cells at T time (T in hours). The regression coefficients were calculated with Excel software (Microsoft, Redmond, WA, USA), a program with Visual Basic for Application (VBA) (Microsoft) was written for this purpose. The quadratic equation was used to find a value that allows to predict the blastocyst formation: the synthetize value. The area under the curve (AUC) obtained from the ROC curve was used to appreciate the performance of the regression coefficients and the synthetize value. A cut-off value has been calculated for each regression coefficient and for the synthetize value to obtain two groups where the difference of blastocyst formation rate according to the cut-off values was maximal. The data were analyzed with SPSS (IBM, Il, Chicago, USA). Results: Among the 557 embryos, 79.7% had reached the blastocyst stage. The synthetize value corresponds to the value calculated with time value equal to 99, the highest AUC was then obtained. The AUC for regression coefficient ‘a’ was 0.648 (p < 0.001), 0.363 (p < 0.001) for the regression coefficient ‘b’, 0.633 (p < 0.001) for the regression coefficient ‘c’, and 0.659 (p < 0.001) for the synthetize value. The results are presented as follow: blastocyst formation rate under cut-off value versus blastocyst rate formation above cut-off value. For the regression coefficient ‘a’ the optimum cut-off value was -1.14.10-3 (61.3% versus 84.3%, p < 0.001), 0.26 for the regression coefficient ‘b’ (83.9% versus 63.1%, p < 0.001), -4.4 for the regression coefficient ‘c’ (62.2% versus 83.1%, p < 0.001) and 8.89 for the synthetize value (58.6% versus 85.0%, p < 0.001). Conclusion: This quadratic regression allows to predict the outcome of an embryo even in case of missing data. Three regression coefficients and a synthetize value could represent the identity card of an embryo. ‘a’ regression coefficient represents the acceleration of cells division, ‘b’ regression coefficient represents the speed of cell division. We could hypothesize that ‘c’ regression coefficient could represent the intrinsic potential of an embryo. This intrinsic potential could be dependent from oocyte originating the embryo. These hypotheses should be confirmed by studies analyzing relationship between regression coefficients and ART parameters.

Keywords: ART procedure, blastocyst formation, time-lapse incubator, quadratic model

Procedia PDF Downloads 284
3008 Bullying Rates Among Students with Special Needs in the United States

Authors: Kaycee Bills

Abstract:

Past studies have indicated students who have disabilities are at a higher risk of experiencing bullying victimization in comparison to other student groups. Extracurricular activity participation has been shown to establish better social outcomes for students. These positive social outcomes indirectly decrease the number of times a student is bullied. The following study uses the National Crime Victimization Survey – School Crime Supplement (NCVS/SCS) to analyze the bullying concurrences experienced among students, with disabilities being a focal variable. To explore the relationship between extracurricular involvement and bullying occurrence rates, this study employs a binary logistic regression to determine if athletic and non-athletic extracurricular activities have an impact on the number of times a student with disabilities experiences bullying. Implications for future social welfare practice and research are discussed.

Keywords: disability, bullying, extracurricular activities, athletics

Procedia PDF Downloads 136
3007 Investigating Income Diversification Strategies into Off-Farm Activities Among Rural Households in Ethiopia

Authors: Kibret Berhanu Getinet

Abstract:

Off-farm income diversification by farm rural households has gained the attention of researchers and policymakers due to the fact that agriculture failed to meet the needs of people in developing countries like Ethiopia. The objective of this study was to investigate income diversification strategies into off-farm activities among rural households in Hawassa Zuria Woreda, Sidama National Regional State, Ethiopia. The study used primary and secondary data sources for the primary data collection questionnaire employed as a data collection instrument. A multistage sampling technique was used to collect data from a total of 197 sample households from four kebeles of the study area. Descriptive statistics, as well as econometrics methods of data analysis, were employed. The descriptive statistics result indicates that the majority of sample rural households (68.53 %) have engaged in off-farm income diversification activities while the remaining 31.47% of households did not participate in the diversification in the study area. The choice of participants among the strategies indicates that 6.60% of respondents participated in off-farm wage employment, 30.46% participated in off-farm self-employment, and about 31.47% of them participated in both off-farm wage employment. The study revealed that the share of off-farm income in total annual earnings of households was about 48.457%, and thus, the off-farm diversification significantly contributes to the rural household income. Moreover, binary and multinomial logistic regression models were employed to identify factors that affect the participation and the choices of the off-farm income diversification strategies, respectively. The binary logit model result indicated that agro-ecological zone, education status of the households, available technical skills of the household, household saving, total livestock owned by the households, access to electricity, road access and being married of household head were significant and positively affected the chance of diversification in off-farm activities while the on-farm income of households is negatively affected the chance of diversification. Similarly, the multinomial logistic regression model estimate revealed that agroecological zone, on-farm income, available technical skills, household savings, and access to electricity are positively related and significantly influenced the household’s choice of employment into off-farm wage employment. The off-farm self-employment diversification choice is significantly influenced by on-farm income, available technical skills, household savings, total livestock owned, and access to electricity. Moreover, the result showed that the factors that affect the choice of farm households to engage in both off-farm wage and self-employment are ecological zone, education status, on-farm income, available technical skills, household own saving, market access, total livestock owned, access to electricity and road access. Thus, due attention should be given to addressing the demographic, socio-economic, and institutional constraints to strengthen off-farm income diversification strategies to improve the income of rural households.

Keywords: off-farm, incoem, diversification, logit model

Procedia PDF Downloads 18
3006 Two-Phase Sampling for Estimating a Finite Population Total in Presence of Missing Values

Authors: Daniel Fundi Murithi

Abstract:

Missing data is a real bane in many surveys. To overcome the problems caused by missing data, partial deletion, and single imputation methods, among others, have been proposed. However, problems such as discarding usable data and inaccuracy in reproducing known population parameters and standard errors are associated with them. For regression and stochastic imputation, it is assumed that there is a variable with complete cases to be used as a predictor in estimating missing values in the other variable, and the relationship between the two variables is linear, which might not be realistic in practice. In this project, we estimate population total in presence of missing values in two-phase sampling. Instead of regression or stochastic models, non-parametric model based regression model is used in imputing missing values. Empirical study showed that nonparametric model-based regression imputation is better in reproducing variance of population total estimate obtained when there were no missing values compared to mean, median, regression, and stochastic imputation methods. Although regression and stochastic imputation were better than nonparametric model-based imputation in reproducing population total estimates obtained when there were no missing values in one of the sample sizes considered, nonparametric model-based imputation may be used when the relationship between outcome and predictor variables is not linear.

Keywords: finite population total, missing data, model-based imputation, two-phase sampling

Procedia PDF Downloads 104
3005 Comparison of Various Classification Techniques Using WEKA for Colon Cancer Detection

Authors: Beema Akbar, Varun P. Gopi, V. Suresh Babu

Abstract:

Colon cancer causes the deaths of about half a million people every year. The common method of its detection is histopathological tissue analysis, it leads to tiredness and workload to the pathologist. A novel method is proposed that combines both structural and statistical pattern recognition used for the detection of colon cancer. This paper presents a comparison among the different classifiers such as Multilayer Perception (MLP), Sequential Minimal Optimization (SMO), Bayesian Logistic Regression (BLR) and k-star by using classification accuracy and error rate based on the percentage split method. The result shows that the best algorithm in WEKA is MLP classifier with an accuracy of 83.333% and kappa statistics is 0.625. The MLP classifier which has a lower error rate, will be preferred as more powerful classification capability.

Keywords: colon cancer, histopathological image, structural and statistical pattern recognition, multilayer perception

Procedia PDF Downloads 547
3004 Frailty and Quality of Life among Older Adults: A Study of Six LMICs Using SAGE Data

Authors: Mamta Jat

Abstract:

Background: The increased longevity has resulted in the increase in the percentage of the global population aged 60 years or over. With this “demographic transition” towards ageing, “epidemiologic transition” is also taking place characterised by growing share of non-communicable diseases in the overall disease burden. So, many of the older adults are ageing with chronic disease and high levels of frailty which often results in lower levels of quality of life. Although frailty may be increasingly common in older adults, prevention or, at least, delay the onset of late-life adverse health outcomes and disability is necessary to maintain the health and functional status of the ageing population. This is an effort using SAGE data to assess levels of frailty and its socio-demographic correlates and its relation with quality of life in LMICs of India, China, Ghana, Mexico, Russia and South Africa in a comparative perspective. Methods: The data comes from multi-country Study on Global AGEing and Adult Health (SAGE), consists of nationally representative samples of older adults in six low and middle-income countries (LMICs): China, Ghana, India, Mexico, the Russian Federation and South Africa. For our study purpose, we will consider only 50+ year’s respondents. The logistic regression model has been used to assess the correlates of frailty. Multinomial logistic regression has been used to study the effect of frailty on QOL (quality of life), controlling for the effect of socio-economic and demographic correlates. Results: Among all the countries India is having highest mean frailty in males (0.22) and females (0.26) and China with the lowest mean frailty in males (0.12) and females (0.14). The odds of being frail are more likely with the increase in age across all the countries. In India, China and Russia the chances of frailty are more among rural older adults; whereas, in Ghana, South Africa and Mexico rural residence is protecting against frailty. Among all countries china has high percentage (71.46) of frail people in low QOL; whereas Mexico has lowest percentage (36.13) of frail people in low QOL.s The risk of having low and middle QOL is significantly (p<0.001) higher among frail elderly as compared to non–frail elderly across all countries with controlling socio-demographic correlates. Conclusion: Women and older age groups are having higher frailty levels than men and younger aged adults in LMICs. The mean frailty scores demonstrated a strong inverse relationship with education and income gradients, while lower levels of education and wealth are showing higher levels of frailty. These patterns are consistent across all LMICs. These data support a significant role of frailty with all other influences controlled, in having low QOL as measured by WHOQOL index. Future research needs to be built on this evolving concept of frailty in an effort to improve quality of life for frail elderly population, in LMICs setting.

Keywords: Keywords: Ageing, elderly, frailty, quality of life

Procedia PDF Downloads 256
3003 Comparison of GIS-Based Soil Erosion Susceptibility Models Using Support Vector Machine, Binary Logistic Regression and Artificial Neural Network in the Southwest Amazon Region

Authors: Elaine Lima Da Fonseca, Eliomar Pereira Da Silva Filho

Abstract:

The modeling of areas susceptible to soil loss by hydro erosive processes consists of a simplified instrument of reality with the purpose of predicting future behaviors from the observation and interaction of a set of geoenvironmental factors. The models of potential areas for soil loss will be obtained through binary logistic regression, artificial neural networks, and support vector machines. The choice of the municipality of Colorado do Oeste in the south of the western Amazon is due to soil degradation due to anthropogenic activities, such as agriculture, road construction, overgrazing, deforestation, and environmental and socioeconomic configurations. Initially, a soil erosion inventory map constructed through various field investigations will be designed, including the use of remotely piloted aircraft, orbital imagery, and the PLANAFLORO/RO database. 100 sampling units with the presence of erosion will be selected based on the assumptions indicated in the literature, and, to complement the dichotomous analysis, 100 units with no erosion will be randomly designated. The next step will be the selection of the predictive parameters that exert, jointly, directly, or indirectly, some influence on the mechanism of occurrence of soil erosion events. The chosen predictors are altitude, declivity, aspect or orientation of the slope, curvature of the slope, composite topographic index, flow power index, lineament density, normalized difference vegetation index, drainage density, lithology, soil type, erosivity, and ground surface temperature. After evaluating the relative contribution of each predictor variable, the erosion susceptibility model will be applied to the municipality of Colorado do Oeste - Rondônia through the SPSS Statistic 26 software. Evaluation of the model will occur through the determination of the values of the R² of Cox & Snell and the R² of Nagelkerke, Hosmer and Lemeshow Test, Log Likelihood Value, and Wald Test, in addition to analysis of the Confounding Matrix, ROC Curve and Accumulated Gain according to the model specification. The validation of the synthesis map resulting from both models of the potential risk of soil erosion will occur by means of Kappa indices, accuracy, and sensitivity, as well as by field verification of the classes of susceptibility to erosion using drone photogrammetry. Thus, it is expected to obtain the mapping of the following classes of susceptibility to erosion very low, low, moderate, very high, and high, which may constitute a screening tool to identify areas where more detailed investigations need to be carried out, applying more efficient social resources.

Keywords: modeling, susceptibility to erosion, artificial intelligence, Amazon

Procedia PDF Downloads 38
3002 A Novel Approach towards Test Case Prioritization Technique

Authors: Kamna Solanki, Yudhvir Singh, Sandeep Dalal

Abstract:

Software testing is a time and cost intensive process. A scrutiny of the code and rigorous testing is required to identify and rectify the putative bugs. The process of bug identification and its consequent correction is continuous in nature and often some of the bugs are removed after the software has been launched in the market. This process of code validation of the altered software during the maintenance phase is termed as Regression testing. Regression testing ubiquitously considers resource constraints; therefore, the deduction of an appropriate set of test cases, from the ensemble of the entire gamut of test cases, is a critical issue for regression test planning. This paper presents a novel method for designing a suitable prioritization process to optimize fault detection rate and performance of regression test on predefined constraints. The proposed method for test case prioritization m-ACO alters the food source selection criteria of natural ants and is basically a modified version of Ant Colony Optimization (ACO). The proposed m-ACO approach has been coded in 'Perl' language and results are validated using three examples by computation of Average Percentage of Faults Detected (APFD) metric.

Keywords: regression testing, software testing, test case prioritization, test suite optimization

Procedia PDF Downloads 304
3001 Detection Efficient Enterprises via Data Envelopment Analysis

Authors: S. Turkan

Abstract:

In this paper, the Turkey’s Top 500 Industrial Enterprises data in 2014 were analyzed by data envelopment analysis. Data envelopment analysis is used to detect efficient decision-making units such as universities, hospitals, schools etc. by using inputs and outputs. The decision-making units in this study are enterprises. To detect efficient enterprises, some financial ratios are determined as inputs and outputs. For this reason, financial indicators related to productivity of enterprises are considered. The efficient foreign weighted owned capital enterprises are detected via super efficiency model. According to the results, it is said that Mercedes-Benz is the most efficient foreign weighted owned capital enterprise in Turkey.

Keywords: data envelopment analysis, super efficiency, logistic regression, financial ratios

Procedia PDF Downloads 302
3000 Prediction of the Thermodynamic Properties of Hydrocarbons Using Gaussian Process Regression

Authors: N. Alhazmi

Abstract:

Knowing the thermodynamics properties of hydrocarbons is vital when it comes to analyzing the related chemical reaction outcomes and understanding the reaction process, especially in terms of petrochemical industrial applications, combustions, and catalytic reactions. However, measuring the thermodynamics properties experimentally is time-consuming and costly. In this paper, Gaussian process regression (GPR) has been used to directly predict the main thermodynamic properties - standard enthalpy of formation, standard entropy, and heat capacity -for more than 360 cyclic and non-cyclic alkanes, alkenes, and alkynes. A simple workflow has been proposed that can be applied to directly predict the main properties of any hydrocarbon by knowing its descriptors and chemical structure and can be generalized to predict the main properties of any material. The model was evaluated by calculating the statistical error R², which was more than 0.9794 for all the predicted properties.

Keywords: thermodynamic, Gaussian process regression, hydrocarbons, regression, supervised learning, entropy, enthalpy, heat capacity

Procedia PDF Downloads 188
2999 Machine Learning for Aiding Meningitis Diagnosis in Pediatric Patients

Authors: Karina Zaccari, Ernesto Cordeiro Marujo

Abstract:

This paper presents a Machine Learning (ML) approach to support Meningitis diagnosis in patients at a children’s hospital in Sao Paulo, Brazil. The aim is to use ML techniques to reduce the use of invasive procedures, such as cerebrospinal fluid (CSF) collection, as much as possible. In this study, we focus on predicting the probability of Meningitis given the results of a blood and urine laboratory tests, together with the analysis of pain or other complaints from the patient. We tested a number of different ML algorithms, including: Adaptative Boosting (AdaBoost), Decision Tree, Gradient Boosting, K-Nearest Neighbors (KNN), Logistic Regression, Random Forest and Support Vector Machines (SVM). Decision Tree algorithm performed best, with 94.56% and 96.18% accuracy for training and testing data, respectively. These results represent a significant aid to doctors in diagnosing Meningitis as early as possible and in preventing expensive and painful procedures on some children.

Keywords: machine learning, medical diagnosis, meningitis detection, pediatric research

Procedia PDF Downloads 121
2998 Imputation of Incomplete Large-Scale Monitoring Count Data via Penalized Estimation

Authors: Mohamed Dakki, Genevieve Robin, Marie Suet, Abdeljebbar Qninba, Mohamed A. El Agbani, Asmâa Ouassou, Rhimou El Hamoumi, Hichem Azafzaf, Sami Rebah, Claudia Feltrup-Azafzaf, Nafouel Hamouda, Wed a.L. Ibrahim, Hosni H. Asran, Amr A. Elhady, Haitham Ibrahim, Khaled Etayeb, Essam Bouras, Almokhtar Saied, Ashrof Glidan, Bakar M. Habib, Mohamed S. Sayoud, Nadjiba Bendjedda, Laura Dami, Clemence Deschamps, Elie Gaget, Jean-Yves Mondain-Monval, Pierre Defos Du Rau

Abstract:

In biodiversity monitoring, large datasets are becoming more and more widely available and are increasingly used globally to estimate species trends and con- servation status. These large-scale datasets challenge existing statistical analysis methods, many of which are not adapted to their size, incompleteness and heterogeneity. The development of scalable methods to impute missing data in incomplete large-scale monitoring datasets is crucial to balance sampling in time or space and thus better inform conservation policies. We developed a new method based on penalized Poisson models to impute and analyse incomplete monitoring data in a large-scale framework. The method al- lows parameterization of (a) space and time factors, (b) the main effects of predic- tor covariates, as well as (c) space–time interactions. It also benefits from robust statistical and computational capability in large-scale settings. The method was tested extensively on both simulated and real-life waterbird data, with the findings revealing that it outperforms six existing methods in terms of missing data imputation errors. Applying the method to 16 waterbird species, we estimated their long-term trends for the first time at the entire North African scale, a region where monitoring data suffer from many gaps in space and time series. This new approach opens promising perspectives to increase the accuracy of species-abundance trend estimations. We made it freely available in the r package ‘lori’ (https://CRAN.R-project.org/package=lori) and recommend its use for large- scale count data, particularly in citizen science monitoring programmes.

Keywords: biodiversity monitoring, high-dimensional statistics, incomplete count data, missing data imputation, waterbird trends in North-Africa

Procedia PDF Downloads 117
2997 Solving Single Machine Total Weighted Tardiness Problem Using Gaussian Process Regression

Authors: Wanatchapong Kongkaew

Abstract:

This paper proposes an application of probabilistic technique, namely Gaussian process regression, for estimating an optimal sequence of the single machine with total weighted tardiness (SMTWT) scheduling problem. In this work, the Gaussian process regression (GPR) model is utilized to predict an optimal sequence of the SMTWT problem, and its solution is improved by using an iterated local search based on simulated annealing scheme, called GPRISA algorithm. The results show that the proposed GPRISA method achieves a very good performance and a reasonable trade-off between solution quality and time consumption. Moreover, in the comparison of deviation from the best-known solution, the proposed mechanism noticeably outperforms the recently existing approaches.

Keywords: Gaussian process regression, iterated local search, simulated annealing, single machine total weighted tardiness

Procedia PDF Downloads 279
2996 The Profit Trend of Cosmetics Products Using Bootstrap Edgeworth Approximation

Authors: Edlira Donefski, Lorenc Ekonomi, Tina Donefski

Abstract:

Edgeworth approximation is one of the most important statistical methods that has a considered contribution in the reduction of the sum of standard deviation of the independent variables’ coefficients in a Quantile Regression Model. This model estimates the conditional median or other quantiles. In this paper, we have applied approximating statistical methods in an economical problem. We have created and generated a quantile regression model to see how the profit gained is connected with the realized sales of the cosmetic products in a real data, taken from a local business. The Linear Regression of the generated profit and the realized sales was not free of autocorrelation and heteroscedasticity, so this is the reason that we have used this model instead of Linear Regression. Our aim is to analyze in more details the relation between the variables taken into study: the profit and the finalized sales and how to minimize the standard errors of the independent variable involved in this study, the level of realized sales. The statistical methods that we have applied in our work are Edgeworth Approximation for Independent and Identical distributed (IID) cases, Bootstrap version of the Model and the Edgeworth approximation for Bootstrap Quantile Regression Model. The graphics and the results that we have presented here identify the best approximating model of our study.

Keywords: bootstrap, edgeworth approximation, IID, quantile

Procedia PDF Downloads 127
2995 Rural Livelihood under a Changing Climate Pattern in the Zio District of Togo, West Africa

Authors: Martial Amou

Abstract:

This study was carried out to assess the situation of households’ livelihood under a changing climate pattern in the Zio district of Togo, West Africa. The study examined three important aspects: (i) assessment of households’ livelihood situation under a changing climate pattern, (ii) farmers’ perception and understanding of local climate change, (iii) determinants of adaptation strategies undertaken in cropping pattern to climate change. To this end, secondary sources of data, and survey data collected from 235 farmers in four villages in the study area were used. Adapted conceptual framework from Sustainable Livelihood Framework of DFID, two steps Binary Logistic Regression Model and descriptive statistics were used in this study as methodological approaches. Based on Sustainable Livelihood Approach (SLA), various factors revolving around the livelihoods of the rural community were grouped into social, natural, physical, human, and financial capital. Thus, the study came up that households’ livelihood situation represented by the overall livelihood index in the study area (34%) is below the standard average households’ livelihood security index (50%). The natural capital was found as the poorest asset (13%) and this will severely affect the sustainability of livelihood in the long run. The result from descriptive statistics and the first step regression (selection model) indicated that most of the farmers in the study area have clear understanding of climate change even though they do not have any idea about greenhouse gases as the main cause behind the issue. From the second step regression (output model) result, education, farming experience, access to credit, access to extension services, cropland size, membership of a social group, distance to the nearest input market, were found to be the significant determinants of adaptation measures undertaken in cropping pattern by farmers in the study area. Based on the result of this study, recommendations are made to farmers, policy makers, institutions, and development service providers in order to better target interventions which build, promote or facilitate the adoption of adaptation measures with potential to build resilience to climate change and then improve rural livelihood.

Keywords: climate change, rural livelihood, cropping pattern, adaptation, Zio District

Procedia PDF Downloads 300
2994 Automatic Identification and Classification of Contaminated Biodegradable Plastics using Machine Learning Algorithms and Hyperspectral Imaging Technology

Authors: Nutcha Taneepanichskul, Helen C. Hailes, Mark Miodownik

Abstract:

Plastic waste has emerged as a critical global environmental challenge, primarily driven by the prevalent use of conventional plastics derived from petrochemical refining and manufacturing processes in modern packaging. While these plastics serve vital functions, their persistence in the environment post-disposal poses significant threats to ecosystems. Addressing this issue necessitates approaches, one of which involves the development of biodegradable plastics designed to degrade under controlled conditions, such as industrial composting facilities. It is imperative to note that compostable plastics are engineered for degradation within specific environments and are not suited for uncontrolled settings, including natural landscapes and aquatic ecosystems. The full benefits of compostable packaging are realized when subjected to industrial composting, preventing environmental contamination and waste stream pollution. Therefore, effective sorting technologies are essential to enhance composting rates for these materials and diminish the risk of contaminating recycling streams. In this study, it leverage hyperspectral imaging technology (HSI) coupled with advanced machine learning algorithms to accurately identify various types of plastics, encompassing conventional variants like Polyethylene terephthalate (PET), Polypropylene (PP), Low density polyethylene (LDPE), High density polyethylene (HDPE) and biodegradable alternatives such as Polybutylene adipate terephthalate (PBAT), Polylactic acid (PLA), and Polyhydroxyalkanoates (PHA). The dataset is partitioned into three subsets: a training dataset comprising uncontaminated conventional and biodegradable plastics, a validation dataset encompassing contaminated plastics of both types, and a testing dataset featuring real-world packaging items in both pristine and contaminated states. Five distinct machine learning algorithms, namely Partial Least Squares Discriminant Analysis (PLS-DA), Support Vector Machine (SVM), Convolutional Neural Network (CNN), Logistic Regression, and Decision Tree Algorithm, were developed and evaluated for their classification performance. Remarkably, the Logistic Regression and CNN model exhibited the most promising outcomes, achieving a perfect accuracy rate of 100% for the training and validation datasets. Notably, the testing dataset yielded an accuracy exceeding 80%. The successful implementation of this sorting technology within recycling and composting facilities holds the potential to significantly elevate recycling and composting rates. As a result, the envisioned circular economy for plastics can be established, thereby offering a viable solution to mitigate plastic pollution.

Keywords: biodegradable plastics, sorting technology, hyperspectral imaging technology, machine learning algorithms

Procedia PDF Downloads 40
2993 Mainstreaming Willingness among Black Owned Informal Small Micro Micro Enterprises in South Africa

Authors: Harris Maduku, Irrshad Kaseeram

Abstract:

The objective of this paper is to understand the factors behind the formalisation willingness of South African black owned SMMEs. Cross-sectional data were collected using a questionnaire from 390 informal businesses in Johannesburg and Pretoria using stratified random sampling and clustered sampling. This study employed a multinomial logistic regression to quantitatively understand what encourages informal SMMEs to be willing to mainstreaming their operations. We find government support, corruption, employment compensation, family labour, success perception, education status, age and financing as key drivers on willingness of SMMEs to formalize their operations. The findings of our study points to government departments to invest more on both financial and non-financial strategies like capacity building and business education on informal SMMEs to cultivate their willingness to mainstream.

Keywords: mainstreaming, transition, informal, willingness, multinomial logit

Procedia PDF Downloads 122
2992 Validation of Escherichia coli O157:H7 Inactivation on Apple-Carrot Juice Treated with Manothermosonication by Kinetic Models

Authors: Ozan Kahraman, Hao Feng

Abstract:

Several models such as Weibull, Modified Gompertz, Biphasic linear, and Log-logistic models have been proposed in order to describe non-linear inactivation kinetics and used to fit non-linear inactivation data of several microorganisms for inactivation by heat, high pressure processing or pulsed electric field. First-order kinetic parameters (D-values and z-values) have often been used in order to identify microbial inactivation by non-thermal processing methods such as ultrasound. Most ultrasonic inactivation studies employed first-order kinetic parameters (D-values and z-values) in order to describe the reduction on microbial survival count. This study was conducted to analyze the E. coli O157:H7 inactivation data by using five microbial survival models (First-order, Weibull, Modified Gompertz, Biphasic linear and Log-logistic). First-order, Weibull, Modified Gompertz, Biphasic linear and Log-logistic kinetic models were used for fitting inactivation curves of Escherichia coli O157:H7. The residual sum of squares and the total sum of squares criteria were used to evaluate the models. The statistical indices of the kinetic models were used to fit inactivation data for E. coli O157:H7 by MTS at three temperatures (40, 50, and 60 0C) and three pressures (100, 200, and 300 kPa). Based on the statistical indices and visual observations, the Weibull and Biphasic models were best fitting of the data for MTS treatment as shown by high R2 values. The non-linear kinetic models, including the Modified Gompertz, First-order, and Log-logistic models did not provide any better fit to data from MTS compared the Weibull and Biphasic models. It was observed that the data found in this study did not follow the first-order kinetics. It is possibly because of the cells which are sensitive to ultrasound treatment were inactivated first, resulting in a fast inactivation period, while those resistant to ultrasound were killed slowly. The Weibull and biphasic models were found as more flexible in order to determine the survival curves of E. coli O157:H7 treated by MTS on apple-carrot juice.

Keywords: Weibull, Biphasic, MTS, kinetic models, E.coli O157:H7

Procedia PDF Downloads 340
2991 Diversity of Voices: Audio Visual Continuous Speech Recognition with Traditional Approach

Authors: Partha Protim Majumder, Sajeeb Das, Sharun Akter Khushbu

Abstract:

Bengali is widely spoken in the world, but Bengali speech recognition has not received much attention. Here, we are conducting the toughest task because it must be performed in a noisy place in our study. Another challenge we overcome is dealing with speeches and collecting data on third genders, and our approach is to recognize the gender in speeches. All of the Bangla speech samples used in this study were short and were taken from real-life situations. We employed the male, female, and third-gender categories of speech. In this study, we derive the feature from the spoken word. We used MFCC(1-20), ZCR,rolloff,spec_cen, RMSE, and chroma_stft. Here, we used the algorithms Gboost, Random Forest, K-Nearest Neighbors (KNN), Decision Tree, Naive Bayes, and Logistic Regression (LR) to assess the performance of recognition metrics, and we got the highest performance from random forest in recognizing the gender of the speeches.

Keywords: MFCC, ZCR, Bengali, LR, RMSE, roll-off, Gboost

Procedia PDF Downloads 33
2990 Factors Associated with Self-Rated Health among Persons with Disabilities: A Korean National Survey

Authors: Won-Seok Kim, Hyung-Ik Shin

Abstract:

Self-rated health (SRH) is a subjective assessment of individual health and has been identified as a strong predictor for mortality and morbidity. However few studies have been directed to the factors associated with SRH in persons with disabilities (PWD). We used data of 7th Korean national survey for 5307 PWD in 2008. Multiple logistic regression analysis was performed to find out independent risk factors for poor SRH in PWD. As a result, indicators of physical condition (poor instrumental ADL), socioeconomic disadvantages (poor education, economically inactive, low self-rated social class, medicaid in health insurance, presence of unmet need for hospital use) and social participation and networks (no use of internet service) were selected as independent risk factors for poor SRH in final model. Findings in the present study would be helpful in making a program to promote the health and narrow the gap of health status between the PWD.

Keywords: disabilities, risk factors, self-rated health, socioeconomic disadvantages, social networks

Procedia PDF Downloads 369
2989 Psychosocial Factors in Relation to Musculoskeletal Disorders among Nursing Professionals in Kurdistan Region, Iraq

Authors: Karwan Khudhir

Abstract:

A cross-sectional study was carried out to determine the prevalence of musculoskeletal disorders (MSDs) and psychosocial factors associated with it, among Kurdistan nursing professionals. Simple random sampling was used to select 220 nurses and data were collected by self-administrative questionnaire. Results of the study showed that the overall prevalence of MSDs among Kurdistan nurses was 74% in different body regions and, by body regions, neck pain was reported to be the highest complaint of twelve-month MSDs (48.4%) compared to other body parts. Logistic regression analysis indicated 6 variables that are significantly associated with musculoskeletal disorders: smoking (OR=19.472, 95% CI: 5.396, 70.273), BMI (OR= 5.106, 95% CI: 1.735, 15.025), physical activity (OR=8.639, 95% CI: 3.075, 24.271), psychological demand (OR=6.685, 95% CI: 3.318, 13.468), social support (OR=3.143, 95% CI: 1.202, 4.814) and job satisfaction (OR=2.44, 95% CI: 1.04, 5.63). Prevention strategies and health education which emphasizes on psychosocial risk factors and how to improve working conditions should be introduced.

Keywords: Kurdistan Region, Iraq, musculoskeletal disorders, nurses, psycho-social factors

Procedia PDF Downloads 198
2988 A Location Routing Model for the Logistic System in the Mining Collection Centers of the Northern Region of Boyacá-Colombia

Authors: Erika Ruíz, Luis Amaya, Diego Carreño

Abstract:

The main objective of this study is to design a mathematical model for the logistics of mining collection centers in the northern region of the department of Boyacá (Colombia), determining the structure that facilitates the flow of products along the supply chain. In order to achieve this, it is necessary to define a suitable design of the distribution network, taking into account the products, customer’s characteristics and the availability of information. Likewise, some other aspects must be defined, such as number and capacity of collection centers to establish, routes that must be taken to deliver products to the customers, among others. This research will use one of the operation research problems, which is used in the design of distribution networks known as Location Routing Problem (LRP).

Keywords: location routing problem, logistic, mining collection, model

Procedia PDF Downloads 190
2987 Family Resilience of Children with Cancer: A Latent Profile Analysis

Authors: Bowen Li, Dan Shu, Shiguang Pang, Li Wang, Qian Liu

Abstract:

Background: Every year, approximately 429,000 adolescents aged 0-19 are diagnosed with cancer worldwide. The diagnosis brings about substantial psychological pressure and caregiving responsibilities for family members and impacts the families significantly. Family resilience has been found to reduce caregiver distress and can also foster post-traumatic growth in cancer survivors. However, current research on family resilience in childhood cancer mainly focuses on individual caregiver resilience and child adaptation, with less attention given to categorizing family resilience among caregivers of children with cancer. Method: A total of 292 caregivers of children with cancer were recruited from four tertiary hospitals in central China from July 2022 to March 2024. This study was approved by the ethics committee, and participants provided informed consent, with the option to withdraw at any time. The Family Resilience Assessment Scale was used to measure family resilience among caregivers of children with cancer. The Quality of Life scale-family, The Perceived Social Support Scale, and The Connor-Davidson Resilience Scale were used to measure potential influencing factors. This study used latent profile analysis (LPA) to identify latent categories of family resilience among caregivers of children with cancer. Binary logistic regression was used to analyze the influencing factors of family resilience. Results: The results reveal two distinct categories: "high family resilience" and "low family resilience." "Low family resilience" group accounts for 85.96% of the total while "high family resilience" group is 14.04%. "High family resilience" scores higher across all dimensions compared to "low family resilience". Within-group comparisons reveals that "family communication and problem-solving" and "empowering the meaning of adversity" received the highest scores, while "utilizing social and economic resources" scores the lowest. "Maintaining a positive attitude" scores similarly high to "family communication and problem-solving" in the high family resilience group, whereas it scores similarly low to "utilizing social and economic resources" in the low family resilience group. In single-factor analysis, residence, number of siblings, caregiver's education level, resilience, social support, quality of life, physical well-being and psychological well-being showed significant difference between two categories. In binary logistic regression analysis, households with only one child are more likely to exhibit low family resilience, whereas high personal resilience is associated with a high level of family resilience. Conclusion: Most families with children suffering from cancer require strengthened family resilience. Support for utilizing socio-economic resources is important for both high and low family resilience families. Single-child families and caregivers with lower resilience require more attention. These findings imply the development of targeted interventions to enhance family resilience among families with children of cancer. Future studies could involve children and other family members for a comprehensive understanding of family resilience. Longitudinal studies are necessary to explore the dynamic changes in family resilience throughout the cancer journey.

Keywords: cancer children, caregivers, family resilience, latent profile analysis

Procedia PDF Downloads 12
2986 A Performance Model for Designing Network in Reverse Logistic

Authors: S. Dhib, S. A. Addouche, T. Loukil, A. Elmhamedi

Abstract:

In this paper, a reverse supply chain network is investigated for a decision making. This decision is surrounded by complex flows of returned products, due to the increasing quantity, the type of returned products and the variety of recovery option products (reuse, recycling, and refurbishment). The most important problem in the reverse logistic network (RLN) is to orient returned products to the suitable type of recovery option. However, returned products orientations from collect sources to the recovery disposition have not well considered in performance model. In this study, we propose a performance model for designing a network configuration on reverse logistics. Conceptual and analytical models are developed with taking into account operational, economic and environmental factors on designing network.

Keywords: reverse logistics, network design, performance model, open loop configuration

Procedia PDF Downloads 414
2985 A Novel Approach of NPSO on Flexible Logistic (S-Shaped) Model for Software Reliability Prediction

Authors: Pooja Rani, G. S. Mahapatra, S. K. Pandey

Abstract:

In this paper, we propose a novel approach of Neural Network and Particle Swarm Optimization methods for software reliability prediction. We first explain how to apply compound function in neural network so that we can derive a Flexible Logistic (S-shaped) Growth Curve (FLGC) model. This model mathematically represents software failure as a random process and can be used to evaluate software development status during testing. To avoid trapping in local minima, we have applied Particle Swarm Optimization method to train proposed model using failure test data sets. We drive our proposed model using computational based intelligence modeling. Thus, proposed model becomes Neuro-Particle Swarm Optimization (NPSO) model. We do test result with different inertia weight to update particle and update velocity. We obtain result based on best inertia weight compare along with Personal based oriented PSO (pPSO) help to choose local best in network neighborhood. The applicability of proposed model is demonstrated through real time test data failure set. The results obtained from experiments show that the proposed model has a fairly accurate prediction capability in software reliability.

Keywords: software reliability, flexible logistic growth curve model, software cumulative failure prediction, neural network, particle swarm optimization

Procedia PDF Downloads 321