Search results for: generalized regression
3621 Enhancing a Recidivism Prediction Tool with Machine Learning: Effectiveness and Algorithmic Fairness
Authors: Marzieh Karimihaghighi, Carlos Castillo
Abstract:
This work studies how Machine Learning (ML) may be used to increase the effectiveness of a criminal recidivism risk assessment tool, RisCanvi. The two key dimensions of this analysis are predictive accuracy and algorithmic fairness. ML-based prediction models obtained in this study are more accurate at predicting criminal recidivism than the manually-created formula used in RisCanvi, achieving an AUC of 0.76 and 0.73 in predicting violent and general recidivism respectively. However, the improvements are small, and it is noticed that algorithmic discrimination can easily be introduced between groups such as national vs foreigner, or young vs old. It is described how effectiveness and algorithmic fairness objectives can be balanced, applying a method in which a single error disparity in terms of generalized false positive rate is minimized, while calibration is maintained across groups. Obtained results show that this bias mitigation procedure can substantially reduce generalized false positive rate disparities across multiple groups. Based on these results, it is proposed that ML-based criminal recidivism risk prediction should not be introduced without applying algorithmic bias mitigation procedures.Keywords: algorithmic fairness, criminal risk assessment, equalized odds, recidivism
Procedia PDF Downloads 1523620 The Impact of Female Education on Fertility: A Natural Experiment from Egypt
Authors: Fatma Romeh, Shiferaw Gurmu
Abstract:
This paper examines the impact of female education on fertility, using the change in length of primary schooling in Egypt in 1988-89 as the source of exogenous variation in schooling. In particular, beginning in 1988, children had to attend primary school for only five years rather than six years. This change was applicable to all individuals born on or after October 1977. Using a nonparametric regression discontinuity approach, we compare education and fertility of women born just before and after October 1977. The results show that female education significantly reduces the number of children born per woman and delays the time until first birth. Applying a robust regression discontinuity approach, however, the impact of education on the number of children is no longer significant. The impact on the timing of first birth remained significant under the robust approach. Each year of female education postponed childbearing by three months, on average.Keywords: Egypt, female education, fertility, robust regression discontinuity
Procedia PDF Downloads 3383619 Breast Cancer Mortality and Comorbidities in Portugal: A Predictive Model Built with Real World Data
Authors: Cecília M. Antão, Paulo Jorge Nogueira
Abstract:
Breast cancer (BC) is the first cause of cancer mortality among Portuguese women. This retrospective observational study aimed at identifying comorbidities associated with BC female patients admitted to Portuguese public hospitals (2010-2018), investigating the effect of comorbidities on BC mortality rate, and building a predictive model using logistic regression. Results showed that the BC mortality in Portugal decreased in this period and reached 4.37% in 2018. Adjusted odds ratio indicated that secondary malignant neoplasms of liver, of bone and bone marrow, congestive heart failure, and diabetes were associated with an increased chance of dying from breast cancer. Although the Lisbon district (the most populated area) accounted for the largest percentage of BC patients, the logistic regression model showed that, besides patient’s age, being resident in Bragança, Castelo Branco, or Porto districts was directly associated with an increase of the mortality rate.Keywords: breast cancer, comorbidities, logistic regression, adjusted odds ratio
Procedia PDF Downloads 873618 Assessing Relationships between Glandularity and Gray Level by Using Breast Phantoms
Authors: Yun-Xuan Tang, Pei-Yuan Liu, Kun-Mu Lu, Min-Tsung Tseng, Liang-Kuang Chen, Yuh-Feng Tsai, Ching-Wen Lee, Jay Wu
Abstract:
Breast cancer is predominant of malignant tumors in females. The increase in the glandular density increases the risk of breast cancer. BI-RADS is a frequently used density indicator in mammography; however, it significantly overestimates the glandularity. Therefore, it is very important to accurately and quantitatively assess the glandularity by mammography. In this study, 20%, 30% and 50% glandularity phantoms were exposed using a mammography machine at 28, 30 and 31 kVp, and 30, 55, 80 and 105 mAs, respectively. The regions of interest (ROIs) were drawn to assess the gray level. The relationship between the glandularity and gray level under various compression thicknesses, kVp, and mAs was established by the multivariable linear regression. A phantom verification was performed with automatic exposure control (AEC). The regression equation was obtained with an R-square value of 0.928. The average gray levels of the verification phantom were 8708, 8660 and 8434 for 0.952, 0.963 and 0.985 g/cm3, respectively. The percent differences of glandularity to the regression equation were 3.24%, 2.75% and 13.7%. We concluded that the proposed method could be clinically applied in mammography to improve the glandularity estimation and further increase the importance of breast cancer screening.Keywords: mammography, glandularity, gray value, BI-RADS
Procedia PDF Downloads 4923617 An Analysis of the Regression Hypothesis from a Shona Broca’s Aphasci Perspective
Authors: Esther Mafunda, Simbarashe Muparangi
Abstract:
The present paper tests the applicability of the Regression Hypothesis on the pathological language dissolution of a Shona male adult with Broca’s aphasia. It particularly assesses the prediction of the Regression Hypothesis, which states that the process according to which language is forgotten will be the reversal of the process according to which it will be acquired. The main aim of the paper is to find out whether mirror symmetries between L1 acquisition and L1 dissolution of tense in Shona and, if so, what might cause these regression patterns. The paper also sought to highlight the practical contributions that Linguistic theory can make to solving language-related problems. Data was collected from a 46-year-old male adult with Broca’s aphasia who was receiving speech therapy at St Giles Rehabilitation Centre in Harare, Zimbabwe. The primary data elicitation method was experimental, using the probe technique. The TART (Test for Assessing Reference Time) Shona version in the form of sequencing pictures was used to access tense by Broca’s aphasic and 3.5-year-old child. Using the SPSS (Statistical Package for Social Studies) and Excel analysis, it was established that the use of the future tense was impaired in Shona Broca’s aphasic whilst the present and past tense was intact. However, though the past tense was intact in the male adult with Broca’s aphasic, a reference to the remote past was made. The use of the future tense was also found to be difficult for the 3,5-year-old speaking child. No difficulties were encountered in using the present and past tenses. This means that mirror symmetries were found between L1 acquisition and L1 dissolution of tense in Shona. On the basis of the results of this research, it can be concluded that the use of tense in a Shona adult with Broca’s aphasia supports the Regression Hypothesis. The findings of this study are important in terms of speech therapy in the context of Zimbabwe. The study also contributes to Bantu linguistics in general and to Shona linguistics in particular. Further studies could also be done focusing on the rest of the Bantu language varieties in terms of aphasia.Keywords: Broca’s Aphasia, regression hypothesis, Shona, language dissolution
Procedia PDF Downloads 963616 Generalized Correlation for the Condensation and Evaporation Heat Transfer Coefficients of Propane (R290), Butane (R600), R134a, and R407c in Porous Horizontal Tubes: Experimental Investigation
Authors: M. Tarawneh
Abstract:
This work is an experimental study on the heat transfer characteristics and pressure drop of different refrigerants during the condensation and evaporation processes in porous media. Four different refrigerants (R134a, R407C, 600a, R290), with different porosities were used to reach a real understanding of the actual heat transfer characteristics and pressure drop when using porous material inside the condenser and evaporator. Steel balls were used as porous media with different porosities (38%, 43%, 48%). The main goal of this project is to enhance the heat transfer coefficient during the condensation and evaporation processes when using different refrigerants and different porosities. Different correlations for the heat transfer coefficient and the pressure drop of the different refrigerants were developed. Also a generalized empirical correlation was developed for the different refrigerants. The experimental and predicted heat transfer coefficients and pressure drops were compared. It was found that, the Absolute standard deviation for the heat transfer coefficient and the pressure drop not exceeded values of 15% and 20%, respectively.Keywords: condensation, evaporation, porous media, horizontal tubes, heat transfer coefficient, propane, butane
Procedia PDF Downloads 5383615 Apricot Insurance Portfolio Risk
Authors: Kasirga Yildirak, Ismail Gur
Abstract:
We propose a model to measure hail risk of an Agricultural Insurance portfolio. Hail is one of the major catastrophic event that causes big amount of loss to an insurer. Moreover, it is very hard to predict due to its strange atmospheric characteristics. We make use of parcel based claims data on apricot damage collected by the Turkish Agricultural Insurance Pool (TARSIM). As our ultimate aim is to compute the loadings assigned to specific parcels, we build a portfolio risk model that makes use of PD and the severity of the exposures. PD is computed by Spherical-Linear and Circular –Linear regression models as the data carries coordinate information and seasonality. Severity is mapped into integer brackets so that Probability Generation Function could be employed. Individual regressions are run on each clusters estimated on different criteria. Loss distribution is constructed by Panjer Recursion technique. We also show that one risk-one crop model can easily be extended to the multi risk–multi crop model by assuming conditional independency.Keywords: hail insurance, spherical regression, circular regression, spherical clustering
Procedia PDF Downloads 2513614 Generalization of Clustering Coefficient on Lattice Networks Applied to Criminal Networks
Authors: Christian H. Sanabria-Montaña, Rodrigo Huerta-Quintanilla
Abstract:
A lattice network is a special type of network in which all nodes have the same number of links, and its boundary conditions are periodic. The most basic lattice network is the ring, a one-dimensional network with periodic border conditions. In contrast, the Cartesian product of d rings forms a d-dimensional lattice network. An analytical expression currently exists for the clustering coefficient in this type of network, but the theoretical value is valid only up to certain connectivity value; in other words, the analytical expression is incomplete. Here we obtain analytically the clustering coefficient expression in d-dimensional lattice networks for any link density. Our analytical results show that the clustering coefficient for a lattice network with density of links that tend to 1, leads to the value of the clustering coefficient of a fully connected network. We developed a model on criminology in which the generalized clustering coefficient expression is applied. The model states that delinquents learn the know-how of crime business by sharing knowledge, directly or indirectly, with their friends of the gang. This generalization shed light on the network properties, which is important to develop new models in different fields where network structure plays an important role in the system dynamic, such as criminology, evolutionary game theory, econophysics, among others.Keywords: clustering coefficient, criminology, generalized, regular network d-dimensional
Procedia PDF Downloads 4113613 Enhancing the Interpretation of Group-Level Diagnostic Results from Cognitive Diagnostic Assessment: Application of Quantile Regression and Cluster Analysis
Authors: Wenbo Du, Xiaomei Ma
Abstract:
With the empowerment of Cognitive Diagnostic Assessment (CDA), various domains of language testing and assessment have been investigated to dig out more diagnostic information. What is noticeable is that most of the extant empirical CDA-based research puts much emphasis on individual-level diagnostic purpose with very few concerned about learners’ group-level performance. Even though the personalized diagnostic feedback is the unique feature that differentiates CDA from other assessment tools, group-level diagnostic information cannot be overlooked in that it might be more practical in classroom setting. Additionally, the group-level diagnostic information obtained via current CDA always results in a “flat pattern”, that is, the mastery/non-mastery of all tested skills accounts for the two highest proportion. In that case, the outcome does not bring too much benefits than the original total score. To address these issues, the present study attempts to apply cluster analysis for group classification and quantile regression analysis to pinpoint learners’ performance at different proficiency levels (beginner, intermediate and advanced) thus to enhance the interpretation of the CDA results extracted from a group of EFL learners’ reading performance on a diagnostic reading test designed by PELDiaG research team from a key university in China. The results show that EM method in cluster analysis yield more appropriate classification results than that of CDA, and quantile regression analysis does picture more insightful characteristics of learners with different reading proficiencies. The findings are helpful and practical for instructors to refine EFL reading curriculum and instructional plan tailored based on the group classification results and quantile regression analysis. Meanwhile, these innovative statistical methods could also make up the deficiencies of CDA and push forward the development of language testing and assessment in the future.Keywords: cognitive diagnostic assessment, diagnostic feedback, EFL reading, quantile regression
Procedia PDF Downloads 1463612 Efficient Principal Components Estimation of Large Factor Models
Authors: Rachida Ouysse
Abstract:
This paper proposes a constrained principal components (CnPC) estimator for efficient estimation of large-dimensional factor models when errors are cross sectionally correlated and the number of cross-sections (N) may be larger than the number of observations (T). Although principal components (PC) method is consistent for any path of the panel dimensions, it is inefficient as the errors are treated to be homoskedastic and uncorrelated. The new CnPC exploits the assumption of bounded cross-sectional dependence, which defines Chamberlain and Rothschild’s (1983) approximate factor structure, as an explicit constraint and solves a constrained PC problem. The CnPC method is computationally equivalent to the PC method applied to a regularized form of the data covariance matrix. Unlike maximum likelihood type methods, the CnPC method does not require inverting a large covariance matrix and thus is valid for panels with N ≥ T. The paper derives a convergence rate and an asymptotic normality result for the CnPC estimators of the common factors. We provide feasible estimators and show in a simulation study that they are more accurate than the PC estimator, especially for panels with N larger than T, and the generalized PC type estimators, especially for panels with N almost as large as T.Keywords: high dimensionality, unknown factors, principal components, cross-sectional correlation, shrinkage regression, regularization, pseudo-out-of-sample forecasting
Procedia PDF Downloads 1503611 The Factors of Supply Chain Collaboration
Authors: Ghada Soltane
Abstract:
The objective of this study was to identify factors impacting supply chain collaboration. a quantitative study was carried out on a sample of 84 Tunisian industrial companies. To verify the research hypotheses and test the direct effect of these factors on supply chain collaboration a multiple regression method was used using SPSS 26 software. The results show that there are four factors direct effects that affect supply chain collaboration in a meaningful and positive way, including: trust, engagement, information sharing and information qualityKeywords: supply chain collaboration, factors of collaboration, principal component analysis, multiple regression
Procedia PDF Downloads 493610 Developing an Advanced Algorithm Capable of Classifying News, Articles and Other Textual Documents Using Text Mining Techniques
Authors: R. B. Knudsen, O. T. Rasmussen, R. A. Alphinas
Abstract:
The reason for conducting this research is to develop an algorithm that is capable of classifying news articles from the automobile industry, according to the competitive actions that they entail, with the use of Text Mining (TM) methods. It is needed to test how to properly preprocess the data for this research by preparing pipelines which fits each algorithm the best. The pipelines are tested along with nine different classification algorithms in the realm of regression, support vector machines, and neural networks. Preliminary testing for identifying the optimal pipelines and algorithms resulted in the selection of two algorithms with two different pipelines. The two algorithms are Logistic Regression (LR) and Artificial Neural Network (ANN). These algorithms are optimized further, where several parameters of each algorithm are tested. The best result is achieved with the ANN. The final model yields an accuracy of 0.79, a precision of 0.80, a recall of 0.78, and an F1 score of 0.76. By removing three of the classes that created noise, the final algorithm is capable of reaching an accuracy of 94%.Keywords: Artificial Neural network, Competitive dynamics, Logistic Regression, Text classification, Text mining
Procedia PDF Downloads 1213609 Study on Optimal Control Strategy of PM2.5 in Wuhan, China
Authors: Qiuling Xie, Shanliang Zhu, Zongdi Sun
Abstract:
In this paper, we analyzed the correlation relationship among PM2.5 from other five Air Quality Indices (AQIs) based on the grey relational degree, and built a multivariate nonlinear regression equation model of PM2.5 and the five monitoring indexes. For the optimal control problem of PM2.5, we took the partial large Cauchy distribution of membership equation as satisfaction function. We established a nonlinear programming model with the goal of maximum performance to price ratio. And the optimal control scheme is given.Keywords: grey relational degree, multiple linear regression, membership function, nonlinear programming
Procedia PDF Downloads 2993608 SVM-Based Modeling of Mass Transfer Potential of Multiple Plunging Jets
Authors: Surinder Deswal, Mahesh Pal
Abstract:
The paper investigates the potential of support vector machines based regression approach to model the mass transfer capacity of multiple plunging jets, both vertical (θ = 90°) and inclined (θ = 60°). The data set used in this study consists of four input parameters with a total of eighty eight cases. For testing, tenfold cross validation was used. Correlation coefficient values of 0.971 and 0.981 (root mean square error values of 0.0025 and 0.0020) were achieved by using polynomial and radial basis kernel functions based support vector regression respectively. Results suggest an improved performance by radial basis function in comparison to polynomial kernel based support vector machines. The estimated overall mass transfer coefficient, by both the kernel functions, is in good agreement with actual experimental values (within a scatter of ±15 %); thereby suggesting the utility of support vector machines based regression approach.Keywords: mass transfer, multiple plunging jets, support vector machines, ecological sciences
Procedia PDF Downloads 4643607 Measures of Reliability and Transportation Quality on an Urban Rail Transit Network in Case of Links’ Capacities Loss
Authors: Jie Liu, Jinqu Cheng, Qiyuan Peng, Yong Yin
Abstract:
Urban rail transit (URT) plays a significant role in dealing with traffic congestion and environmental problems in cities. However, equipment failure and obstruction of links often lead to URT links’ capacities loss in daily operation. It affects the reliability and transport service quality of URT network seriously. In order to measure the influence of links’ capacities loss on reliability and transport service quality of URT network, passengers are divided into three categories in case of links’ capacities loss. Passengers in category 1 are less affected by the loss of links’ capacities. Their travel is reliable since their travel quality is not significantly reduced. Passengers in category 2 are affected by the loss of links’ capacities heavily. Their travel is not reliable since their travel quality is reduced seriously. However, passengers in category 2 still can travel on URT. Passengers in category 3 can not travel on URT because their travel paths’ passenger flow exceeds capacities. Their travel is not reliable. Thus, the proportion of passengers in category 1 whose travel is reliable is defined as reliability indicator of URT network. The transport service quality of URT network is related to passengers’ travel time, passengers’ transfer times and whether seats are available to passengers. The generalized travel cost is a comprehensive reflection of travel time, transfer times and travel comfort. Therefore, passengers’ average generalized travel cost is used as transport service quality indicator of URT network. The impact of links’ capacities loss on transport service quality of URT network is measured with passengers’ relative average generalized travel cost with and without links’ capacities loss. The proportion of the passengers affected by links and betweenness of links are used to determine the important links in URT network. The stochastic user equilibrium distribution model based on the improved logit model is used to determine passengers’ categories and calculate passengers’ generalized travel cost in case of links’ capacities loss, which is solved with method of successive weighted averages algorithm. The reliability and transport service quality indicators of URT network are calculated with the solution result. Taking Wuhan Metro as a case, the reliability and transport service quality of Wuhan metro network is measured with indicators and method proposed in this paper. The result shows that using the proportion of the passengers affected by links can identify important links effectively which have great influence on reliability and transport service quality of URT network; The important links are mostly connected to transfer stations and the passenger flow of important links is high; With the increase of number of failure links and the proportion of capacity loss, the reliability of the network keeps decreasing, the proportion of passengers in category 3 keeps increasing and the proportion of passengers in category 2 increases at first and then decreases; When the number of failure links and the proportion of capacity loss increased to a certain level, the decline of transport service quality is weakened.Keywords: urban rail transit network, reliability, transport service quality, links’ capacities loss, important links
Procedia PDF Downloads 1283606 Hybrid Algorithm for Non-Negative Matrix Factorization Based on Symmetric Kullback-Leibler Divergence for Signal Dependent Noise: A Case Study
Authors: Ana Serafimovic, Karthik Devarajan
Abstract:
Non-negative matrix factorization approximates a high dimensional non-negative matrix V as the product of two non-negative matrices, W and H, and allows only additive linear combinations of data, enabling it to learn parts with representations in reality. It has been successfully applied in the analysis and interpretation of high dimensional data arising in neuroscience, computational biology, and natural language processing, to name a few. The objective of this paper is to assess a hybrid algorithm for non-negative matrix factorization with multiplicative updates. The method aims to minimize the symmetric version of Kullback-Leibler divergence known as intrinsic information and assumes that the noise is signal-dependent and that it originates from an arbitrary distribution from the exponential family. It is a generalization of currently available algorithms for Gaussian, Poisson, gamma and inverse Gaussian noise. We demonstrate the potential usefulness of the new generalized algorithm by comparing its performance to the baseline methods which also aim to minimize symmetric divergence measures.Keywords: non-negative matrix factorization, dimension reduction, clustering, intrinsic information, symmetric information divergence, signal-dependent noise, exponential family, generalized Kullback-Leibler divergence, dual divergence
Procedia PDF Downloads 2463605 Development of Computational Approach for Calculation of Hydrogen Solubility in Hydrocarbons for Treatment of Petroleum
Authors: Abdulrahman Sumayli, Saad M. AlShahrani
Abstract:
For the hydrogenation process, knowing the solubility of hydrogen (H2) in hydrocarbons is critical to improve the efficiency of the process. We investigated the H2 solubility computation in four heavy crude oil feedstocks using machine learning techniques. Temperature, pressure, and feedstock type were considered as the inputs to the models, while the hydrogen solubility was the sole response. Specifically, we employed three different models: Support Vector Regression (SVR), Gaussian process regression (GPR), and Bayesian ridge regression (BRR). To achieve the best performance, the hyper-parameters of these models are optimized using the whale optimization algorithm (WOA). We evaluated the models using a dataset of solubility measurements in various feedstocks, and we compared their performance based on several metrics. Our results show that the WOA-SVR model tuned with WOA achieves the best performance overall, with an RMSE of 1.38 × 10− 2 and an R-squared of 0.991. These findings suggest that machine learning techniques can provide accurate predictions of hydrogen solubility in different feedstocks, which could be useful in the development of hydrogen-related technologies. Besides, the solubility of hydrogen in the four heavy oil fractions is estimated in different ranges of temperatures and pressures of 150 ◦C–350 ◦C and 1.2 MPa–10.8 MPa, respectivelyKeywords: temperature, pressure variations, machine learning, oil treatment
Procedia PDF Downloads 693604 Representativity Based Wasserstein Active Regression
Authors: Benjamin Bobbia, Matthias Picard
Abstract:
In recent years active learning methodologies based on the representativity of the data seems more promising to limit overfitting. The presented query methodology for regression using the Wasserstein distance measuring the representativity of our labelled dataset compared to the global distribution. In this work a crucial use of GroupSort Neural Networks is made therewith to draw a double advantage. The Wasserstein distance can be exactly expressed in terms of such neural networks. Moreover, one can provide explicit bounds for their size and depth together with rates of convergence. However, heterogeneity of the dataset is also considered by weighting the Wasserstein distance with the error of approximation at the previous step of active learning. Such an approach leads to a reduction of overfitting and high prediction performance after few steps of query. After having detailed the methodology and algorithm, an empirical study is presented in order to investigate the range of our hyperparameters. The performances of this method are compared, in terms of numbers of query needed, with other classical and recent query methods on several UCI datasets.Keywords: active learning, Lipschitz regularization, neural networks, optimal transport, regression
Procedia PDF Downloads 803603 A Machine Learning Approach for Earthquake Prediction in Various Zones Based on Solar Activity
Authors: Viacheslav Shkuratskyy, Aminu Bello Usman, Michael O’Dea, Saifur Rahman Sabuj
Abstract:
This paper examines relationships between solar activity and earthquakes; it applied machine learning techniques: K-nearest neighbour, support vector regression, random forest regression, and long short-term memory network. Data from the SILSO World Data Center, the NOAA National Center, the GOES satellite, NASA OMNIWeb, and the United States Geological Survey were used for the experiment. The 23rd and 24th solar cycles, daily sunspot number, solar wind velocity, proton density, and proton temperature were all included in the dataset. The study also examined sunspots, solar wind, and solar flares, which all reflect solar activity and earthquake frequency distribution by magnitude and depth. The findings showed that the long short-term memory network model predicts earthquakes more correctly than the other models applied in the study, and solar activity is more likely to affect earthquakes of lower magnitude and shallow depth than earthquakes of magnitude 5.5 or larger with intermediate depth and deep depth.Keywords: k-nearest neighbour, support vector regression, random forest regression, long short-term memory network, earthquakes, solar activity, sunspot number, solar wind, solar flares
Procedia PDF Downloads 733602 Physiological Action of Anthraquinone-Containing Preparations
Authors: Dmitry Yu. Korulkin, Raissa A. Muzychkina, Evgenii N. Kojaev
Abstract:
In review the generalized data about biological activity of anthraquinone-containing plants and specimens on their basis is presented. Data of traditional medicine, results of bioscreening and clinical researches of specimens are analyzed.Keywords: anthraquinones, physiologically active substances, phytopreparation, Ramon
Procedia PDF Downloads 3763601 A Hybrid Fuzzy Clustering Approach for Fertile and Unfertile Analysis
Authors: Shima Soltanzadeh, Mohammad Hosain Fazel Zarandi, Mojtaba Barzegar Astanjin
Abstract:
Diagnosis of male infertility by the laboratory tests is expensive and, sometimes it is intolerable for patients. Filling out the questionnaire and then using classification method can be the first step in decision-making process, so only in the cases with a high probability of infertility we can use the laboratory tests. In this paper, we evaluated the performance of four classification methods including naive Bayesian, neural network, logistic regression and fuzzy c-means clustering as a classification, in the diagnosis of male infertility due to environmental factors. Since the data are unbalanced, the ROC curves are most suitable method for the comparison. In this paper, we also have selected the more important features using a filtering method and examined the impact of this feature reduction on the performance of each methods; generally, most of the methods had better performance after applying the filter. We have showed that using fuzzy c-means clustering as a classification has a good performance according to the ROC curves and its performance is comparable to other classification methods like logistic regression.Keywords: classification, fuzzy c-means, logistic regression, Naive Bayesian, neural network, ROC curve
Procedia PDF Downloads 3373600 Sensitivity Based Robust Optimization Using 9 Level Orthogonal Array and Stepwise Regression
Authors: K. K. Lee, H. W. Han, H. L. Kang, T. A. Kim, S. H. Han
Abstract:
For the robust optimization of the manufacturing product design, there are design objectives that must be achieved, such as a minimization of the mean and standard deviation in objective functions within the required sensitivity constraints. The authors utilized the sensitivity of objective functions and constraints with respect to the effective design variables to reduce the computational burden associated with the evaluation of the probabilities. The individual mean and sensitivity values could be estimated easily by using the 9 level orthogonal array based response surface models optimized by the stepwise regression. The present study evaluates a proposed procedure from the robust optimization of rubber domes that are commonly used for keyboard switching, by using the 9 level orthogonal array and stepwise regression along with a desirability function. In addition, a new robust optimization process, i.e., the I2GEO (Identify, Integrate, Generate, Explore and Optimize), was proposed on the basis of the robust optimization in rubber domes. The optimized results from the response surface models and the estimated results by using the finite element analysis were consistent within a small margin of error. The standard deviation of objective function is decreasing 54.17% with suggested sensitivity based robust optimization. (Business for Cooperative R&D between Industry, Academy, and Research Institute funded Korea Small and Medium Business Administration in 2017, S2455569)Keywords: objective function, orthogonal array, response surface model, robust optimization, stepwise regression
Procedia PDF Downloads 2883599 Linear Regression Estimation of Tactile Comfort for Denim Fabrics Based on In-Plane Shear Behavior
Authors: Nazli Uren, Ayse Okur
Abstract:
Tactile comfort of a textile product is an essential property and a major concern when it comes to customer perceptions and preferences. The subjective nature of comfort and the difficulties regarding the simulation of human hand sensory feelings make it hard to establish a well-accepted link between tactile comfort and objective evaluations. On the other hand, shear behavior of a fabric is a mechanical parameter which can be measured by various objective test methods. The principal aim of this study is to determine the tactile comfort of commercially available denim fabrics by subjective measurements, create a tactile score database for denim fabrics and investigate the relations between tactile comfort and shear behavior. In-plane shear behaviors of 17 different commercially available denim fabrics with a variety of raw material and weave structure were measured by a custom design shear frame and conventional bias extension method in two corresponding diagonal directions. Tactile comfort of denim fabrics was determined via subjective customer evaluations as well. Aforesaid relations were statistically investigated and introduced as regression equations. The analyses regarding the relations between tactile comfort and shear behavior showed that there are considerably high correlation coefficients. The suggested regression equations were likewise found out to be statistically significant. Accordingly, it was concluded that the tactile comfort of denim fabrics can be estimated with a high precision, based on the results of in-plane shear behavior measurements.Keywords: denim fabrics, in-plane shear behavior, linear regression estimation, tactile comfort
Procedia PDF Downloads 3023598 Molecular Dynamics Simulation for Buckling Analysis at Nanocomposite Beams
Authors: Babak Safaei, A. M. Fattahi
Abstract:
In the present study we have investigated axial buckling characteristics of nanocomposite beams reinforced by single-walled carbon nanotubes (SWCNTs). Various types of beam theories including Euler-Bernoulli beam theory, Timoshenko beam theory and Reddy beam theory were used to analyze the buckling behavior of carbon nanotube-reinforced composite beams. Generalized differential quadrature (GDQ) method was utilized to discretize the governing differential equations along with four commonly used boundary conditions. The material properties of the nanocomposite beams were obtained using molecular dynamic (MD) simulation corresponding to both short-(10,10) SWCNT and long-(10,10) SWCNT composites which were embedded by amorphous polyethylene matrix. Then the results obtained directly from MD simulations were matched with those calculated by the mixture rule to extract appropriate values of carbon nanotube efficiency parameters accounting for the scale-dependent material properties. The selected numerical results were presented to indicate the influences of nanotube volume fractions and end supports on the critical axial buckling loads of nanocomposite beams relevant to long- and short-nanotube composites.Keywords: nanocomposites, molecular dynamics simulation, axial buckling, generalized differential quadrature (GDQ)
Procedia PDF Downloads 3253597 A Statistical Approach to Predict and Classify the Commercial Hatchability of Chickens Using Extrinsic Parameters of Breeders and Eggs
Authors: M. S. Wickramarachchi, L. S. Nawarathna, C. M. B. Dematawewa
Abstract:
Hatchery performance is critical for the profitability of poultry breeder operations. Some extrinsic parameters of eggs and breeders cause to increase or decrease the hatchability. This study aims to identify the affecting extrinsic parameters on the commercial hatchability of local chicken's eggs and determine the most efficient classification model with a hatchability rate greater than 90%. In this study, seven extrinsic parameters were considered: egg weight, moisture loss, breeders age, number of fertilised eggs, shell width, shell length, and shell thickness. Multiple linear regression was performed to determine the most influencing variable on hatchability. First, the correlation between each parameter and hatchability were checked. Then a multiple regression model was developed, and the accuracy of the fitted model was evaluated. Linear Discriminant Analysis (LDA), Classification and Regression Trees (CART), k-Nearest Neighbors (kNN), Support Vector Machines (SVM) with a linear kernel, and Random Forest (RF) algorithms were applied to classify the hatchability. This grouping process was conducted using binary classification techniques. Hatchability was negatively correlated with egg weight, breeders' age, shell width, shell length, and positive correlations were identified with moisture loss, number of fertilised eggs, and shell thickness. Multiple linear regression models were more accurate than single linear models regarding the highest coefficient of determination (R²) with 94% and minimum AIC and BIC values. According to the classification results, RF, CART, and kNN had performed the highest accuracy values 0.99, 0.975, and 0.972, respectively, for the commercial hatchery process. Therefore, the RF is the most appropriate machine learning algorithm for classifying the breeder outcomes, which are economically profitable or not, in a commercial hatchery.Keywords: classification models, egg weight, fertilised eggs, multiple linear regression
Procedia PDF Downloads 873596 Non-Methane Hydrocarbons Emission during the Photocopying Process
Authors: Kiurski S. Jelena, Aksentijević M. Snežana, Kecić S. Vesna, Oros B. Ivana
Abstract:
The prosperity of electronic equipment in photocopying environment not only has improved work efficiency, but also has changed indoor air quality. Considering the number of photocopying employed, indoor air quality might be worse than in general office environments. Determining the contribution from any type of equipment to indoor air pollution is a complex matter. Non-methane hydrocarbons are known to have an important role of air quality due to their high reactivity. The presence of hazardous pollutants in indoor air has been detected in one photocopying shop in Novi Sad, Serbia. Air samples were collected and analyzed for five days, during 8-hr working time in three-time intervals, whereas three different sampling points were determined. Using multiple linear regression model and software package STATISTICA 10 the concentrations of occupational hazards and micro-climates parameters were mutually correlated. Based on the obtained multiple coefficients of determination (0.3751, 0.2389, and 0.1975), a weak positive correlation between the observed variables was determined. Small values of parameter F indicated that there was no statistically significant difference between the concentration levels of non-methane hydrocarbons and micro-climates parameters. The results showed that variable could be presented by the general regression model: y = b0 + b1xi1+ b2xi2. Obtained regression equations allow to measure the quantitative agreement between the variation of variables and thus obtain more accurate knowledge of their mutual relations.Keywords: non-methane hydrocarbons, photocopying process, multiple regression analysis, indoor air quality, pollutant emission
Procedia PDF Downloads 3783595 Exploratory Study of Individual User Characteristics That Predict Attraction to Computer-Mediated Social Support Platforms and Mental Health Apps
Authors: Rachel Cherner
Abstract:
Introduction: The current study investigates several user characteristics that may predict the adoption of digital mental health supports. The extent to which individual characteristics predict preferences for functional elements of computer-mediated social support (CMSS) platforms and mental health (MH) apps is relatively unstudied. Aims: The present study seeks to illuminate the relationship between broad user characteristics and perceived attraction to CMSS platforms and MH apps. Methods: Participants (n=353) were recruited using convenience sampling methods (i.e., digital flyers, email distribution, and online survey forums). The sample was 68% male, and 32% female, with a mean age of 29. Participant racial and ethnic breakdown was 75% White, 7%, 5% Asian, and 5% Black or African American. Participants were asked to complete a 25-minute self-report questionnaire that included empirically validated measures assessing a battery of characteristics (i.e., subjective levels of anxiety/depression via PHQ-9 (Patient Health Questionnaire 9-item) and GAD-7 (Generalized Anxiety Disorder 7-item); attachment style via MAQ (Measure of Attachment Qualities); personality types via TIPI (The 10-Item Personality Inventory); growth mindset and mental health-seeking attitudes via GM (Growth Mindset Scale) and MHSAS (Mental Help Seeking Attitudes Scale)) and subsequent attitudes toward CMSS platforms and MH apps. Results: A stepwise linear regression was used to test if user characteristics significantly predicted attitudes towards key features of CMSS platforms and MH apps. The overall regression was statistically significant (R² =.20, F(1,344)=14.49, p<.000). Conclusion: This original study examines the clinical and sociocultural factors influencing decisions to use CMSS platforms and MH apps. Findings provide valuable insight for increasing adoption and engagement with digital mental health support. Fostering a growth mindset may be a method of increasing participant/patient engagement. In addition, CMSS platforms and MH apps may empower under-resourced and minority groups to gain basic access to mental health support. We do not assume this final model contains the best predictors of use; this is merely a preliminary step toward understanding the psychology and attitudes of CMSS platform/MH app users.Keywords: computer-mediated social support platforms, digital mental health, growth mindset, health-seeking attitudes, mental health apps, user characteristics
Procedia PDF Downloads 923594 Principal Component Regression in Amylose Content on the Malaysian Market Rice Grains Using Near Infrared Reflectance Spectroscopy
Authors: Syahira Ibrahim, Herlina Abdul Rahim
Abstract:
The amylose content is an essential element in determining the texture and taste of rice grains. This paper evaluates the use of VIS-SWNIRS in estimating the amylose content for seven varieties of rice grains available in the Malaysian market. Each type consists of 30 samples and all the samples are scanned using the spectroscopy to obtain a range of values between 680-1000nm. The Savitzky-Golay (SG) smoothing filter is applied to each sample’s data before the Principal Component Regression (PCR) technique is used to examine the data and produce a single value for each sample. This value is then compared with reference values obtained from the standard iodine colorimetric test in terms of its coefficient of determination, R2. Results show that this technique produced low R2 values of less than 0.50. In order to improve the result, the range should include a wavelength range of 1100-2500nm and the number of samples processed should also be increased.Keywords: amylose content, diffuse reflectance, Malaysia rice grain, principal component regression (PCR), Visible and Shortwave near-infrared spectroscopy (VIS-SWNIRS)
Procedia PDF Downloads 3823593 Identification of Switched Reluctance Motor Parameters Using Exponential Swept-Sine Signal
Authors: Abdelmalek Ouannou, Adil Brouri, Laila Kadi, Tarik
Abstract:
Switched reluctance motor (SRM) has a major interest in a large domain as in electric vehicle driving because of its wide range of speed operation, high performances, low cost, and robustness to run under degraded conditions. The purpose of the paper is to develop a new analytical approach for modeling SRM parameters. Then, an identification scheme is proposed to obtain the SRM parameters. Since the SRM is featured by a highly nonlinear behavior, modeling these devices is difficult. Then, it is convenient to develop an accurate model describing the SRM. Furthermore, it is always operated in the magnetically saturated mode to maximize the energy transfer. Accordingly, it is shown that the SRM can be accurately described by a generalized polynomial Hammerstein model, i.e., the parallel connection of several Hammerstein models having polynomial nonlinearity. Presently an analytical identification method is developed using a chirp excitation signal. Afterward, the parameters of the obtained model have been determined using Finite Element Method analysis. Finally, in order to show the effectiveness of the proposed method, a comparison between the true and estimate models has been performed. The obtained results show that the output responses are very close.Keywords: switched reluctance motor, swept-sine signal, generalized Hammerstein model, nonlinear system
Procedia PDF Downloads 2373592 Improved Regression Relations Between Different Magnitude Types and the Moment Magnitude in the Western Balkan Earthquake Catalogue
Authors: Anila Xhahysa, Migena Ceyhan, Neki Kuka, Klajdi Qoshi, Damiano Koxhaj
Abstract:
The seismic event catalog has been updated in the framework of a bilateral project supported by the Central European Investment Fund and with the extensive support of Global Earthquake Model Foundation to update Albania's national seismic hazard model. The earthquake catalogue prepared within this project covers the Western Balkan area limited by 38.0° - 48°N, 12.5° - 24.5°E and includes 41,806 earthquakes that occurred in the region between 510 BC and 2022. Since the moment magnitude characterizes the earthquake size accurately and the selected ground motion prediction equations for the seismic hazard assessment employ this scale, it was chosen as the uniform magnitude scale for the catalogue. Therefore, proxy values of moment magnitude had to be obtained by using new magnitude conversion equations between the local and other magnitude types to this unified scale. The Global Centroid Moment Tensor Catalogue was considered the most authoritative for moderate to large earthquakes for moment magnitude reports; hence it was used as a reference for calibrating other sources. The best fit was observed when compared to some regional agencies, whereas, with reports of moment magnitudes from Italy, Greece and Turkey, differences were observed in all magnitude ranges. For teleseismic magnitudes, to account for the non-linearity of the relationships, we used the exponential model for the derivation of the regression equations. The obtained regressions for the surface wave magnitude and short-period body-wave magnitude show considerable differences with Global Earthquake Model regression curves, especially for low magnitude ranges. Moreover, a conversion relation was obtained between the local magnitude of Albania and the corresponding moment magnitude as reported by the global and regional agencies. As errors were present in both variables, the Deming regression was used.Keywords: regression, seismic catalogue, local magnitude, tele-seismic magnitude, moment magnitude
Procedia PDF Downloads 70