Search results for: logistic regression model
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 18904

Search results for: logistic regression model

18454 SVM-Based Modeling of Mass Transfer Potential of Multiple Plunging Jets

Authors: Surinder Deswal, Mahesh Pal

Abstract:

The paper investigates the potential of support vector machines based regression approach to model the mass transfer capacity of multiple plunging jets, both vertical (θ = 90°) and inclined (θ = 60°). The data set used in this study consists of four input parameters with a total of eighty eight cases. For testing, tenfold cross validation was used. Correlation coefficient values of 0.971 and 0.981 (root mean square error values of 0.0025 and 0.0020) were achieved by using polynomial and radial basis kernel functions based support vector regression respectively. Results suggest an improved performance by radial basis function in comparison to polynomial kernel based support vector machines. The estimated overall mass transfer coefficient, by both the kernel functions, is in good agreement with actual experimental values (within a scatter of ±15 %); thereby suggesting the utility of support vector machines based regression approach.

Keywords: mass transfer, multiple plunging jets, support vector machines, ecological sciences

Procedia PDF Downloads 464
18453 Logistics Model for Improving Quality in Railway Transport

Authors: Eva Nedeliakova, Juraj Camaj, Jaroslav Masek

Abstract:

This contribution is focused on the methodology for identifying levels of quality and improving quality through new logistics model in railway transport. It is oriented on the application of dynamic quality models, which represent an innovative method of evaluation quality services. Through this conception, time factor, expected, and perceived quality in each moment of the transportation process within logistics chain can be taken into account. Various models describe the improvement of the quality which emphases the time factor throughout the whole transportation logistics chain. Quality of services in railway transport can be determined by the existing level of service quality, by detecting the causes of dissatisfaction employees but also customers, to uncover strengths and weaknesses. This new logistics model is able to recognize critical processes in logistic chain. It includes service quality rating that must respect its specific properties, which are unrepeatability, impalpability, their use right at the time they are provided and particularly changeability, which is significant factor in the conditions of rail transport as well. These peculiarities influence the quality of service regarding the constantly increasing requirements and that result in new ways of finding progressive attitudes towards the service quality rating.

Keywords: logistics model, quality, railway transport

Procedia PDF Downloads 568
18452 Normalizing Flow to Augmented Posterior: Conditional Density Estimation with Interpretable Dimension Reduction for High Dimensional Data

Authors: Cheng Zeng, George Michailidis, Hitoshi Iyatomi, Leo L. Duan

Abstract:

The conditional density characterizes the distribution of a response variable y given other predictor x and plays a key role in many statistical tasks, including classification and outlier detection. Although there has been abundant work on the problem of Conditional Density Estimation (CDE) for a low-dimensional response in the presence of a high-dimensional predictor, little work has been done for a high-dimensional response such as images. The promising performance of normalizing flow (NF) neural networks in unconditional density estimation acts as a motivating starting point. In this work, the authors extend NF neural networks when external x is present. Specifically, they use the NF to parameterize a one-to-one transform between a high-dimensional y and a latent z that comprises two components [zₚ, zₙ]. The zₚ component is a low-dimensional subvector obtained from the posterior distribution of an elementary predictive model for x, such as logistic/linear regression. The zₙ component is a high-dimensional independent Gaussian vector, which explains the variations in y not or less related to x. Unlike existing CDE methods, the proposed approach coined Augmented Posterior CDE (AP-CDE) only requires a simple modification of the common normalizing flow framework while significantly improving the interpretation of the latent component since zₚ represents a supervised dimension reduction. In image analytics applications, AP-CDE shows good separation of 𝑥-related variations due to factors such as lighting condition and subject id from the other random variations. Further, the experiments show that an unconditional NF neural network based on an unsupervised model of z, such as a Gaussian mixture, fails to generate interpretable results.

Keywords: conditional density estimation, image generation, normalizing flow, supervised dimension reduction

Procedia PDF Downloads 96
18451 Analysis of the Savings Behaviour of Rice Farmers in Tiaong, Quezon, Philippines

Authors: Angelika Kris D. Dalangin, Cesar B. Quicoy

Abstract:

Rice farming is a major source of livelihood and employment in the Philippines, but it requires a substantial amount of capital. Capital may come from income (farm, non-farm, and off-farm), savings and credit. However, rice farmers suffer from lack of capital due to high costs of inputs and low productivity. Capital insufficiency, coupled with low productivity, hindered them to meet their basic household and production needs. Hence, they resorted to borrowing money, mostly from informal lenders who charge very high interest rates. As another source of capital, savings can help rice farmers meet their basic needs for both the household and the farm. However, information is inadequate whether the farmers save or not, as well as, why they do not depend on savings to augment their lack of capital. Thus, it is worth analyzing how rice farmers saved. The study revealed, using the actual savings which is the difference between the household income and expenditure, that about three-fourths (72%) of the total number of farmers interviewed are savers. However, when they were asked whether they are savers or not, more than half of them considered themselves as non-savers. This gap shows that there are many farmers who think that they do not have savings at all; hence they continue to borrow money and do not depend on savings to augment their lack of capital. The study also identified the forms of savings, saving motives, and savings utilization among rice farmers. Results revealed that, for the past 12 months, most of the farmers saved cash at home for liquidity purposes while others deposited cash in banks and/or saved their money in the form of livestock. Among the most important reasons of farmers for saving are for daily household expenses, for building a house, for emergency purposes, for retirement, and for their next production. Furthermore, the study assessed the factors affecting the rice farmers’ savings behaviour using logistic regression. Results showed that the factors found to be significant were presence of non-farm income, per capita net farm income, and per capita household expense. The presence of non-farm income and per capita net farm income positively affects the farmers’ savings behaviour. On the other hand, per capita household expenses have negative effect. The effect, however, of per capita net farm income and household expenses is very negligible because of the very small chance that the farmer is a saver. Generally, income and expenditure were proved to be significant factors that affect the savings behaviour of the rice farmers. However, most farmers could not save regularly due to low farm income and high household and farm expenditures. Thus, it is highly recommended that government should develop programs or implement policies that will create more jobs for the farmers and their family members. In addition, programs and policies should be implemented to increase farm productivity and income.

Keywords: agricultural economics, agricultural finance, binary logistic regression, logit, Philippines, Quezon, rice farmers, savings, savings behaviour

Procedia PDF Downloads 228
18450 On Improving Breast Cancer Prediction Using GRNN-CP

Authors: Kefaya Qaddoum

Abstract:

The aim of this study is to predict breast cancer and to construct a supportive model that will stimulate a more reliable prediction as a factor that is fundamental for public health. In this study, we utilize general regression neural networks (GRNN) to replace the normal predictions with prediction periods to achieve a reasonable percentage of confidence. The mechanism employed here utilises a machine learning system called conformal prediction (CP), in order to assign consistent confidence measures to predictions, which are combined with GRNN. We apply the resulting algorithm to the problem of breast cancer diagnosis. The results show that the prediction constructed by this method is reasonable and could be useful in practice.

Keywords: neural network, conformal prediction, cancer classification, regression

Procedia PDF Downloads 291
18449 Development of Computational Approach for Calculation of Hydrogen Solubility in Hydrocarbons for Treatment of Petroleum

Authors: Abdulrahman Sumayli, Saad M. AlShahrani

Abstract:

For the hydrogenation process, knowing the solubility of hydrogen (H2) in hydrocarbons is critical to improve the efficiency of the process. We investigated the H2 solubility computation in four heavy crude oil feedstocks using machine learning techniques. Temperature, pressure, and feedstock type were considered as the inputs to the models, while the hydrogen solubility was the sole response. Specifically, we employed three different models: Support Vector Regression (SVR), Gaussian process regression (GPR), and Bayesian ridge regression (BRR). To achieve the best performance, the hyper-parameters of these models are optimized using the whale optimization algorithm (WOA). We evaluated the models using a dataset of solubility measurements in various feedstocks, and we compared their performance based on several metrics. Our results show that the WOA-SVR model tuned with WOA achieves the best performance overall, with an RMSE of 1.38 × 10− 2 and an R-squared of 0.991. These findings suggest that machine learning techniques can provide accurate predictions of hydrogen solubility in different feedstocks, which could be useful in the development of hydrogen-related technologies. Besides, the solubility of hydrogen in the four heavy oil fractions is estimated in different ranges of temperatures and pressures of 150 ◦C–350 ◦C and 1.2 MPa–10.8 MPa, respectively

Keywords: temperature, pressure variations, machine learning, oil treatment

Procedia PDF Downloads 69
18448 Predictive Factors of Nasal Continuous Positive Airway Pressure (NCPAP) Therapy Success in Preterm Neonates with Hyaline Membrane Disease (HMD)

Authors: Novutry Siregar, Afdal, Emilzon Taslim

Abstract:

Hyaline Membrane Disease (HMD) is the main cause of respiratory failure in preterm neonates caused by surfactant deficiency. Nasal Continuous Positive Airway Pressure (NCPAP) is the therapy for HMD. The success of therapy is determined by gestational age, birth weight, HMD grade, time of NCAP administration, and time of breathing frequency recovery. The aim of this research is to identify the predictive factor of NCPAP therapy success in preterm neonates with HMD. This study used a cross-sectional design by using medical records of patients who were treated in the Perinatology of the Pediatric Department of Dr. M. Djamil Padang Central Hospital from January 2015 to December 2017. The samples were eighty-two neonates that were selected by using the total sampling technique. Data analysis was done by using the Chi-Square Test and the Multiple Logistic Regression Prediction Model. The results showed the success rate of NCPAP therapy reached 53.7%. Birth weight (p = 0.048, OR = 3.34 95% CI 1.01-11.07), HMD grade I (p = 0.018, OR = 4.95 CI 95% 1.31-18.68), HMD grade II (p = 0.044, OR = 5.52 95% CI 1.04-29.15), and time of breathing frequency recovery (p = 0,000, OR = 13.50 95% CI 3.58-50, 83) are the predictive factors of NCPAP therapy success in preterm neonates with HMD. The most significant predictive factor is the time of breathing frequency recovery.

Keywords: predictive factors, the success of therapy, NCPAP, preterm neonates, HMD

Procedia PDF Downloads 59
18447 Machine Vision System for Measuring the Quality of Bulk Sun-dried Organic Raisins

Authors: Navab Karimi, Tohid Alizadeh

Abstract:

An intelligent vision-based system was designed to measure the quality and purity of raisins. A machine vision setup was utilized to capture the images of bulk raisins in ranges of 5-50% mixed pure-impure berries. The textural features of bulk raisins were extracted using Grey-level Histograms, Co-occurrence Matrix, and Local Binary Pattern (a total of 108 features). Genetic Algorithm and neural network regression were used for selecting and ranking the best features (21 features). As a result, the GLCM features set was found to have the highest accuracy (92.4%) among the other sets. Followingly, multiple feature combinations of the previous stage were fed into the second regression (linear regression) to increase accuracy, wherein a combination of 16 features was found to be the optimum. Finally, a Support Vector Machine (SVM) classifier was used to differentiate the mixtures, producing the best efficiency and accuracy of 96.2% and 97.35%, respectively.

Keywords: sun-dried organic raisin, genetic algorithm, feature extraction, ann regression, linear regression, support vector machine, south azerbaijan.

Procedia PDF Downloads 73
18446 Bioeconomic Modeling for the Sustainable Exploitation of Three Key Marine Species in Morocco

Authors: I .Ait El Harch, K. Outaaoui, Y. El Foutayeni

Abstract:

This study aims to deepen the understanding and optimize fishing activity in Morocco by holistically integrating biological and economic aspects. We develop a biological equilibrium model in which these competing species present their natural growth by logistic equations, taking into account density and competition between them. The integration of human intervention adds a realistic dimension to our model. A company specifically targets the three species, thus influencing population dynamics according to their fishing activities. The aim of this work is to determine the fishing effort that maximizes the company’s profit, taking into account the constraints associated with conserving ecosystem equilibrium.

Keywords: bioeconomical modeling, optimization techniques, linear complementarity problem LCP, biological equilibrium, maximizing profits

Procedia PDF Downloads 25
18445 The Communication of Audit Report: Key Audit Matters in United Kingdom

Authors: L. Sierra, N. Gambetta, M. A. Garcia-Benau, M. Orta

Abstract:

Financial scandals and financial crisis have led to an international debate on the value of auditing. In recent years there have been significant legislative reforms aiming to increase markets’ confidence in audit services. In particular, there has been a significant debate on the need to improve the communication of auditors with audit reports users as a way to improve its informative value and thus, to improve audit quality. The International Auditing and Assurance Standards Board (IAASB) has proposed changes to the audit report standards. The International Standard on Auditing 701, Communicating Key Audit Matters (KAM) in the Independent Auditor's Report, has introduced new concepts that go beyond the auditor's opinion and requires to disclose the risks that, from the auditor's point of view, are more significant in the audited company information. Focusing on the companies included in the Financial Times Stock Exchange 100 index, this study aims to focus on the analysis of the determinants of the number of KAM disclosed by the auditor in the audit report and moreover, the analysis of the determinants of the different type of KAM reported during the period 2013-2015. To test the hypotheses in the empirical research, two different models have been used. The first one is a linear regression model to identify the client’s characteristics, industry sector and auditor’s characteristics that are related to the number of KAM disclosed in the audit report. Secondly, a logistic regression model is used to identify the determinants of the number of each KAM type disclosed in the audit report; in line with the risk-based approach to auditing financial statements, we categorized the KAM in 2 groups: Entity-level KAM and Accounting-level KAM. Regarding the auditor’s characteristics impact on the KAM disclosure, the results show that PwC tends to report a larger number of KAM while KPMG tends to report less KAM in the audit report. Further, PwC reports a larger number of entity-level risk KAM while KPMG reports less account-level risk KAM. The results also show that companies paying higher fees tend to have more entity-level risk KAM and less account-level risk KAM. The materiality level is positively related to the number of account-level risk KAM. Additionally, these study results show that the relationship between client’s characteristics and number of KAM is more evident in account-level risk KAM than in entity-level risk KAM. A highly leveraged company carries a great deal of risk, but due to this, they are usually subject to strong capital providers monitoring resulting in less account-level risk KAM. The results reveal that the number of account-level risk KAM is strongly related to the industry sector in which the company operates assets. This study helps to understand the UK audit market, provides information to auditors and finally, it opens new research avenues in the academia.

Keywords: FTSE 100, IAS 701, key audit matters, auditor’s characteristics, client’s characteristics

Procedia PDF Downloads 231
18444 Association of Post-Traumatic Stress Disorder with Work Performance amongst Emergency Medical Service Personnel, Karachi, Pakistan

Authors: Salima Kerai, Muhammad Islam, Uzma Khan, Nargis Asad, Junaid Razzak, Omrana Pasha

Abstract:

Background: Pre-hospital care providers are exposed to various kinds of stressors. Their daily exposure to diverse critical and traumatic incidents can lead to stress reactions like Post-Traumatic Stress Disorder (PTSD). Consequences of PTSD in terms of work loss can be catastrophic because of its compound effect on families, which affect them economically, socially and emotionally. Therefore, it is critical to assess the association between PTSD and Work performance in Emergency Medical Service (EMS) if exist any. Methods: This prospective observational study was carried out at AMAN EMS in Karachi, Pakistan. EMS personnel were screened for potential PTSD using impact of event scale-revised (IES-R). Work performance was assessed on basis of five variables; number of late arrivals to work, number of days absent, number of days sick, adherence to protocol and patient satisfaction survey over the period of 3 months. In order to model outcomes like number of late arrivals to work, days absent and days late; negative binomial regression was used whereas logistic regression was applied for adherence to protocol and linear for patient satisfaction scores. Results: Out of 536 EMS personnel, 525 were found to be eligible, of them 518 consented. However data on 507 were included because 7 left the job during study period. The mean score of PTSD was found to be 24.0 ± 12.2. However, weak and insignificant association was found between PTSD and work performance measures: number of late arrivals (RRadj 0.99; 95% CI 0.98-1.00), days absent (RRadj 0.98; 95% CI 0.96-0.99), days sick (Rradj 0.99; 95% CI 0.98 to 1.00), adherence to protocol (ORadj 1.01: 95% CI 0.99 to 1.04) and patient satisfaction (0.001% score; 95% CI -0.03% to 0.03%). Conclusion: No association was found between PTSD and Work performance in the selected EMS population in Karachi Pakistan. Further studies are needed to explore the phenomenon of resiliency in these populations. Moreover, qualitative work is required to explore perceptions and feelings like willingness to go to work, readiness to carry out job responsibilities.

Keywords: trauma, emergency medical service, stress, pakistan

Procedia PDF Downloads 337
18443 Comprehensive Machine Learning-Based Glucose Sensing from Near-Infrared Spectra

Authors: Bitewulign Mekonnen

Abstract:

Context: This scientific paper focuses on the use of near-infrared (NIR) spectroscopy to determine glucose concentration in aqueous solutions accurately and rapidly. The study compares six different machine learning methods for predicting glucose concentration and also explores the development of a deep learning model for classifying NIR spectra. The objective is to optimize the detection model and improve the accuracy of glucose prediction. This research is important because it provides a comprehensive analysis of various machine-learning techniques for estimating aqueous glucose concentrations. Research Aim: The aim of this study is to compare and evaluate different machine-learning methods for predicting glucose concentration from NIR spectra. Additionally, the study aims to develop and assess a deep-learning model for classifying NIR spectra. Methodology: The research methodology involves the use of machine learning and deep learning techniques. Six machine learning regression models, including support vector machine regression, partial least squares regression, extra tree regression, random forest regression, extreme gradient boosting, and principal component analysis-neural network, are employed to predict glucose concentration. The NIR spectra data is randomly divided into train and test sets, and the process is repeated ten times to increase generalization ability. In addition, a convolutional neural network is developed for classifying NIR spectra. Findings: The study reveals that the SVMR, ETR, and PCA-NN models exhibit excellent performance in predicting glucose concentration, with correlation coefficients (R) > 0.99 and determination coefficients (R²)> 0.985. The deep learning model achieves high macro-averaging scores for precision, recall, and F1-measure. These findings demonstrate the effectiveness of machine learning and deep learning methods in optimizing the detection model and improving glucose prediction accuracy. Theoretical Importance: This research contributes to the field by providing a comprehensive analysis of various machine-learning techniques for estimating glucose concentrations from NIR spectra. It also explores the use of deep learning for the classification of indistinguishable NIR spectra. The findings highlight the potential of machine learning and deep learning in enhancing the prediction accuracy of glucose-relevant features. Data Collection and Analysis Procedures: The NIR spectra and corresponding references for glucose concentration are measured in increments of 20 mg/dl. The data is randomly divided into train and test sets, and the models are evaluated using regression analysis and classification metrics. The performance of each model is assessed based on correlation coefficients, determination coefficients, precision, recall, and F1-measure. Question Addressed: The study addresses the question of whether machine learning and deep learning methods can optimize the detection model and improve the accuracy of glucose prediction from NIR spectra. Conclusion: The research demonstrates that machine learning and deep learning methods can effectively predict glucose concentration from NIR spectra. The SVMR, ETR, and PCA-NN models exhibit superior performance, while the deep learning model achieves high classification scores. These findings suggest that machine learning and deep learning techniques can be used to improve the prediction accuracy of glucose-relevant features. Further research is needed to explore their clinical utility in analyzing complex matrices, such as blood glucose levels.

Keywords: machine learning, signal processing, near-infrared spectroscopy, support vector machine, neural network

Procedia PDF Downloads 94
18442 Comparison of Applicability of Time Series Forecasting Models VAR, ARCH and ARMA in Management Science: Study Based on Empirical Analysis of Time Series Techniques

Authors: Muhammad Tariq, Hammad Tahir, Fawwad Mahmood Butt

Abstract:

Purpose: This study attempts to examine the best forecasting methodologies in the time series. The time series forecasting models such as VAR, ARCH and the ARMA are considered for the analysis. Methodology: The Bench Marks or the parameters such as Adjusted R square, F-stats, Durban Watson, and Direction of the roots have been critically and empirically analyzed. The empirical analysis consists of time series data of Consumer Price Index and Closing Stock Price. Findings: The results show that the VAR model performed better in comparison to other models. Both the reliability and significance of VAR model is highly appreciable. In contrary to it, the ARCH model showed very poor results for forecasting. However, the results of ARMA model appeared double standards i.e. the AR roots showed that model is stationary and that of MA roots showed that the model is invertible. Therefore, the forecasting would remain doubtful if it made on the bases of ARMA model. It has been concluded that VAR model provides best forecasting results. Practical Implications: This paper provides empirical evidences for the application of time series forecasting model. This paper therefore provides the base for the application of best time series forecasting model.

Keywords: forecasting, time series, auto regression, ARCH, ARMA

Procedia PDF Downloads 348
18441 Maternal Death Review and Contextualization of Maternal Death in West Bengal

Authors: M. Illias Kanchan

Abstract:

The death of a woman during pregnancy and childbirth is not only a health issue, but also a matter of social injustice. This study makes an attempt to explore the association between maternal death and associated factors in West Bengal using the approaches of facility-based and community-based maternal death review. Bivariate and binary logistic regression analysis have been performed to understand the causes and circumstances of maternal deaths in West Bengal. Delay in seeking care was the major contributor in maternal deaths, near about one-third women died due to this factor. The most common cause of maternal death is found to be hypertensive disorders of pregnancy or eclampsia. We believe that these deaths can be averted by reducing hypertensive disorders of pregnancy or eclampsia.

Keywords: maternal death, facility-based, community-based, review, west Bengal, eclampsia

Procedia PDF Downloads 433
18440 Count Regression Modelling on Number of Migrants in Households

Authors: Tsedeke Lambore Gemecho, Ayele Taye Goshu

Abstract:

The main objective of this study is to identify the determinants of the number of international migrants in a household and to compare regression models for count response. This study is done by collecting data from total of 2288 household heads of 16 randomly sampled districts in Hadiya and Kembata-Tembaro zones of Southern Ethiopia. The Poisson mixed models, as special cases of the generalized linear mixed model, is explored to determine effects of the predictors: age of household head, farm land size, and household size. Two ethnicities Hadiya and Kembata are included in the final model as dummy variables. Stepwise variable selection has indentified four predictors: age of head, farm land size, family size and dummy variable ethnic2 (0=other, 1=Kembata). These predictors are significant at 5% significance level with count response number of migrant. The Poisson mixed model consisting of the four predictors with random effects districts. Area specific random effects are significant with the variance of about 0.5105 and standard deviation of 0.7145. The results show that the number of migrant increases with heads age, family size, and farm land size. In conclusion, there is a significantly high number of international migration per household in the area. Age of household head, family size, and farm land size are determinants that increase the number of international migrant in households. Community-based intervention is needed so as to monitor and regulate the international migration for the benefits of the society.

Keywords: Poisson regression, GLM, number of migrant, Hadiya and Kembata Tembaro zones

Procedia PDF Downloads 283
18439 Predictors of School Drop out among High School Students

Authors: Osman Zorbaz, Selen Demirtas-Zorbaz, Ozlem Ulas

Abstract:

The factors that cause adolescents to drop out school were several. One of the frameworks about school dropout focuses on the contextual factors around the adolescents whereas the other one focuses on individual factors. It can be said that both factors are important equally. In this study, both adolescent’s individual factors (anti-social behaviors, academic success) and contextual factors (parent academic involvement, parent academic support, number of siblings, living with parent) were examined in the term of school dropout. The study sample consisted of 346 high school students in the public schools in Ankara who continued their education in 2015-2016 academic year. One hundred eighty-five the students (53.5%) were girls and 161 (46.5%) were boys. In addition to this 118 of them were in ninth grade, 122 of them in tenth grade and 106 of them were in eleventh grade. Multiple regression and one-way ANOVA statistical methods were used. First, it was examined if the data meet the assumptions and conditions that are required for regression analysis. After controlling the assumptions, regression analysis was conducted. Parent academic involvement, parent academic support, number of siblings, anti-social behaviors, academic success variables were taken into the regression model and it was seen that parent academic involvement (t=-3.023, p < .01), anti-social behaviors (t=7.038, p < .001), and academic success (t=-3.718, p < .001) predicted school dropout whereas parent academic support (t=-1.403, p > .05) and number of siblings (t=-1.908, p > .05) didn’t. The model explained 30% of the variance (R=.557, R2=.300, F5,345=30.626, p < .001). In addition to this the variance, results showed there was no significant difference on high school students school dropout levels according to living with parents or not (F2;345=1.183, p > .05). Results discussed in the light of the literature and suggestion were made. As a result, academic involvement, academic success and anti-social behaviors will be considered as an important factors for preventing school drop-out.

Keywords: adolescents, anti-social behavior, parent academic involvement, parent academic support, school dropout

Procedia PDF Downloads 284
18438 Identifying Protein-Coding and Non-Coding Regions in Transcriptomes

Authors: Angela U. Makolo

Abstract:

Protein-coding and Non-coding regions determine the biology of a sequenced transcriptome. Research advances have shown that Non-coding regions are important in disease progression and clinical diagnosis. Existing bioinformatics tools have been targeted towards Protein-coding regions alone. Therefore, there are challenges associated with gaining biological insights from transcriptome sequence data. These tools are also limited to computationally intensive sequence alignment, which is inadequate and less accurate to identify both Protein-coding and Non-coding regions. Alignment-free techniques can overcome the limitation of identifying both regions. Therefore, this study was designed to develop an efficient sequence alignment-free model for identifying both Protein-coding and Non-coding regions in sequenced transcriptomes. Feature grouping and randomization procedures were applied to the input transcriptomes (37,503 data points). Successive iterations were carried out to compute the gradient vector that converged the developed Protein-coding and Non-coding Region Identifier (PNRI) model to the approximate coefficient vector. The logistic regression algorithm was used with a sigmoid activation function. A parameter vector was estimated for every sample in 37,503 data points in a bid to reduce the generalization error and cost. Maximum Likelihood Estimation (MLE) was used for parameter estimation by taking the log-likelihood of six features and combining them into a summation function. Dynamic thresholding was used to classify the Protein-coding and Non-coding regions, and the Receiver Operating Characteristic (ROC) curve was determined. The generalization performance of PNRI was determined in terms of F1 score, accuracy, sensitivity, and specificity. The average generalization performance of PNRI was determined using a benchmark of multi-species organisms. The generalization error for identifying Protein-coding and Non-coding regions decreased from 0.514 to 0.508 and to 0.378, respectively, after three iterations. The cost (difference between the predicted and the actual outcome) also decreased from 1.446 to 0.842 and to 0.718, respectively, for the first, second and third iterations. The iterations terminated at the 390th epoch, having an error of 0.036 and a cost of 0.316. The computed elements of the parameter vector that maximized the objective function were 0.043, 0.519, 0.715, 0.878, 1.157, and 2.575. The PNRI gave an ROC of 0.97, indicating an improved predictive ability. The PNRI identified both Protein-coding and Non-coding regions with an F1 score of 0.970, accuracy (0.969), sensitivity (0.966), and specificity of 0.973. Using 13 non-human multi-species model organisms, the average generalization performance of the traditional method was 74.4%, while that of the developed model was 85.2%, thereby making the developed model better in the identification of Protein-coding and Non-coding regions in transcriptomes. The developed Protein-coding and Non-coding region identifier model efficiently identified the Protein-coding and Non-coding transcriptomic regions. It could be used in genome annotation and in the analysis of transcriptomes.

Keywords: sequence alignment-free model, dynamic thresholding classification, input randomization, genome annotation

Procedia PDF Downloads 68
18437 Prediction of Marijuana Use among Iranian Early Youth: an Application of Integrative Model of Behavioral Prediction

Authors: Mehdi Mirzaei Alavijeh, Farzad Jalilian

Abstract:

Background: Marijuana is the most widely used illicit drug worldwide, especially among adolescents and young adults, which can cause numerous complications. The aim of this study was to determine the pattern, motivation use, and factors related to marijuana use among Iranian youths based on the integrative model of behavioral prediction Methods: A cross-sectional study was conducted among 174 youths marijuana user in Kermanshah County and Isfahan County, during summer 2014 which was selected with the convenience sampling for participation in this study. A self-reporting questionnaire was applied for collecting data. Data were analyzed by SPSS version 21 using bivariate correlations and linear regression statistical tests. Results: The mean marijuana use of respondents was 4.60 times at during week [95% CI: 4.06, 5.15]. Linear regression statistical showed, the structures of integrative model of behavioral prediction accounted for 36% of the variation in the outcome measure of the marijuana use at during week (R2 = 36% & P < 0.001); and among them attitude, marijuana refuse, and subjective norms were a stronger predictors. Conclusion: Comprehensive health education and prevention programs need to emphasize on cognitive factors that predict youth’s health-related behaviors. Based on our findings it seems, designing educational and behavioral intervention for reducing positive belief about marijuana, marijuana self-efficacy refuse promotion and reduce subjective norms encourage marijuana use has an effective potential to protect youths marijuana use.

Keywords: marijuana, youth, integrative model of behavioral prediction, Iran

Procedia PDF Downloads 554
18436 Machine Learning Approach for Predicting Students’ Academic Performance and Study Strategies Based on Their Motivation

Authors: Fidelia A. Orji, Julita Vassileva

Abstract:

This research aims to develop machine learning models for students' academic performance and study strategy prediction, which could be generalized to all courses in higher education. Key learning attributes (intrinsic, extrinsic, autonomy, relatedness, competence, and self-esteem) used in building the models are chosen based on prior studies, which revealed that the attributes are essential in students’ learning process. Previous studies revealed the individual effects of each of these attributes on students’ learning progress. However, few studies have investigated the combined effect of the attributes in predicting student study strategy and academic performance to reduce the dropout rate. To bridge this gap, we used Scikit-learn in python to build five machine learning models (Decision Tree, K-Nearest Neighbour, Random Forest, Linear/Logistic Regression, and Support Vector Machine) for both regression and classification tasks to perform our analysis. The models were trained, evaluated, and tested for accuracy using 924 university dentistry students' data collected by Chilean authors through quantitative research design. A comparative analysis of the models revealed that the tree-based models such as the random forest (with prediction accuracy of 94.9%) and decision tree show the best results compared to the linear, support vector, and k-nearest neighbours. The models built in this research can be used in predicting student performance and study strategy so that appropriate interventions could be implemented to improve student learning progress. Thus, incorporating strategies that could improve diverse student learning attributes in the design of online educational systems may increase the likelihood of students continuing with their learning tasks as required. Moreover, the results show that the attributes could be modelled together and used to adapt/personalize the learning process.

Keywords: classification models, learning strategy, predictive modeling, regression models, student academic performance, student motivation, supervised machine learning

Procedia PDF Downloads 128
18435 A Study on Characteristics of Hedonic Price Models in Korea Based on Meta-Regression Analysis

Authors: Minseo Jo

Abstract:

The purpose of this paper is to examine the factors in the hedonic price models, that has significance impact in determining the price of apartments. There are many variables employed in the hedonic price models and their effectiveness vary differently according to the researchers and the regions they are analysing. In order to consider various conditions, the meta-regression analysis has been selected for the study. In this paper, four meta-independent variables, from the 65 hedonic price models to analysis. The factors that influence the prices of apartments, as well as including factors that influence the prices of apartments, regions, which are divided into two of the research performed, years of research performed, the coefficients of the functions employed. The covariance between the four meta-variables and p-value of the coefficients and the four meta-variables and number of data used in the 65 hedonic price models have been analyzed in this study. The six factors that are most important in deciding the prices of apartments are positioning of apartments, the noise of the apartments, points of the compass and views from the apartments, proximity to the public transportations, companies that have constructed the apartments, social environments (such as schools etc.).

Keywords: hedonic price model, housing price, meta-regression analysis, characteristics

Procedia PDF Downloads 402
18434 Forecast of the Small Wind Turbines Sales with Replacement Purchases and with or without Account of Price Changes

Authors: V. Churkin, M. Lopatin

Abstract:

The purpose of the paper is to estimate the US small wind turbines market potential and forecast the small wind turbines sales in the US. The forecasting method is based on the application of the Bass model and the generalized Bass model of innovations diffusion under replacement purchases. In the work an exponential distribution is used for modeling of replacement purchases. Only one parameter of such distribution is determined by average lifetime of small wind turbines. The identification of the model parameters is based on nonlinear regression analysis on the basis of the annual sales statistics which has been published by the American Wind Energy Association (AWEA) since 2001 up to 2012. The estimation of the US average market potential of small wind turbines (for adoption purchases) without account of price changes is 57080 (confidence interval from 49294 to 64866 at P = 0.95) under average lifetime of wind turbines 15 years, and 62402 (confidence interval from 54154 to 70648 at P = 0.95) under average lifetime of wind turbines 20 years. In the first case the explained variance is 90,7%, while in the second - 91,8%. The effect of the wind turbines price changes on their sales was estimated using generalized Bass model. This required a price forecast. To do this, the polynomial regression function, which is based on the Berkeley Lab statistics, was used. The estimation of the US average market potential of small wind turbines (for adoption purchases) in that case is 42542 (confidence interval from 32863 to 52221 at P = 0.95) under average lifetime of wind turbines 15 years, and 47426 (confidence interval from 36092 to 58760 at P = 0.95) under average lifetime of wind turbines 20 years. In the first case the explained variance is 95,3%, while in the second –95,3%.

Keywords: bass model, generalized bass model, replacement purchases, sales forecasting of innovations, statistics of sales of small wind turbines in the United States

Procedia PDF Downloads 348
18433 Application of the Least Squares Method in the Adjustment of Chlorodifluoromethane (HCFC-142b) Regression Models

Authors: L. J. de Bessa Neto, V. S. Filho, J. V. Ferreira Nunes, G. C. Bergamo

Abstract:

There are many situations in which human activities have significant effects on the environment. Damage to the ozone layer is one of them. The objective of this work is to use the Least Squares Method, considering the linear, exponential, logarithmic, power and polynomial models of the second degree, to analyze through the coefficient of determination (R²), which model best fits the behavior of the chlorodifluoromethane (HCFC-142b) in parts per trillion between 1992 and 2018, as well as estimates of future concentrations between 5 and 10 periods, i.e. the concentration of this pollutant in the years 2023 and 2028 in each of the adjustments. A total of 809 observations of the concentration of HCFC-142b in one of the monitoring stations of gases precursors of the deterioration of the ozone layer during the period of time studied were selected and, using these data, the statistical software Excel was used for make the scatter plots of each of the adjustment models. With the development of the present study, it was observed that the logarithmic fit was the model that best fit the data set, since besides having a significant R² its adjusted curve was compatible with the natural trend curve of the phenomenon.

Keywords: chlorodifluoromethane (HCFC-142b), ozone, least squares method, regression models

Procedia PDF Downloads 124
18432 Support Vector Regression with Weighted Least Absolute Deviations

Authors: Kang-Mo Jung

Abstract:

Least squares support vector machine (LS-SVM) is a penalized regression which considers both fitting and generalization ability of a model. However, the squared loss function is very sensitive to even single outlier. We proposed a weighted absolute deviation loss function for the robustness of the estimates in least absolute deviation support vector machine. The proposed estimates can be obtained by a quadratic programming algorithm. Numerical experiments on simulated datasets show that the proposed algorithm is competitive in view of robustness to outliers.

Keywords: least absolute deviation, quadratic programming, robustness, support vector machine, weight

Procedia PDF Downloads 527
18431 Switched System Diagnosis Based on Intelligent State Filtering with Unknown Models

Authors: Nada Slimane, Foued Theljani, Faouzi Bouani

Abstract:

The paper addresses the problem of fault diagnosis for systems operating in several modes (normal or faulty) based on states assessment. We use, for this purpose, a methodology consisting of three main processes: 1) sequential data clustering, 2) linear model regression and 3) state filtering. Typically, Kalman Filter (KF) is an algorithm that provides estimation of unknown states using a sequence of I/O measurements. Inevitably, although it is an efficient technique for state estimation, it presents two main weaknesses. First, it merely predicts states without being able to isolate/classify them according to their different operating modes, whether normal or faulty modes. To deal with this dilemma, the KF is endowed with an extra clustering step based fully on sequential version of the k-means algorithm. Second, to provide state estimation, KF requires state space models, which can be unknown. A linear regularized regression is used to identify the required models. To prove its effectiveness, the proposed approach is assessed on a simulated benchmark.

Keywords: clustering, diagnosis, Kalman Filtering, k-means, regularized regression

Procedia PDF Downloads 182
18430 Association between Occupational Characteristics and Well-Being: An Exploratory Study of Married Working Women in New Delhi, India

Authors: Kanchan Negi

Abstract:

Background: Modern and urban occupational culture have driven demands for people to work long hours and weekends and take work to home at times. Research on the health effects of these exhaustive temporal work patterns is scant or contradictory. This study examines the relationship between work patterns and wellbeing in a sample of women living in the metropolitan hub of Delhi. Method: This study is based on the data collected from 360 currently married women between age 29 and 49 years, working in the urban capital hub of India, i.e., Delhi. The women interviewed were professionals from the education, health, banking and information and technology (IT) sector. Bivariate analysis was done to study the characteristics of the sample. Logistic regression analysis was used to estimate the physical and psychological wellbeing across occupational characteristics. Results: Most of the working women were below age 35 years; around 30% of women worked in the education sector, 23% in health, 21% in banking and 26% in the IT sector. Over 55% of women were employed in the private sector and only 36% were permanent employees. Nearly 30% of women worked for more than the standard 8 hours a day. The findings from logistic regression showed that compared to women working in the education sector, those who worked in the banking and IT sector more likely to have physical and psychological health issues (OR 2.07-4.37, CI 1.17-4.37); women who bear dual burden of responsibilities had higher odds of physical and psychological health issues than women who did not (OR 1.19-1.85 CI 0.96-2.92). Women who worked for more than 8 hours a day (OR 1.15, CI 1.01-1.30) and those who worked for more than five days a week (OR 1.25, CI 1.05-1.35) were more likely to have physical health issues than women who worked for 6-8 hours a day and five days e week, respectively. Also, not having flexible work timings and compensatory holidays increased the odds of having physical and psychological health issues among working women (OR 1.17-1.29, CI 1.01-1.47). Women who worked in the private sector, those employed temporarily and who worked in the non-conducive environments were more likely to have psychological health issues as compared to women in the public sector, permanent employees and those who worked in a conducive environment, respectively (OR 1.33-1.67, CI 1.09-2.91). Women who did not have poor work-life balance had reduced the odds of psychological health issues than women with poor work-life balance (OR 0.46, CI 0.25-0.84). Conclusion: Poor wellbeing significantly linked to strenuous and rigid work patterns, suggesting that modern and urban work culture may contribute to the poor wellbeing of working women. Noticing the recent decline in female workforce participation in Delhi, schemes like Flexi-timings, compensatory holidays, work-from-home and daycare facilities for young ones must be welcomed; these policies already exist in some private sector firms, and the public sectors companies should also adopt such changes to ease the dual burden as homemaker and career maker. This could encourage women in the urban areas to readily take up the jobs with less juggle to manage home and work.

Keywords: occupational characteristics, urban India, well-being, working women

Procedia PDF Downloads 205
18429 Youthful Population Sexual Activity in Malawi: A Health Scenario

Authors: A. Sathiya Susuman, N. Wilson

Abstract:

Background: The sexual behaviour of youths is believed to play an important role in the spread of sexually transmitted infections (STIs). Method: The data from the Malawi Demographic and Health Survey 2010 and a sample of 16,217 youth’s age 15 to 24 years (with each household 27.2% female and 72.8% male) was the basis for analysis. Bivariate and logistic regression analysis was performed. Results: The result shows married youth were not interested in condom use (94.2%, p<0.05). Those who were living together were 69 times (OR=1.69, 95% CI, 1.26–2.26) more likely to be involved in early sexual activity compared to those who were not living together. Conclusion: This scientific paper will help other researchers, policy makers, and planners to create strategies to encourage these youths to make use of contraception.

Keywords: sexually transmitted infections (STIs), reproductive tract infections (RTIs), condom use, sexual partners, early sexual debut, youths

Procedia PDF Downloads 437
18428 Risk Factors for Maternal and Neonatal Morbidities Associated with Operative Vaginal Deliveries

Authors: Maria Reichenber Arcilla

Abstract:

Objective: To determine the risk factors for maternal and neonatal complications associated with operative vaginal deliveries. Methods: A retrospective chart review of 435 patients who underwent operative vaginal deliveries was done. Patient profiles – age, parity, AOG, duration of labor – and outcomes – birthweight, maternal and neonatal complications - were tabulated and multivariable analysis and logistic regression were performed using SPSS® Statistics Base. Results and Conclusion: There was no significant difference in the incidence of maternal and neonatal complications between those that underwent vacuum and forceps extraction. Among the variables analysed, parity and duration of labor reached statistical significance. The odds of maternal complications were 3 times higher among nulliparous patients. Neonatal complications were seen in those whose labor lasted more than 9 hours.

Keywords: operative vaginal deliveries, maternal, neonatal, morbidity

Procedia PDF Downloads 406
18427 Development of a Regression Based Model to Predict Subjective Perception of Squeak and Rattle Noise

Authors: Ramkumar R., Gaurav Shinde, Pratik Shroff, Sachin Kumar Jain, Nagesh Walke

Abstract:

Advancements in electric vehicles have significantly reduced the powertrain noise and moving components of vehicles. As a result, in-cab noises have become more noticeable to passengers inside the car. To ensure a comfortable ride for drivers and other passengers, it has become crucial to eliminate undesirable component noises during the development phase. Standard practices are followed to identify the severity of noises based on subjective ratings, but it can be a tedious process to identify the severity of each development sample and make changes to reduce it. Additionally, the severity rating can vary from jury to jury, making it challenging to arrive at a definitive conclusion. To address this, an automotive component was identified to evaluate squeak and rattle noise issue. Physical tests were carried out for random and sine excitation profiles. Aim was to subjectively assess the noise using jury rating method and objectively evaluate the same by measuring the noise. Suitable jury evaluation method was selected for the said activity, and recorded sounds were replayed for jury rating. Objective data sound quality metrics viz., loudness, sharpness, roughness, fluctuation strength and overall Sound Pressure Level (SPL) were measured. Based on this, correlation co-efficients was established to identify the most relevant sound quality metrics that are contributing to particular identified noise issue. Regression analysis was then performed to establish the correlation between subjective and objective data. Mathematical model was prepared using artificial intelligence and machine learning algorithm. The developed model was able to predict the subjective rating with good accuracy.

Keywords: BSR, noise, correlation, regression

Procedia PDF Downloads 79
18426 Robust Variable Selection Based on Schwarz Information Criterion for Linear Regression Models

Authors: Shokrya Saleh A. Alshqaq, Abdullah Ali H. Ahmadini

Abstract:

The Schwarz information criterion (SIC) is a popular tool for selecting the best variables in regression datasets. However, SIC is defined using an unbounded estimator, namely, the least-squares (LS), which is highly sensitive to outlying observations, especially bad leverage points. A method for robust variable selection based on SIC for linear regression models is thus needed. This study investigates the robustness properties of SIC by deriving its influence function and proposes a robust SIC based on the MM-estimation scale. The aim of this study is to produce a criterion that can effectively select accurate models in the presence of vertical outliers and high leverage points. The advantages of the proposed robust SIC is demonstrated through a simulation study and an analysis of a real dataset.

Keywords: influence function, robust variable selection, robust regression, Schwarz information criterion

Procedia PDF Downloads 140
18425 Multilayer Perceptron Neural Network for Rainfall-Water Level Modeling

Authors: Thohidul Islam, Md. Hamidul Haque, Robin Kumar Biswas

Abstract:

Floods are one of the deadliest natural disasters which are very complex to model; however, machine learning is opening the door for more reliable and accurate flood prediction. In this research, a multilayer perceptron neural network (MLP) is developed to model the rainfall-water level relation, in a subtropical monsoon climatic region of the Bangladesh-India border. Our experiments show promising empirical results to forecast the water level for 1 day lead time. Our best performing MLP model achieves 98.7% coefficient of determination with lower model complexity which surpasses previously reported results on similar forecasting problems.

Keywords: flood forecasting, machine learning, multilayer perceptron network, regression

Procedia PDF Downloads 172