Search results for: multivariate regression tree
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 4293

Search results for: multivariate regression tree

4143 Amharic Text News Classification Using Supervised Learning

Authors: Misrak Assefa

Abstract:

The Amharic language is the second most widely spoken Semitic language in the world. There are several new overloaded on the web. Searching some useful documents from the web on a specific topic, which is written in the Amharic language, is a challenging task. Hence, document categorization is required for managing and filtering important information. In the classification of Amharic text news, there is still a gap in the domain of information that needs to be launch. This study attempts to design an automatic Amharic news classification using a supervised learning mechanism on four un-touch classes. To achieve this research, 4,182 news articles were used. Naive Bayes (NB) and Decision tree (j48) algorithms were used to classify the given Amharic dataset. In this paper, k-fold cross-validation is used to estimate the accuracy of the classifier. As a result, it shows those algorithms can be applicable in Amharic news categorization. The best average accuracy result is achieved by j48 decision tree and naïve Bayes is 95.2345 %, and 94.6245 % respectively using three categories. This research indicated that a typical decision tree algorithm is more applicable to Amharic news categorization.

Keywords: text categorization, supervised machine learning, naive Bayes, decision tree

Procedia PDF Downloads 159
4142 Detection and Classification of Rubber Tree Leaf Diseases Using Machine Learning

Authors: Kavyadevi N., Kaviya G., Gowsalya P., Janani M., Mohanraj S.

Abstract:

Hevea brasiliensis, also known as the rubber tree, is one of the foremost assets of crops in the world. One of the most significant advantages of the Rubber Plant in terms of air oxygenation is its capacity to reduce the likelihood of an individual developing respiratory allergies like asthma. To construct such a system that can properly identify crop diseases and pests and then create a database of insecticides for each pest and disease, we must first give treatment for the illness that has been detected. We shall primarily examine three major leaf diseases since they are economically deficient in this article, which is Bird's eye spot, algal spot and powdery mildew. And the recommended work focuses on disease identification on rubber tree leaves. It will be accomplished by employing one of the superior algorithms. Input, Preprocessing, Image Segmentation, Extraction Feature, and Classification will be followed by the processing technique. We will use time-consuming procedures that they use to detect the sickness. As a consequence, the main ailments, underlying causes, and signs and symptoms of diseases that harm the rubber tree are covered in this study.

Keywords: image processing, python, convolution neural network (CNN), machine learning

Procedia PDF Downloads 49
4141 Neutral Heavy Scalar Searches via Standard Model Gauge Boson Decays at the Large Hadron Electron Collider with Multivariate Techniques

Authors: Luigi Delle Rose, Oliver Fischer, Ahmed Hammad

Abstract:

In this article, we study the prospects of the proposed Large Hadron electron Collider (LHeC) in the search for heavy neutral scalar particles. We consider a minimal model with one additional complex scalar singlet that interacts with the Standard Model (SM) via mixing with the Higgs doublet, giving rise to an SM-like Higgs boson and a heavy scalar particle. Both scalar particles are produced via vector boson fusion and can be tested via their decays into pairs of SM particles, analogously to the SM Higgs boson. Using multivariate techniques, we show that the LHeC is sensitive to heavy scalars with masses between 200 and 800 GeV down to scalar mixing of order 0.01.

Keywords: beyond the standard model, large hadron electron collider, multivariate analysis, scalar singlet

Procedia PDF Downloads 106
4140 The Theory behind Logistic Regression

Authors: Jan Henrik Wosnitza

Abstract:

The logistic regression has developed into a standard approach for estimating conditional probabilities in a wide range of applications including credit risk prediction. The article at hand contributes to the current literature on logistic regression fourfold: First, it is demonstrated that the binary logistic regression automatically meets its model assumptions under very general conditions. This result explains, at least in part, the logistic regression's popularity. Second, the requirement of homoscedasticity in the context of binary logistic regression is theoretically substantiated. The variances among the groups of defaulted and non-defaulted obligors have to be the same across the level of the aggregated default indicators in order to achieve linear logits. Third, this article sheds some light on the question why nonlinear logits might be superior to linear logits in case of a small amount of data. Fourth, an innovative methodology for estimating correlations between obligor-specific log-odds is proposed. In order to crystallize the key ideas, this paper focuses on the example of credit risk prediction. However, the results presented in this paper can easily be transferred to any other field of application.

Keywords: correlation, credit risk estimation, default correlation, homoscedasticity, logistic regression, nonlinear logistic regression

Procedia PDF Downloads 394
4139 Real-Time Classification of Marbles with Decision-Tree Method

Authors: K. S. Parlak, E. Turan

Abstract:

The separation of marbles according to the pattern quality is a process made according to expert decision. The classification phase is the most critical part in terms of economic value. In this study, a self-learning system is proposed which performs the classification of marbles quickly and with high success. This system performs ten feature extraction by taking ten marble images from the camera. The marbles are classified by decision tree method using the obtained properties. The user forms the training set by training the system at the marble classification stage. The system evolves itself in every marble image that is classified. The aim of the proposed system is to minimize the error caused by the person performing the classification and achieve it quickly.

Keywords: decision tree, feature extraction, k-means clustering, marble classification

Procedia PDF Downloads 355
4138 Predicting Football Player Performance: Integrating Data Visualization and Machine Learning

Authors: Saahith M. S., Sivakami R.

Abstract:

In the realm of football analytics, particularly focusing on predicting football player performance, the ability to forecast player success accurately is of paramount importance for teams, managers, and fans. This study introduces an elaborate examination of predicting football player performance through the integration of data visualization methods and machine learning algorithms. The research entails the compilation of an extensive dataset comprising player attributes, conducting data preprocessing, feature selection, model selection, and model training to construct predictive models. The analysis within this study will involve delving into feature significance using methodologies like Select Best and Recursive Feature Elimination (RFE) to pinpoint pertinent attributes for predicting player performance. Various machine learning algorithms, including Random Forest, Decision Tree, Linear Regression, Support Vector Regression (SVR), and Artificial Neural Networks (ANN), will be explored to develop predictive models. The evaluation of each model's performance utilizing metrics such as Mean Squared Error (MSE) and R-squared will be executed to gauge their efficacy in predicting player performance. Furthermore, this investigation will encompass a top player analysis to recognize the top-performing players based on the anticipated overall performance scores. Nationality analysis will entail scrutinizing the player distribution based on nationality and investigating potential correlations between nationality and player performance. Positional analysis will concentrate on examining the player distribution across various positions and assessing the average performance of players in each position. Age analysis will evaluate the influence of age on player performance and identify any discernible trends or patterns associated with player age groups. The primary objective is to predict a football player's overall performance accurately based on their individual attributes, leveraging data-driven insights to enrich the comprehension of player success on the field. By amalgamating data visualization and machine learning methodologies, the aim is to furnish valuable tools for teams, managers, and fans to effectively analyze and forecast player performance. This research contributes to the progression of sports analytics by showcasing the potential of machine learning in predicting football player performance and offering actionable insights for diverse stakeholders in the football industry.

Keywords: football analytics, player performance prediction, data visualization, machine learning algorithms, random forest, decision tree, linear regression, support vector regression, artificial neural networks, model evaluation, top player analysis, nationality analysis, positional analysis

Procedia PDF Downloads 14
4137 Multivariate Statistical Process Monitoring of Base Metal Flotation Plant Using Dissimilarity Scale-Based Singular Spectrum Analysis

Authors: Syamala Krishnannair

Abstract:

A multivariate statistical process monitoring methodology using dissimilarity scale-based singular spectrum analysis (SSA) is proposed for the detection and diagnosis of process faults in the base metal flotation plant. Process faults are detected based on the multi-level decomposition of process signals by SSA using the dissimilarity structure of the process data and the subsequent monitoring of the multiscale signals using the unified monitoring index which combines T² with SPE. Contribution plots are used to identify the root causes of the process faults. The overall results indicated that the proposed technique outperformed the conventional multivariate techniques in the detection and diagnosis of the process faults in the flotation plant.

Keywords: fault detection, fault diagnosis, process monitoring, dissimilarity scale

Procedia PDF Downloads 175
4136 The Effect of Mgo and Rubber Nanofillers on Electrical Treeing Characteristic of XLPE Based Nanocomposites

Authors: Nur Amira nor Arifin, Tashia Marie Anthony, Mohd Ruzlin Mokhtar, Huzainie Shafi Abd Halim

Abstract:

Cross-linked polyethylene (XLPE) material is being used as the cable insulation for the past decades due to its higher working temperature of 90 ˚C and some other advantages. However, the use of XLPE as an insulating material for underground distribution cables may have subjected to the unforeseeable weather and uncontrollable environmental condition. These unfavorable condition when combine with high electric field may lead to the initiation and growth of water tree in XLPE insulation. There are several studies on numerous nanofillers incorporate into polymer matrix to hinder the growth of tree propagation. Hence, in this study aims to investigate the effect of MgO and rubber nanofillers at different concentration on the electrical tree of XLPE. The nanofillers and XLPE were mixed and later extruded. After extrusion, the material were then fabricated into the desired shape for experimental purposes. The result shows that the electrical tree propagation of XLPE filled with optimize concentration of nanofillers were much slower compared to pure XLPE. In this paper, the effect of nanofillers towards electrical treeing characteristic will be discussed.

Keywords: electrical trees, nanofillers, polymer nanocomposites, XLPE

Procedia PDF Downloads 111
4135 Count of Trees in East Africa with Deep Learning

Authors: Nubwimana Rachel, Mugabowindekwe Maurice

Abstract:

Trees play a crucial role in maintaining biodiversity and providing various ecological services. Traditional methods of counting trees are time-consuming, and there is a need for more efficient techniques. However, deep learning makes it feasible to identify the multi-scale elements hidden in aerial imagery. This research focuses on the application of deep learning techniques for tree detection and counting in both forest and non-forest areas through the exploration of the deep learning application for automated tree detection and counting using satellite imagery. The objective is to identify the most effective model for automated tree counting. We used different deep learning models such as YOLOV7, SSD, and UNET, along with Generative Adversarial Networks to generate synthetic samples for training and other augmentation techniques, including Random Resized Crop, AutoAugment, and Linear Contrast Enhancement. These models were trained and fine-tuned using satellite imagery to identify and count trees. The performance of the models was assessed through multiple trials; after training and fine-tuning the models, UNET demonstrated the best performance with a validation loss of 0.1211, validation accuracy of 0.9509, and validation precision of 0.9799. This research showcases the success of deep learning in accurate tree counting through remote sensing, particularly with the UNET model. It represents a significant contribution to the field by offering an efficient and precise alternative to conventional tree-counting methods.

Keywords: remote sensing, deep learning, tree counting, image segmentation, object detection, visualization

Procedia PDF Downloads 27
4134 Understanding Farmers’ Perceptions Towards Agrivoltaics Using Decision Tree Algorithms

Authors: Mayuri Roy Choudhury

Abstract:

In recent times the concept of agrivoltaics has gained popularity due to the dual use of land and the added value provided by photovoltaics in terms of renewable energy and crop production on farms. However, the transition towards agrivoltaics has been slow, and our research tries to investigate the obstacles leading towards the slow progress of agrivoltaics. We applied data science decision tree algorithms to quantify qualitative perceptions of farmers in the United States for agrivoltaics. To date, there has not been much research that mentions farmers' perceptions, as most of the research focuses on the benefits of agrivoltaics. Our study adds value by putting forward the voices of farmers, which play a crucial towards the transition to agrivoltaics in the future. Our results show a mixture of responses in favor of agrivoltaics. Furthermore, it also portrays significant concerns of farmers, which is useful for decision-makers when it comes to formulating policies for agrivoltaics.

Keywords: agrivoltaics, decision-tree algorithms, farmers perception, transition

Procedia PDF Downloads 157
4133 Tree Resistance to Wind Storm: The Effects of Soil Saturation on Tree Anchorage of Young Pinus pinaster

Authors: P. Defossez, J. M. Bonnefond, D. Garrigou, P. Trichet, F. Danjon

Abstract:

Windstorm damage to European forests has ecological, social and economic consequences of major importance. Most trees during storms are uprooted. While a large amount of work has been done over the last decade on understanding the aerial tree response to turbulent wind flow, much less is known about the root-soil interface, and the impact of soil moisture and root-soil system fatiguing on tree uprooting. Anchorage strength is expected to be reduced by water-logging and heavy rain during storms due to soil strength decrease with soil water content. Our paper is focused on the maritime pine cultivated on sandy soil, as a representative species of the Forêt des Landes, the largest cultivated forest in Europe. This study aims at providing knowledge on the effects of soil saturation on root anchorage. Pulling experiments on trees were performed to characterize the resistance to wind by measuring the critical bending moment (Mc). Pulling tests were performed on 12 maritime pines of 13-years old for two unsaturated soil conditions that represent the soil conditions expected in winter when wind storms occur in France (w=11.46 to 23.34 % gg⁻¹). A magnetic field digitizing technique was used to characterize the three-dimensional architecture of root systems. The soil mechanical properties as function of soil water content were characterized by laboratory mechanical measurements as function of soil water content and soil porosity on remolded samples using direct shear tests at low confining pressure ( < 15 kPa). Remarkably Mc did not depend on w but mainly on the root system morphology. We suggested that the importance of soil water conditions on tree anchorage depends on the tree size. This study gives a new insight on young tree anchorage: roots may sustain by themselves anchorage, whereas adhesion between roots and surrounding soil may be negligible in sandy soil.

Keywords: roots, sandy soil, shear strength, tree anchorage, unsaturated soil

Procedia PDF Downloads 260
4132 The Pressure Losses in the Model of Human Lungs

Authors: Michaela Chovancova, Pavel Niedoba

Abstract:

For the treatment of acute and chronic lung diseases it is preferred to deliver medicaments by inhalation. The drug is delivered directly to tracheobronchial tree. This way allows the given medicament to get directly into the place of action and it makes rapid onset of action and maximum efficiency. The transport of aerosol particles in the particular part of the lung is influenced by their size, anatomy of the lungs, breathing pattern and airway resistance. This article deals with calculation of airway resistance in the lung model of Horsfield. It solves the problem of determination of the pressure losses in bifurcation and thus defines the pressure drop at a given location in the bronchial tree. The obtained data will be used as boundary conditions for transport of aerosol particles in a central part of bronchial tree realized by Computational Fluid Dynamics (CFD) approach. The results obtained from CFD simulation will allow us to provide information on the required particle size and optimal inhalation technique for particle transport into particular part of the lung.

Keywords: human lungs, bronchial tree, pressure losses, airways resistance, flow, breathing

Procedia PDF Downloads 330
4131 Effect of Pregnancy Intention, Postnatal Depressive Symptoms and Social Support on Early Childhood Stunting: Findings from India

Authors: Swati Srivastava, Ashish Kumar Upadhyay

Abstract:

Background: According to United Nation Children’s Fund, it has been estimated that worldwide about 165 million children were stunted in 2012 and India alone accounts for 38% of global burden of stunting. In terms of incidence, India is home of more than 60 million stunted children worldwide. Our study aims to examine the effect of pregnancy intention and maternal postnatal depressive symptoms on early childhood stunting in India. We hypothesized that effect of pregnancy intention and postnatal maternal depressive symptoms were mediated by social support. Methods: We used data from first wave of Young Lives Study India. Out of 2011 children recruited in original cohort, 1833 children had complete information on pregnancy intention, maternal depression and other variables. A series of multivariate logistic regression model were used to examine the effect of pregnancy intention and postnatal depressive symptoms on early childhood stunting. Results: Bivariate result indicates that a higher percent of children born after unintended pregnancy (40%) were stunted than children of intended pregnancy (26%). Likewise, proportion of stunted children was also higher among women of high postnatal depressive symptoms (35%) than low level of depression (24%). Results of multivariate logistic regression model indicate that children born after unintended pregnancy were significantly more likely to be stunted than children born after intended pregnancy (Coefficient: 1.70, CI: 1.17, 2.48). Likewise, early childhood stunting was also associated with maternal postnatal depressive symptoms among women (Coefficient: 1.48, CI: 1.16, 1.88). The effect of pregnancy intention and postnatal depressive symptoms on early childhood stunting remains unchanged after controlling for social support and other variables. Conclusions: The findings of this study provide conclusive evidence regarding consequences of pregnancy intention and postnatal depressive symptoms on early childhood stunting in India. Therefore, there is need to identify the women with unintended pregnancy and incorporate the promotion of mental health into their national reproductive and child health programme.

Keywords: pregnancy intention, postnatal depressive symptoms, social support, childhood stunting, young lives study, India

Procedia PDF Downloads 270
4130 Confidence Envelopes for Parametric Model Selection Inference and Post-Model Selection Inference

Authors: I. M. L. Nadeesha Jayaweera, Adao Alex Trindade

Abstract:

In choosing a candidate model in likelihood-based modeling via an information criterion, the practitioner is often faced with the difficult task of deciding just how far up the ranked list to look. Motivated by this pragmatic necessity, we construct an uncertainty band for a generalized (model selection) information criterion (GIC), defined as a criterion for which the limit in probability is identical to that of the normalized log-likelihood. This includes common special cases such as AIC & BIC. The method starts from the asymptotic normality of the GIC for the joint distribution of the candidate models in an independent and identically distributed (IID) data framework and proceeds by deriving the (asymptotically) exact distribution of the minimum. The calculation of an upper quantile for its distribution then involves the computation of multivariate Gaussian integrals, which is amenable to efficient implementation via the R package "mvtnorm". The performance of the methodology is tested on simulated data by checking the coverage probability of nominal upper quantiles and compared to the bootstrap. Both methods give coverages close to nominal for large samples, but the bootstrap is two orders of magnitude slower. The methodology is subsequently extended to two other commonly used model structures: regression and time series. In the regression case, we derive the corresponding asymptotically exact distribution of the minimum GIC invoking Lindeberg-Feller type conditions for triangular arrays and are thus able to similarly calculate upper quantiles for its distribution via multivariate Gaussian integration. The bootstrap once again provides a default competing procedure, and we find that similar comparison performance metrics hold as for the IID case. The time series case is complicated by far more intricate asymptotic regime for the joint distribution of the model GIC statistics. Under a Gaussian likelihood, the default in most packages, one needs to derive the limiting distribution of a normalized quadratic form for a realization from a stationary series. Under conditions on the process satisfied by ARMA models, a multivariate normal limit is once again achieved. The bootstrap can, however, be employed for its computation, whence we are once again in the multivariate Gaussian integration paradigm for upper quantile evaluation. Comparisons of this bootstrap-aided semi-exact method with the full-blown bootstrap once again reveal a similar performance but faster computation speeds. One of the most difficult problems in contemporary statistical methodological research is to be able to account for the extra variability introduced by model selection uncertainty, the so-called post-model selection inference (PMSI). We explore ways in which the GIC uncertainty band can be inverted to make inferences on the parameters. This is being attempted in the IID case by pivoting the CDF of the asymptotically exact distribution of the minimum GIC. For inference one parameter at a time and a small number of candidate models, this works well, whence the attained PMSI confidence intervals are wider than the MLE-based Wald, as expected.

Keywords: model selection inference, generalized information criteria, post model selection, Asymptotic Theory

Procedia PDF Downloads 61
4129 Effects of Nut Quality and Yield by Raising Poultry in Chestnut Tree Plantation

Authors: Yunmi Park, Mahn-Jo Kim

Abstract:

The purpose of this research is to find out the effect of raising poultry in environment-friendly producing area to fruit quality and crop within chestnut tree yield. This study was conducted on chestnut tree cultivation sites raising poultry at intervals of five to ten days for three years in the mountainous area which was located in the middle corner of Chungcheongbuk-do province, Korea. The quality of chestnut fruit and the control effects of harmful insects have been investigated between the sites raising poultry and control sites for three years. As a result, the harvest yielded were two to five kilograms higher in the chestnut tree cultivation sites raising poultry compared with the control site without poultry. Also, for the purposes of determining the price when selling, the ratio of the biggest fruit is higher by 3% to 14% in the chestnut tree cultivation sites raising poultry. In order to investigate the effects of pest control through raising poultry, the ratio of harmful insect species to treatment sites was relatively low compared to control site. The appreciable result is that the control effect of larvae of the chestnut leaf-cut weevil was higher in the position where raising the poultry of 4 to 5 weeks compared to the position where raising the poultry of 12 weeks. This study found that the spread of poultry in the cultivation of chestnut trees increased the fruit quality by improving the size of fruits and lowering the dosage of harmful insect, chestnut leaf-cut weevil. Also, the eco-friendly chicken produced by these mountainous regions is expected to contribute to enhancing the incomes of the farmers by differentiating themselves from existing products.

Keywords: chestnut tree, environment-friendly, fruit quality, raising poultry

Procedia PDF Downloads 266
4128 Discrimination Between Bacillus and Alicyclobacillus Isolates in Apple Juice by Fourier Transform Infrared Spectroscopy and Multivariate Analysis

Authors: Murada Alholy, Mengshi Lin, Omar Alhaj, Mahmoud Abugoush

Abstract:

Alicyclobacillus is a causative agent of spoilage in pasteurized and heat-treated apple juice products. Differentiating between this genus and the closely related Bacillus is crucially important. In this study, Fourier transform infrared spectroscopy (FT-IR) was used to identify and discriminate between four Alicyclobacillus strains and four Bacillus isolates inoculated individually into apple juice. Loading plots over the range of 1350 and 1700 cm-1 reflected the most distinctive biochemical features of Bacillus and Alicyclobacillus. Multivariate statistical methods (e.g. principal component analysis (PCA) and soft independent modeling of class analogy (SIMCA)) were used to analyze the spectral data. Distinctive separation of spectral samples was observed. This study demonstrates that FT-IR spectroscopy in combination with multivariate analysis could serve as a rapid and effective tool for fruit juice industry to differentiate between Bacillus and Alicyclobacillus and to distinguish between species belonging to these two genera.

Keywords: alicyclobacillus, bacillus, FT-IR, spectroscopy, PCA

Procedia PDF Downloads 452
4127 Logistic Regression Based Model for Predicting Students’ Academic Performance in Higher Institutions

Authors: Emmanuel Osaze Oshoiribhor, Adetokunbo MacGregor John-Otumu

Abstract:

In recent years, there has been a desire to forecast student academic achievement prior to graduation. This is to help them improve their grades, particularly for individuals with poor performance. The goal of this study is to employ supervised learning techniques to construct a predictive model for student academic achievement. Many academics have already constructed models that predict student academic achievement based on factors such as smoking, demography, culture, social media, parent educational background, parent finances, and family background, to name a few. This feature and the model employed may not have correctly classified the students in terms of their academic performance. This model is built using a logistic regression classifier with basic features such as the previous semester's course score, attendance to class, class participation, and the total number of course materials or resources the student is able to cover per semester as a prerequisite to predict if the student will perform well in future on related courses. The model outperformed other classifiers such as Naive bayes, Support vector machine (SVM), Decision Tree, Random forest, and Adaboost, returning a 96.7% accuracy. This model is available as a desktop application, allowing both instructors and students to benefit from user-friendly interfaces for predicting student academic achievement. As a result, it is recommended that both students and professors use this tool to better forecast outcomes.

Keywords: artificial intelligence, ML, logistic regression, performance, prediction

Procedia PDF Downloads 64
4126 Predication Model for Leukemia Diseases Based on Data Mining Classification Algorithms with Best Accuracy

Authors: Fahd Sabry Esmail, M. Badr Senousy, Mohamed Ragaie

Abstract:

In recent years, there has been an explosion in the rate of using technology that help discovering the diseases. For example, DNA microarrays allow us for the first time to obtain a "global" view of the cell. It has great potential to provide accurate medical diagnosis, to help in finding the right treatment and cure for many diseases. Various classification algorithms can be applied on such micro-array datasets to devise methods that can predict the occurrence of Leukemia disease. In this study, we compared the classification accuracy and response time among eleven decision tree methods and six rule classifier methods using five performance criteria. The experiment results show that the performance of Random Tree is producing better result. Also it takes lowest time to build model in tree classifier. The classification rules algorithms such as nearest- neighbor-like algorithm (NNge) is the best algorithm due to the high accuracy and it takes lowest time to build model in classification.

Keywords: data mining, classification techniques, decision tree, classification rule, leukemia diseases, microarray data

Procedia PDF Downloads 292
4125 Model Averaging for Poisson Regression

Authors: Zhou Jianhong

Abstract:

Model averaging is a desirable approach to deal with model uncertainty, which, however, has rarely been explored for Poisson regression. In this paper, we propose a model averaging procedure based on an unbiased estimator of the expected Kullback-Leibler distance for the Poisson regression. Simulation study shows that the proposed model average estimator outperforms some other commonly used model selection and model average estimators in some situations. Our proposed methods are further applied to a real data example and the advantage of this method is demonstrated again.

Keywords: model averaging, poission regression, Kullback-Leibler distance, statistics

Procedia PDF Downloads 485
4124 Establishment of the Regression Uncertainty of the Critical Heat Flux Power Correlation for an Advanced Fuel Bundle

Authors: L. Q. Yuan, J. Yang, A. Siddiqui

Abstract:

A new regression uncertainty analysis methodology was applied to determine the uncertainties of the critical heat flux (CHF) power correlation for an advanced 43-element bundle design, which was developed by Canadian Nuclear Laboratories (CNL) to achieve improved economics, resource utilization and energy sustainability. The new methodology is considered more appropriate than the traditional methodology in the assessment of the experimental uncertainty associated with regressions. The methodology was first assessed using both the Monte Carlo Method (MCM) and the Taylor Series Method (TSM) for a simple linear regression model, and then extended successfully to a non-linear CHF power regression model (CHF power as a function of inlet temperature, outlet pressure and mass flow rate). The regression uncertainty assessed by MCM agrees well with that by TSM. An equation to evaluate the CHF power regression uncertainty was developed and expressed as a function of independent variables that determine the CHF power.

Keywords: CHF experiment, CHF correlation, regression uncertainty, Monte Carlo Method, Taylor Series Method

Procedia PDF Downloads 388
4123 Fast Bayesian Inference of Multivariate Block-Nearest Neighbor Gaussian Process (NNGP) Models for Large Data

Authors: Carlos Gonzales, Zaida Quiroz, Marcos Prates

Abstract:

Several spatial variables collected at the same location that share a common spatial distribution can be modeled simultaneously through a multivariate geostatistical model that takes into account the correlation between these variables and the spatial autocorrelation. The main goal of this model is to perform spatial prediction of these variables in the region of study. Here we focus on a geostatistical multivariate formulation that relies on sharing common spatial random effect terms. In particular, the first response variable can be modeled by a mean that incorporates a shared random spatial effect, while the other response variables depend on this shared spatial term, in addition to specific random spatial effects. Each spatial random effect is defined through a Gaussian process with a valid covariance function, but in order to improve the computational efficiency when the data are large, each Gaussian process is approximated to a Gaussian random Markov field (GRMF), specifically to the block nearest neighbor Gaussian process (Block-NNGP). This approach involves dividing the spatial domain into several dependent blocks under certain constraints, where the cross blocks allow capturing the spatial dependence on a large scale, while each individual block captures the spatial dependence on a smaller scale. The multivariate geostatistical model belongs to the class of Latent Gaussian Models; thus, to achieve fast Bayesian inference, it is used the integrated nested Laplace approximation (INLA) method. The good performance of the proposed model is shown through simulations and applications for massive data.

Keywords: Block-NNGP, geostatistics, gaussian process, GRMF, INLA, multivariate models.

Procedia PDF Downloads 63
4122 Decision Tree Analysis of Risk Factors for Intravenous Infiltration among Hospitalized Children: A Retrospective Study

Authors: Soon-Mi Park, Ihn Sook Jeong

Abstract:

This retrospective study was aimed to identify risk factors of intravenous (IV) infiltration for hospitalized children. The participants were 1,174 children for test and 424 children for validation, who admitted to a general hospital, received peripheral intravenous injection therapy at least once and had complete records. Data were analyzed with frequency and percentage or mean and standard deviation were calculated, and decision tree analysis was used to screen for the most important risk factors for IV infiltration for hospitalized children. The decision tree analysis showed that the most important traditional risk factors for IV infiltration were the use of ampicillin/sulbactam, IV insertion site (lower extremities), and medical department (internal medicine) both in the test sample and validation sample. The correct classification was 92.2% in the test sample and 90.1% in the validation sample. More careful attention should be made to patients who are administered ampicillin/sulbactam, have IV site in lower extremities and have internal medical problems to prevent or detect infiltration occurrence.

Keywords: decision tree analysis, intravenous infiltration, child, validation

Procedia PDF Downloads 144
4121 Socioeconomic Status and Mortality in Older People with Angina: A Population-Based Cohort Study in China

Authors: Weiju Zhou, Alex Hopkins, Ruoling Chen

Abstract:

Background: China has increased the gap in income between richer and poorer over the past 40 years, and the number of deaths from people with angina has been rising. It is unclear whether socioeconomic status (SES) is associated with increased mortality in older people with angina. Methods: Data from a cohort study of 2,380 participants aged ≥ 65 years, who were randomly recruited from 5-province urban communities were examined in China. The cohort members were interviewed to record socio-demographic and risk factors and document doctor-diagnosed angina at baseline and were followed them up in 3-10 years, including monitoring vital status. Multivariate Cox regression models were employed to examine all-cause mortality in relation to low SES. Results: The cohort follow-up identified 373 deaths occurred; 41 deaths in 208 angina patients. Compared to participants without angina (n=2,172), patients with angina had increased mortality (multivariate adjusted hazard ratio (HR) was 1.41, 95% CI 1.01-1.97). Within angina patients, the risk of mortality increased with low satisfactory income (2.51, 1.08-5.85) and having financial problem (4.00, 1.07-15.00), but significantly with levels of education and occupation. In non-angina participants, none of these four SES indicators were associated with mortality. There was a significant interaction effect between angina and low satisfactory income on mortality. Conclusions: In China, having low income and financial problem increase mortality in older people with angina. Strategies to improve economic circumstances in older people could help reduce inequality in angina survival.

Keywords: angina, mortality, older people, socio-economic status

Procedia PDF Downloads 86
4120 Non-Parametric Regression over Its Parametric Couterparts with Large Sample Size

Authors: Jude Opara, Esemokumo Perewarebo Akpos

Abstract:

This paper is on non-parametric linear regression over its parametric counterparts with large sample size. Data set on anthropometric measurement of primary school pupils was taken for the analysis. The study used 50 randomly selected pupils for the study. The set of data was subjected to normality test, and it was discovered that the residuals are not normally distributed (i.e. they do not follow a Gaussian distribution) for the commonly used least squares regression method for fitting an equation into a set of (x,y)-data points using the Anderson-Darling technique. The algorithms for the nonparametric Theil’s regression are stated in this paper as well as its parametric OLS counterpart. The use of a programming language software known as “R Development” was used in this paper. From the analysis, the result showed that there exists a significant relationship between the response and the explanatory variable for both the parametric and non-parametric regression. To know the efficiency of one method over the other, the Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC) are used, and it is discovered that the nonparametric regression performs better than its parametric regression counterparts due to their lower values in both the AIC and BIC. The study however recommends that future researchers should study a similar work by examining the presence of outliers in the data set, and probably expunge it if detected and re-analyze to compare results.

Keywords: Theil’s regression, Bayesian information criterion, Akaike information criterion, OLS

Procedia PDF Downloads 276
4119 An Evaluation Model for Automatic Map Generalization

Authors: Quynhan Tran, Hong Fan, Quockhanh Pham

Abstract:

Automatic map generalization is a well-known problem in cartography. The development of map generalization research accompanied the development of cartography. The traditional map is plotted manually by cartographic experts. The paper studies none-scale automation generalization of resident polygons and house marker symbol, proposes methodology to evaluate the result maps based on minimal spanning tree. In this paper, the minimal spanning tree before and after map generalization is compared to evaluate whether the generalization result maintain the geographical distribution of features. The minimal spanning tree in vector format is firstly converted into a raster format and the grid size is 2mm (distance on the map). The statistical number of matching grid before and after map generalization and the ratio of overlapping grid to the total grids is calculated. Evaluation experiments are conduct to verify the results. Experiments show that this methodology can give an objective evaluation for the feature distribution and give specialist an hand while they evaluate result maps of none-scale automation generalization with their eyes.

Keywords: automatic cartography generalization, evaluation model, geographic feature distribution, minimal spanning tree

Procedia PDF Downloads 607
4118 Longan Tree Flowering and Bearing Induction Based on Chemicals and Growing Degree-Days Models

Authors: Hong Li, Tingxian Li, Xudong Wang, Fengliang Zhao

Abstract:

Unreliable flowering of chilling-required longan (Dimocarpus longan) due to increased air-temperatures have been the common concerns in the tropical areas. Our objectives were to assess the efficiency of chemicals in longan tree flowering and bearing using Growing Degree Days (GDD). The 2-year study was contacted in the tropical Haihan Island during 2012-2013. At pruning (August) the GDD values were started to count. The KClO3 treatments were applied to the root zones under the canopies at GDD 1300ºC while KH2PO4 rates were applied to the leaves at fruit setting at GDD 3000ºC and GDD 4000ºC. The results showed that total cumulative GDD was 6050ºC for longan. The GDD-guided KClO3 applications induced significant tree budding and flowering. The GDD-guided KH2PO4 applications stimulated higher leaf photosynthesis, carbonxylation efficiency, marketable fruit yield and quality (K+ and sugar) (P<0.05). It was concluded that the GDD-based model could efficiently support longan reliable flowering and bearing.

Keywords: canopy nutrition, flowering induction, growing degree days, longan, oxidant KClO3, tree physiology

Procedia PDF Downloads 278
4117 Quantified Metabolomics for the Determination of Phenotypes and Biomarkers across Species in Health and Disease

Authors: Miroslava Cuperlovic-Culf, Lipu Wang, Ketty Boyle, Nadine Makley, Ian Burton, Anissa Belkaid, Mohamed Touaibia, Marc E. Surrette

Abstract:

Metabolic changes are one of the major factors in the development of a variety of diseases in various species. Metabolism of agricultural plants is altered the following infection with pathogens sometimes contributing to resistance. At the same time, pathogens use metabolites for infection and progression. In humans, metabolism is a hallmark of cancer development for example. Quantified metabolomics data combined with other omics or clinical data and analyzed using various unsupervised and supervised methods can lead to better diagnosis and prognosis. It can also provide information about resistance as well as contribute knowledge of compounds significant for disease progression or prevention. In this work, different methods for metabolomics quantification and analysis from Nuclear Magnetic Resonance (NMR) measurements that are used for investigation of disease development in wheat and human cells will be presented. One-dimensional 1H NMR spectra are used extensively for metabolic profiling due to their high reliability, wide range of applicability, speed, trivial sample preparation and low cost. This presentation will describe a new method for metabolite quantification from NMR data that combines alignment of spectra of standards to sample spectra followed by multivariate linear regression optimization of spectra of assigned metabolites to samples’ spectra. Several different alignment methods were tested and multivariate linear regression result has been compared with other quantification methods. Quantified metabolomics data can be analyzed in the variety of ways and we will present different clustering methods used for phenotype determination, network analysis providing knowledge about the relationships between metabolites through metabolic network as well as biomarker selection providing novel markers. These analysis methods have been utilized for the investigation of fusarium head blight resistance in wheat cultivars as well as analysis of the effect of estrogen receptor and carbonic anhydrase activation and inhibition on breast cancer cell metabolism. Metabolic changes in spikelet’s of wheat cultivars FL62R1, Stettler, MuchMore and Sumai3 following fusarium graminearum infection were explored. Extensive 1D 1H and 2D NMR measurements provided information for detailed metabolite assignment and quantification leading to possible metabolic markers discriminating resistance level in wheat subtypes. Quantification data is compared to results obtained using other published methods. Fusarium infection induced metabolic changes in different wheat varieties are discussed in the context of metabolic network and resistance. Quantitative metabolomics has been used for the investigation of the effect of targeted enzyme inhibition in cancer. In this work, the effect of 17 β -estradiol and ferulic acid on metabolism of ER+ breast cancer cells has been compared to their effect on ER- control cells. The effect of the inhibitors of carbonic anhydrase on the observed metabolic changes resulting from ER activation has also been determined. Metabolic profiles were studied using 1D and 2D metabolomic NMR experiments, combined with the identification and quantification of metabolites, and the annotation of the results is provided in the context of biochemical pathways.

Keywords: metabolic biomarkers, metabolic network, metabolomics, multivariate linear regression, NMR quantification, quantified metabolomics, spectral alignment

Procedia PDF Downloads 316
4116 Development, Optimization, and Validation of a Synchronous Fluorescence Spectroscopic Method with Multivariate Calibration for the Determination of Amlodipine and Olmesartan Implementing: Experimental Design

Authors: Noha Ibrahim, Eman S. Elzanfaly, Said A. Hassan, Ahmed E. El Gendy

Abstract:

Objectives: The purpose of the study is to develop a sensitive synchronous spectrofluorimetric method with multivariate calibration after studying and optimizing the different variables affecting the native fluorescence intensity of amlodipine and olmesartan implementing an experimental design approach. Method: In the first step, the fractional factorial design used to screen independent factors affecting the intensity of both drugs. The objective of the second step was to optimize the method performance using a Central Composite Face-centred (CCF) design. The optimal experimental conditions obtained from this study were; a temperature of (15°C ± 0.5), the solvent of 0.05N HCl and methanol with a ratio of (90:10, v/v respectively), Δλ of 42 and the addition of 1.48 % surfactant providing a sensitive measurement of amlodipine and olmesartan. The resolution of the binary mixture with a multivariate calibration method has been accomplished mainly by using partial least squares (PLS) model. Results: The recovery percentage for amlodipine besylate and atorvastatin calcium in tablets dosage form were found to be (102 ± 0.24, 99.56 ± 0.10, for amlodipine and Olmesartan, respectively). Conclusion: Method is valid according to some International Conference on Harmonization (ICH) guidelines, providing to be linear over a range of 200-300, 500-1500 ng mL⁻¹ for amlodipine and Olmesartan. The methods were successful to estimate amlodipine besylate and olmesartan in bulk powder and pharmaceutical preparation.

Keywords: amlodipine, central composite face-centred design, experimental design, fractional factorial design, multivariate calibration, olmesartan

Procedia PDF Downloads 119
4115 A Numerical Model for Simulation of Blood Flow in Vascular Networks

Authors: Houman Tamaddon, Mehrdad Behnia, Masud Behnia

Abstract:

An accurate study of blood flow is associated with an accurate vascular pattern and geometrical properties of the organ of interest. Due to the complexity of vascular networks and poor accessibility in vivo, it is challenging to reconstruct the entire vasculature of any organ experimentally. The objective of this study is to introduce an innovative approach for the reconstruction of a full vascular tree from available morphometric data. Our method consists of implementing morphometric data on those parts of the vascular tree that are smaller than the resolution of medical imaging methods. This technique reconstructs the entire arterial tree down to the capillaries. Vessels greater than 2 mm are obtained from direct volume and surface analysis using contrast enhanced computed tomography (CT). Vessels smaller than 2mm are reconstructed from available morphometric and distensibility data and rearranged by applying Murray’s Laws. Implementation of morphometric data to reconstruct the branching pattern and applying Murray’s Laws to every vessel bifurcation simultaneously, lead to an accurate vascular tree reconstruction. The reconstruction algorithm generates full arterial tree topography down to the first capillary bifurcation. Geometry of each order of the vascular tree is generated separately to minimize the construction and simulation time. The node-to-node connectivity along with the diameter and length of every vessel segment is established and order numbers, according to the diameter-defined Strahler system, are assigned. During the simulation, we used the averaged flow rate for each order to predict the pressure drop and once the pressure drop is predicted, the flow rate is corrected to match the computed pressure drop for each vessel. The final results for 3 cardiac cycles is presented and compared to the clinical data.

Keywords: blood flow, morphometric data, vascular tree, Strahler ordering system

Procedia PDF Downloads 243
4114 Determination of Physical Properties of Crude Oil Distillates by Near-Infrared Spectroscopy and Multivariate Calibration

Authors: Ayten Ekin Meşe, Selahattin Şentürk, Melike Duvanoğlu

Abstract:

Petroleum refineries are a highly complex process industry with continuous production and high operating costs. Physical separation of crude oil starts with the crude oil distillation unit, continues with various conversion and purification units, and passes through many stages until obtaining the final product. To meet the desired product specification, process parameters are strictly followed. To be able to ensure the quality of distillates, routine analyses are performed in quality control laboratories based on appropriate international standards such as American Society for Testing and Materials (ASTM) standard methods and European Standard (EN) methods. The cut point of distillates in the crude distillation unit is very crucial for the efficiency of the upcoming processes. In order to maximize the process efficiency, the determination of the quality of distillates should be as fast as possible, reliable, and cost-effective. In this sense, an alternative study was carried out on the crude oil distillation unit that serves the entire refinery process. In this work, studies were conducted with three different crude oil distillates which are Light Straight Run Naphtha (LSRN), Heavy Straight Run Naphtha (HSRN), and Kerosene. These products are named after separation by the number of carbons it contains. LSRN consists of five to six carbon-containing hydrocarbons, HSRN consist of six to ten, and kerosene consists of sixteen to twenty-two carbon-containing hydrocarbons. Physical properties of three different crude distillation unit products (LSRN, HSRN, and Kerosene) were determined using Near-Infrared Spectroscopy with multivariate calibration. The absorbance spectra of the petroleum samples were obtained in the range from 10000 cm⁻¹ to 4000 cm⁻¹, employing a quartz transmittance flow through cell with a 2 mm light path and a resolution of 2 cm⁻¹. A total of 400 samples were collected for each petroleum sample for almost four years. Several different crude oil grades were processed during sample collection times. Extended Multiplicative Signal Correction (EMSC) and Savitzky-Golay (SG) preprocessing techniques were applied to FT-NIR spectra of samples to eliminate baseline shifts and suppress unwanted variation. Two different multivariate calibration approaches (Partial Least Squares Regression, PLS and Genetic Inverse Least Squares, GILS) and an ensemble model were applied to preprocessed FT-NIR spectra. Predictive performance of each multivariate calibration technique and preprocessing techniques were compared, and the best models were chosen according to the reproducibility of ASTM reference methods. This work demonstrates the developed models can be used for routine analysis instead of conventional analytical methods with over 90% accuracy.

Keywords: crude distillation unit, multivariate calibration, near infrared spectroscopy, data preprocessing, refinery

Procedia PDF Downloads 88