Search results for: cross validation

5086 Efficient Model Selection in Linear and Non-Linear Quantile Regression by Cross-Validation

Authors: Yoonsuh Jung, Steven N. MacEachern

Abstract:

Check loss function is used to define quantile regression. In the prospect of cross validation, it is also employed as a validation function when underlying truth is unknown. However, our empirical study indicates that the validation with check loss often leads to choosing an over estimated fits. In this work, we suggest a modified or L2-adjusted check loss which rounds the sharp corner in the middle of check loss. It has a large effect of guarding against over fitted model in some extent. Through various simulation settings of linear and non-linear regressions, the improvement of check loss by L2 adjustment is empirically examined. This adjustment is devised to shrink to zero as sample size grows.

Keywords: cross-validation, model selection, quantile regression, tuning parameter selection

Procedia PDF Downloads 430

5085 Efficient Tuning Parameter Selection by Cross-Validated Score in High Dimensional Models

Authors: Yoonsuh Jung

Abstract:

As DNA microarray data contain relatively small sample size compared to the number of genes, high dimensional models are often employed. In high dimensional models, the selection of tuning parameter (or, penalty parameter) is often one of the crucial parts of the modeling. Cross-validation is one of the most common methods for the tuning parameter selection, which selects a parameter value with the smallest cross-validated score. However, selecting a single value as an "optimal" value for the parameter can be very unstable due to the sampling variation since the sample sizes of microarray data are often small. Our approach is to choose multiple candidates of tuning parameter first, then average the candidates with different weights depending on their performance. The additional step of estimating the weights and averaging the candidates rarely increase the computational cost, while it can considerably improve the traditional cross-validation. We show that the selected value from the suggested methods often lead to stable parameter selection as well as improved detection of significant genetic variables compared to the tradition cross-validation via real data and simulated data sets.

Keywords: cross validation, parameter averaging, parameter selection, regularization parameter search

Procedia PDF Downloads 412

5084 Selection of Variogram Model for Environmental Variables

Authors: Sheikh Samsuzzhan Alam

Abstract:

The present study investigates the selection of variogram model in analyzing spatial variations of environmental variables with the trend. Sometimes, the autofitted theoretical variogram does not really capture the true nature of the empirical semivariogram. So proper exploration and analysis are needed to select the best variogram model. For this study, an open source data collected from California Soil Resource Lab1 is used to explain the problems when fitting a theoretical variogram. Five most commonly used variogram models: Linear, Gaussian, Exponential, Matern, and Spherical were fitted to the experimental semivariogram. Ordinary kriging methods were considered to evaluate the accuracy of the selected variograms through cross-validation. This study is beneficial for selecting an appropriate theoretical variogram model for environmental variables.

Keywords: anisotropy, cross-validation, environmental variables, kriging, variogram models

Procedia PDF Downloads 329

5083 A Validation Technique for Integrated Ontologies

Authors: Neli P. Zlatareva

Abstract:

Ontology validation is an important part of web applications’ development, where knowledge integration and ontological reasoning play a fundamental role. It aims to ensure the consistency and correctness of ontological knowledge and to guarantee that ontological reasoning is carried out in a meaningful way. Existing approaches to ontology validation address more or less specific validation issues, but the overall process of validating web ontologies has not been formally established yet. As the size and the number of web ontologies continue to grow, the necessity to validate and ensure their consistency and interoperability is becoming increasingly important. This paper presents a validation technique intended to test the consistency of independent ontologies utilized by a common application.

Keywords: knowledge engineering, ontological reasoning, ontology validation, semantic web

Procedia PDF Downloads 314

5082 Surface-Enhanced Raman Spectroscopy on Gold Nanoparticles in the Kidney Disease

Authors: Leonardo C. Pacheco-Londoño, Nataly J Galan-Freyle, Lisandro Pacheco-Lugo, Antonio Acosta-Hoyos, Elkin Navarro, Gustavo Aroca-Martinez, Karin Rondón-Payares, Alberto C. Espinosa-Garavito, Samuel P. Hernández-Rivera

Abstract:

At the Life Science Research Center at Simon Bolivar University, a primary focus is the diagnosis of various diseases, and the use of gold nanoparticles (Au-NPs) in diverse biomedical applications is continually expanding. In the present study, Au-NPs were employed as substrates for Surface-Enhanced Raman Spectroscopy (SERS) aimed at diagnosing kidney diseases arising from Lupus Nephritis (LN), preeclampsia (PC), and Hypertension (H). Discrimination models were developed for distinguishing patients with and without kidney diseases based on the SERS signals from urine samples by partial least squares-discriminant analysis (PLS-DA). A comparative study of the Raman signals across the three conditions was conducted, leading to the identification of potential metabolite signals. Model performance was assessed through cross-validation and external validation, determining parameters like sensitivity and specificity. Additionally, a secondary analysis was performed using machine learning (ML) models, wherein different ML algorithms were evaluated for their efficiency. Models’ validation was carried out using cross-validation and external validation, and other parameters were determined, such as sensitivity and specificity; the models showed average values of 0.9 for both parameters. Additionally, it is not possible to highlight this collaborative effort involved two university research centers and two healthcare institutions, ensuring ethical treatment and informed consent of patient samples.

Keywords: SERS, Raman, PLS-DA, kidney diseases

Procedia PDF Downloads 39

5081 Characterization and Geographical Differentiation of Yellow Prickly Pear Produced in Different Mediterranean Countries

Authors: Artemis Louppis, Michalis Constantinou, Ioanna Kosma, Federica Blando, Michael Kontominas, Anastasia Badeka

Abstract:

The aim of the present study was to differentiate yellow prickly pear according to geographical origin based on the combination of mineral content, physicochemical parameters, vitamins and antioxidants. A total of 240 yellow prickly pear samples from Cyprus, Spain, Italy and Greece were analyzed for pH, titratable acidity, electrical conductivity, protein, moisture, ash, fat, antioxidant activity, individual antioxidants, sugars and vitamins by UPLC-MS/MS as well as minerals by ICP-MS. Statistical treatment of the data included multivariate analysis of variance followed by linear discriminant analysis. Based on results, a correct classification of 66.7% was achieved using the cross validation by mineral content while 86.1% was achieved using the cross validation method by combination of all analytical parameters.

Keywords: geographical differentiation, prickly pear, chemometrics, analytical techniques

Procedia PDF Downloads 136

5080 Invention of Novel Technique of Process Scale Up by Using Solid Dosage Form

Authors: Shashank Tiwari, S. P. Mahapatra

Abstract:

The aim of this technique is to reduce the steps of process scales up, save time & cost of the industries. This technique will minimise the steps of process scale up. The new steps are, Novel Lab Scale, Novel Lab Scale Trials, Novel Trial Batches, Novel Exhibit Batches, Novel Validation Batches. In these steps, it is not divided to validation batches in three parts but the data of trials batches, Exhibit Batches and Validation batches are use and compile for production and used for validation. It also increases the batch size of the trial, exhibit batches. The new size of trials batches is not less than fifty Thousand, the exhibit batches increase up to two lack and the validation batches up to five lack. After preparing the batches all their data & drugs use for stability & maintain the validation record and compile data for the technology transfer in production department for preparing the marketed size batches.

Keywords: batches, technique, preparation, scale up, validation

Procedia PDF Downloads 350

5079 A Discrete Logit Survival Model with a Smooth Baseline Hazard for Age at First Alcohol Intake among Students at Tertiary Institutions in Thohoyandou, South Africa

Authors: A. Bere, H. G. Sithuba, K. Kyei, C. Sigauke

Abstract:

We employ a discrete logit survival model to investigate the risk factors for early alcohol intake among students at two tertiary institutions in Thohoyandou, South Africa. Data were collected from a sample of 744 students using a self-administered questionnaire. Significant covariates were arrived at through a regularization algorithm implemented using the glmmLasso package. The tuning parameter was determined using a five-fold cross-validation algorithm. The baseline hazard was modelled as a smooth function of time through the use of spline functions. The results show that the hazard of initial alcohol intake peaks at the age of about 16 years and that at any given time, being of a male gender, prior use of other drugs, having drinking peers, having experienced negative life events and physical abuse are associated with a higher risk of alcohol intake debut.

Keywords: cross-validation, discrete hazard model, LASSO, smooth baseline hazard

Procedia PDF Downloads 182

5078 Development of a Decision-Making Method by Using Machine Learning Algorithms in the Early Stage of School Building Design

Authors: Pegah Eshraghi, Zahra Sadat Zomorodian, Mohammad Tahsildoost

Abstract:

Over the past decade, energy consumption in educational buildings has steadily increased. The purpose of this research is to provide a method to quickly predict the energy consumption of buildings using separate evaluation of zones and decomposing the building to eliminate the complexity of geometry at the early design stage. To produce this framework, machine learning algorithms such as Support vector regression (SVR) and Artificial neural network (ANN) are used to predict energy consumption and thermal comfort metrics in a school as a case. The database consists of more than 55000 samples in three climates of Iran. Cross-validation evaluation and unseen data have been used for validation. In a specific label, cooling energy, it can be said the accuracy of prediction is at least 84% and 89% in SVR and ANN, respectively. The results show that the SVR performed much better than the ANN.

Keywords: early stage of design, energy, thermal comfort, validation, machine learning

Procedia PDF Downloads 83

5077 Specific Emitter Identification Based on Refined Composite Multiscale Dispersion Entropy

Authors: Shaoying Guo, Yanyun Xu, Meng Zhang, Weiqing Huang

Abstract:

The wireless communication network is developing rapidly, thus the wireless security becomes more and more important. Specific emitter identification (SEI) is an vital part of wireless communication security as a technique to identify the unique transmitters. In this paper, a SEI method based on multiscale dispersion entropy (MDE) and refined composite multiscale dispersion entropy (RCMDE) is proposed. The algorithms of MDE and RCMDE are used to extract features for identification of five wireless devices and cross-validation support vector machine (CV-SVM) is used as the classifier. The experimental results show that the total identification accuracy is 99.3%, even at low signal-to-noise ratio(SNR) of 5dB, which proves that MDE and RCMDE can describe the communication signal series well. In addition, compared with other methods, the proposed method is effective and provides better accuracy and stability for SEI.

Keywords: cross-validation support vector machine, refined com- posite multiscale dispersion entropy, specific emitter identification, transient signal, wireless communication device

Procedia PDF Downloads 128

5076 Creation and Validation of a Measurement Scale of E-Management: An Exploratory and Confirmatory Study

Authors: Hamadi Khlif

Abstract:

This paper deals with the understanding of the concept of e-management and the development of a measuring instrument adapted to the new problems encountered during the application of this new practice within the modern enterprise. Two principal e-management factors have been isolated in an exploratory study carried out among 260 participants. A confirmatory study applied to a second sample of 270 participants has been established in a cross-validation of the scale of measurement. The study presents the literature review specifically dedicated to e-management and the results of the exploratory and confirmatory phase of the development of this scale, which demonstrates satisfactory psychometric qualities. The e-management has two dimensions: a managerial dimension and a technological dimension.

Keywords: e-management, management, ICT deployment, mode of management

Procedia PDF Downloads 318

5075 Development of a Decision-Making Method by Using Machine Learning Algorithms in the Early Stage of School Building Design

Authors: Rajaian Hoonejani Mohammad, Eshraghi Pegah, Zomorodian Zahra Sadat, Tahsildoost Mohammad

Abstract:

Over the past decade, energy consumption in educational buildings has steadily increased. The purpose of this research is to provide a method to quickly predict the energy consumption of buildings using separate evaluation of zones and decomposing the building to eliminate the complexity of geometry at the early design stage. To produce this framework, machine learning algorithms such as Support vector regression (SVR) and Artificial neural network (ANN) are used to predict energy consumption and thermal comfort metrics in a school as a case. The database consists of more than 55000 samples in three climates of Iran. Cross-validation evaluation and unseen data have been used for validation. In a specific label, cooling energy, it can be said the accuracy of prediction is at least 84% and 89% in SVR and ANN, respectively. The results show that the SVR performed much better than the ANN.

Keywords: early stage of design, energy, thermal comfort, validation, machine learning

Procedia PDF Downloads 63

5074 Modeling and Validation of Microspheres Generation in the Modified T-Junction Device

Authors: Lei Lei, Hongbo Zhang, Donald J. Bergstrom, Bing Zhang, K. Y. Song, W. J. Zhang

Abstract:

This paper presents a model for a modified T-junction device for microspheres generation. The numerical model is developed using a commercial software package: COMSOL Multiphysics. In order to test the accuracy of the numerical model, multiple variables, such as the flow rate of cross-flow, fluid properties, structure, and geometry of the microdevice are applied. The results from the model are compared with the experimental results in the diameter of the microsphere generated. The comparison shows a good agreement. Therefore the model is useful in further optimization of the device and feedback control of microsphere generation if any.

Keywords: CFD modeling, validation, microsphere generation, modified T-junction

Procedia PDF Downloads 697

5073 Educational Data Mining: The Case of the Department of Mathematics and Computing in the Period 2009-2018

Authors: Mário Ernesto Sitoe, Orlando Zacarias

Abstract:

University education is influenced by several factors that range from the adoption of strategies to strengthen the whole process to the academic performance improvement of the students themselves. This work uses data mining techniques to develop a predictive model to identify students with a tendency to evasion and retention. To this end, a database of real students’ data from the Department of University Admission (DAU) and the Department of Mathematics and Informatics (DMI) was used. The data comprised 388 undergraduate students admitted in the years 2009 to 2014. The Weka tool was used for model building, using three different techniques, namely: K-nearest neighbor, random forest, and logistic regression. To allow for training on multiple train-test splits, a cross-validation approach was employed with a varying number of folds. To reduce bias variance and improve the performance of the models, ensemble methods of Bagging and Stacking were used. After comparing the results obtained by the three classifiers, Logistic Regression using Bagging with seven folds obtained the best performance, showing results above 90% in all evaluated metrics: accuracy, rate of true positives, and precision. Retention is the most common tendency.

Keywords: evasion and retention, cross-validation, bagging, stacking

Procedia PDF Downloads 77

5072 Cross-Validation of the Data Obtained for ω-6 Linoleic and ω-3 α-Linolenic Acids Concentration of Hemp Oil Using Jackknife and Bootstrap Resampling

Authors: Vibha Devi, Shabina Khanam

Abstract:

Hemp (Cannabis sativa) possesses a rich content of ω-6 linoleic and ω-3 linolenic essential fatty acid in the ratio of 3:1, which is a rare and most desired ratio that enhances the quality of hemp oil. These components are beneficial for the development of cell and body growth, strengthen the immune system, possess anti-inflammatory action, lowering the risk of heart problem owing to its anti-clotting property and a remedy for arthritis and various disorders. The present study employs supercritical fluid extraction (SFE) approach on hemp seed at various conditions of parameters; temperature (40 - 80) °C, pressure (200 - 350) bar, flow rate (5 - 15) g/min, particle size (0.430 - 1.015) mm and amount of co-solvent (0 - 10) % of solvent flow rate through central composite design (CCD). CCD suggested 32 sets of experiments, which was carried out. As SFE process includes large number of variables, the present study recommends the application of resampling techniques for cross-validation of the obtained data. Cross-validation refits the model on each data to achieve the information regarding the error, variability, deviation etc. Bootstrap and jackknife are the most popular resampling techniques, which create a large number of data through resampling from the original dataset and analyze these data to check the validity of the obtained data. Jackknife resampling is based on the eliminating one observation from the original sample of size N without replacement. For jackknife resampling, the sample size is 31 (eliminating one observation), which is repeated by 32 times. Bootstrap is the frequently used statistical approach for estimating the sampling distribution of an estimator by resampling with replacement from the original sample. For bootstrap resampling, the sample size is 32, which was repeated by 100 times. Estimands for these resampling techniques are considered as mean, standard deviation, variation coefficient and standard error of the mean. For ω-6 linoleic acid concentration, mean value was approx. 58.5 for both resampling methods, which is the average (central value) of the sample mean of all data points. Similarly, for ω-3 linoleic acid concentration, mean was observed as 22.5 through both resampling. Variance exhibits the spread out of the data from its mean. Greater value of variance exhibits the large range of output data, which is 18 for ω-6 linoleic acid (ranging from 48.85 to 63.66 %) and 6 for ω-3 linoleic acid (ranging from 16.71 to 26.2 %). Further, low value of standard deviation (approx. 1 %), low standard error of the mean (< 0.8) and low variance coefficient (< 0.2) reflect the accuracy of the sample for prediction. All the estimator value of variance coefficients, standard deviation and standard error of the mean are found within the 95 % of confidence interval.

Keywords: resampling, supercritical fluid extraction, hemp oil, cross-validation

Procedia PDF Downloads 136

5071 Predictive Analytics of Student Performance Determinants

Authors: Mahtab Davari, Charles Edward Okon, Somayeh Aghanavesi

Abstract:

Every institute of learning is usually interested in the performance of enrolled students. The level of these performances determines the approach an institute of study may adopt in rendering academic services. The focus of this paper is to evaluate students' academic performance in given courses of study using machine learning methods. This study evaluated various supervised machine learning classification algorithms such as Logistic Regression (LR), Support Vector Machine, Random Forest, Decision Tree, K-Nearest Neighbors, Linear Discriminant Analysis, and Quadratic Discriminant Analysis, using selected features to predict study performance. The accuracy, precision, recall, and F1 score obtained from a 5-Fold Cross-Validation were used to determine the best classification algorithm to predict students’ performances. SVM (using a linear kernel), LDA, and LR were identified as the best-performing machine learning methods. Also, using the LR model, this study identified students' educational habits such as reading and paying attention in class as strong determinants for a student to have an above-average performance. Other important features include the academic history of the student and work. Demographic factors such as age, gender, high school graduation, etc., had no significant effect on a student's performance.

Keywords: student performance, supervised machine learning, classification, cross-validation, prediction

Procedia PDF Downloads 119

5070 Comparison of Existing Predictor and Development of Computational Method for S- Palmitoylation Site Identification in Arabidopsis Thaliana

Authors: Ayesha Sanjana Kawser Parsha

Abstract:

S-acylation is an irreversible bond in which cysteine residues are linked to fatty acids palmitate (74%) or stearate (22%), either at the COOH or NH2 terminal, via a thioester linkage. There are several experimental methods that can be used to identify the S-palmitoylation site; however, since they require a lot of time, computational methods are becoming increasingly necessary. There aren't many predictors, however, that can locate S- palmitoylation sites in Arabidopsis Thaliana with sufficient accuracy. This research is based on the importance of building a better prediction tool. To identify the type of machine learning algorithm that predicts this site more accurately for the experimental dataset, several prediction tools were examined in this research, including the GPS PALM 6.0, pCysMod, GPS LIPID 1.0, CSS PALM 4.0, and NBA PALM. These analyses were conducted by constructing the receiver operating characteristics plot and the area under the curve score. An AI-driven deep learning-based prediction tool has been developed utilizing the analysis and three sequence-based input data, such as the amino acid composition, binary encoding profile, and autocorrelation features. The model was developed using five layers, two activation functions, associated parameters, and hyperparameters. The model was built using various combinations of features, and after training and validation, it performed better when all the features were present while using the experimental dataset for 8 and 10-fold cross-validations. While testing the model with unseen and new data, such as the GPS PALM 6.0 plant and pCysMod mouse, the model performed better, and the area under the curve score was near 1. It can be demonstrated that this model outperforms the prior tools in predicting the S- palmitoylation site in the experimental data set by comparing the area under curve score of 10-fold cross-validation of the new model with the established tools' area under curve score with their respective training sets. The objective of this study is to develop a prediction tool for Arabidopsis Thaliana that is more accurate than current tools, as measured by the area under the curve score. Plant food production and immunological treatment targets can both be managed by utilizing this method to forecast S- palmitoylation sites.

Keywords: S- palmitoylation, ROC PLOT, area under the curve, cross- validation score

Procedia PDF Downloads 68

5069 Radar Cross Section Modelling of Lossy Dielectrics

Authors: Ciara Pienaar, J. W. Odendaal, J. Joubert, J. C. Smit

Abstract:

Radar cross section (RCS) of dielectric objects play an important role in many applications, such as low observability technology development, drone detection, and monitoring as well as coastal surveillance. Various materials are used to construct the targets of interest such as metal, wood, composite materials, radar absorbent materials, and other dielectrics. Since simulated datasets are increasingly being used to supplement infield measurements, as it is more cost effective and a larger variety of targets can be simulated, it is important to have a high level of confidence in the predicted results. Confidence can be attained through validation. Various computational electromagnetic (CEM) methods are capable of predicting the RCS of dielectric targets. This study will extend previous studies by validating full-wave and asymptotic RCS simulations of dielectric targets with measured data. The paper will provide measured RCS data of a number of canonical dielectric targets exhibiting different material properties. As stated previously, these measurements are used to validate numerous CEM methods. The dielectric properties are accurately characterized to reduce the uncertainties in the simulations. Finally, an analysis of the sensitivity of oblique and normal incidence scattering predictions to material characteristics is also presented. In this paper, the ability of several CEM methods, including method of moments (MoM), and physical optics (PO), to calculate the RCS of dielectrics were validated with measured data. A few dielectrics, exhibiting different material properties, were selected and several canonical targets, such as flat plates and cylinders, were manufactured. The RCS of these dielectric targets were measured in a compact range at the University of Pretoria, South Africa, over a frequency range of 2 to 18 GHz and a 360° azimuth angle sweep. This study also investigated the effect of slight variations in the material properties on the calculated RCS results, by varying the material properties within a realistic tolerance range and comparing the calculated RCS results. Interesting measured and simulated results have been obtained. Large discrepancies were observed between the different methods as well as the measured data. It was also observed that the accuracy of the RCS data of the dielectrics can be frequency and angle dependent. The simulated RCS for some of these materials also exhibit high sensitivity to variations in the material properties. Comparison graphs between the measured and simulation RCS datasets will be presented and the validation thereof will be discussed. Finally, the effect that small tolerances in the material properties have on the calculated RCS results will be shown. Thus the importance of accurate dielectric material properties for validation purposes will be discussed.

Keywords: asymptotic, CEM, dielectric scattering, full-wave, measurements, radar cross section, validation

Procedia PDF Downloads 230

5068 Multi-Level Air Quality Classification in China Using Information Gain and Support Vector Machine

Authors: Bingchun Liu, Pei-Chann Chang, Natasha Huang, Dun Li

Abstract:

Machine Learning and Data Mining are the two important tools for extracting useful information and knowledge from large datasets. In machine learning, classification is a wildly used technique to predict qualitative variables and is generally preferred over regression from an operational point of view. Due to the enormous increase in air pollution in various countries especially China, Air Quality Classification has become one of the most important topics in air quality research and modelling. This study aims at introducing a hybrid classification model based on information theory and Support Vector Machine (SVM) using the air quality data of four cities in China namely Beijing, Guangzhou, Shanghai and Tianjin from Jan 1, 2014 to April 30, 2016. China's Ministry of Environmental Protection has classified the daily air quality into 6 levels namely Serious Pollution, Severe Pollution, Moderate Pollution, Light Pollution, Good and Excellent based on their respective Air Quality Index (AQI) values. Using the information theory, information gain (IG) is calculated and feature selection is done for both categorical features and continuous numeric features. Then SVM Machine Learning algorithm is implemented on the selected features with cross-validation. The final evaluation reveals that the IG and SVM hybrid model performs better than SVM (alone), Artificial Neural Network (ANN) and K-Nearest Neighbours (KNN) models in terms of accuracy as well as complexity.

Keywords: machine learning, air quality classification, air quality index, information gain, support vector machine, cross-validation

Procedia PDF Downloads 231

5067 Assessment of Pre-Processing Influence on Near-Infrared Spectra for Predicting the Mechanical Properties of Wood

Authors: Aasheesh Raturi, Vimal Kothiyal, P. D. Semalty

Abstract:

We studied mechanical properties of Eucalyptus tereticornis using FT-NIR spectroscopy. Firstly, spectra were pre-processed to eliminate useless information. Then, prediction model was constructed by partial least squares regression. To study the influence of pre-processing on prediction of mechanical properties for NIR analysis of wood samples, we applied various pretreatment methods like straight line subtraction, constant offset elimination, vector-normalization, min-max normalization, multiple scattering. Correction, first derivative, second derivatives and their combination with other treatment such as First derivative + straight line subtraction, First derivative+ vector normalization and First derivative+ multiplicative scattering correction. The data processing methods in combination of preprocessing with different NIR regions, RMSECV, RMSEP and optimum factors/rank were obtained by optimization process of model development. More than 350 combinations were obtained during optimization process. More than one pre-processing method gave good calibration/cross-validation and prediction/test models, but only the best calibration/cross-validation and prediction/test models are reported here. The results show that one can safely use NIR region between 4000 to 7500 cm-1 with straight line subtraction, constant offset elimination, first derivative and second derivative preprocessing method which were found to be most appropriate for models development.

Keywords: FT-NIR, mechanical properties, pre-processing, PLS

Procedia PDF Downloads 347

5066 Preliminary Study of Standardization and Validation of Micronuclei Technique to Assess the DNA Damages Cause for the X-Rays

Authors: L. J. Díaz, M. A. Hernández, A. K. Molina, A. Bermúdez, C. Crane, V. M. Pabón

Abstract:

One of the most important biological indicators that show the exposure to the radiation is the micronuclei (MN). This technique is using to determinate the radiation effects in blood cultures as a biological control and a complement to the physics dosimetry. In Colombia the necessity to apply this analysis has emerged due to the current biological indicator most used is the chromosomal aberrations (CA), that is why it is essential the MN technique’s standardization and validation to have enough tools to improve the radioprotection topic in the country. Besides, this technique will be applied on the construction of a dose-response curve, that allow measure an approximately dose to irradiated people according to MN frequency found. Inside the steps that carried out to accomplish the standardization and validation is the statistic analysis from the lectures of “in vitro” peripheral blood cultures with different analysts, also it was determinate the best culture medium and conditions for the MN can be detected easily.

Keywords: micronuclei, radioprotection, standardization, validation

Procedia PDF Downloads 489

5065 Nanofluid Flow Heat Transfer Through Ducts with Different Cross-Sections

Authors: Amir Dehshiri, Mohammad Reza Salimpour

Abstract:

In the present article, we investigate experimental laminar forced convective heat transfer specifications of TiO2/water nanofluids through conduits with different cross sections. We check the effects of different parameters such as cross-sectional shape, Reynolds number and concentration of nanoparticles in stable suspension on increasing convective heat transfer by designing and assembling of an experimental apparatus. The results demonstrate adding a little amount of nanoparticles to the base fluid, improves heat transfer behavior in conduits. Moreover, conduit with circular cross-section has better performance compared to the square and triangular cross sections. However, conduits with square and triangular cross sections have more relative heat transfer enhancement than conduit with circular cross section.

Keywords: nanofluid, cross-sectional shape, TiO2, convection

Procedia PDF Downloads 443

5064 Estimation of the Acute Toxicity of Halogenated Phenols Using Quantum Chemistry Descriptors

Authors: Khadidja Bellifa, Sidi Mohamed Mekelleche

Abstract:

Phenols and especially halogenated phenols represent a substantial part of the chemicals produced worldwide and are known as aquatic pollutants. Quantitative structure–toxicity relationship (QSTR) models are useful for understanding how chemical structure relates to the toxicity of chemicals. In the present study, the acute toxicities of 45 halogenated phenols to Tetrahymena Pyriformis are estimated using no cost semi-empirical quantum chemistry methods. QSTR models were established using the multiple linear regression technique and the predictive ability of the models was evaluated by the internal cross-validation, the Y-randomization and the external validation. Their structural chemical domain has been defined by the leverage approach. The results show that the best model is obtained with the AM1 method (R²= 0.91, R²CV= 0.90, SD= 0.20 for the training set and R²= 0.96, SD= 0.11 for the test set). Moreover, all the Tropsha’ criteria for a predictive QSTR model are verified.

Keywords: halogenated phenols, toxicity mechanism, hydrophobicity, electrophilicity index, quantitative stucture-toxicity relationships

Procedia PDF Downloads 293

5063 New Effect of Duct Cross Sectional Shape on the Nanofluid Flow Heat Transfer

Authors: Mohammad R. Salimpour, Amir Dehshiri

Abstract:

In the present article, we investigate experimental laminar forced convective heat transfer specifications of TiO2/water nanofluids through conduits with different cross sections. we check the effects of different parameters such as cross sectional shape, Reynolds number and concentration of nanoparticles in stable suspension on increasing convective heat transfer by designing and assembling of an experimental apparatus. The results demonstrate adding a little amount of nanoparticles to the base fluid, improves heat transfer behavior in conduits. Moreover, conduit with circular cross-section has better performance compared to the square and triangular cross sections. However, conduits with square and triangular cross sections have more relative heat transfer enchantment than conduit with circular cross section.

Keywords: nano fluid, cross-sectional shape, TiO2, convection

Procedia PDF Downloads 516

5062 Analysis of Expression Data Using Unsupervised Techniques

Authors: M. A. I Perera, C. R. Wijesinghe, A. R. Weerasinghe

Abstract:

his study was conducted to review and identify the unsupervised techniques that can be employed to analyze gene expression data in order to identify better subtypes of tumors. Identifying subtypes of cancer help in improving the efficacy and reducing the toxicity of the treatments by identifying clues to find target therapeutics. Process of gene expression data analysis described under three steps as preprocessing, clustering, and cluster validation. Feature selection is important since the genomic data are high dimensional with a large number of features compared to samples. Hierarchical clustering and K Means are often used in the analysis of gene expression data. There are several cluster validation techniques used in validating the clusters. Heatmaps are an effective external validation method that allows comparing the identified classes with clinical variables and visual analysis of the classes.

Keywords: cancer subtypes, gene expression data analysis, clustering, cluster validation

Procedia PDF Downloads 145

5061 Effect of Channel Cross Section Shape on Convective Heat Transfer Coefficient of Nanofluid Flow

Authors: Mohammad Reza Salimpour, Amir Dehshiri

Abstract:

In the present article, we investigate experimental laminar forced convective heat transfer specifications of TiO2/water nanofluids through conduits with different cross sections. We check the effects of different parameters such as cross sectional shape, Reynolds number and concentration of nanoparticles in stable suspension on increasing convective heat transfer by designing and assembling of an experimental apparatus. The results demonstrate adding a little amount of nanoparticles to the base fluid improves heat transfer behavior in conduits. Moreover, conduit with circular cross-section has better performance compared to the square and triangular cross sections. However, conduits with square and triangular cross sections have more relative heat transfer enhancement than conduit with circular cross section.

Keywords: nanofluid, cross-sectional shape, TiO2, convection

Procedia PDF Downloads 447

5060 Performance and Limitations of Likelihood Based Information Criteria and Leave-One-Out Cross-Validation Approximation Methods

Authors: M. A. C. S. Sampath Fernando, James M. Curran, Renate Meyer

Abstract:

Model assessment, in the Bayesian context, involves evaluation of the goodness-of-fit and the comparison of several alternative candidate models for predictive accuracy and improvements. In posterior predictive checks, the data simulated under the fitted model is compared with the actual data. Predictive model accuracy is estimated using information criteria such as the Akaike information criterion (AIC), the Bayesian information criterion (BIC), the Deviance information criterion (DIC), and the Watanabe-Akaike information criterion (WAIC). The goal of an information criterion is to obtain an unbiased measure of out-of-sample prediction error. Since posterior checks use the data twice; once for model estimation and once for testing, a bias correction which penalises the model complexity is incorporated in these criteria. Cross-validation (CV) is another method used for examining out-of-sample prediction accuracy. Leave-one-out cross-validation (LOO-CV) is the most computationally expensive variant among the other CV methods, as it fits as many models as the number of observations. Importance sampling (IS), truncated importance sampling (TIS) and Pareto-smoothed importance sampling (PSIS) are generally used as approximations to the exact LOO-CV and utilise the existing MCMC results avoiding expensive computational issues. The reciprocals of the predictive densities calculated over posterior draws for each observation are treated as the raw importance weights. These are in turn used to calculate the approximate LOO-CV of the observation as a weighted average of posterior densities. In IS-LOO, the raw weights are directly used. In contrast, the larger weights are replaced by their modified truncated weights in calculating TIS-LOO and PSIS-LOO. Although, information criteria and LOO-CV are unable to reflect the goodness-of-fit in absolute sense, the differences can be used to measure the relative performance of the models of interest. However, the use of these measures is only valid under specific circumstances. This study has developed 11 models using normal, log-normal, gamma, and student’s t distributions to improve the PCR stutter prediction with forensic data. These models are comprised of four with profile-wide variances, four with locus specific variances, and three which are two-component mixture models. The mean stutter ratio in each model is modeled as a locus specific simple linear regression against a feature of the alleles under study known as the longest uninterrupted sequence (LUS). The use of AIC, BIC, DIC, and WAIC in model comparison has some practical limitations. Even though, IS-LOO, TIS-LOO, and PSIS-LOO are considered to be approximations of the exact LOO-CV, the study observed some drastic deviations in the results. However, there are some interesting relationships among the logarithms of pointwise predictive densities (lppd) calculated under WAIC and the LOO approximation methods. The estimated overall lppd is a relative measure that reflects the overall goodness-of-fit of the model. Parallel log-likelihood profiles for the models conditional on equal posterior variances in lppds were observed. This study illustrates the limitations of the information criteria in practical model comparison problems. In addition, the relationships among LOO-CV approximation methods and WAIC with their limitations are discussed. Finally, useful recommendations that may help in practical model comparisons with these methods are provided.

Keywords: cross-validation, importance sampling, information criteria, predictive accuracy

Procedia PDF Downloads 386

5059 Thermal Performance Investigation on Cross V-Shape Solar Air Collectors

Authors: Xi Luo, Xu Ji, Yunfeng Wang, Guoliang Li, Chongqiang Yan, Ming Li

Abstract:

Two different kinds of cross V-shape solar air collectors are designed and constructed. In the transverse cross V-shape collector, the V-shape bottom plate is along the air flow direction and the absorbing plate is perpendicular to the air flow direction. In the lengthway cross V-shape collector, the V-shape absorbing plate is along the air flow direction and the bottom plate is perpendicular to the air flow direction. Based on heat balance, the mathematical model is built to evaluate their performances. These thermal performances of the two cross V-shape solar air collectors and an extra traditional flat-plate solar air collector are characterized under various operating conditions by experiments. The experimental results agree well with the calculation values. The experimental results prove that the thermal efficiency of transverse cross V-shape collector precedes that of others. The air temperature at any point along the flow direction of the transverse cross V-shape collector is higher than that of the lengthway cross V-shape collector. For the transverse cross V-shape collector, the most effective length of flow channel is 0.9m. For the lengthway cross V-shape collector, a longer flow channel is necessary to achieve a good thermal performance.

Keywords: cross v-shape, performance, solar air collector, thermal efficiency

Procedia PDF Downloads 305

5058 Cross Boader Marriages in 3rd World Countries (Economical Perspective)

Authors: Shagufta Jahangir, Raisa Jahangir

Abstract:

According to researches the 3rd world youth crave to go to developed countries just merely to get sustainable economic situation. To accomplish their wish they use each and every thing like cross boarder marriages is one of them. The basic and main point of cross boarder marriages is financial sustainability neither cross boarder culture nor cross boarder religion or others. The consequences of this research are that 60% to 70% men of 3rd world do cross boarder marriages just for only economic firmness. Due to this thoughts these men flipside to their native areas with only economic firmness rather social attitudes, moral attitudes behaviors, norms, myths and religion.40% to 50 % men do cross boarder marriages to get firmness even they have families in their native areas.2nd family formation is the easy way to get their desired, according to their eyes. After satisfying their needs they back unaccompanied to their native areas even they leave their offspring. They give precedence to their inhabitant families. This study has been design to find out that economic perspective is the basic phenomena of cross boarder marriages in the 3rd world countries men.

Keywords: cross boarder marriages, moral attitudes, native areas, flipside, norms

Procedia PDF Downloads 297

5057 Nonparametric Truncated Spline Regression Model on the Data of Human Development Index in Indonesia

Authors: Kornelius Ronald Demu, Dewi Retno Sari Saputro, Purnami Widyaningsih

Abstract:

Human Development Index (HDI) is a standard measurement for a country's human development. Several factors may have influenced it, such as life expectancy, gross domestic product (GDP) based on the province's annual expenditure, the number of poor people, and the percentage of an illiterate people. The scatter plot between HDI and the influenced factors show that the plot does not follow a specific pattern or form. Therefore, the HDI's data in Indonesia can be applied with a nonparametric regression model. The estimation of the regression curve in the nonparametric regression model is flexible because it follows the shape of the data pattern. One of the nonparametric regression's method is a truncated spline. Truncated spline regression is one of the nonparametric approach, which is a modification of the segmented polynomial functions. The estimator of a truncated spline regression model was affected by the selection of the optimal knots point. Knot points is a focus point of spline truncated functions. The optimal knots point was determined by the minimum value of generalized cross validation (GCV). In this article were applied the data of Human Development Index with a truncated spline nonparametric regression model. The results of this research were obtained the best-truncated spline regression model to the HDI's data in Indonesia with the combination of optimal knots point 5-5-5-4. Life expectancy and the percentage of an illiterate people were the significant factors depend to the HDI in Indonesia. The coefficient of determination is 94.54%. This means the regression model is good enough to applied on the data of HDI in Indonesia.

Keywords: generalized cross validation (GCV), Human Development Index (HDI), knots point, nonparametric regression, truncated spline

Procedia PDF Downloads 329