Search results for: genomic prediction
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2487

Search results for: genomic prediction

2277 Towards the Prediction of Aesthetic Requirements for Women’s Apparel Product

Authors: Yu Zhao, Min Zhang, Yuanqian Wang, Qiuyu Yu

Abstract:

The prediction of aesthetics of apparel is helpful for the development of a new type of apparel. This study is to build the quantitative relationship between the aesthetics and its design parameters. In particular, women’s pants have been preliminarily studied. This aforementioned relationship has been carried out by statistical analysis. The contributions of this study include the development of a more personalized apparel design mechanism and the provision of some empirical knowledge for the development of other products in the aspect of aesthetics.

Keywords: aesthetics, crease line, cropped straight leg pants, knee width

Procedia PDF Downloads 186
2276 Enhancing Patch Time Series Transformer with Wavelet Transform for Improved Stock Prediction

Authors: Cheng-yu Hsieh, Bo Zhang, Ahmed Hambaba

Abstract:

Stock market prediction has long been an area of interest for both expert analysts and investors, driven by its complexity and the noisy, volatile conditions it operates under. This research examines the efficacy of combining the Patch Time Series Transformer (PatchTST) with wavelet transforms, specifically focusing on Haar and Daubechies wavelets, in forecasting the adjusted closing price of the S&P 500 index for the following day. By comparing the performance of the augmented PatchTST models with traditional predictive models such as Recurrent Neural Networks (RNNs), Convolutional Neural Networks (CNNs), Long Short-Term Memory (LSTM) networks, and Transformers, this study highlights significant enhancements in prediction accuracy. The integration of the Daubechies wavelet with PatchTST notably excels, surpassing other configurations and conventional models in terms of Mean Absolute Error (MAE) and Mean Squared Error (MSE). The success of the PatchTST model paired with Daubechies wavelet is attributed to its superior capability in extracting detailed signal information and eliminating irrelevant noise, thus proving to be an effective approach for financial time series forecasting.

Keywords: deep learning, financial forecasting, stock market prediction, patch time series transformer, wavelet transform

Procedia PDF Downloads 50
2275 Network Analysis and Sex Prediction based on a full Human Brain Connectome

Authors: Oleg Vlasovets, Fabian Schaipp, Christian L. Mueller

Abstract:

we conduct a network analysis and predict the sex of 1000 participants based on ”connectome” - pairwise Pearson’s correlation across 436 brain parcels. We solve the non-smooth convex optimization problem, known under the name of Graphical Lasso, where the solution includes a low-rank component. With this solution and machine learning model for a sex prediction, we explain the brain parcels-sex connectivity patterns.

Keywords: network analysis, neuroscience, machine learning, optimization

Procedia PDF Downloads 147
2274 Stacking Ensemble Approach for Combining Different Methods in Real Estate Prediction

Authors: Sol Girouard, Zona Kostic

Abstract:

A home is often the largest and most expensive purchase a person makes. Whether the decision leads to a successful outcome will be determined by a combination of critical factors. In this paper, we propose a method that efficiently handles all the factors in residential real estate and performs predictions given a feature space with high dimensionality while controlling for overfitting. The proposed method was built on gradient descent and boosting algorithms and uses a mixed optimizing technique to improve the prediction power. Usually, a single model cannot handle all the cases thus our approach builds multiple models based on different subsets of the predictors. The algorithm was tested on 3 million homes across the U.S., and the experimental results demonstrate the efficiency of this approach by outperforming techniques currently used in forecasting prices. With everyday changes on the real estate market, our proposed algorithm capitalizes from new events allowing more efficient predictions.

Keywords: real estate prediction, gradient descent, boosting, ensemble methods, active learning, training

Procedia PDF Downloads 277
2273 An Improved Heat Transfer Prediction Model for Film Condensation inside a Tube with Interphacial Shear Effect

Authors: V. G. Rifert, V. V. Gorin, V. V. Sereda, V. V. Treputnev

Abstract:

The analysis of heat transfer design methods in condensing inside plain tubes under existing influence of shear stress is presented in this paper. The existing discrepancy in more than 30-50% between rating heat transfer coefficients and experimental data has been noted. The analysis of existing theoretical and semi-empirical methods of heat transfer prediction is given. The influence of a precise definition concerning boundaries of phase flow (it is especially important in condensing inside horizontal tubes), shear stress (friction coefficient) and heat flux on design of heat transfer is shown. The substantiation of boundary conditions of the values of parameters, influencing accuracy of rated relationships, is given. More correct relationships for heat transfer prediction, which showed good convergence with experiments made by different authors, are substantiated in this work.

Keywords: film condensation, heat transfer, plain tube, shear stress

Procedia PDF Downloads 245
2272 A Hybrid Model Tree and Logistic Regression Model for Prediction of Soil Shear Strength in Clay

Authors: Ehsan Mehryaar, Seyed Armin Motahari Tabari

Abstract:

Without a doubt, soil shear strength is the most important property of the soil. The majority of fatal and catastrophic geological accidents are related to shear strength failure of the soil. Therefore, its prediction is a matter of high importance. However, acquiring the shear strength is usually a cumbersome task that might need complicated laboratory testing. Therefore, prediction of it based on common and easy to get soil properties can simplify the projects substantially. In this paper, A hybrid model based on the classification and regression tree algorithm and logistic regression is proposed where each leaf of the tree is an independent regression model. A database of 189 points for clay soil, including Moisture content, liquid limit, plastic limit, clay content, and shear strength, is collected. The performance of the developed model compared to the existing models and equations using root mean squared error and coefficient of correlation.

Keywords: model tree, CART, logistic regression, soil shear strength

Procedia PDF Downloads 197
2271 Ultimate Strength Prediction of Shear Walls with an Aspect Ratio between One and Two

Authors: Said Boukais, Ali Kezmane, Kahil Amar, Mohand Hamizi, Hannachi Neceur Eddine

Abstract:

This paper presents an analytical study on the behavior of rectangular reinforced concrete walls with an aspect ratio between one and tow. Several experiments on such walls have been selected to be studied. Database from various experiments were collected and nominal wall strengths have been calculated using formulas, such as those of the ACI (American), NZS (New Zealand), Mexican (NTCC), and Wood equation for shear and strain compatibility analysis for flexure. Subsequently, nominal ultimate wall strengths from the formulas were compared with the ultimate wall strengths from the database. These formulas vary substantially in functional form and do not account for all variables that affect the response of walls. There is substantial scatter in the predicted values of ultimate strength. New semi empirical equation are developed using data from tests of 46 walls with the objective of improving the prediction of ultimate strength of walls with the most possible accuracy and for all failure modes.

Keywords: prediction, ultimate strength, reinforced concrete walls, walls, rectangular walls

Procedia PDF Downloads 337
2270 Genomic Surveillance of Bacillus Anthracis in South Africa Revealed a Unique Genetic Cluster of B- Clade Strains

Authors: Kgaugelo Lekota, Ayesha Hassim, Henriette Van Heerden

Abstract:

Bacillus anthracis is the causative agent of anthrax that is composed of three genetic groups, namely A, B, and C. Clade-A is distributed world-wide, while sub-clades B has been identified in Kruger National Park (KNP), South Africa. KNP is one of the endemic anthrax regions in South Africa with distinctive genetic diversity. Genomic surveillance of KNP B. anthracis strains was employed on the historical culture collection isolates (n=67) dated from the 1990’s to 2015 using a whole genome sequencing approach. Whole genome single nucleotide polymorphism (SNPs) and pan-genomics analysis were used to define the B. anthracis genetic population structure. This study showed that KNP has heterologous B. anthracis strains grouping in the A-clade with more prominent ABr.005/006 (Ancient A) SNP lineage. The 2012 and 2015 anthrax isolates are dispersed amongst minor sub-clades that prevail in non-stabilized genetic evolution strains. This was augmented with non-parsimony informative SNPs of the B. anthracis strains across minor sub-clades of the Ancient A clade. Pan-genomics of B. anthracis showed a clear distinction between A and B-clade genomes with 11 374 predicted clusters of protein coding genes. Unique accessory genes of B-clade genomes that included biosynthetic cell wall genes and multidrug resistant of Fosfomycin. South Africa consists of diverse B. anthracis strains with unique defined SNPs. The sequenced B. anthracis strains in this study will serve as a means to further trace the dissemination of B. anthracis outbreaks globally and especially in South Africa.

Keywords: bacillus anthracis, whole genome single nucleotide polymorphisms, pangenomics, kruger national park

Procedia PDF Downloads 150
2269 Epilepsy Seizure Prediction by Effective Connectivity Estimation Using Granger Causality and Directed Transfer Function Analysis of Multi-Channel Electroencephalogram

Authors: Mona Hejazi, Ali Motie Nasrabadi

Abstract:

Epilepsy is a persistent neurological disorder that affects more than 50 million people worldwide. Hence, there is a necessity to introduce an efficient prediction model for making a correct diagnosis of the epileptic seizure and accurate prediction of its type. In this study we consider how the Effective Connectivity (EC) patterns obtained from intracranial Electroencephalographic (EEG) recordings reveal information about the dynamics of the epileptic brain and can be used to predict imminent seizures, as this will enable the patients (and caregivers) to take appropriate precautions. We use this definition because we believe that effective connectivity near seizures begin to change, so we can predict seizures according to this feature. Results are reported on the standard Freiburg EEG dataset which contains data from 21 patients suffering from medically intractable focal epilepsy. Six channels of EEG from each patients are considered and effective connectivity using Directed Transfer Function (DTF) and Granger Causality (GC) methods is estimated. We concentrate on effective connectivity standard deviation over time and feature changes in five brain frequency sub-bands (Alpha, Beta, Theta, Delta, and Gamma) are compared. The performance obtained for the proposed scheme in predicting seizures is: average prediction time is 50 minutes before seizure onset, the maximum sensitivity is approximate ~80% and the false positive rate is 0.33 FP/h. DTF method is more acceptable to predict epileptic seizures and generally we can observe that the greater results are in gamma and beta sub-bands. The research of this paper is significantly helpful for clinical applications, especially for the exploitation of online portable devices.

Keywords: effective connectivity, Granger causality, directed transfer function, epilepsy seizure prediction, EEG

Procedia PDF Downloads 469
2268 Cytogenetic Characterization of the VERO Cell Line Based on Comparisons with the Subline; Implication for Authorization and Quality Control of Animal Cell Lines

Authors: Fumio Kasai, Noriko Hirayama, Jorge Pereira, Azusa Ohtani, Masashi Iemura, Malcolm A. Ferguson Smith, Arihiro Kohara

Abstract:

The VERO cell line was established in 1962 from normal tissue of an African green monkey, Chlorocebus aethiops (2n=60), and has been commonly used worldwide for screening for toxins or as a cell substrate for the production of viral vaccines. The VERO genome was sequenced in 2014; however, its cytogenetic features have not been fully characterized as it contains several chromosome abnormalities and different karyotypes coexist in the cell line. In this study, the VERO cell line (JCRB0111) was compared with one of the sublines. In contrast to 59 chromosomes as the modal chromosome number in the VERO cell line, the subline had two peaks of 56 and 58 chromosomes. M-FISH analysis using human probes revealed that the VERO cell line was characterized by a translocation t(2;25) found in all metaphases, which was absent in the subline. Different abnormalities detected only in the subline show that the cell line is heterogeneous, indicating that the subline has the potential to change its genomic characteristics during cell culture. The various alterations in the two independent lineages suggest that genomic changes in both VERO cells can be accounted for by progressive rearrangements during their evolution in culture. Both t(5;X) and t(8;14) observed in all metaphases of the two cell lines might have a key role in VERO cells and could be used as genetic markers to identify VERO cells. The flow karyotype shows distinct differences from normal. Further analysis of sorted abnormal chromosomes may uncover other characteristics of VERO cells. Because of the absence of STR data, cytogenetic data are important in characterizing animal cell lines and can be an indicator of their quality control.

Keywords: VERO, cell culture passage, chromosome rearrangement, heterogeneous cells

Procedia PDF Downloads 416
2267 Multi-Agent Searching Adaptation Using Levy Flight and Inferential Reasoning

Authors: Sagir M. Yusuf, Chris Baber

Abstract:

In this paper, we describe how to achieve knowledge understanding and prediction (Situation Awareness (SA)) for multiple-agents conducting searching activity using Bayesian inferential reasoning and learning. Bayesian Belief Network was used to monitor agents' knowledge about their environment, and cases are recorded for the network training using expectation-maximisation or gradient descent algorithm. The well trained network will be used for decision making and environmental situation prediction. Forest fire searching by multiple UAVs was the use case. UAVs are tasked to explore a forest and find a fire for urgent actions by the fire wardens. The paper focused on two problems: (i) effective agents’ path planning strategy and (ii) knowledge understanding and prediction (SA). The path planning problem by inspiring animal mode of foraging using Lévy distribution augmented with Bayesian reasoning was fully described in this paper. Results proof that the Lévy flight strategy performs better than the previous fixed-pattern (e.g., parallel sweeps) approaches in terms of energy and time utilisation. We also introduced a waypoint assessment strategy called k-previous waypoints assessment. It improves the performance of the ordinary levy flight by saving agent’s resources and mission time through redundant search avoidance. The agents (UAVs) are to report their mission knowledge at the central server for interpretation and prediction purposes. Bayesian reasoning and learning were used for the SA and results proof effectiveness in different environments scenario in terms of prediction and effective knowledge representation. The prediction accuracy was measured using learning error rate, logarithm loss, and Brier score and the result proves that little agents mission that can be used for prediction within the same or different environment. Finally, we described a situation-based knowledge visualization and prediction technique for heterogeneous multi-UAV mission. While this paper proves linkage of Bayesian reasoning and learning with SA and effective searching strategy, future works is focusing on simplifying the architecture.

Keywords: Levy flight, distributed constraint optimization problem, multi-agent system, multi-robot coordination, autonomous system, swarm intelligence

Procedia PDF Downloads 144
2266 A Comparison between Artificial Neural Network Prediction Models for Coronal Hole Related High Speed Streams

Authors: Rehab Abdulmajed, Amr Hamada, Ahmed Elsaid, Hisashi Hayakawa, Ayman Mahrous

Abstract:

Solar emissions have a high impact on the Earth’s magnetic field, and the prediction of solar events is of high interest. Various techniques have been used in the prediction of solar wind using mathematical models, MHD models, and neural network (NN) models. This study investigates the coronal hole (CH) derived high-speed streams (HSSs) and their correlation to the CH area and create a neural network model to predict the HSSs. Two different algorithms were used to compare different models to find a model that best simulates the HSSs. A dataset of CH synoptic maps through Carrington rotations 1601 to 2185 along with Omni-data set solar wind speed averaged over the Carrington rotations is used, which covers Solar cycles (sc) 21, 22, 23, and most of 24.

Keywords: artificial neural network, coronal hole area, feed-forward neural network models, solar high speed streams

Procedia PDF Downloads 88
2265 The Combination of the Mel Frequency Cepstral Coefficients (MFCC), Perceptual Linear Prediction (PLP), JITTER and SHIMMER Coefficients for the Improvement of Automatic Recognition System for Dysarthric Speech

Authors: Brahim-Fares Zaidi, Malika Boudraa, Sid-Ahmed Selouani

Abstract:

Our work aims to improve our Automatic Recognition System for Dysarthria Speech (ARSDS) based on the Hidden Models of Markov (HMM) and the Hidden Markov Model Toolkit (HTK) to help people who are sick. With pronunciation problems, we applied two techniques of speech parameterization based on Mel Frequency Cepstral Coefficients (MFCC's) and Perceptual Linear Prediction (PLP's) and concatenated them with JITTER and SHIMMER coefficients in order to increase the recognition rate of a dysarthria speech. For our tests, we used the NEMOURS database that represents speakers with dysarthria and normal speakers.

Keywords: hidden Markov model toolkit (HTK), hidden models of Markov (HMM), Mel-frequency cepstral coefficients (MFCC), perceptual linear prediction (PLP’s)

Procedia PDF Downloads 161
2264 SPARK: An Open-Source Knowledge Discovery Platform That Leverages Non-Relational Databases and Massively Parallel Computational Power for Heterogeneous Genomic Datasets

Authors: Thilina Ranaweera, Enes Makalic, John L. Hopper, Adrian Bickerstaffe

Abstract:

Data are the primary asset of biomedical researchers, and the engine for both discovery and research translation. As the volume and complexity of research datasets increase, especially with new technologies such as large single nucleotide polymorphism (SNP) chips, so too does the requirement for software to manage, process and analyze the data. Researchers often need to execute complicated queries and conduct complex analyzes of large-scale datasets. Existing tools to analyze such data, and other types of high-dimensional data, unfortunately suffer from one or more major problems. They typically require a high level of computing expertise, are too simplistic (i.e., do not fit realistic models that allow for complex interactions), are limited by computing power, do not exploit the computing power of large-scale parallel architectures (e.g. supercomputers, GPU clusters etc.), or are limited in the types of analysis available, compounded by the fact that integrating new analysis methods is not straightforward. Solutions to these problems, such as those developed and implemented on parallel architectures, are currently available to only a relatively small portion of medical researchers with access and know-how. The past decade has seen a rapid expansion of data management systems for the medical domain. Much attention has been given to systems that manage phenotype datasets generated by medical studies. The introduction of heterogeneous genomic data for research subjects that reside in these systems has highlighted the need for substantial improvements in software architecture. To address this problem, we have developed SPARK, an enabling and translational system for medical research, leveraging existing high performance computing resources, and analysis techniques currently available or being developed. It builds these into The Ark, an open-source web-based system designed to manage medical data. SPARK provides a next-generation biomedical data management solution that is based upon a novel Micro-Service architecture and Big Data technologies. The system serves to demonstrate the applicability of Micro-Service architectures for the development of high performance computing applications. When applied to high-dimensional medical datasets such as genomic data, relational data management approaches with normalized data structures suffer from unfeasibly high execution times for basic operations such as insert (i.e. importing a GWAS dataset) and the queries that are typical of the genomics research domain. SPARK resolves these problems by incorporating non-relational NoSQL databases that have been driven by the emergence of Big Data. SPARK provides researchers across the world with user-friendly access to state-of-the-art data management and analysis tools while eliminating the need for high-level informatics and programming skills. The system will benefit health and medical research by eliminating the burden of large-scale data management, querying, cleaning, and analysis. SPARK represents a major advancement in genome research technologies, vastly reducing the burden of working with genomic datasets, and enabling cutting edge analysis approaches that have previously been out of reach for many medical researchers.

Keywords: biomedical research, genomics, information systems, software

Procedia PDF Downloads 270
2263 On-Line Data-Driven Multivariate Statistical Prediction Approach to Production Monitoring

Authors: Hyun-Woo Cho

Abstract:

Detection of incipient abnormal events in production processes is important to improve safety and reliability of manufacturing operations and reduce losses caused by failures. The construction of calibration models for predicting faulty conditions is quite essential in making decisions on when to perform preventive maintenance. This paper presents a multivariate calibration monitoring approach based on the statistical analysis of process measurement data. The calibration model is used to predict faulty conditions from historical reference data. This approach utilizes variable selection techniques, and the predictive performance of several prediction methods are evaluated using real data. The results shows that the calibration model based on supervised probabilistic model yielded best performance in this work. By adopting a proper variable selection scheme in calibration models, the prediction performance can be improved by excluding non-informative variables from their model building steps.

Keywords: calibration model, monitoring, quality improvement, feature selection

Procedia PDF Downloads 356
2262 Model-Driven and Data-Driven Approaches for Crop Yield Prediction: Analysis and Comparison

Authors: Xiangtuo Chen, Paul-Henry Cournéde

Abstract:

Crop yield prediction is a paramount issue in agriculture. The main idea of this paper is to find out efficient way to predict the yield of corn based meteorological records. The prediction models used in this paper can be classified into model-driven approaches and data-driven approaches, according to the different modeling methodologies. The model-driven approaches are based on crop mechanistic modeling. They describe crop growth in interaction with their environment as dynamical systems. But the calibration process of the dynamic system comes up with much difficulty, because it turns out to be a multidimensional non-convex optimization problem. An original contribution of this paper is to propose a statistical methodology, Multi-Scenarios Parameters Estimation (MSPE), for the parametrization of potentially complex mechanistic models from a new type of datasets (climatic data, final yield in many situations). It is tested with CORNFLO, a crop model for maize growth. On the other hand, the data-driven approach for yield prediction is free of the complex biophysical process. But it has some strict requirements about the dataset. A second contribution of the paper is the comparison of these model-driven methods with classical data-driven methods. For this purpose, we consider two classes of regression methods, methods derived from linear regression (Ridge and Lasso Regression, Principal Components Regression or Partial Least Squares Regression) and machine learning methods (Random Forest, k-Nearest Neighbor, Artificial Neural Network and SVM regression). The dataset consists of 720 records of corn yield at county scale provided by the United States Department of Agriculture (USDA) and the associated climatic data. A 5-folds cross-validation process and two accuracy metrics: root mean square error of prediction(RMSEP), mean absolute error of prediction(MAEP) were used to evaluate the crop prediction capacity. The results show that among the data-driven approaches, Random Forest is the most robust and generally achieves the best prediction error (MAEP 4.27%). It also outperforms our model-driven approach (MAEP 6.11%). However, the method to calibrate the mechanistic model from dataset easy to access offers several side-perspectives. The mechanistic model can potentially help to underline the stresses suffered by the crop or to identify the biological parameters of interest for breeding purposes. For this reason, an interesting perspective is to combine these two types of approaches.

Keywords: crop yield prediction, crop model, sensitivity analysis, paramater estimation, particle swarm optimization, random forest

Procedia PDF Downloads 231
2261 Leveraging the Power of Dual Spatial-Temporal Data Scheme for Traffic Prediction

Authors: Yang Zhou, Heli Sun, Jianbin Huang, Jizhong Zhao, Shaojie Qiao

Abstract:

Traffic prediction is a fundamental problem in urban environment, facilitating the smart management of various businesses, such as taxi dispatching, bike relocation, and stampede alert. Most earlier methods rely on identifying the intrinsic spatial-temporal correlation to forecast. However, the complex nature of this problem entails a more sophisticated solution that can simultaneously capture the mutual influence of both adjacent and far-flung areas, with the information of time-dimension also incorporated seamlessly. To tackle this difficulty, we propose a new multi-phase architecture, DSTDS (Dual Spatial-Temporal Data Scheme for traffic prediction), that aims to reveal the underlying relationship that determines future traffic trend. First, a graph-based neural network with an attention mechanism is devised to obtain the static features of the road network. Then, a multi-granularity recurrent neural network is built in conjunction with the knowledge from a grid-based model. Subsequently, the preceding output is fed into a spatial-temporal super-resolution module. With this 3-phase structure, we carry out extensive experiments on several real-world datasets to demonstrate the effectiveness of our approach, which surpasses several state-of-the-art methods.

Keywords: traffic prediction, spatial-temporal, recurrent neural network, dual data scheme

Procedia PDF Downloads 117
2260 Fracture and Fatigue Crack Growth Analysis and Modeling

Authors: Volkmar Nolting

Abstract:

Fatigue crack growth prediction has become an important topic in both engineering and non-destructive evaluation. Crack propagation is influenced by the mechanical properties of the material and is conveniently modelled by the Paris-Erdogan equation. The critical crack size and the total number of load cycles are calculated. From a Larson-Miller plot the maximum operational temperature can for a given stress level be determined so that failure does not occur within a given time interval t. The study is used to determine a reasonable inspection cycle and thus enhances operational safety and reduces costs.

Keywords: fracturemechanics, crack growth prediction, lifetime of a component, structural health monitoring

Procedia PDF Downloads 49
2259 Prediction of Wind Speed by Artificial Neural Networks for Energy Application

Authors: S. Adjiri-Bailiche, S. M. Boudia, H. Daaou, S. Hadouche, A. Benzaoui

Abstract:

In this work the study of changes in the wind speed depending on the altitude is calculated and described by the model of the neural networks, the use of measured data, the speed and direction of wind, temperature and the humidity at 10 m are used as input data and as data targets at 50m above sea level. Comparing predict wind speeds and extrapolated at 50 m above sea level is performed. The results show that the prediction by the method of artificial neural networks is very accurate.

Keywords: MATLAB, neural network, power low, vertical extrapolation, wind energy, wind speed

Procedia PDF Downloads 692
2258 A High Content Screening Platform for the Accurate Prediction of Nephrotoxicity

Authors: Sijing Xiong, Ran Su, Lit-Hsin Loo, Daniele Zink

Abstract:

The kidney is a major target for toxic effects of drugs, industrial and environmental chemicals and other compounds. Typically, nephrotoxicity is detected late during drug development, and regulatory animal models could not solve this problem. Validated or accepted in silico or in vitro methods for the prediction of nephrotoxicity are not available. We have established the first and currently only pre-validated in vitro models for the accurate prediction of nephrotoxicity in humans and the first predictive platforms based on renal cells derived from human pluripotent stem cells. In order to further improve the efficiency of our predictive models, we recently developed a high content screening (HCS) platform. This platform employed automated imaging in combination with automated quantitative phenotypic profiling and machine learning methods. 129 image-based phenotypic features were analyzed with respect to their predictive performance in combination with 44 compounds with different chemical structures that included drugs, environmental and industrial chemicals and herbal and fungal compounds. The nephrotoxicity of these compounds in humans is well characterized. A combination of chromatin and cytoskeletal features resulted in high predictivity with respect to nephrotoxicity in humans. Test balanced accuracies of 82% or 89% were obtained with human primary or immortalized renal proximal tubular cells, respectively. Furthermore, our results revealed that a DNA damage response is commonly induced by different PTC-toxicants with diverse chemical structures and injury mechanisms. Together, the results show that the automated HCS platform allows efficient and accurate nephrotoxicity prediction for compounds with diverse chemical structures.

Keywords: high content screening, in vitro models, nephrotoxicity, toxicity prediction

Procedia PDF Downloads 313
2257 Hard Disk Failure Predictions in Supercomputing System Based on CNN-LSTM and Oversampling Technique

Authors: Yingkun Huang, Li Guo, Zekang Lan, Kai Tian

Abstract:

Hard disk drives (HDD) failure of the exascale supercomputing system may lead to service interruption and invalidate previous calculations, and it will cause permanent data loss. Therefore, initiating corrective actions before hard drive failures materialize is critical to the continued operation of jobs. In this paper, a highly accurate analysis model based on CNN-LSTM and oversampling technique was proposed, which can correctly predict the necessity of a disk replacement even ten days in advance. Generally, the learning-based method performs poorly on a training dataset with long-tail distribution, especially fault prediction is a very classic situation as the scarcity of failure data. To overcome the puzzle, a new oversampling was employed to augment the data, and then, an improved CNN-LSTM with the shortcut was built to learn more effective features. The shortcut transmits the results of the previous layer of CNN and is used as the input of the LSTM model after weighted fusion with the output of the next layer. Finally, a detailed, empirical comparison of 6 prediction methods is presented and discussed on a public dataset for evaluation. The experiments indicate that the proposed method predicts disk failure with 0.91 Precision, 0.91 Recall, 0.91 F-measure, and 0.90 MCC for 10 days prediction horizon. Thus, the proposed algorithm is an efficient algorithm for predicting HDD failure in supercomputing.

Keywords: HDD replacement, failure, CNN-LSTM, oversampling, prediction

Procedia PDF Downloads 79
2256 Unsupervised Text Mining Approach to Early Warning System

Authors: Ichihan Tai, Bill Olson, Paul Blessner

Abstract:

Traditional early warning systems that alarm against crisis are generally based on structured or numerical data; therefore, a system that can make predictions based on unstructured textual data, an uncorrelated data source, is a great complement to the traditional early warning systems. The Chicago Board Options Exchange (CBOE) Volatility Index (VIX), commonly referred to as the fear index, measures the cost of insurance against market crash, and spikes in the event of crisis. In this study, news data is consumed for prediction of whether there will be a market-wide crisis by predicting the movement of the fear index, and the historical references to similar events are presented in an unsupervised manner. Topic modeling-based prediction and representation are made based on daily news data between 1990 and 2015 from The Wall Street Journal against VIX index data from CBOE.

Keywords: early warning system, knowledge management, market prediction, topic modeling.

Procedia PDF Downloads 338
2255 Neural Networks and Genetic Algorithms Approach for Word Correction and Prediction

Authors: Rodrigo S. Fonseca, Antônio C. P. Veiga

Abstract:

Aiming at helping people with some movement limitation that makes typing and communication difficult, there is a need to customize an assistive tool with a learning environment that helps the user in order to optimize text input, identifying the error and providing the correction and possibilities of choice in the Portuguese language. The work presents an Orthographic and Grammatical System that can be incorporated into writing environments, improving and facilitating the use of an alphanumeric keyboard, using a prototype built using a genetic algorithm in addition to carrying out the prediction, which can occur based on the quantity and position of the inserted letters and even placement in the sentence, ensuring the sequence of ideas using a Long Short Term Memory (LSTM) neural network. The prototype optimizes data entry, being a component of assistive technology for the textual formulation, detecting errors, seeking solutions and informing the user of accurate predictions quickly and effectively through machine learning.

Keywords: genetic algorithm, neural networks, word prediction, machine learning

Procedia PDF Downloads 194
2254 Prediction of Fillet Weight and Fillet Yield from Body Measurements and Genetic Parameters in a Complete Diallel Cross of Three Nile Tilapia (Oreochromis niloticus) Strains

Authors: Kassaye Balkew Workagegn, Gunnar Klemetsdal, Hans Magnus Gjøen

Abstract:

In this study, the first objective was to investigate whether non-lethal or non-invasive methods, utilizing body measurements, could be used to efficiently predict fillet weight and fillet yield for a complete diallel cross of three Nile tilapia (Oreochromis niloticus) strains collected from three Ethiopian Rift Valley lakes, Lakes Ziway, Koka and Chamo. The second objective was to estimate heritability of body weight, actual and predicted fillet traits, as well as genetic correlations between these traits. A third goal was to estimate additive, reciprocal, and heterosis effects for body weight and the various fillet traits. As in females, early sexual maturation was widespread, only 958 male fish from 81 full-sib families were used, both for the prediction of fillet traits and in genetic analysis. The prediction equations from body measurements were established by forward regression analysis, choosing models with the least predicted residual error sums of squares (PRESS). The results revealed that body measurements on live Nile tilapia is well suited to predict fillet weight but not fillet yield (R²= 0.945 and 0.209, respectively), but both models were seemingly unbiased. The genetic analyses were carried out with bivariate, multibreed models. Body weight, fillet weight, and predicted fillet weight were all estimated with a heritability ranged from 0.23 to 0.28, and with genetic correlations close to one. Contrary, fillet yield was only to a minor degree heritable (0.05), while predicted fillet yield obtained a heritability of 0.19, being a resultant of two body weight variables known to have high heritability. The latter trait was estimated with genetic correlations to body weight and fillet weight traits larger than 0.82. No significant differences among strains were found for their additive genetic, reciprocal, or heterosis effects, while total heterosis effects were estimated as positive and significant (P < 0.05). As a conclusion, prediction of prediction of fillet weight based on body measurements is possible, but not for fillet yield.

Keywords: additive, fillet traits, genetic correlation, heritability, heterosis, prediction, reciprocal

Procedia PDF Downloads 187
2253 Application of Artificial Neural Network for Prediction of Retention Times of Some Secoestrane Derivatives

Authors: Nataša Kalajdžija, Strahinja Kovačević, Davor Lončar, Sanja Podunavac Kuzmanović, Lidija Jevrić

Abstract:

In order to investigate the relationship between retention and structure, a quantitative Structure Retention Relationships (QSRRs) study was applied for the prediction of retention times of a set of 23 secoestrane derivatives in a reversed-phase thin-layer chromatography. After the calculation of molecular descriptors, a suitable set of molecular descriptors was selected by using step-wise multiple linear regressions. Artificial Neural Network (ANN) method was employed to model the nonlinear structure-activity relationships. The ANN technique resulted in 5-6-1 ANN model with the correlation coefficient of 0.98. We found that the following descriptors: Critical pressure, total energy, protease inhibition, distribution coefficient (LogD) and parameter of lipophilicity (miLogP) have a significant effect on the retention times. The prediction results are in very good agreement with the experimental ones. This approach provided a new and effective method for predicting the chromatographic retention index for the secoestrane derivatives investigated.

Keywords: lipophilicity, QSRR, RP TLC retention, secoestranes

Procedia PDF Downloads 455
2252 Multidirectional Product Support System for Decision Making in Textile Industry Using Collaborative Filtering Methods

Authors: A. Senthil Kumar, V. Murali Bhaskaran

Abstract:

In the information technology ground, people are using various tools and software for their official use and personal reasons. Nowadays, people are worrying to choose data accessing and extraction tools at the time of buying and selling their products. In addition, worry about various quality factors such as price, durability, color, size, and availability of the product. The main purpose of the research study is to find solutions to these unsolved existing problems. The proposed algorithm is a Multidirectional Rank Prediction (MDRP) decision making algorithm in order to take an effective strategic decision at all the levels of data extraction, uses a real time textile dataset and analyzes the results. Finally, the results are obtained and compared with the existing measurement methods such as PCC, SLCF, and VSS. The result accuracy is higher than the existing rank prediction methods.

Keywords: Knowledge Discovery in Database (KDD), Multidirectional Rank Prediction (MDRP), Pearson’s Correlation Coefficient (PCC), VSS (Vector Space Similarity)

Procedia PDF Downloads 286
2251 Estimation of Relative Subsidence of Collapsible Soils Using Electromagnetic Measurements

Authors: Henok Hailemariam, Frank Wuttke

Abstract:

Collapsible soils are weak soils that appear to be stable in their natural state, normally dry condition, but rapidly deform under saturation (wetting), thus generating large and unexpected settlements which often yield disastrous consequences for structures unwittingly built on such deposits. In this study, a prediction model for the relative subsidence of stressed collapsible soils based on dielectric permittivity measurement is presented. Unlike most existing methods for soil subsidence prediction, this model does not require moisture content as an input parameter, thus providing the opportunity to obtain accurate estimation of the relative subsidence of collapsible soils using dielectric measurement only. The prediction model is developed based on an existing relative subsidence prediction model (which is dependent on soil moisture condition) and an advanced theoretical frequency and temperature-dependent electromagnetic mixing equation (which effectively removes the moisture content dependence of the original relative subsidence prediction model). For large scale sub-surface soil exploration purposes, the spatial sub-surface soil dielectric data over wide areas and high depths of weak (collapsible) soil deposits can be obtained using non-destructive high frequency electromagnetic (HF-EM) measurement techniques such as ground penetrating radar (GPR). For laboratory or small scale in-situ measurements, techniques such as an open-ended coaxial line with widely applicable time domain reflectometry (TDR) or vector network analysers (VNAs) are usually employed to obtain the soil dielectric data. By using soil dielectric data obtained from small or large scale non-destructive HF-EM investigations, the new model can effectively predict the relative subsidence of weak soils without the need to extract samples for moisture content measurement. Some of the resulting benefits are the preservation of the undisturbed nature of the soil as well as a reduction in the investigation costs and analysis time in the identification of weak (problematic) soils. The accuracy of prediction of the presented model is assessed by conducting relative subsidence tests on a collapsible soil at various initial soil conditions and a good match between the model prediction and experimental results is obtained.

Keywords: collapsible soil, dielectric permittivity, moisture content, relative subsidence

Procedia PDF Downloads 363
2250 Prediction Model of Body Mass Index of Young Adult Students of Public Health Faculty of University of Indonesia

Authors: Yuwaratu Syafira, Wahyu K. Y. Putra, Kusharisupeni Djokosujono

Abstract:

Background/Objective: Body Mass Index (BMI) serves various purposes, including measuring the prevalence of obesity in a population, and also in formulating a patient’s diet at a hospital, and can be calculated with the equation = body weight (kg)/body height (m)². However, the BMI of an individual with difficulties in carrying their weight or standing up straight can not necessarily be measured. The aim of this study was to form a prediction model for the BMI of young adult students of Public Health Faculty of University of Indonesia. Subject/Method: This study used a cross sectional design, with a total sample of 132 respondents, consisted of 58 males and 74 females aged 21- 30. The dependent variable of this study was BMI, and the independent variables consisted of sex and anthropometric measurements, which included ulna length, arm length, tibia length, knee height, mid-upper arm circumference, and calf circumference. Anthropometric information was measured and recorded in a single sitting. Simple and multiple linear regression analysis were used to create the prediction equation for BMI. Results: The male respondents had an average BMI of 24.63 kg/m² and the female respondents had an average of 22.52 kg/m². A total of 17 variables were analysed for its correlation with BMI. Bivariate analysis showed the variable with the strongest correlation with BMI was Mid-Upper Arm Circumference/√Ulna Length (MUAC/√UL) (r = 0.926 for males and r = 0.886 for females). Furthermore, MUAC alone also has a very strong correlation with BMI (r = 0,913 for males and r = 0,877 for females). Prediction models formed from either MUAC/√UL or MUAC alone both produce highly accurate predictions of BMI. However, measuring MUAC/√UL is considered inconvenient, which may cause difficulties when applied on the field. Conclusion: The prediction model considered most ideal to estimate BMI is: Male BMI (kg/m²) = 1.109(MUAC (cm)) – 9.202 and Female BMI (kg/m²) = 0.236 + 0.825(MUAC (cm)), based on its high accuracy levels and the convenience of measuring MUAC on the field.

Keywords: body mass index, mid-upper arm circumference, prediction model, ulna length

Procedia PDF Downloads 214
2249 Flame Volume Prediction and Validation for Lean Blowout of Gas Turbine Combustor

Authors: Ejaz Ahmed, Huang Yong

Abstract:

The operation of aero engines has a critical importance in the vicinity of lean blowout (LBO) limits. Lefebvre’s model of LBO based on empirical correlation has been extended to flame volume concept by the authors. The flame volume takes into account the effects of geometric configuration, the complex spatial interaction of mixing, turbulence, heat transfer and combustion processes inside the gas turbine combustion chamber. For these reasons, flame volume based LBO predictions are more accurate. Although LBO prediction accuracy has improved, it poses a challenge associated with Vf estimation in real gas turbine combustors. This work extends the approach of flame volume prediction previously based on fuel iterative approximation with cold flow simulations to reactive flow simulations. Flame volume for 11 combustor configurations has been simulated and validated against experimental data. To make prediction methodology robust as required in the preliminary design stage, reactive flow simulations were carried out with the combination of probability density function (PDF) and discrete phase model (DPM) in FLUENT 15.0. The criterion for flame identification was defined. Two important parameters i.e. critical injection diameter (Dp,crit) and critical temperature (Tcrit) were identified, and their influence on reactive flow simulation was studied for Vf estimation. Obtained results exhibit ±15% error in Vf estimation with experimental data.

Keywords: CFD, combustion, gas turbine combustor, lean blowout

Procedia PDF Downloads 267
2248 Assessment of Pre-Processing Influence on Near-Infrared Spectra for Predicting the Mechanical Properties of Wood

Authors: Aasheesh Raturi, Vimal Kothiyal, P. D. Semalty

Abstract:

We studied mechanical properties of Eucalyptus tereticornis using FT-NIR spectroscopy. Firstly, spectra were pre-processed to eliminate useless information. Then, prediction model was constructed by partial least squares regression. To study the influence of pre-processing on prediction of mechanical properties for NIR analysis of wood samples, we applied various pretreatment methods like straight line subtraction, constant offset elimination, vector-normalization, min-max normalization, multiple scattering. Correction, first derivative, second derivatives and their combination with other treatment such as First derivative + straight line subtraction, First derivative+ vector normalization and First derivative+ multiplicative scattering correction. The data processing methods in combination of preprocessing with different NIR regions, RMSECV, RMSEP and optimum factors/rank were obtained by optimization process of model development. More than 350 combinations were obtained during optimization process. More than one pre-processing method gave good calibration/cross-validation and prediction/test models, but only the best calibration/cross-validation and prediction/test models are reported here. The results show that one can safely use NIR region between 4000 to 7500 cm-1 with straight line subtraction, constant offset elimination, first derivative and second derivative preprocessing method which were found to be most appropriate for models development.

Keywords: FT-NIR, mechanical properties, pre-processing, PLS

Procedia PDF Downloads 361