Search results for: genomic prediction
2007 Contribution of PALB2 and BLM Mutations to Familial Breast Cancer Risk in BRCA1/2 Negative South African Breast Cancer Patients Detected Using High-Resolution Melting Analysis
Authors: N. C. van der Merwe, J. Oosthuizen, M. F. Makhetha, J. Adams, B. K. Dajee, S-R. Schneider
Abstract:
Women representing high-risk breast cancer families, who tested negative for pathogenic mutations in BRCA1 and BRCA2, are four times more likely to develop breast cancer compared to women in the general population. Sequencing of genes involved in genomic stability and DNA repair led to the identification of novel contributors to familial breast cancer risk. These include BLM and PALB2. Bloom's syndrome is a rare homozygous autosomal recessive chromosomal instability disorder with a high incidence of various types of neoplasia and is associated with breast cancer when in a heterozygous state. PALB2, on the other hand, binds to BRCA2 and together, they partake actively in DNA damage repair. Archived DNA samples of 66 BRCA1/2 negative high-risk breast cancer patients were retrospectively selected based on the presence of an extensive family history of the disease ( > 3 affecteds per family). All coding regions and splice-site boundaries of both genes were screened using High-Resolution Melting Analysis. Samples exhibiting variation were bi-directionally automated Sanger sequenced. The clinical significance of each variant was assessed using various in silico and splice site prediction algorithms. Comprehensive screening identified a total of 11 BLM and 26 PALB2 variants. The variants detected ranged from global to rare and included three novel mutations. Three BLM and two PALB2 likely pathogenic mutations were identified that could account for the disease in these extensive breast cancer families in the absence of BRCA mutations (BLM c.11T > A, p.V4D; BLM c.2603C > T, p.P868L; BLM c.3961G > A, p.V1321I; PALB2 c.421C > T, p.Gln141Ter; PALB2 c.508A > T, p.Arg170Ter). Conclusion: The study confirmed the contribution of pathogenic mutations in BLM and PALB2 to the familial breast cancer burden in South Africa. It explained the presence of the disease in 7.5% of the BRCA1/2 negative families with an extensive family history of breast cancer. Segregation analysis will be performed to confirm the clinical impact of these mutations for each of these families. These results justify the inclusion of both these genes in a comprehensive breast and ovarian next generation sequencing cancer panel and should be screened simultaneously with BRCA1 and BRCA2 as it might explain a significant percentage of familial breast and ovarian cancer in South Africa.Keywords: Bloom Syndrome, familial breast cancer, PALB2, South Africa
Procedia PDF Downloads 2362006 Customer Churn Prediction by Using Four Machine Learning Algorithms Integrating Features Selection and Normalization in the Telecom Sector
Authors: Alanoud Moraya Aldalan, Abdulaziz Almaleh
Abstract:
A crucial component of maintaining a customer-oriented business as in the telecom industry is understanding the reasons and factors that lead to customer churn. Competition between telecom companies has greatly increased in recent years. It has become more important to understand customers’ needs in this strong market of telecom industries, especially for those who are looking to turn over their service providers. So, predictive churn is now a mandatory requirement for retaining those customers. Machine learning can be utilized to accomplish this. Churn Prediction has become a very important topic in terms of machine learning classification in the telecommunications industry. Understanding the factors of customer churn and how they behave is very important to building an effective churn prediction model. This paper aims to predict churn and identify factors of customers’ churn based on their past service usage history. Aiming at this objective, the study makes use of feature selection, normalization, and feature engineering. Then, this study compared the performance of four different machine learning algorithms on the Orange dataset: Logistic Regression, Random Forest, Decision Tree, and Gradient Boosting. Evaluation of the performance was conducted by using the F1 score and ROC-AUC. Comparing the results of this study with existing models has proven to produce better results. The results showed the Gradients Boosting with feature selection technique outperformed in this study by achieving a 99% F1-score and 99% AUC, and all other experiments achieved good results as well.Keywords: machine learning, gradient boosting, logistic regression, churn, random forest, decision tree, ROC, AUC, F1-score
Procedia PDF Downloads 1342005 Permeability Prediction Based on Hydraulic Flow Unit Identification and Artificial Neural Networks
Authors: Emad A. Mohammed
Abstract:
The concept of hydraulic flow units (HFU) has been used for decades in the petroleum industry to improve the prediction of permeability. This concept is strongly related to the flow zone indicator (FZI) which is a function of the reservoir rock quality index (RQI). Both indices are based on reservoir porosity and permeability of core samples. It is assumed that core samples with similar FZI values belong to the same HFU. Thus, after dividing the porosity-permeability data based on the HFU, transformations can be done in order to estimate the permeability from the porosity. The conventional practice is to use the power law transformation using conventional HFU where percentage of error is considerably high. In this paper, neural network technique is employed as a soft computing transformation method to predict permeability instead of power law method to avoid higher percentage of error. This technique is based on HFU identification where Amaefule et al. (1993) method is utilized. In this regard, Kozeny and Carman (K–C) model, and modified K–C model by Hasan and Hossain (2011) are employed. A comparison is made between the two transformation techniques for the two porosity-permeability models. Results show that the modified K-C model helps in getting better results with lower percentage of error in predicting permeability. The results also show that the use of artificial intelligence techniques give more accurate prediction than power law method. This study was conducted on a heterogeneous complex carbonate reservoir in Oman. Data were collected from seven wells to obtain the permeability correlations for the whole field. The findings of this study will help in getting better estimation of permeability of a complex reservoir.Keywords: permeability, hydraulic flow units, artificial intelligence, correlation
Procedia PDF Downloads 1362004 Consumer Experience of 3D Body Scanning Technology and Acceptance of Related E-Commerce Market Applications in Saudi Arabia
Authors: Moudi Almousa
Abstract:
This research paper explores Saudi Arabian female consumers’ experiences using 3D body scanning technology and their level of acceptance of possible market applications of this technology to adopt for apparel online shopping. Data was collected for 82 women after being scanned then viewed a short video explaining three possible scenarios of 3D body scanning applications, which include size prediction, customization, and virtual try-on, before completing the survey questionnaire. Although respondents have strong positive responses towards the scanning experience, the majority were concerned about their privacy during the scanning process. The results indicated that size prediction and virtual try on had greater market application potential and a higher chance of crossing the gap based on consumer interest. The results of the study also indicated a strong positive correlation between respondents’ concern with inability to try on apparel products in online environments and their willingness to use the 3D possible market applications.Keywords: 3D body scanning, market applications, online, apparel fit
Procedia PDF Downloads 1452003 Clinical Prediction Score for Ruptured Appendicitis In ED
Authors: Thidathit Prachanukool, Chaiyaporn Yuksen, Welawat Tienpratarn, Sorravit Savatmongkorngul, Panvilai Tangkulpanich, Chetsadakon Jenpanitpong, Yuranan Phootothum, Malivan Phontabtim, Promphet Nuanprom
Abstract:
Background: Ruptured appendicitis has a high morbidity and mortality and requires immediate surgery. The Alvarado Score is used as a tool to predict the risk of acute appendicitis, but there is no such score for predicting rupture. This study aimed to developed the prediction score to determine the likelihood of ruptured appendicitis in an Asian population. Methods: This study was diagnostic, retrospectively cross-sectional and exploratory model at the Emergency Medicine Department in Ramathibodi Hospital between March 2016 and March 2018. The inclusion criteria were age >15 years and an available pathology report after appendectomy. Clinical factors included gender, age>60 years, right lower quadrant pain, migratory pain, nausea and/or vomiting, diarrhea, anorexia, fever>37.3°C, rebound tenderness, guarding, white blood cell count, polymorphonuclear white blood cells (PMN)>75%, and the pain duration before presentation. The predictive model and prediction score for ruptured appendicitis was developed by multivariable logistic regression analysis. Result: During the study period, 480 patients met the inclusion criteria; of these, 77 (16%) had ruptured appendicitis. Five independent factors were predictive of rupture, age>60 years, fever>37.3°C, guarding, PMN>75%, and duration of pain>24 hours to presentation. A score > 6 increased the likelihood ratio of ruptured appendicitis by 3.88 times. Conclusion: Using the Ramathibodi Welawat Ruptured Appendicitis Score. (RAMA WeRA Score) developed in this study, a score of > 6 was associated with ruptured appendicitis.Keywords: predictive model, risk score, ruptured appendicitis, emergency room
Procedia PDF Downloads 1652002 Prediction of Mechanical Strength of Multiscale Hybrid Reinforced Cementitious Composite
Authors: Salam Alrekabi, A. B. Cundy, Mohammed Haloob Al-Majidi
Abstract:
Novel multiscale hybrid reinforced cementitious composites based on carbon nanotubes (MHRCC-CNT), and carbon nanofibers (MHRCC-CNF) are new types of cement-based material fabricated with micro steel fibers and nanofilaments, featuring superior strain hardening, ductility, and energy absorption. This study focused on established models to predict the compressive strength, and direct and splitting tensile strengths of the produced cementitious composites. The analysis was carried out based on the experimental data presented by the previous author’s study, regression analysis, and the established models that available in the literature. The obtained models showed small differences in the predictions and target values with experimental verification indicated that the estimation of the mechanical properties could be achieved with good accuracy.Keywords: multiscale hybrid reinforced cementitious composites, carbon nanotubes, carbon nanofibers, mechanical strength prediction
Procedia PDF Downloads 1612001 Comparison of Existing Predictor and Development of Computational Method for S- Palmitoylation Site Identification in Arabidopsis Thaliana
Authors: Ayesha Sanjana Kawser Parsha
Abstract:
S-acylation is an irreversible bond in which cysteine residues are linked to fatty acids palmitate (74%) or stearate (22%), either at the COOH or NH2 terminal, via a thioester linkage. There are several experimental methods that can be used to identify the S-palmitoylation site; however, since they require a lot of time, computational methods are becoming increasingly necessary. There aren't many predictors, however, that can locate S- palmitoylation sites in Arabidopsis Thaliana with sufficient accuracy. This research is based on the importance of building a better prediction tool. To identify the type of machine learning algorithm that predicts this site more accurately for the experimental dataset, several prediction tools were examined in this research, including the GPS PALM 6.0, pCysMod, GPS LIPID 1.0, CSS PALM 4.0, and NBA PALM. These analyses were conducted by constructing the receiver operating characteristics plot and the area under the curve score. An AI-driven deep learning-based prediction tool has been developed utilizing the analysis and three sequence-based input data, such as the amino acid composition, binary encoding profile, and autocorrelation features. The model was developed using five layers, two activation functions, associated parameters, and hyperparameters. The model was built using various combinations of features, and after training and validation, it performed better when all the features were present while using the experimental dataset for 8 and 10-fold cross-validations. While testing the model with unseen and new data, such as the GPS PALM 6.0 plant and pCysMod mouse, the model performed better, and the area under the curve score was near 1. It can be demonstrated that this model outperforms the prior tools in predicting the S- palmitoylation site in the experimental data set by comparing the area under curve score of 10-fold cross-validation of the new model with the established tools' area under curve score with their respective training sets. The objective of this study is to develop a prediction tool for Arabidopsis Thaliana that is more accurate than current tools, as measured by the area under the curve score. Plant food production and immunological treatment targets can both be managed by utilizing this method to forecast S- palmitoylation sites.Keywords: S- palmitoylation, ROC PLOT, area under the curve, cross- validation score
Procedia PDF Downloads 762000 Exploring the Impact of Input Sequence Lengths on Long Short-Term Memory-Based Streamflow Prediction in Flashy Catchments
Authors: Farzad Hosseini Hossein Abadi, Cristina Prieto Sierra, Cesar Álvarez Díaz
Abstract:
Predicting streamflow accurately in flashy catchments prone to floods is a major research and operational challenge in hydrological modeling. Recent advancements in deep learning, particularly Long Short-Term Memory (LSTM) networks, have shown to be promising in achieving accurate hydrological predictions at daily and hourly time scales. In this work, a multi-timescale LSTM (MTS-LSTM) network was applied to the context of regional hydrological predictions at an hourly time scale in flashy catchments. The case study includes 40 catchments allocated in the Basque Country, north of Spain. We explore the impact of hyperparameters on the performance of streamflow predictions given by regional deep learning models through systematic hyperparameter tuning - where optimal regional values for different catchments are identified. The results show that predictions are highly accurate, with Nash-Sutcliffe (NSE) and Kling-Gupta (KGE) metrics values as high as 0.98 and 0.97, respectively. A principal component analysis reveals that a hyperparameter related to the length of the input sequence contributes most significantly to the prediction performance. The findings suggest that input sequence lengths have a crucial impact on the model prediction performance. Moreover, employing catchment-scale analysis reveals distinct sequence lengths for individual basins, highlighting the necessity of customizing this hyperparameter based on each catchment’s characteristics. This aligns with well known “uniqueness of the place” paradigm. In prior research, tuning the length of the input sequence of LSTMs has received limited focus in the field of streamflow prediction. Initially it was set to 365 days to capture a full annual water cycle. Later, performing limited systematic hyper-tuning using grid search, revealed a modification to 270 days. However, despite the significance of this hyperparameter in hydrological predictions, usually studies have overlooked its tuning and fixed it to 365 days. This study, employing a simultaneous systematic hyperparameter tuning approach, emphasizes the critical role of input sequence length as an influential hyperparameter in configuring LSTMs for regional streamflow prediction. Proper tuning of this hyperparameter is essential for achieving accurate hourly predictions using deep learning models.Keywords: LSTMs, streamflow, hyperparameters, hydrology
Procedia PDF Downloads 691999 Comparison of Different Machine Learning Algorithms for Solubility Prediction
Authors: Muhammet Baldan, Emel Timuçin
Abstract:
Molecular solubility prediction plays a crucial role in various fields, such as drug discovery, environmental science, and material science. In this study, we compare the performance of five machine learning algorithms—linear regression, support vector machines (SVM), random forests, gradient boosting machines (GBM), and neural networks—for predicting molecular solubility using the AqSolDB dataset. The dataset consists of 9981 data points with their corresponding solubility values. MACCS keys (166 bits), RDKit properties (20 properties), and structural properties(3) features are extracted for every smile representation in the dataset. A total of 189 features were used for training and testing for every molecule. Each algorithm is trained on a subset of the dataset and evaluated using metrics accuracy scores. Additionally, computational time for training and testing is recorded to assess the efficiency of each algorithm. Our results demonstrate that random forest model outperformed other algorithms in terms of predictive accuracy, achieving an 0.93 accuracy score. Gradient boosting machines and neural networks also exhibit strong performance, closely followed by support vector machines. Linear regression, while simpler in nature, demonstrates competitive performance but with slightly higher errors compared to ensemble methods. Overall, this study provides valuable insights into the performance of machine learning algorithms for molecular solubility prediction, highlighting the importance of algorithm selection in achieving accurate and efficient predictions in practical applications.Keywords: random forest, machine learning, comparison, feature extraction
Procedia PDF Downloads 401998 StockTwits Sentiment Analysis on Stock Price Prediction
Authors: Min Chen, Rubi Gupta
Abstract:
Understanding and predicting stock market movements is a challenging problem. It is believed stock markets are partially driven by public sentiments, which leads to numerous research efforts to predict stock market trend using public sentiments expressed on social media such as Twitter but with limited success. Recently a microblogging website StockTwits is becoming increasingly popular for users to share their discussions and sentiments about stocks and financial market. In this project, we analyze the text content of StockTwits tweets and extract financial sentiment using text featurization and machine learning algorithms. StockTwits tweets are first pre-processed using techniques including stopword removal, special character removal, and case normalization to remove noise. Features are extracted from these preprocessed tweets through text featurization process using bags of words, N-gram models, TF-IDF (term frequency-inverse document frequency), and latent semantic analysis. Machine learning models are then trained to classify the tweets' sentiment as positive (bullish) or negative (bearish). The correlation between the aggregated daily sentiment and daily stock price movement is then investigated using Pearson’s correlation coefficient. Finally, the sentiment information is applied together with time series stock data to predict stock price movement. The experiments on five companies (Apple, Amazon, General Electric, Microsoft, and Target) in a duration of nine months demonstrate the effectiveness of our study in improving the prediction accuracy.Keywords: machine learning, sentiment analysis, stock price prediction, tweet processing
Procedia PDF Downloads 1561997 Investigation on Remote Sense Surface Latent Heat Temperature Associated with Pre-Seismic Activities in Indian Region
Authors: Vijay S. Katta, Vinod Kushwah, Rudraksh Tiwari, Mulayam Singh Gaur, Priti Dimri, Ashok Kumar Sharma
Abstract:
The formation process of seismic activities because of abrupt slip on faults, tectonic plate moments due to accumulated stress in the Earth’s crust. The prediction of seismic activity is a very challenging task. We have studied the changes in surface latent heat temperatures which are observed prior to significant earthquakes have been investigated and could be considered for short term earthquake prediction. We analyzed the surface latent heat temperature (SLHT) variation for inland earthquakes occurred in Chamba, Himachal Pradesh (32.5 N, 76.1E, M-4.5, depth-5km) nearby the main boundary fault region, the data of SLHT have been taken from National Center for Environmental Prediction (NCEP). In this analysis, we have calculated daily variations with surface latent heat temperature (0C) in the range area 1⁰x1⁰ (~120/KM²) with the pixel covering epicenter of earthquake at the center for a three months period prior to and after the seismic activities. The mean value during that period has been considered in order to take account of the seasonal effect. The monthly mean has been subtracted from daily value to study anomalous behavior (∆SLHT) of SLHT during the earthquakes. The results found that the SLHTs adjacent the epicenters all are anomalous high value 3-5 days before the seismic activities. The abundant surface water and groundwater in the epicenter and its adjacent region can provide the necessary condition for the change of SLHT. To further confirm the reliability of SLHT anomaly, it is necessary to explore its physical mechanism in depth by more earthquakes cases.Keywords: surface latent heat temperature, satellite data, earthquake, magnetic storm
Procedia PDF Downloads 1341996 Prediction of Rolling Forces and Real Exit Thickness of Strips in the Cold Rolling by Using Artificial Neural Networks
Authors: M. Heydari Vini
Abstract:
There is a complicated relation between effective input parameters of cold rolling and output rolling force and exit thickness of strips.in many mathematical models, the effect of some rolling parameters have been ignored and the outputs have not a desirable accuracy. In the other hand, there is a special relation among input thickness of strips,the width of the strips,rolling speeds,mandrill tensions and the required exit thickness of strips with rolling force and the real exit thickness of the rolled strip. First of all, in this paper the effective parameters of cold rolling process modeled using an artificial neural network according to the optimum network achieved by using a written program in MATLAB,it has been shown that the prediction of rolling stand parameters with different properties and new dimensions attained from prior rolled strips by an artificial neural network is applicable.Keywords: cold rolling, artificial neural networks, rolling force, real rolled thickness of strips
Procedia PDF Downloads 5051995 A Novel Chicken W Chromosome Specific Tandem Repeat
Authors: Alsu F. Saifitdinova, Alexey S. Komissarov, Svetlana A. Galkina, Elena I. Koshel, Maria M. Kulak, Stephen J. O'Brien, Elena R. Gaginskaya
Abstract:
The mystery of sex determination is one of the most ancient and still not solved until the end so far. In many species, sex determination is genetic and often accompanied by the presence of dimorphic sex chromosomes in the karyotype. Genomic sequencing gave the information about the gene content of sex chromosomes which allowed to reveal their origin from ordinary autosomes and to trace their evolutionary history. Female-specific W chromosome in birds as well as mammalian male-specific Y chromosome is characterized by the degeneration of gene content and the accumulation of repetitive DNA. Tandem repeats complicate the analysis of genomic data. Despite the best efforts chicken W chromosome assembly includes only 1.2 Mb from expected 55 Mb. Supplementing the information on the sex chromosome composition not only helps to complete the assembly of genomes but also moves us in the direction of understanding of the sex-determination systems evolution. A whole-genome survey to the assembly Gallus_gallus WASHUC 2.60 was applied for repeats search in assembled genome and performed search and assembly of high copy number repeats in unassembled reads of SRR867748 short reads datasets. For cytogenetic analysis conventional methods of fluorescent in situ hybridization was used for previously cloned W specific satellites and specifically designed directly labeled synthetic oligonucleotide DNA probe was used for bioinformatically identified repetitive sequence. Hybridization was performed with mitotic chicken chromosomes and manually isolated giant meiotic lampbrush chromosomes from growing oocytes. A novel chicken W specific satellite (GGAAA)n which is not co-localizes with any previously described classes of W specific repeats was identified and mapped with high resolution. In the composition of autosomes this repeat units was found as a part of upstream regions of gonad specific protein coding sequences. These findings may contribute to the understanding of the role of tandem repeats in sex specific differentiation regulation in birds and sex chromosome evolution. This work was supported by the postdoctoral fellowships from St. Petersburg State University (#1.50.1623.2013 and #1.50.1043.2014), the grant for Leading Scientific Schools (#3553.2014.4) and the grant from Russian foundation for basic researches (#15-04-05684). The equipment and software of Research Resource Center “Chromas” and Theodosius Dobzhansky Center for Genome Bioinformatics of Saint Petersburg State University were used.Keywords: birds, lampbrush chromosomes, sex chromosomes, tandem repeats
Procedia PDF Downloads 3891994 Prediction of California Bearing Ratio of a Black Cotton Soil Stabilized with Waste Glass and Eggshell Powder using Artificial Neural Network
Authors: Biruhi Tesfaye, Avinash M. Potdar
Abstract:
The laboratory test process to determine the California bearing ratio (CBR) of black cotton soils is not only overpriced but also time-consuming as well. Hence advanced prediction of CBR plays a significant role as it is applicable In pavement design. The prediction of CBR of treated soil was executed by Artificial Neural Networks (ANNs) which is a Computational tool based on the properties of the biological neural system. To observe CBR values, combined eggshell and waste glass was added to soil as 4, 8, 12, and 16 % of the weights of the soil samples. Accordingly, the laboratory related tests were conducted to get the required best model. The maximum CBR value found at 5.8 at 8 % of eggshell waste glass powder addition. The model was developed using CBR as an output layer variable. CBR was considered as a function of the joint effect of liquid limit, plastic limit, and plastic index, optimum moisture content and maximum dry density. The best model that has been found was ANN with 5, 6 and 1 neurons in the input, hidden and output layer correspondingly. The performance of selected ANN has been 0.99996, 4.44E-05, 0.00353 and 0.0067 which are correlation coefficient (R), mean square error (MSE), mean absolute error (MAE) and root mean square error (RMSE) respectively. The research presented or summarized above throws light on future scope on stabilization with waste glass combined with different percentages of eggshell that leads to the economical design of CBR acceptable to pavement sub-base or base, as desired.Keywords: CBR, artificial neural network, liquid limit, plastic limit, maximum dry density, OMC
Procedia PDF Downloads 1901993 Application of Post-Stack and Pre-Stack Seismic Inversion for Prediction of Hydrocarbon Reservoirs in a Persian Gulf Gas Field
Authors: Nastaran Moosavi, Mohammad Mokhtari
Abstract:
Seismic inversion is a technique which has been in use for years and its main goal is to estimate and to model physical characteristics of rocks and fluids. Generally, it is a combination of seismic and well-log data. Seismic inversion can be carried out through different methods; we have conducted and compared post-stack and pre- stack seismic inversion methods on real data in one of the fields in the Persian Gulf. Pre-stack seismic inversion can transform seismic data to rock physics such as P-impedance, S-impedance and density. While post- stack seismic inversion can just estimate P-impedance. Then these parameters can be used in reservoir identification. Based on the results of inverting seismic data, a gas reservoir was detected in one of Hydrocarbon oil fields in south of Iran (Persian Gulf). By comparing post stack and pre-stack seismic inversion it can be concluded that the pre-stack seismic inversion provides a more reliable and detailed information for identification and prediction of hydrocarbon reservoirs.Keywords: density, p-impedance, s-impedance, post-stack seismic inversion, pre-stack seismic inversion
Procedia PDF Downloads 3231992 Reburning Characteristics of Biomass Syngas in a Pilot Scale Heavy Oil Furnace
Authors: Sang Heon Han, Daejun Chang, Won Yang
Abstract:
NOx reduction characteristics of syngas fuel were numerically investigated for the 2MW pilot scale heavy oil furnace of KITECH (Korea Institute of Industrial Technology). The secondary fuel and syngas was fed into the furnace with two purposes- partial replacement of main fuel and reburning of NOx. Some portion of syngas was fed into the flame zone to partially replace the heavy oil, while the other portion was fed into the furnace downstream to reduce NOx generation. The numerical prediction was verified by comparing it with the experimental results. Syngas of KITECH’s experiment, assumed to be produced from biomass, had very low calorific value and contained 3% hydrocarbon. This study investigated the precise behavior of NOx generation and NOx reduction as well as thermo-fluidic characteristics inside the furnace, which was unavailable with experiment. In addition to 3% hydrocarbon syngas, 5%, and 7% hydrocarbon syngas were numerically tested as reburning fuels to analyze the effect of hydrocarbon proportion to NOx reduction. The prediction showed that the 3% hydrocarbon syngas is as much effective as 7% hydrocarbon syngas in reducing NOx.Keywords: syngas, reburning, heavy oil, furnace
Procedia PDF Downloads 4441991 Current Methods for Drug Property Prediction in the Real World
Authors: Jacob Green, Cecilia Cabrera, Maximilian Jakobs, Andrea Dimitracopoulos, Mark van der Wilk, Ryan Greenhalgh
Abstract:
Predicting drug properties is key in drug discovery to enable de-risking of assets before expensive clinical trials and to find highly active compounds faster. Interest from the machine learning community has led to the release of a variety of benchmark datasets and proposed methods. However, it remains unclear for practitioners which method or approach is most suitable, as different papers benchmark on different datasets and methods, leading to varying conclusions that are not easily compared. Our large-scale empirical study links together numerous earlier works on different datasets and methods, thus offering a comprehensive overview of the existing property classes, datasets, and their interactions with different methods. We emphasise the importance of uncertainty quantification and the time and, therefore, cost of applying these methods in the drug development decision-making cycle. To the best of the author's knowledge, it has been observed that the optimal approach varies depending on the dataset and that engineered features with classical machine learning methods often outperform deep learning. Specifically, QSAR datasets are typically best analysed with classical methods such as Gaussian Processes, while ADMET datasets are sometimes better described by Trees or deep learning methods such as Graph Neural Networks or language models. Our work highlights that practitioners do not yet have a straightforward, black-box procedure to rely on and sets a precedent for creating practitioner-relevant benchmarks. Deep learning approaches must be proven on these benchmarks to become the practical method of choice in drug property prediction.Keywords: activity (QSAR), ADMET, classical methods, drug property prediction, empirical study, machine learning
Procedia PDF Downloads 811990 Regression Model Evaluation on Depth Camera Data for Gaze Estimation
Authors: James Purnama, Riri Fitri Sari
Abstract:
We investigate the machine learning algorithm selection problem in the term of a depth image based eye gaze estimation, with respect to its essential difficulty in reducing the number of required training samples and duration time of training. Statistics based prediction accuracy are increasingly used to assess and evaluate prediction or estimation in gaze estimation. This article evaluates Root Mean Squared Error (RMSE) and R-Squared statistical analysis to assess machine learning methods on depth camera data for gaze estimation. There are 4 machines learning methods have been evaluated: Random Forest Regression, Regression Tree, Support Vector Machine (SVM), and Linear Regression. The experiment results show that the Random Forest Regression has the lowest RMSE and the highest R-Squared, which means that it is the best among other methods.Keywords: gaze estimation, gaze tracking, eye tracking, kinect, regression model, orange python
Procedia PDF Downloads 5381989 Rail Degradation Modelling Using ARMAX: A Case Study Applied to Melbourne Tram System
Authors: M. Karimpour, N. Elkhoury, L. Hitihamillage, S. Moridpour, R. Hesami
Abstract:
There is a necessity among rail transportation authorities for a superior understanding of the rail track degradation overtime and the factors influencing rail degradation. They need an accurate technique to identify the time when rail tracks fail or need maintenance. In turn, this will help to increase the level of safety and comfort of the passengers and the vehicles as well as improve the cost effectiveness of maintenance activities. An accurate model can play a key role in prediction of the long-term behaviour of railroad tracks. An accurate model can decrease the cost of maintenance. In this research, the rail track degradation is predicted using an autoregressive moving average with exogenous input (ARMAX). An ARMAX has been implemented on Melbourne tram data to estimate the values for the tram track degradation. Gauge values and rail usage in Million Gross Tone (MGT) are the main parameters used in the model. The developed model can accurately predict the future status of the tram tracks.Keywords: ARMAX, dynamic systems, MGT, prediction, rail degradation
Procedia PDF Downloads 2481988 Evaluation of Short-Term Load Forecasting Techniques Applied for Smart Micro-Grids
Authors: Xiaolei Hu, Enrico Ferrera, Riccardo Tomasi, Claudio Pastrone
Abstract:
Load Forecasting plays a key role in making today's and future's Smart Energy Grids sustainable and reliable. Accurate power consumption prediction allows utilities to organize in advance their resources or to execute Demand Response strategies more effectively, which enables several features such as higher sustainability, better quality of service, and affordable electricity tariffs. It is easy yet effective to apply Load Forecasting at larger geographic scale, i.e. Smart Micro Grids, wherein the lower available grid flexibility makes accurate prediction more critical in Demand Response applications. This paper analyses the application of short-term load forecasting in a concrete scenario, proposed within the EU-funded GreenCom project, which collect load data from single loads and households belonging to a Smart Micro Grid. Three short-term load forecasting techniques, i.e. linear regression, artificial neural networks, and radial basis function network, are considered, compared, and evaluated through absolute forecast errors and training time. The influence of weather conditions in Load Forecasting is also evaluated. A new definition of Gain is introduced in this paper, which innovatively serves as an indicator of short-term prediction capabilities of time spam consistency. Two models, 24- and 1-hour-ahead forecasting, are built to comprehensively compare these three techniques.Keywords: short-term load forecasting, smart micro grid, linear regression, artificial neural networks, radial basis function network, gain
Procedia PDF Downloads 4681987 Water Leakage Detection System of Pipe Line using Radial Basis Function Neural Network
Authors: A. Ejah Umraeni Salam, M. Tola, M. Selintung, F. Maricar
Abstract:
Clean water is an essential and fundamental human need. Therefore, its supply must be assured by maintaining the quality, quantity and water pressure. However the fact is, on its distribution system, leakage happens and becomes a common world issue. One of the technical causes of the leakage is a leaking pipe. The purpose of the research is how to use the Radial Basis Function Neural (RBFNN) model to detect the location and the magnitude of the pipeline leakage rapidly and efficiently. In this study the RBFNN are trained and tested on data from EPANET hydraulic modeling system. Method of Radial Basis Function Neural Network is proved capable to detect location and magnitude of pipeline leakage with of the accuracy of the prediction results based on the value of RMSE (Root Meant Square Error), comparison prediction and actual measurement approaches 0.000049 for the whole pipeline system.Keywords: radial basis function neural network, leakage pipeline, EPANET, RMSE
Procedia PDF Downloads 3581986 Probabilistic Crash Prediction and Prevention of Vehicle Crash
Authors: Lavanya Annadi, Fahimeh Jafari
Abstract:
Transportation brings immense benefits to society, but it also has its costs. Costs include such as the cost of infrastructure, personnel and equipment, but also the loss of life and property in traffic accidents on the road, delays in travel due to traffic congestion and various indirect costs in terms of air transport. More research has been done to identify the various factors that affect road accidents, such as road infrastructure, traffic, sociodemographic characteristics, land use, and the environment. The aim of this research is to predict the probabilistic crash prediction of vehicles using machine learning due to natural and structural reasons by excluding spontaneous reasons like overspeeding etc., in the United States. These factors range from weather factors, like weather conditions, precipitation, visibility, wind speed, wind direction, temperature, pressure, and humidity to human made structures like road structure factors like bump, roundabout, no exit, turning loop, give away, etc. Probabilities are dissected into ten different classes. All the predictions are based on multiclass classification techniques, which are supervised learning. This study considers all crashes that happened in all states collected by the US government. To calculate the probability, multinomial expected value was used and assigned a classification label as the crash probability. We applied three different classification models, including multiclass Logistic Regression, Random Forest and XGBoost. The numerical results show that XGBoost achieved a 75.2% accuracy rate which indicates the part that is being played by natural and structural reasons for the crash. The paper has provided in-deep insights through exploratory data analysis.Keywords: road safety, crash prediction, exploratory analysis, machine learning
Procedia PDF Downloads 1111985 Solid State Drive End to End Reliability Prediction, Characterization and Control
Authors: Mohd Azman Abdul Latif, Erwan Basiron
Abstract:
A flaw or drift from expected operational performance in one component (NAND, PMIC, controller, DRAM, etc.) may affect the reliability of the entire Solid State Drive (SSD) system. Therefore, it is important to ensure the required quality of each individual component through qualification testing specified using standards or user requirements. Qualification testing is time-consuming and comes at a substantial cost for product manufacturers. A highly technical team, from all the eminent stakeholders is embarking on reliability prediction from beginning of new product development, identify critical to reliability parameters, perform full-blown characterization to embed margin into product reliability and establish control to ensure the product reliability is sustainable in the mass production. The paper will discuss a comprehensive development framework, comprehending SSD end to end from design to assembly, in-line inspection, in-line testing and will be able to predict and to validate the product reliability at the early stage of new product development. During the design stage, the SSD will go through intense reliability margin investigation with focus on assembly process attributes, process equipment control, in-process metrology and also comprehending forward looking product roadmap. Once these pillars are completed, the next step is to perform process characterization and build up reliability prediction modeling. Next, for the design validation process, the reliability prediction specifically solder joint simulator will be established. The SSD will be stratified into Non-Operating and Operating tests with focus on solder joint reliability and connectivity/component latent failures by prevention through design intervention and containment through Temperature Cycle Test (TCT). Some of the SSDs will be subjected to the physical solder joint analysis called Dye and Pry (DP) and Cross Section analysis. The result will be feedbacked to the simulation team for any corrective actions required to further improve the design. Once the SSD is validated and is proven working, it will be subjected to implementation of the monitor phase whereby Design for Assembly (DFA) rules will be updated. At this stage, the design change, process and equipment parameters are in control. Predictable product reliability at early product development will enable on-time sample qualification delivery to customer and will optimize product development validation, effective development resource and will avoid forced late investment to bandage the end-of-life product failures. Understanding the critical to reliability parameters earlier will allow focus on increasing the product margin that will increase customer confidence to product reliability.Keywords: e2e reliability prediction, SSD, TCT, solder joint reliability, NUDD, connectivity issues, qualifications, characterization and control
Procedia PDF Downloads 1741984 Application of Artificial Neural Network for Prediction of High Tensile Steel Strands in Post-Tensioned Slabs
Authors: Gaurav Sancheti
Abstract:
This study presents an impacting approach of Artificial Neural Networks (ANNs) in determining the quantity of High Tensile Steel (HTS) strands required in post-tensioned (PT) slabs. Various PT slab configurations were generated by varying the span and depth of the slab. For each of these slab configurations, quantity of required HTS strands were recorded. ANNs with backpropagation algorithm and varying architectures were developed and their performance was evaluated in terms of Mean Square Error (MSE). The recorded data for the quantity of HTS strands was used as a feeder database for training the developed ANNs. The networks were validated using various validation techniques. The results show that the proposed ANNs have a great potential with good prediction and generalization capability.Keywords: artificial neural networks, back propagation, conceptual design, high tensile steel strands, post tensioned slabs, validation techniques
Procedia PDF Downloads 2211983 Predicting Bridge Pier Scour Depth with SVM
Authors: Arun Goel
Abstract:
Prediction of maximum local scour is necessary for the safety and economical design of the bridges. A number of equations have been developed over the years to predict local scour depth using laboratory data and a few pier equations have also been proposed using field data. Most of these equations are empirical in nature as indicated by the past publications. In this paper, attempts have been made to compute local depth of scour around bridge pier in dimensional and non-dimensional form by using linear regression, simple regression and SVM (Poly and Rbf) techniques along with few conventional empirical equations. The outcome of this study suggests that the SVM (Poly and Rbf) based modeling can be employed as an alternate to linear regression, simple regression and the conventional empirical equations in predicting scour depth of bridge piers. The results of present study on the basis of non-dimensional form of bridge pier scour indicates the improvement in the performance of SVM (Poly and Rbf) in comparison to dimensional form of scour.Keywords: modeling, pier scour, regression, prediction, SVM (Poly and Rbf kernels)
Procedia PDF Downloads 4511982 Predicting Global Solar Radiation Using Recurrent Neural Networks and Climatological Parameters
Authors: Rami El-Hajj Mohamad, Mahmoud Skafi, Ali Massoud Haidar
Abstract:
Several meteorological parameters were used for the prediction of monthly average daily global solar radiation on horizontal using recurrent neural networks (RNNs). Climatological data and measures, mainly air temperature, humidity, sunshine duration, and wind speed between 1995 and 2007 were used to design and validate a feed forward and recurrent neural network based prediction systems. In this paper we present our reference system based on a feed-forward multilayer perceptron (MLP) as well as the proposed approach based on an RNN model. The obtained results were promising and comparable to those obtained by other existing empirical and neural models. The experimental results showed the advantage of RNNs over simple MLPs when we deal with time series solar radiation predictions based on daily climatological data.Keywords: recurrent neural networks, global solar radiation, multi-layer perceptron, gradient, root mean square error
Procedia PDF Downloads 4441981 A Study on Performance Prediction in Early Design Stage of Apartment Housing Using Machine Learning
Authors: Seongjun Kim, Sanghoon Shim, Jinwooung Kim, Jaehwan Jung, Sung-Ah Kim
Abstract:
As the development of information and communication technology, the convergence of machine learning of the ICT area and design is attempted. In this way, it is possible to grasp the correlation between various design elements, which was difficult to grasp, and to reflect this in the design result. In architecture, there is an attempt to predict the performance, which is difficult to grasp in the past, by finding the correlation among multiple factors mainly through machine learning. In architectural design area, some attempts to predict the performance affected by various factors have been tried. With machine learning, it is possible to quickly predict performance. The aim of this study is to propose a model that predicts performance according to the block arrangement of apartment housing through machine learning and the design alternative which satisfies the performance such as the daylight hours in the most similar form to the alternative proposed by the designer. Through this study, a designer can proceed with the design considering various design alternatives and accurate performances quickly from the early design stage.Keywords: apartment housing, machine learning, multi-objective optimization, performance prediction
Procedia PDF Downloads 4811980 Prediction of Heavy-Weight Impact Noise and Vibration of Floating Floor Using Modified Impact Spectrum
Authors: Ju-Hyung Kim, Dae-Ho Mun, Hong-Gun Park
Abstract:
When an impact is applied to a floating floor, noise and vibration response of high-frequency range is reduced effectively, while amplifies the response at low-frequency range. This means floating floor can make worse noise condition when heavy-weight impact is applied. The amplified response is the result of interaction between finishing layer (mortar plate) and concrete slab. Because an impact force is not directly delivered to concrete slab, the impact force waveform or spectrum can be changed. In this paper, the changed impact spectrum was derived from several floating floor vibration tests. Based on the measured data, numerical modeling can describe the floating floor response, especially at low-frequency range. As a result, heavy-weight impact noise can be predicted using modified impact spectrum.Keywords: floating floor, heavy-weight impact, prediction, vibration
Procedia PDF Downloads 3721979 Predicting and Obtaining New Solvates of Curcumin, Demethoxycurcumin and Bisdemethoxycurcumin Based on the Ccdc Statistical Tools and Hansen Solubility Parameters
Authors: J. Ticona Chambi, E. A. De Almeida, C. A. Andrade Raymundo Gaiotto, A. M. Do Espírito Santo, L. Infantes, S. L. Cuffini
Abstract:
The solubility of active pharmaceutical ingredients (APIs) is challenging for the pharmaceutical industry. The new multicomponent crystalline forms as cocrystal and solvates present an opportunity to improve the solubility of APIs. Commonly, the procedure to obtain multicomponent crystalline forms of a drug starts by screening the drug molecule with the different coformers/solvents. However, it is necessary to develop methods to obtain multicomponent forms in an efficient way and with the least possible environmental impact. The Hansen Solubility Parameters (HSPs) is considered a tool to obtain theoretical knowledge of the solubility of the target compound in the chosen solvent. H-Bond Propensity (HBP), Molecular Complementarity (MC), Coordination Values (CV) are tools used for statistical prediction of cocrystals developed by the Cambridge Crystallographic Data Center (CCDC). The HSPs and the CCDC tools are based on inter- and intra-molecular interactions. The curcumin (Cur), target molecule, is commonly used as an anti‐inflammatory. The demethoxycurcumin (Demcur) and bisdemethoxycurcumin (Bisdcur) are natural analogues of Cur from turmeric. Those target molecules have differences in their solubilities. In this way, the work aimed to analyze and compare different tools for multicomponent forms prediction (solvates) of Cur, Demcur and Biscur. The HSP values were calculated for Cur, Demcur, and Biscur using the chemical group contribution methods and the statistical optimization from experimental data. The HSPmol software was used. From the HSPs of the target molecules and fifty solvents (listed in the HSP books), the relative energy difference (RED) was determined. The probability of the target molecules would be interacting with the solvent molecule was determined using the CCDC tools. A dataset of fifty molecules of different organic solvents was ranked for each prediction method and by a consensus ranking of different combinations: HSP, CV, HBP and MC values. Based on the prediction, 15 solvents were selected as Dimethyl Sulfoxide (DMSO), Tetrahydrofuran (THF), Acetonitrile (ACN), 1,4-Dioxane (DOX) and others. In a starting analysis, the slow evaporation technique from 50°C at room temperature and 4°C was used to obtain solvates. The single crystals were collected by using a Bruker D8 Venture diffractometer, detector Photon100. The data processing and crystal structure determination were performed using APEX3 and Olex2-1.5 software. According to the results, the HSPs (theoretical and optimized) and the Hansen solubility sphere for Cur, Demcur and Biscur were obtained. With respect to prediction analyses, a way to evaluate the predicting method was through the ranking and the consensus ranking position of solvates already reported in the literature. It was observed that the combination of HSP-CV obtained the best results when compared to the other methods. Furthermore, as a result of solvent selected, six new solvates, Cur-DOX, Cur-DMSO, Bicur-DOX, Bircur-THF, Demcur-DOX, Demcur-ACN and a new Biscur hydrate, were obtained. Crystal structures were determined for Cur-DOX, Biscur-DOX, Demcur-DOX and Bicur-Water. Moreover, the unit-cell parameter information for Cur-DMSO, Biscur-THF and Demcur-ACN were obtained. The preliminary results showed that the prediction method is showing a promising strategy to evaluate the possibility of forming multicomponent. It is currently working on obtaining multicomponent single crystals.Keywords: curcumin, HSPs, prediction, solvates, solubility
Procedia PDF Downloads 631978 Prediction of in situ Permeability for Limestone Rock Using Rock Quality Designation Index
Authors: Ahmed T. Farid, Muhammed Rizwan
Abstract:
Geotechnical study for evaluating soil or rock permeability is a highly important parameter. Permeability values for rock formations are more difficult for determination than soil formation as it is an effect of the rock quality and its fracture values. In this research, the prediction of in situ permeability of limestone rock formations was predicted. The limestone rock permeability was evaluated using Lugeon tests (in-situ packer permeability). Different sites which spread all over the Riyadh region of Saudi Arabia were chosen to conduct our study of predicting the in-situ permeability of limestone rock. Correlations were deducted between the values of in-situ permeability of the limestone rock with the value of the rock quality designation (RQD) calculated during the execution of the boreholes of the study areas. The study was performed for different ranges of RQD values measured during drilling of the sites boreholes. The developed correlations are recommended for the onsite determination of the in-situ permeability of limestone rock only. For the other sedimentary formations of rock, more studies are needed for predicting the actual correlations related to each type.Keywords: In situ, packer, permeability, rock, quality
Procedia PDF Downloads 372