Search results for: Grey prediction model
17832 Reliability Prediction of Tires Using Linear Mixed-Effects Model
Authors: Myung Hwan Na, Ho- Chun Song, EunHee Hong
Abstract:
We widely use normal linear mixed-effects model to analysis data in repeated measurement. In case of detecting heteroscedasticity and the non-normality of the population distribution at the same time, normal linear mixed-effects model can give improper result of analysis. To achieve more robust estimation, we use heavy tailed linear mixed-effects model which gives more exact and reliable analysis conclusion than standard normal linear mixed-effects model.Keywords: reliability, tires, field data, linear mixed-effects model
Procedia PDF Downloads 56417831 Effect of Genuine Missing Data Imputation on Prediction of Urinary Incontinence
Authors: Suzan Arslanturk, Mohammad-Reza Siadat, Theophilus Ogunyemi, Ananias Diokno
Abstract:
Missing data is a common challenge in statistical analyses of most clinical survey datasets. A variety of methods have been developed to enable analysis of survey data to deal with missing values. Imputation is the most commonly used among the above methods. However, in order to minimize the bias introduced due to imputation, one must choose the right imputation technique and apply it to the correct type of missing data. In this paper, we have identified different types of missing values: missing data due to skip pattern (SPMD), undetermined missing data (UMD), and genuine missing data (GMD) and applied rough set imputation on only the GMD portion of the missing data. We have used rough set imputation to evaluate the effect of such imputation on prediction by generating several simulation datasets based on an existing epidemiological dataset (MESA). To measure how well each dataset lends itself to the prediction model (logistic regression), we have used p-values from the Wald test. To evaluate the accuracy of the prediction, we have considered the width of 95% confidence interval for the probability of incontinence. Both imputed and non-imputed simulation datasets were fit to the prediction model, and they both turned out to be significant (p-value < 0.05). However, the Wald score shows a better fit for the imputed compared to non-imputed datasets (28.7 vs. 23.4). The average confidence interval width was decreased by 10.4% when the imputed dataset was used, meaning higher precision. The results show that using the rough set method for missing data imputation on GMD data improve the predictive capability of the logistic regression. Further studies are required to generalize this conclusion to other clinical survey datasets.Keywords: rough set, imputation, clinical survey data simulation, genuine missing data, predictive index
Procedia PDF Downloads 16917830 Rainfall-Runoff Forecasting Utilizing Genetic Programming Technique
Authors: Ahmed Najah Ahmed Al-Mahfoodh, Ali Najah Ahmed Al-Mahfoodh, Ahmed Al-Shafie
Abstract:
In this study, genetic programming (GP) technique has been investigated in prediction of set of rainfall-runoff data. To assess the effect of input parameters on the model, the sensitivity analysis was adopted. To evaluate the performance of the proposed model, three statistical indexes were used, namely; Correlation Coefficient (CC), Mean Square Error (MSE) and Correlation of Efficiency (CE). The principle aim of this study is to develop a computationally efficient and robust approach for predict of rainfall-runoff which could reduce the cost and labour for measuring these parameters. This research concentrates on the Johor River in Johor State, Malaysia.Keywords: genetic programming, prediction, rainfall-runoff, Malaysia
Procedia PDF Downloads 48317829 Epileptic Seizure Prediction by Exploiting Signal Transitions Phenomena
Authors: Mohammad Zavid Parvez, Manoranjan Paul
Abstract:
A seizure prediction method is proposed by extracting global features using phase correlation between adjacent epochs for detecting relative changes and local features using fluctuation/deviation within an epoch for determining fine changes of different EEG signals. A classifier and a regularization technique are applied for the reduction of false alarms and improvement of the overall prediction accuracy. The experiments show that the proposed method outperforms the state-of-the-art methods and provides high prediction accuracy (i.e., 97.70%) with low false alarm using EEG signals in different brain locations from a benchmark data set.Keywords: Epilepsy, seizure, phase correlation, fluctuation, deviation.
Procedia PDF Downloads 46717828 Protein Tertiary Structure Prediction by a Multiobjective Optimization and Neural Network Approach
Authors: Alexandre Barbosa de Almeida, Telma Woerle de Lima Soares
Abstract:
Protein structure prediction is a challenging task in the bioinformatics field. The biological function of all proteins majorly relies on the shape of their three-dimensional conformational structure, but less than 1% of all known proteins in the world have their structure solved. This work proposes a deep learning model to address this problem, attempting to predict some aspects of the protein conformations. Throughout a process of multiobjective dominance, a recurrent neural network was trained to abstract the particular bias of each individual multiobjective algorithm, generating a heuristic that could be useful to predict some of the relevant aspects of the three-dimensional conformation process formation, known as protein folding.Keywords: Ab initio heuristic modeling, multiobjective optimization, protein structure prediction, recurrent neural network
Procedia PDF Downloads 20617827 Review: Wavelet New Tool for Path Loss Prediction
Authors: Danladi Ali, Abdullahi Mukaila
Abstract:
In this work, GSM signal strength (power) was monitored in an indoor environment. Samples of the GSM signal strength was measured on mobile equipment (ME). One-dimensional multilevel wavelet is used to predict the fading phenomenon of the GSM signal measured and neural network clustering to determine the average power received in the study area. The wavelet prediction revealed that the GSM signal is attenuated due to the fast fading phenomenon which fades about 7 times faster than the radio wavelength while the neural network clustering determined that -75dBm appeared more frequently followed by -85dBm. The work revealed that significant part of the signal measured is dominated by weak signal and the signal followed more of Rayleigh than Gaussian distribution. This confirmed the wavelet prediction.Keywords: decomposition, clustering, propagation, model, wavelet, signal strength and spectral efficiency
Procedia PDF Downloads 44917826 Morphometry of Cervical Spinal Cord in Rabbit Using Design-Based Stereology
Authors: Hamed Chavoshi Pour, Javad Sadeghinejad
Abstract:
The spinal cord is a long structure that starts at the end of the medulla oblongata and is located within the vertebral canal. Physiologically, the spinal cord connects the brain with the peripheral nervous system for sensory and motor activities. The cervical spinal cord is an area of particular interest in medicine and veterinary medicine due to the high prevalence of diseases in this region. This study describes the morphometric features of the cervical spinal cord in rabbits using design-unbiased stereology. The cervical spinal cords of five male rabbits were dissected, and slabs were taken according to systematic uniform random sampling. Each slab was embedded in paraffin and cut into a 6-µm thick section, and stained with cresyl violet 0.1% for stereological estimations. The total spinal cord volume, volume fraction of grey and white matter, and also dorsal and ventral horns were estimated using point counting and Cavalieri's estimator. The total cervical spinal cord volume was 0.98 ± 0.07 cm³. The relative volume of white matter and grey matter was 70.6 ± 1.7% and 29.31 ± 1.67%, respectively. The dorsal horn and ventral horn volume were 13.86 ± 1.36% and 14.9 ± 0.62% of the whole cervical spinal cord. This knowledge of rabbit spinal cord findings may serve as a foundation for a translational model in spinal cord experimental research and provide basic findings for the diagnosis and treatment of spinal cord disorders.Keywords: stereology, spinal cord, rabbit, cervical
Procedia PDF Downloads 7717825 Digital Structural Monitoring Tools @ADaPT for Cracks Initiation and Growth due to Mechanical Damage Mechanism
Authors: Faizul Azly Abd Dzubir, Muhammad F. Othman
Abstract:
Conventional structural health monitoring approach for mechanical equipment uses inspection data from Non-Destructive Testing (NDT) during plant shut down window and fitness for service evaluation to estimate the integrity of the equipment that is prone to crack damage. Yet, this forecast is fraught with uncertainty because it is often based on assumptions of future operational parameters, and the prediction is not continuous or online. Advanced Diagnostic and Prognostic Technology (ADaPT) uses Acoustic Emission (AE) technology and a stochastic prognostic model to provide real-time monitoring and prediction of mechanical defects or cracks. The forecast can help the plant authority handle their cracked equipment before it ruptures, causing an unscheduled shutdown of the facility. The ADaPT employs process historical data trending, finite element analysis, fitness for service, and probabilistic statistical analysis to develop a prediction model for crack initiation and growth due to mechanical damage. The prediction model is combined with live equipment operating data for real-time prediction of the remaining life span owing to fracture. ADaPT was devised at a hot combined feed exchanger (HCFE) that had suffered creep crack damage. The ADaPT tool predicts the initiation of a crack at the top weldment area by April 2019. During the shutdown window in April 2019, a crack was discovered and repaired. Furthermore, ADaPT successfully advised the plant owner to run at full capacity and improve output by up to 7% by April 2019. ADaPT was also used on a coke drum that had extensive fatigue cracking. The initial cracks are declared safe with ADaPT, with remaining crack lifetimes extended another five (5) months, just in time for another planned facility downtime to execute repair. The prediction model, when combined with plant information data, allows plant operators to continuously monitor crack propagation caused by mechanical damage for improved maintenance planning and to avoid costly shutdowns to repair immediately.Keywords: mechanical damage, cracks, continuous monitoring tool, remaining life, acoustic emission, prognostic model
Procedia PDF Downloads 7717824 Prediction of Cutting Tool Life in Drilling of Reinforced Aluminum Alloy Composite Using a Fuzzy Method
Authors: Mohammed T. Hayajneh
Abstract:
Machining of Metal Matrix Composites (MMCs) is very significant process and has been a main problem that draws many researchers to investigate the characteristics of MMCs during different machining process. The poor machining properties of hard particles reinforced MMCs make drilling process a rather interesting task. Unlike drilling of conventional materials, many problems can be seriously encountered during drilling of MMCs, such as tool wear and cutting forces. Cutting tool wear is a very significant concern in industries. Cutting tool wear not only influences the quality of the drilled hole, but also affects the cutting tool life. Prediction the cutting tool life during drilling is essential for optimizing the cutting conditions. However, the relationship between tool life and cutting conditions, tool geometrical factors and workpiece material properties has not yet been established by any machining theory. In this research work, fuzzy subtractive clustering system has been used to model the cutting tool life in drilling of Al2O3 particle reinforced aluminum alloy composite to investigate of the effect of cutting conditions on cutting tool life. This investigation can help in controlling and optimizing of cutting conditions when the process parameters are adjusted. The built model for prediction the tool life is identified by using drill diameter, cutting speed, and cutting feed rate as input data. The validity of the model was confirmed by the examinations under various cutting conditions. Experimental results have shown the efficiency of the model to predict cutting tool life.Keywords: composite, fuzzy, tool life, wear
Procedia PDF Downloads 29717823 Wildland Fire in Terai Arc Landscape of Lesser Himalayas Threatning the Tiger Habitat
Authors: Amit Kumar Verma
Abstract:
The present study deals with fire prediction model in Terai Arc Landscape, one of the most dramatic ecosystems in Asia where large, wide-ranging species such as tiger, rhinos, and elephant will thrive while bringing economic benefits to the local people. Forest fires cause huge economic and ecological losses and release considerable quantities of carbon into the air and is an important factor inflating the global burden of carbon emissions. Forest fire is an important factor of behavioral cum ecological habit of tiger in wild. Post fire changes i.e. micro and macro habitat directly affect the tiger habitat or land. Vulnerability of fire depicts the changes in microhabitat (humus, soil profile, litter, vegetation, grassland ecosystem). Microorganism like spider, annelids, arthropods and other favorable microorganism directly affect by the forest fire and indirectly these entire microorganisms are responsible for the development of tiger (Panthera tigris) habitat. On the other hand, fire brings depletion in prey species and negative movement of tiger from wild to human- dominated areas, which may leads the conflict i.e. dangerous for both tiger & human beings. Early forest fire prediction through mapping the risk zones can help minimize the fire frequency and manage forest fires thereby minimizing losses. Satellite data plays a vital role in identifying and mapping forest fire and recording the frequency with which different vegetation types are affected. Thematic hazard maps have been generated by using IDW technique. A prediction model for fire occurrence is developed for TAL. The fire occurrence records were collected from state forest department from 2000 to 2014. Disciminant function models was used for developing a prediction model for forest fires in TAL, random points for non-occurrence of fire have been generated. Based on the attributes of points of occurrence and non-occurrence, the model developed predicts the fire occurrence. The map of predicted probabilities classified the study area into five classes very high (12.94%), high (23.63%), moderate (25.87%), low(27.46%) and no fire (10.1%) based upon the intensity of hazard. model is able to classify 78.73 percent of points correctly and hence can be used for the purpose with confidence. Overall, also the model works correctly with almost 69% of points. This study exemplifies the usefulness of prediction model of forest fire and offers a more effective way for management of forest fire. Overall, this study depicts the model for conservation of tiger’s natural habitat and forest conservation which is beneficial for the wild and human beings for future prospective.Keywords: fire prediction model, forest fire hazard, GIS, landsat, MODIS, TAL
Procedia PDF Downloads 35217822 Hard Disk Failure Predictions in Supercomputing System Based on CNN-LSTM and Oversampling Technique
Authors: Yingkun Huang, Li Guo, Zekang Lan, Kai Tian
Abstract:
Hard disk drives (HDD) failure of the exascale supercomputing system may lead to service interruption and invalidate previous calculations, and it will cause permanent data loss. Therefore, initiating corrective actions before hard drive failures materialize is critical to the continued operation of jobs. In this paper, a highly accurate analysis model based on CNN-LSTM and oversampling technique was proposed, which can correctly predict the necessity of a disk replacement even ten days in advance. Generally, the learning-based method performs poorly on a training dataset with long-tail distribution, especially fault prediction is a very classic situation as the scarcity of failure data. To overcome the puzzle, a new oversampling was employed to augment the data, and then, an improved CNN-LSTM with the shortcut was built to learn more effective features. The shortcut transmits the results of the previous layer of CNN and is used as the input of the LSTM model after weighted fusion with the output of the next layer. Finally, a detailed, empirical comparison of 6 prediction methods is presented and discussed on a public dataset for evaluation. The experiments indicate that the proposed method predicts disk failure with 0.91 Precision, 0.91 Recall, 0.91 F-measure, and 0.90 MCC for 10 days prediction horizon. Thus, the proposed algorithm is an efficient algorithm for predicting HDD failure in supercomputing.Keywords: HDD replacement, failure, CNN-LSTM, oversampling, prediction
Procedia PDF Downloads 8117821 A Predictive Model for Turbulence Evolution and Mixing Using Machine Learning
Authors: Yuhang Wang, Jorg Schluter, Sergiy Shelyag
Abstract:
The high cost associated with high-resolution computational fluid dynamics (CFD) is one of the main challenges that inhibit the design, development, and optimisation of new combustion systems adapted for renewable fuels. In this study, we propose a physics-guided CNN-based model to predict turbulence evolution and mixing without requiring a traditional CFD solver. The model architecture is built upon U-Net and the inception module, while a physics-guided loss function is designed by introducing two additional physical constraints to allow for the conservation of both mass and pressure over the entire predicted flow fields. Then, the model is trained on the Large Eddy Simulation (LES) results of a natural turbulent mixing layer with two different Reynolds number cases (Re = 3000 and 30000). As a result, the model prediction shows an excellent agreement with the corresponding CFD solutions in terms of both spatial distributions and temporal evolution of turbulent mixing. Such promising model prediction performance opens up the possibilities of doing accurate high-resolution manifold-based combustion simulations at a low computational cost for accelerating the iterative design process of new combustion systems.Keywords: computational fluid dynamics, turbulence, machine learning, combustion modelling
Procedia PDF Downloads 9217820 Stock Price Prediction with 'Earnings' Conference Call Sentiment
Authors: Sungzoon Cho, Hye Jin Lee, Sungwhan Jeon, Dongyoung Min, Sungwon Lyu
Abstract:
Major public corporations worldwide use conference calls to report their quarterly earnings. These 'earnings' conference calls allow for questions from stock analysts. We investigated if it is possible to identify sentiment from the call script and use it to predict stock price movement. We analyzed call scripts from six companies, two each from Korea, China and Indonesia during six years 2011Q1 – 2017Q2. Random forest with Frequency-based sentiment scores using Loughran MacDonald Dictionary did better than control model with only financial indicators. When the stock prices went up 20 days from earnings release, our model predicted correctly 77% of time. When the model predicted 'up,' actual stock prices went up 65% of time. This preliminary result encourages us to investigate advanced sentiment scoring methodologies such as topic modeling, auto-encoder, and word2vec variants.Keywords: earnings call script, random forest, sentiment analysis, stock price prediction
Procedia PDF Downloads 29417819 Reconstructability Analysis for Landslide Prediction
Authors: David Percy
Abstract:
Landslides are a geologic phenomenon that affects a large number of inhabited places and are constantly being monitored and studied for the prediction of future occurrences. Reconstructability analysis (RA) is a methodology for extracting informative models from large volumes of data that work exclusively with discrete data. While RA has been used in medical applications and social science extensively, we are introducing it to the spatial sciences through applications like landslide prediction. Since RA works exclusively with discrete data, such as soil classification or bedrock type, working with continuous data, such as porosity, requires that these data are binned for inclusion in the model. RA constructs models of the data which pick out the most informative elements, independent variables (IVs), from each layer that predict the dependent variable (DV), landslide occurrence. Each layer included in the model retains its classification data as a primary encoding of the data. Unlike other machine learning algorithms that force the data into one-hot encoding type of schemes, RA works directly with the data as it is encoded, with the exception of continuous data, which must be binned. The usual physical and derived layers are included in the model, and testing our results against other published methodologies, such as neural networks, yields accuracy that is similar but with the advantage of a completely transparent model. The results of an RA session with a data set are a report on every combination of variables and their probability of landslide events occurring. In this way, every combination of informative state combinations can be examined.Keywords: reconstructability analysis, machine learning, landslides, raster analysis
Procedia PDF Downloads 6817818 Studies on the Applicability of Artificial Neural Network (ANN) in Prediction of Thermodynamic Behavior of Sodium Chloride Aqueous System Containing a Non-Electrolytes
Authors: Dariush Jafari, S. Mostafa Nowee
Abstract:
In this study a ternary system containing sodium chloride as solute, water as primary solvent and ethanol as the antisolvent was considered to investigate the application of artificial neural network (ANN) in prediction of sodium solubility in the mixture of water as the solvent and ethanol as the antisolvent. The system was previously studied using by Extended UNIQUAC model by the authors of this study. The comparison between the results of the two models shows an excellent agreement between them (R2=0.99), and also approves the capability of ANN to predict the thermodynamic behavior of ternary electrolyte systems which are difficult to model.Keywords: thermodynamic modeling, ANN, solubility, ternary electrolyte system
Procedia PDF Downloads 38517817 The Optimization of an Industrial Recycling Line: Improving the Durability of Recycled Polyethyene Blends
Authors: Alae Lamtai, Said Elkoun, Hniya Kharmoudi, Mathieu Robert, Carl Diez
Abstract:
This study applies Taguchi's design of experiment methodology and grey relational analysis (GRA) for multi objective optimization of an industrial recycling line. This last is composed mainly of a mono and twin-screw extruder and a filtration system. Experiments were performed according to L₁₆ standard orthogonal array based on five process parameters, namely: mono screw design, screw speed of the mono and twin-screw extruder, melt pump pressure, and filter mesh size. The objective of this optimization is to improve the durability of the Polyethylene (PE) blend by decreasing the loss of Stress Crack resistance (SCR) using Notched Crack Ligament Stress (NCLS) test and Unnotched Crack Ligament Stress (UCLS) in parallel with increasing the gain of Izod impact strength of the Polyethylene (PE) blend before and after recycling. Based on Grey Relational Analysis (GRA), the optimal setting of process parameters was identified, and the results indicated that the mono-screw design and screw speed of both mono and twin-screw extruder impact significantly the mechanical properties of recycled Polyethylene (PE) blend.Keywords: Taguchi, recycling line, polyethylene, stress crack resistance, Izod impact strength, grey relational analysis
Procedia PDF Downloads 8417816 Homeless Population Modeling and Trend Prediction Through Identifying Key Factors and Machine Learning
Authors: Shayla He
Abstract:
Background and Purpose: According to Chamie (2017), it’s estimated that no less than 150 million people, or about 2 percent of the world’s population, are homeless. The homeless population in the United States has grown rapidly in the past four decades. In New York City, the sheltered homeless population has increased from 12,830 in 1983 to 62,679 in 2020. Knowing the trend on the homeless population is crucial at helping the states and the cities make affordable housing plans, and other community service plans ahead of time to better prepare for the situation. This study utilized the data from New York City, examined the key factors associated with the homelessness, and developed systematic modeling to predict homeless populations of the future. Using the best model developed, named HP-RNN, an analysis on the homeless population change during the months of 2020 and 2021, which were impacted by the COVID-19 pandemic, was conducted. Moreover, HP-RNN was tested on the data from Seattle. Methods: The methodology involves four phases in developing robust prediction methods. Phase 1 gathered and analyzed raw data of homeless population and demographic conditions from five urban centers. Phase 2 identified the key factors that contribute to the rate of homelessness. In Phase 3, three models were built using Linear Regression, Random Forest, and Recurrent Neural Network (RNN), respectively, to predict the future trend of society's homeless population. Each model was trained and tuned based on the dataset from New York City for its accuracy measured by Mean Squared Error (MSE). In Phase 4, the final phase, the best model from Phase 3 was evaluated using the data from Seattle that was not part of the model training and tuning process in Phase 3. Results: Compared to the Linear Regression based model used by HUD et al (2019), HP-RNN significantly improved the prediction metrics of Coefficient of Determination (R2) from -11.73 to 0.88 and MSE by 99%. HP-RNN was then validated on the data from Seattle, WA, which showed a peak %error of 14.5% between the actual and the predicted count. Finally, the modeling results were collected to predict the trend during the COVID-19 pandemic. It shows a good correlation between the actual and the predicted homeless population, with the peak %error less than 8.6%. Conclusions and Implications: This work is the first work to apply RNN to model the time series of the homeless related data. The Model shows a close correlation between the actual and the predicted homeless population. There are two major implications of this result. First, the model can be used to predict the homeless population for the next several years, and the prediction can help the states and the cities plan ahead on affordable housing allocation and other community service to better prepare for the future. Moreover, this prediction can serve as a reference to policy makers and legislators as they seek to make changes that may impact the factors closely associated with the future homeless population trend.Keywords: homeless, prediction, model, RNN
Procedia PDF Downloads 12117815 A Comparative Soft Computing Approach to Supplier Performance Prediction Using GEP and ANN Models: An Automotive Case Study
Authors: Seyed Esmail Seyedi Bariran, Khairul Salleh Mohamed Sahari
Abstract:
In multi-echelon supply chain networks, optimal supplier selection significantly depends on the accuracy of suppliers’ performance prediction. Different methods of multi criteria decision making such as ANN, GA, Fuzzy, AHP, etc have been previously used to predict the supplier performance but the “black-box” characteristic of these methods is yet a major concern to be resolved. Therefore, the primary objective in this paper is to implement an artificial intelligence-based gene expression programming (GEP) model to compare the prediction accuracy with that of ANN. A full factorial design with %95 confidence interval is initially applied to determine the appropriate set of criteria for supplier performance evaluation. A test-train approach is then utilized for the ANN and GEP exclusively. The training results are used to find the optimal network architecture and the testing data will determine the prediction accuracy of each method based on measures of root mean square error (RMSE) and correlation coefficient (R2). The results of a case study conducted in Supplying Automotive Parts Co. (SAPCO) with more than 100 local and foreign supply chain members revealed that, in comparison with ANN, gene expression programming has a significant preference in predicting supplier performance by referring to the respective RMSE and R-squared values. Moreover, using GEP, a mathematical function was also derived to solve the issue of ANN black-box structure in modeling the performance prediction.Keywords: Supplier Performance Prediction, ANN, GEP, Automotive, SAPCO
Procedia PDF Downloads 42117814 Using Probe Person Data for Travel Mode Detection
Authors: Muhammad Awais Shafique, Eiji Hato, Hideki Yaginuma
Abstract:
Recently GPS data is used in a lot of studies to automatically reconstruct travel patterns for trip survey. The aim is to minimize the use of questionnaire surveys and travel diaries so as to reduce their negative effects. In this paper data acquired from GPS and accelerometer embedded in smart phones is utilized to predict the mode of transportation used by the phone carrier. For prediction, Support Vector Machine (SVM) and Adaptive boosting (AdaBoost) are employed. Moreover a unique method to improve the prediction results from these algorithms is also proposed. Results suggest that the prediction accuracy of AdaBoost after improvement is relatively better than the rest.Keywords: accelerometer, AdaBoost, GPS, mode prediction, support vector machine
Procedia PDF Downloads 36217813 Prediction of Bariatric Surgery Publications by Using Different Machine Learning Algorithms
Authors: Senol Dogan, Gunay Karli
Abstract:
Identification of relevant publications based on a Medline query is time-consuming and error-prone. An all based process has the potential to solve this problem without any manual work. To the best of our knowledge, our study is the first to investigate the ability of machine learning to identify relevant articles accurately. 5 different machine learning algorithms were tested using 23 predictors based on several metadata fields attached to publications. We find that the Boosted model is the best-performing algorithm and its overall accuracy is 96%. In addition, specificity and sensitivity of the algorithm is 97 and 93%, respectively. As a result of the work, we understood that we can apply the same procedure to understand cancer gene expression big data.Keywords: prediction of publications, machine learning, algorithms, bariatric surgery, comparison of algorithms, boosted, tree, logistic regression, ANN model
Procedia PDF Downloads 21017812 The Role of HPV Status in Patients with Overlapping Grey Zone Cancer in Oral Cavity and Oropharynx
Authors: Yao Song
Abstract:
Objectives: We aimed to explore the clinicodemographic characteristics and prognosis of grey zone squamous cell cancer (GZSCC) located in the overlapping or ambiguous area of the oral cavity and oropharynx and to identify valuable factors that would improve its differential diagnosis and prognosis. Methods: Information of GZSCC patients in the Surveillance, Epidemiology, and End Results (SEER) database was compared to patients with an oral cavity (OCSCC) and oropharyngeal (OPSCC) squamous cell carcinomas with corresponding HPV status, respectively. Kaplan-Meier method with log-rank test and multivariate Cox regression analysis were applied to assess associations between clinical characteristics and overall survival (OS). A predictive model integrating age, gender, marital status, HPV status, and staging variables was conducted to classify GZSCC patients into three risk groups and verified internally by 10-fold cross validation. Results: A total of 3318 GZSCC, 10792 OPSCC, and 6656 OCSCC patients were identified. HPV-positive GZSCC patients had the best 5-year OS as HPV-positive OPSCC (81% vs. 82%). However, the 5-year OS of HPV-negative/unknown GZSCC (43%/42%) was the worst among all groups, indicating that HPV status and the overlapping nature of tumors were valuable prognostic predictors in GZSCC patients. Compared with the strategy of dividing GZSCC into two groups by HPV status, the predictive model integrating more variables could additionally identify a unique high-risk GZSCC group with the lowest OS rate. Conclusions: GZSCC patients had distinct clinical characteristics and prognoses compared with OPSCC and OCSCC; integrating HPV status and other clinical factors could help distinguish GZSCC and predict their prognosis.Keywords: GZSCC, OCSCC, OPSCC, HPV
Procedia PDF Downloads 7517811 Stock Price Prediction Using Time Series Algorithms
Authors: Sumit Sen, Sohan Khedekar, Umang Shinde, Shivam Bhargava
Abstract:
This study has been undertaken to investigate whether the deep learning models are able to predict the future stock prices by training the model with the historical stock price data. Since this work required time series analysis, various models are present today to perform time series analysis such as Recurrent Neural Network LSTM, ARIMA and Facebook Prophet. Applying these models the movement of stock price of stocks are predicted and also tried to provide the future prediction of the stock price of a stock. Final product will be a stock price prediction web application that is developed for providing the user the ease of analysis of the stocks and will also provide the predicted stock price for the next seven days.Keywords: Autoregressive Integrated Moving Average, Deep Learning, Long Short Term Memory, Time-series
Procedia PDF Downloads 14317810 Flame Volume Prediction and Validation for Lean Blowout of Gas Turbine Combustor
Authors: Ejaz Ahmed, Huang Yong
Abstract:
The operation of aero engines has a critical importance in the vicinity of lean blowout (LBO) limits. Lefebvre’s model of LBO based on empirical correlation has been extended to flame volume concept by the authors. The flame volume takes into account the effects of geometric configuration, the complex spatial interaction of mixing, turbulence, heat transfer and combustion processes inside the gas turbine combustion chamber. For these reasons, flame volume based LBO predictions are more accurate. Although LBO prediction accuracy has improved, it poses a challenge associated with Vf estimation in real gas turbine combustors. This work extends the approach of flame volume prediction previously based on fuel iterative approximation with cold flow simulations to reactive flow simulations. Flame volume for 11 combustor configurations has been simulated and validated against experimental data. To make prediction methodology robust as required in the preliminary design stage, reactive flow simulations were carried out with the combination of probability density function (PDF) and discrete phase model (DPM) in FLUENT 15.0. The criterion for flame identification was defined. Two important parameters i.e. critical injection diameter (Dp,crit) and critical temperature (Tcrit) were identified, and their influence on reactive flow simulation was studied for Vf estimation. Obtained results exhibit ±15% error in Vf estimation with experimental data.Keywords: CFD, combustion, gas turbine combustor, lean blowout
Procedia PDF Downloads 26817809 Surface Roughness Analysis, Modelling and Prediction in Fused Deposition Modelling Additive Manufacturing Technology
Authors: Yusuf S. Dambatta, Ahmed A. D. Sarhan
Abstract:
Fused deposition modelling (FDM) is one of the most prominent rapid prototyping (RP) technologies which is being used to efficiently fabricate CAD 3D geometric models. However, the process is coupled with many drawbacks, of which the surface quality of the manufactured RP parts is among. Hence, studies relating to improving the surface roughness have been a key issue in the field of RP research. In this work, a technique of modelling the surface roughness in FDM is presented. Using experimentally measured surface roughness response of the FDM parts, an ANFIS prediction model was developed to obtain the surface roughness in the FDM parts using the main critical process parameters that affects the surface quality. The ANFIS model was validated and compared with experimental test results.Keywords: surface roughness, fused deposition modelling (FDM), adaptive neuro fuzzy inference system (ANFIS), orientation
Procedia PDF Downloads 46217808 Assessment of Pre-Processing Influence on Near-Infrared Spectra for Predicting the Mechanical Properties of Wood
Authors: Aasheesh Raturi, Vimal Kothiyal, P. D. Semalty
Abstract:
We studied mechanical properties of Eucalyptus tereticornis using FT-NIR spectroscopy. Firstly, spectra were pre-processed to eliminate useless information. Then, prediction model was constructed by partial least squares regression. To study the influence of pre-processing on prediction of mechanical properties for NIR analysis of wood samples, we applied various pretreatment methods like straight line subtraction, constant offset elimination, vector-normalization, min-max normalization, multiple scattering. Correction, first derivative, second derivatives and their combination with other treatment such as First derivative + straight line subtraction, First derivative+ vector normalization and First derivative+ multiplicative scattering correction. The data processing methods in combination of preprocessing with different NIR regions, RMSECV, RMSEP and optimum factors/rank were obtained by optimization process of model development. More than 350 combinations were obtained during optimization process. More than one pre-processing method gave good calibration/cross-validation and prediction/test models, but only the best calibration/cross-validation and prediction/test models are reported here. The results show that one can safely use NIR region between 4000 to 7500 cm-1 with straight line subtraction, constant offset elimination, first derivative and second derivative preprocessing method which were found to be most appropriate for models development.Keywords: FT-NIR, mechanical properties, pre-processing, PLS
Procedia PDF Downloads 36217807 Reasons for Non-Applicability of Software Entropy Metrics for Bug Prediction in Android
Authors: Arvinder Kaur, Deepti Chopra
Abstract:
Software Entropy Metrics for bug prediction have been validated on various software systems by different researchers. In our previous research, we have validated that Software Entropy Metrics calculated for Mozilla subsystem’s predict the future bugs reasonably well. In this study, the Software Entropy metrics are calculated for a subsystem of Android and it is noticed that these metrics are not suitable for bug prediction. The results are compared with a subsystem of Mozilla and a comparison is made between the two software systems to determine the reasons why Software Entropy metrics are not applicable for Android.Keywords: android, bug prediction, mining software repositories, software entropy
Procedia PDF Downloads 57917806 Useful Lifetime Prediction of Chevron Rubber Spring for Railway Vehicle
Authors: Chang Su Woo, Hyun Sung Park
Abstract:
Useful lifetime evaluation of chevron rubber spring was very important in design procedure to assure the safety and reliability. It is, therefore, necessary to establish a suitable criterion for the replacement period of chevron rubber spring. In this study, we performed characteristic analysis and useful lifetime prediction of chevron rubber spring. Rubber material coefficient was obtained by curve fittings of uni-axial tension, equi bi-axial tension and pure shear test. Computer simulation was executed to predict and evaluate the load capacity and stiffness for chevron rubber spring. In order to useful lifetime prediction of rubber material, we carried out the compression set with heat aging test in an oven at the temperature ranging from 50°C to 100°C during a period 180 days. By using the Arrhenius plot, several useful lifetime prediction equations for rubber material was proposed.Keywords: chevron rubber spring, material coefficient, finite element analysis, useful lifetime prediction
Procedia PDF Downloads 56817805 Variable Refrigerant Flow (VRF) Zonal Load Prediction Using a Transfer Learning-Based Framework
Authors: Junyu Chen, Peng Xu
Abstract:
In the context of global efforts to enhance building energy efficiency, accurate thermal load forecasting is crucial for both device sizing and predictive control. Variable Refrigerant Flow (VRF) systems are widely used in buildings around the world, yet VRF zonal load prediction has received limited attention. Due to differences between VRF zones in building-level prediction methods, zone-level load forecasting could significantly enhance accuracy. Given that modern VRF systems generate high-quality data, this paper introduces transfer learning to leverage this data and further improve prediction performance. This framework also addresses the challenge of predicting load for building zones with no historical data, offering greater accuracy and usability compared to pure white-box models. The study first establishes an initial variable set of VRF zonal building loads and generates a foundational white-box database using EnergyPlus. Key variables for VRF zonal loads are identified using methods including SRRC, PRCC, and Random Forest. XGBoost and LSTM are employed to generate pre-trained black-box models based on the white-box database. Finally, real-world data is incorporated into the pre-trained model using transfer learning to enhance its performance in operational buildings. In this paper, zone-level load prediction was integrated with transfer learning, and a framework was proposed to improve the accuracy and applicability of VRF zonal load prediction.Keywords: zonal load prediction, variable refrigerant flow (VRF) system, transfer learning, energyplus
Procedia PDF Downloads 3017804 Comparison of Multivariate Adaptive Regression Splines and Random Forest Regression in Predicting Forced Expiratory Volume in One Second
Authors: P. V. Pramila , V. Mahesh
Abstract:
Pulmonary Function Tests are important non-invasive diagnostic tests to assess respiratory impairments and provides quantifiable measures of lung function. Spirometry is the most frequently used measure of lung function and plays an essential role in the diagnosis and management of pulmonary diseases. However, the test requires considerable patient effort and cooperation, markedly related to the age of patients esulting in incomplete data sets. This paper presents, a nonlinear model built using Multivariate adaptive regression splines and Random forest regression model to predict the missing spirometric features. Random forest based feature selection is used to enhance both the generalization capability and the model interpretability. In the present study, flow-volume data are recorded for N= 198 subjects. The ranked order of feature importance index calculated by the random forests model shows that the spirometric features FVC, FEF 25, PEF,FEF 25-75, FEF50, and the demographic parameter height are the important descriptors. A comparison of performance assessment of both models prove that, the prediction ability of MARS with the `top two ranked features namely the FVC and FEF 25 is higher, yielding a model fit of R2= 0.96 and R2= 0.99 for normal and abnormal subjects. The Root Mean Square Error analysis of the RF model and the MARS model also shows that the latter is capable of predicting the missing values of FEV1 with a notably lower error value of 0.0191 (normal subjects) and 0.0106 (abnormal subjects). It is concluded that combining feature selection with a prediction model provides a minimum subset of predominant features to train the model, yielding better prediction performance. This analysis can assist clinicians with a intelligence support system in the medical diagnosis and improvement of clinical care.Keywords: FEV, multivariate adaptive regression splines pulmonary function test, random forest
Procedia PDF Downloads 31117803 Fast Prediction Unit Partition Decision and Accelerating the Algorithm Using Cudafor Intra and Inter Prediction of HEVC
Authors: Qiang Zhang, Chun Yuan
Abstract:
Since the PU (Prediction Unit) decision process is the most time consuming part of the emerging HEVC (High Efficient Video Coding) standardin intra and inter frame coding, this paper proposes the fast PU decision algorithm and speed up the algorithm using CUDA (Compute Unified Device Architecture). In intra frame coding, the fast PU decision algorithm uses the texture features to skip intra-frame prediction or terminal the intra-frame prediction for smaller PU size. In inter frame coding of HEVC, the fast PU decision algorithm takes use of the similarity of its own two Nx2N size PU's motion vectors and the hierarchical structure of CU (Coding Unit) partition to skip some modes of PU partition, so as to reduce the motion estimation times. The accelerate algorithm using CUDA is based on the fast PU decision algorithm which uses the GPU to make the motion search and the gradient computation could be parallel computed. The proposed algorithm achieves up to 57% time saving compared to the HM 10.0 with little rate-distortion losses (0.043dB drop and 1.82% bitrate increase on average).Keywords: HEVC, PU decision, inter prediction, intra prediction, CUDA, parallel
Procedia PDF Downloads 399