Search results for: compressive strength prediction
3902 Easymodel: Web-based Bioinformatics Software for Protein Modeling Based on Modeller
Authors: Alireza Dantism
Abstract:
Presently, describing the function of a protein sequence is one of the most common problems in biology. Usually, this problem can be facilitated by studying the three-dimensional structure of proteins. In the absence of a protein structure, comparative modeling often provides a useful three-dimensional model of the protein that is dependent on at least one known protein structure. Comparative modeling predicts the three-dimensional structure of a given protein sequence (target) mainly based on its alignment with one or more proteins of known structure (templates). Comparative modeling consists of four main steps 1. Similarity between the target sequence and at least one known template structure 2. Alignment of target sequence and template(s) 3. Build a model based on alignment with the selected template(s). 4. Prediction of model errors 5. Optimization of the built model There are many computer programs and web servers that automate the comparative modeling process. One of the most important advantages of these servers is that it makes comparative modeling available to both experts and non-experts, and they can easily do their own modeling without the need for programming knowledge, but some other experts prefer using programming knowledge and do their modeling manually because by doing this they can maximize the accuracy of their modeling. In this study, a web-based tool has been designed to predict the tertiary structure of proteins using PHP and Python programming languages. This tool is called EasyModel. EasyModel can receive, according to the user's inputs, the desired unknown sequence (which we know as the target) in this study, the protein sequence file (template), etc., which also has a percentage of similarity with the primary sequence, and its third structure Predict the unknown sequence and present the results in the form of graphs and constructed protein files.Keywords: structural bioinformatics, protein tertiary structure prediction, modeling, comparative modeling, modeller
Procedia PDF Downloads 973901 Use of Front-Face Fluorescence Spectroscopy and Multiway Analysis for the Prediction of Olive Oil Quality Features
Authors: Omar Dib, Rita Yaacoub, Luc Eveleigh, Nathalie Locquet, Hussein Dib, Ali Bassal, Christophe B. Y. Cordella
Abstract:
The potential of front-face fluorescence coupled with chemometric techniques, namely parallel factor analysis (PARAFAC) and multiple linear regression (MLR) as a rapid analysis tool to characterize Lebanese virgin olive oils was investigated. Fluorescence fingerprints were acquired directly on 102 Lebanese virgin olive oil samples in the range of 280-540 nm in excitation and 280-700 nm in emission. A PARAFAC model with seven components was considered optimal with a residual of 99.64% and core consistency value of 78.65. The model revealed seven main fluorescence profiles in olive oil and was mainly associated with tocopherols, polyphenols, chlorophyllic compounds and oxidation/hydrolysis products. 23 MLR regression models based on PARAFAC scores were generated, the majority of which showed a good correlation coefficient (R > 0.7 for 12 predicted variables), thus satisfactory prediction performances. Acid values, peroxide values, and Delta K had the models with the highest predictions, with R values of 0.89, 0.84 and 0.81 respectively. Among fatty acids, linoleic and oleic acids were also highly predicted with R values of 0.8 and 0.76, respectively. Factors contributing to the model's construction were related to common fluorophores found in olive oil, mainly chlorophyll, polyphenols, and oxidation products. This study demonstrates the interest of front-face fluorescence as a promising tool for quality control of Lebanese virgin olive oils.Keywords: front-face fluorescence, Lebanese virgin olive oils, multiple Linear regressions, PARAFAC analysis
Procedia PDF Downloads 4533900 Fuzzy Logic Classification Approach for Exponential Data Set in Health Care System for Predication of Future Data
Authors: Manish Pandey, Gurinderjit Kaur, Meenu Talwar, Sachin Chauhan, Jagbir Gill
Abstract:
Health-care management systems are a unit of nice connection as a result of the supply a straightforward and fast management of all aspects relating to a patient, not essentially medical. What is more, there are unit additional and additional cases of pathologies during which diagnosing and treatment may be solely allotted by victimization medical imaging techniques. With associate ever-increasing prevalence, medical pictures area unit directly acquired in or regenerate into digital type, for his or her storage additionally as sequent retrieval and process. Data Mining is the process of extracting information from large data sets through using algorithms and Techniques drawn from the field of Statistics, Machine Learning and Data Base Management Systems. Forecasting may be a prediction of what's going to occur within the future, associated it's an unsure method. Owing to the uncertainty, the accuracy of a forecast is as vital because the outcome foretold by foretelling the freelance variables. A forecast management should be wont to establish if the accuracy of the forecast is within satisfactory limits. Fuzzy regression strategies have normally been wont to develop shopper preferences models that correlate the engineering characteristics with shopper preferences relating to a replacement product; the patron preference models offer a platform, wherever by product developers will decide the engineering characteristics so as to satisfy shopper preferences before developing the merchandise. Recent analysis shows that these fuzzy regression strategies area units normally will not to model client preferences. We tend to propose a Testing the strength of Exponential Regression Model over regression toward the mean Model.Keywords: health-care management systems, fuzzy regression, data mining, forecasting, fuzzy membership function
Procedia PDF Downloads 2793899 Band Structure Computation of GaMnAs Using the Multiband k.p Theory
Authors: Khadijah B. Alziyadi, Khawlh A. Alzubaidi, Amor M. Alsayari
Abstract:
Recently, GaMnAs diluted magnetic semiconductors(DMSs) have received considerable attention because they combine semiconductor and magnetic properties. GaMnAs has been used as a model DMS and as a test bed for many concepts and functionalities of spintronic devices. In this paper, a theoretical study on the band structure ofGaMnAswill be presented. The model that we used in this study is the 8-band k.p methodwherespin-orbit interaction, spin splitting, and strain are considered. The band structure of GaMnAs will be calculated in different directions in the reciprocal space. The effect of manganese content on the GaMnAs band structure will be discussed. Also, the influence of strain, which varied continuously from tensile to compressive, on the different bands will be studied.Keywords: band structure, diluted magnetic semiconductor, k.p method, strain
Procedia PDF Downloads 1523898 Life Time Improvement of Clamp Structural by Using Fatigue Analysis
Authors: Pisut Boonkaew, Jatuporn Thongsri
Abstract:
In hard disk drive manufacturing industry, the process of reducing an unnecessary part and qualifying the quality of part before assembling is important. Thus, clamp was designed and fabricated as a fixture for holding in testing process. Basically, testing by trial and error consumes a long time to improve. Consequently, the simulation was brought to improve the part and reduce the time taken. The problem is the present clamp has a low life expectancy because of the critical stress that occurred. Hence, the simulation was brought to study the behavior of stress and compressive force to improve the clamp expectancy with all probability of designs which are present up to 27 designs, which excluding the repeated designs. The probability was calculated followed by the full fractional rules of six sigma methodology which was provided correctly. The six sigma methodology is a well-structured method for improving quality level by detecting and reducing the variability of the process. Therefore, the defective will be decreased while the process capability increasing. This research focuses on the methodology of stress and fatigue reduction while compressive force still remains in the acceptable range that has been set by the company. In the simulation, ANSYS simulates the 3D CAD with the same condition during the experiment. Then the force at each distance started from 0.01 to 0.1 mm will be recorded. The setting in ANSYS was verified by mesh convergence methodology and compared the percentage error with the experimental result; the error must not exceed the acceptable range. Therefore, the improved process focuses on degree, radius, and length that will reduce stress and still remain in the acceptable force number. Therefore, the fatigue analysis will be brought as the next process in order to guarantee that the lifetime will be extended by simulating through ANSYS simulation program. Not only to simulate it, but also to confirm the setting by comparing with the actual clamp in order to observe the different of fatigue between both designs. This brings the life time improvement up to 57% compared with the actual clamp in the manufacturing. This study provides a precise and trustable setting enough to be set as a reference methodology for the future design. Because of the combination and adaptation from the six sigma method, finite element, fatigue and linear regressive analysis that lead to accurate calculation, this project will able to save up to 60 million dollars annually.Keywords: clamp, finite element analysis, structural, six sigma, linear regressive analysis, fatigue analysis, probability
Procedia PDF Downloads 2353897 Deep Learning Framework for Predicting Bus Travel Times with Multiple Bus Routes: A Single-Step Multi-Station Forecasting Approach
Authors: Muhammad Ahnaf Zahin, Yaw Adu-Gyamfi
Abstract:
Bus transit is a crucial component of transportation networks, especially in urban areas. Any intelligent transportation system must have accurate real-time information on bus travel times since it minimizes waiting times for passengers at different stations along a route, improves service reliability, and significantly optimizes travel patterns. Bus agencies must enhance the quality of their information service to serve their passengers better and draw in more travelers since people waiting at bus stops are frequently anxious about when the bus will arrive at their starting point and when it will reach their destination. For solving this issue, different models have been developed for predicting bus travel times recently, but most of them are focused on smaller road networks due to their relatively subpar performance in high-density urban areas on a vast network. This paper develops a deep learning-based architecture using a single-step multi-station forecasting approach to predict average bus travel times for numerous routes, stops, and trips on a large-scale network using heterogeneous bus transit data collected from the GTFS database. Over one week, data was gathered from multiple bus routes in Saint Louis, Missouri. In this study, Gated Recurrent Unit (GRU) neural network was followed to predict the mean vehicle travel times for different hours of the day for multiple stations along multiple routes. Historical time steps and prediction horizon were set up to 5 and 1, respectively, which means that five hours of historical average travel time data were used to predict average travel time for the following hour. The spatial and temporal information and the historical average travel times were captured from the dataset for model input parameters. As adjacency matrices for the spatial input parameters, the station distances and sequence numbers were used, and the time of day (hour) was considered for the temporal inputs. Other inputs, including volatility information such as standard deviation and variance of journey durations, were also included in the model to make it more robust. The model's performance was evaluated based on a metric called mean absolute percentage error (MAPE). The observed prediction errors for various routes, trips, and stations remained consistent throughout the day. The results showed that the developed model could predict travel times more accurately during peak traffic hours, having a MAPE of around 14%, and performed less accurately during the latter part of the day. In the context of a complicated transportation network in high-density urban areas, the model showed its applicability for real-time travel time prediction of public transportation and ensured the high quality of the predictions generated by the model.Keywords: gated recurrent unit, mean absolute percentage error, single-step forecasting, travel time prediction.
Procedia PDF Downloads 723896 Simulation of Glass Breakage Using Voronoi Random Field Tessellations
Authors: Michael A. Kraus, Navid Pourmoghaddam, Martin Botz, Jens Schneider, Geralt Siebert
Abstract:
Fragmentation analysis of tempered glass gives insight into the quality of the tempering process and defines a certain degree of safety as well. Different standard such as the European EN 12150-1 or the American ASTM C 1048/CPSC 16 CFR 1201 define a minimum number of fragments required for soda-lime safety glass on the basis of fragmentation test results for classification. This work presents an approach for the glass breakage pattern prediction using a Voronoi Tesselation over Random Fields. The random Voronoi tessellation is trained with and validated against data from several breakage patterns. The fragments in observation areas of 50 mm x 50 mm were used for training and validation. All glass specimen used in this study were commercially available soda-lime glasses at three different thicknesses levels of 4 mm, 8 mm and 12 mm. The results of this work form a Bayesian framework for the training and prediction of breakage patterns of tempered soda-lime glass using a Voronoi Random Field Tesselation. Uncertainties occurring in this process can be well quantified, and several statistical measures of the pattern can be preservation with this method. Within this work it was found, that different Random Fields as basis for the Voronoi Tesselation lead to differently well fitted statistical properties of the glass breakage patterns. As the methodology is derived and kept general, the framework could be also applied to other random tesselations and crack pattern modelling purposes.Keywords: glass breakage predicition, Voronoi Random Field Tessellation, fragmentation analysis, Bayesian parameter identification
Procedia PDF Downloads 1603895 Investigation of Scaling Laws for Stiffness and strength in Bioinspired Glass Sponge Structures Produced by Fused Filament Fabrication
Authors: Hassan Beigi Rizi, Harold Auradou, Lamine Hattali
Abstract:
Various industries, including civil engineering, automotive, aerospace, and biomedical fields, are currently seeking novel and innovative high-performance lightweight materials to reduce energy consumption. Inspired by the structure of Euplectella Aspergillum Glass Sponges (EA-sponge), 2D unit cells were created and fabricated using a Fused Filament Fabrication (FFF) process with Polylactic acid (PLA) filaments. The stiffness and strength of bio-inspired EA-sponge lattices were investigated both experimentally and numerically under uniaxial tensile loading and are compared to three standard square lattices with diagonal struts (Designs B and C) and non-diagonal struts (Design D) reinforcements. The aim is to establish predictive scaling laws models and examine the deformation mechanisms involved. The results indicated that for the EA-sponge structure, the relative moduli and yield strength scaled linearly with relative density, suggesting that the deformation mechanism is stretching-dominated. The Finite element analysis (FEA), with periodic boundary conditions for volumetric homogenization, confirms these trends and goes beyond the experimental limits imposed by the FFF printing process. Therefore, the stretching-dominated behavior, investigated from 0.1 to 0.5 relative density, demonstrate that the study of EA-sponge structure can be exploited for the realization of square lattice topologies that are stiff and strong and have attractive potential for lightweight structural applications. However, the FFF process introduces an accuracy limitation, with approximately 10% error, making it challenging to print structures with a relative density below 0.2. Future work could focus on exploring the impact of different printing materials on the performance of EA-sponge structures.Keywords: bio-inspiration, lattice structures, fused filament fabrication, scaling laws
Procedia PDF Downloads 73894 Artificial Neural Network in Ultra-High Precision Grinding of Borosilicate-Crown Glass
Authors: Goodness Onwuka, Khaled Abou-El-Hossein
Abstract:
Borosilicate-crown (BK7) glass has found broad application in the optic and automotive industries and the growing demands for nanometric surface finishes is becoming a necessity in such applications. Thus, it has become paramount to optimize the parameters influencing the surface roughness of this precision lens. The research was carried out on a 4-axes Nanoform 250 precision lathe machine with an ultra-high precision grinding spindle. The experiment varied the machining parameters of feed rate, wheel speed and depth of cut at three levels for different combinations using Box Behnken design of experiment and the resulting surface roughness values were measured using a Taylor Hobson Dimension XL optical profiler. Acoustic emission monitoring technique was applied at a high sampling rate to monitor the machining process while further signal processing and feature extraction methods were implemented to generate the input to a neural network algorithm. This paper highlights the training and development of a back propagation neural network prediction algorithm through careful selection of parameters and the result show a better classification accuracy when compared to a previously developed response surface model with very similar machining parameters. Hence artificial neural network algorithms provide better surface roughness prediction accuracy in the ultra-high precision grinding of BK7 glass.Keywords: acoustic emission technique, artificial neural network, surface roughness, ultra-high precision grinding
Procedia PDF Downloads 3053893 Experimental Investigation on Mechanical Properties of Rice Husk Filled Jute Reinforced Composites
Authors: Priyankar P. Deka, Sutanu Samanta
Abstract:
This paper describes the development of new class of epoxy based hybrid composites reinforced with jute and filled with rice husk flour. Rice husk flour is added in 0%, 1%, 3%, 5% by weight. Epoxy resin and triethylene tetramine (T.E.T.A) is used as matrix and hardener respectively. It investigates the mechanical properties of the composites and a comparison is done for monolithic jute composite and the filled ones. The specimens are prepared according to the ASTM standards and experimentation is carried out using INSTRON 8801. The result shows that with the increase of filler percentage the tensile properties increases but compressive and flexural properties decreases.Keywords: jute, mechanical characterization, natural fiber, rice husk
Procedia PDF Downloads 2853892 A Machine Learning Approach for Performance Prediction Based on User Behavioral Factors in E-Learning Environments
Authors: Naduni Ranasinghe
Abstract:
E-learning environments are getting more popular than any other due to the impact of COVID19. Even though e-learning is one of the best solutions for the teaching-learning process in the academic process, it’s not without major challenges. Nowadays, machine learning approaches are utilized in the analysis of how behavioral factors lead to better adoption and how they related to better performance of the students in eLearning environments. During the pandemic, we realized the academic process in the eLearning approach had a major issue, especially for the performance of the students. Therefore, an approach that investigates student behaviors in eLearning environments using a data-intensive machine learning approach is appreciated. A hybrid approach was used to understand how each previously told variables are related to the other. A more quantitative approach was used referred to literature to understand the weights of each factor for adoption and in terms of performance. The data set was collected from previously done research to help the training and testing process in ML. Special attention was made to incorporating different dimensionality of the data to understand the dependency levels of each. Five independent variables out of twelve variables were chosen based on their impact on the dependent variable, and by considering the descriptive statistics, out of three models developed (Random Forest classifier, SVM, and Decision tree classifier), random forest Classifier (Accuracy – 0.8542) gave the highest value for accuracy. Overall, this work met its goals of improving student performance by identifying students who are at-risk and dropout, emphasizing the necessity of using both static and dynamic data.Keywords: academic performance prediction, e learning, learning analytics, machine learning, predictive model
Procedia PDF Downloads 1573891 Effect of Mach Number for Gust-Airfoil Interatcion Noise
Authors: ShuJiang Jiang
Abstract:
The interaction of turbulence with airfoil is an important noise source in many engineering fields, including helicopters, turbofan, and contra-rotating open rotor engines, where turbulence generated in the wake of upstream blades interacts with the leading edge of downstream blades and produces aerodynamic noise. One approach to study turbulence-airfoil interaction noise is to model the oncoming turbulence as harmonic gusts. A compact noise source produces a dipole-like sound directivity pattern. However, when the acoustic wavelength is much smaller than the airfoil chord length, the airfoil needs to be treated as a non-compact source, and the gust-airfoil interaction becomes more complicated and results in multiple lobes generated in the radiated sound directivity. Capturing the short acoustic wavelength is a challenge for numerical simulations. In this work, simulations are performed for gust-airfoil interaction at different Mach numbers, using a high-fidelity direct Computational AeroAcoustic (CAA) approach based on a spectral/hp element method, verified by a CAA benchmark case. It is found that the squared sound pressure varies approximately as the 5th power of Mach number, which changes slightly with the observer location. This scaling law can give a better sound prediction than the flat-plate theory for thicker airfoils. Besides, another prediction method, based on the flat-plate theory and CAA simulation, has been proposed to give better predictions than the scaling law for thicker airfoils.Keywords: aeroacoustics, gust-airfoil interaction, CFD, CAA
Procedia PDF Downloads 783890 Evaluation of Coal Quality and Geomechanical Moduli Using Core and Geophysical Logs: Study from Middle Permian Barakar Formation of Gondwana Coalfield
Authors: Joyjit Dey, Souvik Sen
Abstract:
Middle Permian Barakar formation is the major economic coal bearing unit of vast east-west trending Damodar Valley basin of Gondwana coalfield. Primary sedimentary structures were studied from the core holes, which represent majorly four facies groups: sandstone dominated facies, sandstone-shale heterolith facies, shale facies and coal facies. Total eight major coal seams have been identified with the bottom most seam being the thickest. Laterally, continuous coal seams were deposited in the calm and quiet environment of extensive floodplain swamps. Channel sinuosity and lateral channel migration/avulsion results in lateral facies heterogeneity and coal splitting. Geophysical well logs (Gamma-Resistivity-Density logs) have been used to establish the vertical and lateral correlation of various litho units field-wide, which reveals the predominance of repetitive fining upwards cycles. Well log data being a permanent record, offers a strong foundation for generating log based property evaluation and helps in characterization of depositional units in terms of lateral and vertical heterogeneity. Low gamma, high resistivity, low density is the typical coal seam signatures in geophysical logs. Here, we have used a density cutoff of 1.6 g/cc as a primary discriminator of coal and the same has been employed to compute various coal assay parameters, which are ash, fixed carbon, moisture, volatile content, cleat porosity, vitrinite reflectance (VRo%), which were calibrated with the laboratory based measurements. The study shows ash content and VRo% increase from west to east (towards basin margin), while fixed carbon, moisture and volatile content increase towards west, depicting increased coal quality westwards. Seam wise cleat porosity decreases from east to west, this would be an effect of overburden, as overburden pressure increases westward with the deepening of basin causing more sediment packet deposited on the western side of the study area. Coal is a porous, viscoelastic material in which velocity and strain both change nonlinearly with stress, especially for stress applied perpendicular to the bedding plane. Usually, the coal seam has a high velocity contrast relative to its neighboring layers. Despite extensive discussion of the maceral and chemical properties of coal, its elastic characteristics have received comparatively little attention. The measurement of the elastic constants of coal presents many difficulties: sample-to-sample inhomogeneity and fragility and velocity dependence on stress, orientation, humidity, and chemical content. In this study, a conclusive empirical equation VS= 0.80VP-0.86 has been used to model shear velocity from compression velocity. Also the same has been used to compute various geomechanical moduli. Geomech analyses yield a Poisson ratio of 0.348 against coals. Average bulk modulus value is 3.97 GPA, while average shear modulus and Young’s modulus values are coming out as 1.34 and 3.59 GPA respectively. These middle Permian Barakar coals show an average 23.84 MPA uniaxial compressive strength (UCS) with 4.97 MPA cohesive strength and 0.46 as friction coefficient. The output values of log based proximate parameters and geomechanical moduli suggest a medium volatile Bituminous grade for the studied coal seams, which is found in the laboratory based core study as well.Keywords: core analysis, coal characterization, geophysical log, geo-mechanical moduli
Procedia PDF Downloads 2263889 A Prediction Method of Pollutants Distribution Pattern: Flare Motion Using Computational Fluid Dynamics (CFD) Fluent Model with Weather Research Forecast Input Model during Transition Season
Authors: Benedictus Asriparusa, Lathifah Al Hakimi, Aulia Husada
Abstract:
A large amount of energy is being wasted by the release of natural gas associated with the oil industry. This release interrupts the environment particularly atmosphere layer condition globally which contributes to global warming impact. This research presents an overview of the methods employed by researchers in PT. Chevron Pacific Indonesia in the Minas area to determine a new prediction method of measuring and reducing gas flaring and its emission. The method emphasizes advanced research which involved analytical studies, numerical studies, modeling, and computer simulations, amongst other techniques. A flaring system is the controlled burning of natural gas in the course of routine oil and gas production operations. This burning occurs at the end of a flare stack or boom. The combustion process releases emissions of greenhouse gases such as NO2, CO2, SO2, etc. This condition will affect the chemical composition of air and environment around the boundary layer mainly during transition season. Transition season in Indonesia is absolutely very difficult condition to predict its pattern caused by the difference of two air mass conditions. This paper research focused on transition season in 2013. A simulation to create the new pattern of the pollutants distribution is needed. This paper has outlines trends in gas flaring modeling and current developments to predict the dominant variables in the pollutants distribution. A Fluent model is used to simulate the distribution of pollutants gas coming out of the stack, whereas WRF model output is used to overcome the limitations of the analysis of meteorological data and atmospheric conditions in the study area. Based on the running model, the most influence factor was wind speed. The goal of the simulation is to predict the new pattern based on the time of fastest wind and slowest wind occurs for pollutants distribution. According to the simulation results, it can be seen that the fastest wind (last of March) moves pollutants in a horizontal direction and the slowest wind (middle of May) moves pollutants vertically. Besides, the design of flare stack in compliance according to EPA Oil and Gas Facility Stack Parameters likely shows pollutants concentration remains on the under threshold NAAQS (National Ambient Air Quality Standards).Keywords: flare motion, new prediction, pollutants distribution, transition season, WRF model
Procedia PDF Downloads 5563888 A Preliminary Study on the Effects of Equestrian and Basketball Exercises in Children with Autism
Authors: Li Shuping, Shu Huaping, Yi Chaofan, Tao Jiang
Abstract:
Equestrian practice is often considered having a unique effect on improving symptoms in children with autism. This study evaluated and measured the changes in daily behavior, morphological, physical function, and fitness indexes of two group children with autism by means of 12 weeks of equestrian and basketball exercises. 19 clinically diagnosed children with moderate/mild autism were randomly divided into equestrian group (9 children, age=10.11±1.90y) and basketball group (10 children, age=10.70±2.16y). Both the equestrian and basketball groups practiced twice a week for 45 to 60 minutes each time. Three scales, the Autism Behavior Checklist (ABC), the Childhood Autism Rating Scale (CARS) and the Clancy Autism Behavior Scale (CABS) were used to assess their human behavior and psychology. Four morphological, seven physical function and fitness indicators were measured to evaluate the effects of the two exercises on the children’s body. The evaluations were taken by every four weeks ( pre-exercise, the 4th week, the 8th week and 12th week (post exercise). The result showed that the total scores of ABC, CARS and CABS, the dimension scores of ABC on the somatic motor, language and life self-care obtained after exercise were significantly lower than those obtained before 12 week exercises in both groups. The ABC feeling dimension scores of equestrian group and ABC communication dimension score of basketball group were significantly lower,and The upper arm circumference, sitting forward flexion, 40 second sit-up, 15s lateral jump, vital capacity, and single foot standing of both groups were significantly higher than that of before exercise.. The BMI of equestrian group was significantly reduced. The handgrip strength of basketball group was significantly increased. In conclusion, both types of exercises could improve daily behavior, morphological, physical function, and fitness indexes of the children with autism. However, the behavioral psychological scores, body morphology and function indicators and time points were different in the middle and back of the two interventions.But the indicators and the timing of the improvement were different. To the group of equestrian, the improvement of the flexibility occurred at week 4, the improvement of the sensory perception, control and use their own body, and promote the development of core strength endurance, coordination and cardiopulmonary function occurred at week 8,and the improvement of core strength endurance, coordination and cardiopulmonary function occurred at week 12. To the group of basketball, the improvement of the hand strength, balance, flexibility and cardiopulmonary function occurred at week 4, the improvement of the self-care ability and language expression ability, and core strength endurance and coordination occurred at week 8, the improvement of the control and use of their own body and social interaction ability occurred at week 12. In comparison of the exercise effects, the equestrian exercise improved the physical control and application ability appeared earlier than that of basketball group. Basketball exercise improved the language expression ability, self-care ability, balance ability and cardiopulmonary function of autistic children appeared earlier than that of equestrian group.Keywords: intervention, children with autism, equestrain, basketball
Procedia PDF Downloads 683887 A Computational Framework for Load Mediated Patellar Ligaments Damage at the Tropocollagen Level
Authors: Fadi Al Khatib, Raouf Mbarki, Malek Adouni
Abstract:
In various sport and recreational activities, the patellofemoral joint undergoes large forces and moments while accommodating the significant knee joint movement. In doing so, this joint is commonly the source of anterior knee pain related to instability in normal patellar tracking and excessive pressure syndrome. One well-observed explanation of the instability of the normal patellar tracking is the patellofemoral ligaments and patellar tendon damage. Improved knowledge of the damage mechanism mediating ligaments and tendon injuries can be a great help not only in rehabilitation and prevention procedures but also in the design of better reconstruction systems in the management of knee joint disorders. This damage mechanism, specifically due to excessive mechanical loading, has been linked to the micro level of the fibred structure precisely to the tropocollagen molecules and their connection density. We argue defining a clear frame starting from the bottom (micro level) to up (macro level) in the hierarchies of the soft tissue may elucidate the essential underpinning on the state of the ligaments damage. To do so, in this study a multiscale fibril reinforced hyper elastoplastic Finite Element model that accounts for the synergy between molecular and continuum syntheses was developed to determine the short-term stresses/strains patellofemoral ligaments and tendon response. The plasticity of the proposed model is associated only with the uniaxial deformation of the collagen fibril. The yield strength of the fibril is a function of the cross-link density between tropocollagen molecules, defined here by a density function. This function obtained through a Coarse-graining procedure linking nanoscale collagen features and the tissue level materials properties using molecular dynamics simulations. The hierarchies of the soft tissues were implemented using the rule of mixtures. Thereafter, the model was calibrated using a statistical calibration procedure. The model then implemented into a real structure of patellofemoral ligaments and patellar tendon (OpenKnee) and simulated under realistic loading conditions. With the calibrated material parameters the calculated axial stress lies well with the experimental measurement with a coefficient of determination (R2) equal to 0.91 and 0.92 for the patellofemoral ligaments and the patellar tendon respectively. The ‘best’ prediction of the yielding strength and strain as compared with the reported experimental data yielded when the cross-link density between the tropocollagen molecule of the fibril equal to 5.5 ± 0.5 (patellofemoral ligaments) and 12 (patellar tendon). Damage initiation of the patellofemoral ligaments was located at the femoral insertions while the damage of the patellar tendon happened in the middle of the structure. These predicted finding showed a meaningful correlation between the cross-link density of the tropocollagen molecules and the stiffness of the connective tissues of the extensor mechanism. Also, damage initiation and propagation were documented with this model, which were in satisfactory agreement with earlier observation. To the best of our knowledge, this is the first attempt to model ligaments from the bottom up, predicted depending to the tropocollagen cross-link density. This approach appears more meaningful towards a realistic simulation of a damaging process or repair attempt compared with certain published studies.Keywords: tropocollagen, multiscale model, fibrils, knee ligaments
Procedia PDF Downloads 1283886 Improved Soil and Snow Treatment with the Rapid Update Cycle Land-Surface Model for Regional and Global Weather Predictions
Authors: Tatiana G. Smirnova, Stan G. Benjamin
Abstract:
Rapid Update Cycle (RUC) land surface model (LSM) was a land-surface component in several generations of operational weather prediction models at the National Center for Environment Prediction (NCEP) at the National Oceanic and Atmospheric Administration (NOAA). It was designed for short-range weather predictions with an emphasis on severe weather and originally was intentionally simple to avoid uncertainties from poorly known parameters. Nevertheless, the RUC LSM, when coupled with the hourly-assimilating atmospheric model, can produce a realistic evolution of time-varying soil moisture and temperature, as well as the evolution of snow cover on the ground surface. This result is possible only if the soil/vegetation/snow component of the coupled weather prediction model has sufficient skill to avoid long-term drift. RUC LSM was first implemented in the operational NCEP Rapid Update Cycle (RUC) weather model in 1998 and later in the Weather Research Forecasting Model (WRF)-based Rapid Refresh (RAP) and High-resolution Rapid Refresh (HRRR). Being available to the international WRF community, it was implemented in operational weather models in Austria, New Zealand, and Switzerland. Based on the feedback from the US weather service offices and the international WRF community and also based on our own validation, RUC LSM has matured over the years. Also, a sea-ice module was added to RUC LSM for surface predictions over the Arctic sea-ice. Other modifications include refinements to the snow model and a more accurate specification of albedo, roughness length, and other surface properties. At present, RUC LSM is being tested in the regional application of the Unified Forecast System (UFS). The next generation UFS-based regional Rapid Refresh FV3 Standalone (RRFS) model will replace operational RAP and HRRR at NCEP. Over time, RUC LSM participated in several international model intercomparison projects to verify its skill using observed atmospheric forcing. The ESM-SnowMIP was the last of these experiments focused on the verification of snow models for open and forested regions. The simulations were performed for ten sites located in different climatic zones of the world forced with observed atmospheric conditions. While most of the 26 participating models have more sophisticated snow parameterizations than in RUC, RUC LSM got a high ranking in simulations of both snow water equivalent and surface temperature. However, ESM-SnowMIP experiment also revealed some issues in the RUC snow model, which will be addressed in this paper. One of them is the treatment of grid cells partially covered with snow. RUC snow module computes energy and moisture budgets of snow-covered and snow-free areas separately by aggregating the solutions at the end of each time step. Such treatment elevates the importance of computing in the model snow cover fraction. Improvements to the original simplistic threshold-based approach have been implemented and tested both offline and in the coupled weather model. The detailed description of changes to the snow cover fraction and other modifications to RUC soil and snow parameterizations will be described in this paper.Keywords: land-surface models, weather prediction, hydrology, boundary-layer processes
Procedia PDF Downloads 883885 Big Data in Telecom Industry: Effective Predictive Techniques on Call Detail Records
Authors: Sara ElElimy, Samir Moustafa
Abstract:
Mobile network operators start to face many challenges in the digital era, especially with high demands from customers. Since mobile network operators are considered a source of big data, traditional techniques are not effective with new era of big data, Internet of things (IoT) and 5G; as a result, handling effectively different big datasets becomes a vital task for operators with the continuous growth of data and moving from long term evolution (LTE) to 5G. So, there is an urgent need for effective Big data analytics to predict future demands, traffic, and network performance to full fill the requirements of the fifth generation of mobile network technology. In this paper, we introduce data science techniques using machine learning and deep learning algorithms: the autoregressive integrated moving average (ARIMA), Bayesian-based curve fitting, and recurrent neural network (RNN) are employed for a data-driven application to mobile network operators. The main framework included in models are identification parameters of each model, estimation, prediction, and final data-driven application of this prediction from business and network performance applications. These models are applied to Telecom Italia Big Data challenge call detail records (CDRs) datasets. The performance of these models is found out using a specific well-known evaluation criteria shows that ARIMA (machine learning-based model) is more accurate as a predictive model in such a dataset than the RNN (deep learning model).Keywords: big data analytics, machine learning, CDRs, 5G
Procedia PDF Downloads 1393884 Predicting Costs in Construction Projects with Machine Learning: A Detailed Study Based on Activity-Level Data
Authors: Soheila Sadeghi
Abstract:
Construction projects are complex and often subject to significant cost overruns due to the multifaceted nature of the activities involved. Accurate cost estimation is crucial for effective budget planning and resource allocation. Traditional methods for predicting overruns often rely on expert judgment or analysis of historical data, which can be time-consuming, subjective, and may fail to consider important factors. However, with the increasing availability of data from construction projects, machine learning techniques can be leveraged to improve the accuracy of overrun predictions. This study applied machine learning algorithms to enhance the prediction of cost overruns in a case study of a construction project. The methodology involved the development and evaluation of two machine learning models: Random Forest and Neural Networks. Random Forest can handle high-dimensional data, capture complex relationships, and provide feature importance estimates. Neural Networks, particularly Deep Neural Networks (DNNs), are capable of automatically learning and modeling complex, non-linear relationships between input features and the target variable. These models can adapt to new data, reduce human bias, and uncover hidden patterns in the dataset. The findings of this study demonstrate that both Random Forest and Neural Networks can significantly improve the accuracy of cost overrun predictions compared to traditional methods. The Random Forest model also identified key cost drivers and risk factors, such as changes in the scope of work and delays in material delivery, which can inform better project risk management. However, the study acknowledges several limitations. First, the findings are based on a single construction project, which may limit the generalizability of the results to other projects or contexts. Second, the dataset, although comprehensive, may not capture all relevant factors influencing cost overruns, such as external economic conditions or political factors. Third, the study focuses primarily on cost overruns, while schedule overruns are not explicitly addressed. Future research should explore the application of machine learning techniques to a broader range of projects, incorporate additional data sources, and investigate the prediction of both cost and schedule overruns simultaneously.Keywords: cost prediction, machine learning, project management, random forest, neural networks
Procedia PDF Downloads 573883 Macroscopic Evidence of the Liquidlike Nature of Nanoscale Polydimethylsiloxane Brushes
Authors: Xiaoxiao Zhao
Abstract:
We report macroscopic evidence of the liquidlike nature of surface-tethered poly(dimethylsiloxane) (PDMS) brushes by studying their adhesion to ice. Whereas ice permanently detaches from solid surfaces when subjected to sufficient shear, commonly referred to as the material’s ice adhesion strength, adhered ice instead slides over PDMS brushes indefinitely. When additionally methylated, we observe a Couette-like flow of the PDMS brushes between the ice and silicon surface. PDMS brush ice adhesion displays shear-rate-dependent shear stress and rheological behavior reminiscent of liquids and is affected by ice velocity, temperature, and brush thickness, following scaling laws akin to liquid PDMS films. This liquidlike nature allows it to detach solely by self-weight, yielding an ice adhesion strength of 0.3 kPa, 1000 times less than low surface energy, perfluorinated monolayer. The methylated PDMS brushes also display omniphobicity, repelling all liquids essentially with vanishingly small contact angle hysteresis. Methylation results in significantly higher contact angles than previously reported, nonmethylated brushes, especially for polar liquids of both high and low surface tension.Keywords: omniphobic, surface science, polymer brush, icephobic surface
Procedia PDF Downloads 673882 Effective Stacking of Deep Neural Models for Automated Object Recognition in Retail Stores
Authors: Ankit Sinha, Soham Banerjee, Pratik Chattopadhyay
Abstract:
Automated product recognition in retail stores is an important real-world application in the domain of Computer Vision and Pattern Recognition. In this paper, we consider the problem of automatically identifying the classes of the products placed on racks in retail stores from an image of the rack and information about the query/product images. We improve upon the existing approaches in terms of effectiveness and memory requirement by developing a two-stage object detection and recognition pipeline comprising of a Faster-RCNN-based object localizer that detects the object regions in the rack image and a ResNet-18-based image encoder that classifies the detected regions into the appropriate classes. Each of the models is fine-tuned using appropriate data sets for better prediction and data augmentation is performed on each query image to prepare an extensive gallery set for fine-tuning the ResNet-18-based product recognition model. This encoder is trained using a triplet loss function following the strategy of online-hard-negative-mining for improved prediction. The proposed models are lightweight and can be connected in an end-to-end manner during deployment to automatically identify each product object placed in a rack image. Extensive experiments using Grozi-32k and GP-180 data sets verify the effectiveness of the proposed model.Keywords: retail stores, faster-RCNN, object localization, ResNet-18, triplet loss, data augmentation, product recognition
Procedia PDF Downloads 1573881 Feature Analysis of Predictive Maintenance Models
Authors: Zhaoan Wang
Abstract:
Research in predictive maintenance modeling has improved in the recent years to predict failures and needed maintenance with high accuracy, saving cost and improving manufacturing efficiency. However, classic prediction models provide little valuable insight towards the most important features contributing to the failure. By analyzing and quantifying feature importance in predictive maintenance models, cost saving can be optimized based on business goals. First, multiple classifiers are evaluated with cross-validation to predict the multi-class of failures. Second, predictive performance with features provided by different feature selection algorithms are further analyzed. Third, features selected by different algorithms are ranked and combined based on their predictive power. Finally, linear explainer SHAP (SHapley Additive exPlanations) is applied to interpret classifier behavior and provide further insight towards the specific roles of features in both local predictions and global model behavior. The results of the experiments suggest that certain features play dominant roles in predictive models while others have significantly less impact on the overall performance. Moreover, for multi-class prediction of machine failures, the most important features vary with type of machine failures. The results may lead to improved productivity and cost saving by prioritizing sensor deployment, data collection, and data processing of more important features over less importance features.Keywords: automated supply chain, intelligent manufacturing, predictive maintenance machine learning, feature engineering, model interpretation
Procedia PDF Downloads 1333880 Non-Linear Assessment of Chromatographic Lipophilicity and Model Ranking of Newly Synthesized Steroid Derivatives
Authors: Milica Karadzic, Lidija Jevric, Sanja Podunavac-Kuzmanovic, Strahinja Kovacevic, Anamarija Mandic, Katarina Penov Gasi, Marija Sakac, Aleksandar Okljesa, Andrea Nikolic
Abstract:
The present paper deals with chromatographic lipophilicity prediction of newly synthesized steroid derivatives. The prediction was achieved using in silico generated molecular descriptors and quantitative structure-retention relationship (QSRR) methodology with the artificial neural networks (ANN) approach. Chromatographic lipophilicity of the investigated compounds was expressed as retention factor value logk. For QSRR modeling, a feedforward back-propagation ANN with gradient descent learning algorithm was applied. Using the novel sum of ranking differences (SRD) method generated ANN models were ranked. The aim was to distinguish the most consistent QSRR model that can be found, and similarity or dissimilarity between the models that could be noticed. In this study, SRD was performed with average values of retention factor value logk as reference values. An excellent correlation between experimentally observed retention factor value logk and values predicted by the ANN was obtained with a correlation coefficient higher than 0.9890. Statistical results show that the established ANN models can be applied for required purpose. This article is based upon work from COST Action (TD1305), supported by COST (European Cooperation in Science and Technology).Keywords: artificial neural networks, liquid chromatography, molecular descriptors, steroids, sum of ranking differences
Procedia PDF Downloads 3193879 Tribological Performance of Polymer Syntactic Foams in Low-Speed Conditions
Authors: R. Narasimha Rao, Ch. Sri Chaitanya
Abstract:
Syntactic foams are closed-cell foams with high specific strength and high compression strength. At Low speeds, the wear rate is sensitive to the sliding speeds and other tribological parameters like applied load and the sliding distance. In the present study, the tribological performance of the polymer-based syntactic foams was reported based on the experiments conducted on a pin-on-disc tribometer. The syntactic foams were manufactured with epoxy as the matrix and the cenospheres obtained from the thermal powerplants as the reinforcement. The experiments were conducted at a sliding speed of the 1 m/s. The applied load was varied from 1 kg to 5 kg up to a sliding distance of 3000 m. The wear rate increased with the sliding distance at lower loads. The trend was reversed at higher loads of 5kg. This may be due to the high plastic deformation at the initial stages when higher loads were applied. This was evident with the higher friction constants for the higher loads. The adhesive wear was found to be predominant for lower loads, while the abrasive wear tracks can be seen in micrographs of samples tested under higher loads.Keywords: sliding speed, syntactic foams, tribological performance, wear rate
Procedia PDF Downloads 783878 Investigation of Distortion and Impact Strength of 304L Butt Joint Using Different Weld Groove
Authors: A. Sharma, S. S. Sandhu, A. Shahi, A. Kumar
Abstract:
The aim of present investigation was to carry out Finite element modeling of distortion in the case of butt weld. 12mm thick AISI 304L plates were butt welded using three different combinations of groove design namely Double U, Double V and Composite. A full simulation of shielded metal arc welding (SMAW) of nonlinear heat transfer is carried out. Aspects like, temperature-dependent thermal properties of AISI stainless steel above liquid phase, the effect of thermal boundary conditions, were included in the model. Since welding heat dissipation characteristics changed due to variable groove design significant changes in the microhardness tensile strength and impact toughness of the joints were observed. The cumulative distortion was found to be least in double V joint followed by the Composite and Double U-joints. All the joints have joint efficiency more than 100%. CVN value of the Double V-groove weld metal was highest. The experimental results and the FEM results were compared and reveal a very good correlation for distortion and weld groove design for a multipass joint with a standard analogy of 83%.Keywords: AISI 304 L, Butt joint, distortion, FEM, groove design, SMAW
Procedia PDF Downloads 4083877 Prediction for DC-AC PWM Inverters DC Pulsed Current Sharing from Passive Parallel Battery-Supercapacitor Energy Storage Systems
Authors: Andreas Helwig, John Bell, Wangmo
Abstract:
Hybrid energy storage systems (HESS) are gaining popularity for grid energy storage (ESS) driven by the increasingly dynamic nature of energy demands, requiring both high energy and high power density. Particularly the ability of energy storage systems via inverters to respond to increasing fluctuation in energy demands, the combination of lithium Iron Phosphate (LFP) battery and supercapacitor (SC) is a particular example of complex electro-chemical devices that may provide benefit to each other for pulse width modulated DC to AC inverter application. This is due to SC’s ability to respond to instantaneous, high-current demands and batteries' long-term energy delivery. However, there is a knowledge gap on the current sharing mechanism within a HESS supplying a load powered by high-frequency pulse-width modulation (PWM) switching to understand the mechanism of aging in such HESS. This paper investigates the prediction of current utilizing various equivalent circuits for SC to investigate sharing between battery and SC in MATLAB/Simulink simulation environment. The findings predict a significant reduction of battery current when the battery is used in a hybrid combination with a supercapacitor as compared to a battery-only model. The impact of PWM inverter carrier switching frequency on current requirements was analyzed between 500Hz and 31kHz. While no clear trend emerged, models predicted optimal frequencies for minimized current needs.Keywords: hybrid energy storage, carrier frequency, PWM switching, equivalent circuit models
Procedia PDF Downloads 263876 Agreement between Basal Metabolic Rate Measured by Bioelectrical Impedance Analysis and Estimated by Prediction Equations in Obese Groups
Authors: Orkide Donma, Mustafa M. Donma
Abstract:
Basal metabolic rate (BMR) is widely used and an accepted measure of energy expenditure. Its principal determinant is body mass. However, this parameter is also correlated with a variety of other factors. The objective of this study is to measure BMR and compare it with the values obtained from predictive equations in adults classified according to their body mass index (BMI) values. 276 adults were included into the scope of this study. Their age, height and weight values were recorded. Five groups were designed based on their BMI values. First group (n = 85) was composed of individuals with BMI values varying between 18.5 and 24.9 kg/m2. Those with BMI values varying from 25.0 to 29.9 kg/m2 constituted Group 2 (n = 90). Individuals with 30.0-34.9 kg/m2, 35.0-39.9 kg/m2, > 40.0 kg/m2 were included in Group 3 (n = 53), 4 (n = 28) and 5 (n = 20), respectively. The most commonly used equations to be compared with the measured BMR values were selected. For this purpose, the values were calculated by the use of four equations to predict BMR values, by name, introduced by Food and Agriculture Organization (FAO)/World Health Organization (WHO)/United Nations University (UNU), Harris and Benedict, Owen and Mifflin. Descriptive statistics, ANOVA, post-Hoc Tukey and Pearson’s correlation tests were performed by a statistical program designed for Windows (SPSS, version 16.0). p values smaller than 0.05 were accepted as statistically significant. Mean ± SD of groups 1, 2, 3, 4 and 5 for measured BMR in kcal were 1440.3 ± 210.0, 1618.8 ± 268.6, 1741.1 ± 345.2, 1853.1 ± 351.2 and 2028.0 ± 412.1, respectively. Upon evaluation of the comparison of means among groups, differences were highly significant between Group 1 and each of the remaining four groups. The values were increasing from Group 2 to Group 5. However, differences between Group 2 and Group 3, Group 3 and Group 4, Group 4 and Group 5 were not statistically significant. These insignificances were lost in predictive equations proposed by Harris and Benedict, FAO/WHO/UNU and Owen. For Mifflin, the insignificance was limited only to Group 4 and Group 5. Upon evaluation of the correlations of measured BMR and the estimated values computed from prediction equations, the lowest correlations between measured BMR and estimated BMR values were observed among the individuals within normal BMI range. The highest correlations were detected in individuals with BMI values varying between 30.0 and 34.9 kg/m2. Correlations between measured BMR values and BMR values calculated by FAO/WHO/UNU as well as Owen were the same and the highest. In all groups, the highest correlations were observed between BMR values calculated from Mifflin and Harris and Benedict equations using age as an additional parameter. In conclusion, the unique resemblance of the FAO/WHO/UNU and Owen equations were pointed out. However, mean values obtained from FAO/WHO/UNU were much closer to the measured BMR values. Besides, the highest correlations were found between BMR calculated from FAO/WHO/UNU and measured BMR. These findings suggested that FAO/WHO/UNU was the most reliable equation, which may be used in conditions when the measured BMR values are not available.Keywords: adult, basal metabolic rate, fao/who/unu, obesity, prediction equations
Procedia PDF Downloads 1333875 Getting to Know the Types of Concrete and its Production Methods
Authors: Mokhtar Nikgoo
Abstract:
Definition of Concrete and Concreting: Concrete (in French: Béton) in a broad sense is any substance or combination that consists of a sticky substance with the property of cementation. In general, concrete refers to concrete made by Portland cement, which is produced by mixing fine and coarse aggregates, Portland cement and water. After enough time, this mixture turns into a stone-like substance. During the hardening or processing of the concrete, cement is chemically combined with water to form strong crystals that bind the aggregates together, a process called hydration. During this process, significant heat is released called hydration heat. Additionally, concrete shrinks slightly, especially as excess water evaporates, a phenomenon known as drying shrinkage. The process of hardening and the gradual increase in concrete strength that occurs with it does not end suddenly unless it is artificially interrupted. Instead, it decreases more over long periods of time, although, in practical applications, concrete is usually set after 28 days and is considered at full design strength. Concrete may be made from different types of cement as well as pozzolans, furnace slag, additives, additives, polymers, fibers, etc. It may also be used in the way it is made, heating, water vapor, autoclave, vacuum, hydraulic pressures and various condensers.Keywords: concrete, RCC, batching, cement, Pozzolan, mixing plan
Procedia PDF Downloads 983874 Study of the Persian Gulf’s and Oman Sea’s Numerical Tidal Currents
Authors: Fatemeh Sadat Sharifi
Abstract:
In this research, a barotropic model was employed to consider the tidal studies in the Persian Gulf and Oman Sea, where the only sufficient force was the tidal force. To do that, a finite-difference, free-surface model called Regional Ocean Modeling System (ROMS), was employed on the data over the Persian Gulf and Oman Sea. To analyze flow patterns of the region, the results of limited size model of The Finite Volume Community Ocean Model (FVCOM) were appropriated. The two points were determined since both are one of the most critical water body in case of the economy, biology, fishery, Shipping, navigation, and petroleum extraction. The OSU Tidal Prediction Software (OTPS) tide and observation data validated the modeled result. Next, tidal elevation and speed, and tidal analysis were interpreted. Preliminary results determine a significant accuracy in the tidal height compared with observation and OTPS data, declaring that tidal currents are highest in Hormuz Strait and the narrow and shallow region between Iranian coasts and Islands. Furthermore, tidal analysis clarifies that the M_2 component has the most significant value. Finally, the Persian Gulf tidal currents are divided into two branches: the first branch converts from south to Qatar and via United Arab Emirate rotates to Hormuz Strait. The secondary branch, in north and west, extends up to the highest point in the Persian Gulf and in the head of Gulf turns counterclockwise.Keywords: numerical model, barotropic tide, tidal currents, OSU tidal prediction software, OTPS
Procedia PDF Downloads 1313873 Profiling Risky Code Using Machine Learning
Authors: Zunaira Zaman, David Bohannon
Abstract:
This study explores the application of machine learning (ML) for detecting security vulnerabilities in source code. The research aims to assist organizations with large application portfolios and limited security testing capabilities in prioritizing security activities. ML-based approaches offer benefits such as increased confidence scores, false positives and negatives tuning, and automated feedback. The initial approach using natural language processing techniques to extract features achieved 86% accuracy during the training phase but suffered from overfitting and performed poorly on unseen datasets during testing. To address these issues, the study proposes using the abstract syntax tree (AST) for Java and C++ codebases to capture code semantics and structure and generate path-context representations for each function. The Code2Vec model architecture is used to learn distributed representations of source code snippets for training a machine-learning classifier for vulnerability prediction. The study evaluates the performance of the proposed methodology using two datasets and compares the results with existing approaches. The Devign dataset yielded 60% accuracy in predicting vulnerable code snippets and helped resist overfitting, while the Juliet Test Suite predicted specific vulnerabilities such as OS-Command Injection, Cryptographic, and Cross-Site Scripting vulnerabilities. The Code2Vec model achieved 75% accuracy and a 98% recall rate in predicting OS-Command Injection vulnerabilities. The study concludes that even partial AST representations of source code can be useful for vulnerability prediction. The approach has the potential for automated intelligent analysis of source code, including vulnerability prediction on unseen source code. State-of-the-art models using natural language processing techniques and CNN models with ensemble modelling techniques did not generalize well on unseen data and faced overfitting issues. However, predicting vulnerabilities in source code using machine learning poses challenges such as high dimensionality and complexity of source code, imbalanced datasets, and identifying specific types of vulnerabilities. Future work will address these challenges and expand the scope of the research.Keywords: code embeddings, neural networks, natural language processing, OS command injection, software security, code properties
Procedia PDF Downloads 107