Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 7686

Search results for: dimensional affect prediction

6456 Time and Cost Prediction Models for Language Classification Over a Large Corpus on Spark

Authors: Jairson Barbosa Rodrigues, Paulo Romero Martins Maciel, Germano Crispim Vasconcelos

Abstract:

This paper presents an investigation of the performance impacts regarding the variation of five factors (input data size, node number, cores, memory, and disks) when applying a distributed implementation of Naïve Bayes for text classification of a large Corpus on the Spark big data processing framework. Problem: The algorithm's performance depends on multiple factors, and knowing before-hand the effects of each factor becomes especially critical as hardware is priced by time slice in cloud environments. Objectives: To explain the functional relationship between factors and performance and to develop linear predictor models for time and cost. Methods: the solid statistical principles of Design of Experiments (DoE), particularly the randomized two-level fractional factorial design with replications. This research involved 48 real clusters with different hardware arrangements. The metrics were analyzed using linear models for screening, ranking, and measurement of each factor's impact. Results: Our findings include prediction models and show some non-intuitive results about the small influence of cores and the neutrality of memory and disks on total execution time, and the non-significant impact of data input scale on costs, although notably impacts the execution time.

Keywords: big data, design of experiments, distributed machine learning, natural language processing, spark

Procedia PDF Downloads 103

6455 Use of Front-Face Fluorescence Spectroscopy and Multiway Analysis for the Prediction of Olive Oil Quality Features

Authors: Omar Dib, Rita Yaacoub, Luc Eveleigh, Nathalie Locquet, Hussein Dib, Ali Bassal, Christophe B. Y. Cordella

Abstract:

The potential of front-face fluorescence coupled with chemometric techniques, namely parallel factor analysis (PARAFAC) and multiple linear regression (MLR) as a rapid analysis tool to characterize Lebanese virgin olive oils was investigated. Fluorescence fingerprints were acquired directly on 102 Lebanese virgin olive oil samples in the range of 280-540 nm in excitation and 280-700 nm in emission. A PARAFAC model with seven components was considered optimal with a residual of 99.64% and core consistency value of 78.65. The model revealed seven main fluorescence profiles in olive oil and was mainly associated with tocopherols, polyphenols, chlorophyllic compounds and oxidation/hydrolysis products. 23 MLR regression models based on PARAFAC scores were generated, the majority of which showed a good correlation coefficient (R > 0.7 for 12 predicted variables), thus satisfactory prediction performances. Acid values, peroxide values, and Delta K had the models with the highest predictions, with R values of 0.89, 0.84 and 0.81 respectively. Among fatty acids, linoleic and oleic acids were also highly predicted with R values of 0.8 and 0.76, respectively. Factors contributing to the model's construction were related to common fluorophores found in olive oil, mainly chlorophyll, polyphenols, and oxidation products. This study demonstrates the interest of front-face fluorescence as a promising tool for quality control of Lebanese virgin olive oils.

Keywords: front-face fluorescence, Lebanese virgin olive oils, multiple Linear regressions, PARAFAC analysis

Procedia PDF Downloads 443

6454 Impact of Culture and Religion on Disability and the Health Care Seeking Practices of the Shona People

Authors: Mafunda Esther

Abstract:

The paper seeks to find out and document the impact of culture and religion on disability, specifically language impairment and health care seeking practices of the Shona people. Its main objectives are to explore the cultural and religious beliefs that affect the utilization of rehabilitation services in a rural community in Zimbabwe. The other objective of the paper is to describe how language impairment is presented and understood by people living in a Zimbabwean rural area. The research is qualitative interpretive phenomenological research, and it utilizes the case study approach using semi structured interviews and focus group discussions. Results from the research established that religious and cultural beliefs determine how the Shona people view disability, and this guides their health care seeking practices. The research is important since communication disorders occur in populations worldwide though they are not always recognized as such. The lack of recognition of and the attitudes toward speech and languages disorders, as well as the beliefs about the causes of such disorders, affect people's attitudes toward the treatment of the disorders.

Keywords: culture, religion, disability, language impairment

Procedia PDF Downloads 82

6453 Deep Learning Framework for Predicting Bus Travel Times with Multiple Bus Routes: A Single-Step Multi-Station Forecasting Approach

Authors: Muhammad Ahnaf Zahin, Yaw Adu-Gyamfi

Abstract:

Bus transit is a crucial component of transportation networks, especially in urban areas. Any intelligent transportation system must have accurate real-time information on bus travel times since it minimizes waiting times for passengers at different stations along a route, improves service reliability, and significantly optimizes travel patterns. Bus agencies must enhance the quality of their information service to serve their passengers better and draw in more travelers since people waiting at bus stops are frequently anxious about when the bus will arrive at their starting point and when it will reach their destination. For solving this issue, different models have been developed for predicting bus travel times recently, but most of them are focused on smaller road networks due to their relatively subpar performance in high-density urban areas on a vast network. This paper develops a deep learning-based architecture using a single-step multi-station forecasting approach to predict average bus travel times for numerous routes, stops, and trips on a large-scale network using heterogeneous bus transit data collected from the GTFS database. Over one week, data was gathered from multiple bus routes in Saint Louis, Missouri. In this study, Gated Recurrent Unit (GRU) neural network was followed to predict the mean vehicle travel times for different hours of the day for multiple stations along multiple routes. Historical time steps and prediction horizon were set up to 5 and 1, respectively, which means that five hours of historical average travel time data were used to predict average travel time for the following hour. The spatial and temporal information and the historical average travel times were captured from the dataset for model input parameters. As adjacency matrices for the spatial input parameters, the station distances and sequence numbers were used, and the time of day (hour) was considered for the temporal inputs. Other inputs, including volatility information such as standard deviation and variance of journey durations, were also included in the model to make it more robust. The model's performance was evaluated based on a metric called mean absolute percentage error (MAPE). The observed prediction errors for various routes, trips, and stations remained consistent throughout the day. The results showed that the developed model could predict travel times more accurately during peak traffic hours, having a MAPE of around 14%, and performed less accurately during the latter part of the day. In the context of a complicated transportation network in high-density urban areas, the model showed its applicability for real-time travel time prediction of public transportation and ensured the high quality of the predictions generated by the model.

Keywords: gated recurrent unit, mean absolute percentage error, single-step forecasting, travel time prediction.

Procedia PDF Downloads 62

6452 Probability Model Accidents of Motorcyclist Based on Driver's Personality

Authors: Margareth E. Bolla, Ludfi Djakfar, Achmad Wicaksono

Abstract:

The increase in the number of motorcycle users in Indonesia is in line with the increase in accidents involving motorcycles. Several previous studies have shown that humans are the biggest factor causing accidents, and the driver's personality factor will affect his behavior on the road. This study was conducted to see how a person's personality traits will affect the probability of having an accident while driving. The Big Five Inventory (BFI) questionnaire and the Honda Riding Trainer (HRT) simulator were used as measuring tools, while the analysis carried out was logistic regression analysis. The results of the descriptive analysis of the respondent's personality based on the BFI show that the majority of drivers have the dominant character of neuroticism (34%), while the smallest group is the driver with the dominant type of openness character (6%). The percentage of motorists who were not involved in an accident was 54%. The results of the logistic regression analysis form a mathematical model as follows Y = -3.852 - 0.288 X1 + 0.596 X2 + 0.429 X3 - 0.386 X4 - 0.094 X5 + 0.436 X6 + 0.162 X7, where the results of hypothesis testing indicate that the variables openness, conscientiousness, extraversion, agreeableness, neuroticism, history of traffic accidents and age at starting driving did not have a significant effect on the probability of a motorcyclist being involved in an accident.

Keywords: accidents, BFI, probability, simulator

Procedia PDF Downloads 138

6451 Simulation of Glass Breakage Using Voronoi Random Field Tessellations

Authors: Michael A. Kraus, Navid Pourmoghaddam, Martin Botz, Jens Schneider, Geralt Siebert

Abstract:

Fragmentation analysis of tempered glass gives insight into the quality of the tempering process and defines a certain degree of safety as well. Different standard such as the European EN 12150-1 or the American ASTM C 1048/CPSC 16 CFR 1201 define a minimum number of fragments required for soda-lime safety glass on the basis of fragmentation test results for classification. This work presents an approach for the glass breakage pattern prediction using a Voronoi Tesselation over Random Fields. The random Voronoi tessellation is trained with and validated against data from several breakage patterns. The fragments in observation areas of 50 mm x 50 mm were used for training and validation. All glass specimen used in this study were commercially available soda-lime glasses at three different thicknesses levels of 4 mm, 8 mm and 12 mm. The results of this work form a Bayesian framework for the training and prediction of breakage patterns of tempered soda-lime glass using a Voronoi Random Field Tesselation. Uncertainties occurring in this process can be well quantified, and several statistical measures of the pattern can be preservation with this method. Within this work it was found, that different Random Fields as basis for the Voronoi Tesselation lead to differently well fitted statistical properties of the glass breakage patterns. As the methodology is derived and kept general, the framework could be also applied to other random tesselations and crack pattern modelling purposes.

Keywords: glass breakage predicition, Voronoi Random Field Tessellation, fragmentation analysis, Bayesian parameter identification

Procedia PDF Downloads 150

6450 Creativity and Innovation in a Military Unit of South America: Decision Making Process, Socio-Emotional Climate, Shared Flow and Leadership

Authors: S. da Costa, D. Páez, E. Martínez, A. Torres, M. Beramendi, D. Hermosilla, M. Muratori

Abstract:

This study examined the association between creative performance, organizational climate and leadership, affectivity, shared flow, and group decision making. The sample consisted of 315 cadets of a military academic unit of South America. Satisfaction with the decision-making process during a creative task was associated with the usefulness and effectiveness of the ideas generated by the teams with a weighted average correlation of r = .18. Organizational emotional climate, positive and innovation leadership were associated with this group decision-making process r = .25, with shared flow, r = .29 and with positive affect felt during the performance of the creative task, r = .12. In a sequential mediational analysis positive organizational leadership styles were significantly associated with decision-making process and trough cohesion with utility and efficacy of the solution of a creative task. Satisfactory decision-making was related to shared flow during the creative task at collective or group level, and positive affect with flow at individual level.This study examined the association between creative performance, organizational climate and leadership, affectivity, shared flow, and group decision making. The sample consisted of 315 cadets of a military academic unit of South America. Satisfaction with the decision-making process during a creative task was associated with the usefulness and effectiveness of the ideas generated by the teams with a weighted average correlation of r = .18. Organizational emotional climate, positive and innovation leadership were associated with this group decision-making process r = .25, with shared flow, r = .29 and with positive affect felt during the performance of the creative task, r = .12. In a sequential mediational analysis positive organizational leadership styles were significantly associated with decision-making process and trough cohesion with utility and efficacy of the solution of a creative task. Satisfactory decision-making was related to shared flow during the creative task at collective or group level, and positive affect with flow at individual level.

Keywords: creativity, innovation, military, organization, teams

Procedia PDF Downloads 115

6449 Bioinformatics Approach to Identify Physicochemical and Structural Properties Associated with Successful Cell-free Protein Synthesis

Authors: Alexander A. Tokmakov

Abstract:

Cell-free protein synthesis is widely used to synthesize recombinant proteins. It allows genome-scale expression of various polypeptides under strictly controlled uniform conditions. However, only a minor fraction of all proteins can be successfully expressed in the systems of protein synthesis that are currently used. The factors determining expression success are poorly understood. At present, the vast volume of data is accumulated in cell-free expression databases. It makes possible comprehensive bioinformatics analysis and identification of multiple features associated with successful cell-free expression. Here, we describe an approach aimed at identification of multiple physicochemical and structural properties of amino acid sequences associated with protein solubility and aggregation and highlight major correlations obtained using this approach. The developed method includes: categorical assessment of the protein expression data, calculation and prediction of multiple properties of expressed amino acid sequences, correlation of the individual properties with the expression scores, and evaluation of statistical significance of the observed correlations. Using this approach, we revealed a number of statistically significant correlations between calculated and predicted features of protein sequences and their amenability to cell-free expression. It was found that some of the features, such as protein pI, hydrophobicity, presence of signal sequences, etc., are mostly related to protein solubility, whereas the others, such as protein length, number of disulfide bonds, content of secondary structure, etc., affect mainly the expression propensity. We also demonstrated that amenability of polypeptide sequences to cell-free expression correlates with the presence of multiple sites of post-translational modifications. The correlations revealed in this study provide a plethora of important insights into protein folding and rationalization of protein production. The developed bioinformatics approach can be of practical use for predicting expression success and optimizing cell-free protein synthesis.

Keywords: bioinformatics analysis, cell-free protein synthesis, expression success, optimization, recombinant proteins

Procedia PDF Downloads 405

6448 Artificial Neural Network in Ultra-High Precision Grinding of Borosilicate-Crown Glass

Authors: Goodness Onwuka, Khaled Abou-El-Hossein

Abstract:

Borosilicate-crown (BK7) glass has found broad application in the optic and automotive industries and the growing demands for nanometric surface finishes is becoming a necessity in such applications. Thus, it has become paramount to optimize the parameters influencing the surface roughness of this precision lens. The research was carried out on a 4-axes Nanoform 250 precision lathe machine with an ultra-high precision grinding spindle. The experiment varied the machining parameters of feed rate, wheel speed and depth of cut at three levels for different combinations using Box Behnken design of experiment and the resulting surface roughness values were measured using a Taylor Hobson Dimension XL optical profiler. Acoustic emission monitoring technique was applied at a high sampling rate to monitor the machining process while further signal processing and feature extraction methods were implemented to generate the input to a neural network algorithm. This paper highlights the training and development of a back propagation neural network prediction algorithm through careful selection of parameters and the result show a better classification accuracy when compared to a previously developed response surface model with very similar machining parameters. Hence artificial neural network algorithms provide better surface roughness prediction accuracy in the ultra-high precision grinding of BK7 glass.

Keywords: acoustic emission technique, artificial neural network, surface roughness, ultra-high precision grinding

Procedia PDF Downloads 298

6447 Manufacturing and Calibration of Material Standards for Optical Microscopy in Industrial Environments

Authors: Alberto Mínguez-Martínez, Jesús De Vicente Y Oliva

Abstract:

It seems that we live in a world in which the trend in industrial environments is the miniaturization of systems and materials and the fabrication of parts at the micro-and nano-scale. The problem arises when manufacturers want to study the quality of their production. This characteristic is becoming crucial due to the evolution of the industry and the development of Industry 4.0. As Industry 4.0 is based on digital models of production and processes, having accurate measurements becomes capital. At this point, the metrology field plays an important role as it is a powerful tool to ensure more stable production to reduce scrap and the cost of non-conformities. The most extended measuring instruments that allow us to carry out accurate measurements at these scales are optical microscopes, whether they are traditional, confocal, focus variation microscopes, profile projectors, or any other similar measurement system. However, the accuracy of measurements is connected to the traceability of them to the SI unit of length (the meter). The fact of providing adequate traceability to 2D and 3D dimensional measurements at micro-and nano-scale in industrial environments is a problem that is being studied, and it does not have a unique answer. In addition, if commercial material standards for micro-and nano-scale are considered, we can find that there are two main problems. On the one hand, those material standards that could be considered complete and very interesting do not give traceability of dimensional measurements and, on the other hand, their calibration is very expensive. This situation implies that these kinds of standards will not succeed in industrial environments and, as a result, they will work in the absence of traceability. To solve this problem in industrial environments, it becomes necessary to have material standards that are easy to use, agile, adaptive to different forms, cheap to manufacture and, of course, traceable to the definition of meter with simple methods. By using these ‘customized standards’, it would be possible to adapt and design measuring procedures for each application and manufacturers will work with some traceability. It is important to note that, despite the fact that this traceability is clearly incomplete, this situation is preferable to working in the absence of it. Recently, it has been demonstrated the versatility and the utility of using laser technology and other AM technologies to manufacture customized material standards. In this paper, the authors propose to manufacture a customized material standard using an ultraviolet laser system and a method to calibrate it. To conclude, the results of the calibration carried out in an accredited dimensional metrology laboratory are presented.

Keywords: industrial environment, material standards, optical measuring instrument, traceability

Procedia PDF Downloads 110

6446 Cut-Out Animation as an Technic and Development inside History Process

Authors: Armagan Gokcearslan

Abstract:

The art of animation has developed very rapidly from the aspects of script, sound and music, motion, character design, techniques being used and technological tools being developed since the first years until today. Technical variety attracts a particular attention in the art of animation. Being perceived as a kind of illusion in the beginning; animations commonly used the Flash Sketch technique. Animations artists using the Flash Sketch technique created scenes by drawing them on a blackboard with chalk. The Flash Sketch technique was used by primary animation artists like Emile Cohl, Winsor McCay ande Blackton. And then tools like Magical Lantern, Thaumatrope, Phenakisticope, and Zeotrap were developed and started to be used intensely in the first years of the art of animation. Today, on the other hand, the art of animation is affected by developments in the computer technology. It is possible to create three-dimensional and two-dimensional animations with the help of various computer software. Cut-out technique is among the important techniques being used in the art of animation. Cut-out animation technique is based on the art of paper cutting. Examining cut-out animations; it is observed that they technically resemble the art of paper cutting. The art of paper cutting has a rooted history. It is possible to see the oldest samples of paper cutting in the People’s Republic of China in the period after the 2. century B.C. when the Chinese invented paper. The most popular artist using the cut-out animation technique is the German artist Lotte Reiniger. This study titled “Cut-out Animation as a Technic and Development Inside History Process” will embrace the art of paper cutting, the relationship between the art of paper cutting and cut-out animation, its development within the historical process, animation artists producing artworks in this field, important cut-out animations, and their technical properties.

Keywords: cut-out, paper art, animation, technic

Procedia PDF Downloads 258

6445 A Machine Learning Approach for Performance Prediction Based on User Behavioral Factors in E-Learning Environments

Authors: Naduni Ranasinghe

Abstract:

E-learning environments are getting more popular than any other due to the impact of COVID19. Even though e-learning is one of the best solutions for the teaching-learning process in the academic process, it’s not without major challenges. Nowadays, machine learning approaches are utilized in the analysis of how behavioral factors lead to better adoption and how they related to better performance of the students in eLearning environments. During the pandemic, we realized the academic process in the eLearning approach had a major issue, especially for the performance of the students. Therefore, an approach that investigates student behaviors in eLearning environments using a data-intensive machine learning approach is appreciated. A hybrid approach was used to understand how each previously told variables are related to the other. A more quantitative approach was used referred to literature to understand the weights of each factor for adoption and in terms of performance. The data set was collected from previously done research to help the training and testing process in ML. Special attention was made to incorporating different dimensionality of the data to understand the dependency levels of each. Five independent variables out of twelve variables were chosen based on their impact on the dependent variable, and by considering the descriptive statistics, out of three models developed (Random Forest classifier, SVM, and Decision tree classifier), random forest Classifier (Accuracy – 0.8542) gave the highest value for accuracy. Overall, this work met its goals of improving student performance by identifying students who are at-risk and dropout, emphasizing the necessity of using both static and dynamic data.

Keywords: academic performance prediction, e learning, learning analytics, machine learning, predictive model

Procedia PDF Downloads 141

6444 Effect of Mach Number for Gust-Airfoil Interatcion Noise

Authors: ShuJiang Jiang

Abstract:

The interaction of turbulence with airfoil is an important noise source in many engineering fields, including helicopters, turbofan, and contra-rotating open rotor engines, where turbulence generated in the wake of upstream blades interacts with the leading edge of downstream blades and produces aerodynamic noise. One approach to study turbulence-airfoil interaction noise is to model the oncoming turbulence as harmonic gusts. A compact noise source produces a dipole-like sound directivity pattern. However, when the acoustic wavelength is much smaller than the airfoil chord length, the airfoil needs to be treated as a non-compact source, and the gust-airfoil interaction becomes more complicated and results in multiple lobes generated in the radiated sound directivity. Capturing the short acoustic wavelength is a challenge for numerical simulations. In this work, simulations are performed for gust-airfoil interaction at different Mach numbers, using a high-fidelity direct Computational AeroAcoustic (CAA) approach based on a spectral/hp element method, verified by a CAA benchmark case. It is found that the squared sound pressure varies approximately as the 5th power of Mach number, which changes slightly with the observer location. This scaling law can give a better sound prediction than the flat-plate theory for thicker airfoils. Besides, another prediction method, based on the flat-plate theory and CAA simulation, has been proposed to give better predictions than the scaling law for thicker airfoils.

Keywords: aeroacoustics, gust-airfoil interaction, CFD, CAA

Procedia PDF Downloads 62

6443 The Evaluation of the Safety Coefficient of Soil Slope Stability by Group Pile

Authors: Seyed Abolhassan Naeini, Hamed Yekehdehghan

Abstract:

One of the factors that affect the constructions adjacent to a slope is stability. There are various methods for the stability of the slopes, one of which is the use of concrete group piles. This study, using FLAC3D software, has tried to investigate the changes in safety coefficient because of the use of concrete group piles. In this research, furthermore, the optimal position of the piles has been investigated and the results show that the group pile does not affect the toe of the slope. In addition, the effect of the piles' burial depth on the slope has been studied. Results show that by increasing the piles burial depth on a slope, the level of stability and as a result the safety coefficient increases. In the investigation of reducing the distance between the piles and increasing the depth of underground water, it was observed that the obtained safety coefficient increased. Finally, the effect of the resistance of the lower stabilizing layer of the slope on stabilization was investigated by the pile group. The results showed that due to the behavior of the pile as a deep foundation, the stronger the soil layers are in the stable part of a stronger slope (in terms of resistance parameters), the more influential the piles are in enhancing the coefficient of safety.

Keywords: safety coefficient, group pile, slope, stability, FLAC3D software

Procedia PDF Downloads 84

6442 Influence of Single and Multiple Skin-Core Debonding on Free Vibration Characteristics of Innovative GFRP Sandwich Panels

Authors: Indunil Jayatilake, Warna Karunasena, Weena Lokuge

Abstract:

An Australian manufacturer has fabricated an innovative GFRP sandwich panel made from E-glass fiber skin and a modified phenolic core for structural applications. Debonding, which refers to separation of skin from the core material in composite sandwiches, is one of the most common types of damage in composites. The presence of debonding is of great concern because it not only severely affects the stiffness but also modifies the dynamic behaviour of the structure. Generally, it is seen that the majority of research carried out has been concerned about the delamination of laminated structures whereas skin-core debonding has received relatively minor attention. Furthermore, it is observed that research done on composite slabs having multiple skin-core debonding is very limited. To address this gap, a comprehensive research investigating dynamic behaviour of composite panels with single and multiple debonding is presented. The study uses finite-element modelling and analyses for investigating the influence of debonding on free vibration behaviour of single and multilayer composite sandwich panels. A broad parametric investigation has been carried out by varying debonding locations, debonding sizes and support conditions of the panels in view of both single and multiple debonding. Numerical models were developed with Strand7 finite element package by innovatively selecting the suitable elements to diligently represent their actual behavior. Three-dimensional finite element models were employed to simulate the physically real situation as close as possible, with the use of an experimentally and numerically validated finite element model. Comparative results and conclusions based on the analyses are presented. For similar extents and locations of debonding, the effect of debonding on natural frequencies appears greatly dependent on the end conditions of the panel, giving greater decrease in natural frequency when the panels are more restrained. Some modes are more sensitive to debonding and this sensitivity seems to be related to their vibration mode shapes. The fundamental mode seems generally the least sensitive mode to debonding with respect to the variation in free vibration characteristics. The results indicate the effectiveness of the developed three-dimensional finite element models in assessing debonding damage in composite sandwich panels

Keywords: debonding, free vibration behaviour, GFRP sandwich panels, three dimensional finite element modelling

Procedia PDF Downloads 303

6441 Future Projection of Glacial Lake Outburst Floods Hazard: A Hydrodynamic Study of the Highest Lake in the Dhauliganga Basin, Uttarakhand

Authors: Ashim Sattar, Ajanta Goswami, Anil V. Kulkarni

Abstract:

Glacial lake outburst floods (GLOF) highly contributes to mountain hazards in the Himalaya. Over the past decade, high altitude lakes in the Himalaya has been showing notable growth in their size and number. The key reason is rapid retreat of its glacier front. Hydrodynamic modeling GLOF using shallow water equations (SWE) would result in understanding its impact in the downstream region. The present study incorporates remote sensing based ice thickness modeling to determine the future extent of the Dhauliganga Lake to map the over deepening extent around the highest lake in the Dhauliganga basin. The maximum future volume of the lake calculated using area-volume scaling is used to model a GLOF event. The GLOF hydrograph is routed along the channel using one dimensional and two dimensional model to understand the flood wave propagation till it reaches the 1st hydropower station located 72 km downstream of the lake. The present extent of the lake calculated using SENTINEL 2 images is 0.13 km². The maximum future extent of the lake, mapped by investigating the glacier bed has a calculated scaled volume of 3.48 x 106 m³. The GLOF modeling releasing the future volume of the lake resulted in a breach hydrograph with a peak flood of 4995 m³/s at just downstream of the lake. Hydraulic routing

Keywords: GLOF, glacial lake outburst floods, mountain hazard, Central Himalaya, future projection

Procedia PDF Downloads 151

6440 Improved Soil and Snow Treatment with the Rapid Update Cycle Land-Surface Model for Regional and Global Weather Predictions

Authors: Tatiana G. Smirnova, Stan G. Benjamin

Abstract:

Rapid Update Cycle (RUC) land surface model (LSM) was a land-surface component in several generations of operational weather prediction models at the National Center for Environment Prediction (NCEP) at the National Oceanic and Atmospheric Administration (NOAA). It was designed for short-range weather predictions with an emphasis on severe weather and originally was intentionally simple to avoid uncertainties from poorly known parameters. Nevertheless, the RUC LSM, when coupled with the hourly-assimilating atmospheric model, can produce a realistic evolution of time-varying soil moisture and temperature, as well as the evolution of snow cover on the ground surface. This result is possible only if the soil/vegetation/snow component of the coupled weather prediction model has sufficient skill to avoid long-term drift. RUC LSM was first implemented in the operational NCEP Rapid Update Cycle (RUC) weather model in 1998 and later in the Weather Research Forecasting Model (WRF)-based Rapid Refresh (RAP) and High-resolution Rapid Refresh (HRRR). Being available to the international WRF community, it was implemented in operational weather models in Austria, New Zealand, and Switzerland. Based on the feedback from the US weather service offices and the international WRF community and also based on our own validation, RUC LSM has matured over the years. Also, a sea-ice module was added to RUC LSM for surface predictions over the Arctic sea-ice. Other modifications include refinements to the snow model and a more accurate specification of albedo, roughness length, and other surface properties. At present, RUC LSM is being tested in the regional application of the Unified Forecast System (UFS). The next generation UFS-based regional Rapid Refresh FV3 Standalone (RRFS) model will replace operational RAP and HRRR at NCEP. Over time, RUC LSM participated in several international model intercomparison projects to verify its skill using observed atmospheric forcing. The ESM-SnowMIP was the last of these experiments focused on the verification of snow models for open and forested regions. The simulations were performed for ten sites located in different climatic zones of the world forced with observed atmospheric conditions. While most of the 26 participating models have more sophisticated snow parameterizations than in RUC, RUC LSM got a high ranking in simulations of both snow water equivalent and surface temperature. However, ESM-SnowMIP experiment also revealed some issues in the RUC snow model, which will be addressed in this paper. One of them is the treatment of grid cells partially covered with snow. RUC snow module computes energy and moisture budgets of snow-covered and snow-free areas separately by aggregating the solutions at the end of each time step. Such treatment elevates the importance of computing in the model snow cover fraction. Improvements to the original simplistic threshold-based approach have been implemented and tested both offline and in the coupled weather model. The detailed description of changes to the snow cover fraction and other modifications to RUC soil and snow parameterizations will be described in this paper.

Keywords: land-surface models, weather prediction, hydrology, boundary-layer processes

Procedia PDF Downloads 77

6439 Modeling of Turbulent Flow for Two-Dimensional Backward-Facing Step Flow

Authors: Alex Fedoseyev

Abstract:

This study investigates a generalized hydrodynamic equation (GHE) simplified model for the simulation of turbulent flow over a two-dimensional backward-facing step (BFS) at Reynolds number Re=132000. The GHE were derived from the generalized Boltzmann equation (GBE). GBE was obtained by first principles from the chain of Bogolubov kinetic equations and considers particles of finite dimensions. The GHE has additional terms, temporal and spatial fluctuations, compared to the Navier-Stokes equations (NSE). These terms have a timescale multiplier τ, and the GHE becomes the NSE when $\tau$ is zero. The nondimensional τ is a product of the Reynolds number and the squared length scale ratio, τ=Re*(l/L)², where l is the apparent Kolmogorov length scale, and L is a hydrodynamic length scale. The BFS flow modeling results obtained by 2D calculations cannot match the experimental data for Re>450. One or two additional equations are required for the turbulence model to be added to the NSE, which typically has two to five parameters to be tuned for specific problems. It is shown that the GHE does not require an additional turbulence model, whereas the turbulent velocity results are in good agreement with the experimental results. A review of several studies on the simulation of flow over the BFS from 1980 to 2023 is provided. Most of these studies used different turbulence models when Re>1000. In this study, the 2D turbulent flow over a BFS with height H=L/3 (where L is the channel height) at Reynolds number Re=132000 was investigated using numerical solutions of the GHE (by a finite-element method) and compared to the solutions from the Navier-Stokes equations, k–ε turbulence model, and experimental results. The comparison included the velocity profiles at X/L=5.33 (near the end of the recirculation zone, available from the experiment), recirculation zone length, and velocity flow field. The mean velocity of NSE was obtained by averaging the solution over the number of time steps. The solution with a standard k −ε model shows a velocity profile at X/L=5.33, which has no backward flow. A standard k−ε model underpredicts the experimental recirculation zone length X/L=7.0∓0.5 by a substantial amount of 20-25%, and a more sophisticated turbulence model is needed for this problem. The obtained data confirm that the GHE results are in good agreement with the experimental results for turbulent flow over two-dimensional BFS. A turbulence model was not required in this case. The computations were stable. The solution time for the GHE is the same or less than that for the NSE and significantly less than that for the NSE with the turbulence model. The proposed approach was limited to 2D and only one Reynolds number. Further work will extend this approach to 3D flow and a higher Re.

Keywords: backward-facing step, comparison with experimental data, generalized hydrodynamic equations, separation, reattachment, turbulent flow

Procedia PDF Downloads 49

6438 Big Data in Telecom Industry: Effective Predictive Techniques on Call Detail Records

Authors: Sara ElElimy, Samir Moustafa

Abstract:

Mobile network operators start to face many challenges in the digital era, especially with high demands from customers. Since mobile network operators are considered a source of big data, traditional techniques are not effective with new era of big data, Internet of things (IoT) and 5G; as a result, handling effectively different big datasets becomes a vital task for operators with the continuous growth of data and moving from long term evolution (LTE) to 5G. So, there is an urgent need for effective Big data analytics to predict future demands, traffic, and network performance to full fill the requirements of the fifth generation of mobile network technology. In this paper, we introduce data science techniques using machine learning and deep learning algorithms: the autoregressive integrated moving average (ARIMA), Bayesian-based curve fitting, and recurrent neural network (RNN) are employed for a data-driven application to mobile network operators. The main framework included in models are identification parameters of each model, estimation, prediction, and final data-driven application of this prediction from business and network performance applications. These models are applied to Telecom Italia Big Data challenge call detail records (CDRs) datasets. The performance of these models is found out using a specific well-known evaluation criteria shows that ARIMA (machine learning-based model) is more accurate as a predictive model in such a dataset than the RNN (deep learning model).

Keywords: big data analytics, machine learning, CDRs, 5G

Procedia PDF Downloads 128

6437 Exploring the Relationship between Employer Brand and Organizational Attractiveness: The Mediating Role of Employer Image and the Moderating Role of Value Congruence

Authors: Yi Shan Wu, Ting Hsuan Wu, Li Wei Cheng, Pei Yu Guo

Abstract:

Given the fiercely competitive environment, human capital is one of the most valuable assets in a commercial enterprise. Therefore, developing strategies to acquire more talents is crucial. Talents are mainly attracted by both internal and external employer brands as well as by the messages conveyed from the employer image. This not only manifests the importance of a brand and an image of an organization but shows people might be affected by their personal values when assessing an organization as an employer. The goal of the present study is to examine the association between employer brand, employer image, and the likelihood of increasing organizational attractiveness. In addition, we draw from social identity theory to propose value congruence may affect the relationship between employer brand and employer image. Data was collected from those people who only worked less than a year in the industry via an online survey (N=209). The results show that employer image partly mediates the effect of employer brand on organizational attractiveness. In addition, the results also suggest that value congruence does not moderate the relationship between employer brand and employer image. These findings explain why building a good employer brand could enhance organization attractiveness and indicate there should be other factors that may affect employer image building, offering directions for future research.

Keywords: organizational attractiveness, employer brand, employer image, value congruence

Procedia PDF Downloads 123

6436 Analysis of Ionospheric Variations over Japan during 23rd Solar Cycle Using Wavelet Techniques

Authors: C. S. Seema, P. R. Prince

Abstract:

The characterization of spatio-temporal inhomogeneities occurring in the ionospheric F₂ layer is remarkable since these variations are direct consequences of electrodynamical coupling between magnetosphere and solar events. The temporal and spatial variations of the F₂ layer, which occur with a period of several days or even years, mainly owe to geomagnetic and meteorological activities. The hourly F₂ layer critical frequency (foF2) over 23rd solar cycle (1996-2008) of three ionosonde stations (Wakkanai, Kokunbunji, and Okinawa) in northern hemisphere, which falls within same longitudinal span, is analyzed using continuous wavelet techniques. Morlet wavelet is used to transform continuous time series data of foF2 to a two dimensional time-frequency space, quantifying the time evolution of the oscillatory modes. The presence of significant time patterns (periodicities) at a particular time period and the time location of each periodicity are detected from the two-dimensional representation of the wavelet power, in the plane of scale and period of the time series. The mean strength of each periodicity over the entire period of analysis is studied using global wavelet spectrum. The quasi biennial, annual, semiannual, 27 day, diurnal and 12 hour variations of foF2 are clearly evident in the wavelet power spectra in all the three stations. Critical frequency oscillations with multi-day periods (2-3 days and 9 days in the low latitude station, 6-7 days in all stations and 15 days in mid-high latitude station) are also superimposed over large time scaled variations.

Keywords: continuous wavelet analysis, critical frequency, ionosphere, solar cycle

Procedia PDF Downloads 204

6435 Effective Stacking of Deep Neural Models for Automated Object Recognition in Retail Stores

Authors: Ankit Sinha, Soham Banerjee, Pratik Chattopadhyay

Abstract:

Automated product recognition in retail stores is an important real-world application in the domain of Computer Vision and Pattern Recognition. In this paper, we consider the problem of automatically identifying the classes of the products placed on racks in retail stores from an image of the rack and information about the query/product images. We improve upon the existing approaches in terms of effectiveness and memory requirement by developing a two-stage object detection and recognition pipeline comprising of a Faster-RCNN-based object localizer that detects the object regions in the rack image and a ResNet-18-based image encoder that classifies the detected regions into the appropriate classes. Each of the models is fine-tuned using appropriate data sets for better prediction and data augmentation is performed on each query image to prepare an extensive gallery set for fine-tuning the ResNet-18-based product recognition model. This encoder is trained using a triplet loss function following the strategy of online-hard-negative-mining for improved prediction. The proposed models are lightweight and can be connected in an end-to-end manner during deployment to automatically identify each product object placed in a rack image. Extensive experiments using Grozi-32k and GP-180 data sets verify the effectiveness of the proposed model.

Keywords: retail stores, faster-RCNN, object localization, ResNet-18, triplet loss, data augmentation, product recognition

Procedia PDF Downloads 139

6434 Feature Analysis of Predictive Maintenance Models

Authors: Zhaoan Wang

Abstract:

Research in predictive maintenance modeling has improved in the recent years to predict failures and needed maintenance with high accuracy, saving cost and improving manufacturing efficiency. However, classic prediction models provide little valuable insight towards the most important features contributing to the failure. By analyzing and quantifying feature importance in predictive maintenance models, cost saving can be optimized based on business goals. First, multiple classifiers are evaluated with cross-validation to predict the multi-class of failures. Second, predictive performance with features provided by different feature selection algorithms are further analyzed. Third, features selected by different algorithms are ranked and combined based on their predictive power. Finally, linear explainer SHAP (SHapley Additive exPlanations) is applied to interpret classifier behavior and provide further insight towards the specific roles of features in both local predictions and global model behavior. The results of the experiments suggest that certain features play dominant roles in predictive models while others have significantly less impact on the overall performance. Moreover, for multi-class prediction of machine failures, the most important features vary with type of machine failures. The results may lead to improved productivity and cost saving by prioritizing sensor deployment, data collection, and data processing of more important features over less importance features.

Keywords: automated supply chain, intelligent manufacturing, predictive maintenance machine learning, feature engineering, model interpretation

Procedia PDF Downloads 118

6433 Non-Linear Assessment of Chromatographic Lipophilicity and Model Ranking of Newly Synthesized Steroid Derivatives

Authors: Milica Karadzic, Lidija Jevric, Sanja Podunavac-Kuzmanovic, Strahinja Kovacevic, Anamarija Mandic, Katarina Penov Gasi, Marija Sakac, Aleksandar Okljesa, Andrea Nikolic

Abstract:

The present paper deals with chromatographic lipophilicity prediction of newly synthesized steroid derivatives. The prediction was achieved using in silico generated molecular descriptors and quantitative structure-retention relationship (QSRR) methodology with the artificial neural networks (ANN) approach. Chromatographic lipophilicity of the investigated compounds was expressed as retention factor value logk. For QSRR modeling, a feedforward back-propagation ANN with gradient descent learning algorithm was applied. Using the novel sum of ranking differences (SRD) method generated ANN models were ranked. The aim was to distinguish the most consistent QSRR model that can be found, and similarity or dissimilarity between the models that could be noticed. In this study, SRD was performed with average values of retention factor value logk as reference values. An excellent correlation between experimentally observed retention factor value logk and values predicted by the ANN was obtained with a correlation coefficient higher than 0.9890. Statistical results show that the established ANN models can be applied for required purpose. This article is based upon work from COST Action (TD1305), supported by COST (European Cooperation in Science and Technology).

Keywords: artificial neural networks, liquid chromatography, molecular descriptors, steroids, sum of ranking differences

Procedia PDF Downloads 305

6432 Infernal Affairs (Hong Kong) versus Double Face (Japan): Remaking and Context

Authors: Roman Kusaiko

Abstract:

For decades, remaking was one of the film industry’s main practices but has become vivid in recent years. The latest geopolitical developments, though, are becoming a new challenge for filmmakers regarding cultural landscapes and contextual differences. Deglobalization may also affect transnational remaking practices. Thus, these upcoming challenges can be addressed through the analysis of contemporary academic thought, primarily from adaptation and film studies and their understanding of the issues of transmediality and how it affects film remaking. However, the analysis would be insufficient without conducting case studies. This paper is part of broader research about transnational remaking practices and their cultural and contextual specifics. This paper aims to understand whether shifting medium affects remaking as a critical category and present case studies of the popular Hong Kong motion picture Infernal Affairs and its transition into the Japanese remake Double Face. Consequently, the analysis of their contextual distinctions will lead to the correct categorization of the transnational remakes allowing scholars and filmmakers to better understand the existing remaking practices and whether they affect the final result.

Keywords: cinema, context, culture, films, remaking, transmediality

Procedia PDF Downloads 84

6431 Agreement between Basal Metabolic Rate Measured by Bioelectrical Impedance Analysis and Estimated by Prediction Equations in Obese Groups

Authors: Orkide Donma, Mustafa M. Donma

Abstract:

Basal metabolic rate (BMR) is widely used and an accepted measure of energy expenditure. Its principal determinant is body mass. However, this parameter is also correlated with a variety of other factors. The objective of this study is to measure BMR and compare it with the values obtained from predictive equations in adults classified according to their body mass index (BMI) values. 276 adults were included into the scope of this study. Their age, height and weight values were recorded. Five groups were designed based on their BMI values. First group (n = 85) was composed of individuals with BMI values varying between 18.5 and 24.9 kg/m². Those with BMI values varying from 25.0 to 29.9 kg/m²constituted Group 2 (n = 90). Individuals with 30.0-34.9 kg/m², 35.0-39.9 kg/m², > 40.0 kg/m² were included in Group 3 (n = 53), 4 (n = 28) and 5 (n = 20), respectively. The most commonly used equations to be compared with the measured BMR values were selected. For this purpose, the values were calculated by the use of four equations to predict BMR values, by name, introduced by Food and Agriculture Organization (FAO)/World Health Organization (WHO)/United Nations University (UNU), Harris and Benedict, Owen and Mifflin. Descriptive statistics, ANOVA, post-Hoc Tukey and Pearson’s correlation tests were performed by a statistical program designed for Windows (SPSS, version 16.0). p values smaller than 0.05 were accepted as statistically significant. Mean ± SD of groups 1, 2, 3, 4 and 5 for measured BMR in kcal were 1440.3 ± 210.0, 1618.8 ± 268.6, 1741.1 ± 345.2, 1853.1 ± 351.2 and 2028.0 ± 412.1, respectively. Upon evaluation of the comparison of means among groups, differences were highly significant between Group 1 and each of the remaining four groups. The values were increasing from Group 2 to Group 5. However, differences between Group 2 and Group 3, Group 3 and Group 4, Group 4 and Group 5 were not statistically significant. These insignificances were lost in predictive equations proposed by Harris and Benedict, FAO/WHO/UNU and Owen. For Mifflin, the insignificance was limited only to Group 4 and Group 5. Upon evaluation of the correlations of measured BMR and the estimated values computed from prediction equations, the lowest correlations between measured BMR and estimated BMR values were observed among the individuals within normal BMI range. The highest correlations were detected in individuals with BMI values varying between 30.0 and 34.9 kg/m². Correlations between measured BMR values and BMR values calculated by FAO/WHO/UNU as well as Owen were the same and the highest. In all groups, the highest correlations were observed between BMR values calculated from Mifflin and Harris and Benedict equations using age as an additional parameter. In conclusion, the unique resemblance of the FAO/WHO/UNU and Owen equations were pointed out. However, mean values obtained from FAO/WHO/UNU were much closer to the measured BMR values. Besides, the highest correlations were found between BMR calculated from FAO/WHO/UNU and measured BMR. These findings suggested that FAO/WHO/UNU was the most reliable equation, which may be used in conditions when the measured BMR values are not available.

Keywords: adult, basal metabolic rate, fao/who/unu, obesity, prediction equations

Procedia PDF Downloads 122

6430 The Impact of Direct and Indirect Pressure Measuring Systems on the Pressure Mapping for the Medical Compression Garments

Authors: Arash M. Shahidi, Tilak Dias, Gayani K. Nandasiri

Abstract:

While graduated compression is the foundation of treatment and management of many medical complications such as leg ulcer, varicose veins, and lymphedema, monitoring the interface pressure has been conducted using different sensors that operate based on diverse approaches. The variations existed from the pressure readings collected using different interface pressure measurement systems would cause difficulties in taking a decision regarding the compression therapy. It is crucial to acknowledge the differences existing between direct and indirect pressure measurement systems while considering the commercially available systems such as AMI, Picopress and OPM which are under direct measurements systems, and HATRA (BSI), HOSY (RAL-GZ) and FlexiForce which comes under the indirect measurement system. Furthermore, Piezo-resistive sensors (Flexiforce) can measure the changes in resistance corresponding to the applied force on the sensing area. Direct pressure measuring systems are capable of measuring interface pressure on the three-dimensional states, while the indirect pressure measuring systems stretch the fabric in the two-dimensional direction and extrapolate pressure from surface tension measured on the device and neglect the vital factor which is the radius of curvature. In this study, a leg mannequin of known dimensions is selected with a knitted class 3 compression stocking. It has been decided to evaluate the data collected from different available systems (AMI, PicoPress, FlexiForce, and HATRA) and compare the results. The results showed a discrepancy between Hatra, AMI, Picopress, and Flexiforce against the pressure standard used to generate class 3 compression stocking. As predicted a higher pressure value with direct interface measuring systems were monitored against HATRA due to the effect of the radius of curvature.

Keywords: AMI, FlexiForce, graduated compression, HATRA, interface pressure, PicoPress

Procedia PDF Downloads 334

6429 Hansen Solubility Parameter from Surface Measurements

Authors: Neveen AlQasas, Daniel Johnson

Abstract:

Membranes for water treatment are an established technology that attracts great attention due to its simplicity and cost effectiveness. However, membranes in operation suffer from the adverse effect of membrane fouling. Bio-fouling is a phenomenon that occurs at the water-membrane interface, and is a dynamic process that is initiated by the adsorption of dissolved organic material, including biomacromolecules, on the membrane surface. After initiation, attachment of microorganisms occurs, followed by biofilm growth. The biofilm blocks the pores of the membrane and consequently results in reducing the water flux. Moreover, the presence of a fouling layer can have a substantial impact on the membrane separation properties. Understanding the mechanism of the initiation phase of biofouling is a key point in eliminating the biofouling on membrane surfaces. The adhesion and attachment of different fouling materials is affected by the surface properties of the membrane materials. Therefore, surface properties of different polymeric materials had been studied in terms of their surface energies and Hansen solubility parameters (HSP). The difference between the combined HSP parameters (HSP distance) allows prediction of the affinity of two materials to each other. The possibilities of measuring the HSP of different polymer films via surface measurements, such as contact angle has been thoroughly investigated. Knowing the HSP of a membrane material and the HSP of a specific foulant, facilitate the estimation of the HSP distance between the two, and therefore the strength of attachment to the surface. Contact angle measurements using fourteen different solvents on five different polymeric films were carried out using the sessile drop method. Solvents were ranked as good or bad solvents using different ranking method and ranking was used to calculate the HSP of each polymeric film. Results clearly indicate the absence of a direct relation between contact angle values of each film and the HSP distance between each polymer film and the solvents used. Therefore, estimating HSP via contact angle alone is not sufficient. However, it was found if the surface tensions and viscosities of the used solvents are taken in to the account in the analysis of the contact angle values, a prediction of the HSP from contact angle measurements is possible. This was carried out via training of a neural network model. The trained neural network model has three inputs, contact angle value, surface tension and viscosity of solvent used. The model is able to predict the HSP distance between the used solvent and the tested polymer (material). The HSP distance prediction is further used to estimate the total and individual HSP parameters of each tested material. The results showed an accuracy of about 90% for all the five studied films

Keywords: surface characterization, hansen solubility parameter estimation, contact angle measurements, artificial neural network model, surface measurements

Procedia PDF Downloads 81

6428 Study of the Persian Gulf’s and Oman Sea’s Numerical Tidal Currents

Authors: Fatemeh Sadat Sharifi

Abstract:

In this research, a barotropic model was employed to consider the tidal studies in the Persian Gulf and Oman Sea, where the only sufficient force was the tidal force. To do that, a finite-difference, free-surface model called Regional Ocean Modeling System (ROMS), was employed on the data over the Persian Gulf and Oman Sea. To analyze flow patterns of the region, the results of limited size model of The Finite Volume Community Ocean Model (FVCOM) were appropriated. The two points were determined since both are one of the most critical water body in case of the economy, biology, fishery, Shipping, navigation, and petroleum extraction. The OSU Tidal Prediction Software (OTPS) tide and observation data validated the modeled result. Next, tidal elevation and speed, and tidal analysis were interpreted. Preliminary results determine a significant accuracy in the tidal height compared with observation and OTPS data, declaring that tidal currents are highest in Hormuz Strait and the narrow and shallow region between Iranian coasts and Islands. Furthermore, tidal analysis clarifies that the M_2 component has the most significant value. Finally, the Persian Gulf tidal currents are divided into two branches: the first branch converts from south to Qatar and via United Arab Emirate rotates to Hormuz Strait. The secondary branch, in north and west, extends up to the highest point in the Persian Gulf and in the head of Gulf turns counterclockwise.

Keywords: numerical model, barotropic tide, tidal currents, OSU tidal prediction software, OTPS

Procedia PDF Downloads 120

6427 Profiling Risky Code Using Machine Learning

Authors: Zunaira Zaman, David Bohannon

Abstract:

This study explores the application of machine learning (ML) for detecting security vulnerabilities in source code. The research aims to assist organizations with large application portfolios and limited security testing capabilities in prioritizing security activities. ML-based approaches offer benefits such as increased confidence scores, false positives and negatives tuning, and automated feedback. The initial approach using natural language processing techniques to extract features achieved 86% accuracy during the training phase but suffered from overfitting and performed poorly on unseen datasets during testing. To address these issues, the study proposes using the abstract syntax tree (AST) for Java and C++ codebases to capture code semantics and structure and generate path-context representations for each function. The Code2Vec model architecture is used to learn distributed representations of source code snippets for training a machine-learning classifier for vulnerability prediction. The study evaluates the performance of the proposed methodology using two datasets and compares the results with existing approaches. The Devign dataset yielded 60% accuracy in predicting vulnerable code snippets and helped resist overfitting, while the Juliet Test Suite predicted specific vulnerabilities such as OS-Command Injection, Cryptographic, and Cross-Site Scripting vulnerabilities. The Code2Vec model achieved 75% accuracy and a 98% recall rate in predicting OS-Command Injection vulnerabilities. The study concludes that even partial AST representations of source code can be useful for vulnerability prediction. The approach has the potential for automated intelligent analysis of source code, including vulnerability prediction on unseen source code. State-of-the-art models using natural language processing techniques and CNN models with ensemble modelling techniques did not generalize well on unseen data and faced overfitting issues. However, predicting vulnerabilities in source code using machine learning poses challenges such as high dimensionality and complexity of source code, imbalanced datasets, and identifying specific types of vulnerabilities. Future work will address these challenges and expand the scope of the research.

Keywords: code embeddings, neural networks, natural language processing, OS command injection, software security, code properties

Procedia PDF Downloads 95