Search results for: time prediction algorithms
19494 Data Recording for Remote Monitoring of Autonomous Vehicles
Authors: Rong-Terng Juang
Abstract:
Autonomous vehicles offer the possibility of significant benefits to social welfare. However, fully automated cars might not be going to happen in the near further. To speed the adoption of the self-driving technologies, many governments worldwide are passing laws requiring data recorders for the testing of autonomous vehicles. Currently, the self-driving vehicle, (e.g., shuttle bus) has to be monitored from a remote control center. When an autonomous vehicle encounters an unexpected driving environment, such as road construction or an obstruction, it should request assistance from a remote operator. Nevertheless, large amounts of data, including images, radar and lidar data, etc., have to be transmitted from the vehicle to the remote center. Therefore, this paper proposes a data compression method of in-vehicle networks for remote monitoring of autonomous vehicles. Firstly, the time-series data are rearranged into a multi-dimensional signal space. Upon the arrival, for controller area networks (CAN), the new data are mapped onto a time-data two-dimensional space associated with the specific CAN identity. Secondly, the data are sampled based on differential sampling. Finally, the whole set of data are encoded using existing algorithms such as Huffman, arithmetic and codebook encoding methods. To evaluate system performance, the proposed method was deployed on an in-house built autonomous vehicle. The testing results show that the amount of data can be reduced as much as 1/7 compared to the raw data.Keywords: autonomous vehicle, data compression, remote monitoring, controller area networks (CAN), Lidar
Procedia PDF Downloads 16319493 DOA Estimation Using Golden Section Search
Authors: Niharika Verma, Sandeep Santosh
Abstract:
DOA technique is a localization technique used in the communication field. Various algorithms have been developed for direction of arrival estimation like MUSIC, ROOT MUSIC, etc. These algorithms depend on various parameters like antenna array elements, number of snapshots and various others. Basically the MUSIC spectrum is evaluated and peaks obtained are considered as the angle of arrivals. The angles evaluated using this process depends on the scanning interval chosen. The accuracy of the results obtained depends on the coarseness of the interval chosen. In this paper, golden section search is applied to the MUSIC algorithm and therefore, more accurate results are achieved. Initially the coarse DOA estimations is done using the MUSIC algorithm in the range -90 to 90 degree at the interval of 10 degree. After the peaks obtained then fine DOA estimation is done using golden section search. Also, the partitioning method is applied to estimate the number of signals incident on the antenna array. Dependency of the algorithm on the number of snapshots is also being explained. Hence, the accurate results are being determined using this algorithm.Keywords: Direction of Arrival (DOA), golden section search, MUSIC, number of snapshots
Procedia PDF Downloads 44619492 A Comparative Analysis of Classification Models with Wrapper-Based Feature Selection for Predicting Student Academic Performance
Authors: Abdullah Al Farwan, Ya Zhang
Abstract:
In today’s educational arena, it is critical to understand educational data and be able to evaluate important aspects, particularly data on student achievement. Educational Data Mining (EDM) is a research area that focusing on uncovering patterns and information in data from educational institutions. Teachers, if they are able to predict their students' class performance, can use this information to improve their teaching abilities. It has evolved into valuable knowledge that can be used for a wide range of objectives; for example, a strategic plan can be used to generate high-quality education. Based on previous data, this paper recommends employing data mining techniques to forecast students' final grades. In this study, five data mining methods, Decision Tree, JRip, Naive Bayes, Multi-layer Perceptron, and Random Forest with wrapper feature selection, were used on two datasets relating to Portuguese language and mathematics classes lessons. The results showed the effectiveness of using data mining learning methodologies in predicting student academic success. The classification accuracy achieved with selected algorithms lies in the range of 80-94%. Among all the selected classification algorithms, the lowest accuracy is achieved by the Multi-layer Perceptron algorithm, which is close to 70.45%, and the highest accuracy is achieved by the Random Forest algorithm, which is close to 94.10%. This proposed work can assist educational administrators to identify poor performing students at an early stage and perhaps implement motivational interventions to improve their academic success and prevent educational dropout.Keywords: classification algorithms, decision tree, feature selection, multi-layer perceptron, Naïve Bayes, random forest, students’ academic performance
Procedia PDF Downloads 16619491 Prediction Compressive Strength of Self-Compacting Concrete Containing Fly Ash Using Fuzzy Logic Inference System
Authors: Belalia Douma Omar, Bakhta Boukhatem, Mohamed Ghrici
Abstract:
Self-compacting concrete (SCC) developed in Japan in the late 80s has enabled the construction industry to reduce demand on the resources, improve the work condition and also reduce the impact of environment by elimination of the need for compaction. Fuzzy logic (FL) approaches has recently been used to model some of the human activities in many areas of civil engineering applications. Especially from these systems in the model experimental studies, very good results have been obtained. In the present study, a model for predicting compressive strength of SCC containing various proportions of fly ash, as partial replacement of cement has been developed by using Adaptive Neuro-Fuzzy Inference System (ANFIS). For the purpose of building this model, a database of experimental data were gathered from the literature and used for training and testing the model. The used data as the inputs of fuzzy logic models are arranged in a format of five parameters that cover the total binder content, fly ash replacement percentage, water content, super plasticizer and age of specimens. The training and testing results in the fuzzy logic model have shown a strong potential for predicting the compressive strength of SCC containing fly ash in the considered range.Keywords: self-compacting concrete, fly ash, strength prediction, fuzzy logic
Procedia PDF Downloads 33519490 Transforming Data Science Curriculum Through Design Thinking
Authors: Samar Swaid
Abstract:
Today, corporates are moving toward the adoption of Design-Thinking techniques to develop products and services, putting their consumer as the heart of the development process. One of the leading companies in Design-Thinking, IDEO (Innovation, Design, Engineering Organization), defines Design-Thinking as an approach to problem-solving that relies on a set of multi-layered skills, processes, and mindsets that help people generate novel solutions to problems. Design thinking may result in new ideas, narratives, objects or systems. It is about redesigning systems, organizations, infrastructures, processes, and solutions in an innovative fashion based on the users' feedback. Tim Brown, president and CEO of IDEO, sees design thinking as a human-centered approach that draws from the designer's toolkit to integrate people's needs, innovative technologies, and business requirements. The application of design thinking has been witnessed to be the road to developing innovative applications, interactive systems, scientific software, healthcare application, and even to utilizing Design-Thinking to re-think business operations, as in the case of Airbnb. Recently, there has been a movement to apply design thinking to machine learning and artificial intelligence to ensure creating the "wow" effect on consumers. The Association of Computing Machinery task force on Data Science program states that" Data scientists should be able to implement and understand algorithms for data collection and analysis. They should understand the time and space considerations of algorithms. They should follow good design principles developing software, understanding the importance of those principles for testability and maintainability" However, this definition hides the user behind the machine who works on data preparation, algorithm selection and model interpretation. Thus, the Data Science program includes design thinking to ensure meeting the user demands, generating more usable machine learning tools, and developing ways of framing computational thinking. Here, describe the fundamentals of Design-Thinking and teaching modules for data science programs.Keywords: data science, design thinking, AI, currculum, transformation
Procedia PDF Downloads 8119489 A Comparison between the Results of Hormuz Strait Wave Simulations Using WAVEWATCH-III and MIKE21-SW and Satellite Altimetry Observations
Authors: Fatemeh Sadat Sharifi
Abstract:
In the present study, the capabilities of WAVEWATCH-III and MIKE21-SW for predicting the characteristics of wind waves in Hormuz Strait are evaluated. The GFS wind data (Global Forecast System) were derived. The bathymetry of gride with 2 arc-minute resolution, also were extracted from the ETOPO1. WAVEWATCH-III findings illustrate more valid prediction of wave features comparing to the MIKE-21 SW in deep water. Apparently, in shallow area, the MIKE-21 provides more uniformities with altimetry measurements. This may be due to the merits of the unstructured grid which are used in MIKE-21, leading to better representations of the coastal area. The findings on the direction of waves generated by wind in the modeling area indicate that in some regions, despite the increase in wind speed, significant wave height stays nearly unchanged. This is fundamental because of swift changes in wind track over the Strait of Hormuz. After discussing wind-induced waves in the region, the impact of instability of the surface layer on wave growth has been considered. For this purpose, the average monthly mean air temperature has been used. The results in cold months, when the surface layer is unstable, indicates an acceptable increase in the accuracy of prediction of the indicator wave height.Keywords: numerical modeling, WAVEWATCH-III, Strait of Hormuz, MIKE21-SW
Procedia PDF Downloads 20719488 Assessing Level of Pregnancy Rate and Milk Yield in Indian Murrah Buffaloes
Authors: V. Jamuna, A. K. Chakravarty, C. S. Patil, Vijay Kumar, M. A. Mir, Rakesh Kumar
Abstract:
Intense selection of buffaloes for milk production at organized herds of the country without giving due attention to fertility traits viz. pregnancy rate has lead to deterioration in their performances. Aim of study is to develop an optimum model for predicting pregnancy rate and to assess the level of pregnancy rate with respect to milk production Murrah buffaloes. Data pertaining to 1224 lactation records of Murrah buffaloes spread over a period 21 years were analyzed and it was observed that pregnancy rate depicted negative phenotypic association with lactation milk yield (-0.08 ± 0.04). For developing optimum model for pregnancy rate in Murrah buffaloes seven simple and multiple regression models were developed. Among the seven models, model II having only Service period as an independent reproduction variable, was found to be the best prediction model, based on the four statistical criterions (high coefficient of determination (R 2), low mean sum of squares due to error (MSSe), conceptual predictive (CP) value, and Bayesian information criterion (BIC). For standardizing the level of fertility with milk production, pregnancy rate was classified into seven classes with the increment of 10% in all parities, life time and their corresponding average pregnancy rate in relation to the average lactation milk yield (MY).It was observed that to achieve around 2000 kg MY which can be considered optimum for Indian Murrah buffaloes, level of pregnancy rate should be in between 30-50%.Keywords: life time, pregnancy rate, production, service period, standardization
Procedia PDF Downloads 63519487 Design of a Graphical User Interface for Data Preprocessing and Image Segmentation Process in 2D MRI Images
Authors: Enver Kucukkulahli, Pakize Erdogmus, Kemal Polat
Abstract:
The 2D image segmentation is a significant process in finding a suitable region in medical images such as MRI, PET, CT etc. In this study, we have focused on 2D MRI images for image segmentation process. We have designed a GUI (graphical user interface) written in MATLABTM for 2D MRI images. In this program, there are two different interfaces including data pre-processing and image clustering or segmentation. In the data pre-processing section, there are median filter, average filter, unsharp mask filter, Wiener filter, and custom filter (a filter that is designed by user in MATLAB). As for the image clustering, there are seven different image segmentations for 2D MR images. These image segmentation algorithms are as follows: PSO (particle swarm optimization), GA (genetic algorithm), Lloyds algorithm, k-means, the combination of Lloyds and k-means, mean shift clustering, and finally BBO (Biogeography Based Optimization). To find the suitable cluster number in 2D MRI, we have designed the histogram based cluster estimation method and then applied to these numbers to image segmentation algorithms to cluster an image automatically. Also, we have selected the best hybrid method for each 2D MR images thanks to this GUI software.Keywords: image segmentation, clustering, GUI, 2D MRI
Procedia PDF Downloads 37719486 Comparison of Multivariate Adaptive Regression Splines and Random Forest Regression in Predicting Forced Expiratory Volume in One Second
Authors: P. V. Pramila , V. Mahesh
Abstract:
Pulmonary Function Tests are important non-invasive diagnostic tests to assess respiratory impairments and provides quantifiable measures of lung function. Spirometry is the most frequently used measure of lung function and plays an essential role in the diagnosis and management of pulmonary diseases. However, the test requires considerable patient effort and cooperation, markedly related to the age of patients esulting in incomplete data sets. This paper presents, a nonlinear model built using Multivariate adaptive regression splines and Random forest regression model to predict the missing spirometric features. Random forest based feature selection is used to enhance both the generalization capability and the model interpretability. In the present study, flow-volume data are recorded for N= 198 subjects. The ranked order of feature importance index calculated by the random forests model shows that the spirometric features FVC, FEF 25, PEF,FEF 25-75, FEF50, and the demographic parameter height are the important descriptors. A comparison of performance assessment of both models prove that, the prediction ability of MARS with the `top two ranked features namely the FVC and FEF 25 is higher, yielding a model fit of R2= 0.96 and R2= 0.99 for normal and abnormal subjects. The Root Mean Square Error analysis of the RF model and the MARS model also shows that the latter is capable of predicting the missing values of FEV1 with a notably lower error value of 0.0191 (normal subjects) and 0.0106 (abnormal subjects). It is concluded that combining feature selection with a prediction model provides a minimum subset of predominant features to train the model, yielding better prediction performance. This analysis can assist clinicians with a intelligence support system in the medical diagnosis and improvement of clinical care.Keywords: FEV, multivariate adaptive regression splines pulmonary function test, random forest
Procedia PDF Downloads 31019485 An Automated Optimal Robotic Assembly Sequence Planning Using Artificial Bee Colony Algorithm
Authors: Balamurali Gunji, B. B. V. L. Deepak, B. B. Biswal, Amrutha Rout, Golak Bihari Mohanta
Abstract:
Robots play an important role in the operations like pick and place, assembly, spot welding and much more in manufacturing industries. Out of those, assembly is a very important process in manufacturing, where 20% of manufacturing cost is wholly occupied by the assembly process. To do the assembly task effectively, Assembly Sequences Planning (ASP) is required. ASP is one of the multi-objective non-deterministic optimization problems, achieving the optimal assembly sequence involves huge search space and highly complex in nature. Many researchers have followed different algorithms to solve ASP problem, which they have several limitations like the local optimal solution, huge search space, and execution time is more, complexity in applying the algorithm, etc. By keeping the above limitations in mind, in this paper, a new automated optimal robotic assembly sequence planning using Artificial Bee Colony (ABC) Algorithm is proposed. In this algorithm, automatic extraction of assembly predicates is done using Computer Aided Design (CAD) interface instead of extracting the assembly predicates manually. Due to this, the time of extraction of assembly predicates to obtain the feasible assembly sequence is reduced. The fitness evaluation of the obtained feasible sequence is carried out using ABC algorithm to generate the optimal assembly sequence. The proposed methodology is applied to different industrial products and compared the results with past literature.Keywords: assembly sequence planning, CAD, artificial Bee colony algorithm, assembly predicates
Procedia PDF Downloads 23719484 MPPT Control with (P&O) and (FLC) Algorithms of Solar Electric Generator
Authors: Dib Djalel, Mordjaoui Mourad
Abstract:
The current trend towards the exploitation of various renewable energy resources has become indispensable, so it is important to improve the efficiency and reliability of the GPV photovoltaic systems. Maximum Power Point Tracking (MPPT) plays an important role in photovoltaic power systems because it maximize the power output from a PV system for a given set of conditions. This paper presents a new fuzzy logic control based MPPT algorithm for solar panel. The solar panel is modeled and analyzed in Matlab/Simulink. The Solar panel can produce maximum power at a particular operating point called Maximum Power Point(MPP). To produce maximum power and to get maximum efficiency, the entire photovoltaic panel must operate at this particular point. Maximum power point of PV panel keeps on changing with changing environmental conditions such as solar irradiance and cell temperature. Thus, to extract maximum available power from a PV module, MPPT algorithms are implemented and Perturb and Observe (P&O) MPPT and fuzzy logic control FLC, MPPT are developed and compared. Simulation results show the effectiveness of the fuzzy control technique to produce a more stable power.Keywords: MPPT, photovoltaic panel, fuzzy logic control, modeling, solar power
Procedia PDF Downloads 48319483 Coding Considerations for Standalone Molecular Dynamics Simulations of Atomistic Structures
Authors: R. O. Ocaya, J. J. Terblans
Abstract:
The laws of Newtonian mechanics allow ab-initio molecular dynamics to model and simulate particle trajectories in material science by defining a differentiable potential function. This paper discusses some considerations for the coding of ab-initio programs for simulation on a standalone computer and illustrates the approach by C language codes in the context of embedded metallic atoms in the face-centred cubic structure. The algorithms use velocity-time integration to determine particle parameter evolution for up to several thousands of particles in a thermodynamical ensemble. Such functions are reusable and can be placed in a redistributable header library file. While there are both commercial and free packages available, their heuristic nature prevents dissection. In addition, developing own codes has the obvious advantage of teaching techniques applicable to new problems.Keywords: C language, molecular dynamics, simulation, embedded atom method
Procedia PDF Downloads 30519482 Principal Component Analysis Combined Machine Learning Techniques on Pharmaceutical Samples by Laser Induced Breakdown Spectroscopy
Authors: Kemal Efe Eseller, Göktuğ Yazici
Abstract:
Laser-induced breakdown spectroscopy (LIBS) is a rapid optical atomic emission spectroscopy which is used for material identification and analysis with the advantages of in-situ analysis, elimination of intensive sample preparation, and micro-destructive properties for the material to be tested. LIBS delivers short pulses of laser beams onto the material in order to create plasma by excitation of the material to a certain threshold. The plasma characteristics, which consist of wavelength value and intensity amplitude, depends on the material and the experiment’s environment. In the present work, medicine samples’ spectrum profiles were obtained via LIBS. Medicine samples’ datasets include two different concentrations for both paracetamol based medicines, namely Aferin and Parafon. The spectrum data of the samples were preprocessed via filling outliers based on quartiles, smoothing spectra to eliminate noise and normalizing both wavelength and intensity axis. Statistical information was obtained and principal component analysis (PCA) was incorporated to both the preprocessed and raw datasets. The machine learning models were set based on two different train-test splits, which were 70% training – 30% test and 80% training – 20% test. Cross-validation was preferred to protect the models against overfitting; thus the sample amount is small. The machine learning results of preprocessed and raw datasets were subjected to comparison for both splits. This is the first time that all supervised machine learning classification algorithms; consisting of Decision Trees, Discriminant, naïve Bayes, Support Vector Machines (SVM), k-NN(k-Nearest Neighbor) Ensemble Learning and Neural Network algorithms; were incorporated to LIBS data of paracetamol based pharmaceutical samples, and their different concentrations on preprocessed and raw dataset in order to observe the effect of preprocessing.Keywords: machine learning, laser-induced breakdown spectroscopy, medicines, principal component analysis, preprocessing
Procedia PDF Downloads 8719481 Prediction of Formation Pressure Using Artificial Intelligence Techniques
Authors: Abdulmalek Ahmed
Abstract:
Formation pressure is the main function that affects drilling operation economically and efficiently. Knowing the pore pressure and the parameters that affect it will help to reduce the cost of drilling process. Many empirical models reported in the literature were used to calculate the formation pressure based on different parameters. Some of these models used only drilling parameters to estimate pore pressure. Other models predicted the formation pressure based on log data. All of these models required different trends such as normal or abnormal to predict the pore pressure. Few researchers applied artificial intelligence (AI) techniques to predict the formation pressure by only one method or a maximum of two methods of AI. The objective of this research is to predict the pore pressure based on both drilling parameters and log data namely; weight on bit, rotary speed, rate of penetration, mud weight, bulk density, porosity and delta sonic time. A real field data is used to predict the formation pressure using five different artificial intelligence (AI) methods such as; artificial neural networks (ANN), radial basis function (RBF), fuzzy logic (FL), support vector machine (SVM) and functional networks (FN). All AI tools were compared with different empirical models. AI methods estimated the formation pressure by a high accuracy (high correlation coefficient and low average absolute percentage error) and outperformed all previous. The advantage of the new technique is its simplicity, which represented from its estimation of pore pressure without the need of different trends as compared to other models which require a two different trend (normal or abnormal pressure). Moreover, by comparing the AI tools with each other, the results indicate that SVM has the advantage of pore pressure prediction by its fast processing speed and high performance (a high correlation coefficient of 0.997 and a low average absolute percentage error of 0.14%). In the end, a new empirical correlation for formation pressure was developed using ANN method that can estimate pore pressure with a high precision (correlation coefficient of 0.998 and average absolute percentage error of 0.17%).Keywords: Artificial Intelligence (AI), Formation pressure, Artificial Neural Networks (ANN), Fuzzy Logic (FL), Support Vector Machine (SVM), Functional Networks (FN), Radial Basis Function (RBF)
Procedia PDF Downloads 14919480 A Contemporary Advertising Strategy on Social Networking Sites
Authors: M. S. Aparna, Pushparaj Shetty D.
Abstract:
Nowadays social networking sites have become so popular that the producers or the sellers look for these sites as one of the best options to target the right audience to market their products. There are several tools available to monitor or analyze the social networks. Our task is to identify the right community web pages and find out the behavior analysis of the members by using these tools and formulate an appropriate strategy to market the products or services to achieve the set goals. The advertising becomes more effective when the information of the product/ services come from a known source. The strategy explores great buying influence in the audience on referral marketing. Our methodology proceeds with critical budget analysis and promotes viral influence propagation. In this context, we encompass the vital bits of budget evaluation such as the number of optimal seed nodes or primary influential users activated onset, an estimate coverage spread of nodes and maximum influence propagating distance from an initial seed to an end node. Our proposal for Buyer Prediction mathematical model arises from the urge to perform complex analysis when the probability density estimates of reliable factors are not known or difficult to calculate. Order Statistics and Buyer Prediction mapping function guarantee the selection of optimal influential users at each level. We exercise an efficient tactics of practicing community pages and user behavior to determine the product enthusiasts on social networks. Our approach is promising and should be an elementary choice when there is little or no prior knowledge on the distribution of potential buyers on social networks. In this strategy, product news propagates to influential users on or surrounding networks. By applying the same technique, a user can search friends who are capable to advise better or give referrals, if a product interests him.Keywords: viral marketing, social network analysis, community web pages, buyer prediction, influence propagation, budget constraints
Procedia PDF Downloads 26219479 The Psychosis Prodrome: Biomarkers of the Glutamatergic System and Their Potential Role in Prediction and Treatment
Authors: Peter David Reiss
Abstract:
The concept of the psychosis prodrome has allowed for the identification of adolescent and young adult patients who have a significantly elevated risk of developing schizophrenia spectrum disorders. A number of different interventions have been tested in order to prevent or delay progression of symptoms. To date, there has been no consistent meta-analytical evidence to support efficacy of antipsychotic treatment for patients in the prodromal state, and their use remains therefore inconclusive. Although antipsychotics may manage symptoms transiently, they have not been found to prevent or delay onset of psychotic disorders. Furthermore, pharmacological intervention in high-risk individuals remains controversial, because of the antipsychotic side effect profile in a population in which only about 20 to 35 percent will eventually convert to psychosis over a two-year period, with even after two years conversion rates not exceeding 30 to 40 percent. This general estimate is additionally problematic, in that it ignores the fact that there is significant variation in individual risk among clinical high-risk cases. The current lack of reliable tests for at-risk patients makes it difficult to justify individual treatment decisions. Preventive treatment should ideally be dictated by an individual’s risk while minimizing potentially harmful medication exposure. This requires more accurate predictive assessments by using valid and accessible prognostic markers. The following will compare prediction and risk modification potential of behavioral biomarkers such as disturbances of basic sense of self and emotion awareness, neurocognitive biomarkers such as attention, working and declarative memory, and neurophysiological biomarkers such as glutamatergic abnormalities and NMDA receptor dysfunction. Identification of robust biomarkers could therefore not only provide more reliable means of psychosis prediction, but also help test and develop new clinical interventions targeted at the prodromal state.Keywords: at-risk mental state, biomarkers, glutamatergic system, NMDA receptor, psychosis prodrome, schizophrenia
Procedia PDF Downloads 19519478 Prediction of the Crustal Deformation of Volcán - Nevado Del RUíz in the Year 2020 Using Tropomi Tropospheric Information, Dinsar Technique, and Neural Networks
Authors: Juan Sebastián Hernández
Abstract:
The Nevado del Ruíz volcano, located between the limits of the Departments of Caldas and Tolima in Colombia, presented an unstable behaviour in the course of the year 2020, this volcanic activity led to secondary effects on the crust, which is why the prediction of deformations becomes the task of geoscientists. In the course of this article, the use of tropospheric variables such as evapotranspiration, UV aerosol index, carbon monoxide, nitrogen dioxide, methane, surface temperature, among others, is used to train a set of neural networks that can predict the behaviour of the resulting phase of an unrolled interferogram with the DInSAR technique, whose main objective is to identify and characterise the behaviour of the crust based on the environmental conditions. For this purpose, variables were collected, a generalised linear model was created, and a set of neural networks was created. After the training of the network, validation was carried out with the test data, giving an MSE of 0.17598 and an associated r-squared of approximately 0.88454. The resulting model provided a dataset with good thematic accuracy, reflecting the behaviour of the volcano in 2020, given a set of environmental characteristics.Keywords: crustal deformation, Tropomi, neural networks (ANN), volcanic activity, DInSAR
Procedia PDF Downloads 10319477 Harnessing Artificial Intelligence and Machine Learning for Advanced Fraud Detection and Prevention
Authors: Avinash Malladhi
Abstract:
Forensic accounting is a specialized field that involves the application of accounting principles, investigative skills, and legal knowledge to detect and prevent fraud. With the rise of big data and technological advancements, artificial intelligence (AI) and machine learning (ML) algorithms have emerged as powerful tools for forensic accountants to enhance their fraud detection capabilities. In this paper, we review and analyze various AI/ML algorithms that are commonly used in forensic accounting, including supervised and unsupervised learning, deep learning, natural language processing Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), Support Vector Machines (SVMs), Decision Trees, and Random Forests. We discuss their underlying principles, strengths, and limitations and provide empirical evidence from existing research studies demonstrating their effectiveness in detecting financial fraud. We also highlight potential ethical considerations and challenges associated with using AI/ML in forensic accounting. Furthermore, we highlight the benefits of these technologies in improving fraud detection and prevention in forensic accounting.Keywords: AI, machine learning, forensic accounting & fraud detection, anti money laundering, Benford's law, fraud triangle theory
Procedia PDF Downloads 9319476 Strategies in Customer Relationship Management and Customers’ Behavior in Making Decision on Buying Car Insurance of Southeast Insurance Co. Ltd. in Bangkok
Authors: Nattapong Techarattanased, Paweena Sribunrueng
Abstract:
The objective of this study is to investigate strategies in customer relationship management and customers’ behavior in making decision on buying car insurance of Southeast Insurance Co. Ltd. in Bangkok. Subjects in this study included 400 customers with the age over 20 years old to complete questionnaires. The data were analyzed by arithmetic mean and multiple regressions. The results revealed that the customers’ opinions on the strategies in customer relationship management, i.e. customer relationship, customer feedback, customer follow-up, useful service suggestions, customer communication, and service channels were in moderate level but on the customer retention was in high level. Moreover, the strategy in customer relationship management, i.e. customer relationship, and customer feedback had an influence on customers’ buying decision on buying car insurance. The two factors above can be used for the prediction at the rate of 34%. In addition, the strategy in customer relationship management, i.e. customer retention, customer feedback, and useful service suggestions had an influence on the customers’ buying decision on period of being customers. The three factors could be used for the prediction at the rate of 45%.Keywords: strategies, customer relationship management, behavior in buying decision, car insurance
Procedia PDF Downloads 40519475 Using Simulation Modeling Approach to Predict USMLE Steps 1 and 2 Performances
Authors: Chau-Kuang Chen, John Hughes, Jr., A. Dexter Samuels
Abstract:
The prediction models for the United States Medical Licensure Examination (USMLE) Steps 1 and 2 performances were constructed by the Monte Carlo simulation modeling approach via linear regression. The purpose of this study was to build robust simulation models to accurately identify the most important predictors and yield the valid range estimations of the Steps 1 and 2 scores. The application of simulation modeling approach was deemed an effective way in predicting student performances on licensure examinations. Also, sensitivity analysis (a/k/a what-if analysis) in the simulation models was used to predict the magnitudes of Steps 1 and 2 affected by changes in the National Board of Medical Examiners (NBME) Basic Science Subject Board scores. In addition, the study results indicated that the Medical College Admission Test (MCAT) Verbal Reasoning score and Step 1 score were significant predictors of the Step 2 performance. Hence, institutions could screen qualified student applicants for interviews and document the effectiveness of basic science education program based on the simulation results.Keywords: prediction model, sensitivity analysis, simulation method, USMLE
Procedia PDF Downloads 33919474 Exploring Syntactic and Semantic Features for Text-Based Authorship Attribution
Authors: Haiyan Wu, Ying Liu, Shaoyun Shi
Abstract:
Authorship attribution is to extract features to identify authors of anonymous documents. Many previous works on authorship attribution focus on statistical style features (e.g., sentence/word length), content features (e.g., frequent words, n-grams). Modeling these features by regression or some transparent machine learning methods gives a portrait of the authors' writing style. But these methods do not capture the syntactic (e.g., dependency relationship) or semantic (e.g., topics) information. In recent years, some researchers model syntactic trees or latent semantic information by neural networks. However, few works take them together. Besides, predictions by neural networks are difficult to explain, which is vital in authorship attribution tasks. In this paper, we not only utilize the statistical style and content features but also take advantage of both syntactic and semantic features. Different from an end-to-end neural model, feature selection and prediction are two steps in our method. An attentive n-gram network is utilized to select useful features, and logistic regression is applied to give prediction and understandable representation of writing style. Experiments show that our extracted features can improve the state-of-the-art methods on three benchmark datasets.Keywords: authorship attribution, attention mechanism, syntactic feature, feature extraction
Procedia PDF Downloads 13619473 Implementation of Deep Neural Networks for Pavement Condition Index Prediction
Authors: M. Sirhan, S. Bekhor, A. Sidess
Abstract:
In-service pavements deteriorate with time due to traffic wheel loads, environment, and climate conditions. Pavement deterioration leads to a reduction in their serviceability and structural behavior. Consequently, proper maintenance and rehabilitation (M&R) are necessary actions to keep the in-service pavement network at the desired level of serviceability. Due to resource and financial constraints, the pavement management system (PMS) prioritizes roads most in need of maintenance and rehabilitation action. It recommends a suitable action for each pavement based on the performance and surface condition of each road in the network. The pavement performance and condition are usually quantified and evaluated by different types of roughness-based and stress-based indices. Examples of such indices are Pavement Serviceability Index (PSI), Pavement Serviceability Ratio (PSR), Mean Panel Rating (MPR), Pavement Condition Rating (PCR), Ride Number (RN), Profile Index (PI), International Roughness Index (IRI), and Pavement Condition Index (PCI). PCI is commonly used in PMS as an indicator of the extent of the distresses on the pavement surface. PCI values range between 0 and 100; where 0 and 100 represent a highly deteriorated pavement and a newly constructed pavement, respectively. The PCI value is a function of distress type, severity, and density (measured as a percentage of the total pavement area). PCI is usually calculated iteratively using the 'Paver' program developed by the US Army Corps. The use of soft computing techniques, especially Artificial Neural Network (ANN), has become increasingly popular in the modeling of engineering problems. ANN techniques have successfully modeled the performance of the in-service pavements, due to its efficiency in predicting and solving non-linear relationships and dealing with an uncertain large amount of data. Typical regression models, which require a pre-defined relationship, can be replaced by ANN, which was found to be an appropriate tool for predicting the different pavement performance indices versus different factors as well. Subsequently, the objective of the presented study is to develop and train an ANN model that predicts the PCI values. The model’s input consists of percentage areas of 11 different damage types; alligator cracking, swelling, rutting, block cracking, longitudinal/transverse cracking, edge cracking, shoving, raveling, potholes, patching, and lane drop off, at three severity levels (low, medium, high) for each. The developed model was trained using 536,000 samples and tested on 134,000 samples. The samples were collected and prepared by The National Transport Infrastructure Company. The predicted results yielded satisfactory compliance with field measurements. The proposed model predicted PCI values with relatively low standard deviations, suggesting that it could be incorporated into the PMS for PCI determination. It is worth mentioning that the most influencing variables for PCI prediction are damages related to alligator cracking, swelling, rutting, and potholes.Keywords: artificial neural networks, computer programming, pavement condition index, pavement management, performance prediction
Procedia PDF Downloads 13719472 Application of Multilinear Regression Analysis for Prediction of Synthetic Shear Wave Velocity Logs in Upper Assam Basin
Authors: Triveni Gogoi, Rima Chatterjee
Abstract:
Shear wave velocity (Vs) estimation is an important approach in the seismic exploration and characterization of a hydrocarbon reservoir. There are varying methods for prediction of S-wave velocity, if recorded S-wave log is not available. But all the available methods for Vs prediction are empirical mathematical models. Shear wave velocity can be estimated using P-wave velocity by applying Castagna’s equation, which is the most common approach. The constants used in Castagna’s equation vary for different lithologies and geological set-ups. In this study, multiple regression analysis has been used for estimation of S-wave velocity. The EMERGE module from Hampson-Russel software has been used here for generation of S-wave log. Both single attribute and multi attributes analysis have been carried out for generation of synthetic S-wave log in Upper Assam basin. Upper Assam basin situated in North Eastern India is one of the most important petroleum provinces of India. The present study was carried out using four wells of the study area. Out of these wells, S-wave velocity was available for three wells. The main objective of the present study is a prediction of shear wave velocities for wells where S-wave velocity information is not available. The three wells having S-wave velocity were first used to test the reliability of the method and the generated S-wave log was compared with actual S-wave log. Single attribute analysis has been carried out for these three wells within the depth range 1700-2100m, which corresponds to Barail group of Oligocene age. The Barail Group is the main target zone in this study, which is the primary producing reservoir of the basin. A system generated list of attributes with varying degrees of correlation appeared and the attribute with the highest correlation was concerned for the single attribute analysis. Crossplot between the attributes shows the variation of points from line of best fit. The final result of the analysis was compared with the available S-wave log, which shows a good visual fit with a correlation of 72%. Next multi-attribute analysis has been carried out for the same data using all the wells within the same analysis window. A high correlation of 85% has been observed between the output log from the analysis and the recorded S-wave. The almost perfect fit between the synthetic S-wave and the recorded S-wave log validates the reliability of the method. For further authentication, the generated S-wave data from the wells have been tied to the seismic and correlated them. Synthetic share wave log has been generated for the well M2 where S-wave is not available and it shows a good correlation with the seismic. Neutron porosity, density, AI and P-wave velocity are proved to be the most significant variables in this statistical method for S-wave generation. Multilinear regression method thus can be considered as a reliable technique for generation of shear wave velocity log in this study.Keywords: Castagna's equation, multi linear regression, multi attribute analysis, shear wave logs
Procedia PDF Downloads 22919471 Crack Width Analysis of Reinforced Concrete Members under Shrinkage Effect by Pseudo-Discrete Crack Model
Authors: F. J. Ma, A. K. H. Kwan
Abstract:
Crack caused by shrinkage movement of concrete is a serious problem especially when restraint is provided. It may cause severe serviceability and durability problems. The existing prediction methods for crack width of concrete due to shrinkage movement are mainly numerical methods under simplified circumstances, which do not agree with each other. To get a more unified prediction method applicable to more sophisticated circumstances, finite element crack width analysis for shrinkage effect should be developed. However, no existing finite element analysis can be carried out to predict the crack width of concrete due to shrinkage movement because of unsolved reasons of conventional finite element analysis. In this paper, crack width analysis implemented by finite element analysis is presented with pseudo-discrete crack model, which combines traditional smeared crack model and newly proposed crack queuing algorithm. The proposed pseudo-discrete crack model is capable of simulating separate and single crack without adopting discrete crack element. And the improved finite element analysis can successfully simulate the stress redistribution when concrete is cracked, which is crucial for predicting crack width, crack spacing and crack number.Keywords: crack queuing algorithm, crack width analysis, finite element analysis, shrinkage effect
Procedia PDF Downloads 41919470 Development and Investigation of Efficient Substrate Feeding and Dissolved Oxygen Control Algorithms for Scale-Up of Recombinant E. coli Cultivation Process
Authors: Vytautas Galvanauskas, Rimvydas Simutis, Donatas Levisauskas, Vykantas Grincas, Renaldas Urniezius
Abstract:
The paper deals with model-based development and implementation of efficient control strategies for recombinant protein synthesis in fed-batch E.coli cultivation processes. Based on experimental data, a kinetic dynamic model for cultivation process was developed. This model was used to determine substrate feeding strategies during the cultivation. The proposed feeding strategy consists of two phases – biomass growth phase and recombinant protein production phase. In the first process phase, substrate-limited process is recommended when the specific growth rate of biomass is about 90-95% of its maximum value. This ensures reduction of glucose concentration in the medium, improves process repeatability, reduces the development of secondary metabolites and other unwanted by-products. The substrate limitation can be enhanced to satisfy restriction on maximum oxygen transfer rate in the bioreactor and to guarantee necessary dissolved carbon dioxide concentration in culture media. In the recombinant protein production phase, the level of substrate limitation and specific growth rate are selected within the range to enable optimal target protein synthesis rate. To account for complex process dynamics, to efficiently exploit the oxygen transfer capability of the bioreactor, and to maintain the required dissolved oxygen concentration, adaptive control algorithms for dissolved oxygen control have been proposed. The developed model-based control strategies are useful in scale-up of cultivation processes and accelerate implementation of innovative biotechnological processes for industrial applications.Keywords: adaptive algorithms, model-based control, recombinant E. coli, scale-up of bioprocesses
Procedia PDF Downloads 25719469 Plant Disease Detection Using Image Processing and Machine Learning
Authors: Sanskar, Abhinav Pal, Aryush Gupta, Sushil Kumar Mishra
Abstract:
One of the critical and tedious assignments in agricultural practices is the detection of diseases on vegetation. Agricultural production is very important in today’s economy because plant diseases are common, and early detection of plant diseases is important in agriculture. Automatic detection of such early diseases is useful because it reduces control efforts in large productive farms. Using digital image processing and machine learning algorithms, this paper presents a method for plant disease detection. Detection of the disease occurs on different leaves of the plant. The proposed system for plant disease detection is simple and computationally efficient, requiring less time than learning-based approaches. The accuracy of various plant and foliar diseases is calculated and presented in this paper.Keywords: plant diseases, machine learning, image processing, deep learning
Procedia PDF Downloads 1019468 A Comparison of Methods for Neural Network Aggregation
Authors: John Pomerat, Aviv Segev
Abstract:
Recently, deep learning has had many theoretical breakthroughs. For deep learning to be successful in the industry, however, there need to be practical algorithms capable of handling many real-world hiccups preventing the immediate application of a learning algorithm. Although AI promises to revolutionize the healthcare industry, getting access to patient data in order to train learning algorithms has not been easy. One proposed solution to this is data- sharing. In this paper, we propose an alternative protocol, based on multi-party computation, to train deep learning models while maintaining both the privacy and security of training data. We examine three methods of training neural networks in this way: Transfer learning, average ensemble learning, and series network learning. We compare these methods to the equivalent model obtained through data-sharing across two different experiments. Additionally, we address the security concerns of this protocol. While the motivating example is healthcare, our findings regarding multi-party computation of neural network training are purely theoretical and have use-cases outside the domain of healthcare.Keywords: neural network aggregation, multi-party computation, transfer learning, average ensemble learning
Procedia PDF Downloads 16219467 A Machine Learning Approach for Intelligent Transportation System Management on Urban Roads
Authors: Ashish Dhamaniya, Vineet Jain, Rajesh Chouhan
Abstract:
Traffic management is one of the gigantic issue in most of the urban roads in al-most all metropolitan cities in India. Speed is one of the critical traffic parameters for effective Intelligent Transportation System (ITS) implementation as it decides the arrival rate of vehicles on an intersection which are majorly the point of con-gestions. The study aimed to leverage Machine Learning (ML) models to produce precise predictions of speed on urban roadway links. The research objective was to assess how categorized traffic volume and road width, serving as variables, in-fluence speed prediction. Four tree-based regression models namely: Decision Tree (DT), Random Forest (RF), Extra Tree (ET), and Extreme Gradient Boost (XGB)are employed for this purpose. The models' performances were validated using test data, and the results demonstrate that Random Forest surpasses other machine learning techniques and a conventional utility theory-based model in speed prediction. The study is useful for managing the urban roadway network performance under mixed traffic conditions and effective implementation of ITS.Keywords: stream speed, urban roads, machine learning, traffic flow
Procedia PDF Downloads 7019466 Assessment of DNA Sequence Encoding Techniques for Machine Learning Algorithms Using a Universal Bacterial Marker
Authors: Diego Santibañez Oyarce, Fernanda Bravo Cornejo, Camilo Cerda Sarabia, Belén Díaz Díaz, Esteban Gómez Terán, Hugo Osses Prado, Raúl Caulier-Cisterna, Jorge Vergara-Quezada, Ana Moya-Beltrán
Abstract:
The advent of high-throughput sequencing technologies has revolutionized genomics, generating vast amounts of genetic data that challenge traditional bioinformatics methods. Machine learning addresses these challenges by leveraging computational power to identify patterns and extract information from large datasets. However, biological sequence data, being symbolic and non-numeric, must be converted into numerical formats for machine learning algorithms to process effectively. So far, some encoding methods, such as one-hot encoding or k-mers, have been explored. This work proposes additional approaches for encoding DNA sequences in order to compare them with existing techniques and determine if they can provide improvements or if current methods offer superior results. Data from the 16S rRNA gene, a universal marker, was used to analyze eight bacterial groups that are significant in the pulmonary environment and have clinical implications. The bacterial genes included in this analysis are Prevotella, Abiotrophia, Acidovorax, Streptococcus, Neisseria, Veillonella, Mycobacterium, and Megasphaera. These data were downloaded from the NCBI database in Genbank file format, followed by a syntactic analysis to selectively extract relevant information from each file. For data encoding, a sequence normalization process was carried out as the first step. From approximately 22,000 initial data points, a subset was generated for testing purposes. Specifically, 55 sequences from each bacterial group met the length criteria, resulting in an initial sample of approximately 440 sequences. The sequences were encoded using different methods, including one-hot encoding, k-mers, Fourier transform, and Wavelet transform. Various machine learning algorithms, such as support vector machines, random forests, and neural networks, were trained to evaluate these encoding methods. The performance of these models was assessed using multiple metrics, including the confusion matrix, ROC curve, and F1 Score, providing a comprehensive evaluation of their classification capabilities. The results show that accuracies between encoding methods vary by up to approximately 15%, with the Fourier transform obtaining the best results for the evaluated machine learning algorithms. These findings, supported by the detailed analysis using the confusion matrix, ROC curve, and F1 Score, provide valuable insights into the effectiveness of different encoding methods and machine learning algorithms for genomic data analysis, potentially improving the accuracy and efficiency of bacterial classification and related genomic studies.Keywords: DNA encoding, machine learning, Fourier transform, Fourier transformation
Procedia PDF Downloads 2319465 Improvement of Cross Range Resolution in Through Wall Radar Imaging Using Bilateral Backprojection
Authors: Rashmi Yadawad, Disha Narayanan, Ravi Gautam
Abstract:
Through Wall Radar Imaging is gaining increasing importance now a days in the field of Defense and one of the most important criteria that forms the basis for the image quality obtained is the Cross-Range resolution of the image. In this research paper, the Bilateral Back projection algorithm has been implemented for Through Wall Radar Imaging. The sole purpose is to enhance the resolution in the cross range direction of the obtained Back projection image. Synthetic Data is generated for two targets which are placed at various locations in a room of dimensions 8 m by 6m. Two algorithms namely, simple back projection and Bilateral Back projection have been implemented, images are obtained and the obtained images are compared. Numerical simulations have been coded in MATLAB and experimental results of the two algorithms have been shown. Based on the comparison between the two images, it can be clearly seen that the ringing effect and chess board effect have been heavily reduced in the bilaterally back projected image and hence promising results are obtained giving a relatively sharper image with relatively well defined edges.Keywords: through wall radar imaging, bilateral back projection, cross range resolution, synthetic data
Procedia PDF Downloads 347