Search results for: statistical estimation problem
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 12424

Search results for: statistical estimation problem

9454 Phenotype Prediction of DNA Sequence Data: A Machine and Statistical Learning Approach

Authors: Darlington Mapiye, Mpho Mokoatle, James Mashiyane, Stephanie Muller, Gciniwe Dlamini

Abstract:

Great advances in high-throughput sequencing technologies have resulted in availability of huge amounts of sequencing data in public and private repositories, enabling a holistic understanding of complex biological phenomena. Sequence data are used for a wide range of applications such as gene annotations, expression studies, personalized treatment and precision medicine. However, this rapid growth in sequence data poses a great challenge which calls for novel data processing and analytic methods, as well as huge computing resources. In this work, a machine and statistical learning approach for DNA sequence classification based on k-mer representation of sequence data is proposed. The approach is tested using whole genome sequences of Mycobacterium tuberculosis (MTB) isolates to (i) reduce the size of genomic sequence data, (ii) identify an optimum size of k-mers and utilize it to build classification models, (iii) predict the phenotype from whole genome sequence data of a given bacterial isolate, and (iv) demonstrate computing challenges associated with the analysis of whole genome sequence data in producing interpretable and explainable insights. The classification models were trained on 104 whole genome sequences of MTB isoloates. Cluster analysis showed that k-mers maybe used to discriminate phenotypes and the discrimination becomes more concise as the size of k-mers increase. The best performing classification model had a k-mer size of 10 (longest k-mer) an accuracy, recall, precision, specificity, and Matthews Correlation coeffient of 72.0 %, 80.5 %, 80.5 %, 63.6 %, and 0.4 respectively. This study provides a comprehensive approach for resampling whole genome sequencing data, objectively selecting a k-mer size, and performing classification for phenotype prediction. The analysis also highlights the importance of increasing the k-mer size to produce more biological explainable results, which brings to the fore the interplay that exists amongst accuracy, computing resources and explainability of classification results. However, the analysis provides a new way to elucidate genetic information from genomic data, and identify phenotype relationships which are important especially in explaining complex biological mechanisms

Keywords: AWD-LSTM, bootstrapping, k-mers, next generation sequencing

Procedia PDF Downloads 164
9453 The Effect of Molecular Weight on the Cross-Linking of Two Different Molecular Weight LLDPE Samples

Authors: Ashkan Forootan, Reza Rashedi

Abstract:

Polyethylene has wide usage areas such as blow molding, pipe, film, cable insulation. However, regardless to its growing applications, it has some constraints such as the limited 70C operating temperature. Polyethylene thermo setting procedure whose molecules are knotted and 3D-molecular-network formed , is developed to conquer the above problem and to raise the applicable temperature of the polymer. This paper reports the cross-linking for two different molecular weight grades of LLDPE by adding 0.5, 1, and 2% of DCP (Dicumyl Peroxide). DCP was chosen for its prevalence among various cross-linking agents. Structural parameters such as molecular weight, melt flow index, comonomer, number of branches,etc. were obtained through the use of relative tests as Gel Permeation Chromatography and Fourier Transform Infra Red spectrometer. After calculating the percentage of gel content, properties of the pure and cross-linked samples were compared by thermal and mechanical analysis with DMTA and FTIR and the effects of cross-linking like viscous and elastic modulus were discussed by using various structural paprameters such as MFI, molecular weight, short chain branches, etc. Studies showed that cross-linked polymer, unlike the pure one, had a solid state with thermal mechanical properties in the range of 110 to 120C and this helped overcome the problem of using polyethylene in temperatures near the melting point.

Keywords: LLDPE, cross-link, structural parameters, DCP, DMTA, GPC

Procedia PDF Downloads 308
9452 Innovative Strategies for Improving Writing Skills of Secondary Level Students

Authors: Ihsan Ullah Khan, Asim Kareem, Naveed Saif

Abstract:

This research study examined the application of innovative strategies for improving writing skills of Secondary level students. It also examined the steps taken by Secondary level teachers for the improvement of writing skills of their students. Effective written communication is the problem faced by all the ESL students at secondary level. The objective of the study was to help the secondary level students to overcome this problem. More specifically, this research study aimed to guide the teachers, teaching at secondary level, to bring innovation in their teaching by showing the results of innovative strategies. In order to know about the practices of the teachers, inside the classroom, data was calculated through rating scale questionnaire. After that experimental study was carried out. For the experimental study a 10th grade class was selected. Results were drawn by analyzing the pre and post-tests of the students with the help of independent sample t-test. The results showed that a significant change occurred in the writing skills of the students, belonging to Treatment group. No improvement was observed in the writing skills of the students, belonging to Control group. Thus this research study proved to be a great contribution by guiding the teachers to bring a significant change in the writing skills of the students.

Keywords: writing skills, innovative strategies, teachers, students, treatment group, control group

Procedia PDF Downloads 446
9451 The Effect of Impinging WC-12Co Particles Temperature on Thickness of HVOF Thermally Sprayed Coatings

Authors: M. Jalali Azizpour

Abstract:

In this paper, the effect of WC-12Co particle Temperature in HVOF thermal spraying process on the coating thickness has been studied. The statistical results show that the spray distance and oxygen-to-fuel ratio are more effective factors on particle characterization and thickness of HVOF thermal spraying coatings. Spray Watch diagnostic system, scanning electron microscopy (SEM), X-ray diffraction and thickness measuring system were used for this purpose.

Keywords: HVOF, temperature thickness, velocity, WC-12Co

Procedia PDF Downloads 244
9450 Numerical Computation of Specific Absorption Rate and Induced Current for Workers Exposed to Static Magnetic Fields of MRI Scanners

Authors: Sherine Farrag

Abstract:

Currently-used MRI scanners in Cairo City possess static magnetic field (SMF) that varies from 0.25 up to 3T. More than half of them possess SMF of 1.5T. The SMF of the magnet determine the diagnostic power of a scanner, but not worker's exposure profile. This research paper presents an approach for numerical computation of induced electric fields and SAR values by estimation of fringe static magnetic fields. Iso-gauss line of MR was mapped and a polynomial function of the 7th degree was generated and tested. Induced current field due to worker motion in the SMF and SAR values for organs and tissues have been calculated. Results illustrate that the computation tool used permits quick accurate MRI iso-gauss mapping and calculation of SAR values which can then be used for assessment of occupational exposure profile of MRI operators.

Keywords: MRI occupational exposure, MRI safety, induced current density, specific absorption rate, static magnetic fields

Procedia PDF Downloads 432
9449 Systematic Review of Quantitative Risk Assessment Tools and Their Effect on Racial Disproportionality in Child Welfare Systems

Authors: Bronwen Wade

Abstract:

Over the last half-century, child welfare systems have increasingly relied on quantitative risk assessment tools, such as actuarial or predictive risk tools. These tools are developed by performing statistical analysis of how attributes captured in administrative data are related to future child maltreatment. Some scholars argue that attributes in administrative data can serve as proxies for race and that quantitative risk assessment tools reify racial bias in decision-making. Others argue that these tools provide more “objective” and “scientific” guides for decision-making instead of subjective social worker judgment. This study performs a systematic review of the literature on the impact of quantitative risk assessment tools on racial disproportionality; it examines methodological biases in work on this topic, summarizes key findings, and provides suggestions for further work. A search of CINAHL, PsychInfo, Proquest Social Science Premium Collection, and the ProQuest Dissertations and Theses Collection was performed. Academic and grey literature were included. The review includes studies that use quasi-experimental methods and development, validation, or re-validation studies of quantitative risk assessment tools. PROBAST (Prediction model Risk of Bias Assessment Tool) and CHARMS (CHecklist for critical Appraisal and data extraction for systematic Reviews of prediction Modelling Studies) were used to assess the risk of bias and guide data extraction for risk development, validation, or re-validation studies. ROBINS-I (Risk of Bias in Non-Randomized Studies of Interventions) was used to assess for bias and guide data extraction for the quasi-experimental studies identified. Due to heterogeneity among papers, a meta-analysis was not feasible, and a narrative synthesis was conducted. 11 papers met the eligibility criteria, and each has an overall high risk of bias based on the PROBAST and ROBINS-I assessments. This is deeply concerning, as major policy decisions have been made based on a limited number of studies with a high risk of bias. The findings on racial disproportionality have been mixed and depend on the tool and approach used. Authors use various definitions for racial equity, fairness, or disproportionality. These concepts of statistical fairness are connected to theories about the reason for racial disproportionality in child welfare or social definitions of fairness that are usually not stated explicitly. Most findings from these studies are unreliable, given the high degree of bias. However, some of the less biased measures within studies suggest that quantitative risk assessment tools may worsen racial disproportionality, depending on how disproportionality is mathematically defined. Authors vary widely in their approach to defining and addressing racial disproportionality within studies, making it difficult to generalize findings or approaches across studies. This review demonstrates the power of authors to shape policy or discourse around racial justice based on their choice of statistical methods; it also demonstrates the need for improved rigor and transparency in studies of quantitative risk assessment tools. Finally, this review raises concerns about the impact that these tools have on child welfare systems and racial disproportionality.

Keywords: actuarial risk, child welfare, predictive risk, racial disproportionality

Procedia PDF Downloads 59
9448 Regionalization of IDF Curves, by Interpolating Intensity and Adjustment Parameters - Application to Boyacá, Colombia

Authors: Pedro Mauricio Acosta, Carlos Andrés Caro

Abstract:

This research presents the regionalization of IDF curves for the department of Boyacá, Colombia, which comprises 16 towns, including the provincial capital, Tunja. For regionalization adjustment parameters (U and alpha) of the IDF curves stations referred to in the studied area were used. Similar regionalization is used by the interpolation of intensities. In the case of regionalization by parameters found by the construction of the curves intensity, duration and frequency estimation methods using ordinary moments and maximum likelihood. Regionalization and interpolation of data were performed with the assistance of Arcgis software. Within the development of the project the best choice to provide a level of reliability such as to determine which of the options and ways to regionalize is best sought. The resulting isolines maps were made in the case of regionalization intensities, each map is associated with a different return period and duration in order to build IDF curves in the studied area. In the case of the regionalization maps parameters associated with each parameter were performed last.

Keywords: intensity duration, frequency curves, regionalization, hydrology

Procedia PDF Downloads 330
9447 Spatial Econometric Approaches for Count Data: An Overview and New Directions

Authors: Paula Simões, Isabel Natário

Abstract:

This paper reviews a number of theoretical aspects for implementing an explicit spatial perspective in econometrics for modelling non-continuous data, in general, and count data, in particular. It provides an overview of the several spatial econometric approaches that are available to model data that are collected with reference to location in space, from the classical spatial econometrics approaches to the recent developments on spatial econometrics to model count data, in a Bayesian hierarchical setting. Considerable attention is paid to the inferential framework, necessary for structural consistent spatial econometric count models, incorporating spatial lag autocorrelation, to the corresponding estimation and testing procedures for different assumptions, to the constrains and implications embedded in the various specifications in the literature. This review combines insights from the classical spatial econometrics literature as well as from hierarchical modeling and analysis of spatial data, in order to look for new possible directions on the processing of count data, in a spatial hierarchical Bayesian econometric context.

Keywords: spatial data analysis, spatial econometrics, Bayesian hierarchical models, count data

Procedia PDF Downloads 598
9446 The Incident of Concussion across Popular American Youth Sports: A Retrospective Review

Authors: Rami Hashish, Manon Limousis-Gayda, Caitlin H. McCleery

Abstract:

Introduction: A leading cause of emergency room visits among youth (in the United States), is sports-related traumatic brain injuries. Mild traumatic brain injuries (mTBIs), also called concussions, are caused by linear and/or angular acceleration experienced at the head and represent an increasing societal burden. Due to the developing nature of the brain in youth, there is a great risk for long-term neuropsychological deficiencies following a concussion. Accordingly, the purpose of this paper is to investigate incidence rates of concussion across gender for the five most common youth sports in the United States. These include basketball, track and field, soccer, baseball (boys), softball (girls), football (boys), and volleyball (girls). Methods: A PubMed search was performed for four search themes combined. The first theme identified the outcomes (concussion, brain injuries, mild traumatic brain injury, etc.). The second theme identified the sport (American football, soccer, basketball, softball, volleyball, track, and field, etc.). The third theme identified the population (adolescence, children, youth, boys, girls). The last theme identified the study design (prevalence, frequency, incidence, prospective). Ultimately, 473 studies were surveyed, with 15 fulfilling the criteria: prospective study presenting original data and incidence of concussion in the relevant youth sport. The following data were extracted from the selected studies: population age, total study population, total athletic exposures (AE) and incidence rate per 1000 athletic exposures (IR/1000). Two One-Way ANOVA and a Tukey’s post hoc test were conducted using SPSS. Results: From the 15 selected studies, statistical analysis revealed the incidence of concussion per 1000 AEs across the considered sports ranged from 0.014 (girl’s track and field) to 0.780 (boy’s football). Average IR/1000 across all sports was 0.483 and 0.268 for boys and girls, respectively; this difference in IR was found to be statistically significant (p=0.013). Tukey’s post hoc test showed that football had significantly higher IR/1000 than boys’ basketball (p=0.022), soccer (p=0.033) and track and field (p=0.026). No statistical difference was found for concussion incidence between girls’ sports. Removal of football was found to lower the IR/1000 for boys without a statistical difference (p=0.101) compared to girls. Discussion: Football was the only sport showing a statistically significant difference in concussion incidence rate relative to other sports (within gender). Males were overall more likely to be concussed than females when football was included (1.8x), whereas concussion was more likely for females when football was excluded. While the significantly higher rate of concussion in football is not surprising because of the nature and rules of the sport, it is concerning that research has shown higher incidence of concussion in practices than games. Interestingly, findings indicate that girls’ sports are more concussive overall when football is removed. This appears to counter the common notion that boys’ sports are more physically taxing and dangerous. Future research should focus on understanding the concussive mechanisms of injury in each sport to enable effective rule changes.

Keywords: gender, football, soccer, traumatic brain injury

Procedia PDF Downloads 145
9445 The Effectiveness of Psychodrama on Anxiety Enhancement in Adolescent Boys

Authors: Saeed Dehnavi, Marjan Pooee

Abstract:

Background - Psychodrama, as a form of art therapy, helps people to enact and use role-plays for a specific problem, rather than just talking about it, in an effort to review the problem, gain feedback from group members, find appropriate solutions, and practice them for their life. This paper evaluated the effectiveness of psychodrama on enhancing anxiety of young adolescent boys. Methodology - This is aquasi-experimental research study, using a pre-post testing plan with control group. From four secondary schools in Kermanshah - Iran, 210 adolescent boys (aged 13 and 14 years) were asked to complete Koper Smith's self-esteem measure scale. Given the low self-esteem scores (less than the cut-off of 23), a number of 20 individuals were selected and randomly placed into two control and experimental groups. The experimental group participated in a twelve-session psychodrama therapy plan for 6 weeks, while the control group received no intervention. Data analysis was carried out by the analysis of covariance (ANCOVA). Results - The results of ANCOVA analysis showed an increase in the post-test scores for anxiety, and such increase was statistically significant. Conclusion - The findings indicated the effectiveness of psychodrama on anxiety enhancement of young boys. During psychodrama sessions, the adolescents learned to take the initiative, communicate with others in an excited state, and improve their anxiety with positive and constructive experiences.

Keywords: anxiety, art therapy, psychodrama, young adolescents

Procedia PDF Downloads 546
9444 Reflection of Gender Differences on the Associations among BMI, Body Fat Percentages, Body Circumference Values, and Cardiometabolic Parameters

Authors: Mustafa Metin Donma

Abstract:

The associations among anthropometric measurements, body fat, and cardiometabolic parameters have attracted attention for years due to the importance and topicality of the subject. Gender is also an important factor to be considered during the evaluation of the study findings. Gender is particularly important in the field of pediatrics and during the interpretation of complete blood cell count parameters. These parameters are important because their ratios are being used as valuable and informative markers demonstrating cardiometabolic risk. The aim of this study was to introduce potential differences between male and female children in terms of the associations between these ratios and parameters closely related to obesity development. The study population was composed of six hundred and twenty-seven children. Fifty-eight percent of children were females, and forty-two percent of children were males. The body mass index distribution of the groups was almost the same. The study was evaluated by the Institutional Ethics Committee and approved. Parents have given informed consent forms on behalf of their children to participate in the study. Anthropometric measurements were taken. Complete blood cell count, biochemical and body fat percentage analysis were performed. Body mass index, neutrophil-to-lymphocyte (N/L) ratio, aspartate aminotransferase-to-platelet count ratio were calculated. Statistical evaluations were performed by a statistical package program. Body mass index, circumferences of waist, hip, head, and neck values were higher, and body fat percentages of trunk and extremities were lower in males than females (p>0.05). Increased eosinophil percent and alanine aminotransferase-to-aspartate aminotransferase ratio decreased N/L ratio values in males were observed. Positive correlations between total body fat percent and N/L ratio in female children, platelet count in male children were noteworthy. In conclusion, associations of total body fat percent with N/L and platelet count in female children and male children, respectively, point out the importance of N/L and platelet count in different genders for the future development of cardiovascular diseases.

Keywords: body fat percentages, children, gender, neutrophil-to-lymphocyte ratio, platelet count

Procedia PDF Downloads 11
9443 Is the Okun's Law Valid in Tunisia?

Authors: El Andari Chifaa, Bouaziz Rached

Abstract:

The central focus of this paper was to check whether the Okun’s law in Tunisia is valid or not. For this purpose, we have used quarterly time series data during the period 1990Q1-2014Q1. Firstly, we applied the error correction model instead of the difference version of Okun's Law, the Engle-Granger and Johansen test are employed to find out long run association between unemployment, production, and how error correction mechanism (ECM) is used for short run dynamic. Secondly, we used the gap version of Okun’s law where the estimation is done from three band pass filters which are mathematical tools used in macro-economic and especially in business cycles theory. The finding of the study indicates that the inverse relationship between unemployment and output is verified in the short and long term, and the Okun's law holds for the Tunisian economy, but with an Okun’s coefficient lower than required. Therefore, our empirical results have important implications for structural and cyclical policymakers in Tunisia to promote economic growth in a context of lower unemployment growth.

Keywords: Okun’s law, validity, unit root, cointegration, error correction model, bandpass filters

Procedia PDF Downloads 320
9442 DEEPMOTILE: Motility Analysis of Human Spermatozoa Using Deep Learning in Sri Lankan Population

Authors: Chamika Chiran Perera, Dananjaya Perera, Chirath Dasanayake, Banuka Athuraliya

Abstract:

Male infertility is a major problem in the world, and it is a neglected and sensitive health issue in Sri Lanka. It can be determined by analyzing human semen samples. Sperm motility is one of many factors that can evaluate male’s fertility potential. In Sri Lanka, this analysis is performed manually. Manual methods are time consuming and depend on the person, but they are reliable and it can depend on the expert. Machine learning and deep learning technologies are currently being investigated to automate the spermatozoa motility analysis, and these methods are unreliable. These automatic methods tend to produce false positive results and false detection. Current automatic methods support different techniques, and some of them are very expensive. Due to the geographical variance in spermatozoa characteristics, current automatic methods are not reliable for motility analysis in Sri Lanka. The suggested system, DeepMotile, is to explore a method to analyze motility of human spermatozoa automatically and present it to the andrology laboratories to overcome current issues. DeepMotile is a novel deep learning method for analyzing spermatozoa motility parameters in the Sri Lankan population. To implement the current approach, Sri Lanka patient data were collected anonymously as a dataset, and glass slides were used as a low-cost technique to analyze semen samples. Current problem was identified as microscopic object detection and tackling the problem. YOLOv5 was customized and used as the object detector, and it achieved 94 % mAP (mean average precision), 86% Precision, and 90% Recall with the gathered dataset. StrongSORT was used as the object tracker, and it was validated with andrology experts due to the unavailability of annotated ground truth data. Furthermore, this research has identified many potential ways for further investigation, and andrology experts can use this system to analyze motility parameters with realistic accuracy.

Keywords: computer vision, deep learning, convolutional neural networks, multi-target tracking, microscopic object detection and tracking, male infertility detection, motility analysis of human spermatozoa

Procedia PDF Downloads 111
9441 Best Practice for Post-Operative Surgical Site Infection Prevention

Authors: Scott Cavinder

Abstract:

Surgical site infections (SSI) are a known complication to any surgical procedure and are one of the most common nosocomial infections. Globally it is estimated 300 million surgical procedures take place annually, with an incidence of SSI’s estimated to be 11 of 100 surgical patients developing an infection within 30 days after surgery. The specific purpose of the project is to address the PICOT (Problem, Intervention, Comparison, Outcome, Time) question: In patients who have undergone cardiothoracic or vascular surgery (P), does implementation of a post-operative care bundle based on current EBP (I) as compared to current clinical agency practice standards (C) result in a decrease of SSI (O) over a 12-week period (T)? Synthesis of Supporting Evidence: A literature search of five databases, including citation chasing, was performed, which yielded fourteen pieces of evidence ranging from high to good quality. Four common themes were identified for the prevention of SSI’s including use and removal of surgical dressings; use of topical antibiotics and antiseptics; implementation of evidence-based care bundles, and implementation of surveillance through auditing and feedback. The Iowa Model was selected as the framework to help guide this project as it is a multiphase change process which encourages clinicians to recognize opportunities for improvement in healthcare practice. Practice/Implementation: The process for this project will include recruiting postsurgical participants who have undergone cardiovascular or thoracic surgery prior to discharge at a Northwest Indiana Hospital. The patients will receive education, verbal instruction, and return demonstration. The patients will be followed for 12 weeks, and wounds assessed utilizing the National Healthcare Safety Network//Centers for Disease Control (NHSN/CDC) assessment tool and compared to the SSI rate of 2021. Key stakeholders will include two cardiovascular surgeons, four physician assistants, two advance practice nurses, medical assistant and patients. Method of Evaluation: Chi Square analysis will be utilized to establish statistical significance and similarities between the two groups. Main Results/Outcomes: The proposed outcome is the prevention of SSIs in the post-op cardiothoracic and vascular patient. Implication/Recommendation(s): Implementation of standardized post operative care bundles in the prevention of SSI in cardiovascular and thoracic surgical patients.

Keywords: cardiovascular, evidence based practice, infection, post-operative, prevention, thoracic, surgery

Procedia PDF Downloads 87
9440 Case-Based Reasoning for Modelling Random Variables in the Reliability Assessment of Existing Structures

Authors: Francesca Marsili

Abstract:

The reliability assessment of existing structures with probabilistic methods is becoming an increasingly important and frequent engineering task. However probabilistic reliability methods are based on an exhaustive knowledge of the stochastic modeling of the variables involved in the assessment; at the moment standards for the modeling of variables are absent, representing an obstacle to the dissemination of probabilistic methods. The framework according to probability distribution functions (PDFs) are established is represented by the Bayesian statistics, which uses Bayes Theorem: a prior PDF for the considered parameter is established based on information derived from the design stage and qualitative judgments based on the engineer past experience; then, the prior model is updated with the results of investigation carried out on the considered structure, such as material testing, determination of action and structural properties. The application of Bayesian statistics arises two different kind of problems: 1. The results of the updating depend on the engineer previous experience; 2. The updating of the prior PDF can be performed only if the structure has been tested, and quantitative data that can be statistically manipulated have been collected; performing tests is always an expensive and time consuming operation; furthermore, if the considered structure is an ancient building, destructive tests could compromise its cultural value and therefore should be avoided. In order to solve those problems, an interesting research path is represented by investigating Artificial Intelligence (AI) techniques that can be useful for the automation of the modeling of variables and for the updating of material parameters without performing destructive tests. Among the others, one that raises particular attention in relation to the object of this study is constituted by Case-Based Reasoning (CBR). In this application, cases will be represented by existing buildings where material tests have already been carried out and an updated PDFs for the material mechanical parameters has been computed through a Bayesian analysis. Then each case will be composed by a qualitative description of the material under assessment and the posterior PDFs that describe its material properties. The problem that will be solved is the definition of PDFs for material parameters involved in the reliability assessment of the considered structure. A CBR system represent a good candi¬date in automating the modelling of variables because: 1. Engineers already draw an estimation of the material properties based on the experience collected during the assessment of similar structures, or based on similar cases collected in literature or in data-bases; 2. Material tests carried out on structure can be easily collected from laboratory database or from literature; 3. The system will provide the user of a reliable probabilistic description of the variables involved in the assessment that will also serve as a tool in support of the engineer’s qualitative judgments. Automated modeling of variables can help in spreading probabilistic reliability assessment of existing buildings in the common engineering practice, and target at the best intervention and further tests on the structure; CBR represents a technique which may help to achieve this.

Keywords: reliability assessment of existing buildings, Bayesian analysis, case-based reasoning, historical structures

Procedia PDF Downloads 341
9439 Improved Computational Efficiency of Machine Learning Algorithm Based on Evaluation Metrics to Control the Spread of Coronavirus in the UK

Authors: Swathi Ganesan, Nalinda Somasiri, Rebecca Jeyavadhanam, Gayathri Karthick

Abstract:

The COVID-19 crisis presents a substantial and critical hazard to worldwide health. Since the occurrence of the disease in late January 2020 in the UK, the number of infected people confirmed to acquire the illness has increased tremendously across the country, and the number of individuals affected is undoubtedly considerably high. The purpose of this research is to figure out a predictive machine learning archetypal that could forecast COVID-19 cases within the UK. This study concentrates on the statistical data collected from 31st January 2020 to 31st March 2021 in the United Kingdom. Information on total COVID cases registered, new cases encountered on a daily basis, total death registered, and patients’ death per day due to Coronavirus is collected from World Health Organisation (WHO). Data preprocessing is carried out to identify any missing values, outliers, or anomalies in the dataset. The data is split into 8:2 ratio for training and testing purposes to forecast future new COVID cases. Support Vector Machines (SVM), Random Forests, and linear regression algorithms are chosen to study the model performance in the prediction of new COVID-19 cases. From the evaluation metrics such as r-squared value and mean squared error, the statistical performance of the model in predicting the new COVID cases is evaluated. Random Forest outperformed the other two Machine Learning algorithms with a training accuracy of 99.47% and testing accuracy of 98.26% when n=30. The mean square error obtained for Random Forest is 4.05e11, which is lesser compared to the other predictive models used for this study. From the experimental analysis Random Forest algorithm can perform more effectively and efficiently in predicting the new COVID cases, which could help the health sector to take relevant control measures for the spread of the virus.

Keywords: COVID-19, machine learning, supervised learning, unsupervised learning, linear regression, support vector machine, random forest

Procedia PDF Downloads 123
9438 Redefining Solar Generation Estimation: A Comprehensive Analysis of Real Utility Advanced Metering Infrastructure (AMI) Data from Various Projects in New York

Authors: Haowei Lu, Anaya Aaron

Abstract:

Understanding historical solar generation and forecasting future solar generation from interconnected Distributed Energy Resources (DER) is crucial for utility planning and interconnection studies. The existing methodology, which relies on solar radiation, weather data, and common inverter models, is becoming less accurate. Rapid advancements in DER technologies have resulted in more diverse project sites, deviating from common patterns due to various factors such as DC/AC ratio, solar panel performance, tilt angle, and the presence of DC-coupled battery energy storage systems. In this paper, the authors review 10,000 DER projects within the system and analyze the Advanced Metering Infrastructure (AMI) data for various types to demonstrate the impact of different parameters. An updated methodology is proposed for redefining historical and future solar generation in distribution feeders.

Keywords: photovoltaic system, solar energy, fluctuations, energy storage, uncertainty

Procedia PDF Downloads 41
9437 The Methods of Customer Satisfaction Measurement and Its Statistical Analysis towards Sales and Logistic Activities in Food Sector

Authors: Seher Arslankaya, Bahar Uludağ

Abstract:

Meeting the needs and demands of customers and pleasing the customers are important requirements for companies in food sectors where the growth of competition is significantly unpredictable. Customer satisfaction is also one of the key concepts which is mainly driven by wide range of customer preference and expectation upon products and services introduced and delivered to them. In order to meet the customer demands, the companies that engage in food sectors are expected to have a well-managed set of Total Quality Management (TQM), which sets out to improve quality of products and services; to reduce costs and to increase customer satisfaction by restructuring traditional management practices. It aims to increase customer satisfaction by meeting (their) customer expectations and requirements. The achievement would be determined with the help of customer satisfaction surveys, which is done to obtain immediate feedback and to provide quick responses. In addition, the surveys would also assist the making of strategic planning which helps to anticipate customer future needs and expectations. Meanwhile, periodic measurement of customer satisfaction would be a must because with the better understanding of customers perceptions from the surveys (done by questioners), the companies would have a clear idea to identify their own strengths and weaknesses that help the companies keep their loyal customers; to stand in comparison toward their competitors and map out their future progress and improvement. In this study, we propose a survey based on customer satisfaction measurement method and its statistical analysis for sales and logistic activities of food firms. Customer satisfaction would be discussed in details. Furthermore, after analysing the data derived from the questionnaire that applied to customers by using the SPSS software, various results obtained from the application would be presented. By also applying ANOVA test, the study would analysis the existence of meaningful differences between customer demographic proportion and their perceptions. The purpose of this study is also to find out requirements which help to remove the effects that decrease customer satisfaction and produce loyal customers in food industry. For this purpose, the customer complaints are collected. Additionally, comments and suggestions are done according to the obtained results of surveys, which would be useful for the making-process of strategic planning in food industry.

Keywords: customer satisfaction measurement and analysis, food industry, SPSS, TQM

Procedia PDF Downloads 256
9436 Noise Source Identification on Urban Construction Sites Using Signal Time Delay Analysis

Authors: Balgaisha G. Mukanova, Yelbek B. Utepov, Aida G. Nazarova, Alisher Z. Imanov

Abstract:

The problem of identifying local noise sources on a construction site using a sensor system is considered. Mathematical modeling of detected signals on sensors was carried out, considering signal decay and signal delay time between the source and detector. Recordings of noises produced by construction tools were used as a dependence of noise on time. Synthetic sensor data was constructed based on these data, and a model of the propagation of acoustic waves from a point source in the three-dimensional space was applied. All sensors and sources are assumed to be located in the same plane. A source localization method is checked based on the signal time delay between two adjacent detectors and plotting the direction of the source. Based on the two direct lines' crossline, the noise source's position is determined. Cases of one dominant source and the case of two sources in the presence of several other sources of lower intensity are considered. The number of detectors varies from three to eight detectors. The intensity of the noise field in the assessed area is plotted. The signal of a two-second duration is considered. The source is located for subsequent parts of the signal with a duration above 0.04 sec; the final result is obtained by computing the average value.

Keywords: acoustic model, direction of arrival, inverse source problem, sound localization, urban noises

Procedia PDF Downloads 65
9435 Effect of Malnutrition at Admission on Length of Hospital Stay among Adult Surgical Patients in Wolaita Sodo University Comprehensive Specialized Hospital, South Ethiopia: Prospective Cohort Study, 2022

Authors: Yoseph Halala Handiso, Zewdi Gebregziabher

Abstract:

Background: Malnutrition in hospitalized patients remains a major public health problem in both developed and developing countries. Despite the fact that malnourished patients are more prone to stay longer in hospital, there is limited data regarding the magnitude of malnutrition and its effect on length of stay among surgical patients in Ethiopia, while nutritional assessment is also often a neglected component of the health service practice. Objective: This study aimed to assess the prevalence of malnutrition at admission and its effect on the length of hospital stay among adult surgical patients in Wolaita Sodo University Comprehensive Specialized Hospital, South Ethiopia, 2022. Methods: A facility-based prospective cohort study was conducted among 398 adult surgical patients admitted to the hospital. Participants in the study were chosen using a convenient sampling technique. Subjective global assessment was used to determine the nutritional status of patients with a minimum stay of 24 hours within 48 hours after admission (SGA). Data were collected using the open data kit (ODK) version 2022.3.3 software, while Stata version 14.1 software was employed for statistical analysis. The Cox regression model was used to determine the effect of malnutrition on the length of hospital stay (LOS) after adjusting for several potential confounders taken at admission. Adjusted hazard ratio (HR) with a 95% confidence interval was used to show the effect of malnutrition. Results: The prevalence of hospital malnutrition at admission was 64.32% (95% CI: 59%-69%) according to the SGA classification. Adult surgical patients who were malnourished at admission had higher median LOS (12 days: 95% CI: 11-13) as compared to well-nourished patients (8 days: 95% CI: 8-9), means adult surgical patients who were malnourished at admission were at higher risk of reduced chance of discharge with improvement (prolonged LOS) (AHR: 0.37, 95% CI: 0.29-0.47) as compared to well-nourished patients. Presence of comorbidity (AHR: 0.68, 95% CI: 0.50-90), poly medication (AHR: 0.69, 95% CI: 0.55-0.86), and history of admission (AHR: 0.70, 95% CI: 0.55-0.87) within the previous five years were found to be the significant covariates of the length of hospital stay (LOS). Conclusion: The magnitude of hospital malnutrition at admission was found to be high. Malnourished patients at admission had a higher risk of prolonged length of hospital stay as compared to well-nourished patients. The presence of comorbidity, polymedication, and history of admission were found to be the significant covariates of LOS. All stakeholders should give attention to reducing the magnitude of malnutrition and its covariates to improve the burden of LOS.

Keywords: effect of malnutrition, length of hospital stay, surgical patients, Ethiopia

Procedia PDF Downloads 71
9434 Enhancing the Oxidation Resistance of Copper at High Temperature by Surface Fluorination

Authors: Jae-Ho Kim, Ryosuke Yokochi, Miho Fuzihashi, Susumu Yonezawa

Abstract:

The use of silver nanoparticles in conductive inks and their printing by injecting technology has been known for years. However, the very high cost of silver limits wide industrial applications. Since copper is much cheaper but possesses a very high conductivity (only 6% less than that of Ag), Cu nanoparticles can be considered as a replacement for silver nanoparticles. However, a major problem in utilizing their copper nanoparticles is their inherent tendency to oxidize in ambient conditions. In conductive printing applications, the presence of copper oxide on the surface of nanoparticles has two negative consequences: it increases the required sintering temperature and reduces the electrical conductivity. Only a limited number of reports have attempted to address the oxidation problem, which in general is based on minimizing the exposure of the copper nanoparticles to oxygen by a protective layer composed of a second material at the surface of the particles. To form the protective layer on the surface, carbon-based materials, surfactants, metals, and so on. In this study, we tried to modify the oxide on Cu particles using fluorine gas. And the creation effects of oxyfluorides or fluorides on the oxidation resistance of Cu particles were investigated. Compared with untreated sample (a), the fluorinated samples can restrain the weight increase even at 200℃ from the TG-DTA results. It might be considered that the substantial oxyfluorides on the surface play a role in protecting metal oxidation.

Keywords: copper metal, electrical conductivity, oxidation resistance, surface fluorination

Procedia PDF Downloads 114
9433 Clustering-Based Computational Workload Minimization in Ontology Matching

Authors: Mansir Abubakar, Hazlina Hamdan, Norwati Mustapha, Teh Noranis Mohd Aris

Abstract:

In order to build a matching pattern for each class correspondences of ontology, it is required to specify a set of attribute correspondences across two corresponding classes by clustering. Clustering reduces the size of potential attribute correspondences considered in the matching activity, which will significantly reduce the computation workload; otherwise, all attributes of a class should be compared with all attributes of the corresponding class. Most existing ontology matching approaches lack scalable attributes discovery methods, such as cluster-based attribute searching. This problem makes ontology matching activity computationally expensive. It is therefore vital in ontology matching to design a scalable element or attribute correspondence discovery method that would reduce the size of potential elements correspondences during mapping thereby reduce the computational workload in a matching process as a whole. The objective of this work is 1) to design a clustering method for discovering similar attributes correspondences and relationships between ontologies, 2) to discover element correspondences by classifying elements of each class based on element’s value features using K-medoids clustering technique. Discovering attribute correspondence is highly required for comparing instances when matching two ontologies. During the matching process, any two instances across two different data sets should be compared to their attribute values, so that they can be regarded to be the same or not. Intuitively, any two instances that come from classes across which there is a class correspondence are likely to be identical to each other. Besides, any two instances that hold more similar attribute values are more likely to be matched than the ones with less similar attribute values. Most of the time, similar attribute values exist in the two instances across which there is an attribute correspondence. This work will present how to classify attributes of each class with K-medoids clustering, then, clustered groups to be mapped by their statistical value features. We will also show how to map attributes of a clustered group to attributes of the mapped clustered group, generating a set of potential attribute correspondences that would be applied to generate a matching pattern. The K-medoids clustering phase would largely reduce the number of attribute pairs that are not corresponding for comparing instances as only the coverage probability of attributes pairs that reaches 100% and attributes above the specified threshold can be considered as potential attributes for a matching. Using clustering will reduce the size of potential elements correspondences to be considered during mapping activity, which will in turn reduce the computational workload significantly. Otherwise, all element of the class in source ontology have to be compared with all elements of the corresponding classes in target ontology. K-medoids can ably cluster attributes of each class, so that a proportion of attribute pairs that are not corresponding would not be considered when constructing the matching pattern.

Keywords: attribute correspondence, clustering, computational workload, k-medoids clustering, ontology matching

Procedia PDF Downloads 250
9432 Cascaded Neural Network for Internal Temperature Forecasting in Induction Motor

Authors: Hidir S. Nogay

Abstract:

In this study, two systems were created to predict interior temperature in induction motor. One of them consisted of a simple ANN model which has two layers, ten input parameters and one output parameter. The other one consisted of eight ANN models connected each other as cascaded. Cascaded ANN system has 17 inputs. Main reason of cascaded system being used in this study is to accomplish more accurate estimation by increasing inputs in the ANN system. Cascaded ANN system is compared with simple conventional ANN model to prove mentioned advantages. Dataset was obtained from experimental applications. Small part of the dataset was used to obtain more understandable graphs. Number of data is 329. 30% of the data was used for testing and validation. Test data and validation data were determined for each ANN model separately and reliability of each model was tested. As a result of this study, it has been understood that the cascaded ANN system produced more accurate estimates than conventional ANN model.

Keywords: cascaded neural network, internal temperature, inverter, three-phase induction motor

Procedia PDF Downloads 347
9431 Resource Leveling Optimization in Construction Projects of High Voltage Substations Using Nature-Inspired Intelligent Evolutionary Algorithms

Authors: Dimitrios Ntardas, Alexandros Tzanetos, Georgios Dounias

Abstract:

High Voltage Substations (HVS) are the intermediate step between production of power and successfully transmitting it to clients, making them one of the most important checkpoints in power grids. Nowadays - renewable resources and consequently distributed generation are growing fast, the construction of HVS is of high importance both in terms of quality and time completion so that new energy producers can quickly and safely intergrade in power grids. The resources needed, such as machines and workers, should be carefully allocated so that the construction of a HVS is completed on time, with the lowest possible cost (e.g. not spending additional cost that were not taken into consideration, because of project delays), but in the highest quality. In addition, there are milestones and several checkpoints to be precisely achieved during construction to ensure the cost and timeline control and to ensure that the percentage of governmental funding will be granted. The management of such a demanding project is a NP-hard problem that consists of prerequisite constraints and resource limits for each task of the project. In this work, a hybrid meta-heuristic method is implemented to solve this problem. Meta-heuristics have been proven to be quite useful when dealing with high-dimensional constraint optimization problems. Hybridization of them results in boost of their performance.

Keywords: hybrid meta-heuristic methods, substation construction, resource allocation, time-cost efficiency

Procedia PDF Downloads 157
9430 Ranking Theory-The Paradigm Shift in Statistical Approach to the Issue of Ranking in a Sports League

Authors: E. Gouya Bozorg

Abstract:

The issue of ranking of sports teams, in particular soccer teams is of primary importance in the professional sports. However, it is still based on classical statistics and models outside of area of mathematics. Rigorous mathematics and then statistics despite the expectation held of them have not been able to effectively engage in the issue of ranking. It is something that requires serious pathology. The purpose of this study is to change the approach to get closer to mathematics proper for using in the ranking. We recommend using theoretical mathematics as a good option because it can hermeneutically obtain the theoretical concepts and criteria needful for the ranking from everyday language of a League. We have proposed a framework that puts the issue of ranking into a new space that we have applied in soccer as a case study. This is an experimental and theoretical study on the issue of ranking in a professional soccer league based on theoretical mathematics, followed by theoretical statistics. First, we showed the theoretical definition of constant number Є = 1.33 or ‘golden number’ of a soccer league. Then, we have defined the ‘efficiency of a team’ by this number and formula of μ = (Pts / (k.Є)) – 1, in which Pts is a point obtained by a team in k number of games played. Moreover, K.Є index has been used to show the theoretical median line in the league table and to compare top teams and bottom teams. Theoretical coefficient of σ= 1 / (1+ (Ptx / Ptxn)) has also been defined that in every match between the teams x, xn, with respect to the ability of a team and the points of both of them Ptx, Ptxn, and it gives a performance point resulting in a special ranking for the League. And it has been useful particularly in evaluating the performance of weaker teams. The current theory has been examined for the statistical data of 4 major European Leagues during the period of 1998-2014. Results of this study showed that the issue of ranking is dependent on appropriate theoretical indicators of a League. These indicators allowed us to find different forms of ranking of teams in a league including the ‘special table’ of a league. Furthermore, on this basis the issue of a record of team has been revised and amended. In addition, the theory of ranking can be used to compare and classify the different leagues and tournaments. Experimental results obtained from archival statistics of major professional leagues in the world in the past two decades have confirmed the theory. This topic introduces a new theory for ranking of a soccer league. Moreover, this theory can be used to compare different leagues and tournaments.

Keywords: efficiency of a team, ranking, special table, theoretical mathematic

Procedia PDF Downloads 421
9429 Arsenic Contamination in Drinking Water Is Associated with Dyslipidemia in Pregnancy

Authors: Begum Rokeya, Rahelee Zinnat, Fatema Jebunnesa, Israt Ara Hossain, A. Rahman

Abstract:

Background and Aims: Arsenic in drinking water is a global environmental health problem, and the exposure may increase dyslipidemia and cerebrovascular diseases mortalities, most likely through causing atherosclerosis. However, the mechanism of lipid metabolism, atherosclerosis formation, arsenic exposure and impact in pregnancy is still unclear. Recent epidemiological evidences indicate close association between inorganic arsenic exposure via drinking water and Dyslipidemia. However, the exact mechanism of this arsenic-mediated increase in atherosclerosis risk factors remains enigmatic. We explore the association of the effect of arsenic on serum lipid profile in pregnant subjects. Methods: A total 200 pregnant mother screened in this study from arsenic exposed area. Our study group included 100 exposed subjects were cases and 100 Non exposed healthy pregnant were controls requited by a cross-sectional study. Clinical and anthropometric measurements were done by standard techniques. Lipidemic status was assessed by enzymatic endpoint method. Urinary As was measured by inductively coupled plasma-mass spectrometry and adjusted with specific gravity and Arsenic exposure was assessed by the level of urinary arsenic level > 100 μg/L was categorized as arsenic exposed and < 100 μg/L were categorized as non-exposed. Multivariate logistic regression and Student’s t - test was used for statistical analysis. Results: Systolic and diastolic blood pressure both were significantly higher in the Arsenic exposed pregnant subjects compared to the Non-exposed group (p<0.001). Arsenic exposed subjects had 2 times higher chance of developing hypertensive pregnancy (Odds Ratio 2.2). In parallel to the findings in Ar exposed subjects showed significantly higher proportion of triglyceride and total cholesterol and low density of lipo protein when compare to non- arsenic exposed pregnant subjects. Significant correlation of urinary arsenic level was also found with SBP, DBP, TG, T chol and serum LDL-Cholesterol. On multivariate logistic regression showed urinary arsenic had a positive association with DBP, SBP, Triglyceride and LDL-c. Conclusion: In conclusion, arsenic exposure may induce dyslipidemia like atherosclerosis through modifying reverse cholesterol transport in cholesterol metabolism. For decreasing atherosclerosis related mortality associated with arsenic, preventing exposure from environmental sources in early life is an important element.

Keywords: Arsenic Exposure, Dyslipidemia, Gestational Diabetes Mellitus, Serum lipid profile

Procedia PDF Downloads 130
9428 Conceptual and Preliminary Design of Landmine Searching UAS at Extreme Environmental Condition

Authors: Gopalasingam Daisan

Abstract:

Landmines and ammunitions have been creating a significant threat to the people and animals, after the war, the landmines remain in the land and it plays a vital role in civilian’s security. Especially the Children are at the highest risk because they are curious. After all, an unexploded bomb can look like a tempting toy to an inquisitive child. The initial step of designing the UAS (Unmanned Aircraft Systems) for landmine detection is to choose an appropriate and effective sensor to locate the landmines and other unexploded ammunitions. The sensor weight and other components related to the sensor supporting device’s weight are taken as a payload weight. The mission requirement is to find the landmines in a particular area by making a proper path that will cover all the vicinity in the desired area. The weight estimation of the UAV (Unmanned Aerial Vehicle) can be estimated by various techniques discovered previously with good accuracy at the first phase of the design. The next crucial part of the design is to calculate the power requirement and the wing loading calculations. The matching plot techniques are used to determine the thrust-to-weight ratio, and this technique makes this process not only easiest but also precisely. The wing loading can be calculated easily from the stall equation. After these calculations, the wing area is determined from the wing loading equation and the required power is calculated from the thrust to weight ratio calculations. According to the power requirement, an appropriate engine can be selected from the available engine from the market. And the wing geometric parameter is chosen based on the conceptual sketch. The important steps in the wing design to choose proper aerofoil and which will ensure to create sufficient lift coefficient to satisfy the requirements. The next component is the tail; the tail area and other related parameters can be estimated or calculated to counteract the effect of the wing pitching moment. As the vertical tail design depends on many parameters, the initial sizing only can be done in this phase. The fuselage is another major component, which is selected based on the slenderness ratio, and also the shape is determined on the sensor size to fit it under the fuselage. The landing gear is one of the important components which is selected based on the controllability and stability requirements. The minimum and maximum wheel track and wheelbase can be determined based on the crosswind and overturn angle requirements. The minor components of the landing gear design and estimation are not the focus of this project. Another important task is to calculate the weight of the major components and it is going to be estimated using empirical relations and also the mass is added to each such component. The CG and moment of inertia are also determined to each component separately. The sensitivity of the weight calculation is taken into consideration to avoid extra material requirements and also reduce the cost of the design. Finally, the aircraft performance is calculated, especially the V-n (velocity and load factor) diagram for different flight conditions such as not disturbed and with gust velocity.

Keywords: landmine, UAS, matching plot, optimization

Procedia PDF Downloads 172
9427 Analyzing the Performance of Machine Learning Models to Predict Alzheimer's Disease and its Stages Addressing Missing Value Problem

Authors: Carlos Theran, Yohn Parra Bautista, Victor Adankai, Richard Alo, Jimwi Liu, Clement G. Yedjou

Abstract:

Alzheimer's disease (AD) is a neurodegenerative disorder primarily characterized by deteriorating cognitive functions. AD has gained relevant attention in the last decade. An estimated 24 million people worldwide suffered from this disease by 2011. In 2016 an estimated 40 million were diagnosed with AD, and for 2050 is expected to reach 131 million people affected by AD. Therefore, detecting and confirming AD at its different stages is a priority for medical practices to provide adequate and accurate treatments. Recently, Machine Learning (ML) models have been used to study AD's stages handling missing values in multiclass, focusing on the delineation of Early Mild Cognitive Impairment (EMCI), Late Mild Cognitive Impairment (LMCI), and normal cognitive (CN). But, to our best knowledge, robust performance information of these models and the missing data analysis has not been presented in the literature. In this paper, we propose studying the performance of five different machine learning models for AD's stages multiclass prediction in terms of accuracy, precision, and F1-score. Also, the analysis of three imputation methods to handle the missing value problem is presented. A framework that integrates ML model for AD's stages multiclass prediction is proposed, performing an average accuracy of 84%.

Keywords: alzheimer's disease, missing value, machine learning, performance evaluation

Procedia PDF Downloads 258
9426 Management of Quality Assessment of Teaching and Methodological Activities of a Teacher of a Military, Special Educational Institution

Authors: Maxutova I. O., Bulatbayeva A. A.

Abstract:

In modern conditions, the competitiveness of the military, a special educational institution in the educational market, is determined by the quality of the provision of educational services and the economic efficiency of activities. Improving the quality of educational services of the military, the special educational institution is an urgent socially and economically significant problem. The article shows a possible system for the formation of the competitiveness of military, the special educational institution through an assessment of the quality of the educational process, the problem of the transition of the military, special educational institution to digital support of indicative monitoring of the quality of services provided is raised. Quality monitoring is presented in the form of a program or information system, the work of which is carried out in a military, the special educational institution through highlighted interrelated elements. A result-oriented model of management and assessment of the quality of work of the military, the special educational institution is proposed. The indicative indicators for assessing the quality of the teaching and methodological activity of the teacher are considered and described. The publication was prepared as part of an applied grant study for 2020-2022 commissioned by the Ministry of Education and Science of the Republic of Kazakhstan on the topic "Development of a comprehensive methodology for assessing the quality of education of graduates of military special educational institutions" IRN 00029/GF-20.

Keywords: quality assessment, indicative indicators, monitoring program, educational and methodological activities, professional activities, result

Procedia PDF Downloads 155
9425 Fraud Detection in Credit Cards with Machine Learning

Authors: Anjali Chouksey, Riya Nimje, Jahanvi Saraf

Abstract:

Online transactions have increased dramatically in this new ‘social-distancing’ era. With online transactions, Fraud in online payments has also increased significantly. Frauds are a significant problem in various industries like insurance companies, baking, etc. These frauds include leaking sensitive information related to the credit card, which can be easily misused. Due to the government also pushing online transactions, E-commerce is on a boom. But due to increasing frauds in online payments, these E-commerce industries are suffering a great loss of trust from their customers. These companies are finding credit card fraud to be a big problem. People have started using online payment options and thus are becoming easy targets of credit card fraud. In this research paper, we will be discussing machine learning algorithms. We have used a decision tree, XGBOOST, k-nearest neighbour, logistic-regression, random forest, and SVM on a dataset in which there are transactions done online mode using credit cards. We will test all these algorithms for detecting fraud cases using the confusion matrix, F1 score, and calculating the accuracy score for each model to identify which algorithm can be used in detecting frauds.

Keywords: machine learning, fraud detection, artificial intelligence, decision tree, k nearest neighbour, random forest, XGBOOST, logistic regression, support vector machine

Procedia PDF Downloads 153