Search results for: CART
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 44

Search results for: CART

44 Shopping Cart System: Load Balancing and Fault Tolerance in the OSGi Service Platform

Authors: Irina Astrova, Arne Koschel, Thole Schneider, Johannes Westhuis, Jürgen Westerkamp

Abstract:

The main purpose of this paper was to find a simple solution for load balancing and fault tolerance in OSGi. The challenge was to implement a highly available web application such as a shopping cart system with load balancing and fault tolerance, without having to change the core of OSGi.

Keywords: fault tolerance, load balancing, OSGi, shopping cart system

Procedia PDF Downloads 421
43 Kinematic Analysis of Human Gait for Typical Postures of Walking, Running and Cart Pulling

Authors: Nupur Karmaker, Hasin Aupama Azhari, Abdul Al Mortuza, Abhijit Chanda, Golam Abu Zakaria

Abstract:

Purpose: The purpose of gait analysis is to determine the biomechanics of the joint, phases of gait cycle, graphical and analytical analysis of degree of rotation, analysis of the electrical activity of muscles and force exerted on the hip joint at different locomotion during walking, running and cart pulling. Methods and Materials: Visual gait analysis and electromyography method has been used to detect the degree of rotation of joints and electrical activity of muscles. In cinematography method an object is observed from different sides and takes its video. Cart pulling length has been divided into frames with respect to time by using video splitter software. Phases of gait cycle, degree of rotation of joints, EMG profile and force analysis during walking and running has been taken from different papers. Gait cycle and degree of rotation of joints during cart pulling has been prepared by using video camera, stop watch, video splitter software and Microsoft Excel. Results and Discussion: During the cart pulling the force exerted on hip is the resultant of various forces. The force on hip is the vector sum of the force Fg= mg, due the body of weight of the person and Fa= ma, due to the velocity. Maximum stance phase shows during cart pulling and minimum shows during running. During cart pulling shows maximum degree of rotation of hip joint, knee: running, and ankle: cart pulling. During walking, it has been observed minimum degree of rotation of hip, ankle: during running. During cart pulling, dynamic force depends on the walking velocity, body weight and load weight. Conclusions: 80% people suffer gait related disease with increasing their age. Proper care should take during cart pulling. It will be better to establish the gait laboratory to determine the gait related diseases. If the way of cart pulling is changed i.e the design of cart pulling machine, load bearing system is changed then it would possible to reduce the risk of limb loss, flat foot syndrome and varicose vein in lower limb.

Keywords: kinematic, gait, gait lab, phase, force analysis

Procedia PDF Downloads 576
42 Perinatal Ethanol Exposure Modifies CART System in Rat Brain Anticipated for Development of Anxiety, Depression and Memory Deficits

Authors: M. P. Dandekar, A. P. Bharne, P. T. Borkar, D. M. Kokare, N. K. Subhedar

Abstract:

Ethanol ingestion by the mother ensue adverse consequences for her offspring. Herein, we examine the behavioral phenotype and neural substrate of the offspring of the mother on ethanol. Female rats were fed with ethanol-containing liquid diet from 8 days prior of conception and continued till 25 days post-parturition to coincide with weaning. Behavioral changes associated with anxiety, depression and learning and memory were assessed in the offspring, after they attained adulthood (day 85), using elevated plus maze (EPM), forced swim (FST) and novel object recognition tests (NORT), respectively. The offspring of the alcoholic mother, compared to those of the pair-fed mother, spent significantly more time in closed arms of EPM and showed more immobility time in FST. Offspring at the age of 25 and 85 days failed to discriminate between novel versus familiar object in NORT, thus reflecting anxiogenic, depressive and amnesic phenotypes. Neuropeptide cocaine- and amphetamine-regulated transcript peptide (CART) is known to be involved in central effects of ethanol and hence selected for the current study. Twenty-five days old pups of the alcoholic mother showed significant augmentation in CART-immunoreactivity in the cells of Edinger-Westphal (EW) nucleus and lateral hypothalamus. However, a significant decrease in CART-immunoreactivity was seen in nucleus accumbens shell (AcbSh), lateral part of bed nucleus of the stria terminalis (BNSTl), locus coeruleus (LC), hippocampus (CA1, CA2 and CA3), and arcuate nucleus (ARC) of the pups and/or adults offspring. While no change in the CART-immunoreactive fibers of AcbSh and BNSTl, CA2 and CA3 was noticed in the 25 days old pups, the CART-immunoreactive cells in EW and paraventricular nucleus (PVN), and fibers in the central nucleus of amygdala of 85 days old offspring remained unaffected. We suggest that the endogenous CART system in these discrete areas, among other factors, may be a causal to the abnormalities in the next generation of an alcoholic mother.

Keywords: anxiety, depression, CART, ethanol, immunocytochemistry

Procedia PDF Downloads 395
41 PM10 Prediction and Forecasting Using CART: A Case Study for Pleven, Bulgaria

Authors: Snezhana G. Gocheva-Ilieva, Maya P. Stoimenova

Abstract:

Ambient air pollution with fine particulate matter (PM10) is a systematic permanent problem in many countries around the world. The accumulation of a large number of measurements of both the PM10 concentrations and the accompanying atmospheric factors allow for their statistical modeling to detect dependencies and forecast future pollution. This study applies the classification and regression trees (CART) method for building and analyzing PM10 models. In the empirical study, average daily air data for the city of Pleven, Bulgaria for a period of 5 years are used. Predictors in the models are seven meteorological variables, time variables, as well as lagged PM10 variables and some lagged meteorological variables, delayed by 1 or 2 days with respect to the initial time series, respectively. The degree of influence of the predictors in the models is determined. The selected best CART models are used to forecast future PM10 concentrations for two days ahead after the last date in the modeling procedure and show very accurate results.

Keywords: cross-validation, decision tree, lagged variables, short-term forecasting

Procedia PDF Downloads 196
40 Retrospective Analysis of Injuries to Flight Attendants in a Commercial Airliner

Authors: B. K. Umesh Kumar, Waleed Al Shukaili

Abstract:

Air travel is one of the safest modes of travel. Inflight injuries occur due to various factors such as air turbulence, spillage of hot liquids, and fall of improperly stowed overhead baggage. Injuries occur not only to passengers but also to the flight attendants who are handling the passengers throughout the flight. A retrospective study of all records of crew safety report by the captain of the aircraft for all the flights from 01 Mar 2015 to 31 Mar 2019 in a National Carrier of Middle Eastern country, were analyzed. There was one injury to Flight attendant every 1200 flights. Commonest aircraft involved was Boeing. Inflight phase had 82% of all injuries. 63% of accidents involved female Attendants. Commonest age group involved was from 25-30 years. Cart and container injuries were the commonest and accounted for nearly 62% of the total injuries followed by turbulence. Back injuries were the commonest injuries followed by ankle, shoulder, and burns. Mean days of absence from work seen in shoulder injuries 40 days followed by injuries to back, which accounted for 38 Days. Reduction in injuries to flight attendants can be brought about by proper selection of crew, reduction in cart load. Proper maintenance of cart and container plays a major role in prevention of occupational accidents.

Keywords: flight attendants, in-flight injuries, types of injuries, work related injury prevention

Procedia PDF Downloads 126
39 Simulation and Analysis of Inverted Pendulum Controllers

Authors: Sheren H. Salah

Abstract:

The inverted pendulum is a highly nonlinear and open-loop unstable system. An inverted pendulum (IP) is a pendulum which has its mass above its pivot point. It is often implemented with the pivot point mounted on a cart that can move horizontally and may be called a cart and pole. The characteristics of the inverted pendulum make identification and control more challenging. This paper presents the simulation study of several control strategies for an inverted pendulum system. The goal is to determine which control strategy delivers better performance with respect to pendulum’s angle. The inverted pendulum represents a challenging control problem, which continually moves toward an uncontrolled state. For controlling the inverted pendulum. The simulation study that sliding mode control (SMC) control produced better response compared to Genetic Algorithm Control (GAs) and proportional-integral-derivative(PID) control.

Keywords: Inverted Pendulum (IP) Proportional-Integral-Derivative (PID), Genetic Algorithm Control (GAs), Sliding Mode Control (SMC)

Procedia PDF Downloads 555
38 An Alternative Approach for Assessing the Impact of Cutting Conditions on Surface Roughness Using Single Decision Tree

Authors: S. Ghorbani, N. I. Polushin

Abstract:

In this study, an approach to identify factors affecting on surface roughness in a machining process is presented. This study is based on 81 data about surface roughness over a wide range of cutting tools (conventional, cutting tool with holes, cutting tool with composite material), workpiece materials (AISI 1045 Steel, AA2024 aluminum alloy, A48-class30 gray cast iron), spindle speed (630-1000 rpm), feed rate (0.05-0.075 mm/rev), depth of cut (0.05-0.15 mm) and tool overhang (41-65 mm). A single decision tree (SDT) analysis was done to identify factors for predicting a model of surface roughness, and the CART algorithm was employed for building and evaluating regression tree. Results show that a single decision tree is better than traditional regression models with higher rate and forecast accuracy and strong value.

Keywords: cutting condition, surface roughness, decision tree, CART algorithm

Procedia PDF Downloads 376
37 Determination of the Bank's Customer Risk Profile: Data Mining Applications

Authors: Taner Ersoz, Filiz Ersoz, Seyma Ozbilge

Abstract:

In this study, the clients who applied to a bank branch for loan were analyzed through data mining. The study was composed of the information such as amounts of loans received by personal and SME clients working with the bank branch, installment numbers, number of delays in loan installments, payments available in other banks and number of banks to which they are in debt between 2010 and 2013. The client risk profile was examined through Classification and Regression Tree (CART) analysis, one of the decision tree classification methods. At the end of the study, 5 different types of customers have been determined on the decision tree. The classification of these types of customers has been created with the rating of those posing a risk for the bank branch and the customers have been classified according to the risk ratings.

Keywords: client classification, loan suitability, risk rating, CART analysis

Procedia PDF Downloads 338
36 Predictive Analysis for Big Data: Extension of Classification and Regression Trees Algorithm

Authors: Ameur Abdelkader, Abed Bouarfa Hafida

Abstract:

Since its inception, predictive analysis has revolutionized the IT industry through its robustness and decision-making facilities. It involves the application of a set of data processing techniques and algorithms in order to create predictive models. Its principle is based on finding relationships between explanatory variables and the predicted variables. Past occurrences are exploited to predict and to derive the unknown outcome. With the advent of big data, many studies have suggested the use of predictive analytics in order to process and analyze big data. Nevertheless, they have been curbed by the limits of classical methods of predictive analysis in case of a large amount of data. In fact, because of their volumes, their nature (semi or unstructured) and their variety, it is impossible to analyze efficiently big data via classical methods of predictive analysis. The authors attribute this weakness to the fact that predictive analysis algorithms do not allow the parallelization and distribution of calculation. In this paper, we propose to extend the predictive analysis algorithm, Classification And Regression Trees (CART), in order to adapt it for big data analysis. The major changes of this algorithm are presented and then a version of the extended algorithm is defined in order to make it applicable for a huge quantity of data.

Keywords: predictive analysis, big data, predictive analysis algorithms, CART algorithm

Procedia PDF Downloads 142
35 Predicting Resistance of Commonly Used Antimicrobials in Urinary Tract Infections: A Decision Tree Analysis

Authors: Meera Tandan, Mohan Timilsina, Martin Cormican, Akke Vellinga

Abstract:

Background: In general practice, many infections are treated empirically without microbiological confirmation. Understanding susceptibility of antimicrobials during empirical prescribing can be helpful to reduce inappropriate prescribing. This study aims to apply a prediction model using a decision tree approach to predict the antimicrobial resistance (AMR) of urinary tract infections (UTI) based on non-clinical features of patients over 65 years. Decision tree models are a novel idea to predict the outcome of AMR at an initial stage. Method: Data was extracted from the database of the microbiological laboratory of the University Hospitals Galway on all antimicrobial susceptibility testing (AST) of urine specimens from patients over the age of 65 from January 2011 to December 2014. The primary endpoint was resistance to common antimicrobials (Nitrofurantoin, trimethoprim, ciprofloxacin, co-amoxiclav and amoxicillin) used to treat UTI. A classification and regression tree (CART) model was generated with the outcome ‘resistant infection’. The importance of each predictor (the number of previous samples, age, gender, location (nursing home, hospital, community) and causative agent) on antimicrobial resistance was estimated. Sensitivity, specificity, negative predictive (NPV) and positive predictive (PPV) values were used to evaluate the performance of the model. Seventy-five percent (75%) of the data were used as a training set and validation of the model was performed with the remaining 25% of the dataset. Results: A total of 9805 UTI patients over 65 years had their urine sample submitted for AST at least once over the four years. E.coli, Klebsiella, Proteus species were the most commonly identified pathogens among the UTI patients without catheter whereas Sertia, Staphylococcus aureus; Enterobacter was common with the catheter. The validated CART model shows slight differences in the sensitivity, specificity, PPV and NPV in between the models with and without the causative organisms. The sensitivity, specificity, PPV and NPV for the model with non-clinical predictors was between 74% and 88% depending on the antimicrobial. Conclusion: The CART models developed using non-clinical predictors have good performance when predicting antimicrobial resistance. These models predict which antimicrobial may be the most appropriate based on non-clinical factors. Other CART models, prospective data collection and validation and an increasing number of non-clinical factors will improve model performance. The presented model provides an alternative approach to decision making on antimicrobial prescribing for UTIs in older patients.

Keywords: antimicrobial resistance, urinary tract infection, prediction, decision tree

Procedia PDF Downloads 256
34 Using Statistical Significance and Prediction to Test Long/Short Term Public Services and Patients' Cohorts: A Case Study in Scotland

Authors: Raptis Sotirios

Abstract:

Health and social care (HSc) services planning and scheduling are facing unprecedented challenges due to the pandemic pressure and also suffer from unplanned spending that is negatively impacted by the global financial crisis. Data-driven can help to improve policies, plan and design services provision schedules using algorithms assist healthcare managers’ to face unexpected demands using fewer resources. The paper discusses services packing using statistical significance tests and machine learning (ML) to evaluate demands similarity and coupling. This is achieved by predicting the range of the demand (class) using ML methods such as CART, random forests (RF), and logistic regression (LGR). The significance tests Chi-Squared test and Student test are used on data over a 39 years span for which HSc services data exist for services delivered in Scotland. The demands are probabilistically associated through statistical hypotheses that assume that the target service’s demands are statistically dependent on other demands as a NULL hypothesis. This linkage can be confirmed or not by the data. Complementarily, ML methods are used to linearly predict the above target demands from the statistically found associations and extend the linear dependence of the target’s demand to independent demands forming, thus groups of services. Statistical tests confirm ML couplings making the prediction also statistically meaningful and prove that a target service can be matched reliably to other services, and ML shows these indicated relationships can also be linear ones. Zero paddings were used for missing years records and illustrated better such relationships both for limited years and in the entire span offering long term data visualizations while limited years groups explained how well patients numbers can be related in short periods or can change over time as opposed to behaviors across more years. The prediction performance of the associations is measured using Receiver Operating Characteristic(ROC) AUC and ACC metrics as well as the statistical tests, Chi-Squared and Student. Co-plots and comparison tables for RF, CART, and LGR as well as p-values and Information Exchange(IE), are provided showing the specific behavior of the ML and of the statistical tests and the behavior using different learning ratios. The impact of k-NN and cross-correlation and C-Means first groupings is also studied over limited years and the entire span. It was found that CART was generally behind RF and LGR, but in some interesting cases, LGR reached an AUC=0 falling below CART, while the ACC was as high as 0.912, showing that ML methods can be confused padding or by data irregularities or outliers. On average, 3 linear predictors were sufficient, LGR was found competing RF well, and CART followed with the same performance at higher learning ratios. Services were packed only if when significance level(p-value) of their association coefficient was more than 0.05. Social factors relationships were observed between home care services and treatment of old people, birth weights, alcoholism, drug abuse, and emergency admissions. The work found that different HSc services can be well packed as plans of limited years, across various services sectors, learning configurations, as confirmed using statistical hypotheses.

Keywords: class, cohorts, data frames, grouping, prediction, prob-ability, services

Procedia PDF Downloads 236
33 An Improved Parallel Algorithm of Decision Tree

Authors: Jiameng Wang, Yunfei Yin, Xiyu Deng

Abstract:

Parallel optimization is one of the important research topics of data mining at this stage. Taking Classification and Regression Tree (CART) parallelization as an example, this paper proposes a parallel data mining algorithm based on SSP-OGini-PCCP. Aiming at the problem of choosing the best CART segmentation point, this paper designs an S-SP model without data association; and in order to calculate the Gini index efficiently, a parallel OGini calculation method is designed. In addition, in order to improve the efficiency of the pruning algorithm, a synchronous PCCP pruning strategy is proposed in this paper. In this paper, the optimal segmentation calculation, Gini index calculation, and pruning algorithm are studied in depth. These are important components of parallel data mining. By constructing a distributed cluster simulation system based on SPARK, data mining methods based on SSP-OGini-PCCP are tested. Experimental results show that this method can increase the search efficiency of the best segmentation point by an average of 89%, increase the search efficiency of the Gini segmentation index by 3853%, and increase the pruning efficiency by 146% on average; and as the size of the data set increases, the performance of the algorithm remains stable, which meets the requirements of contemporary massive data processing.

Keywords: classification, Gini index, parallel data mining, pruning ahead

Procedia PDF Downloads 124
32 Interpretation of the Russia-Ukraine 2022 War via N-Gram Analysis

Authors: Elcin Timur Cakmak, Ayse Oguzlar

Abstract:

This study presents the results of the tweets sent by Twitter users on social media about the Russia-Ukraine war by bigram and trigram methods. On February 24, 2022, Russian President Vladimir Putin declared a military operation against Ukraine, and all eyes were turned to this war. Many people living in Russia and Ukraine reacted to this war and protested and also expressed their deep concern about this war as they felt the safety of their families and their futures were at stake. Most people, especially those living in Russia and Ukraine, express their views on the war in different ways. The most popular way to do this is through social media. Many people prefer to convey their feelings using Twitter, one of the most frequently used social media tools. Since the beginning of the war, it is seen that there have been thousands of tweets about the war from many countries of the world on Twitter. These tweets accumulated in data sources are extracted using various codes for analysis through Twitter API and analysed by Python programming language. The aim of the study is to find the word sequences in these tweets by the n-gram method, which is known for its widespread use in computational linguistics and natural language processing. The tweet language used in the study is English. The data set consists of the data obtained from Twitter between February 24, 2022, and April 24, 2022. The tweets obtained from Twitter using the #ukraine, #russia, #war, #putin, #zelensky hashtags together were captured as raw data, and the remaining tweets were included in the analysis stage after they were cleaned through the preprocessing stage. In the data analysis part, the sentiments are found to present what people send as a message about the war on Twitter. Regarding this, negative messages make up the majority of all the tweets as a ratio of %63,6. Furthermore, the most frequently used bigram and trigram word groups are found. Regarding the results, the most frequently used word groups are “he, is”, “I, do”, “I, am” for bigrams. Also, the most frequently used word groups are “I, do, not”, “I, am, not”, “I, can, not” for trigrams. In the machine learning phase, the accuracy of classifications is measured by Classification and Regression Trees (CART) and Naïve Bayes (NB) algorithms. The algorithms are used separately for bigrams and trigrams. We gained the highest accuracy and F-measure values by the NB algorithm and the highest precision and recall values by the CART algorithm for bigrams. On the other hand, the highest values for accuracy, precision, and F-measure values are achieved by the CART algorithm, and the highest value for the recall is gained by NB for trigrams.

Keywords: classification algorithms, machine learning, sentiment analysis, Twitter

Procedia PDF Downloads 75
31 Short-Term Forecast of Wind Turbine Production with Machine Learning Methods: Direct Approach and Indirect Approach

Authors: Mamadou Dione, Eric Matzner-lober, Philippe Alexandre

Abstract:

The Energy Transition Act defined by the French State has precise implications on Renewable Energies, in particular on its remuneration mechanism. Until then, a purchase obligation contract permitted the sale of wind-generated electricity at a fixed rate. Tomorrow, it will be necessary to sell this electricity on the Market (at variable rates) before obtaining additional compensation intended to reduce the risk. This sale on the market requires to announce in advance (about 48 hours before) the production that will be delivered on the network, so to be able to predict (in the short term) this production. The fundamental problem remains the variability of the Wind accentuated by the geographical situation. The objective of the project is to provide, every day, short-term forecasts (48-hour horizon) of wind production using weather data. The predictions of the GFS model and those of the ECMWF model are used as explanatory variables. The variable to be predicted is the production of a wind farm. We do two approaches: a direct approach that predicts wind generation directly from weather data, and an integrated approach that estimâtes wind from weather data and converts it into wind power by power curves. We used machine learning techniques to predict this production. The models tested are random forests, CART + Bagging, CART + Boosting, SVM (Support Vector Machine). The application is made on a wind farm of 22MW (11 wind turbines) of the Compagnie du Vent (that became Engie Green France). Our results are very conclusive compared to the literature.

Keywords: forecast aggregation, machine learning, spatio-temporal dynamics modeling, wind power forcast

Procedia PDF Downloads 218
30 A Statistical Approach to Predict and Classify the Commercial Hatchability of Chickens Using Extrinsic Parameters of Breeders and Eggs

Authors: M. S. Wickramarachchi, L. S. Nawarathna, C. M. B. Dematawewa

Abstract:

Hatchery performance is critical for the profitability of poultry breeder operations. Some extrinsic parameters of eggs and breeders cause to increase or decrease the hatchability. This study aims to identify the affecting extrinsic parameters on the commercial hatchability of local chicken's eggs and determine the most efficient classification model with a hatchability rate greater than 90%. In this study, seven extrinsic parameters were considered: egg weight, moisture loss, breeders age, number of fertilised eggs, shell width, shell length, and shell thickness. Multiple linear regression was performed to determine the most influencing variable on hatchability. First, the correlation between each parameter and hatchability were checked. Then a multiple regression model was developed, and the accuracy of the fitted model was evaluated. Linear Discriminant Analysis (LDA), Classification and Regression Trees (CART), k-Nearest Neighbors (kNN), Support Vector Machines (SVM) with a linear kernel, and Random Forest (RF) algorithms were applied to classify the hatchability. This grouping process was conducted using binary classification techniques. Hatchability was negatively correlated with egg weight, breeders' age, shell width, shell length, and positive correlations were identified with moisture loss, number of fertilised eggs, and shell thickness. Multiple linear regression models were more accurate than single linear models regarding the highest coefficient of determination (R²) with 94% and minimum AIC and BIC values. According to the classification results, RF, CART, and kNN had performed the highest accuracy values 0.99, 0.975, and 0.972, respectively, for the commercial hatchery process. Therefore, the RF is the most appropriate machine learning algorithm for classifying the breeder outcomes, which are economically profitable or not, in a commercial hatchery.

Keywords: classification models, egg weight, fertilised eggs, multiple linear regression

Procedia PDF Downloads 88
29 The Exploitation of Balancing an Inverted Pendulum System Using Sliding Mode Control

Authors: Sheren H. Salah, Ahmed Y. Ben Sasi

Abstract:

The inverted pendulum system is a classic control problem that is used in universities around the world. It is a suitable process to test prototype controllers due to its high non-linearities and lack of stability. The inverted pendulum represents a challenging control problem, which continually moves toward an uncontrolled state. This paper presents the possibility of balancing an inverted pendulum system using sliding mode control (SMC). The goal is to determine which control strategy delivers better performance with respect to pendulum’s angle and cart's position. Therefore, proportional-integral-derivative (PID) is used for comparison. Results have proven SMC control produced better response compared to PID control in both normal and noisy systems.

Keywords: inverted pendulum (IP), proportional-integral derivative (PID), sliding mode control (SMC), systems and control engineering

Procedia PDF Downloads 588
28 Performance Evaluation of Contemporary Classifiers for Automatic Detection of Epileptic EEG

Authors: K. E. Ch. Vidyasagar, M. Moghavvemi, T. S. S. T. Prabhat

Abstract:

Epilepsy is a global problem, and with seizures eluding even the smartest of diagnoses a requirement for automatic detection of the same using electroencephalogram (EEG) would have a huge impact in diagnosis of the disorder. Among a multitude of methods for automatic epilepsy detection, one should find the best method out, based on accuracy, for classification. This paper reasons out, and rationalizes, the best methods for classification. Accuracy is based on the classifier, and thus this paper discusses classifiers like quadratic discriminant analysis (QDA), classification and regression tree (CART), support vector machine (SVM), naive Bayes classifier (NBC), linear discriminant analysis (LDA), K-nearest neighbor (KNN) and artificial neural networks (ANN). Results show that ANN is the most accurate of all the above stated classifiers with 97.7% accuracy, 97.25% specificity and 98.28% sensitivity in its merit. This is followed closely by SVM with 1% variation in result. These results would certainly help researchers choose the best classifier for detection of epilepsy.

Keywords: classification, seizure, KNN, SVM, LDA, ANN, epilepsy

Procedia PDF Downloads 524
27 The Effect of Feature Selection on Pattern Classification

Authors: Chih-Fong Tsai, Ya-Han Hu

Abstract:

The aim of feature selection (or dimensionality reduction) is to filter out unrepresentative features (or variables) making the classifier perform better than the one without feature selection. Since there are many well-known feature selection algorithms, and different classifiers based on different selection results may perform differently, very few studies consider examining the effect of performing different feature selection algorithms on the classification performances by different classifiers over different types of datasets. In this paper, two widely used algorithms, which are the genetic algorithm (GA) and information gain (IG), are used to perform feature selection. On the other hand, three well-known classifiers are constructed, which are the CART decision tree (DT), multi-layer perceptron (MLP) neural network, and support vector machine (SVM). Based on 14 different types of datasets, the experimental results show that in most cases IG is a better feature selection algorithm than GA. In addition, the combinations of IG with DT and IG with SVM perform best and second best for small and large scale datasets.

Keywords: data mining, feature selection, pattern classification, dimensionality reduction

Procedia PDF Downloads 669
26 Using Single Decision Tree to Assess the Impact of Cutting Conditions on Vibration

Authors: S. Ghorbani, N. I. Polushin

Abstract:

Vibration during machining process is crucial since it affects cutting tool, machine, and workpiece leading to a tool wear, tool breakage, and an unacceptable surface roughness. This paper applies a nonparametric statistical method, single decision tree (SDT), to identify factors affecting on vibration in machining process. Workpiece material (AISI 1045 Steel, AA2024 Aluminum alloy, A48-class30 Gray Cast Iron), cutting tool (conventional, cutting tool with holes in toolholder, cutting tool filled up with epoxy-granite), tool overhang (41-65 mm), spindle speed (630-1000 rpm), feed rate (0.05-0.075 mm/rev) and depth of cut (0.05-0.15 mm) were used as input variables, while vibration was the output parameter. It is concluded that workpiece material is the most important parameters for natural frequency followed by cutting tool and overhang.

Keywords: cutting condition, vibration, natural frequency, decision tree, CART algorithm

Procedia PDF Downloads 337
25 Establishment of Precision System for Underground Facilities Based on 3D Absolute Positioning Technology

Authors: Yonggu Jang, Jisong Ryu, Woosik Lee

Abstract:

The study aims to address the limitations of existing underground facility exploration equipment in terms of exploration depth range, relative depth measurement, data processing time, and human-centered ground penetrating radar image interpretation. The study proposed the use of 3D absolute positioning technology to develop a precision underground facility exploration system. The aim of this study is to establish a precise exploration system for underground facilities based on 3D absolute positioning technology, which can accurately survey up to a depth of 5m and measure the 3D absolute location of precise underground facilities. The study developed software and hardware technologies to build the precision exploration system. The software technologies developed include absolute positioning technology, ground surface location synchronization technology of GPR exploration equipment, GPR exploration image AI interpretation technology, and integrated underground space map-based composite data processing technology. The hardware systems developed include a vehicle-type exploration system and a cart-type exploration system. The data was collected using the developed exploration system, which employs 3D absolute positioning technology. The GPR exploration images were analyzed using AI technology, and the three-dimensional location information of the explored precise underground facilities was compared to the integrated underground space map. The study successfully developed a precision underground facility exploration system based on 3D absolute positioning technology. The developed exploration system can accurately survey up to a depth of 5m and measure the 3D absolute location of precise underground facilities. The system comprises software technologies that build a 3D precise DEM, synchronize the GPR sensor's ground surface 3D location coordinates, automatically analyze and detect underground facility information in GPR exploration images and improve accuracy through comparative analysis of the three-dimensional location information, and hardware systems, including a vehicle-type exploration system and a cart-type exploration system. The study's findings and technological advancements are essential for underground safety management in Korea. The proposed precision exploration system significantly contributes to establishing precise location information of underground facility information, which is crucial for underground safety management and improves the accuracy and efficiency of exploration. The study addressed the limitations of existing equipment in exploring underground facilities, proposed 3D absolute positioning technology-based precision exploration system, developed software and hardware systems for the exploration system, and contributed to underground safety management by providing precise location information. The developed precision underground facility exploration system based on 3D absolute positioning technology has the potential to provide accurate and efficient exploration of underground facilities up to a depth of 5m. The system's technological advancements contribute to the establishment of precise location information of underground facility information, which is essential for underground safety management in Korea.

Keywords: 3D absolute positioning, AI interpretation of GPR exploration images, complex data processing, integrated underground space maps, precision exploration system for underground facilities

Procedia PDF Downloads 62
24 Patient-Specific Modeling Algorithm for Medical Data Based on AUC

Authors: Guilherme Ribeiro, Alexandre Oliveira, Antonio Ferreira, Shyam Visweswaran, Gregory Cooper

Abstract:

Patient-specific models are instance-based learning algorithms that take advantage of the particular features of the patient case at hand to predict an outcome. We introduce two patient-specific algorithms based on decision tree paradigm that use AUC as a metric to select an attribute. We apply the patient specific algorithms to predict outcomes in several datasets, including medical datasets. Compared to the patient-specific decision path (PSDP) entropy-based and CART methods, the AUC-based patient-specific decision path models performed equivalently on area under the ROC curve (AUC). Our results provide support for patient-specific methods being a promising approach for making clinical predictions.

Keywords: approach instance-based, area under the ROC curve, patient-specific decision path, clinical predictions

Procedia PDF Downloads 479
23 Knowledge Discovery and Data Mining Techniques in Textile Industry

Authors: Filiz Ersoz, Taner Ersoz, Erkin Guler

Abstract:

This paper addresses the issues and technique for textile industry using data mining techniques. Data mining has been applied to the stitching of garments products that were obtained from a textile company. Data mining techniques were applied to the data obtained from the CHAID algorithm, CART algorithm, Regression Analysis and, Artificial Neural Networks. Classification technique based analyses were used while data mining and decision model about the production per person and variables affecting about production were found by this method. In the study, the results show that as the daily working time increases, the production per person also decreases. In addition, the relationship between total daily working and production per person shows a negative result and the production per person show the highest and negative relationship.

Keywords: data mining, textile production, decision trees, classification

Procedia PDF Downloads 352
22 A Hybrid Model Tree and Logistic Regression Model for Prediction of Soil Shear Strength in Clay

Authors: Ehsan Mehryaar, Seyed Armin Motahari Tabari

Abstract:

Without a doubt, soil shear strength is the most important property of the soil. The majority of fatal and catastrophic geological accidents are related to shear strength failure of the soil. Therefore, its prediction is a matter of high importance. However, acquiring the shear strength is usually a cumbersome task that might need complicated laboratory testing. Therefore, prediction of it based on common and easy to get soil properties can simplify the projects substantially. In this paper, A hybrid model based on the classification and regression tree algorithm and logistic regression is proposed where each leaf of the tree is an independent regression model. A database of 189 points for clay soil, including Moisture content, liquid limit, plastic limit, clay content, and shear strength, is collected. The performance of the developed model compared to the existing models and equations using root mean squared error and coefficient of correlation.

Keywords: model tree, CART, logistic regression, soil shear strength

Procedia PDF Downloads 197
21 Spatial Data Mining by Decision Trees

Authors: Sihem Oujdi, Hafida Belbachir

Abstract:

Existing methods of data mining cannot be applied on spatial data because they require spatial specificity consideration, as spatial relationships. This paper focuses on the classification with decision trees, which are one of the data mining techniques. We propose an extension of the C4.5 algorithm for spatial data, based on two different approaches Join materialization and Querying on the fly the different tables. Similar works have been done on these two main approaches, the first - Join materialization - favors the processing time in spite of memory space, whereas the second - Querying on the fly different tables- promotes memory space despite of the processing time. The modified C4.5 algorithm requires three entries tables: a target table, a neighbor table, and a spatial index join that contains the possible spatial relationship among the objects in the target table and those in the neighbor table. Thus, the proposed algorithms are applied to a spatial data pattern in the accidentology domain. A comparative study of our approach with other works of classification by spatial decision trees will be detailed.

Keywords: C4.5 algorithm, decision trees, S-CART, spatial data mining

Procedia PDF Downloads 615
20 Risk Factors of Becoming NEET Youth in Iran: A Machine Learning Approach

Authors: Hamed Rahmani, Wim Groot

Abstract:

The term "youth not in employment, education or training (NEET)" refers to a combination of youth unemployment and school dropout. This study investigates the variables that increase the risk of becoming NEET in Iran. A selection bias-adjusted Probit model was employed using machine learning to identify these risk factors. We used cross-sectional data obtained from the Statistical Centre of Iran and the Ministry of Cooperatives Labour and Social Welfare that was taken from the labour force survey conducted in the spring of 2021. We look at years of education, work experience, housework, the number of children under the age of six in the home, family education, birthplace, and the amount of land owned by households. Results show that hours spent performing domestic chores enhance the likelihood of youth becoming NEET, and years of education and years of potential work experience decrease the chance of being NEET. The findings also show that female youth born in cities were less likely than those born in rural regions to become NEET.

Keywords: NEET youth, probit, CART, machine learning, unemployment

Procedia PDF Downloads 108
19 Digital Platform of Crops for Smart Agriculture

Authors: Pascal François Faye, Baye Mor Sall, Bineta Dembele, Jeanne Ana Awa Faye

Abstract:

In agriculture, estimating crop yields is key to improving productivity and decision-making processes such as financial market forecasting and addressing food security issues. The main objective of this paper is to have tools to predict and improve the accuracy of crop yield forecasts using machine learning (ML) algorithms such as CART , KNN and SVM . We developed a mobile app and a web app that uses these algorithms for practical use by farmers. The tests show that our system (collection and deployment architecture, web application and mobile application) is operational and validates empirical knowledge on agro-climatic parameters in addition to proactive decision-making support. The experimental results obtained on the agricultural data, the performance of the ML algorithms are compared using cross-validation in order to identify the most effective ones following the agricultural data. The proposed applications demonstrate that the proposed approach is effective in predicting crop yields and provides timely and accurate responses to farmers for decision support.

Keywords: prediction, machine learning, artificial intelligence, digital agriculture

Procedia PDF Downloads 80
18 Using Data-Driven Model on Online Customer Journey

Authors: Ing-Jen Hung, Tzu-Chien Wang

Abstract:

Nowadays, customers can interact with firms through miscellaneous online ads on different channels easily. In other words, customer now has innumerable options and limitless time to accomplish their commercial activities with firms, individualizing their own online customer journey. This kind of convenience emphasizes the importance of online advertisement allocation on different channels. Therefore, profound understanding of customer behavior can make considerable benefit from optimizing fund allocation on diverse ad channels. To achieve this objective, multiple firms utilize numerical methodology to create data-driven advertisement policy. In our research, we aim to exploit online customer click data to discover the correlations between each channel and their sequential relations. We use LSTM to deal with sequential property of our data and compare its accuracy with other non-sequential methods, such as CART decision tree, logistic regression, etc. Besides, we also classify our customers into several groups by their behavioral characteristics to perceive the differences between all groups as customer portrait. As a result, we discover distinct customer journey under each customer portrait. Our article provides some insights into marketing research and can help firm to formulate online advertising criteria.

Keywords: LSTM, customer journey, marketing, channel ads

Procedia PDF Downloads 121
17 Cytotoxic Effect of Neem Seed Extract (Azadirachta indica) in Comparison with Artificial Insecticide Novastar on Haemocytes (THC and DHC) of Musca domestica

Authors: Muhammad Zaheer Awan, Adnan Qadir, Zeeshan Anjum

Abstract:

Housefly, Musca domestica Linnaeus is ubiquitous and hazardous for Homo sapiens and livestock in sundry venerations. Musca domestica cart 100 different pathogens, such as typhoid, salmonella, bacillary dysentery, tuberculosis, anthrax and parasitic worms. The flies in rural areas usually carry more pathogens. Houseflies feed on liquid or semi-liquid substances besides solid materials which are softened by saliva. Neem botanically known as Azadirachta indica belongs to the family Meliaceae and is an indigenous tree to Pakistan. The neem tree is also one such tree which has been revered by the Pakistanis and Kashmiris for its medicinal properties. Present study showed neem seed extract has potentially toxic ability that affect Total Haemocyte Count (THC) and Differential Haemocytes Count (DHC) in insect’s blood cells, of the housefly. A significant variation in haemolymph density was observed just after application, 30 minutes and 60 minutes post treatment in term of THC and DHC in comparison with novastar. The study strappingly acclaim use of neem seed extract as insecticide as compare to artificial insecticides.

Keywords: neem, Azadirachta indica, Musca domestica, differential haemocyte count (DHC), total haemocytes count (DHC), novastar

Procedia PDF Downloads 205
16 E-Payments, COVID-19 Restrictions, and Currency in Circulation: Thailand and Turkey

Authors: Zeliha Sayar

Abstract:

Central banks all over the world appear to be focusing first and foremost on retail central bank digital currency CBDC), i.e., digital cash/money. This approach is predicated on the belief that the use of cash has decreased, owing primarily to technological advancements and pandemic restrictions, and that a suitable foundation for the transition to a cashless society has been revealed. This study aims to contribute to the debate over whether digital money/CBDC can be a substitute or supplement to physical cash by examining the potential effects on cash demand. For this reason, this paper compares two emerging countries, Turkey, and Thailand, to demystify the impact of e-payment and COVID-19 restrictions on cash demand by employing fully modified ordinary least squares (FMOLS), dynamic ordinary least squares (DOLS), and the canonical cointegrating regression (CCR). The currency in circulation in two emerging countries, Turkey and Thailand, was examined in order to estimate the elasticity of different types of retail payments. The results demonstrate that real internet and mobile, cart, contactless payment, and e-money are long-term determinants of real cash demand in these two developing countries. Furthermore, with the exception of contactless payments in Turkey, there is a positive relationship between the currency in circulation and the various types of retail payments. According to findings, COVID-19 restrictions encourage the demand for cash, resulting in cash hoarding.

Keywords: CCR, DOLS, e-money, FMOLS, real cash

Procedia PDF Downloads 107
15 Performance Study of Classification Algorithms for Consumer Online Shopping Attitudes and Behavior Using Data Mining

Authors: Rana Alaa El-Deen Ahmed, M. Elemam Shehab, Shereen Morsy, Nermeen Mekawie

Abstract:

With the growing popularity and acceptance of e-commerce platforms, users face an ever increasing burden in actually choosing the right product from the large number of online offers. Thus, techniques for personalization and shopping guides are needed by users. For a pleasant and successful shopping experience, users need to know easily which products to buy with high confidence. Since selling a wide variety of products has become easier due to the popularity of online stores, online retailers are able to sell more products than a physical store. The disadvantage is that the customers might not find products they need. In this research the customer will be able to find the products he is searching for, because recommender systems are used in some ecommerce web sites. Recommender system learns from the information about customers and products and provides appropriate personalized recommendations to customers to find the needed product. In this paper eleven classification algorithms are comparatively tested to find the best classifier fit for consumer online shopping attitudes and behavior in the experimented dataset. The WEKA knowledge analysis tool, which is an open source data mining workbench software used in comparing conventional classifiers to get the best classifier was used in this research. In this research by using the data mining tool (WEKA) with the experimented classifiers the results show that decision table and filtered classifier gives the highest accuracy and the lowest accuracy classification via clustering and simple cart.

Keywords: classification, data mining, machine learning, online shopping, WEKA

Procedia PDF Downloads 352