Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 217

Search results for: nearest neighbour

217 Human Gait Recognition Using Moment with Fuzzy

Authors: Jyoti Bharti, Navneet Manjhi, M. K.Gupta, Bimi Jain

Abstract:

A reliable gait features are required to extract the gait sequences from an images. In this paper suggested a simple method for gait identification which is based on moments. Moment values are extracted on different number of frames of gray scale and silhouette images of CASIA database. These moment values are considered as feature values. Fuzzy logic and nearest neighbour classifier are used for classification. Both achieved higher recognition.

Keywords: gait, fuzzy logic, nearest neighbour, recognition rate, moments

Procedia PDF Downloads 519
216 An Analysis of Classification of Imbalanced Datasets by Using Synthetic Minority Over-Sampling Technique

Authors: Ghada A. Alfattni

Abstract:

Analysing unbalanced datasets is one of the challenges that practitioners in machine learning field face. However, many researches have been carried out to determine the effectiveness of the use of the synthetic minority over-sampling technique (SMOTE) to address this issue. The aim of this study was therefore to compare the effectiveness of the SMOTE over different models on unbalanced datasets. Three classification models (Logistic Regression, Support Vector Machine and Nearest Neighbour) were tested with multiple datasets, then the same datasets were oversampled by using SMOTE and applied again to the three models to compare the differences in the performances. Results of experiments show that the highest number of nearest neighbours gives lower values of error rates. 

Keywords: imbalanced datasets, SMOTE, machine learning, logistic regression, support vector machine, nearest neighbour

Procedia PDF Downloads 251
215 Urban Land Cover from GF-2 Satellite Images Using Object Based and Neural Network Classifications

Authors: Lamyaa Gamal El-Deen Taha, Ashraf Sharawi

Abstract:

China launched satellite GF-2 in 2014. This study deals with comparing nearest neighbor object-based classification and neural network classification methods for classification of the fused GF-2 image. Firstly, rectification of GF-2 image was performed. Secondly, a comparison between nearest neighbor object-based classification and neural network classification for classification of fused GF-2 was performed. Thirdly, the overall accuracy of classification and kappa index were calculated. Results indicate that nearest neighbor object-based classification is better than neural network classification for urban mapping.

Keywords: GF-2 images, feature extraction-rectification, nearest neighbour object based classification, segmentation algorithms, neural network classification, multilayer perceptron

Procedia PDF Downloads 240
214 Fraud Detection in Credit Cards with Machine Learning

Authors: Anjali Chouksey, Riya Nimje, Jahanvi Saraf

Abstract:

Online transactions have increased dramatically in this new ‘social-distancing’ era. With online transactions, Fraud in online payments has also increased significantly. Frauds are a significant problem in various industries like insurance companies, baking, etc. These frauds include leaking sensitive information related to the credit card, which can be easily misused. Due to the government also pushing online transactions, E-commerce is on a boom. But due to increasing frauds in online payments, these E-commerce industries are suffering a great loss of trust from their customers. These companies are finding credit card fraud to be a big problem. People have started using online payment options and thus are becoming easy targets of credit card fraud. In this research paper, we will be discussing machine learning algorithms. We have used a decision tree, XGBOOST, k-nearest neighbour, logistic-regression, random forest, and SVM on a dataset in which there are transactions done online mode using credit cards. We will test all these algorithms for detecting fraud cases using the confusion matrix, F1 score, and calculating the accuracy score for each model to identify which algorithm can be used in detecting frauds.

Keywords: machine learning, fraud detection, artificial intelligence, decision tree, k nearest neighbour, random forest, XGBOOST, logistic regression, support vector machine

Procedia PDF Downloads 53
213 Detecting of Crime Hot Spots for Crime Mapping

Authors: Somayeh Nezami

Abstract:

The management of financial and human resources of police in metropolitans requires many information and exact plans to reduce a rate of crime and increase the safety of the society. Geographical Information Systems have an important role in providing crime maps and their analysis. By using them and identification of crime hot spots along with spatial presentation of the results, it is possible to allocate optimum resources while presenting effective methods for decision making and preventive solutions. In this paper, we try to explain and compare between some of the methods of hot spots analysis such as Mode, Fuzzy Mode and Nearest Neighbour Hierarchical spatial clustering (NNH). Then the spots with the highest crime rates of drug smuggling for one province in Iran with borderline with Afghanistan are obtained. We will show that among these three methods NNH leads to the best result.

Keywords: GIS, Hot spots, nearest neighbor hierarchical spatial clustering, NNH, spatial analysis of crime

Procedia PDF Downloads 243
212 SC-LSH: An Efficient Indexing Method for Approximate Similarity Search in High Dimensional Space

Authors: Sanaa Chafik, Imane Daoudi, Mounim A. El Yacoubi, Hamid El Ouardi

Abstract:

Locality Sensitive Hashing (LSH) is one of the most promising techniques for solving nearest neighbour search problem in high dimensional space. Euclidean LSH is the most popular variation of LSH that has been successfully applied in many multimedia applications. However, the Euclidean LSH presents limitations that affect structure and query performances. The main limitation of the Euclidean LSH is the large memory consumption. In order to achieve a good accuracy, a large number of hash tables is required. In this paper, we propose a new hashing algorithm to overcome the storage space problem and improve query time, while keeping a good accuracy as similar to that achieved by the original Euclidean LSH. The Experimental results on a real large-scale dataset show that the proposed approach achieves good performances and consumes less memory than the Euclidean LSH.

Keywords: approximate nearest neighbor search, content based image retrieval (CBIR), curse of dimensionality, locality sensitive hashing, multidimensional indexing, scalability

Procedia PDF Downloads 251
211 Fuzzy-Machine Learning Models for the Prediction of Fire Outbreak: A Comparative Analysis

Authors: Uduak Umoh, Imo Eyoh, Emmauel Nyoho

Abstract:

This paper compares fuzzy-machine learning algorithms such as Support Vector Machine (SVM), and K-Nearest Neighbor (KNN) for the predicting cases of fire outbreak. The paper uses the fire outbreak dataset with three features (Temperature, Smoke, and Flame). The data is pre-processed using Interval Type-2 Fuzzy Logic (IT2FL) algorithm. Min-Max Normalization and Principal Component Analysis (PCA) are used to predict feature labels in the dataset, normalize the dataset, and select relevant features respectively. The output of the pre-processing is a dataset with two principal components (PC1 and PC2). The pre-processed dataset is then used in the training of the aforementioned machine learning models. K-fold (with K=10) cross-validation method is used to evaluate the performance of the models using the matrices – ROC (Receiver Operating Curve), Specificity, and Sensitivity. The model is also tested with 20% of the dataset. The validation result shows KNN is the better model for fire outbreak detection with an ROC value of 0.99878, followed by SVM with an ROC value of 0.99753.

Keywords: Machine Learning Algorithms , Interval Type-2 Fuzzy Logic, Fire Outbreak, Support Vector Machine, K-Nearest Neighbour, Principal Component Analysis

Procedia PDF Downloads 60
210 Real Time Lidar and Radar High-Level Fusion for Obstacle Detection and Tracking with Evaluation on a Ground Truth

Authors: Hatem Hajri, Mohamed-Cherif Rahal

Abstract:

Both Lidars and Radars are sensors for obstacle detection. While Lidars are very accurate on obstacles positions and less accurate on their velocities, Radars are more precise on obstacles velocities and less precise on their positions. Sensor fusion between Lidar and Radar aims at improving obstacle detection using advantages of the two sensors. The present paper proposes a real-time Lidar/Radar data fusion algorithm for obstacle detection and tracking based on the global nearest neighbour standard filter (GNN). This algorithm is implemented and embedded in an automative vehicle as a component generated by a real-time multisensor software. The benefits of data fusion comparing with the use of a single sensor are illustrated through several tracking scenarios (on a highway and on a bend) and using real-time kinematic sensors mounted on the ego and tracked vehicles as a ground truth.

Keywords: ground truth, Hungarian algorithm, lidar Radar data fusion, global nearest neighbor filter

Procedia PDF Downloads 77
209 Neighbour Cell List Reduction in Multi-Tier Heterogeneous Networks

Authors: Mohanad Alhabo, Naveed Nawaz

Abstract:

The ongoing call or data session must be maintained to ensure a good quality of service. This can be accomplished by performing the handover procedure while the user is on the move. However, the dense deployment of small cells in 5G networks is a challenging issue due to the extensive number of handovers. In this paper, a neighbour cell list method is proposed to reduce the number of target small cells and hence minimizing the number of handovers. The neighbour cell list is built by omitting cells that could cause an unnecessary handover and handover failure because of short time of stay of the user in these cells. A multi-attribute decision making technique, simple additive weighting, is then applied to the optimized neighbour cell list. Multi-tier small cells network is considered in this work. The performance of the proposed method is analysed and compared with that of the existing methods. Results disclose that our method has decreased the candidate small cell list, unnecessary handovers, handover failure, and short time of stay cells compared to the competitive method.

Keywords: handover, HetNets, multi-attribute decision making, small cells

Procedia PDF Downloads 48
208 Application of Fuzzy Approach to the Vibration Fault Diagnosis

Authors: Jalel Khelil

Abstract:

In order to improve reliability of Gas Turbine machine especially its generator equipment, a fault diagnosis system based on fuzzy approach is proposed. Three various methods namely K-NN (K-nearest neighbors), F-KNN (Fuzzy K-nearest neighbors) and FNM (Fuzzy nearest mean) are adopted to provide the measurement of relative strength of vibration defaults. Both applications consist of two major steps: Feature extraction and default classification. 09 statistical features are extracted from vibration signals. 03 different classes are used in this study which describes vibrations condition: Normal, unbalance defect, and misalignment defect. The use of the fuzzy approaches and the classification results are discussed. Results show that these approaches yield high successful rates of vibration default classification.

Keywords: fault diagnosis, fuzzy classification k-nearest neighbor, vibration

Procedia PDF Downloads 387
207 Hybrid Fuzzy Weighted K-Nearest Neighbor to Predict Hospital Readmission for Diabetic Patients

Authors: Soha A. Bahanshal, Byung G. Kim

Abstract:

Identification of patients at high risk for hospital readmission is of crucial importance for quality health care and cost reduction. Predicting hospital readmissions among diabetic patients has been of great interest to many researchers and health decision makers. We build a prediction model to predict hospital readmission for diabetic patients within 30 days of discharge. The core of the prediction model is a modified k Nearest Neighbor called Hybrid Fuzzy Weighted k Nearest Neighbor algorithm. The prediction is performed on a patient dataset which consists of more than 70,000 patients with 50 attributes. We applied data preprocessing using different techniques in order to handle data imbalance and to fuzzify the data to suit the prediction algorithm. The model so far achieved classification accuracy of 80% compared to other models that only use k Nearest Neighbor.

Keywords: machine learning, prediction, classification, hybrid fuzzy weighted k-nearest neighbor, diabetic hospital readmission

Procedia PDF Downloads 84
206 Nearest Neighbor Investigate Using R+ Tree

Authors: Rutuja Desai

Abstract:

Search engine is fundamentally a framework used to search the data which is pertinent to the client via WWW. Looking close-by spot identified with the keywords is an imperative concept in developing web advances. For such kind of searching, extent pursuit or closest neighbor is utilized. In range search the forecast is made whether the objects meet to query object. Nearest neighbor is the forecast of the focuses close to the query set by the client. Here, the nearest neighbor methodology is utilized where Data recovery R+ tree is utilized rather than IR2 tree. The disadvantages of IR2 tree is: The false hit number can surpass the limit and the mark in Information Retrieval R-tree must have Voice over IP bit for each one of a kind word in W set is recouped by Data recovery R+ tree. The inquiry is fundamentally subordinate upon the key words and the geometric directions.

Keywords: information retrieval, nearest neighbor search, keyword search, R+ tree

Procedia PDF Downloads 207
205 Contourlet Transform and Local Binary Pattern Based Feature Extraction for Bleeding Detection in Endoscopic Images

Authors: Mekha Mathew, Varun P Gopi

Abstract:

Wireless Capsule Endoscopy (WCE) has become a great device in Gastrointestinal (GI) tract diagnosis, which can examine the entire GI tract, especially the small intestine without invasiveness and sedation. Bleeding in the digestive tract is a symptom of a disease rather than a disease itself. Hence the detection of bleeding is important in diagnosing many diseases. In this paper we proposes a novel method for distinguishing bleeding regions from normal regions based on Contourlet transform and Local Binary Pattern (LBP). Experiments show that this method provides a high accuracy rate of 96.38% in CIE XYZ colour space for k-Nearest Neighbour (k-NN) classifier.

Keywords: Wireless Capsule Endoscopy, local binary pattern, k-NN classifier, contourlet transform

Procedia PDF Downloads 397
204 Kernel-Based Double Nearest Proportion Feature Extraction for Hyperspectral Image Classification

Authors: Hung-Sheng Lin, Cheng-Hsuan Li

Abstract:

Over the past few years, kernel-based algorithms have been widely used to extend some linear feature extraction methods such as principal component analysis (PCA), linear discriminate analysis (LDA), and nonparametric weighted feature extraction (NWFE) to their nonlinear versions, kernel principal component analysis (KPCA), generalized discriminate analysis (GDA), and kernel nonparametric weighted feature extraction (KNWFE), respectively. These nonlinear feature extraction methods can detect nonlinear directions with the largest nonlinear variance or the largest class separability based on the given kernel function. Moreover, they have been applied to improve the target detection or the image classification of hyperspectral images. The double nearest proportion feature extraction (DNP) can effectively reduce the overlap effect and have good performance in hyperspectral image classification. The DNP structure is an extension of the k-nearest neighbor technique. For each sample, there are two corresponding nearest proportions of samples, the self-class nearest proportion and the other-class nearest proportion. The term “nearest proportion” used here consider both the local information and other more global information. With these settings, the effect of the overlap between the sample distributions can be reduced. Usually, the maximum likelihood estimator and the related unbiased estimator are not ideal estimators in high dimensional inference problems, particularly in small data-size situation. Hence, an improved estimator by shrinkage estimation (regularization) is proposed. Based on the DNP structure, LDA is included as a special case. In this paper, the kernel method is applied to extend DNP to kernel-based DNP (KDNP). In addition to the advantages of DNP, KDNP surpasses DNP in the experimental results. According to the experiments on the real hyperspectral image data sets, the classification performance of KDNP is better than that of PCA, LDA, NWFE, and their kernel versions, KPCA, GDA, and KNWFE.

Keywords: feature extraction, kernel method, double nearest proportion feature extraction, kernel double nearest feature extraction

Procedia PDF Downloads 226
203 Identity Verification Using k-NN Classifiers and Autistic Genetic Data

Authors: Fuad M. Alkoot

Abstract:

DNA data have been used in forensics for decades. However, current research looks at using the DNA as a biometric identity verification modality. The goal is to improve the speed of identification. We aim at using gene data that was initially used for autism detection to find if and how accurate is this data for identification applications. Mainly our goal is to find if our data preprocessing technique yields data useful as a biometric identification tool. We experiment with using the nearest neighbor classifier to identify subjects. Results show that optimal classification rate is achieved when the test set is corrupted by normally distributed noise with zero mean and standard deviation of 1. The classification rate is close to optimal at higher noise standard deviation reaching 3. This shows that the data can be used for identity verification with high accuracy using a simple classifier such as the k-nearest neighbor (k-NN). 

Keywords: biometrics, genetic data, identity verification, k nearest neighbor

Procedia PDF Downloads 172
202 A Reliable Multi-Type Vehicle Classification System

Authors: Ghada S. Moussa

Abstract:

Vehicle classification is an important task in traffic surveillance and intelligent transportation systems. Classification of vehicle images is facing several problems such as: high intra-class vehicle variations, occlusion, shadow, illumination. These problems and others must be considered to develop a reliable vehicle classification system. In this study, a reliable multi-type vehicle classification system based on Bag-of-Words (BoW) paradigm is developed. Our proposed system used and compared four well-known classifiers; Linear Discriminant Analysis (LDA), Support Vector Machine (SVM), k-Nearest Neighbour (KNN), and Decision Tree to classify vehicles into four categories: motorcycles, small, medium and large. Experiments on a large dataset show that our approach is efficient and reliable in classifying vehicles with accuracy of 95.7%. The SVM outperforms other classification algorithms in terms of both accuracy and robustness alongside considerable reduction in execution time. The innovativeness of developed system is it can serve as a framework for many vehicle classification systems.

Keywords: vehicle classification, bag-of-words technique, SVM classifier, LDA classifier, KNN classifier, decision tree classifier, SIFT algorithm

Procedia PDF Downloads 260
201 Machine Learning Approach for Predicting Students’ Academic Performance and Study Strategies Based on Their Motivation

Authors: Fidelia A. Orji, Julita Vassileva

Abstract:

This research aims to develop machine learning models for students' academic performance and study strategy prediction, which could be generalized to all courses in higher education. Key learning attributes (intrinsic, extrinsic, autonomy, relatedness, competence, and self-esteem) used in building the models are chosen based on prior studies, which revealed that the attributes are essential in students’ learning process. Previous studies revealed the individual effects of each of these attributes on students’ learning progress. However, few studies have investigated the combined effect of the attributes in predicting student study strategy and academic performance to reduce the dropout rate. To bridge this gap, we used Scikit-learn in python to build five machine learning models (Decision Tree, K-Nearest Neighbour, Random Forest, Linear/Logistic Regression, and Support Vector Machine) for both regression and classification tasks to perform our analysis. The models were trained, evaluated, and tested for accuracy using 924 university dentistry students' data collected by Chilean authors through quantitative research design. A comparative analysis of the models revealed that the tree-based models such as the random forest (with prediction accuracy of 94.9%) and decision tree show the best results compared to the linear, support vector, and k-nearest neighbours. The models built in this research can be used in predicting student performance and study strategy so that appropriate interventions could be implemented to improve student learning progress. Thus, incorporating strategies that could improve diverse student learning attributes in the design of online educational systems may increase the likelihood of students continuing with their learning tasks as required. Moreover, the results show that the attributes could be modelled together and used to adapt/personalize the learning process.

Keywords: classification models, learning strategy, predictive modeling, regression models, student academic performance, student motivation, supervised machine learning

Procedia PDF Downloads 5
200 A Comparative Study of k-NN and MLP-NN Classifiers Using GA-kNN Based Feature Selection Method for Wood Recognition System

Authors: Uswah Khairuddin, Rubiyah Yusof, Nenny Ruthfalydia Rosli

Abstract:

This paper presents a comparative study between k-Nearest Neighbour (k-NN) and Multi-Layer Perceptron Neural Network (MLP-NN) classifier using Genetic Algorithm (GA) as feature selector for wood recognition system. The features have been extracted from the images using Grey Level Co-Occurrence Matrix (GLCM). The use of GA based feature selection is mainly to ensure that the database used for training the features for the wood species pattern classifier consists of only optimized features. The feature selection process is aimed at selecting only the most discriminating features of the wood species to reduce the confusion for the pattern classifier. This feature selection approach maintains the ‘good’ features that minimizes the inter-class distance and maximizes the intra-class distance. Wrapper GA is used with k-NN classifier as fitness evaluator (GA-kNN). The results shows that k-NN is the best choice of classifier because it uses a very simple distance calculation algorithm and classification tasks can be done in a short time with good classification accuracy.

Keywords: feature selection, genetic algorithm, optimization, wood recognition system

Procedia PDF Downloads 458
199 Comparison of Heuristic Methods for Solving Traveling Salesman Problem

Authors: Regita P. Permata, Ulfa S. Nuraini

Abstract:

Traveling Salesman Problem (TSP) is the most studied problem in combinatorial optimization. In simple language, TSP can be described as a problem of finding a minimum distance tour to a city, starting and ending in the same city, and exactly visiting another city. In product distribution, companies often get problems in determining the minimum distance that affects the time allocation. In this research, we aim to apply TSP heuristic methods to simulate nodes as city coordinates in product distribution. The heuristics used are sub tour reversal, nearest neighbor, farthest insertion, cheapest insertion, nearest insertion, and arbitrary insertion. We have done simulation nodes using Euclidean distances to compare the number of cities and processing time, thus we get optimum heuristic method. The results show that the optimum heuristic methods are farthest insertion and nearest insertion. These two methods can be recommended to solve product distribution problems in certain companies.

Keywords: Euclidean, heuristics, simulation, TSP

Procedia PDF Downloads 48
198 Effective Charge Coupling in Low Dimensional Doped Quantum Antiferromagnets

Authors: Suraka Bhattacharjee, Ranjan Chaudhury

Abstract:

The interaction between the charge degrees of freedom for itinerant antiferromagnets is investigated in terms of generalized charge stiffness constant corresponding to nearest neighbour t-J model and t1-t2-t3-J model. The low dimensional hole doped antiferromagnets are the well known systems that can be described by the t-J-like models. Accordingly, we have used these models to investigate the fermionic pairing possibilities and the coupling between the itinerant charge degrees of freedom. A detailed comparison between spin and charge couplings highlights that the charge and spin couplings show very similar behaviour in the over-doped region, whereas, they show completely different trends in the lower doping regimes. Moreover, a qualitative equivalence between generalized charge stiffness and effective Coulomb interaction is also established based on the comparisons with other theoretical and experimental results. Thus it is obvious that the enhanced possibility of fermionic pairing is inherent in the reduction of Coulomb repulsion with increase in doping concentration. However, the increased possibility can not give rise to pairing without the presence of any other pair producing mechanism outside the t-J model. Therefore, one can conclude that the t-J-like models themselves solely are not capable of producing conventional momentum-based superconducting pairing on their own.

Keywords: generalized charge stiffness constant, charge coupling, effective Coulomb interaction, t-J-like models, momentum-space pairing

Procedia PDF Downloads 55
197 Performance Comparison of Different Regression Methods for a Polymerization Process with Adaptive Sampling

Authors: Florin Leon, Silvia Curteanu

Abstract:

Developing complete mechanistic models for polymerization reactors is not easy, because complex reactions occur simultaneously; there is a large number of kinetic parameters involved and sometimes the chemical and physical phenomena for mixtures involving polymers are poorly understood. To overcome these difficulties, empirical models based on sampled data can be used instead, namely regression methods typical of machine learning field. They have the ability to learn the trends of a process without any knowledge about its particular physical and chemical laws. Therefore, they are useful for modeling complex processes, such as the free radical polymerization of methyl methacrylate achieved in a batch bulk process. The goal is to generate accurate predictions of monomer conversion, numerical average molecular weight and gravimetrical average molecular weight. This process is associated with non-linear gel and glass effects. For this purpose, an adaptive sampling technique is presented, which can select more samples around the regions where the values have a higher variation. Several machine learning methods are used for the modeling and their performance is compared: support vector machines, k-nearest neighbor, k-nearest neighbor and random forest, as well as an original algorithm, large margin nearest neighbor regression. The suggested method provides very good results compared to the other well-known regression algorithms.

Keywords: batch bulk methyl methacrylate polymerization, adaptive sampling, machine learning, large margin nearest neighbor regression

Procedia PDF Downloads 209
196 Diabetes Diagnosis Model Using Rough Set and K- Nearest Neighbor Classifier

Authors: Usiobaifo Agharese Rosemary, Osaseri Roseline Oghogho

Abstract:

Diabetes is a complex group of disease with a variety of causes; it is a disorder of the body metabolism in the digestion of carbohydrates food. The application of machine learning in the field of medical diagnosis has been the focus of many researchers and the use of recognition and classification model as a decision support tools has help the medical expert in diagnosis of diseases. Considering the large volume of medical data which require special techniques, experience, and high diagnostic skill in the diagnosis of diseases, the application of an artificial intelligent system to assist medical personnel in order to enhance their efficiency and accuracy in diagnosis will be an invaluable tool. In this study will propose a diabetes diagnosis model using rough set and K-nearest Neighbor classifier algorithm. The system consists of two modules: the feature extraction module and predictor module, rough data set is used to preprocess the attributes while K-nearest neighbor classifier is used to classify the given data. The dataset used for this model was taken for University of Benin Teaching Hospital (UBTH) database. Half of the data was used in the training while the other half was used in testing the system. The proposed model was able to achieve over 80% accuracy.

Keywords: classifier algorithm, diabetes, diagnostic model, machine learning

Procedia PDF Downloads 253
195 A Psychophysiological Evaluation of an Effective Recognition Technique Using Interactive Dynamic Virtual Environments

Authors: Mohammadhossein Moghimi, Robert Stone, Pia Rotshtein

Abstract:

Recording psychological and physiological correlates of human performance within virtual environments and interpreting their impacts on human engagement, ‘immersion’ and related emotional or ‘effective’ states is both academically and technologically challenging. By exposing participants to an effective, real-time (game-like) virtual environment, designed and evaluated in an earlier study, a psychophysiological database containing the EEG, GSR and Heart Rate of 30 male and female gamers, exposed to 10 games, was constructed. Some 174 features were subsequently identified and extracted from a number of windows, with 28 different timing lengths (e.g. 2, 3, 5, etc. seconds). After reducing the number of features to 30, using a feature selection technique, K-Nearest Neighbour (KNN) and Support Vector Machine (SVM) methods were subsequently employed for the classification process. The classifiers categorised the psychophysiological database into four effective clusters (defined based on a 3-dimensional space – valence, arousal and dominance) and eight emotion labels (relaxed, content, happy, excited, angry, afraid, sad, and bored). The KNN and SVM classifiers achieved average cross-validation accuracies of 97.01% (±1.3%) and 92.84% (±3.67%), respectively. However, no significant differences were found in the classification process based on effective clusters or emotion labels.

Keywords: virtual reality, effective computing, effective VR, emotion-based effective physiological database

Procedia PDF Downloads 159
194 A Selection Approach: Discriminative Model for Nominal Attributes-Based Distance Measures

Authors: Fang Gong

Abstract:

Distance measures are an indispensable part of many instance-based learning (IBL) and machine learning (ML) algorithms. The value difference metrics (VDM) and inverted specific-class distance measure (ISCDM) are among the top-performing distance measures that address nominal attributes. VDM performs well in some domains owing to its simplicity and poorly in others that exist missing value and non-class attribute noise. ISCDM, however, typically works better than VDM on such domains. To maximize their advantages and avoid disadvantages, in this paper, a selection approach: a discriminative model for nominal attributes-based distance measures is proposed. More concretely, VDM and ISCDM are built independently on a training dataset at the training stage, and the most credible one is recorded for each training instance. At the test stage, its nearest neighbor for each test instance is primarily found by any of VDM and ISCDM and then chooses the most reliable model of its nearest neighbor to predict its class label. It is simply denoted as a discriminative distance measure (DDM). Experiments are conducted on the 34 University of California at Irvine (UCI) machine learning repository datasets, and it shows DDM retains the interpretability and simplicity of VDM and ISCDM but significantly outperforms the original VDM and ISCDM and other state-of-the-art competitors in terms of accuracy.

Keywords: distance measure, discriminative model, nominal attributes, nearest neighbor

Procedia PDF Downloads 46
193 Using Predictive Analytics to Identify First-Year Engineering Students at Risk of Failing

Authors: Beng Yew Low, Cher Liang Cha, Cheng Yong Teoh

Abstract:

Due to a lack of continual assessment or grade related data, identifying first-year engineering students in a polytechnic education at risk of failing is challenging. Our experience over the years tells us that there is no strong correlation between having good entry grades in Mathematics and the Sciences and excelling in hardcore engineering subjects. Hence, identifying students at risk of failure cannot be on the basis of entry grades in Mathematics and the Sciences alone. These factors compound the difficulty of early identification and intervention. This paper describes the development of a predictive analytics model in the early detection of students at risk of failing and evaluates its effectiveness. Data from continual assessments conducted in term one, supplemented by data of student psychological profiles such as interests and study habits, were used. Three classification techniques, namely Logistic Regression, K Nearest Neighbour, and Random Forest, were used in our predictive model. Based on our findings, Random Forest was determined to be the strongest predictor with an Area Under the Curve (AUC) value of 0.994. Correspondingly, the Accuracy, Precision, Recall, and F-Score were also highest among these three classifiers. Using this Random Forest Classification technique, students at risk of failure could be identified at the end of term one. They could then be assigned to a Learning Support Programme at the beginning of term two. This paper gathers the results of our findings. It also proposes further improvements that can be made to the model.

Keywords: continual assessment, predictive analytics, random forest, student psychological profile

Procedia PDF Downloads 43
192 The Classification Performance in Parametric and Nonparametric Discriminant Analysis for a Class- Unbalanced Data of Diabetes Risk Groups

Authors: Lily Ingsrisawang, Tasanee Nacharoen

Abstract:

Introduction: The problems of unbalanced data sets generally appear in real world applications. Due to unequal class distribution, many research papers found that the performance of existing classifier tends to be biased towards the majority class. The k -nearest neighbors’ nonparametric discriminant analysis is one method that was proposed for classifying unbalanced classes with good performance. Hence, the methods of discriminant analysis are of interest to us in investigating misclassification error rates for class-imbalanced data of three diabetes risk groups. Objective: The purpose of this study was to compare the classification performance between parametric discriminant analysis and nonparametric discriminant analysis in a three-class classification application of class-imbalanced data of diabetes risk groups. Methods: Data from a healthy project for 599 staffs in a government hospital in Bangkok were obtained for the classification problem. The staffs were diagnosed into one of three diabetes risk groups: non-risk (90%), risk (5%), and diabetic (5%). The original data along with the variables; diabetes risk group, age, gender, cholesterol, and BMI was analyzed and bootstrapped up to 50 and 100 samples, 599 observations per sample, for additional estimation of misclassification error rate. Each data set was explored for the departure of multivariate normality and the equality of covariance matrices of the three risk groups. Both the original data and the bootstrap samples show non-normality and unequal covariance matrices. The parametric linear discriminant function, quadratic discriminant function, and the nonparametric k-nearest neighbors’ discriminant function were performed over 50 and 100 bootstrap samples and applied to the original data. In finding the optimal classification rule, the choices of prior probabilities were set up for both equal proportions (0.33: 0.33: 0.33) and unequal proportions with three choices of (0.90:0.05:0.05), (0.80: 0.10: 0.10) or (0.70, 0.15, 0.15). Results: The results from 50 and 100 bootstrap samples indicated that the k-nearest neighbors approach when k = 3 or k = 4 and the prior probabilities of {non-risk:risk:diabetic} as {0.90:0.05:0.05} or {0.80:0.10:0.10} gave the smallest error rate of misclassification. Conclusion: The k-nearest neighbors approach would be suggested for classifying a three-class-imbalanced data of diabetes risk groups.

Keywords: error rate, bootstrap, diabetes risk groups, k-nearest neighbors

Procedia PDF Downloads 346
191 Searching k-Nearest Neighbors to be Appropriate under Gaming Environments

Authors: Jae Moon Lee

Abstract:

In general, algorithms to find continuous k-nearest neighbors have been researched on the location based services, monitoring periodically the moving objects such as vehicles and mobile phone. Those researches assume the environment that the number of query points is much less than that of moving objects and the query points are not moved but fixed. In gaming environments, this problem is when computing the next movement considering the neighbors such as flocking, crowd and robot simulations. In this case, every moving object becomes a query point so that the number of query point is same to that of moving objects and the query points are also moving. In this paper, we analyze the performance of the existing algorithms focused on location based services how they operate under gaming environments.

Keywords: flocking behavior, heterogeneous agents, similarity, simulation

Procedia PDF Downloads 222
190 The Selection of the Nearest Anchor Using Received Signal Strength Indication (RSSI)

Authors: Hichem Sassi, Tawfik Najeh, Noureddine Liouane

Abstract:

The localization information is crucial for the operation of WSN. There are principally two types of localization algorithms. The Range-based localization algorithm has strict requirements on hardware; thus, it is expensive to be implemented in practice. The Range-free localization algorithm reduces the hardware cost. However, it can only achieve high accuracy in ideal scenarios. In this paper, we locate unknown nodes by incorporating the advantages of these two types of methods. The proposed algorithm makes the unknown nodes select the nearest anchor using the Received Signal Strength Indicator (RSSI) and choose two other anchors which are the most accurate to achieve the estimated location. Our algorithm improves the localization accuracy compared with previous algorithms, which has been demonstrated by the simulating results.

Keywords: WSN, localization, DV-Hop, RSSI

Procedia PDF Downloads 269
189 Measuring Multi-Class Linear Classifier for Image Classification

Authors: Fatma Susilawati Mohamad, Azizah Abdul Manaf, Fadhillah Ahmad, Zarina Mohamad, Wan Suryani Wan Awang

Abstract:

A simple and robust multi-class linear classifier is proposed and implemented. For a pair of classes of the linear boundary, a collection of segments of hyper planes created as perpendicular bisectors of line segments linking centroids of the classes or part of classes. Nearest Neighbor and Linear Discriminant Analysis are compared in the experiments to see the performances of each classifier in discriminating ripeness of oil palm. This paper proposes a multi-class linear classifier using Linear Discriminant Analysis (LDA) for image identification. Result proves that LDA is well capable in separating multi-class features for ripeness identification.

Keywords: multi-class, linear classifier, nearest neighbor, linear discriminant analysis

Procedia PDF Downloads 425
188 Hybrid Approach for Face Recognition Combining Gabor Wavelet and Linear Discriminant Analysis

Authors: A: Annis Fathima, V. Vaidehi, S. Ajitha

Abstract:

Face recognition system finds many applications in surveillance and human computer interaction systems. As the applications using face recognition systems are of much importance and demand more accuracy, more robustness in the face recognition system is expected with less computation time. In this paper, a hybrid approach for face recognition combining Gabor Wavelet and Linear Discriminant Analysis (HGWLDA) is proposed. The normalized input grayscale image is approximated and reduced in dimension to lower the processing overhead for Gabor filters. This image is convolved with bank of Gabor filters with varying scales and orientations. LDA, a subspace analysis techniques are used to reduce the intra-class space and maximize the inter-class space. The techniques used are 2-dimensional Linear Discriminant Analysis (2D-LDA), 2-dimensional bidirectional LDA ((2D)2LDA), Weighted 2-dimensional bidirectional Linear Discriminant Analysis (Wt (2D)2 LDA). LDA reduces the feature dimension by extracting the features with greater variance. k-Nearest Neighbour (k-NN) classifier is used to classify and recognize the test image by comparing its feature with each of the training set features. The HGWLDA approach is robust against illumination conditions as the Gabor features are illumination invariant. This approach also aims at a better recognition rate using less number of features for varying expressions. The performance of the proposed HGWLDA approaches is evaluated using AT&T database, MIT-India face database and faces94 database. It is found that the proposed HGWLDA approach provides better results than the existing Gabor approach.

Keywords: face recognition, Gabor wavelet, LDA, k-NN classifier

Procedia PDF Downloads 396