Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 37

Search results for: KNearest Neighbor (KNN)

37 Improvement in Power Transformer Intelligent Dissolved Gas Analysis Method

Authors: S. Qaedi, S. Seyedtabaii

Abstract:

Non-Destructive evaluation of in-service power transformer condition is necessary for avoiding catastrophic failures. Dissolved Gas Analysis (DGA) is one of the important methods. Traditional, statistical and intelligent DGA approaches have been adopted for accurate classification of incipient fault sources. Unfortunately, there are not often enough faulty patterns required for sufficient training of intelligent systems. By bootstrapping the shortcoming is expected to be alleviated and algorithms with better classification success rates to be obtained. In this paper the performance of an artificial neural network, K-Nearest Neighbour and support vector machine methods using bootstrapped data are detailed and shown that while the success rate of the ANN algorithms improves remarkably, the outcome of the others do not benefit so much from the provided enlarged data space. For assessment, two databases are employed: IEC TC10 and a dataset collected from reported data in papers. High average test success rate well exhibits the remarkable outcome.

Keywords: Dissolved gas analysis, Transformer incipient fault, Artificial Neural Network, Support Vector Machine (SVM), KNearest Neighbor (KNN)

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
36 Determination of Neighbor Node in Consideration of the Imaging Range of Cameras in Automatic Human Tracking System

Authors: Kozo Tanigawa, Tappei Yotsumoto, Kenichi Takahashi, Takao Kawamura, Kazunori Sugahara

Abstract:

A automatic human tracking system using mobile agent technology is realized because a mobile agent moves in accordance with a migration of a target person. In this paper, we propose a method for determining the neighbor node in consideration of the imaging range of cameras.

Keywords: Human tracking, Mobile agent, Pan/Tilt/Zoom, Neighbor relation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
35 Fast and Accuracy Control Chart Pattern Recognition using a New cluster-k-Nearest Neighbor

Authors: Samir Brahim Belhaouari

Abstract:

By taking advantage of both k-NN which is highly accurate and K-means cluster which is able to reduce the time of classification, we can introduce Cluster-k-Nearest Neighbor as "variable k"-NN dealing with the centroid or mean point of all subclasses generated by clustering algorithm. In general the algorithm of K-means cluster is not stable, in term of accuracy, for that reason we develop another algorithm for clustering our space which gives a higher accuracy than K-means cluster, less subclass number, stability and bounded time of classification with respect to the variable data size. We find between 96% and 99.7 % of accuracy in the lassification of 6 different types of Time series by using K-means cluster algorithm and we find 99.7% by using the new clustering algorithm.

Keywords: Pattern recognition, Time series, k-Nearest Neighbor, k-means cluster, Gaussian Mixture Model, Classification

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
34 Neighbors of Indefinite Binary Quadratic Forms

Authors: Ahmet Tekcan

Abstract:

In this paper, we derive some algebraic identities on right and left neighbors R(F) and L(F) of an indefinite binary quadratic form F = F(x, y) = ax2 + bxy + cy2 of discriminant Δ = b2 -4ac. We prove that the proper cycle of F can be given by using its consecutive left neighbors. Also we construct a connection between right and left neighbors of F.

Keywords: Quadratic form, indefinite form, cycle, proper cycle, right neighbor, left neighbor.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
33 A Robust Method for Finding Nearest-Neighbor using Hexagon Cells

Authors: Ahmad Attiq Al-Ogaibi, Ahmad Sharieh, Moh’d Belal Al-Zoubi, R. Bremananth

Abstract:

In pattern clustering, nearest neighborhood point computation is a challenging issue for many applications in the area of research such as Remote Sensing, Computer Vision, Pattern Recognition and Statistical Imaging. Nearest neighborhood computation is an essential computation for providing sufficient classification among the volume of pixels (voxels) in order to localize the active-region-of-interests (AROI). Furthermore, it is needed to compute spatial metric relationships of diverse area of imaging based on the applications of pattern recognition. In this paper, we propose a new methodology for finding the nearest neighbor point, depending on making a virtually grid of a hexagon cells, then locate every point beneath them. An algorithm is suggested for minimizing the computation and increasing the turnaround time of the process. The nearest neighbor query points Φ are fetched by seeking fashion of hexagon holistic. Seeking will be repeated until an AROI Φ is to be expected. If any point Υ is located then searching starts in the nearest hexagons in a circular way. The First hexagon is considered be level 0 (L0) and the surrounded hexagons is level 1 (L1). If Υ is located in L1, then search starts in the next level (L2) to ensure that Υ is the nearest neighbor for Φ. Based on the result and experimental results, we found that the proposed method has an advantage over the traditional methods in terms of minimizing the time complexity required for searching the neighbors, in turn, efficiency of classification will be improved sufficiently.

Keywords: Hexagon cells, k-nearest neighbors, Nearest Neighbor, Pattern recognition, Query pattern, Virtually grid

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
32 An Improved k Nearest Neighbor Classifier Using Interestingness Measures for Medical Image Mining

Authors: J. Alamelu Mangai, Satej Wagle, V. Santhosh Kumar

Abstract:

The exponential increase in the volume of medical image database has imposed new challenges to clinical routine in maintaining patient history, diagnosis, treatment and monitoring. With the advent of data mining and machine learning techniques it is possible to automate and/or assist physicians in clinical diagnosis. In this research a medical image classification framework using data mining techniques is proposed. It involves feature extraction, feature selection, feature discretization and classification. In the classification phase, the performance of the traditional kNN k nearest neighbor classifier is improved using a feature weighting scheme and a distance weighted voting instead of simple majority voting. Feature weights are calculated using the interestingness measures used in association rule mining. Experiments on the retinal fundus images show that the proposed framework improves the classification accuracy of traditional kNN from 78.57 % to 92.85 %.

Keywords: Medical Image Mining, Data Mining, Feature Weighting, Association Rule Mining, k nearest neighbor classifier.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
31 Feature Reduction of Nearest Neighbor Classifiers using Genetic Algorithm

Authors: M. Analoui, M. Fadavi Amiri

Abstract:

The design of a pattern classifier includes an attempt to select, among a set of possible features, a minimum subset of weakly correlated features that better discriminate the pattern classes. This is usually a difficult task in practice, normally requiring the application of heuristic knowledge about the specific problem domain. The selection and quality of the features representing each pattern have a considerable bearing on the success of subsequent pattern classification. Feature extraction is the process of deriving new features from the original features in order to reduce the cost of feature measurement, increase classifier efficiency, and allow higher classification accuracy. Many current feature extraction techniques involve linear transformations of the original pattern vectors to new vectors of lower dimensionality. While this is useful for data visualization and increasing classification efficiency, it does not necessarily reduce the number of features that must be measured since each new feature may be a linear combination of all of the features in the original pattern vector. In this paper a new approach is presented to feature extraction in which feature selection, feature extraction, and classifier training are performed simultaneously using a genetic algorithm. In this approach each feature value is first normalized by a linear equation, then scaled by the associated weight prior to training, testing, and classification. A knn classifier is used to evaluate each set of feature weights. The genetic algorithm optimizes a vector of feature weights, which are used to scale the individual features in the original pattern vectors in either a linear or a nonlinear fashion. By this approach, the number of features used in classifying can be finely reduced.

Keywords: Feature reduction, genetic algorithm, pattern classification, nearest neighbor rule classifiers (k-NNR).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
30 Classification Influence Index and its Application for k-Nearest Neighbor Classifier

Authors: Sejong Oh

Abstract:

Classification is an important topic in machine learning and bioinformatics. Many datasets have been introduced for classification tasks. A dataset contains multiple features, and the quality of features influences the classification accuracy of the dataset. The power of classification for each feature differs. In this study, we suggest the Classification Influence Index (CII) as an indicator of classification power for each feature. CII enables evaluation of the features in a dataset and improved classification accuracy by transformation of the dataset. By conducting experiments using CII and the k-nearest neighbor classifier to analyze real datasets, we confirmed that the proposed index provided meaningful improvement of the classification accuracy.

Keywords: accuracy, classification, dataset, data preprocessing

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
29 Improving Cryptographically Generated Address Algorithm in IPv6 Secure Neighbor Discovery Protocol through Trust Management

Authors: M. Moslehpour, S. Khorsandi

Abstract:

As transition to widespread use of IPv6 addresses has gained momentum, it has been shown to be vulnerable to certain security attacks such as those targeting Neighbor Discovery Protocol (NDP) which provides the address resolution functionality in IPv6. To protect this protocol, Secure Neighbor Discovery (SEND) is introduced. This protocol uses Cryptographically Generated Address (CGA) and asymmetric cryptography as a defense against threats on integrity and identity of NDP. Although SEND protects NDP against attacks, it is computationally intensive due to Hash2 condition in CGA. To improve the CGA computation speed, we parallelized CGA generation process and used the available resources in a trusted network. Furthermore, we focused on the influence of the existence of malicious nodes on the overall load of un-malicious ones in the network. According to the evaluation results, malicious nodes have adverse impacts on the average CGA generation time and on the average number of tries. We utilized a Trust Management that is capable of detecting and isolating the malicious node to remove possible incentives for malicious behavior. We have demonstrated the effectiveness of the Trust Management System in detecting the malicious nodes and hence improving the overall system performance.

Keywords: NDP, SEND, CGA, modifier, malicious node.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
28 Weighted k-Nearest-Neighbor Techniques for High Throughput Screening Data

Authors: Kozak K, M. Kozak, K. Stapor

Abstract:

The k-nearest neighbors (knn) is a simple but effective method of classification. In this paper we present an extended version of this technique for chemical compounds used in High Throughput Screening, where the distances of the nearest neighbors can be taken into account. Our algorithm uses kernel weight functions as guidance for the process of defining activity in screening data. Proposed kernel weight function aims to combine properties of graphical structure and molecule descriptors of screening compounds. We apply the modified knn method on several experimental data from biological screens. The experimental results confirm the effectiveness of the proposed method.

Keywords: biological screening, kernel methods, KNN, QSAR

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
27 Analysis of Genetic Variations in Camel Breeds (Camelus dromedarius)

Authors: Yasser M. Saad, Amr A. El Hanafy, Saleh A. Alkarim, Hussein A. Almehdar, Elrashdy M. Redwan

Abstract:

Camels are substantial providers of transport, milk, sport, meat, shelter, security and capital in many countries, particularly in Saudi Arabia. Inter simple sequence repeat technique was used to detect the genetic variations among some camel breeds (Majaheim, Safra, Wadah, and Hamara). Actual number of alleles, effective number of alleles, gene diversity, Shannon’s information index and polymorphic bands were calculated for each evaluated camel breed. Neighbor-joining tree that re-constructed for evaluated these camel breeds showed that, Hamara breed is distantly related from the other evaluated camels. In addition, the polymorphic sites, haplotypes and nucleotide diversity were identified for some camelidae cox1 gene sequences (obtained from NCBI). The distance value between C. bactrianus and C. dromedarius (0.072) was relatively low. Analysis of genetic diversity is an important way for conserving Camelus dromedarius genetic resources.

Keywords: Camel, genetics, ISSR, cox1, neighbor-joining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
26 Reducing SAGE Data Using Genetic Algorithms

Authors: Cheng-Hong Yang, Tsung-Mu Shih, Li-Yeh Chuang

Abstract:

Serial Analysis of Gene Expression is a powerful quantification technique for generating cell or tissue gene expression data. The profile of the gene expression of cell or tissue in several different states is difficult for biologists to analyze because of the large number of genes typically involved. However, feature selection in machine learning can successfully reduce this problem. The method allows reducing the features (genes) in specific SAGE data, and determines only relevant genes. In this study, we used a genetic algorithm to implement feature selection, and evaluate the classification accuracy of the selected features with the K-nearest neighbor method. In order to validate the proposed method, we used two SAGE data sets for testing. The results of this study conclusively prove that the number of features of the original SAGE data set can be significantly reduced and higher classification accuracy can be achieved.

Keywords: Serial Analysis of Gene Expression, Feature selection, Genetic Algorithm, K-nearest neighbor method.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
25 An Accurate Method for Phylogeny Tree Reconstruction Based on a Modified Wild Dog Algorithm

Authors: Essam Al Daoud

Abstract:

This study solves a phylogeny problem by using modified wild dog pack optimization. The least squares error is considered as a cost function that needs to be minimized. Therefore, in each iteration, new distance matrices based on the constructed trees are calculated and used to select the alpha dog. To test the suggested algorithm, ten homologous genes are selected and collected from National Center for Biotechnology Information (NCBI) databanks (i.e., 16S, 18S, 28S, Cox 1, ITS1, ITS2, ETS, ATPB, Hsp90, and STN). The data are divided into three categories: 50 taxa, 100 taxa and 500 taxa. The empirical results show that the proposed algorithm is more reliable and accurate than other implemented methods.

Keywords: Least squares, neighbor joining, phylogenetic tree, wild dogpack.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
24 Pre-Operative Tool for Facial-Post-Surgical Estimation and Detection

Authors: Ayat E. Ali, Christeen R. Aziz, Merna A. Helmy, Mohammed M. Malek, Sherif H. El-Gohary

Abstract:

Goal: Purpose of the project was to make a plastic surgery prediction by using pre-operative images for the plastic surgeries’ patients and to show this prediction on a screen to compare between the current case and the appearance after the surgery. Methods: To this aim, we implemented a software which used data from the internet for facial skin diseases, skin burns, pre-and post-images for plastic surgeries then the post- surgical prediction is done by using K-nearest neighbor (KNN). So we designed and fabricated a smart mirror divided into two parts a screen and a reflective mirror so patient's pre- and post-appearance will be showed at the same time. Results: We worked on some skin diseases like vitiligo, skin burns and wrinkles. We classified the three degrees of burns using KNN classifier with accuracy 60%. We also succeeded in segmenting the area of vitiligo. Our future work will include working on more skin diseases, classify them and give a prediction for the look after the surgery. Also we will go deeper into facial deformities and plastic surgeries like nose reshaping and face slim down. Conclusion: Our project will give a prediction relates strongly to the real look after surgery and decrease different diagnoses among doctors. Significance: The mirror may have broad societal appeal as it will make the distance between patient's satisfaction and the medical standards smaller.

Keywords: K-nearest neighbor, face detection, vitiligo, bone deformity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
23 FCNN-MR: A Parallel Instance Selection Method Based on Fast Condensed Nearest Neighbor Rule

Authors: Lu Si, Jie Yu, Shasha Li, Jun Ma, Lei Luo, Qingbo Wu, Yongqi Ma, Zhengji Liu

Abstract:

Instance selection (IS) technique is used to reduce the data size to improve the performance of data mining methods. Recently, to process very large data set, several proposed methods divide the training set into some disjoint subsets and apply IS algorithms independently to each subset. In this paper, we analyze the limitation of these methods and give our viewpoint about how to divide and conquer in IS procedure. Then, based on fast condensed nearest neighbor (FCNN) rule, we propose a large data sets instance selection method with MapReduce framework. Besides ensuring the prediction accuracy and reduction rate, it has two desirable properties: First, it reduces the work load in the aggregation node; Second and most important, it produces the same result with the sequential version, which other parallel methods cannot achieve. We evaluate the performance of FCNN-MR on one small data set and two large data sets. The experimental results show that it is effective and practical.

Keywords: Instance selection, data reduction, MapReduce, kNN.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
22 An Experimental Study of a Self-Supervised Classifier Ensemble

Authors: Neamat El Gayar

Abstract:

Learning using labeled and unlabelled data has received considerable amount of attention in the machine learning community due its potential in reducing the need for expensive labeled data. In this work we present a new method for combining labeled and unlabeled data based on classifier ensembles. The model we propose assumes each classifier in the ensemble observes the input using different set of features. Classifiers are initially trained using some labeled samples. The trained classifiers learn further through labeling the unknown patterns using a teaching signals that is generated using the decision of the classifier ensemble, i.e. the classifiers self-supervise each other. Experiments on a set of object images are presented. Our experiments investigate different classifier models, different fusing techniques, different training sizes and different input features. Experimental results reveal that the proposed self-supervised ensemble learning approach reduces classification error over the single classifier and the traditional ensemble classifier approachs.

Keywords: Multiple Classifier Systems, classifier ensembles, learning using labeled and unlabelled data, K-nearest neighbor classifier, Bayes classifier.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
21 Multilevel Classifiers in Recognition of Handwritten Kannada Numerals

Authors: Dinesh Acharya U., N. V. Subba Reddy, Krishnamoorthi Makkithaya

Abstract:

The recognition of handwritten numeral is an important area of research for its applications in post office, banks and other organizations. This paper presents automatic recognition of handwritten Kannada numerals based on structural features. Five different types of features, namely, profile based 10-segment string, water reservoir; vertical and horizontal strokes, end points and average boundary length from the minimal bounding box are used in the recognition of numeral. The effect of each feature and their combination in the numeral classification is analyzed using nearest neighbor classifiers. It is common to combine multiple categories of features into a single feature vector for the classification. Instead, separate classifiers can be used to classify based on each visual feature individually and the final classification can be obtained based on the combination of separate base classification results. One popular approach is to combine the classifier results into a feature vector and leaving the decision to next level classifier. This method is extended to extract a better information, possibility distribution, from the base classifiers in resolving the conflicts among the classification results. Here, we use fuzzy k Nearest Neighbor (fuzzy k-NN) as base classifier for individual feature sets, the results of which together forms the feature vector for the final k Nearest Neighbor (k-NN) classifier. Testing is done, using different features, individually and in combination, on a database containing 1600 samples of different numerals and the results are compared with the results of different existing methods.

Keywords: Fuzzy k Nearest Neighbor, Multiple Classifiers, Numeral Recognition, Structural features.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
20 A Distributed Cryptographically Generated Address Computing Algorithm for Secure Neighbor Discovery Protocol in IPv6

Authors: M. Moslehpour, S. Khorsandi

Abstract:

Due to shortage in IPv4 addresses, transition to IPv6 has gained significant momentum in recent years. Like Address Resolution Protocol (ARP) in IPv4, Neighbor Discovery Protocol (NDP) provides some functions like address resolution in IPv6. Besides functionality of NDP, it is vulnerable to some attacks. To mitigate these attacks, Internet Protocol Security (IPsec) was introduced, but it was not efficient due to its limitation. Therefore, SEND protocol is proposed to automatic protection of auto-configuration process. It is secure neighbor discovery and address resolution process. To defend against threats on NDP’s integrity and identity, Cryptographically Generated Address (CGA) and asymmetric cryptography are used by SEND. Besides advantages of SEND, its disadvantages like the computation process of CGA algorithm and sequentially of CGA generation algorithm are considerable. In this paper, we parallel this process between network resources in order to improve it. In addition, we compare the CGA generation time in self-computing and distributed-computing process. We focus on the impact of the malicious nodes on the CGA generation time in the network. According to the result, although malicious nodes participate in the generation process, CGA generation time is less than when it is computed in a one-way. By Trust Management System, detecting and insulating malicious nodes is easier.

Keywords: NDP, IPsec, SEND, CGA, Modifier, Malicious node, Self-Computing, Distributed-Computing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
19 Image Spam Detection Using Color Features and K-Nearest Neighbor Classification

Authors: T. Kumaresan, S. Sanjushree, C. Palanisamy

Abstract:

Image spam is a kind of email spam where the spam text is embedded with an image. It is a new spamming technique being used by spammers to send their messages to bulk of internet users. Spam email has become a big problem in the lives of internet users, causing time consumption and economic losses. The main objective of this paper is to detect the image spam by using histogram properties of an image. Though there are many techniques to automatically detect and avoid this problem, spammers employing new tricks to bypass those techniques, as a result those techniques are inefficient to detect the spam mails. In this paper we have proposed a new method to detect the image spam. Here the image features are extracted by using RGB histogram, HSV histogram and combination of both RGB and HSV histogram. Based on the optimized image feature set classification is done by using k- Nearest Neighbor(k-NN) algorithm. Experimental result shows that our method has achieved better accuracy. From the result it is known that combination of RGB and HSV histogram with k-NN algorithm gives the best accuracy in spam detection.

Keywords: File Type, HSV Histogram, k-NN, RGB Histogram, Spam Detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
18 Feature Extraction Technique for Prediction the Antigenic Variants of the Influenza Virus

Authors: Majid Forghani, Michael Khachay

Abstract:

In genetics, the impact of neighboring amino acids on a target site is referred as the nearest-neighbor effect or simply neighbor effect. In this paper, a new method called wavelet particle decomposition representing the one-dimensional neighbor effect using wavelet packet decomposition is proposed. The main idea lies in known dependence of wavelet packet sub-bands on location and order of neighboring samples. The method decomposes the value of a signal sample into small values called particles that represent a part of the neighbor effect information. The results have shown that the information obtained from the particle decomposition can be used to create better model variables or features. As an example, the approach has been applied to improve the correlation of test and reference sequence distance with titer in the hemagglutination inhibition assay.

Keywords: Antigenic variants, neighbor effect, wavelet packet, wavelet particle decomposition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
17 Study of Compaction in Hot-Mix Asphalt Using Computer Simulations

Authors: Kasthurirangan Gopalakrishnan, Naga Shashidhar, Xiaoxiong Zhong

Abstract:

During the process of compaction in Hot-Mix Asphalt (HMA) mixtures, the distance between aggregate particles decreases as they come together and eliminate air-voids. By measuring the inter-particle distances in a cut-section of a HMA sample the degree of compaction can be estimated. For this, a calibration curve is generated by computer simulation technique when the gradation and asphalt content of the HMA mixture are known. A two-dimensional cross section of HMA specimen was simulated using the mixture design information (gradation, asphalt content and air-void content). Nearest neighbor distance methods such as Delaunay triangulation were used to study the changes in inter-particle distance and area distribution during the process of compaction in HMA. Such computer simulations would enable making several hundreds of repetitions in a short period of time without the necessity to compact and analyze laboratory specimens in order to obtain good statistics on the parameters defined. The distributions for the statistical parameters based on computer simulations showed similar trends as those of laboratory specimens.

Keywords: Computer simulations, Hot-Mix Asphalt (HMA), inter-particle distance, image analysis, nearest neighbor

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
16 A Learning-Community Recommendation Approach for Web-Based Cooperative Learning

Authors: Jian-Wei Li, Yao-Tien Wang, Yi-Chun Chang

Abstract:

Cooperative learning has been defined as learners working together as a team to solve a problem to complete a task or to accomplish a common goal, which emphasizes the importance of interactions among members to promote the whole learning performance. With the popularity of society networks, cooperative learning is no longer limited to traditional classroom teaching activities. Since society networks facilitate to organize online learners, to establish common shared visions, and to advance learning interaction, the online community and online learning community have triggered the establishment of web-based societies. Numerous research literatures have indicated that the collaborative learning community is a critical issue to enhance learning performance. Hence, this paper proposes a learning community recommendation approach to facilitate that a learner joins the appropriate learning communities, which is based on k-nearest neighbor (kNN) classification. To demonstrate the viability of the proposed approach, the proposed approach is implemented for 117 students to recommend learning communities. The experimental results indicate that the proposed approach can effectively recommend appropriate learning communities for learners.

Keywords: k-nearest neighbor classification, learning community, Cooperative/Collaborative Learning and Environments.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
15 Identity Verification Using k-NN Classifiers and Autistic Genetic Data

Authors: Fuad M. Alkoot

Abstract:

DNA data have been used in forensics for decades. However, current research looks at using the DNA as a biometric identity verification modality. The goal is to improve the speed of identification. We aim at using gene data that was initially used for autism detection to find if and how accurate is this data for identification applications. Mainly our goal is to find if our data preprocessing technique yields data useful as a biometric identification tool. We experiment with using the nearest neighbor classifier to identify subjects. Results show that optimal classification rate is achieved when the test set is corrupted by normally distributed noise with zero mean and standard deviation of 1. The classification rate is close to optimal at higher noise standard deviation reaching 3. This shows that the data can be used for identity verification with high accuracy using a simple classifier such as the k-nearest neighbor (k-NN). 

Keywords: Biometrics, identity verification, genetic data, k-nearest neighbor.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
14 Performance Comparison of Different Regression Methods for a Polymerization Process with Adaptive Sampling

Authors: Florin Leon, Silvia Curteanu

Abstract:

Developing complete mechanistic models for polymerization reactors is not easy, because complex reactions occur simultaneously; there is a large number of kinetic parameters involved and sometimes the chemical and physical phenomena for mixtures involving polymers are poorly understood. To overcome these difficulties, empirical models based on sampled data can be used instead, namely regression methods typical of machine learning field. They have the ability to learn the trends of a process without any knowledge about its particular physical and chemical laws. Therefore, they are useful for modeling complex processes, such as the free radical polymerization of methyl methacrylate achieved in a batch bulk process. The goal is to generate accurate predictions of monomer conversion, numerical average molecular weight and gravimetrical average molecular weight. This process is associated with non-linear gel and glass effects. For this purpose, an adaptive sampling technique is presented, which can select more samples around the regions where the values have a higher variation. Several machine learning methods are used for the modeling and their performance is compared: support vector machines, k-nearest neighbor, k-nearest neighbor and random forest, as well as an original algorithm, large margin nearest neighbor regression. The suggested method provides very good results compared to the other well-known regression algorithms.

Keywords: Adaptive sampling, batch bulk methyl methacrylate polymerization, large margin nearest neighbor regression, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
13 Smart Grids Cyber Security Issues and Challenges

Authors: Imen Aouini, Lamia Ben Azzouz

Abstract:

The energy need is growing rapidly due to the population growth and the large new usage of power. Several works put considerable efforts to make the electricity grid more intelligent to reduce essentially energy consumption and provide efficiency and reliability of power systems. The Smart Grid is a complex architecture that covers critical devices and systems vulnerable to significant attacks. Hence, security is a crucial factor for the success and the wide deployment of Smart Grids. In this paper, we present security issues of the Smart Grid architecture and we highlight open issues that will make the Smart Grid security a challenging research area in the future.

Keywords: Smart grids, smart meters, home area network, neighbor area network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
12 Performance Analysis of Genetic Algorithm with kNN and SVM for Feature Selection in Tumor Classification

Authors: C. Gunavathi, K. Premalatha

Abstract:

Tumor classification is a key area of research in the field of bioinformatics. Microarray technology is commonly used in the study of disease diagnosis using gene expression levels. The main drawback of gene expression data is that it contains thousands of genes and a very few samples. Feature selection methods are used to select the informative genes from the microarray. These methods considerably improve the classification accuracy. In the proposed method, Genetic Algorithm (GA) is used for effective feature selection. Informative genes are identified based on the T-Statistics, Signal-to-Noise Ratio (SNR) and F-Test values. The initial candidate solutions of GA are obtained from top-m informative genes. The classification accuracy of k-Nearest Neighbor (kNN) method is used as the fitness function for GA. In this work, kNN and Support Vector Machine (SVM) are used as the classifiers. The experimental results show that the proposed work is suitable for effective feature selection. With the help of the selected genes, GA-kNN method achieves 100% accuracy in 4 datasets and GA-SVM method achieves in 5 out of 10 datasets. The GA with kNN and SVM methods are demonstrated to be an accurate method for microarray based tumor classification.

Keywords: F-Test, Gene Expression, Genetic Algorithm, k- Nearest-Neighbor, Microarray, Signal-to-Noise Ratio, Support Vector Machine, T-statistics, Tumor Classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
11 A Comparison between Heuristic and Meta-Heuristic Methods for Solving the Multiple Traveling Salesman Problem

Authors: San Nah Sze, Wei King Tiong

Abstract:

The multiple traveling salesman problem (mTSP) can be used to model many practical problems. The mTSP is more complicated than the traveling salesman problem (TSP) because it requires determining which cities to assign to each salesman, as well as the optimal ordering of the cities within each salesman's tour. Previous studies proposed that Genetic Algorithm (GA), Integer Programming (IP) and several neural network (NN) approaches could be used to solve mTSP. This paper compared the results for mTSP, solved with Genetic Algorithm (GA) and Nearest Neighbor Algorithm (NNA). The number of cities is clustered into a few groups using k-means clustering technique. The number of groups depends on the number of salesman. Then, each group is solved with NNA and GA as an independent TSP. It is found that k-means clustering and NNA are superior to GA in terms of performance (evaluated by fitness function) and computing time.

Keywords: Multiple Traveling Salesman Problem, GeneticAlgorithm, Nearest Neighbor Algorithm, k-Means Clustering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
10 Comparison of Phylogenetic Trees of Multiple Protein Sequence Alignment Methods

Authors: Khaddouja Boujenfa, Nadia Essoussi, Mohamed Limam

Abstract:

Multiple sequence alignment is a fundamental part in many bioinformatics applications such as phylogenetic analysis. Many alignment methods have been proposed. Each method gives a different result for the same data set, and consequently generates a different phylogenetic tree. Hence, the chosen alignment method affects the resulting tree. However in the literature, there is no evaluation of multiple alignment methods based on the comparison of their phylogenetic trees. This work evaluates the following eight aligners: ClustalX, T-Coffee, SAGA, MUSCLE, MAFFT, DIALIGN, ProbCons and Align-m, based on their phylogenetic trees (test trees) produced on a given data set. The Neighbor-Joining method is used to estimate trees. Three criteria, namely, the dNNI, the dRF and the Id_Tree are established to test the ability of different alignment methods to produce closer test tree compared to the reference one (true tree). Results show that the method which produces the most accurate alignment gives the nearest test tree to the reference tree. MUSCLE outperforms all aligners with respect to the three criteria and for all datasets, performing particularly better when sequence identities are within 10-20%. It is followed by T-Coffee at lower sequence identity (<10%), Align-m at 20-30% identity, and ClustalX and ProbCons at 30-50% identity. Also, it is noticed that when sequence identities are higher (>30%), trees scores of all methods become similar.

Keywords: Multiple alignment methods, phylogenetic trees, Neighbor-Joining method, Robinson-Foulds distance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
9 SC-LSH: An Efficient Indexing Method for Approximate Similarity Search in High Dimensional Space

Authors: Sanaa Chafik, ImaneDaoudi, Mounim A. El Yacoubi, Hamid El Ouardi

Abstract:

Locality Sensitive Hashing (LSH) is one of the most promising techniques for solving nearest neighbour search problem in high dimensional space. Euclidean LSH is the most popular variation of LSH that has been successfully applied in many multimedia applications. However, the Euclidean LSH presents limitations that affect structure and query performances. The main limitation of the Euclidean LSH is the large memory consumption. In order to achieve a good accuracy, a large number of hash tables is required. In this paper, we propose a new hashing algorithm to overcome the storage space problem and improve query time, while keeping a good accuracy as similar to that achieved by the original Euclidean LSH. The Experimental results on a real large-scale dataset show that the proposed approach achieves good performances and consumes less memory than the Euclidean LSH.

Keywords: Approximate Nearest Neighbor Search, Content based image retrieval (CBIR), Curse of dimensionality, Locality sensitive hashing, Multidimensional indexing, Scalability.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
8 Methods of Geodesic Distance in Two-Dimensional Face Recognition

Authors: Rachid Ahdid, Said Safi, Bouzid Manaut

Abstract:

In this paper, we present a comparative study of three methods of 2D face recognition system such as: Iso-Geodesic Curves (IGC), Geodesic Distance (GD) and Geodesic-Intensity Histogram (GIH). These approaches are based on computing of geodesic distance between points of facial surface and between facial curves. In this study we represented the image at gray level as a 2D surface in a 3D space, with the third coordinate proportional to the intensity values of pixels. In the classifying step, we use: Neural Networks (NN), K-Nearest Neighbor (KNN) and Support Vector Machines (SVM). The images used in our experiments are from two wellknown databases of face images ORL and YaleB. ORL data base was used to evaluate the performance of methods under conditions where the pose and sample size are varied, and the database YaleB was used to examine the performance of the systems when the facial expressions and lighting are varied.

Keywords: 2D face recognition, Geodesic distance, Iso-Geodesic Curves, Geodesic-Intensity Histogram, facial surface, Neural Networks, K-Nearest Neighbor, Support Vector Machines.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF