Search results for: nearest neighbour.
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 164

Search results for: nearest neighbour.

134 Classification Influence Index and its Application for k-Nearest Neighbor Classifier

Authors: Sejong Oh

Abstract:

Classification is an important topic in machine learning and bioinformatics. Many datasets have been introduced for classification tasks. A dataset contains multiple features, and the quality of features influences the classification accuracy of the dataset. The power of classification for each feature differs. In this study, we suggest the Classification Influence Index (CII) as an indicator of classification power for each feature. CII enables evaluation of the features in a dataset and improved classification accuracy by transformation of the dataset. By conducting experiments using CII and the k-nearest neighbor classifier to analyze real datasets, we confirmed that the proposed index provided meaningful improvement of the classification accuracy.

Keywords: accuracy, classification, dataset, data preprocessing

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1445
133 The Classification Performance in Parametric and Nonparametric Discriminant Analysis for a Class- Unbalanced Data of Diabetes Risk Groups

Authors: Lily Ingsrisawang, Tasanee Nacharoen

Abstract:

The problems arising from unbalanced data sets generally appear in real world applications. Due to unequal class distribution, many researchers have found that the performance of existing classifiers tends to be biased towards the majority class. The k-nearest neighbors’ nonparametric discriminant analysis is a method that was proposed for classifying unbalanced classes with good performance. In this study, the methods of discriminant analysis are of interest in investigating misclassification error rates for classimbalanced data of three diabetes risk groups. The purpose of this study was to compare the classification performance between parametric discriminant analysis and nonparametric discriminant analysis in a three-class classification of class-imbalanced data of diabetes risk groups. Data from a project maintaining healthy conditions for 599 employees of a government hospital in Bangkok were obtained for the classification problem. The employees were divided into three diabetes risk groups: non-risk (90%), risk (5%), and diabetic (5%). The original data including the variables of diabetes risk group, age, gender, blood glucose, and BMI were analyzed and bootstrapped for 50 and 100 samples, 599 observations per sample, for additional estimation of the misclassification error rate. Each data set was explored for the departure of multivariate normality and the equality of covariance matrices of the three risk groups. Both the original data and the bootstrap samples showed nonnormality and unequal covariance matrices. The parametric linear discriminant function, quadratic discriminant function, and the nonparametric k-nearest neighbors’ discriminant function were performed over 50 and 100 bootstrap samples and applied to the original data. Searching the optimal classification rule, the choices of prior probabilities were set up for both equal proportions (0.33: 0.33: 0.33) and unequal proportions of (0.90:0.05:0.05), (0.80: 0.10: 0.10) and (0.70, 0.15, 0.15). The results from 50 and 100 bootstrap samples indicated that the k-nearest neighbors approach when k=3 or k=4 and the defined prior probabilities of non-risk: risk: diabetic as 0.90: 0.05:0.05 or 0.80:0.10:0.10 gave the smallest error rate of misclassification. The k-nearest neighbors approach would be suggested for classifying a three-class-imbalanced data of diabetes risk groups.

Keywords: Bootstrap, diabetes risk groups, error rate, k-nearest neighbors.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1962
132 Searching k-Nearest Neighbors to be Appropriate under Gamming Environments

Authors: Jae Moon Lee

Abstract:

In general, algorithms to find continuous k-nearest neighbors have been researched on the location based services, monitoring periodically the moving objects such as vehicles and mobile phone. Those researches assume the environment that the number of query points is much less than that of moving objects and the query points are not moved but fixed. In gaming environments, this problem is when computing the next movement considering the neighbors such as flocking, crowd and robot simulations. In this case, every moving object becomes a query point so that the number of query point is same to that of moving objects and the query points are also moving. In this paper, we analyze the performance of the existing algorithms focused on location based services how they operate under gaming environments.

Keywords: Flocking behavior, heterogeneous agents, similarity, simulation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1504
131 The Selection of the Nearest Anchor Using Received Signal Strength Indication (RSSI)

Authors: Hichem Sassi, Tawfik Najeh, Noureddine Liouane

Abstract:

The localization information is crucial for the operation of WSN. There are principally two types of localization algorithms. The Range-based localization algorithm has strict requirements on hardware, thus is expensive to be implemented in practice. The Range-free localization algorithm reduces the hardware cost. However, it can only achieve high accuracy in ideal scenarios. In this paper, we locate unknown nodes by incorporating the advantages of these two types of methods. The proposed algorithm makes the unknown nodes select the nearest anchor using the Received Signal Strength Indicator (RSSI) and choose two other anchors which are the most accurate to achieve the estimated location. Our algorithm improves the localization accuracy compared with previous algorithms, which has been demonstrated by the simulating results.

Keywords: WSN, localization, DV-hop, RSSI.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1766
130 CBIR Using Multi-Resolution Transform for Brain Tumour Detection and Stages Identification

Authors: H. Benjamin Fredrick David, R. Balasubramanian, A. Anbarasa Pandian

Abstract:

Image retrieval is the most interesting technique which is being used today in our digital world. CBIR, commonly expanded as Content Based Image Retrieval is an image processing technique which identifies the relevant images and retrieves them based on the patterns that are extracted from the digital images. In this paper, two research works have been presented using CBIR. The first work provides an automated and interactive approach to the analysis of CBIR techniques. CBIR works on the principle of supervised machine learning which involves feature selection followed by training and testing phase applied on a classifier in order to perform prediction. By using feature extraction, the image transforms such as Contourlet, Ridgelet and Shearlet could be utilized to retrieve the texture features from the images. The features extracted are used to train and build a classifier using the classification algorithms such as Naïve Bayes, K-Nearest Neighbour and Multi-class Support Vector Machine. Further the testing phase involves prediction which predicts the new input image using the trained classifier and label them from one of the four classes namely 1- Normal brain, 2- Benign tumour, 3- Malignant tumour and 4- Severe tumour. The second research work includes developing a tool which is used for tumour stage identification using the best feature extraction and classifier identified from the first work. Finally, the tool will be used to predict tumour stage and provide suggestions based on the stage of tumour identified by the system. This paper presents these two approaches which is a contribution to the medical field for giving better retrieval performance and for tumour stages identification.

Keywords: Brain tumour detection, content based image retrieval, classification of tumours, image retrieval.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 708
129 A Selective 3-Anchor DV-Hop Algorithm Based On the Nearest Anchor for Wireless Sensor Network

Authors: Hichem Sassi, Tawfik Najeh, Noureddine Liouane

Abstract:

Information of nodes’ locations is an important criterion for lots of applications in Wireless Sensor Networks. In the hop-based range-free localization methods, anchors transmit the localization messages counting a hop count value to the whole network. Each node receives this message and calculates its own distance with anchor in hops and then approximates its own position. However the estimative distances can provoke large error, and affect the localization precision. To solve the problem, this paper proposes an algorithm, which makes the unknown nodes fix the nearest anchor as a reference and select two other anchors which are the most accurate to achieve the estimated location. Compared to the DV-Hop algorithm, experiment results illustrate that proposed algorithm has less average localization error and is more effective.

Keywords: Wireless Sensors Networks, Localization problem, localization average error, DV–Hop Algorithm, MATLAB.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2904
128 On Finite Hjelmslev Planes of Parameters (pk−1, p)

Authors: Atilla Akpinar

Abstract:

In this paper, we study on finite projective Hjelmslev planes M(Zq) coordinatized by Hjelmslev ring Zq (where prime power q = pk). We obtain finite hyperbolic Klingenberg planes from these planes under certain conditions. Also, we give a combinatorical result on M(Zq), related by deleting a line from lines in same neighbour.

Keywords: Finite Klingenberg plane, finite hyperbolic Klingenberg plane.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1105
127 A Comparison between Heuristic and Meta-Heuristic Methods for Solving the Multiple Traveling Salesman Problem

Authors: San Nah Sze, Wei King Tiong

Abstract:

The multiple traveling salesman problem (mTSP) can be used to model many practical problems. The mTSP is more complicated than the traveling salesman problem (TSP) because it requires determining which cities to assign to each salesman, as well as the optimal ordering of the cities within each salesman's tour. Previous studies proposed that Genetic Algorithm (GA), Integer Programming (IP) and several neural network (NN) approaches could be used to solve mTSP. This paper compared the results for mTSP, solved with Genetic Algorithm (GA) and Nearest Neighbor Algorithm (NNA). The number of cities is clustered into a few groups using k-means clustering technique. The number of groups depends on the number of salesman. Then, each group is solved with NNA and GA as an independent TSP. It is found that k-means clustering and NNA are superior to GA in terms of performance (evaluated by fitness function) and computing time.

Keywords: Multiple Traveling Salesman Problem, GeneticAlgorithm, Nearest Neighbor Algorithm, k-Means Clustering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3147
126 Comparative Study Using Weka for Red Blood Cells Classification

Authors: Jameela Ali Alkrimi, Hamid A. Jalab, Loay E. George, Abdul Rahim Ahmad, Azizah Suliman, Karim Al-Jashamy

Abstract:

Red blood cells (RBC) are the most common types of blood cells and are the most intensively studied in cell biology. The lack of RBCs is a condition in which the amount of hemoglobin level is lower than normal and is referred to as “anemia”. Abnormalities in RBCs will affect the exchange of oxygen. This paper presents a comparative study for various techniques for classifying the RBCs as normal or abnormal (anemic) using WEKA. WEKA is an open source consists of different machine learning algorithms for data mining applications. The algorithms tested are Radial Basis Function neural network, Support vector machine, and K-Nearest Neighbors algorithm. Two sets of combined features were utilized for classification of blood cells images. The first set, exclusively consist of geometrical features, was used to identify whether the tested blood cell has a spherical shape or non-spherical cells. While the second set, consist mainly of textural features was used to recognize the types of the spherical cells. We have provided an evaluation based on applying these classification methods to our RBCs image dataset which were obtained from Serdang Hospital - Malaysia, and measuring the accuracy of test results. The best achieved classification rates are 97%, 98%, and 79% for Support vector machines, Radial Basis Function neural network, and K-Nearest Neighbors algorithm respectively.

Keywords: K-Nearest Neighbors, Neural Network, Radial Basis Function, Red blood cells, Support vector machine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2925
125 Artificial Intelligence-Based Detection of Individuals Suffering from Vestibular Disorder

Authors: D. Hişam, S. İkizoğlu

Abstract:

Identifying the problem behind balance disorder is one of the most interesting topics in medical literature. This study has considerably enhanced the development of artificial intelligence (AI) algorithms applying multiple machine learning (ML) models to sensory data on gait collected from humans to classify between normal people and those suffering from Vestibular System (VS) problems. Although AI is widely utilized as a diagnostic tool in medicine, AI models have not been used to perform feature extraction and identify VS disorders through training on raw data. In this study, three ML models, the Random Forest Classifier (RF), Extreme Gradient Boosting (XGB), and K-Nearest Neighbor (KNN), have been trained to detect VS disorder, and the performance comparison of the algorithms has been made using accuracy, recall, precision, and f1-score. With an accuracy of 95.28 %, Random Forest (RF) Classifier was the most accurate model.

Keywords: Vestibular disorder, machine learning, random forest classifier, k-nearest neighbor, extreme gradient boosting.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 82
124 FCNN-MR: A Parallel Instance Selection Method Based on Fast Condensed Nearest Neighbor Rule

Authors: Lu Si, Jie Yu, Shasha Li, Jun Ma, Lei Luo, Qingbo Wu, Yongqi Ma, Zhengji Liu

Abstract:

Instance selection (IS) technique is used to reduce the data size to improve the performance of data mining methods. Recently, to process very large data set, several proposed methods divide the training set into some disjoint subsets and apply IS algorithms independently to each subset. In this paper, we analyze the limitation of these methods and give our viewpoint about how to divide and conquer in IS procedure. Then, based on fast condensed nearest neighbor (FCNN) rule, we propose a large data sets instance selection method with MapReduce framework. Besides ensuring the prediction accuracy and reduction rate, it has two desirable properties: First, it reduces the work load in the aggregation node; Second and most important, it produces the same result with the sequential version, which other parallel methods cannot achieve. We evaluate the performance of FCNN-MR on one small data set and two large data sets. The experimental results show that it is effective and practical.

Keywords: Instance selection, data reduction, MapReduce, kNN.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 962
123 Image Spam Detection Using Color Features and K-Nearest Neighbor Classification

Authors: T. Kumaresan, S. Sanjushree, C. Palanisamy

Abstract:

Image spam is a kind of email spam where the spam text is embedded with an image. It is a new spamming technique being used by spammers to send their messages to bulk of internet users. Spam email has become a big problem in the lives of internet users, causing time consumption and economic losses. The main objective of this paper is to detect the image spam by using histogram properties of an image. Though there are many techniques to automatically detect and avoid this problem, spammers employing new tricks to bypass those techniques, as a result those techniques are inefficient to detect the spam mails. In this paper we have proposed a new method to detect the image spam. Here the image features are extracted by using RGB histogram, HSV histogram and combination of both RGB and HSV histogram. Based on the optimized image feature set classification is done by using k- Nearest Neighbor(k-NN) algorithm. Experimental result shows that our method has achieved better accuracy. From the result it is known that combination of RGB and HSV histogram with k-NN algorithm gives the best accuracy in spam detection.

Keywords: File Type, HSV Histogram, k-NN, RGB Histogram, Spam Detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2086
122 Methods of Geodesic Distance in Two-Dimensional Face Recognition

Authors: Rachid Ahdid, Said Safi, Bouzid Manaut

Abstract:

In this paper, we present a comparative study of three methods of 2D face recognition system such as: Iso-Geodesic Curves (IGC), Geodesic Distance (GD) and Geodesic-Intensity Histogram (GIH). These approaches are based on computing of geodesic distance between points of facial surface and between facial curves. In this study we represented the image at gray level as a 2D surface in a 3D space, with the third coordinate proportional to the intensity values of pixels. In the classifying step, we use: Neural Networks (NN), K-Nearest Neighbor (KNN) and Support Vector Machines (SVM). The images used in our experiments are from two wellknown databases of face images ORL and YaleB. ORL data base was used to evaluate the performance of methods under conditions where the pose and sample size are varied, and the database YaleB was used to examine the performance of the systems when the facial expressions and lighting are varied.

Keywords: 2D face recognition, Geodesic distance, Iso-Geodesic Curves, Geodesic-Intensity Histogram, facial surface, Neural Networks, K-Nearest Neighbor, Support Vector Machines.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1772
121 Study of Compaction in Hot-Mix Asphalt Using Computer Simulations

Authors: Kasthurirangan Gopalakrishnan, Naga Shashidhar, Xiaoxiong Zhong

Abstract:

During the process of compaction in Hot-Mix Asphalt (HMA) mixtures, the distance between aggregate particles decreases as they come together and eliminate air-voids. By measuring the inter-particle distances in a cut-section of a HMA sample the degree of compaction can be estimated. For this, a calibration curve is generated by computer simulation technique when the gradation and asphalt content of the HMA mixture are known. A two-dimensional cross section of HMA specimen was simulated using the mixture design information (gradation, asphalt content and air-void content). Nearest neighbor distance methods such as Delaunay triangulation were used to study the changes in inter-particle distance and area distribution during the process of compaction in HMA. Such computer simulations would enable making several hundreds of repetitions in a short period of time without the necessity to compact and analyze laboratory specimens in order to obtain good statistics on the parameters defined. The distributions for the statistical parameters based on computer simulations showed similar trends as those of laboratory specimens.

Keywords: Computer simulations, Hot-Mix Asphalt (HMA), inter-particle distance, image analysis, nearest neighbor

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1841
120 Massive Lesions Classification using Features based on Morphological Lesion Differences

Authors: U. Bottigli, D.Cascio, F. Fauci, B. Golosio, R. Magro, G.L. Masala, P. Oliva, G. Raso, S.Stumbo

Abstract:

Purpose of this work is the development of an automatic classification system which could be useful for radiologists in the investigation of breast cancer. The software has been designed in the framework of the MAGIC-5 collaboration. In the automatic classification system the suspicious regions with high probability to include a lesion are extracted from the image as regions of interest (ROIs). Each ROI is characterized by some features based on morphological lesion differences. Some classifiers as a Feed Forward Neural Network, a K-Nearest Neighbours and a Support Vector Machine are used to distinguish the pathological records from the healthy ones. The results obtained in terms of sensitivity (percentage of pathological ROIs correctly classified) and specificity (percentage of non-pathological ROIs correctly classified) will be presented through the Receive Operating Characteristic curve (ROC). In particular the best performances are 88% ± 1 of area under ROC curve obtained with the Feed Forward Neural Network.

Keywords: Neural Networks, K-Nearest Neighbours, SupportVector Machine, Computer Aided Diagnosis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1330
119 Reducing SAGE Data Using Genetic Algorithms

Authors: Cheng-Hong Yang, Tsung-Mu Shih, Li-Yeh Chuang

Abstract:

Serial Analysis of Gene Expression is a powerful quantification technique for generating cell or tissue gene expression data. The profile of the gene expression of cell or tissue in several different states is difficult for biologists to analyze because of the large number of genes typically involved. However, feature selection in machine learning can successfully reduce this problem. The method allows reducing the features (genes) in specific SAGE data, and determines only relevant genes. In this study, we used a genetic algorithm to implement feature selection, and evaluate the classification accuracy of the selected features with the K-nearest neighbor method. In order to validate the proposed method, we used two SAGE data sets for testing. The results of this study conclusively prove that the number of features of the original SAGE data set can be significantly reduced and higher classification accuracy can be achieved.

Keywords: Serial Analysis of Gene Expression, Feature selection, Genetic Algorithm, K-nearest neighbor method.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1564
118 A Learning-Community Recommendation Approach for Web-Based Cooperative Learning

Authors: Jian-Wei Li, Yao-Tien Wang, Yi-Chun Chang

Abstract:

Cooperative learning has been defined as learners working together as a team to solve a problem to complete a task or to accomplish a common goal, which emphasizes the importance of interactions among members to promote the whole learning performance. With the popularity of society networks, cooperative learning is no longer limited to traditional classroom teaching activities. Since society networks facilitate to organize online learners, to establish common shared visions, and to advance learning interaction, the online community and online learning community have triggered the establishment of web-based societies. Numerous research literatures have indicated that the collaborative learning community is a critical issue to enhance learning performance. Hence, this paper proposes a learning community recommendation approach to facilitate that a learner joins the appropriate learning communities, which is based on k-nearest neighbor (kNN) classification. To demonstrate the viability of the proposed approach, the proposed approach is implemented for 117 students to recommend learning communities. The experimental results indicate that the proposed approach can effectively recommend appropriate learning communities for learners.

Keywords: k-nearest neighbor classification, learning community, Cooperative/Collaborative Learning and Environments.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1858
117 Superior Performances of the Neural Network on the Masses Lesions Classification through Morphological Lesion Differences

Authors: U. Bottigli, R.Chiarucci, B. Golosio, G.L. Masala, P. Oliva, S.Stumbo, D.Cascio, F. Fauci, M. Glorioso, M. Iacomi, R. Magro, G. Raso

Abstract:

Purpose of this work is to develop an automatic classification system that could be useful for radiologists in the breast cancer investigation. The software has been designed in the framework of the MAGIC-5 collaboration. In an automatic classification system the suspicious regions with high probability to include a lesion are extracted from the image as regions of interest (ROIs). Each ROI is characterized by some features based generally on morphological lesion differences. A study in the space features representation is made and some classifiers are tested to distinguish the pathological regions from the healthy ones. The results provided in terms of sensitivity and specificity will be presented through the ROC (Receiver Operating Characteristic) curves. In particular the best performances are obtained with the Neural Networks in comparison with the K-Nearest Neighbours and the Support Vector Machine: The Radial Basis Function supply the best results with 0.89 ± 0.01 of area under ROC curve but similar results are obtained with the Probabilistic Neural Network and a Multi Layer Perceptron.

Keywords: Neural Networks, K-Nearest Neighbours, Support Vector Machine, Computer Aided Detection

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1566
116 Comparison of Different k-NN Models for Speed Prediction in an Urban Traffic Network

Authors: Seyoung Kim, Jeongmin Kim, Kwang Ryel Ryu

Abstract:

A database that records average traffic speeds measured at five-minute intervals for all the links in the traffic network of a metropolitan city. While learning from this data the models that can predict future traffic speed would be beneficial for the applications such as the car navigation system, building predictive models for every link becomes a nontrivial job if the number of links in a given network is huge. An advantage of adopting k-nearest neighbor (k-NN) as predictive models is that it does not require any explicit model building. Instead, k-NN takes a long time to make a prediction because it needs to search for the k-nearest neighbors in the database at prediction time. In this paper, we investigate how much we can speed up k-NN in making traffic speed predictions by reducing the amount of data to be searched for without a significant sacrifice of prediction accuracy. The rationale behind this is that we had a better look at only the recent data because the traffic patterns not only repeat daily or weekly but also change over time. In our experiments, we build several different k-NN models employing different sets of features which are the current and past traffic speeds of the target link and the neighbor links in its up/down-stream. The performances of these models are compared by measuring the average prediction accuracy and the average time taken to make a prediction using various amounts of data.

Keywords: Big data, k-NN, machine learning, traffic speed prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1326
115 Predicting Application Layer DDoS Attacks Using Machine Learning Algorithms

Authors: S. Umarani, D. Sharmila

Abstract:

A Distributed Denial of Service (DDoS) attack is a major threat to cyber security. It originates from the network layer or the application layer of compromised/attacker systems which are connected to the network. The impact of this attack ranges from the simple inconvenience to use a particular service to causing major failures at the targeted server. When there is heavy traffic flow to a target server, it is necessary to classify the legitimate access and attacks. In this paper, a novel method is proposed to detect DDoS attacks from the traces of traffic flow. An access matrix is created from the traces. As the access matrix is multi dimensional, Principle Component Analysis (PCA) is used to reduce the attributes used for detection. Two classifiers Naive Bayes and K-Nearest neighborhood are used to classify the traffic as normal or abnormal. The performance of the classifier with PCA selected attributes and actual attributes of access matrix is compared by the detection rate and False Positive Rate (FPR).

Keywords: Distributed Denial of Service (DDoS) attack, Application layer DDoS, DDoS Detection, K- Nearest neighborhood classifier, Naive Bayes Classifier, Principle Component Analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5214
114 Performance Analysis of Genetic Algorithm with kNN and SVM for Feature Selection in Tumor Classification

Authors: C. Gunavathi, K. Premalatha

Abstract:

Tumor classification is a key area of research in the field of bioinformatics. Microarray technology is commonly used in the study of disease diagnosis using gene expression levels. The main drawback of gene expression data is that it contains thousands of genes and a very few samples. Feature selection methods are used to select the informative genes from the microarray. These methods considerably improve the classification accuracy. In the proposed method, Genetic Algorithm (GA) is used for effective feature selection. Informative genes are identified based on the T-Statistics, Signal-to-Noise Ratio (SNR) and F-Test values. The initial candidate solutions of GA are obtained from top-m informative genes. The classification accuracy of k-Nearest Neighbor (kNN) method is used as the fitness function for GA. In this work, kNN and Support Vector Machine (SVM) are used as the classifiers. The experimental results show that the proposed work is suitable for effective feature selection. With the help of the selected genes, GA-kNN method achieves 100% accuracy in 4 datasets and GA-SVM method achieves in 5 out of 10 datasets. The GA with kNN and SVM methods are demonstrated to be an accurate method for microarray based tumor classification.

Keywords: F-Test, Gene Expression, Genetic Algorithm, k- Nearest-Neighbor, Microarray, Signal-to-Noise Ratio, Support Vector Machine, T-statistics, Tumor Classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4481
113 Implementing a Visual Servoing System for Robot Controlling

Authors: Maryam Vafadar, Alireza Behrad, Saeed Akbari

Abstract:

Nowadays, with the emerging of the new applications like robot control in image processing, artificial vision for visual servoing is a rapidly growing discipline and Human-machine interaction plays a significant role for controlling the robot. This paper presents a new algorithm based on spatio-temporal volumes for visual servoing aims to control robots. In this algorithm, after applying necessary pre-processing on video frames, a spatio-temporal volume is constructed for each gesture and feature vector is extracted. These volumes are then analyzed for matching in two consecutive stages. For hand gesture recognition and classification we tested different classifiers including k-Nearest neighbor, learning vector quantization and back propagation neural networks. We tested the proposed algorithm with the collected data set and results showed the correct gesture recognition rate of 99.58 percent. We also tested the algorithm with noisy images and algorithm showed the correct recognition rate of 97.92 percent in noisy images.

Keywords: Back propagation neural network, Feature vector, Hand gesture recognition, k-Nearest Neighbor, Learning vector quantization neural network, Robot control, Spatio-temporal volume, Visual servoing

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1626
112 Pre-Operative Tool for Facial-Post-Surgical Estimation and Detection

Authors: Ayat E. Ali, Christeen R. Aziz, Merna A. Helmy, Mohammed M. Malek, Sherif H. El-Gohary

Abstract:

Goal: Purpose of the project was to make a plastic surgery prediction by using pre-operative images for the plastic surgeries’ patients and to show this prediction on a screen to compare between the current case and the appearance after the surgery. Methods: To this aim, we implemented a software which used data from the internet for facial skin diseases, skin burns, pre-and post-images for plastic surgeries then the post- surgical prediction is done by using K-nearest neighbor (KNN). So we designed and fabricated a smart mirror divided into two parts a screen and a reflective mirror so patient's pre- and post-appearance will be showed at the same time. Results: We worked on some skin diseases like vitiligo, skin burns and wrinkles. We classified the three degrees of burns using KNN classifier with accuracy 60%. We also succeeded in segmenting the area of vitiligo. Our future work will include working on more skin diseases, classify them and give a prediction for the look after the surgery. Also we will go deeper into facial deformities and plastic surgeries like nose reshaping and face slim down. Conclusion: Our project will give a prediction relates strongly to the real look after surgery and decrease different diagnoses among doctors. Significance: The mirror may have broad societal appeal as it will make the distance between patient's satisfaction and the medical standards smaller.

Keywords: K-nearest neighbor, face detection, vitiligo, bone deformity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 649
111 Visualization and Indexing of Spectral Databases

Authors: Tibor Kulcsar, Gabor Sarossy, Gabor Bereznai, Robert Auer, Janos Abonyi

Abstract:

On-line (near infrared) spectroscopy is widely used to support the operation of complex process systems. Information extracted from spectral database can be used to estimate unmeasured product properties and monitor the operation of the process. These techniques are based on looking for similar spectra by nearest neighborhood algorithms and distance based searching methods. Search for nearest neighbors in the spectral space is an NP-hard problem, the computational complexity increases by the number of points in the discrete spectrum and the number of samples in the database. To reduce the calculation time some kind of indexing could be used. The main idea presented in this paper is to combine indexing and visualization techniques to reduce the computational requirement of estimation algorithms by providing a two dimensional indexing that can also be used to visualize the structure of the spectral database. This 2D visualization of spectral database does not only support application of distance and similarity based techniques but enables the utilization of advanced clustering and prediction algorithms based on the Delaunay tessellation of the mapped spectral space. This means the prediction has not to use the high dimension space but can be based on the mapped space too. The results illustrate that the proposed method is able to segment (cluster) spectral databases and detect outliers that are not suitable for instance based learning algorithms.

Keywords: indexing high dimensional databases, dimensional reduction, clustering, similarity, k-nn algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1719
110 Low Resolution Face Recognition Using Mixture of Experts

Authors: Fatemeh Behjati Ardakani, Fatemeh Khademian, Abbas Nowzari Dalini, Reza Ebrahimpour

Abstract:

Human activity is a major concern in a wide variety of applications, such as video surveillance, human computer interface and face image database management. Detecting and recognizing faces is a crucial step in these applications. Furthermore, major advancements and initiatives in security applications in the past years have propelled face recognition technology into the spotlight. The performance of existing face recognition systems declines significantly if the resolution of the face image falls below a certain level. This is especially critical in surveillance imagery where often, due to many reasons, only low-resolution video of faces is available. If these low-resolution images are passed to a face recognition system, the performance is usually unacceptable. Hence, resolution plays a key role in face recognition systems. In this paper we introduce a new low resolution face recognition system based on mixture of expert neural networks. In order to produce the low resolution input images we down-sampled the 48 × 48 ORL images to 12 × 12 ones using the nearest neighbor interpolation method and after that applying the bicubic interpolation method yields enhanced images which is given to the Principal Component Analysis feature extractor system. Comparison with some of the most related methods indicates that the proposed novel model yields excellent recognition rate in low resolution face recognition that is the recognition rate of 100% for the training set and 96.5% for the test set.

Keywords: Low resolution face recognition, Multilayered neuralnetwork, Mixture of experts neural network, Principal componentanalysis, Bicubic interpolation, Nearest neighbor interpolation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1674
109 Feature Reduction of Nearest Neighbor Classifiers using Genetic Algorithm

Authors: M. Analoui, M. Fadavi Amiri

Abstract:

The design of a pattern classifier includes an attempt to select, among a set of possible features, a minimum subset of weakly correlated features that better discriminate the pattern classes. This is usually a difficult task in practice, normally requiring the application of heuristic knowledge about the specific problem domain. The selection and quality of the features representing each pattern have a considerable bearing on the success of subsequent pattern classification. Feature extraction is the process of deriving new features from the original features in order to reduce the cost of feature measurement, increase classifier efficiency, and allow higher classification accuracy. Many current feature extraction techniques involve linear transformations of the original pattern vectors to new vectors of lower dimensionality. While this is useful for data visualization and increasing classification efficiency, it does not necessarily reduce the number of features that must be measured since each new feature may be a linear combination of all of the features in the original pattern vector. In this paper a new approach is presented to feature extraction in which feature selection, feature extraction, and classifier training are performed simultaneously using a genetic algorithm. In this approach each feature value is first normalized by a linear equation, then scaled by the associated weight prior to training, testing, and classification. A knn classifier is used to evaluate each set of feature weights. The genetic algorithm optimizes a vector of feature weights, which are used to scale the individual features in the original pattern vectors in either a linear or a nonlinear fashion. By this approach, the number of features used in classifying can be finely reduced.

Keywords: Feature reduction, genetic algorithm, pattern classification, nearest neighbor rule classifiers (k-NNR).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1723
108 Accelerating GLA with an M-Tree

Authors: Olli Luoma, Johannes Tuikkala, Olli Nevalainen

Abstract:

In this paper, we propose a novel improvement for the generalized Lloyd Algorithm (GLA). Our algorithm makes use of an M-tree index built on the codebook which makes it possible to reduce the number of distance computations when the nearest code words are searched. Our method does not impose the use of any specific distance function, but works with any metric distance, making it more general than many other fast GLA variants. Finally, we present the positive results of our performance experiments.

Keywords: Clustering, GLA, M-Tree, Vector Quantization .

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1458
107 Improved Tropical Wood Species Recognition System based on Multi-feature Extractor and Classifier

Authors: Marzuki Khalid, RubiyahYusof, AnisSalwaMohdKhairuddin

Abstract:

An automated wood recognition system is designed to classify tropical wood species.The wood features are extracted based on two feature extractors: Basic Grey Level Aura Matrix (BGLAM) technique and statistical properties of pores distribution (SPPD) technique. Due to the nonlinearity of the tropical wood species separation boundaries, a pre classification stage is proposed which consists ofKmeans clusteringand kernel discriminant analysis (KDA). Finally, Linear Discriminant Analysis (LDA) classifier and KNearest Neighbour (KNN) are implemented for comparison purposes. The study involves comparison of the system with and without pre classification using KNN classifier and LDA classifier.The results show that the inclusion of the pre classification stage has improved the accuracy of both the LDA and KNN classifiers by more than 12%.

Keywords: Tropical wood species, nonlinear data, featureextractors, classification

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1934
106 Spatial Mapping of Dengue Incidence: A Case Study in Hulu Langat District, Selangor, Malaysia

Authors: Er, A. C., Rosli, M. H., Asmahani A., Mohamad Naim M. R., Harsuzilawati M.

Abstract:

Dengue is a mosquito-borne infection that has peaked to an alarming rate in recent decades. It can be found in tropical and sub-tropical climate. In Malaysia, dengue has been declared as one of the national health threat to the public. This study aimed to map the spatial distributions of dengue cases in the district of Hulu Langat, Selangor via a combination of Geographic Information System (GIS) and spatial statistic tools. Data related to dengue was gathered from the various government health agencies. The location of dengue cases was geocoded using a handheld GPS Juno SB Trimble. A total of 197 dengue cases occurring in 2003 were used in this study. Those data then was aggregated into sub-district level and then converted into GIS format. The study also used population or demographic data as well as the boundary of Hulu Langat. To assess the spatial distribution of dengue cases three spatial statistics method (Moran-s I, average nearest neighborhood (ANN) and kernel density estimation) were applied together with spatial analysis in the GIS environment. Those three indices were used to analyze the spatial distribution and average distance of dengue incidence and to locate the hot spot of dengue cases. The results indicated that the dengue cases was clustered (p < 0.01) when analyze using Moran-s I with z scores 5.03. The results from ANN analysis showed that the average nearest neighbor ratio is less than 1 which is 0.518755 (p < 0.0001). From this result, we can expect the dengue cases pattern in Hulu Langat district is exhibiting a cluster pattern. The z-score for dengue incidence within the district is -13.0525 (p < 0.0001). It was also found that the significant spatial autocorrelation of dengue incidences occurs at an average distance of 380.81 meters (p < 0.0001). Several locations especially residential area also had been identified as the hot spots of dengue cases in the district.

Keywords: Dengue, geographic information system (GIS), spatial analysis, spatial statistics

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5288
105 Lung Cancer Detection and Multi Level Classification Using Discrete Wavelet Transform Approach

Authors: V. Veeraprathap, G. S. Harish, G. Narendra Kumar

Abstract:

Uncontrolled growth of abnormal cells in the lung in the form of tumor can be either benign (non-cancerous) or malignant (cancerous). Patients with Lung Cancer (LC) have an average of five years life span expectancy provided diagnosis, detection and prediction, which reduces many treatment options to risk of invasive surgery increasing survival rate. Computed Tomography (CT), Positron Emission Tomography (PET), and Magnetic Resonance Imaging (MRI) for earlier detection of cancer are common. Gaussian filter along with median filter used for smoothing and noise removal, Histogram Equalization (HE) for image enhancement gives the best results without inviting further opinions. Lung cavities are extracted and the background portion other than two lung cavities is completely removed with right and left lungs segmented separately. Region properties measurements area, perimeter, diameter, centroid and eccentricity measured for the tumor segmented image, while texture is characterized by Gray-Level Co-occurrence Matrix (GLCM) functions, feature extraction provides Region of Interest (ROI) given as input to classifier. Two levels of classifications, K-Nearest Neighbor (KNN) is used for determining patient condition as normal or abnormal, while Artificial Neural Networks (ANN) is used for identifying the cancer stage is employed. Discrete Wavelet Transform (DWT) algorithm is used for the main feature extraction leading to best efficiency. The developed technology finds encouraging results for real time information and on line detection for future research.

Keywords: ANN, DWT, GLCM, KNN, ROI, artificial neural networks, discrete wavelet transform, gray-level co-occurrence matrix, k-nearest neighbor, region of interest.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 899