Search results for: approximate nearest neighbor search
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2567

Search results for: approximate nearest neighbor search

2567 Nearest Neighbor Investigate Using R+ Tree

Authors: Rutuja Desai

Abstract:

Search engine is fundamentally a framework used to search the data which is pertinent to the client via WWW. Looking close-by spot identified with the keywords is an imperative concept in developing web advances. For such kind of searching, extent pursuit or closest neighbor is utilized. In range search the forecast is made whether the objects meet to query object. Nearest neighbor is the forecast of the focuses close to the query set by the client. Here, the nearest neighbor methodology is utilized where Data recovery R+ tree is utilized rather than IR2 tree. The disadvantages of IR2 tree is: The false hit number can surpass the limit and the mark in Information Retrieval R-tree must have Voice over IP bit for each one of a kind word in W set is recouped by Data recovery R+ tree. The inquiry is fundamentally subordinate upon the key words and the geometric directions.

Keywords: information retrieval, nearest neighbor search, keyword search, R+ tree

Procedia PDF Downloads 288
2566 SC-LSH: An Efficient Indexing Method for Approximate Similarity Search in High Dimensional Space

Authors: Sanaa Chafik, Imane Daoudi, Mounim A. El Yacoubi, Hamid El Ouardi

Abstract:

Locality Sensitive Hashing (LSH) is one of the most promising techniques for solving nearest neighbour search problem in high dimensional space. Euclidean LSH is the most popular variation of LSH that has been successfully applied in many multimedia applications. However, the Euclidean LSH presents limitations that affect structure and query performances. The main limitation of the Euclidean LSH is the large memory consumption. In order to achieve a good accuracy, a large number of hash tables is required. In this paper, we propose a new hashing algorithm to overcome the storage space problem and improve query time, while keeping a good accuracy as similar to that achieved by the original Euclidean LSH. The Experimental results on a real large-scale dataset show that the proposed approach achieves good performances and consumes less memory than the Euclidean LSH.

Keywords: approximate nearest neighbor search, content based image retrieval (CBIR), curse of dimensionality, locality sensitive hashing, multidimensional indexing, scalability

Procedia PDF Downloads 321
2565 Hybrid Fuzzy Weighted K-Nearest Neighbor to Predict Hospital Readmission for Diabetic Patients

Authors: Soha A. Bahanshal, Byung G. Kim

Abstract:

Identification of patients at high risk for hospital readmission is of crucial importance for quality health care and cost reduction. Predicting hospital readmissions among diabetic patients has been of great interest to many researchers and health decision makers. We build a prediction model to predict hospital readmission for diabetic patients within 30 days of discharge. The core of the prediction model is a modified k Nearest Neighbor called Hybrid Fuzzy Weighted k Nearest Neighbor algorithm. The prediction is performed on a patient dataset which consists of more than 70,000 patients with 50 attributes. We applied data preprocessing using different techniques in order to handle data imbalance and to fuzzify the data to suit the prediction algorithm. The model so far achieved classification accuracy of 80% compared to other models that only use k Nearest Neighbor.

Keywords: machine learning, prediction, classification, hybrid fuzzy weighted k-nearest neighbor, diabetic hospital readmission

Procedia PDF Downloads 185
2564 Urban Land Cover from GF-2 Satellite Images Using Object Based and Neural Network Classifications

Authors: Lamyaa Gamal El-Deen Taha, Ashraf Sharawi

Abstract:

China launched satellite GF-2 in 2014. This study deals with comparing nearest neighbor object-based classification and neural network classification methods for classification of the fused GF-2 image. Firstly, rectification of GF-2 image was performed. Secondly, a comparison between nearest neighbor object-based classification and neural network classification for classification of fused GF-2 was performed. Thirdly, the overall accuracy of classification and kappa index were calculated. Results indicate that nearest neighbor object-based classification is better than neural network classification for urban mapping.

Keywords: GF-2 images, feature extraction-rectification, nearest neighbour object based classification, segmentation algorithms, neural network classification, multilayer perceptron

Procedia PDF Downloads 388
2563 Identity Verification Using k-NN Classifiers and Autistic Genetic Data

Authors: Fuad M. Alkoot

Abstract:

DNA data have been used in forensics for decades. However, current research looks at using the DNA as a biometric identity verification modality. The goal is to improve the speed of identification. We aim at using gene data that was initially used for autism detection to find if and how accurate is this data for identification applications. Mainly our goal is to find if our data preprocessing technique yields data useful as a biometric identification tool. We experiment with using the nearest neighbor classifier to identify subjects. Results show that optimal classification rate is achieved when the test set is corrupted by normally distributed noise with zero mean and standard deviation of 1. The classification rate is close to optimal at higher noise standard deviation reaching 3. This shows that the data can be used for identity verification with high accuracy using a simple classifier such as the k-nearest neighbor (k-NN). 

Keywords: biometrics, genetic data, identity verification, k nearest neighbor

Procedia PDF Downloads 255
2562 Performance Comparison of Different Regression Methods for a Polymerization Process with Adaptive Sampling

Authors: Florin Leon, Silvia Curteanu

Abstract:

Developing complete mechanistic models for polymerization reactors is not easy, because complex reactions occur simultaneously; there is a large number of kinetic parameters involved and sometimes the chemical and physical phenomena for mixtures involving polymers are poorly understood. To overcome these difficulties, empirical models based on sampled data can be used instead, namely regression methods typical of machine learning field. They have the ability to learn the trends of a process without any knowledge about its particular physical and chemical laws. Therefore, they are useful for modeling complex processes, such as the free radical polymerization of methyl methacrylate achieved in a batch bulk process. The goal is to generate accurate predictions of monomer conversion, numerical average molecular weight and gravimetrical average molecular weight. This process is associated with non-linear gel and glass effects. For this purpose, an adaptive sampling technique is presented, which can select more samples around the regions where the values have a higher variation. Several machine learning methods are used for the modeling and their performance is compared: support vector machines, k-nearest neighbor, k-nearest neighbor and random forest, as well as an original algorithm, large margin nearest neighbor regression. The suggested method provides very good results compared to the other well-known regression algorithms.

Keywords: batch bulk methyl methacrylate polymerization, adaptive sampling, machine learning, large margin nearest neighbor regression

Procedia PDF Downloads 303
2561 Diabetes Diagnosis Model Using Rough Set and K- Nearest Neighbor Classifier

Authors: Usiobaifo Agharese Rosemary, Osaseri Roseline Oghogho

Abstract:

Diabetes is a complex group of disease with a variety of causes; it is a disorder of the body metabolism in the digestion of carbohydrates food. The application of machine learning in the field of medical diagnosis has been the focus of many researchers and the use of recognition and classification model as a decision support tools has help the medical expert in diagnosis of diseases. Considering the large volume of medical data which require special techniques, experience, and high diagnostic skill in the diagnosis of diseases, the application of an artificial intelligent system to assist medical personnel in order to enhance their efficiency and accuracy in diagnosis will be an invaluable tool. In this study will propose a diabetes diagnosis model using rough set and K-nearest Neighbor classifier algorithm. The system consists of two modules: the feature extraction module and predictor module, rough data set is used to preprocess the attributes while K-nearest neighbor classifier is used to classify the given data. The dataset used for this model was taken for University of Benin Teaching Hospital (UBTH) database. Half of the data was used in the training while the other half was used in testing the system. The proposed model was able to achieve over 80% accuracy.

Keywords: classifier algorithm, diabetes, diagnostic model, machine learning

Procedia PDF Downloads 335
2560 Feature Extraction Technique for Prediction the Antigenic Variants of the Influenza Virus

Authors: Majid Forghani, Michael Khachay

Abstract:

In genetics, the impact of neighboring amino acids on a target site is referred as the nearest-neighbor effect or simply neighbor effect. In this paper, a new method called wavelet particle decomposition representing the one-dimensional neighbor effect using wavelet packet decomposition is proposed. The main idea lies in known dependence of wavelet packet sub-bands on location and order of neighboring samples. The method decomposes the value of a signal sample into small values called particles that represent a part of the neighbor effect information. The results have shown that the information obtained from the particle decomposition can be used to create better model variables or features. As an example, the approach has been applied to improve the correlation of test and reference sequence distance with titer in the hemagglutination inhibition assay.

Keywords: antigenic variants, neighbor effect, wavelet packet, wavelet particle decomposition

Procedia PDF Downloads 151
2559 Comparison of Different k-NN Models for Speed Prediction in an Urban Traffic Network

Authors: Seyoung Kim, Jeongmin Kim, Kwang Ryel Ryu

Abstract:

A database that records average traffic speeds measured at five-minute intervals for all the links in the traffic network of a metropolitan city. While learning from this data the models that can predict future traffic speed would be beneficial for the applications such as the car navigation system, building predictive models for every link becomes a nontrivial job if the number of links in a given network is huge. An advantage of adopting k-nearest neighbor (k-NN) as predictive models is that it does not require any explicit model building. Instead, k-NN takes a long time to make a prediction because it needs to search for the k-nearest neighbors in the database at prediction time. In this paper, we investigate how much we can speed up k-NN in making traffic speed predictions by reducing the amount of data to be searched for without a significant sacrifice of prediction accuracy. The rationale behind this is that we had a better look at only the recent data because the traffic patterns not only repeat daily or weekly but also change over time. In our experiments, we build several different k-NN models employing different sets of features which are the current and past traffic speeds of the target link and the neighbor links in its up/down-stream. The performances of these models are compared by measuring the average prediction accuracy and the average time taken to make a prediction using various amounts of data.

Keywords: big data, k-NN, machine learning, traffic speed prediction

Procedia PDF Downloads 362
2558 A Selection Approach: Discriminative Model for Nominal Attributes-Based Distance Measures

Authors: Fang Gong

Abstract:

Distance measures are an indispensable part of many instance-based learning (IBL) and machine learning (ML) algorithms. The value difference metrics (VDM) and inverted specific-class distance measure (ISCDM) are among the top-performing distance measures that address nominal attributes. VDM performs well in some domains owing to its simplicity and poorly in others that exist missing value and non-class attribute noise. ISCDM, however, typically works better than VDM on such domains. To maximize their advantages and avoid disadvantages, in this paper, a selection approach: a discriminative model for nominal attributes-based distance measures is proposed. More concretely, VDM and ISCDM are built independently on a training dataset at the training stage, and the most credible one is recorded for each training instance. At the test stage, its nearest neighbor for each test instance is primarily found by any of VDM and ISCDM and then chooses the most reliable model of its nearest neighbor to predict its class label. It is simply denoted as a discriminative distance measure (DDM). Experiments are conducted on the 34 University of California at Irvine (UCI) machine learning repository datasets, and it shows DDM retains the interpretability and simplicity of VDM and ISCDM but significantly outperforms the original VDM and ISCDM and other state-of-the-art competitors in terms of accuracy.

Keywords: distance measure, discriminative model, nominal attributes, nearest neighbor

Procedia PDF Downloads 113
2557 Application of Fuzzy Approach to the Vibration Fault Diagnosis

Authors: Jalel Khelil

Abstract:

In order to improve reliability of Gas Turbine machine especially its generator equipment, a fault diagnosis system based on fuzzy approach is proposed. Three various methods namely K-NN (K-nearest neighbors), F-KNN (Fuzzy K-nearest neighbors) and FNM (Fuzzy nearest mean) are adopted to provide the measurement of relative strength of vibration defaults. Both applications consist of two major steps: Feature extraction and default classification. 09 statistical features are extracted from vibration signals. 03 different classes are used in this study which describes vibrations condition: Normal, unbalance defect, and misalignment defect. The use of the fuzzy approaches and the classification results are discussed. Results show that these approaches yield high successful rates of vibration default classification.

Keywords: fault diagnosis, fuzzy classification k-nearest neighbor, vibration

Procedia PDF Downloads 465
2556 Measuring Multi-Class Linear Classifier for Image Classification

Authors: Fatma Susilawati Mohamad, Azizah Abdul Manaf, Fadhillah Ahmad, Zarina Mohamad, Wan Suryani Wan Awang

Abstract:

A simple and robust multi-class linear classifier is proposed and implemented. For a pair of classes of the linear boundary, a collection of segments of hyper planes created as perpendicular bisectors of line segments linking centroids of the classes or part of classes. Nearest Neighbor and Linear Discriminant Analysis are compared in the experiments to see the performances of each classifier in discriminating ripeness of oil palm. This paper proposes a multi-class linear classifier using Linear Discriminant Analysis (LDA) for image identification. Result proves that LDA is well capable in separating multi-class features for ripeness identification.

Keywords: multi-class, linear classifier, nearest neighbor, linear discriminant analysis

Procedia PDF Downloads 537
2555 The Influence of Noise on Aerial Image Semantic Segmentation

Authors: Pengchao Wei, Xiangzhong Fang

Abstract:

Noise is ubiquitous in this world. Denoising is an essential technology, especially in image semantic segmentation, where noises are generally categorized into two main types i.e. feature noise and label noise. The main focus of this paper is aiming at modeling label noise, investigating the behaviors of different types of label noise on image semantic segmentation tasks using K-Nearest-Neighbor and Convolutional Neural Network classifier. The performance without label noise and with is evaluated and illustrated in this paper. In addition to that, the influence of feature noise on the image semantic segmentation task is researched as well and a feature noise reduction method is applied to mitigate its influence in the learning procedure.

Keywords: convolutional neural network, denoising, feature noise, image semantic segmentation, k-nearest-neighbor, label noise

Procedia PDF Downloads 219
2554 Minimization of Propagation Delay in Multi Unmanned Aerial Vehicle Network

Authors: Purva Joshi, Rohit Thanki, Omar Hanif

Abstract:

Unmanned aerial vehicles (UAVs) are becoming increasingly important in various industrial applications and sectors. Nowadays, a multi UAV network is used for specific types of communication (e.g., military) and monitoring purposes. Therefore, it is critical to reducing propagation delay during communication between UAVs, which is essential in a multi UAV network. This paper presents how the propagation delay between the base station (BS) and the UAVs is reduced using a searching algorithm. Furthermore, the iterative-based K-nearest neighbor (k-NN) algorithm and Travelling Salesmen Problem (TSP) algorthm were utilized to optimize the distance between BS and individual UAV to overcome the problem of propagation delay in multi UAV networks. The simulation results show that this proposed method reduced complexity, improved reliability, and reduced propagation delay in multi UAV networks.

Keywords: multi UAV network, optimal distance, propagation delay, K - nearest neighbor, traveling salesmen problem

Procedia PDF Downloads 198
2553 Towards a Balancing Medical Database by Using the Least Mean Square Algorithm

Authors: Kamel Belammi, Houria Fatrim

Abstract:

imbalanced data set, a problem often found in real world application, can cause seriously negative effect on classification performance of machine learning algorithms. There have been many attempts at dealing with classification of imbalanced data sets. In medical diagnosis classification, we often face the imbalanced number of data samples between the classes in which there are not enough samples in rare classes. In this paper, we proposed a learning method based on a cost sensitive extension of Least Mean Square (LMS) algorithm that penalizes errors of different samples with different weight and some rules of thumb to determine those weights. After the balancing phase, we applythe different classifiers (support vector machine (SVM), k- nearest neighbor (KNN) and multilayer neuronal networks (MNN)) for balanced data set. We have also compared the obtained results before and after balancing method.

Keywords: multilayer neural networks, k- nearest neighbor, support vector machine, imbalanced medical data, least mean square algorithm, diabetes

Procedia PDF Downloads 531
2552 Artificial Intelligence-Based Detection of Individuals Suffering from Vestibular Disorder

Authors: Dua Hişam, Serhat İkizoğlu

Abstract:

Identifying the problem behind balance disorder is one of the most interesting topics in the medical literature. This study has considerably enhanced the development of artificial intelligence (AI) algorithms applying multiple machine learning (ML) models to sensory data on gait collected from humans to classify between normal people and those suffering from Vestibular System (VS) problems. Although AI is widely utilized as a diagnostic tool in medicine, AI models have not been used to perform feature extraction and identify VS disorders through training on raw data. In this study, three machine learning (ML) models, the Random Forest Classifier (RF), Extreme Gradient Boosting (XGB), and K-Nearest Neighbor (KNN), have been trained to detect VS disorder, and the performance comparison of the algorithms has been made using accuracy, recall, precision, and f1-score. With an accuracy of 95.28 %, Random Forest Classifier (RF) was the most accurate model.

Keywords: vestibular disorder, machine learning, random forest classifier, k-nearest neighbor, extreme gradient boosting

Procedia PDF Downloads 68
2551 Searchable Encryption in Cloud Storage

Authors: Ren Junn Hwang, Chung-Chien Lu, Jain-Shing Wu

Abstract:

Cloud outsource storage is one of important services in cloud computing. Cloud users upload data to cloud servers to reduce the cost of managing data and maintaining hardware and software. To ensure data confidentiality, users can encrypt their files before uploading them to a cloud system. However, retrieving the target file from the encrypted files exactly is difficult for cloud server. This study proposes a protocol for performing multikeyword searches for encrypted cloud data by applying k-nearest neighbor technology. The protocol ranks the relevance scores of encrypted files and keywords, and prevents cloud servers from learning search keywords submitted by a cloud user. To reduce the costs of file transfer communication, the cloud server returns encrypted files in order of relevance. Moreover, when a cloud user inputs an incorrect keyword and the number of wrong alphabet does not exceed a given threshold; the user still can retrieve the target files from cloud server. In addition, the proposed scheme satisfies security requirements for outsourced data storage.

Keywords: fault-tolerance search, multi-keywords search, outsource storage, ranked search, searchable encryption

Procedia PDF Downloads 382
2550 FCNN-MR: A Parallel Instance Selection Method Based on Fast Condensed Nearest Neighbor Rule

Authors: Lu Si, Jie Yu, Shasha Li, Jun Ma, Lei Luo, Qingbo Wu, Yongqi Ma, Zhengji Liu

Abstract:

Instance selection (IS) technique is used to reduce the data size to improve the performance of data mining methods. Recently, to process very large data set, several proposed methods divide the training set into some disjoint subsets and apply IS algorithms independently to each subset. In this paper, we analyze the limitation of these methods and give our viewpoint about how to divide and conquer in IS procedure. Then, based on fast condensed nearest neighbor (FCNN) rule, we propose a large data sets instance selection method with MapReduce framework. Besides ensuring the prediction accuracy and reduction rate, it has two desirable properties: First, it reduces the work load in the aggregation node; Second and most important, it produces the same result with the sequential version, which other parallel methods cannot achieve. We evaluate the performance of FCNN-MR on one small data set and two large data sets. The experimental results show that it is effective and practical.

Keywords: instance selection, data reduction, MapReduce, kNN

Procedia PDF Downloads 253
2549 A Location-Based Search Approach According to Users’ Application Scenario

Authors: Shih-Ting Yang, Chih-Yun Lin, Ming-Yu Li, Jhong-Ting Syue, Wei-Ming Huang

Abstract:

Global positioning system (GPS) has become increasing precise in recent years, and the location-based service (LBS) has developed rapidly. Take the example of finding a parking lot (such as Parking apps). The location-based service can offer immediate information about a nearby parking lot, including the information about remaining parking spaces. However, it cannot provide expected search results according to the requirement situations of users. For that reason, this paper develops a “Location-based Search Approach according to Users’ Application Scenario” according to the location-based search and demand determination to help users obtain the information consistent with their requirements. The “Location-based Search Approach based on Users’ Application Scenario” of this paper consists of one mechanism and three kernel modules. First, in the Information Pre-processing Mechanism (IPM), this paper uses the cosine theorem to categorize the locations of users. Then, in the Information Category Evaluation Module (ICEM), the kNN (k-Nearest Neighbor) is employed to classify the browsing records of users. After that, in the Information Volume Level Determination Module (IVLDM), this paper makes a comparison between the number of users’ clicking the information at different locations and the average number of users’ clicking the information at a specific location, so as to evaluate the urgency of demand; then, the two-dimensional space is used to estimate the application situations of users. For the last step, in the Location-based Search Module (LBSM), this paper compares all search results and the average number of characters of the search results, categorizes the search results with the Manhattan Distance, and selects the results according to the application scenario of users. Additionally, this paper develops a Web-based system according to the methodology to demonstrate practical application of this paper. The application scenario-based estimate and the location-based search are used to evaluate the type and abundance of the information expected by the public at specific location, so that information demanders can obtain the information consistent with their application situations at specific location.

Keywords: data mining, knowledge management, location-based service, user application scenario

Procedia PDF Downloads 123
2548 Kernel-Based Double Nearest Proportion Feature Extraction for Hyperspectral Image Classification

Authors: Hung-Sheng Lin, Cheng-Hsuan Li

Abstract:

Over the past few years, kernel-based algorithms have been widely used to extend some linear feature extraction methods such as principal component analysis (PCA), linear discriminate analysis (LDA), and nonparametric weighted feature extraction (NWFE) to their nonlinear versions, kernel principal component analysis (KPCA), generalized discriminate analysis (GDA), and kernel nonparametric weighted feature extraction (KNWFE), respectively. These nonlinear feature extraction methods can detect nonlinear directions with the largest nonlinear variance or the largest class separability based on the given kernel function. Moreover, they have been applied to improve the target detection or the image classification of hyperspectral images. The double nearest proportion feature extraction (DNP) can effectively reduce the overlap effect and have good performance in hyperspectral image classification. The DNP structure is an extension of the k-nearest neighbor technique. For each sample, there are two corresponding nearest proportions of samples, the self-class nearest proportion and the other-class nearest proportion. The term “nearest proportion” used here consider both the local information and other more global information. With these settings, the effect of the overlap between the sample distributions can be reduced. Usually, the maximum likelihood estimator and the related unbiased estimator are not ideal estimators in high dimensional inference problems, particularly in small data-size situation. Hence, an improved estimator by shrinkage estimation (regularization) is proposed. Based on the DNP structure, LDA is included as a special case. In this paper, the kernel method is applied to extend DNP to kernel-based DNP (KDNP). In addition to the advantages of DNP, KDNP surpasses DNP in the experimental results. According to the experiments on the real hyperspectral image data sets, the classification performance of KDNP is better than that of PCA, LDA, NWFE, and their kernel versions, KPCA, GDA, and KNWFE.

Keywords: feature extraction, kernel method, double nearest proportion feature extraction, kernel double nearest feature extraction

Procedia PDF Downloads 343
2547 Comparison of Heuristic Methods for Solving Traveling Salesman Problem

Authors: Regita P. Permata, Ulfa S. Nuraini

Abstract:

Traveling Salesman Problem (TSP) is the most studied problem in combinatorial optimization. In simple language, TSP can be described as a problem of finding a minimum distance tour to a city, starting and ending in the same city, and exactly visiting another city. In product distribution, companies often get problems in determining the minimum distance that affects the time allocation. In this research, we aim to apply TSP heuristic methods to simulate nodes as city coordinates in product distribution. The heuristics used are sub tour reversal, nearest neighbor, farthest insertion, cheapest insertion, nearest insertion, and arbitrary insertion. We have done simulation nodes using Euclidean distances to compare the number of cities and processing time, thus we get optimum heuristic method. The results show that the optimum heuristic methods are farthest insertion and nearest insertion. These two methods can be recommended to solve product distribution problems in certain companies.

Keywords: Euclidean, heuristics, simulation, TSP

Procedia PDF Downloads 125
2546 Closest Possible Neighbor of a Different Class: Explaining a Model Using a Neighbor Migrating Generator

Authors: Hassan Eshkiki, Benjamin Mora

Abstract:

The Neighbor Migrating Generator is a simple and efficient approach to finding the closest potential neighbor(s) with a different label for a given instance and so without the need to calibrate any kernel settings at all. This allows determining and explaining the most important features that will influence an AI model. It can be used to either migrate a specific sample to the class decision boundary of the original model within a close neighborhood of that sample or identify global features that can help localising neighbor classes. The proposed technique works by minimizing a loss function that is divided into two components which are independently weighted according to three parameters α, β, and ω, α being self-adjusting. Results show that this approach is superior to past techniques when detecting the smallest changes in the feature space and may also point out issues in models like over-fitting.

Keywords: explainable AI, EX AI, feature importance, counterfactual explanations

Procedia PDF Downloads 189
2545 A Recommender System for Dynamic Selection of Undergraduates' Elective Courses

Authors: Adewale O. Ogunde, Emmanuel O. Ajibade

Abstract:

The task of selecting a few elective courses from a variety of available elective courses has been a difficult one for many students over the years. In many higher institutions, guidance and counselors or level advisers are usually employed to assist the students in picking the right choice of courses. In reality, these counselors and advisers are most times overloaded with too many students to attend to, and sometimes they do not have enough time for the students. Most times, the academic strength of the student based on past results are not considered in the new choice of electives. Recommender systems implement advanced data analysis techniques to help users find the items of their interest by producing a predicted likeliness score or a list of top recommended items for a given active user. Therefore, in this work, a collaborative filtering-based recommender system that will dynamically recommend elective courses to undergraduate students based on their past grades in related courses was developed. This approach employed the use of the k-nearest neighbor algorithm to discover hidden relationships between the related courses passed by students in the past and the currently available elective courses. Real students’ results dataset was used to build and test the recommendation model. The developed system will not only improve the academic performance of students, but it will also help reduce the workload on the level advisers and school counselors.

Keywords: collaborative filtering, elective courses, k-nearest neighbor algorithm, recommender systems

Procedia PDF Downloads 162
2544 Principle Component Analysis on Colon Cancer Detection

Authors: N. K. Caecar Pratiwi, Yunendah Nur Fuadah, Rita Magdalena, R. D. Atmaja, Sofia Saidah, Ocky Tiaramukti

Abstract:

Colon cancer or colorectal cancer is a type of cancer that attacks the last part of the human digestive system. Lymphoma and carcinoma are types of cancer that attack human’s colon. Colon cancer causes deaths about half a million people every year. In Indonesia, colon cancer is the third largest cancer case for women and second in men. Unhealthy lifestyles such as minimum consumption of fiber, rarely exercising and lack of awareness for early detection are factors that cause high cases of colon cancer. The aim of this project is to produce a system that can detect and classify images into type of colon cancer lymphoma, carcinoma, or normal. The designed system used 198 data colon cancer tissue pathology, consist of 66 images for Lymphoma cancer, 66 images for carcinoma cancer and 66 for normal / healthy colon condition. This system will classify colon cancer starting from image preprocessing, feature extraction using Principal Component Analysis (PCA) and classification using K-Nearest Neighbor (K-NN) method. Several stages in preprocessing are resize, convert RGB image to grayscale, edge detection and last, histogram equalization. Tests will be done by trying some K-NN input parameter setting. The result of this project is an image processing system that can detect and classify the type of colon cancer with high accuracy and low computation time.

Keywords: carcinoma, colorectal cancer, k-nearest neighbor, lymphoma, principle component analysis

Procedia PDF Downloads 205
2543 Detecting of Crime Hot Spots for Crime Mapping

Authors: Somayeh Nezami

Abstract:

The management of financial and human resources of police in metropolitans requires many information and exact plans to reduce a rate of crime and increase the safety of the society. Geographical Information Systems have an important role in providing crime maps and their analysis. By using them and identification of crime hot spots along with spatial presentation of the results, it is possible to allocate optimum resources while presenting effective methods for decision making and preventive solutions. In this paper, we try to explain and compare between some of the methods of hot spots analysis such as Mode, Fuzzy Mode and Nearest Neighbour Hierarchical spatial clustering (NNH). Then the spots with the highest crime rates of drug smuggling for one province in Iran with borderline with Afghanistan are obtained. We will show that among these three methods NNH leads to the best result.

Keywords: GIS, Hot spots, nearest neighbor hierarchical spatial clustering, NNH, spatial analysis of crime

Procedia PDF Downloads 329
2542 Pre-Operative Tool for Facial-Post-Surgical Estimation and Detection

Authors: Ayat E. Ali, Christeen R. Aziz, Merna A. Helmy, Mohammed M. Malek, Sherif H. El-Gohary

Abstract:

Goal: Purpose of the project was to make a plastic surgery prediction by using pre-operative images for the plastic surgeries’ patients and to show this prediction on a screen to compare between the current case and the appearance after the surgery. Methods: To this aim, we implemented a software which used data from the internet for facial skin diseases, skin burns, pre-and post-images for plastic surgeries then the post- surgical prediction is done by using K-nearest neighbor (KNN). So we designed and fabricated a smart mirror divided into two parts a screen and a reflective mirror so patient's pre- and post-appearance will be showed at the same time. Results: We worked on some skin diseases like vitiligo, skin burns and wrinkles. We classified the three degrees of burns using KNN classifier with accuracy 60%. We also succeeded in segmenting the area of vitiligo. Our future work will include working on more skin diseases, classify them and give a prediction for the look after the surgery. Also we will go deeper into facial deformities and plastic surgeries like nose reshaping and face slim down. Conclusion: Our project will give a prediction relates strongly to the real look after surgery and decrease different diagnoses among doctors. Significance: The mirror may have broad societal appeal as it will make the distance between patient's satisfaction and the medical standards smaller.

Keywords: k-nearest neighbor (knn), face detection, vitiligo, bone deformity

Procedia PDF Downloads 162
2541 Parkinson’s Disease Detection Analysis through Machine Learning Approaches

Authors: Muhtasim Shafi Kader, Fizar Ahmed, Annesha Acharjee

Abstract:

Machine learning and data mining are crucial in health care, as well as medical information and detection. Machine learning approaches are now being utilized to improve awareness of a variety of critical health issues, including diabetes detection, neuron cell tumor diagnosis, COVID 19 identification, and so on. Parkinson’s disease is basically a disease for our senior citizens in Bangladesh. Parkinson's Disease indications often seem progressive and get worst with time. People got affected trouble walking and communicating with the condition advances. Patients can also have psychological and social vagaries, nap problems, hopelessness, reminiscence loss, and weariness. Parkinson's disease can happen in both men and women. Though men are affected by the illness at a proportion that is around partial of them are women. In this research, we have to get out the accurate ML algorithm to find out the disease with a predictable dataset and the model of the following machine learning classifiers. Therefore, nine ML classifiers are secondhand to portion study to use machine learning approaches like as follows, Naive Bayes, Adaptive Boosting, Bagging Classifier, Decision Tree Classifier, Random Forest classifier, XBG Classifier, K Nearest Neighbor Classifier, Support Vector Machine Classifier, and Gradient Boosting Classifier are used.

Keywords: naive bayes, adaptive boosting, bagging classifier, decision tree classifier, random forest classifier, XBG classifier, k nearest neighbor classifier, support vector classifier, gradient boosting classifier

Procedia PDF Downloads 127
2540 Identification of Breast Anomalies Based on Deep Convolutional Neural Networks and K-Nearest Neighbors

Authors: Ayyaz Hussain, Tariq Sadad

Abstract:

Breast cancer (BC) is one of the widespread ailments among females globally. The early prognosis of BC can decrease the mortality rate. Exact findings of benign tumors can avoid unnecessary biopsies and further treatments of patients under investigation. However, due to variations in images, it is a tough job to isolate cancerous cases from normal and benign ones. The machine learning technique is widely employed in the classification of BC pattern and prognosis. In this research, a deep convolution neural network (DCNN) called AlexNet architecture is employed to get more discriminative features from breast tissues. To achieve higher accuracy, K-nearest neighbor (KNN) classifiers are employed as a substitute for the softmax layer in deep learning. The proposed model is tested on a widely used breast image database called MIAS dataset for experimental purposes and achieved 99% accuracy.

Keywords: breast cancer, DCNN, KNN, mammography

Procedia PDF Downloads 135
2539 Fast Bayesian Inference of Multivariate Block-Nearest Neighbor Gaussian Process (NNGP) Models for Large Data

Authors: Carlos Gonzales, Zaida Quiroz, Marcos Prates

Abstract:

Several spatial variables collected at the same location that share a common spatial distribution can be modeled simultaneously through a multivariate geostatistical model that takes into account the correlation between these variables and the spatial autocorrelation. The main goal of this model is to perform spatial prediction of these variables in the region of study. Here we focus on a geostatistical multivariate formulation that relies on sharing common spatial random effect terms. In particular, the first response variable can be modeled by a mean that incorporates a shared random spatial effect, while the other response variables depend on this shared spatial term, in addition to specific random spatial effects. Each spatial random effect is defined through a Gaussian process with a valid covariance function, but in order to improve the computational efficiency when the data are large, each Gaussian process is approximated to a Gaussian random Markov field (GRMF), specifically to the block nearest neighbor Gaussian process (Block-NNGP). This approach involves dividing the spatial domain into several dependent blocks under certain constraints, where the cross blocks allow capturing the spatial dependence on a large scale, while each individual block captures the spatial dependence on a smaller scale. The multivariate geostatistical model belongs to the class of Latent Gaussian Models; thus, to achieve fast Bayesian inference, it is used the integrated nested Laplace approximation (INLA) method. The good performance of the proposed model is shown through simulations and applications for massive data.

Keywords: Block-NNGP, geostatistics, gaussian process, GRMF, INLA, multivariate models.

Procedia PDF Downloads 97
2538 Determination of Neighbor Node in Consideration of the Imaging Range of Cameras in Automatic Human Tracking System

Authors: Kozo Tanigawa, Tappei Yotsumoto, Kenichi Takahashi, Takao Kawamura, Kazunori Sugahara

Abstract:

An automatic human tracking system using mobile agent technology is realized because a mobile agent moves in accordance with a migration of a target person. In this paper, we propose a method for determining the neighbor node in consideration of the imaging range of cameras.

Keywords: human tracking, mobile agent, Pan/Tilt/Zoom, neighbor relation

Procedia PDF Downloads 516