Search results for: Nearest neighbour

74 Development of the Academic Model to Predict Student Success at VUT-FSASEC Using Decision Trees

Authors: Langa Hendrick Musawenkosi, Twala Bhekisipho

Abstract:

The success or failure of students is a concern for every academic institution, college, university, governments and students themselves. Several approaches have been researched to address this concern. In this paper, a view is held that when a student enters a university or college or an academic institution, he or she enters an academic environment. The academic environment is unique concept used to develop the solution for making predictions effectively. This paper presents a model to determine the propensity of a student to succeed or fail in the French South African Schneider Electric Education Center (FSASEC) at the Vaal University of Technology (VUT). The Decision Tree algorithm is used to implement the model at FSASEC.

Keywords: Academic environment model, decision trees, FSASEC, K-nearest neighbor, machine learning, popularity index, support vector machine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1078

73 Feature Extraction Technique for Prediction the Antigenic Variants of the Influenza Virus

Authors: Majid Forghani, Michael Khachay

Abstract:

In genetics, the impact of neighboring amino acids on a target site is referred as the nearest-neighbor effect or simply neighbor effect. In this paper, a new method called wavelet particle decomposition representing the one-dimensional neighbor effect using wavelet packet decomposition is proposed. The main idea lies in known dependence of wavelet packet sub-bands on location and order of neighboring samples. The method decomposes the value of a signal sample into small values called particles that represent a part of the neighbor effect information. The results have shown that the information obtained from the particle decomposition can be used to create better model variables or features. As an example, the approach has been applied to improve the correlation of test and reference sequence distance with titer in the hemagglutination inhibition assay.

Keywords: Antigenic variants, neighbor effect, wavelet packet, wavelet particle decomposition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 725

72 Using Historical Data for Stock Prediction of a Tech Company

Authors: Sofia Stoica

Abstract:

In this paper, we use historical data to predict the stock price of a tech company. To this end, we use a dataset consisting of the stock prices over the past five years of 10 major tech companies: Adobe, Amazon, Apple, Facebook, Google, Microsoft, Netflix, Oracle, Salesforce, and Tesla. We implemented and tested three models – a linear regressor model, a k-nearest neighbor model (KNN), and a sequential neural network – and two algorithms – Multiplicative Weight Update and AdaBoost. We found that the sequential neural network performed the best, with a testing error of 0.18%. Interestingly, the linear model performed the second best with a testing error of 0.73%. These results show that using historical data is enough to obtain high accuracies, and a simple algorithm like linear regression has a performance similar to more sophisticated models while taking less time and resources to implement.

Keywords: Finance, machine learning, opening price, stock market.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 300

71 Network Intrusion Detection Design Using Feature Selection of Soft Computing Paradigms

Authors: T. S. Chou, K. K. Yen, J. Luo

Abstract:

The network traffic data provided for the design of intrusion detection always are large with ineffective information and enclose limited and ambiguous information about users- activities. We study the problems and propose a two phases approach in our intrusion detection design. In the first phase, we develop a correlation-based feature selection algorithm to remove the worthless information from the original high dimensional database. Next, we design an intrusion detection method to solve the problems of uncertainty caused by limited and ambiguous information. In the experiments, we choose six UCI databases and DARPA KDD99 intrusion detection data set as our evaluation tools. Empirical studies indicate that our feature selection algorithm is capable of reducing the size of data set. Our intrusion detection method achieves a better performance than those of participating intrusion detectors.

Keywords: Intrusion detection, feature selection, k-nearest neighbors, fuzzy clustering, Dempster-Shafer theory

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1880

70 Machine Learning for Aiding Meningitis Diagnosis in Pediatric Patients

Authors: Karina Zaccari, Ernesto Cordeiro Marujo

Abstract:

This paper presents a Machine Learning (ML) approach to support Meningitis diagnosis in patients at a children’s hospital in Sao Paulo, Brazil. The aim is to use ML techniques to reduce the use of invasive procedures, such as cerebrospinal fluid (CSF) collection, as much as possible. In this study, we focus on predicting the probability of Meningitis given the results of a blood and urine laboratory tests, together with the analysis of pain or other complaints from the patient. We tested a number of different ML algorithms, including: Adaptative Boosting (AdaBoost), Decision Tree, Gradient Boosting, K-Nearest Neighbors (KNN), Logistic Regression, Random Forest and Support Vector Machines (SVM). Decision Tree algorithm performed best, with 94.56% and 96.18% accuracy for training and testing data, respectively. These results represent a significant aid to doctors in diagnosing Meningitis as early as possible and in preventing expensive and painful procedures on some children.

Keywords: Machine learning, medical diagnosis, meningitis detection, gradient boosting.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1032

69 DWT Based Image Steganalysis

Authors: Indradip Banerjee, Souvik Bhattacharyya, Gautam Sanyal

Abstract:

‘Steganalysis’ is one of the challenging and attractive interests for the researchers with the development of information hiding techniques. It is the procedure to detect the hidden information from the stego created by known steganographic algorithm. In this paper, a novel feature based image steganalysis technique is proposed. Various statistical moments have been used along with some similarity metric. The proposed steganalysis technique has been designed based on transformation in four wavelet domains, which include Haar, Daubechies, Symlets and Biorthogonal. Each domain is being subjected to various classifiers, namely K-nearest-neighbor, K* Classifier, Locally weighted learning, Naive Bayes classifier, Neural networks, Decision trees and Support vector machines. The experiments are performed on a large set of pictures which are available freely in image database. The system also predicts the different message length definitions.

Keywords: Steganalysis, Moments, Wavelet Domain, KNN, K*, LWL, Naive Bayes Classifier, Neural networks, Decision trees, SVM.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2535

68 Neural Networks for Short Term Wind Speed Prediction

Authors: K. Sreelakshmi, P. Ramakanthkumar

Abstract:

Predicting short term wind speed is essential in order to prevent systems in-action from the effects of strong winds. It also helps in using wind energy as an alternative source of energy, mainly for Electrical power generation. Wind speed prediction has applications in Military and civilian fields for air traffic control, rocket launch, ship navigation etc. The wind speed in near future depends on the values of other meteorological variables, such as atmospheric pressure, moisture content, humidity, rainfall etc. The values of these parameters are obtained from a nearest weather station and are used to train various forms of neural networks. The trained model of neural networks is validated using a similar set of data. The model is then used to predict the wind speed, using the same meteorological information. This paper reports an Artificial Neural Network model for short term wind speed prediction, which uses back propagation algorithm.

Keywords: Short term wind speed prediction, Neural networks, Back propagation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3018

67 Performance Study of Cascade Refrigeration System Using Alternative Refrigerants

Authors: Gulshan Sachdeva, Vaibhav Jain, S. S. Kachhwaha

Abstract:

Cascade refrigeration systems employ series of single stage vapor compression units which are thermally coupled with evaporator/condenser cascades. Different refrigerants are used in each of the circuit depending on the optimum characteristics shown by the refrigerant for a particular application. In the present research study, a steady state thermodynamic model is developed which simulates the working of an actual cascade system. The model provides COP and all other system parameters e.g. total compressor work, temperature, pressure, enthalpy and entropy at different state points. The working fluid in low temperature circuit (LTC) is CO₂ (R744) while Ammonia (R717), Propane (R290), Propylene (R1270), R404A and R12 are the refrigerants in high temperature circuit (HTC). The performance curves of Ammonia, Propane, Propylene, and R404A are compared with R12 to find its nearest substitute. Results show that Ammonia is the best substitute of R12.

Keywords: Cascade system, Refrigerants, Thermodynamic model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5690

66 A Distributed Topology Control Algorithm to Conserve Energy in Heterogeneous Wireless Mesh Networks

Authors: F. O. Aron, T. O. Olwal, A. Kurien, M. O. Odhiambo

Abstract:

A considerable amount of energy is consumed during transmission and reception of messages in a wireless mesh network (WMN). Reducing per-node transmission power would greatly increase the network lifetime via power conservation in addition to increasing the network capacity via better spatial bandwidth reuse. In this work, the problem of topology control in a hybrid WMN of heterogeneous wireless devices with varying maximum transmission ranges is considered. A localized distributed topology control algorithm is presented which calculates the optimal transmission power so that (1) network connectivity is maintained (2) node transmission power is reduced to cover only the nearest neighbours (3) networks lifetime is extended. Simulations and analysis of results are carried out in the NS-2 environment to demonstrate the correctness and effectiveness of the proposed algorithm.

Keywords: Topology Control, Wireless Mesh Networks, Backbone, Energy Efficiency, Localized Algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1349

65 Facility Location Problem in Emergency Logistic

Authors: Yousef Abu Nahleh, Arun Kumar, Fugen Daver

Abstract:

Facility location is one of the important problems affecting the relief operations. The location model in this paper is motivated by arranging the flow of relief materials from the main warehouse to continent warehouse and further to regional warehouse and from these to the disaster area. This flow makes the relief organization always ready to deal with the disaster situation during shortest possible time. The main purpose of this paper is merge the concept of just in time and the campaign system in emergency supply chain,so that when the disaster happens the affected country can request help from the nearest regional warehouse, which will supply the relief material and the required stuff to support and assist the victims in the disaster area. Furthermore, the regional warehouse places an order to the continent warehouse to replenish the material that is distributed to the disaster area. This way they will always be ready to respond to any type of disaster.

Keywords: Facility location, Center-of-Gravity Technique, Humanitarian relief, emergency supply chain.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3743

64 An Experimental Study of a Self-Supervised Classifier Ensemble

Authors: Neamat El Gayar

Abstract:

Learning using labeled and unlabelled data has received considerable amount of attention in the machine learning community due its potential in reducing the need for expensive labeled data. In this work we present a new method for combining labeled and unlabeled data based on classifier ensembles. The model we propose assumes each classifier in the ensemble observes the input using different set of features. Classifiers are initially trained using some labeled samples. The trained classifiers learn further through labeling the unknown patterns using a teaching signals that is generated using the decision of the classifier ensemble, i.e. the classifiers self-supervise each other. Experiments on a set of object images are presented. Our experiments investigate different classifier models, different fusing techniques, different training sizes and different input features. Experimental results reveal that the proposed self-supervised ensemble learning approach reduces classification error over the single classifier and the traditional ensemble classifier approachs.

Keywords: Multiple Classifier Systems, classifier ensembles, learning using labeled and unlabelled data, K-nearest neighbor classifier, Bayes classifier.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1595

63 Analysis of Feature Space for a 2d/3d Vision based Emotion Recognition Method

Authors: Robert Niese, Ayoub Al-Hamadi, Bernd Michaelis

Abstract:

In modern human computer interaction systems (HCI), emotion recognition is becoming an imperative characteristic. The quest for effective and reliable emotion recognition in HCI has resulted in a need for better face detection, feature extraction and classification. In this paper we present results of feature space analysis after briefly explaining our fully automatic vision based emotion recognition method. We demonstrate the compactness of the feature space and show how the 2d/3d based method achieves superior features for the purpose of emotion classification. Also it is exposed that through feature normalization a widely person independent feature space is created. As a consequence, the classifier architecture has only a minor influence on the classification result. This is particularly elucidated with the help of confusion matrices. For this purpose advanced classification algorithms, such as Support Vector Machines and Artificial Neural Networks are employed, as well as the simple k- Nearest Neighbor classifier.

Keywords: Facial expression analysis, Feature extraction, Image processing, Pattern Recognition, Application.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1870

62 Quantifying Freeway Capacity Reductions by Rainfall Intensities Based on Stochastic Nature of Flow Breakdown

Authors: Hoyoung Lee, Dong-Kyu Kim, Seung-Young Kho, R. Eddie Wilson

Abstract:

This study quantifies a decrement in freeway capacity during rainfall. Traffic and rainfall data were gathered from Highway Agencies and Wunderground weather service. Three inter-urban freeway sections and its nearest weather stations were selected as experimental sites. Capacity analysis found reductions of maximum and mean pre-breakdown flow rates due to rainfall. The Kruskal-Wallis test also provided some evidence to suggest that the variance in the pre-breakdown flow rate is statistically insignificant. Potential application of this study lies in the operation of real time traffic management schemes such as Variable Speed Limits (VSL), Hard Shoulder Running (HSR), and Ramp Metering System (RMS), where speed or flow limits could be set based on a number of factors, including rainfall events and their intensities.

Keywords: Capacity randomness, flow breakdown, freeway capacity, rainfall.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1257

61 On Speeding Up Support Vector Machines: Proximity Graphs Versus Random Sampling for Pre-Selection Condensation

Authors: Xiaohua Liu, Juan F. Beltran, Nishant Mohanchandra, Godfried T. Toussaint

Abstract:

Support vector machines (SVMs) are considered to be the best machine learning algorithms for minimizing the predictive probability of misclassification. However, their drawback is that for large data sets the computation of the optimal decision boundary is a time consuming function of the size of the training set. Hence several methods have been proposed to speed up the SVM algorithm. Here three methods used to speed up the computation of the SVM classifiers are compared experimentally using a musical genre classification problem. The simplest method pre-selects a random sample of the data before the application of the SVM algorithm. Two additional methods use proximity graphs to pre-select data that are near the decision boundary. One uses k-Nearest Neighbor graphs and the other Relative Neighborhood Graphs to accomplish the task.

Keywords: Machine learning, data mining, support vector machines, proximity graphs, relative-neighborhood graphs, k-nearestneighbor graphs, random sampling, training data condensation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1874

60 Deep Web Content Mining

Authors: Shohreh Ajoudanian, Mohammad Davarpanah Jazi

Abstract:

The rapid expansion of the web is causing the constant growth of information, leading to several problems such as increased difficulty of extracting potentially useful knowledge. Web content mining confronts this problem gathering explicit information from different web sites for its access and knowledge discovery. Query interfaces of web databases share common building blocks. After extracting information with parsing approach, we use a new data mining algorithm to match a large number of schemas in databases at a time. Using this algorithm increases the speed of information matching. In addition, instead of simple 1:1 matching, they do complex (m:n) matching between query interfaces. In this paper we present a novel correlation mining algorithm that matches correlated attributes with smaller cost. This algorithm uses Jaccard measure to distinguish positive and negative correlated attributes. After that, system matches the user query with different query interfaces in special domain and finally chooses the nearest query interface with user query to answer to it.

Keywords: Content mining, complex matching, correlation mining, information extraction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2239

59 A Novel Prediction Method for Tag SNP Selection using Genetic Algorithm based on KNN

Authors: Li-Yeh Chuang, Yu-Jen Hou, Jr., Cheng-Hong Yang

Abstract:

Single nucleotide polymorphisms (SNPs) hold much promise as a basis for disease-gene association. However, research is limited by the cost of genotyping the tremendous number of SNPs. Therefore, it is important to identify a small subset of informative SNPs, the so-called tag SNPs. This subset consists of selected SNPs of the genotypes, and accurately represents the rest of the SNPs. Furthermore, an effective evaluation method is needed to evaluate prediction accuracy of a set of tag SNPs. In this paper, a genetic algorithm (GA) is applied to tag SNP problems, and the K-nearest neighbor (K-NN) serves as a prediction method of tag SNP selection. The experimental data used was taken from the HapMap project; it consists of genotype data rather than haplotype data. The proposed method consistently identified tag SNPs with considerably better prediction accuracy than methods from the literature. At the same time, the number of tag SNPs identified was smaller than the number of tag SNPs in the other methods. The run time of the proposed method was much shorter than the run time of the SVM/STSA method when the same accuracy was reached.

Keywords: Genetic Algorithm (GA), Genotype, Single nucleotide polymorphism (SNP), tag SNPs.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1719

58 LiDAR Based Real Time Multiple Vehicle Detection and Tracking

Authors: Zhongzhen Luo, Saeid Habibi, Martin v. Mohrenschildt

Abstract:

Self-driving vehicle require a high level of situational awareness in order to maneuver safely when driving in real world condition. This paper presents a LiDAR based real time perception system that is able to process sensor raw data for multiple target detection and tracking in dynamic environment. The proposed algorithm is nonparametric and deterministic that is no assumptions and priori knowledge are needed from the input data and no initializations are required. Additionally, the proposed method is working on the three-dimensional data directly generated by LiDAR while not scarifying the rich information contained in the domain of 3D. Moreover, a fast and efficient for real time clustering algorithm is applied based on a radially bounded nearest neighbor (RBNN). Hungarian algorithm procedure and adaptive Kalman filtering are used for data association and tracking algorithm. The proposed algorithm is able to run in real time with average run time of 70ms per frame.

Keywords: LiDAR, real-time system, clustering, tracking, data association.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4618

57 Calculation of Water Economy Balance for Water Management

Authors: Vakhtang Geladze, Nana Bolashvili, Tamazi Karalashvili, Nino Machavariani, Ana Karalashvili, George Geladze, Nana Kvirkvelia

Abstract:

Fresh water deficit is one of the most important global problems today. It must be taken into consideration that in the nearest future fresh water crisis will become even more acute owing to the global climate warming and fast desertification processes in the world. Georgia is rich in water resources, but there are disbalance between the eastern and western parts of the country. The goal of the study is to integrate the recent mechanisms compatible with European standards into Georgian water resources management system on the basis of GIS. Moreover, to draw up water economy balance for the purpose of proper determination of water consumption priorities that will be an exchange ratio of water resources and water consumption of the concrete territory. For study region was choose south-eastern part of country, Kvemo kartli Region. This is typical agrarian region, tends to the desertification. The water supply of the region was assessed on the basis of water economy balance, which was first time calculated for this region.

Keywords: GIS, water economy balance, water resources.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 852

56 Study of Aluminum, Copper and Molybdenum Pollution in Groundwater Sources Surrounding (Miduk) Shahr-E- Babak Copper Complex Tailings Dam

Authors: Maryam Kargar, Neamatolah Khorasani, Mahmoud Karami, Gholam-Reza Rafiee, Reza Naseh

Abstract:

Interpolated contour maps drawn for aluminum, copper and molybdenum in downstream monitoring boreholes of water dam in Miduk Copper Complex and the values of pH, redox potential (Eh) and distance from water dam indicate different trends of variation and behavior of these three elements in downward groundwater resources. As these maps exhibit, aluminum is dominant in the most alkaline (pH = 9-11) borehole (MB5) to water dam. The highest concentration of molybdenum is found in the nearest borehole (MB6) to water dam. Main concentration of copper is observed in the most oxidized borehole (MB3 with Eh=293.2mV). The spatial difference among sampling stations can be attributed to the existence of faults and diaclases in the geologic structure of Miduk region which causes the groundwater sampling sites to be impressed by different contamination sources (toe seepage and upper seepage water originated from different zones of tailings dump).

Keywords: Contour maps, Monitoring borehole, Toe seepage, Upper seepage.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2168

55 Machine Learning in Production Systems Design Using Genetic Algorithms

Authors: Abu Qudeiri Jaber, Yamamoto Hidehiko Rizauddin Ramli

Abstract:

To create a solution for a specific problem in machine learning, the solution is constructed from the data or by use a search method. Genetic algorithms are a model of machine learning that can be used to find nearest optimal solution. While the great advantage of genetic algorithms is the fact that they find a solution through evolution, this is also the biggest disadvantage. Evolution is inductive, in nature life does not evolve towards a good solution but it evolves away from bad circumstances. This can cause a species to evolve into an evolutionary dead end. In order to reduce the effect of this disadvantage we propose a new a learning tool (criteria) which can be included into the genetic algorithms generations to compare the previous population and the current population and then decide whether is effective to continue with the previous population or the current population, the proposed learning tool is called as Keeping Efficient Population (KEP). We applied a GA based on KEP to the production line layout problem, as a result KEP keep the evaluation direction increases and stops any deviation in the evaluation.

Keywords: Genetic algorithms, Layout problem, Machinelearning, Production system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1587

54 Dynamic Time Warping in Gait Classificationof Motion Capture Data

Authors: Adam Świtoński, Agnieszka Michalczuk, Henryk Josiński, Andrzej Polański, KonradWojciechowski

Abstract:

The method of gait identification based on the nearest neighbor classification technique with motion similarity assessment by the dynamic time warping is proposed. The model based kinematic motion data, represented by the joints rotations coded by Euler angles and unit quaternions is used. The different pose distance functions in Euler angles and quaternion spaces are considered. To evaluate individual features of the subsequent joints movements during gait cycle, joint selection is carried out. To examine proposed approach database containing 353 gaits of 25 humans collected in motion capture laboratory is used. The obtained results are promising. The classifications, which takes into consideration all joints has accuracy over 91%. Only analysis of movements of hip joints allows to correctly identify gaits with almost 80% precision.

Keywords: Biometrics, dynamic time warping, gait identification, motion capture, time series classification, quaternion distance functions, attribute ranking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2565

53 Automatic Lip Contour Tracking and Visual Character Recognition for Computerized Lip Reading

Authors: Harshit Mehrotra, Gaurav Agrawal, M.C. Srivastava

Abstract:

Computerized lip reading has been one of the most actively researched areas of computer vision in recent past because of its crime fighting potential and invariance to acoustic environment. However, several factors like fast speech, bad pronunciation, poor illumination, movement of face, moustaches and beards make lip reading difficult. In present work, we propose a solution for automatic lip contour tracking and recognizing letters of English language spoken by speakers using the information available from lip movements. Level set method is used for tracking lip contour using a contour velocity model and a feature vector of lip movements is then obtained. Character recognition is performed using modified k nearest neighbor algorithm which assigns more weight to nearer neighbors. The proposed system has been found to have accuracy of 73.3% for character recognition with speaker lip movements as the only input and without using any speech recognition system in parallel. The approach used in this work is found to significantly solve the purpose of lip reading when size of database is small.

Keywords: Contour Velocity Model, Lip Contour Tracking, LipReading, Visual Character Recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2348

52 Delivery System Design of the Local Part to Reduce the Logistic Costs in an Automotive Industry

Authors: Inaki Maulida Hakim, Alesandro Romero

Abstract:

This research was conducted in an automotive company in Indonesia to overcome the problem of high logistics cost. The problem causes high of additional truck delivery. From the breakdown of the problem, chosen one route, which has the highest gap value, namely for RE-04. Research methodology will be started from calculating the ideal condition, making simulation, calculating the ideal logistic cost, and proposing an improvement. From the calculation of the ideal condition, box arrangement was done on the truck has efficiency with three trucks delivery per day. Route simulation making uses Tecnomatix Plant Simulation software as a visualization for the company about how the system is occurred on route RE-04 in ideal condition. The last step is proposing improvements on the area of route RE-04. The route arrangement is done by Saving Method and sequence of each supplier with the Nearest Neighbor. The results of the proposed improvements are three new route groups, where was expected to decrease logistics cost and increase the average of the truck efficiency per day.

Keywords: Logistic cost, milkrun, simulation, efficiency.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1701

51 A Distance Function for Data with Missing Values and Its Application

Authors: Loai AbdAllah, Ilan Shimshoni

Abstract:

Missing values in data are common in real world applications. Since the performance of many data mining algorithms depend critically on it being given a good metric over the input space, we decided in this paper to define a distance function for unlabeled datasets with missing values. We use the Bhattacharyya distance, which measures the similarity of two probability distributions, to define our new distance function. According to this distance, the distance between two points without missing attributes values is simply the Mahalanobis distance. When on the other hand there is a missing value of one of the coordinates, the distance is computed according to the distribution of the missing coordinate. Our distance is general and can be used as part of any algorithm that computes the distance between data points. Because its performance depends strongly on the chosen distance measure, we opted for the k nearest neighbor classifier to evaluate its ability to accurately reflect object similarity. We experimented on standard numerical datasets from the UCI repository from different fields. On these datasets we simulated missing values and compared the performance of the kNN classifier using our distance to other three basic methods. Our experiments show that kNN using our distance function outperforms the kNN using other methods. Moreover, the runtime performance of our method is only slightly higher than the other methods.

Keywords: Missing values, Distance metric, Bhattacharyya distance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2693

50 DCBOR: A Density Clustering Based on Outlier Removal

Authors: A. M. Fahim, G. Saake, A. M. Salem, F. A. Torkey, M. A. Ramadan

Abstract:

Data clustering is an important data exploration technique with many applications in data mining. We present an enhanced version of the well known single link clustering algorithm. We will refer to this algorithm as DCBOR. The proposed algorithm alleviates the chain effect by removing the outliers from the given dataset. So this algorithm provides outlier detection and data clustering simultaneously. This algorithm does not need to update the distance matrix, since the algorithm depends on merging the most k-nearest objects in one step and the cluster continues grow as long as possible under specified condition. So the algorithm consists of two phases; at the first phase, it removes the outliers from the input dataset. At the second phase, it performs the clustering process. This algorithm discovers clusters of different shapes, sizes, densities and requires only one input parameter; this parameter represents a threshold for outlier points. The value of the input parameter is ranging from 0 to 1. The algorithm supports the user in determining an appropriate value for it. We have tested this algorithm on different datasets contain outlier and connecting clusters by chain of density points, and the algorithm discovers the correct clusters. The results of our experiments demonstrate the effectiveness and the efficiency of DCBOR.

Keywords: Data Clustering, Clustering Algorithms, Handling Noise, Arbitrary Shape of Clusters.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1887

49 Numerical Analysis of Laminar Flow around Square Cylinders with EHD Phenomenon

Authors: M. Salmanpour, O. Nourani Zonouz

Abstract:

In this research, a numerical simulation of an Electrohydrodynamic (EHD) actuator’s effects on the flow around a square cylinder by using a finite volume method has been investigated. This is one of the newest ways for controlling the fluid flows. Two plate electrodes are flush-mounted on the surface of the cylinder and one wire electrode is placed on the line with zero angle of attack relative to the stagnation point and excited with DC power supply. The discharge produces an electric force and changes the local momentum behaviors in the fluid layers. For this purpose, after selecting proper domain and boundary conditions, the electric field relating to the problem has been analyzed and then the results in the form of electrical body force have been entered in the governing equations of fluid field (Navier-Stokes equations). The effect of ionic wind resulted from the Electrohydrodynamic actuator, on the velocity, pressure and the wake behind cylinder has been considered. According to the results, it is observed that the fluid flow accelerates in the nearest wall of the frontal half of the cylinder and the pressure difference between frontal and hinder cylinder is increased.

Keywords: CFD, corona discharge, electro hydrodynamics, flow around square cylinders.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 822

48 Predication Model for Leukemia Diseases Based on Data Mining Classification Algorithms with Best Accuracy

Authors: Fahd Sabry Esmail, M. Badr Senousy, Mohamed Ragaie

Abstract:

In recent years, there has been an explosion in the rate of using technology that help discovering the diseases. For example, DNA microarrays allow us for the first time to obtain a "global" view of the cell. It has great potential to provide accurate medical diagnosis, to help in finding the right treatment and cure for many diseases. Various classification algorithms can be applied on such micro-array datasets to devise methods that can predict the occurrence of Leukemia disease. In this study, we compared the classification accuracy and response time among eleven decision tree methods and six rule classifier methods using five performance criteria. The experiment results show that the performance of Random Tree is producing better result. Also it takes lowest time to build model in tree classifier. The classification rules algorithms such as nearest- neighbor-like algorithm (NNge) is the best algorithm due to the high accuracy and it takes lowest time to build model in classification.

Keywords: Data mining, classification techniques, decision tree, classification rule, leukemia diseases, microarray data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2489

47 Predicting Protein-Protein Interactions from Protein Sequences Using Phylogenetic Profiles

Authors: Omer Nebil Yaveroglu, Tolga Can

Abstract:

In this study, a high accuracy protein-protein interaction prediction method is developed. The importance of the proposed method is that it only uses sequence information of proteins while predicting interaction. The method extracts phylogenetic profiles of proteins by using their sequence information. Combining the phylogenetic profiles of two proteins by checking existence of homologs in different species and fitting this combined profile into a statistical model, it is possible to make predictions about the interaction status of two proteins. For this purpose, we apply a collection of pattern recognition techniques on the dataset of combined phylogenetic profiles of protein pairs. Support Vector Machines, Feature Extraction using ReliefF, Naive Bayes Classification, K-Nearest Neighborhood Classification, Decision Trees, and Random Forest Classification are the methods we applied for finding the classification method that best predicts the interaction status of protein pairs. Random Forest Classification outperformed all other methods with a prediction accuracy of 76.93%

Keywords: Protein Interaction Prediction, Phylogenetic Profile, SVM , ReliefF, Decision Trees, Random Forest Classification

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1567

46 Codebook Generation for Vector Quantization on Orthogonal Polynomials based Transform Coding

Authors: R. Krishnamoorthi, N. Kannan

Abstract:

In this paper, a new algorithm for generating codebook is proposed for vector quantization (VQ) in image coding. The significant features of the training image vectors are extracted by using the proposed Orthogonal Polynomials based transformation. We propose to generate the codebook by partitioning these feature vectors into a binary tree. Each feature vector at a non-terminal node of the binary tree is directed to one of the two descendants by comparing a single feature associated with that node to a threshold. The binary tree codebook is used for encoding and decoding the feature vectors. In the decoding process the feature vectors are subjected to inverse transformation with the help of basis functions of the proposed Orthogonal Polynomials based transformation to get back the approximated input image training vectors. The results of the proposed coding are compared with the VQ using Discrete Cosine Transform (DCT) and Pairwise Nearest Neighbor (PNN) algorithm. The new algorithm results in a considerable reduction in computation time and provides better reconstructed picture quality.

Keywords: Orthogonal Polynomials, Image Coding, Vector Quantization, TSVQ, Binary Tree Classifier

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2095

45 Assessment of Time-Lapse in Visible and Thermal Face Recognition

Authors: Sajad Farokhi, Siti Mariyam Shamsuddin, Jan Flusser, Usman Ullah Sheikh

Abstract:

Although face recognition seems as an easy task for human, automatic face recognition is a much more challenging task due to variations in time, illumination and pose. In this paper, the influence of time-lapse on visible and thermal images is examined. Orthogonal moment invariants are used as a feature extractor to analyze the effect of time-lapse on thermal and visible images and the results are compared with conventional Principal Component Analysis (PCA). A new triangle square ratio criterion is employed instead of Euclidean distance to enhance the performance of nearest neighbor classifier. The results of this study indicate that the ideal feature vectors can be represented with high discrimination power due to the global characteristic of orthogonal moment invariants. Moreover, the effect of time-lapse has been decreasing and enhancing the accuracy of face recognition considerably in comparison with PCA. Furthermore, our experimental results based on moment invariant and triangle square ratio criterion show that the proposed approach achieves on average 13.6% higher in recognition rate than PCA.

Keywords: Infrared Face recognition, Time-lapse, Zernike moment invariants

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1735