Search results for: Clustering Validation
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 881

Search results for: Clustering Validation

731 Predictive Clustering Hybrid Regression(pCHR) Approach and Its Application to Sucrose-Based Biohydrogen Production

Authors: Nikhil, Ari Visa, Chin-Chao Chen, Chiu-Yue Lin, Jaakko A. Puhakka, Olli Yli-Harja

Abstract:

A predictive clustering hybrid regression (pCHR) approach was developed and evaluated using dataset from H2- producing sucrose-based bioreactor operated for 15 months. The aim was to model and predict the H2-production rate using information available about envirome and metabolome of the bioprocess. Selforganizing maps (SOM) and Sammon map were used to visualize the dataset and to identify main metabolic patterns and clusters in bioprocess data. Three metabolic clusters: acetate coupled with other metabolites, butyrate only, and transition phases were detected. The developed pCHR model combines principles of k-means clustering, kNN classification and regression techniques. The model performed well in modeling and predicting the H2-production rate with mean square error values of 0.0014 and 0.0032, respectively.

Keywords: Biohydrogen, bioprocess modeling, clusteringhybrid regression.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1724
730 A Comparison between Heuristic and Meta-Heuristic Methods for Solving the Multiple Traveling Salesman Problem

Authors: San Nah Sze, Wei King Tiong

Abstract:

The multiple traveling salesman problem (mTSP) can be used to model many practical problems. The mTSP is more complicated than the traveling salesman problem (TSP) because it requires determining which cities to assign to each salesman, as well as the optimal ordering of the cities within each salesman's tour. Previous studies proposed that Genetic Algorithm (GA), Integer Programming (IP) and several neural network (NN) approaches could be used to solve mTSP. This paper compared the results for mTSP, solved with Genetic Algorithm (GA) and Nearest Neighbor Algorithm (NNA). The number of cities is clustered into a few groups using k-means clustering technique. The number of groups depends on the number of salesman. Then, each group is solved with NNA and GA as an independent TSP. It is found that k-means clustering and NNA are superior to GA in terms of performance (evaluated by fitness function) and computing time.

Keywords: Multiple Traveling Salesman Problem, GeneticAlgorithm, Nearest Neighbor Algorithm, k-Means Clustering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3150
729 Enhancing K-Means Algorithm with Initial Cluster Centers Derived from Data Partitioning along the Data Axis with the Highest Variance

Authors: S. Deelers, S. Auwatanamongkol

Abstract:

In this paper, we propose an algorithm to compute initial cluster centers for K-means clustering. Data in a cell is partitioned using a cutting plane that divides cell in two smaller cells. The plane is perpendicular to the data axis with the highest variance and is designed to reduce the sum squared errors of the two cells as much as possible, while at the same time keep the two cells far apart as possible. Cells are partitioned one at a time until the number of cells equals to the predefined number of clusters, K. The centers of the K cells become the initial cluster centers for K-means. The experimental results suggest that the proposed algorithm is effective, converge to better clustering results than those of the random initialization method. The research also indicated the proposed algorithm would greatly improve the likelihood of every cluster containing some data in it.

Keywords: Clustering algorithm, K-means algorithm, Datapartitioning, Initial cluster centers.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2813
728 Recursive Similarity Hashing of Fractal Geometry

Authors: Timothee G. Leleu

Abstract:

A new technique of topological multi-scale analysis is introduced. By performing a clustering recursively to build a hierarchy, and analyzing the co-scale and intra-scale similarities, an Iterated Function System can be extracted from any data set. The study of fractals shows that this method is efficient to extract self-similarities, and can find elegant solutions the inverse problem of building fractals. The theoretical aspects and practical implementations are discussed, together with examples of analyses of simple fractals.

Keywords: hierarchical clustering, multi-scale analysis, Similarity hashing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1813
727 An Optimal Unsupervised Satellite image Segmentation Approach Based on Pearson System and k-Means Clustering Algorithm Initialization

Authors: Ahmed Rekik, Mourad Zribi, Ahmed Ben Hamida, Mohamed Benjelloun

Abstract:

This paper presents an optimal and unsupervised satellite image segmentation approach based on Pearson system and k-Means Clustering Algorithm Initialization. Such method could be considered as original by the fact that it utilised K-Means clustering algorithm for an optimal initialisation of image class number on one hand and it exploited Pearson system for an optimal statistical distributions- affectation of each considered class on the other hand. Satellite image exploitation requires the use of different approaches, especially those founded on the unsupervised statistical segmentation principle. Such approaches necessitate definition of several parameters like image class number, class variables- estimation and generalised mixture distributions. Use of statistical images- attributes assured convincing and promoting results under the condition of having an optimal initialisation step with appropriated statistical distributions- affectation. Pearson system associated with a k-means clustering algorithm and Stochastic Expectation-Maximization 'SEM' algorithm could be adapted to such problem. For each image-s class, Pearson system attributes one distribution type according to different parameters and especially the Skewness 'β1' and the kurtosis 'β2'. The different adapted algorithms, K-Means clustering algorithm, SEM algorithm and Pearson system algorithm, are then applied to satellite image segmentation problem. Efficiency of those combined algorithms was firstly validated with the Mean Quadratic Error 'MQE' evaluation, and secondly with visual inspection along several comparisons of these unsupervised images- segmentation.

Keywords: Unsupervised classification, Pearson system, Satellite image, Segmentation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1982
726 Development of a Clustered Network based on Unique Hop ID

Authors: Hemanth Kumar, A. R., Sudhakar G, Satyanarayana B. S.

Abstract:

In this paper, Land Marks for Unique Addressing( LMUA) algorithm is develped to generate unique ID for each and every node which leads to the formation of overlapping/Non overlapping clusters based on unique ID. To overcome the draw back of the developed LMUA algorithm, the concept of clustering is introduced. Based on the clustering concept a Land Marks for Unique Addressing and Clustering(LMUAC) Algorithm is developed to construct strictly non-overlapping clusters and classify those nodes in to Cluster Heads, Member Nodes, Gate way nodes and generating the Hierarchical code for the cluster heads to operate in the level one hierarchy for wireless communication switching. The expansion of the existing network can be performed or not without modifying the cost of adding the clusterhead is shown. The developed algorithm shows one way of efficiently constructing the

Keywords: Cluster Dimension, Cluster Basis, Metric Dimension, Metric Basis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1262
725 Verification and Validation of Simulated Process Models of KALBR-SIM Training Simulator

Authors: T. Jayanthi, K. Velusamy, H. Seetha, S. A. V. Satya Murty

Abstract:

Verification and Validation of Simulated Process Model is the most important phase of the simulator life cycle. Evaluation of simulated process models based on Verification and Validation techniques checks the closeness of each component model (in a simulated network) with the real system/process with respect to dynamic behaviour under steady state and transient conditions. The process of Verification and Validation helps in qualifying the process simulator for the intended purpose whether it is for providing comprehensive training or design verification. In general, model verification is carried out by comparison of simulated component characteristics with the original requirement to ensure that each step in the model development process completely incorporates all the design requirements. Validation testing is performed by comparing the simulated process parameters to the actual plant process parameters either in standalone mode or integrated mode. A Full Scope Replica Operator Training Simulator for PFBR - Prototype Fast Breeder Reactor has been developed at IGCAR, Kalpakkam, INDIA named KALBR-SIM (Kalpakkam Breeder Reactor Simulator) where in the main participants are engineers/experts belonging to Modeling Team, Process Design and Instrumentation & Control design team. This paper discusses about the Verification and Validation process in general, the evaluation procedure adopted for PFBR operator training Simulator, the methodology followed for verifying the models, the reference documents and standards used etc. It details out the importance of internal validation by design experts, subsequent validation by external agency consisting of experts from various fields, model improvement by tuning based on expert’s comments, final qualification of the simulator for the intended purpose and the difficulties faced while co-coordinating various activities.

Keywords: Verification and Validation (V&V), Prototype Fast Breeder Reactor (PFBR), Kalpakkam Breeder Reactor Simulator (KALBR-SIM), Steady State, Transient State.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2470
724 Analysis of Cooperative Learning Behavior Based on the Data of Students' Movement

Authors: Wang Lin, Li Zhiqiang

Abstract:

The purpose of this paper is to analyze the cooperative learning behavior pattern based on the data of students' movement. The study firstly reviewed the cooperative learning theory and its research status, and briefly introduced the k-means clustering algorithm. Then, it used clustering algorithm and mathematical statistics theory to analyze the activity rhythm of individual student and groups in different functional areas, according to the movement data provided by 10 first-year graduate students. It also focused on the analysis of students' behavior in the learning area and explored the law of cooperative learning behavior. The research result showed that the cooperative learning behavior analysis method based on movement data proposed in this paper is feasible. From the results of data analysis, the characteristics of behavior of students and their cooperative learning behavior patterns could be found.

Keywords: Behavior pattern, cooperative learning, data analyze, K-means clustering algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 758
723 Simultaneous Clustering and Feature Selection Method for Gene Expression Data

Authors: T. Chandrasekhar, K. Thangavel, E. N. Sathishkumar

Abstract:

Microarrays are made it possible to simultaneously monitor the expression profiles of thousands of genes under various experimental conditions. It is used to identify the co-expressed genes in specific cells or tissues that are actively used to make proteins. This method is used to analysis the gene expression, an important task in bioinformatics research. Cluster analysis of gene expression data has proved to be a useful tool for identifying co-expressed genes, biologically relevant groupings of genes and samples. In this work K-Means algorithms has been applied for clustering of Gene Expression Data. Further, rough set based Quick reduct algorithm has been applied for each cluster in order to select the most similar genes having high correlation. Then the ACV measure is used to evaluate the refined clusters and classification is used to evaluate the proposed method. They could identify compact clusters with feature selection method used to genes are selected.

Keywords: Clustering, Feature selection, Gene expression data, Quick reduct.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1924
722 Validation of the Formal Model of Web Services Applications for Digital Reference Service of Library Information System

Authors: Zainab M. Musa, Nordin M. A. Rahman, Julaily A. Jusoh

Abstract:

The web services applications for digital reference service (WSDRS) of LIS model is an informal model that claims to reduce the problems of digital reference services in libraries. It uses web services technology to provide efficient way of satisfying users’ needs in the reference section of libraries. The formal WSDRS model consists of the Z specifications of all the informal specifications of the model. This paper discusses the formal validation of the Z specifications of WSDRS model. The authors formally verify and thus validate the properties of the model using Z/EVES theorem prover.

Keywords: Validation, verification, formal, theorem proving.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1277
721 A New Approach for Image Segmentation using Pillar-Kmeans Algorithm

Authors: Ali Ridho Barakbah, Yasushi Kiyoki

Abstract:

This paper presents a new approach for image segmentation by applying Pillar-Kmeans algorithm. This segmentation process includes a new mechanism for clustering the elements of high-resolution images in order to improve precision and reduce computation time. The system applies K-means clustering to the image segmentation after optimized by Pillar Algorithm. The Pillar algorithm considers the pillars- placement which should be located as far as possible from each other to withstand against the pressure distribution of a roof, as identical to the number of centroids amongst the data distribution. This algorithm is able to optimize the K-means clustering for image segmentation in aspects of precision and computation time. It designates the initial centroids- positions by calculating the accumulated distance metric between each data point and all previous centroids, and then selects data points which have the maximum distance as new initial centroids. This algorithm distributes all initial centroids according to the maximum accumulated distance metric. This paper evaluates the proposed approach for image segmentation by comparing with K-means and Gaussian Mixture Model algorithm and involving RGB, HSV, HSL and CIELAB color spaces. The experimental results clarify the effectiveness of our approach to improve the segmentation quality in aspects of precision and computational time.

Keywords: Image segmentation, K-means clustering, Pillaralgorithm, color spaces.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3326
720 Graph-Based Text Similarity Measurement by Exploiting Wikipedia as Background Knowledge

Authors: Lu Zhang, Chunping Li, Jun Liu, Hui Wang

Abstract:

Text similarity measurement is a fundamental issue in many textual applications such as document clustering, classification, summarization and question answering. However, prevailing approaches based on Vector Space Model (VSM) more or less suffer from the limitation of Bag of Words (BOW), which ignores the semantic relationship among words. Enriching document representation with background knowledge from Wikipedia is proven to be an effective way to solve this problem, but most existing methods still cannot avoid similar flaws of BOW in a new vector space. In this paper, we propose a novel text similarity measurement which goes beyond VSM and can find semantic affinity between documents. Specifically, it is a unified graph model that exploits Wikipedia as background knowledge and synthesizes both document representation and similarity computation. The experimental results on two different datasets show that our approach significantly improves VSM-based methods in both text clustering and classification.

Keywords: Text classification, Text clustering, Text similarity, Wikipedia

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2060
719 Effective Keyword and Similarity Thresholds for the Discovery of Themes from the User Web Access Patterns

Authors: Haider A Ramadhan, Khalil Shihab

Abstract:

Clustering techniques have been used by many intelligent software agents to group similar access patterns of the Web users into high level themes which express users intentions and interests. However, such techniques have been mostly focusing on one salient feature of the Web document visited by the user, namely the extracted keywords. The major aim of these techniques is to come up with an optimal threshold for the number of keywords needed to produce more focused themes. In this paper we focus on both keyword and similarity thresholds to generate themes with concentrated themes, and hence build a more sound model of the user behavior. The purpose of this paper is two fold: use distance based clustering methods to recognize overall themes from the Proxy log file, and suggest an efficient cut off levels for the keyword and similarity thresholds which tend to produce more optimal clusters with better focus and efficient size.

Keywords: Data mining, knowledge discovery, clustering, dataanalysis, Web log analysis, theme based searching.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1408
718 Semi-Automatic Method to Assist Expert for Association Rules Validation

Authors: Amdouni Hamida, Gammoudi Mohamed Mohsen

Abstract:

In order to help the expert to validate association rules extracted from data, some quality measures are proposed in the literature. We distinguish two categories: objective and subjective measures. The first one depends on a fixed threshold and on data quality from which the rules are extracted. The second one consists on providing to the expert some tools in the objective to explore and visualize rules during the evaluation step. However, the number of extracted rules to validate remains high. Thus, the manually mining rules task is very hard. To solve this problem, we propose, in this paper, a semi-automatic method to assist the expert during the association rule's validation. Our method uses rule-based classification as follow: (i) We transform association rules into classification rules (classifiers), (ii) We use the generated classifiers for data classification. (iii) We visualize association rules with their quality classification to give an idea to the expert and to assist him during validation process.

Keywords: Association rules, Rule-based classification, Classification quality, Validation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1745
717 Comparative Study of Ad Hoc Routing Protocols in Vehicular Ad-Hoc Networks for Smart City

Authors: Khadija Raissi, Bechir Ben Gouissem

Abstract:

In this paper, we perform the investigation of some routing protocols in Vehicular Ad-Hoc Network (VANET) context. Indeed, we study the efficiency of protocols like Dynamic Source Routing (DSR), Ad hoc On-demand Distance Vector Routing (AODV), Destination Sequenced Distance Vector (DSDV), Optimized Link State Routing convention (OLSR) and Vehicular Multi-hop algorithm for Stable Clustering (VMASC) in terms of packet delivery ratio (PDR) and throughput. The performance evaluation and comparison between the studied protocols shows that the VMASC is the best protocols regarding fast data transmission and link stability in VANETs. The validation of all results is done by the NS3 simulator.

Keywords: VANET, smart city, AODV, OLSR, DSR, OLSR, VMASC, routing protocols, NS3.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 971
716 A Balanced Cost Cluster-Heads Selection Algorithm for Wireless Sensor Networks

Authors: Ouadoudi Zytoune, Youssef Fakhri, Driss Aboutajdine

Abstract:

This paper focuses on reducing the power consumption of wireless sensor networks. Therefore, a communication protocol named LEACH (Low-Energy Adaptive Clustering Hierarchy) is modified. We extend LEACHs stochastic cluster-head selection algorithm by a modifying the probability of each node to become cluster-head based on its required energy to transmit to the sink. We present an efficient energy aware routing algorithm for the wireless sensor networks. Our contribution consists in rotation selection of clusterheads considering the remoteness of the nodes to the sink, and then, the network nodes residual energy. This choice allows a best distribution of the transmission energy in the network. The cluster-heads selection algorithm is completely decentralized. Simulation results show that the energy is significantly reduced compared with the previous clustering based routing algorithm for the sensor networks.

Keywords: Wireless Sensor Networks, Energy efficiency, WirelessCommunications, Clustering-based algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2597
715 Volterra Filter for Color Image Segmentation

Authors: M. B. Meenavathi, K. Rajesh

Abstract:

Color image segmentation plays an important role in computer vision and image processing areas. In this paper, the features of Volterra filter are utilized for color image segmentation. The discrete Volterra filter exhibits both linear and nonlinear characteristics. The linear part smoothes the image features in uniform gray zones and is used for getting a gross representation of objects of interest. The nonlinear term compensates for the blurring due to the linear term and preserves the edges which are mainly used to distinguish the various objects. The truncated quadratic Volterra filters are mainly used for edge preserving along with Gaussian noise cancellation. In our approach, the segmentation is based on K-means clustering algorithm in HSI space. Both the hue and the intensity components are fully utilized. For hue clustering, the special cyclic property of the hue component is taken into consideration. The experimental results show that the proposed technique segments the color image while preserving significant features and removing noise effects.

Keywords: Color image segmentation, HSI space, K–means clustering, Volterra filter.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1812
714 A Constrained Clustering Algorithm for the Classification of Industrial Ores

Authors: Luciano Nieddu, Giuseppe Manfredi

Abstract:

In this paper a Pattern Recognition algorithm based on a constrained version of the k-means clustering algorithm will be presented. The proposed algorithm is a non parametric supervised statistical pattern recognition algorithm, i.e. it works under very mild assumptions on the dataset. The performance of the algorithm will be tested, togheter with a feature extraction technique that captures the information on the closed two-dimensional contour of an image, on images of industrial mineral ores.

Keywords: K-means, Industrial ores classification, Invariant Features, Supervised Classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1333
713 Face Recognition Based On Vector Quantization Using Fuzzy Neuro Clustering

Authors: Elizabeth B. Varghese, M. Wilscy

Abstract:

A face recognition system is a computer application for automatically identifying or verifying a person from a digital image or a video frame. A lot of algorithms have been proposed for face recognition. Vector Quantization (VQ) based face recognition is a novel approach for face recognition. Here a new codebook generation for VQ based face recognition using Integrated Adaptive Fuzzy Clustering (IAFC) is proposed. IAFC is a fuzzy neural network which incorporates a fuzzy learning rule into a competitive neural network. The performance of proposed algorithm is demonstrated by using publicly available AT&T database, Yale database, Indian Face database and a small face database, DCSKU database created in our lab. In all the databases the proposed approach got a higher recognition rate than most of the existing methods. In terms of Equal Error Rate (ERR) also the proposed codebook is better than the existing methods.

Keywords: Face Recognition, Vector Quantization, Integrated Adaptive Fuzzy Clustering, Self Organization Map.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2193
712 A Fuzzy Time Series Forecasting Model for Multi-Variate Forecasting Analysis with Fuzzy C-Means Clustering

Authors: Emrah Bulut, Okan Duru, Shigeru Yoshida

Abstract:

In this study, a fuzzy integrated logical forecasting method (FILF) is extended for multi-variate systems by using a vector autoregressive model. Fuzzy time series forecasting (FTSF) method was recently introduced by Song and Chissom [1]-[2] after that Chen improved the FTSF method. Rather than the existing literature, the proposed model is not only compared with the previous FTS models, but also with the conventional time series methods such as the classical vector autoregressive model. The cluster optimization is based on the C-means clustering method. An empirical study is performed for the prediction of the chartering rates of a group of dry bulk cargo ships. The root mean squared error (RMSE) metric is used for the comparing of results of methods and the proposed method has superiority than both traditional FTS methods and also the classical time series methods.

Keywords: C-means clustering, Fuzzy time series, Multi-variate design

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2249
711 A Proposal for Systematic Mapping Study of Software Security Testing, Verification and Validation

Authors: Adriano Bessa Albuquerque, Francisco Jose Barreto Nunes

Abstract:

Software vulnerabilities are increasing and not only impact services and processes availability as well as information confidentiality, integrity and privacy, but also cause changes that interfere in the development process. Security test could be a solution to reduce vulnerabilities. However, the variety of test techniques with the lack of real case studies of applying tests focusing on software development life cycle compromise its effective use. This paper offers an overview of how a Systematic Mapping Study (MS) about security verification, validation and test (VVT) was performed, besides presenting general results about this study.

Keywords: Software test, software security verification validation and test, security test institutionalization, systematic mapping study.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1569
710 A Comparative Study on Fuzzy and Neuro-Fuzzy Enabled Cluster Based Routing Protocols for Wireless Sensor Networks

Authors: Y. Harold Robinson, E. Golden Julie

Abstract:

Dynamic Routing in Wireless Sensor Networks (WSNs) has played a significant task in research for the recent years. Energy consumption and data delivery in time are the major parameters with the usage of sensor nodes that are significant criteria for these networks. The location of sensor nodes must not be prearranged. Clustering in WSN is a key methodology which is used to enlarge the life-time of a sensor network. It consists of numerous real-time applications. The features of WSNs are minimized the consumption of energy. Soft computing techniques can be included to accomplish improved performance. This paper surveys the modern trends in routing enclose fuzzy logic and Neuro-fuzzy logic based on the clustering techniques and implements a comparative study of the numerous related methodologies.

Keywords: Wireless sensor networks, clustering, fuzzy logic, neuro-fuzzy logic, energy efficiency.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 947
709 One-Class Support Vector Machines for Aerial Images Segmentation

Authors: Chih-Hung Wu, Chih-Chin Lai, Chun-Yen Chen, Yan-He Chen

Abstract:

Interpretation of aerial images is an important task in various applications. Image segmentation can be viewed as the essential step for extracting information from aerial images. Among many developed segmentation methods, the technique of clustering has been extensively investigated and used. However, determining the number of clusters in an image is inherently a difficult problem, especially when a priori information on the aerial image is unavailable. This study proposes a support vector machine approach for clustering aerial images. Three cluster validity indices, distance-based index, Davies-Bouldin index, and Xie-Beni index, are utilized as quantitative measures of the quality of clustering results. Comparisons on the effectiveness of these indices and various parameters settings on the proposed methods are conducted. Experimental results are provided to illustrate the feasibility of the proposed approach.

Keywords: Aerial imaging, image segmentation, machine learning, support vector machine, cluster validity index

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1895
708 Estimation of Uncertainty of Thermal Conductivity Measurement with Single Laboratory Validation Approach

Authors: Saowaluck Ukrisdawithid

Abstract:

The thermal conductivity of thermal insulation materials are measured by Heat Flow Meter (HFM) apparatus. The components of uncertainty are complex and difficult on routine measurement by modelling approach. In this study, uncertainty of thermal conductivity measurement was estimated by single laboratory validation approach. The within-laboratory reproducibility was 1.1%. The standard uncertainty of method and laboratory bias by using SRM1453 expanded polystyrene board was dominant at 1.4%. However, it was assessed that there was no significant bias. For sample measurement, the sources of uncertainty were repeatability, density of sample and thermal conductivity resolution of HFM. From this approach to sample measurements, the combined uncertainty was calculated. In summary, the thermal conductivity of sample, polystyrene foam, was reported as 0.03367 W/m·K ± 3.5% (k = 2) at mean temperature 23.5 °C. The single laboratory validation approach is simple key of routine testing laboratory for estimation uncertainty of thermal conductivity measurement by using HFM, according to ISO/IEC 17025-2017 requirements. These are meaningful for laboratory competent improvement, quality control on products, and conformity assessment.

Keywords: Single laboratory validation approach, within-laboratory reproducibility, method and laboratory bias, certified reference material.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 723
707 Face Recognition Using Principal Component Analysis, K-Means Clustering, and Convolutional Neural Network

Authors: Zukisa Nante, Wang Zenghui

Abstract:

Face recognition is the problem of identifying or recognizing individuals in an image. This paper investigates a possible method to bring a solution to this problem. The method proposes an amalgamation of Principal Component Analysis (PCA), K-Means clustering, and Convolutional Neural Network (CNN) for a face recognition system. It is trained and evaluated using the ORL dataset. This dataset consists of 400 different faces with 40 classes of 10 face images per class. Firstly, PCA enabled the usage of a smaller network. This reduces the training time of the CNN. Thus, we get rid of the redundancy and preserve the variance with a smaller number of coefficients. Secondly, the K-Means clustering model is trained using the compressed PCA obtained data which select the K-Means clustering centers with better characteristics. Lastly, the K-Means characteristics or features are an initial value of the CNN and act as input data. The accuracy and the performance of the proposed method were tested in comparison to other Face Recognition (FR) techniques namely PCA, Support Vector Machine (SVM), as well as K-Nearest Neighbour (kNN). During experimentation, the accuracy and the performance of our suggested method after 90 epochs achieved the highest performance: 99% accuracy F1-Score, 99% precision, and 99% recall in 463.934 seconds. It outperformed the PCA that obtained 97% and KNN with 84% during the conducted experiments. Therefore, this method proved to be efficient in identifying faces in the images.

Keywords: Face recognition, Principal Component Analysis, PCA, Convolutional Neural Network, CNN, Rectified Linear Unit, ReLU, feature extraction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 427
706 Using Data Mining Techniques for Finding Cardiac Outlier Patients

Authors: Farhan Ismaeel Dakheel, Raoof Smko, K. Negrat, Abdelsalam Almarimi

Abstract:

In this paper we used data mining techniques to identify outlier patients who are using large amount of drugs over a long period of time. Any healthcare or health insurance system should deal with the quantities of drugs utilized by chronic diseases patients. In Kingdom of Bahrain, about 20% of health budget is spent on medications. For the managers of healthcare systems, there is no enough information about the ways of drug utilization by chronic diseases patients, is there any misuse or is there outliers patients. In this work, which has been done in cooperation with information department in the Bahrain Defence Force hospital; we select the data for Cardiac patients in the period starting from 1/1/2008 to December 31/12/2008 to be the data for the model in this paper. We used three techniques for finding the drug utilization for cardiac patients. First we applied a clustering technique, followed by measuring of clustering validity, and finally we applied a decision tree as classification algorithm. The clustering results is divided into three clusters according to the drug utilization, for 1603 patients, who received 15,806 prescriptions during this period can be partitioned into three groups, where 23 patients (2.59%) who received 1316 prescriptions (8.32%) are classified to be outliers. The classification algorithm shows that the use of average drug utilization and the age, and the gender of the patient can be considered to be the main predictive factors in the induced model.

Keywords: Data Mining, Clustering, Classification, Drug Utilization..

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1850
705 Unsupervised Clustering Methods for Identifying Rare Events in Anomaly Detection

Authors: Witcha Chimphlee, Abdul Hanan Abdullah, Mohd Noor Md Sap, Siriporn Chimphlee, Surat Srinoy

Abstract:

It is important problems to increase the detection rates and reduce false positive rates in Intrusion Detection System (IDS). Although preventative techniques such as access control and authentication attempt to prevent intruders, these can fail, and as a second line of defence, intrusion detection has been introduced. Rare events are events that occur very infrequently, detection of rare events is a common problem in many domains. In this paper we propose an intrusion detection method that combines Rough set and Fuzzy Clustering. Rough set has to decrease the amount of data and get rid of redundancy. Fuzzy c-means clustering allow objects to belong to several clusters simultaneously, with different degrees of membership. Our approach allows us to recognize not only known attacks but also to detect suspicious activity that may be the result of a new, unknown attack. The experimental results on Knowledge Discovery and Data Mining-(KDDCup 1999) Dataset show that the method is efficient and practical for intrusion detection systems.

Keywords: Network and security, intrusion detection, fuzzy cmeans, rough set.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2802
704 Applying Hybrid Graph Drawing and Clustering Methods on Stock Investment Analysis

Authors: Mouataz Zreika, Maria Estela Varua

Abstract:

Stock investment decisions are often made based on current events of the global economy and the analysis of historical data. Conversely, visual representation could assist investors’ gain deeper understanding and better insight on stock market trends more efficiently. The trend analysis is based on long-term data collection. The study adopts a hybrid method that combines the Clustering algorithm and Force-directed algorithm to overcome the scalability problem when visualizing large data. This method exemplifies the potential relationships between each stock, as well as determining the degree of strength and connectivity, which will provide investors another understanding of the stock relationship for reference. Information derived from visualization will also help them make an informed decision. The results of the experiments show that the proposed method is able to produced visualized data aesthetically by providing clearer views for connectivity and edge weights.

Keywords: Clustering, force-directed, graph drawing, stock investment analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1552
703 Clustering for Detection of Population Groups at Risk from Anticholinergic Medication

Authors: Amirali Shirazibeheshti, Tarik Radwan, Alireza Ettefaghian, Farbod Khanizadeh, George Wilson, Cristina Luca

Abstract:

Anticholinergic medication has been associated with events such as falls, delirium, and cognitive impairment in older patients. To further assess this, anticholinergic burden scores have been developed to quantify risk. A risk model based on clustering was deployed in a healthcare management system to cluster patients into multiple risk groups according to anticholinergic burden scores of multiple medicines prescribed to patients to facilitate clinical decision-making. To do so, anticholinergic burden scores of drugs were extracted from the literature which categorizes the risk on a scale of 1 to 3. Given the patients’ prescription data on the healthcare database, a weighted anticholinergic risk score was derived per patient based on the prescription of multiple anticholinergic drugs. This study was conducted on 300,000 records of patients currently registered with a major regional UK-based healthcare provider. The weighted risk scores were used as inputs to an unsupervised learning algorithm (mean-shift clustering) that groups patients into clusters that represent different levels of anticholinergic risk. This work evaluates the association between the average risk score and measures of socioeconomic status (index of multiple deprivation) and health (index of health and disability). The clustering identifies a group of 15 patients at the highest risk from multiple anticholinergic medication. Our findings show that this group of patients is located within more deprived areas of London compared to the population of other risk groups. Furthermore, the prescription of anticholinergic medicines is more skewed to female than male patients, suggesting that females are more at risk from this kind of multiple medication. The risk may be monitored and controlled in a healthcare management system that is well-equipped with tools implementing appropriate techniques of artificial intelligence.

Keywords: Anticholinergic medication, socioeconomic status, deprivation, clustering, risk analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 990
702 Using Data Mining for Learning and Clustering FCM

Authors: Somayeh Alizadeh, Mehdi Ghazanfari, Mohammad Fathian

Abstract:

Fuzzy Cognitive Maps (FCMs) have successfully been applied in numerous domains to show relations between essential components. In some FCM, there are more nodes, which related to each other and more nodes means more complex in system behaviors and analysis. In this paper, a novel learning method used to construct FCMs based on historical data and by using data mining and DEMATEL method, a new method defined to reduce nodes number. This method cluster nodes in FCM based on their cause and effect behaviors.

Keywords: Clustering, Data Mining, Fuzzy Cognitive Map(FCM), Learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1970