Search results for: model based clustering
37873 Support Vector Machine Based Retinal Therapeutic for Glaucoma Using Machine Learning Algorithm
Authors: P. S. Jagadeesh Kumar, Mingmin Pan, Yang Yung, Tracy Lin Huan
Abstract:
Glaucoma is a group of visual maladies represented by the scheduled optic nerve neuropathy; means to the increasing dwindling in vision ground, resulting in loss of sight. In this paper, a novel support vector machine based retinal therapeutic for glaucoma using machine learning algorithm is conservative. The algorithm has fitting pragmatism; subsequently sustained on correlation clustering mode, it visualizes perfect computations in the multi-dimensional space. Support vector clustering turns out to be comparable to the scale-space advance that investigates the cluster organization by means of a kernel density estimation of the likelihood distribution, where cluster midpoints are idiosyncratic by the neighborhood maxima of the concreteness. The predicted planning has 91% attainment rate on data set deterrent on a consolidation of 500 realistic images of resolute and glaucoma retina; therefore, the computational benefit of depending on the cluster overlapping system pedestal on machine learning algorithm has complete performance in glaucoma therapeutic.Keywords: machine learning algorithm, correlation clustering mode, cluster overlapping system, glaucoma, kernel density estimation, retinal therapeutic
Procedia PDF Downloads 25437872 Filtering Intrusion Detection Alarms Using Ant Clustering Approach
Authors: Ghodhbani Salah, Jemili Farah
Abstract:
With the growth of cyber attacks, information safety has become an important issue all over the world. Many firms rely on security technologies such as intrusion detection systems (IDSs) to manage information technology security risks. IDSs are considered to be the last line of defense to secure a network and play a very important role in detecting large number of attacks. However the main problem with today’s most popular commercial IDSs is generating high volume of alerts and huge number of false positives. This drawback has become the main motivation for many research papers in IDS area. Hence, in this paper we present a data mining technique to assist network administrators to analyze and reduce false positive alarms that are produced by an IDS and increase detection accuracy. Our data mining technique is unsupervised clustering method based on hybrid ANT algorithm. This algorithm discovers clusters of intruders’ behavior without prior knowledge of a possible number of classes, then we apply K-means algorithm to improve the convergence of the ANT clustering. Experimental results on real dataset show that our proposed approach is efficient with high detection rate and low false alarm rate.Keywords: intrusion detection system, alarm filtering, ANT class, ant clustering, intruders’ behaviors, false alarms
Procedia PDF Downloads 40337871 Time-Series Load Data Analysis for User Power Profiling
Authors: Mahdi Daghmhehci Firoozjaei, Minchang Kim, Dima Alhadidi
Abstract:
In this paper, we present a power profiling model for smart grid consumers based on real time load data acquired smart meters. It profiles consumers’ power consumption behaviour using the dynamic time warping (DTW) clustering algorithm. Due to the invariability of signal warping of this algorithm, time-disordered load data can be profiled and consumption features be extracted. Two load types are defined and the related load patterns are extracted for classifying consumption behaviour by DTW. The classification methodology is discussed in detail. To evaluate the performance of the method, we analyze the time-series load data measured by a smart meter in a real case. The results verify the effectiveness of the proposed profiling method with 90.91% true positive rate for load type clustering in the best case.Keywords: power profiling, user privacy, dynamic time warping, smart grid
Procedia PDF Downloads 14837870 An Empirical Study to Predict Myocardial Infarction Using K-Means and Hierarchical Clustering
Authors: Md. Minhazul Islam, Shah Ashisul Abed Nipun, Majharul Islam, Md. Abdur Rakib Rahat, Jonayet Miah, Salsavil Kayyum, Anwar Shadaab, Faiz Al Faisal
Abstract:
The target of this research is to predict Myocardial Infarction using unsupervised Machine Learning algorithms. Myocardial Infarction Prediction related to heart disease is a challenging factor faced by doctors & hospitals. In this prediction, accuracy of the heart disease plays a vital role. From this concern, the authors have analyzed on a myocardial dataset to predict myocardial infarction using some popular Machine Learning algorithms K-Means and Hierarchical Clustering. This research includes a collection of data and the classification of data using Machine Learning Algorithms. The authors collected 345 instances along with 26 attributes from different hospitals in Bangladesh. This data have been collected from patients suffering from myocardial infarction along with other symptoms. This model would be able to find and mine hidden facts from historical Myocardial Infarction cases. The aim of this study is to analyze the accuracy level to predict Myocardial Infarction by using Machine Learning techniques.Keywords: Machine Learning, K-means, Hierarchical Clustering, Myocardial Infarction, Heart Disease
Procedia PDF Downloads 20337869 Approach Based on Fuzzy C-Means for Band Selection in Hyperspectral Images
Authors: Diego Saqui, José H. Saito, José R. Campos, Lúcio A. de C. Jorge
Abstract:
Hyperspectral images and remote sensing are important for many applications. A problem in the use of these images is the high volume of data to be processed, stored and transferred. Dimensionality reduction techniques can be used to reduce the volume of data. In this paper, an approach to band selection based on clustering algorithms is presented. This approach allows to reduce the volume of data. The proposed structure is based on Fuzzy C-Means (or K-Means) and NWHFC algorithms. New attributes in relation to other studies in the literature, such as kurtosis and low correlation, are also considered. A comparison of the results of the approach using the Fuzzy C-Means and K-Means with different attributes is performed. The use of both algorithms show similar good results but, particularly when used attributes variance and kurtosis in the clustering process, however applicable in hyperspectral images.Keywords: band selection, fuzzy c-means, k-means, hyperspectral image
Procedia PDF Downloads 40737868 Web Proxy Detection via Bipartite Graphs and One-Mode Projections
Authors: Zhipeng Chen, Peng Zhang, Qingyun Liu, Li Guo
Abstract:
With the Internet becoming the dominant channel for business and life, many IPs are increasingly masked using web proxies for illegal purposes such as propagating malware, impersonate phishing pages to steal sensitive data or redirect victims to other malicious targets. Moreover, as Internet traffic continues to grow in size and complexity, it has become an increasingly challenging task to detect the proxy service due to their dynamic update and high anonymity. In this paper, we present an approach based on behavioral graph analysis to study the behavior similarity of web proxy users. Specifically, we use bipartite graphs to model host communications from network traffic and build one-mode projections of bipartite graphs for discovering social-behavior similarity of web proxy users. Based on the similarity matrices of end-users from the derived one-mode projection graphs, we apply a simple yet effective spectral clustering algorithm to discover the inherent web proxy users behavior clusters. The web proxy URL may vary from time to time. Still, the inherent interest would not. So, based on the intuition, by dint of our private tools implemented by WebDriver, we examine whether the top URLs visited by the web proxy users are web proxies. Our experiment results based on real datasets show that the behavior clusters not only reduce the number of URLs analysis but also provide an effective way to detect the web proxies, especially for the unknown web proxies.Keywords: bipartite graph, one-mode projection, clustering, web proxy detection
Procedia PDF Downloads 24537867 Identification of Biological Pathways Causative for Breast Cancer Using Unsupervised Machine Learning
Authors: Karthik Mittal
Abstract:
This study performs an unsupervised machine learning analysis to find clusters of related SNPs which highlight biological pathways that are important for the biological mechanisms of breast cancer. Studying genetic variations in isolation is illogical because these genetic variations are known to modulate protein production and function; the downstream effects of these modifications on biological outcomes are highly interconnected. After extracting the SNPs and their effect on different types of breast cancer using the MRBase library, two unsupervised machine learning clustering algorithms were implemented on the genetic variants: a k-means clustering algorithm and a hierarchical clustering algorithm; furthermore, principal component analysis was executed to visually represent the data. These algorithms specifically used the SNP’s beta value on the three different types of breast cancer tested in this project (estrogen-receptor positive breast cancer, estrogen-receptor negative breast cancer, and breast cancer in general) to perform this clustering. Two significant genetic pathways validated the clustering produced by this project: the MAPK signaling pathway and the connection between the BRCA2 gene and the ESR1 gene. This study provides the first proof of concept showing the importance of unsupervised machine learning in interpreting GWAS summary statistics.Keywords: breast cancer, computational biology, unsupervised machine learning, k-means, PCA
Procedia PDF Downloads 14637866 A Mixing Matrix Estimation Algorithm for Speech Signals under the Under-Determined Blind Source Separation Model
Authors: Jing Wu, Wei Lv, Yibing Li, Yuanfan You
Abstract:
The separation of speech signals has become a research hotspot in the field of signal processing in recent years. It has many applications and influences in teleconferencing, hearing aids, speech recognition of machines and so on. The sounds received are usually noisy. The issue of identifying the sounds of interest and obtaining clear sounds in such an environment becomes a problem worth exploring, that is, the problem of blind source separation. This paper focuses on the under-determined blind source separation (UBSS). Sparse component analysis is generally used for the problem of under-determined blind source separation. The method is mainly divided into two parts. Firstly, the clustering algorithm is used to estimate the mixing matrix according to the observed signals. Then the signal is separated based on the known mixing matrix. In this paper, the problem of mixing matrix estimation is studied. This paper proposes an improved algorithm to estimate the mixing matrix for speech signals in the UBSS model. The traditional potential algorithm is not accurate for the mixing matrix estimation, especially for low signal-to noise ratio (SNR).In response to this problem, this paper considers the idea of an improved potential function method to estimate the mixing matrix. The algorithm not only avoids the inuence of insufficient prior information in traditional clustering algorithm, but also improves the estimation accuracy of mixing matrix. This paper takes the mixing of four speech signals into two channels as an example. The results of simulations show that the approach in this paper not only improves the accuracy of estimation, but also applies to any mixing matrix.Keywords: DBSCAN, potential function, speech signal, the UBSS model
Procedia PDF Downloads 13537865 A 5G Architecture Based to Dynamic Vehicular Clustering Enhancing VoD Services Over Vehicular Ad hoc Networks
Authors: Lamaa Sellami, Bechir Alaya
Abstract:
Nowadays, video-on-demand (VoD) applications are becoming one of the tendencies driving vehicular network users. In this paper, considering the unpredictable vehicle density, the unexpected acceleration or deceleration of the different cars included in the vehicular traffic load, and the limited radio range of the employed communication scheme, we introduce the “Dynamic Vehicular Clustering” (DVC) algorithm as a new scheme for video streaming systems over VANET. The proposed algorithm takes advantage of the concept of small cells and the introduction of wireless backhauls, inspired by the different features and the performance of the Long Term Evolution (LTE)- Advanced network. The proposed clustering algorithm considers multiple characteristics such as the vehicle’s position and acceleration to reduce latency and packet loss. Therefore, each cluster is counted as a small cell containing vehicular nodes and an access point that is elected regarding some particular specifications.Keywords: video-on-demand, vehicular ad-hoc network, mobility, vehicular traffic load, small cell, wireless backhaul, LTE-advanced, latency, packet loss
Procedia PDF Downloads 13937864 Altered Network Organization in Mild Alzheimer's Disease Compared to Mild Cognitive Impairment Using Resting-State EEG
Authors: Chia-Feng Lu, Yuh-Jen Wang, Shin Teng, Yu-Te Wu, Sui-Hing Yan
Abstract:
Brain functional networks based on resting-state EEG data were compared between patients with mild Alzheimer’s disease (mAD) and matched patients with amnestic subtype of mild cognitive impairment (aMCI). We integrated the time–frequency cross mutual information (TFCMI) method to estimate the EEG functional connectivity between cortical regions and the network analysis based on graph theory to further investigate the alterations of functional networks in mAD compared with aMCI group. We aimed at investigating the changes of network integrity, local clustering, information processing efficiency, and fault tolerance in mAD brain networks for different frequency bands based on several topological properties, including degree, strength, clustering coefficient, shortest path length, and efficiency. Results showed that the disruptions of network integrity and reductions of network efficiency in mAD characterized by lower degree, decreased clustering coefficient, higher shortest path length, and reduced global and local efficiencies in the delta, theta, beta2, and gamma bands were evident. The significant changes in network organization can be used in assisting discrimination of mAD from aMCI in clinical.Keywords: EEG, functional connectivity, graph theory, TFCMI
Procedia PDF Downloads 43137863 Evaluation of Security and Performance of Master Node Protocol in the Bitcoin Peer-To-Peer Network
Authors: Muntadher Sallal, Gareth Owenson, Mo Adda, Safa Shubbar
Abstract:
Bitcoin is a digital currency based on a peer-to-peer network to propagate and verify transactions. Bitcoin is gaining wider adoption than any previous crypto-currency. However, the mechanism of peers randomly choosing logical neighbors without any knowledge about underlying physical topology can cause a delay overhead in information propagation, which makes the system vulnerable to double-spend attacks. Aiming at alleviating the propagation delay problem, this paper introduces proximity-aware extensions to the current Bitcoin protocol, named Master Node Based Clustering (MNBC). The ultimate purpose of the proposed protocol, that are based on how clusters are formulated and how nodes can define their membership, is to improve the information propagation delay in the Bitcoin network. In MNBC protocol, physical internet connectivity increases, as well as the number of hops between nodes, decreases through assigning nodes to be responsible for maintaining clusters based on physical internet proximity. We show, through simulations, that the proposed protocol defines better clustering structures that optimize the performance of the transaction propagation over the Bitcoin protocol. The evaluation of partition attacks in the MNBC protocol, as well as the Bitcoin network, was done in this paper. Evaluation results prove that even though the Bitcoin network is more resistant against the partitioning attack than the MNBC protocol, more resources are needed to be spent to split the network in the MNBC protocol, especially with a higher number of nodes.Keywords: Bitcoin network, propagation delay, clustering, scalability
Procedia PDF Downloads 11537862 Dissimilarity Measure for General Histogram Data and Its Application to Hierarchical Clustering
Authors: K. Umbleja, M. Ichino
Abstract:
Symbolic data mining has been developed to analyze data in very large datasets. It is also useful in cases when entry specific details should remain hidden. Symbolic data mining is quickly gaining popularity as datasets in need of analyzing are becoming ever larger. One type of such symbolic data is a histogram, which enables to save huge amounts of information into a single variable with high-level of granularity. Other types of symbolic data can also be described in histograms, therefore making histogram a very important and general symbolic data type - a method developed for histograms - can also be applied to other types of symbolic data. Due to its complex structure, analyzing histograms is complicated. This paper proposes a method, which allows to compare two histogram-valued variables and therefore find a dissimilarity between two histograms. Proposed method uses the Ichino-Yaguchi dissimilarity measure for mixed feature-type data analysis as a base and develops a dissimilarity measure specifically for histogram data, which allows to compare histograms with different number of bins and bin widths (so called general histogram). Proposed dissimilarity measure is then used as a measure for clustering. Furthermore, linkage method based on weighted averages is proposed with the concept of cluster compactness to measure the quality of clustering. The method is then validated with application on real datasets. As a result, the proposed dissimilarity measure is found producing adequate and comparable results with general histograms without the loss of detail or need to transform the data.Keywords: dissimilarity measure, hierarchical clustering, histograms, symbolic data analysis
Procedia PDF Downloads 16237861 Detection Method of Federated Learning Backdoor Based on Weighted K-Medoids
Authors: Xun Li, Haojie Wang
Abstract:
Federated learning is a kind of distributed training and centralized training mode, which is of great value in the protection of user privacy. In order to solve the problem that the model is vulnerable to backdoor attacks in federated learning, a backdoor attack detection method based on a weighted k-medoids algorithm is proposed. First of all, this paper collates the update parameters of the client to construct a vector group, then uses the principal components analysis (PCA) algorithm to extract the corresponding feature information from the vector group, and finally uses the improved k-medoids clustering algorithm to identify the normal and backdoor update parameters. In this paper, the backdoor is implanted in the federation learning model through the model replacement attack method in the simulation experiment, and the update parameters from the attacker are effectively detected and removed by the defense method proposed in this paper.Keywords: federated learning, backdoor attack, PCA, k-medoids, backdoor defense
Procedia PDF Downloads 11437860 A Comparison of South East Asian Face Emotion Classification based on Optimized Ellipse Data Using Clustering Technique
Authors: M. Karthigayan, M. Rizon, Sazali Yaacob, R. Nagarajan, M. Muthukumaran, Thinaharan Ramachandran, Sargunam Thirugnanam
Abstract:
In this paper, using a set of irregular and regular ellipse fitting equations using Genetic algorithm (GA) are applied to the lip and eye features to classify the human emotions. Two South East Asian (SEA) faces are considered in this work for the emotion classification. There are six emotions and one neutral are considered as the output. Each subject shows unique characteristic of the lip and eye features for various emotions. GA is adopted to optimize irregular ellipse characteristics of the lip and eye features in each emotion. That is, the top portion of lip configuration is a part of one ellipse and the bottom of different ellipse. Two ellipse based fitness equations are proposed for the lip configuration and relevant parameters that define the emotions are listed. The GA method has achieved reasonably successful classification of emotion. In some emotions classification, optimized data values of one emotion are messed or overlapped to other emotion ranges. In order to overcome the overlapping problem between the emotion optimized values and at the same time to improve the classification, a fuzzy clustering method (FCM) of approach has been implemented to offer better classification. The GA-FCM approach offers a reasonably good classification within the ranges of clusters and it had been proven by applying to two SEA subjects and have improved the classification rate.Keywords: ellipse fitness function, genetic algorithm, emotion recognition, fuzzy clustering
Procedia PDF Downloads 54637859 Predicting the Human Impact of Natural Onset Disasters Using Pattern Recognition Techniques and Rule Based Clustering
Authors: Sara Hasani
Abstract:
This research focuses on natural sudden onset disasters characterised as ‘occurring with little or no warning and often cause excessive injuries far surpassing the national response capacities’. Based on the panel analysis of the historic record of 4,252 natural onset disasters between 1980 to 2015, a predictive method was developed to predict the human impact of the disaster (fatality, injured, homeless) with less than 3% of errors. The geographical dispersion of the disasters includes every country where the data were available and cross-examined from various humanitarian sources. The records were then filtered into 4252 records of the disasters where the five predictive variables (disaster type, HDI, DRI, population, and population density) were clearly stated. The procedure was designed based on a combination of pattern recognition techniques and rule-based clustering for prediction and discrimination analysis to validate the results further. The result indicates that there is a relationship between the disaster human impact and the five socio-economic characteristics of the affected country mentioned above. As a result, a framework was put forward, which could predict the disaster’s human impact based on their severity rank in the early hours of disaster strike. The predictions in this model were outlined in two worst and best-case scenarios, which respectively inform the lower range and higher range of the prediction. A necessity to develop the predictive framework can be highlighted by noticing that despite the existing research in literature, a framework for predicting the human impact and estimating the needs at the time of the disaster is yet to be developed. This can further be used to allocate the resources at the response phase of the disaster where the data is scarce.Keywords: disaster management, natural disaster, pattern recognition, prediction
Procedia PDF Downloads 15337858 LiDAR Based Real Time Multiple Vehicle Detection and Tracking
Authors: Zhongzhen Luo, Saeid Habibi, Martin v. Mohrenschildt
Abstract:
Self-driving vehicle require a high level of situational awareness in order to maneuver safely when driving in real world condition. This paper presents a LiDAR based real time perception system that is able to process sensor raw data for multiple target detection and tracking in dynamic environment. The proposed algorithm is nonparametric and deterministic that is no assumptions and priori knowledge are needed from the input data and no initializations are required. Additionally, the proposed method is working on the three-dimensional data directly generated by LiDAR while not scarifying the rich information contained in the domain of 3D. Moreover, a fast and efficient for real time clustering algorithm is applied based on a radially bounded nearest neighbor (RBNN). Hungarian algorithm procedure and adaptive Kalman filtering are used for data association and tracking algorithm. The proposed algorithm is able to run in real time with average run time of 70ms per frame.Keywords: lidar, segmentation, clustering, tracking
Procedia PDF Downloads 42337857 Energy Efficient Clustering with Adaptive Particle Swarm Optimization
Authors: KumarShashvat, ArshpreetKaur, RajeshKumar, Raman Chadha
Abstract:
Wireless sensor networks have principal characteristic of having restricted energy and with limitation that energy of the nodes cannot be replenished. To increase the lifetime in this scenario WSN route for data transmission is opted such that utilization of energy along the selected route is negligible. For this energy efficient network, dandy infrastructure is needed because it impinges the network lifespan. Clustering is a technique in which nodes are grouped into disjoints and non–overlapping sets. In this technique data is collected at the cluster head. In this paper, Adaptive-PSO algorithm is proposed which forms energy aware clusters by minimizing the cost of locating the cluster head. The main concern is of the suitability of the swarms by adjusting the learning parameters of PSO. Particle Swarm Optimization converges quickly at the beginning stage of the search but during the course of time, it becomes stable and may be trapped in local optima. In suggested network model swarms are given the intelligence of the spiders which makes them capable enough to avoid earlier convergence and also help them to escape from the local optima. Comparison analysis with traditional PSO shows that new algorithm considerably enhances the performance where multi-dimensional functions are taken into consideration.Keywords: Particle Swarm Optimization, adaptive – PSO, comparison between PSO and A-PSO, energy efficient clustering
Procedia PDF Downloads 24637856 Issue Reorganization Using the Measure of Relevance
Authors: William Wong Xiu Shun, Yoonjin Hyun, Mingyu Kim, Seongi Choi, Namgyu Kim
Abstract:
Recently, the demand of extracting the R&D keywords from the issues and using them in retrieving R&D information is increasing rapidly. But it is hard to identify the related issues or to distinguish them. Although the similarity between the issues cannot be identified, but with the R&D lexicon, the issues that always shared the same R&D keywords can be determined. In details, the R&D keywords that associated with particular issue is implied the key technology elements that needed to solve the problem of the particular issue. Furthermore, the related issues that sharing the same R&D keywords can be showed in a more systematic way through the issue clustering constructed from the perspective of R&D. Thus, sharing of the R&D result and reusable of the R&D technology can be facilitated. Indirectly, the redundancy of investment on the same R&D can be reduce as the R&D information can be shared between those corresponding issues and reusability of the related R&D can be improved. Therefore, a methodology of constructing an issue clustering from the perspective of common R&D keywords is proposed to satisfy the demands mentioned.Keywords: clustering, social network analysis, text mining, topic analysis
Procedia PDF Downloads 57337855 Routing and Energy Efficiency through Data Coupled Clustering in Large Scale Wireless Sensor Networks (WSNs)
Authors: Jainendra Singh, Zaheeruddin
Abstract:
A typical wireless sensor networks (WSNs) consists of several tiny and low-power sensors which use radio frequency to perform distributed sensing tasks. The longevity of wireless sensor networks (WSNs) is a major issue that impacts the application of such networks. While routing protocols are striving to save energy by acting on sensor nodes, recent studies show that network lifetime can be enhanced by further involving sink mobility. A common approach for energy efficiency is partitioning the network into clusters with correlated data, where the representative nodes simply transmit or average measurements inside the cluster. In this paper, we propose an energy- efficient homogenous clustering (EHC) technique. In this technique, the decision of each sensor is based on their residual energy and an estimate of how many of its neighboring cluster heads (CHs) will benefit from it being a CH. We, also explore the routing algorithm in clustered WSNs. We show that the proposed schemes significantly outperform current approaches in terms of packet delay, hop count and energy consumption of WSNs.Keywords: wireless sensor network, energy efficiency, clustering, routing
Procedia PDF Downloads 26437854 Clustering Based and Centralized Routing Table Topology of Control Protocol in Mobile Wireless Sensor Networks
Authors: Mbida Mohamed, Ezzati Abdellah
Abstract:
A strong challenge in the wireless sensor networks (WSN) is to save the energy and have a long life time in the network without having a high rate of loss information. However, topology control (TC) protocols are designed in a way that the network is divided and having a standard system of exchange packets between nodes. In this article, we will propose a clustering based and centralized routing table protocol of TC (CBCRT) which delegates a leader node that will encapsulate a single routing table in every cluster nodes. Hence, if a node wants to send packets to the sink, it requests the information's routing table of the current cluster from the node leader in order to root the packet.Keywords: mobile wireless sensor networks, routing, topology of control, protocols
Procedia PDF Downloads 27237853 Wavelet Based Residual Method of Detecting GSM Signal Strength Fading
Authors: Danladi Ali, Onah Festus Iloabuchi
Abstract:
In this paper, GSM signal strength was measured in order to detect the type of the signal fading phenomenon using one-dimensional multilevel wavelet residual method and neural network clustering to determine the average GSM signal strength received in the study area. The wavelet residual method predicted that the GSM signal experienced slow fading and attenuated with MSE of 3.875dB. The neural network clustering revealed that mostly -75dB, -85dB and -95dB were received. This means that the signal strength received in the study is a weak signal.Keywords: one-dimensional multilevel wavelets, path loss, GSM signal strength, propagation, urban environment
Procedia PDF Downloads 33837852 Water Detection in Aerial Images Using Fuzzy Sets
Authors: Caio Marcelo Nunes, Anderson da Silva Soares, Gustavo Teodoro Laureano, Clarimar Jose Coelho
Abstract:
This paper presents a methodology to pixel recognition in aerial images using fuzzy $c$-means algorithm. This algorithm is a alternative to recognize areas considering uncertainties and inaccuracies. Traditional clustering technics are used in recognizing of multispectral images of earth's surface. This technics recognize well-defined borders that can be easily discretized. However, in the real world there are many areas with uncertainties and inaccuracies which can be mapped by clustering algorithms that use fuzzy sets. The methodology presents in this work is applied to multispectral images obtained from Landsat-5/TM satellite. The pixels are joined using the $c$-means algorithm. After, a classification process identify the types of surface according the patterns obtained from spectral response of image surface. The classes considered are, exposed soil, moist soil, vegetation, turbid water and clean water. The results obtained shows that the fuzzy clustering identify the real type of the earth's surface.Keywords: aerial images, fuzzy clustering, image processing, pattern recognition
Procedia PDF Downloads 48237851 An Observation of the Information Technology Research and Development Based on Article Data Mining: A Survey Study on Science Direct
Authors: Muhammet Dursun Kaya, Hasan Asil
Abstract:
One of the most important factors of research and development is the deep insight into the evolutions of scientific development. The state-of-the-art tools and instruments can considerably assist the researchers, and many of the world organizations have become aware of the advantages of data mining for the acquisition of the knowledge required for the unstructured data. This paper was an attempt to review the articles on the information technology published in the past five years with the aid of data mining. A clustering approach was used to study these articles, and the research results revealed that three topics, namely health, innovation, and information systems, have captured the special attention of the researchers.Keywords: information technology, data mining, scientific development, clustering
Procedia PDF Downloads 27837850 Health Trajectory Clustering Using Deep Belief Networks
Authors: Farshid Hajati, Federico Girosi, Shima Ghassempour
Abstract:
We present a Deep Belief Network (DBN) method for clustering health trajectories. Deep Belief Network (DBN) is a deep architecture that consists of a stack of Restricted Boltzmann Machines (RBM). In a deep architecture, each layer learns more complex features than the past layers. The proposed method depends on DBN in clustering without using back propagation learning algorithm. The proposed DBN has a better a performance compared to the deep neural network due the initialization of the connecting weights. We use Contrastive Divergence (CD) method for training the RBMs which increases the performance of the network. The performance of the proposed method is evaluated extensively on the Health and Retirement Study (HRS) database. The University of Michigan Health and Retirement Study (HRS) is a nationally representative longitudinal study that has surveyed more than 27,000 elderly and near-elderly Americans since its inception in 1992. Participants are interviewed every two years and they collect data on physical and mental health, insurance coverage, financial status, family support systems, labor market status, and retirement planning. The dataset is publicly available and we use the RAND HRS version L, which is easy to use and cleaned up version of the data. The size of sample data set is 268 and the length of the trajectories is equal to 10. The trajectories do not stop when the patient dies and represent 10 different interviews of live patients. Compared to the state-of-the-art benchmarks, the experimental results show the effectiveness and superiority of the proposed method in clustering health trajectories.Keywords: health trajectory, clustering, deep learning, DBN
Procedia PDF Downloads 36937849 Clustering of Panels and Shade Diffusion Techniques for Partially Shaded PV Array-Review
Authors: Shahida Khatoon, Mohd. Faisal Jalil, Vaishali Gautam
Abstract:
The Photovoltaic (PV) generated power is mainly dependent on environmental factors. The PV array’s lifetime and overall systems effectiveness reduce due to the partial shading condition. Clustering the electrical connections between solar modules is a viable strategy for minimizing these power losses by shade diffusion. This article comprehensively evaluates various PV array clustering/reconfiguration models for PV systems. These are static and dynamic reconfiguration techniques for extracting maximum power in mismatch conditions. This paper explores and analyzes current breakthroughs in solar PV performance improvement strategies that merit further investigation. Altogether, researchers and academicians working in the field of dedicated solar power generation will benefit from this research.Keywords: static reconfiguration, dynamic reconfiguration, photo voltaic array, partial shading, CTC configuration
Procedia PDF Downloads 11637848 Proposing an Algorithm to Cluster Ad Hoc Networks, Modulating Two Levels of Learning Automaton and Nodes Additive Weighting
Authors: Mohammad Rostami, Mohammad Reza Forghani, Elahe Neshat, Fatemeh Yaghoobi
Abstract:
An Ad Hoc network consists of wireless mobile equipment which connects to each other without any infrastructure, using connection equipment. The best way to form a hierarchical structure is clustering. Various methods of clustering can form more stable clusters according to nodes' mobility. In this research we propose an algorithm, which allocates some weight to nodes based on factors, i.e. link stability and power reduction rate. According to the allocated weight in the previous phase, the cellular learning automaton picks out in the second phase nodes which are candidates for being cluster head. In the third phase, learning automaton selects cluster head nodes, member nodes and forms the cluster. Thus, this automaton does the learning from the setting and can form optimized clusters in terms of power consumption and link stability. To simulate the proposed algorithm we have used omnet++4.2.2. Simulation results indicate that newly formed clusters have a longer lifetime than previous algorithms and decrease strongly network overload by reducing update rate.Keywords: mobile Ad Hoc networks, clustering, learning automaton, cellular automaton, battery power
Procedia PDF Downloads 41137847 An Integrated Label Propagation Network for Structural Condition Assessment
Authors: Qingsong Xiong, Cheng Yuan, Qingzhao Kong, Haibei Xiong
Abstract:
Deep-learning-driven approaches based on vibration responses have attracted larger attention in rapid structural condition assessment while obtaining sufficient measured training data with corresponding labels is relevantly costly and even inaccessible in practical engineering. This study proposes an integrated label propagation network for structural condition assessment, which is able to diffuse the labels from continuously-generating measurements by intact structure to those of missing labels of damage scenarios. The integrated network is embedded with damage-sensitive features extraction by deep autoencoder and pseudo-labels propagation by optimized fuzzy clustering, the architecture and mechanism which are elaborated. With a sophisticated network design and specified strategies for improving performance, the present network achieves to extends the superiority of self-supervised representation learning, unsupervised fuzzy clustering and supervised classification algorithms into an integration aiming at assessing damage conditions. Both numerical simulations and full-scale laboratory shaking table tests of a two-story building structure were conducted to validate its capability of detecting post-earthquake damage. The identifying accuracy of a present network was 0.95 in numerical validations and an average 0.86 in laboratory case studies, respectively. It should be noted that the whole training procedure of all involved models in the network stringently doesn’t rely upon any labeled data of damage scenarios but only several samples of intact structure, which indicates a significant superiority in model adaptability and feasible applicability in practice.Keywords: autoencoder, condition assessment, fuzzy clustering, label propagation
Procedia PDF Downloads 9537846 Analysis of Expression Data Using Unsupervised Techniques
Authors: M. A. I Perera, C. R. Wijesinghe, A. R. Weerasinghe
Abstract:
his study was conducted to review and identify the unsupervised techniques that can be employed to analyze gene expression data in order to identify better subtypes of tumors. Identifying subtypes of cancer help in improving the efficacy and reducing the toxicity of the treatments by identifying clues to find target therapeutics. Process of gene expression data analysis described under three steps as preprocessing, clustering, and cluster validation. Feature selection is important since the genomic data are high dimensional with a large number of features compared to samples. Hierarchical clustering and K Means are often used in the analysis of gene expression data. There are several cluster validation techniques used in validating the clusters. Heatmaps are an effective external validation method that allows comparing the identified classes with clinical variables and visual analysis of the classes.Keywords: cancer subtypes, gene expression data analysis, clustering, cluster validation
Procedia PDF Downloads 14937845 A Minimum Spanning Tree-Based Method for Initializing the K-Means Clustering Algorithm
Authors: J. Yang, Y. Ma, X. Zhang, S. Li, Y. Zhang
Abstract:
The traditional k-means algorithm has been widely used as a simple and efficient clustering method. However, the algorithm often converges to local minima for the reason that it is sensitive to the initial cluster centers. In this paper, an algorithm for selecting initial cluster centers on the basis of minimum spanning tree (MST) is presented. The set of vertices in MST with same degree are regarded as a whole which is used to find the skeleton data points. Furthermore, a distance measure between the skeleton data points with consideration of degree and Euclidean distance is presented. Finally, MST-based initialization method for the k-means algorithm is presented, and the corresponding time complexity is analyzed as well. The presented algorithm is tested on five data sets from the UCI Machine Learning Repository. The experimental results illustrate the effectiveness of the presented algorithm compared to three existing initialization methods.Keywords: degree, initial cluster center, k-means, minimum spanning tree
Procedia PDF Downloads 41137844 A Technique for Image Segmentation Using K-Means Clustering Classification
Authors: Sadia Basar, Naila Habib, Awais Adnan
Abstract:
The paper presents the Technique for Image Segmentation Using K-Means Clustering Classification. The presented algorithms were specific, however, missed the neighboring information and required high-speed computerized machines to run the segmentation algorithms. Clustering is the process of partitioning a group of data points into a small number of clusters. The proposed method is content-aware and feature extraction method which is able to run on low-end computerized machines, simple algorithm, required low-quality streaming, efficient and used for security purpose. It has the capability to highlight the boundary and the object. At first, the user enters the data in the representation of the input. Then in the next step, the digital image is converted into groups clusters. Clusters are divided into many regions. The same categories with same features of clusters are assembled within a group and different clusters are placed in other groups. Finally, the clusters are combined with respect to similar features and then represented in the form of segments. The clustered image depicts the clear representation of the digital image in order to highlight the regions and boundaries of the image. At last, the final image is presented in the form of segments. All colors of the image are separated in clusters.Keywords: clustering, image segmentation, K-means function, local and global minimum, region
Procedia PDF Downloads 375