Search results for: possibilistic clustering
558 Unsupervised Segmentation Technique for Acute Leukemia Cells Using Clustering Algorithms
Authors: N. H. Harun, A. S. Abdul Nasir, M. Y. Mashor, R. Hassan
Abstract:
Leukaemia is a blood cancer disease that contributes to the increment of mortality rate in Malaysia each year. There are two main categories for leukaemia, which are acute and chronic leukaemia. The production and development of acute leukaemia cells occurs rapidly and uncontrollable. Therefore, if the identification of acute leukaemia cells could be done fast and effectively, proper treatment and medicine could be delivered. Due to the requirement of prompt and accurate diagnosis of leukaemia, the current study has proposed unsupervised pixel segmentation based on clustering algorithm in order to obtain a fully segmented abnormal white blood cell (blast) in acute leukaemia image. In order to obtain the segmented blast, the current study proposed three clustering algorithms which are k-means, fuzzy c-means and moving k-means algorithms have been applied on the saturation component image. Then, median filter and seeded region growing area extraction algorithms have been applied, to smooth the region of segmented blast and to remove the large unwanted regions from the image, respectively. Comparisons among the three clustering algorithms are made in order to measure the performance of each clustering algorithm on segmenting the blast area. Based on the good sensitivity value that has been obtained, the results indicate that moving k-means clustering algorithm has successfully produced the fully segmented blast region in acute leukaemia image. Hence, indicating that the resultant images could be helpful to haematologists for further analysis of acute leukaemia.Keywords: acute leukaemia images, clustering algorithms, image segmentation, moving k-means
Procedia PDF Downloads 291557 Ensuring Uniform Energy Consumption in Non-Deterministic Wireless Sensor Network to Protract Networks Lifetime
Authors: Vrince Vimal, Madhav J. Nigam
Abstract:
Wireless sensor networks have enticed much of the spotlight from researchers all around the world, owing to its extensive applicability in agricultural, industrial and military fields. Energy conservation node deployment stratagems play a notable role for active implementation of Wireless Sensor Networks. Clustering is the approach in wireless sensor networks which improves energy efficiency in the network. The clustering algorithm needs to have an optimum size and number of clusters, as clustering, if not implemented properly, cannot effectively increase the life of the network. In this paper, an algorithm has been proposed to address connectivity issues with the aim of ensuring the uniform energy consumption of nodes in every part of the network. The results obtained after simulation showed that the proposed algorithm has an edge over existing algorithms in terms of throughput and networks lifetime.Keywords: Wireless Sensor network (WSN), Random Deployment, Clustering, Isolated Nodes, Networks Lifetime
Procedia PDF Downloads 336556 An Intelligent Traffic Management System Based on the WiFi and Bluetooth Sensing
Authors: Hamed Hossein Afshari, Shahrzad Jalali, Amir Hossein Ghods, Bijan Raahemi
Abstract:
This paper introduces an automated clustering solution that applies to WiFi/Bluetooth sensing data and is later used for traffic management applications. The paper initially summarizes a number of clustering approaches and thereafter shows their performance for noise removal. In this context, clustering is used to recognize WiFi and Bluetooth MAC addresses that belong to passengers traveling by a public urban transit bus. The main objective is to build an intelligent system that automatically filters out MAC addresses that belong to persons located outside the bus for different routes in the city of Ottawa. The proposed intelligent system alleviates the need for defining restrictive thresholds that however reduces the accuracy as well as the range of applicability of the solution for different routes. This paper moreover discusses the performance benefits of the presented clustering approaches in terms of the accuracy, time and space complexity, and the ease of use. Note that results of clustering can further be used for the purpose of the origin-destination estimation of individual passengers, predicting the traffic load, and intelligent management of urban bus schedules.Keywords: WiFi-Bluetooth sensing, cluster analysis, artificial intelligence, traffic management
Procedia PDF Downloads 241555 An Efficient Clustering Technique for Copy-Paste Attack Detection
Authors: N. Chaitawittanun, M. Munlin
Abstract:
Due to rapid advancement of powerful image processing software, digital images are easy to manipulate and modify by ordinary people. Lots of digital images are edited for a specific purpose and more difficult to distinguish form their original ones. We propose a clustering method to detect a copy-move image forgery of JPEG, BMP, TIFF, and PNG. The process starts with reducing the color of the photos. Then, we use the clustering technique to divide information of measuring data by Hausdorff Distance. The result shows that the purposed methods is capable of inspecting the image file and correctly identify the forgery.Keywords: image detection, forgery image, copy-paste, attack detection
Procedia PDF Downloads 338554 Application of Fuzzy Clustering on Classification Agile Supply Chain Firms
Authors: Hamidreza Fallah Lajimi, Elham Karami, Alireza Arab, Fatemeh Alinasab
Abstract:
Being responsive is an increasingly important skill for firms in today’s global economy; thus firms must be agile. Naturally, it follows that an organization’s agility depends on its supply chain being agile. However, achieving supply chain agility is a function of other abilities within the organization. This paper analyses results from a survey of 71 Iran manufacturing companies in order to identify some of the factors for agile organizations in managing their supply chains. Then we classification this company in four cluster with fuzzy c-mean technique and with Four validations functional determine automatically the optimal number of clusters.Keywords: agile supply chain, clustering, fuzzy clustering, business engineering
Procedia PDF Downloads 713553 Advances in Machine Learning and Deep Learning Techniques for Image Classification and Clustering
Authors: R. Nandhini, Gaurab Mudbhari
Abstract:
Ranging from the field of health care to self-driving cars, machine learning and deep learning algorithms have revolutionized the field with the proper utilization of images and visual-oriented data. Segmentation, regression, classification, clustering, dimensionality reduction, etc., are some of the Machine Learning tasks that helped Machine Learning and Deep Learning models to become state-of-the-art models for the field where images are key datasets. Among these tasks, classification and clustering are essential but difficult because of the intricate and high-dimensional characteristics of image data. This finding examines and assesses advanced techniques in supervised classification and unsupervised clustering for image datasets, emphasizing the relative efficiency of Convolutional Neural Networks (CNNs), Vision Transformers (ViTs), Deep Embedded Clustering (DEC), and self-supervised learning approaches. Due to the distinctive structural attributes present in images, conventional methods often fail to effectively capture spatial patterns, resulting in the development of models that utilize more advanced architectures and attention mechanisms. In image classification, we investigated both CNNs and ViTs. One of the most promising models, which is very much known for its ability to detect spatial hierarchies, is CNN, and it serves as a core model in our study. On the other hand, ViT is another model that also serves as a core model, reflecting a modern classification method that uses a self-attention mechanism which makes them more robust as this self-attention mechanism allows them to lean global dependencies in images without relying on convolutional layers. This paper evaluates the performance of these two architectures based on accuracy, precision, recall, and F1-score across different image datasets, analyzing their appropriateness for various categories of images. In the domain of clustering, we assess DEC, Variational Autoencoders (VAEs), and conventional clustering techniques like k-means, which are used on embeddings derived from CNN models. DEC, a prominent model in the field of clustering, has gained the attention of many ML engineers because of its ability to combine feature learning and clustering into a single framework and its main goal is to improve clustering quality through better feature representation. VAEs, on the other hand, are pretty well known for using latent embeddings for grouping similar images without requiring for prior label by utilizing the probabilistic clustering method.Keywords: machine learning, deep learning, image classification, image clustering
Procedia PDF Downloads 11552 Identifying Autism Spectrum Disorder Using Optimization-Based Clustering
Authors: Sharifah Mousli, Sona Taheri, Jiayuan He
Abstract:
Autism spectrum disorder (ASD) is a complex developmental condition involving persistent difficulties with social communication, restricted interests, and repetitive behavior. The challenges associated with ASD can interfere with an affected individual’s ability to function in social, academic, and employment settings. Although there is no effective medication known to treat ASD, to our best knowledge, early intervention can significantly improve an affected individual’s overall development. Hence, an accurate diagnosis of ASD at an early phase is essential. The use of machine learning approaches improves and speeds up the diagnosis of ASD. In this paper, we focus on the application of unsupervised clustering methods in ASD as a large volume of ASD data generated through hospitals, therapy centers, and mobile applications has no pre-existing labels. We conduct a comparative analysis using seven clustering approaches such as K-means, agglomerative hierarchical, model-based, fuzzy-C-means, affinity propagation, self organizing maps, linear vector quantisation – as well as the recently developed optimization-based clustering (COMSEP-Clust) approach. We evaluate the performances of the clustering methods extensively on real-world ASD datasets encompassing different age groups: toddlers, children, adolescents, and adults. Our experimental results suggest that the COMSEP-Clust approach outperforms the other seven methods in recognizing ASD with well-separated clusters.Keywords: autism spectrum disorder, clustering, optimization, unsupervised machine learning
Procedia PDF Downloads 116551 Spectral Clustering from the Discrepancy View and Generalized Quasirandomness
Authors: Marianna Bolla
Abstract:
The aim of this paper is to compare spectral, discrepancy, and degree properties of expanding graph sequences. As we can prove equivalences and implications between them and the definition of the generalized (multiclass) quasirandomness of Lovasz–Sos (2008), they can be regarded as generalized quasirandom properties akin to the equivalent quasirandom properties of the seminal Chung-Graham-Wilson paper (1989) in the one-class scenario. Since these properties are valid for deterministic graph sequences, irrespective of stochastic models, the partial implications also justify for low-dimensional embedding of large-scale graphs and for discrepancy minimizing spectral clustering.Keywords: generalized random graphs, multiway discrepancy, normalized modularity spectra, spectral clustering
Procedia PDF Downloads 197550 A Clustering Algorithm for Massive Texts
Authors: Ming Liu, Chong Wu, Bingquan Liu, Lei Chen
Abstract:
Internet users have to face the massive amount of textual data every day. Organizing texts into categories can help users dig the useful information from large-scale text collection. Clustering, in fact, is one of the most promising tools for categorizing texts due to its unsupervised characteristic. Unfortunately, most of traditional clustering algorithms lose their high qualities on large-scale text collection. This situation mainly attributes to the high- dimensional vectors generated from texts. To effectively and efficiently cluster large-scale text collection, this paper proposes a vector reconstruction based clustering algorithm. Only the features that can represent the cluster are preserved in cluster’s representative vector. This algorithm alternately repeats two sub-processes until it converges. One process is partial tuning sub-process, where feature’s weight is fine-tuned by iterative process. To accelerate clustering velocity, an intersection based similarity measurement and its corresponding neuron adjustment function are proposed and implemented in this sub-process. The other process is overall tuning sub-process, where the features are reallocated among different clusters. In this sub-process, the features useless to represent the cluster are removed from cluster’s representative vector. Experimental results on the three text collections (including two small-scale and one large-scale text collections) demonstrate that our algorithm obtains high quality on both small-scale and large-scale text collections.Keywords: vector reconstruction, large-scale text clustering, partial tuning sub-process, overall tuning sub-process
Procedia PDF Downloads 435549 Effect of Bi-Dispersity on Particle Clustering in Sedimentation
Authors: Ali Abbas Zaidi
Abstract:
In free settling or sedimentation, particles form clusters at high Reynolds number and dilute suspensions. It is due to the entrapment of particles in the wakes of upstream particles. In this paper, the effect of bi-dispersity of settling particles on particle clustering is investigated using particle-resolved direct numerical simulation. Immersed boundary method is used for particle fluid interactions and discrete element method is used for particle-particle interactions. The solid volume fraction used in the simulation is 1% and the Reynolds number based on Sauter mean diameter is 350. Both solid volume fraction and Reynolds number lie in the clustering regime of sedimentation. In simulations, the particle diameter ratio (i.e. diameter of larger particle to smaller particle (d₁/d₂)) is varied from 2:1, 3:1 and 4:1. For each case of particle diameter ratio, solid volume fraction for each particle size (φ₁/φ₂) is varied from 1:1, 1:2 and 2:1. For comparison, simulations are also performed for monodisperse particles. For studying particles clustering, radial distribution function and instantaneous location of particles in the computational domain are studied. It is observed that the degree of particle clustering decreases with the increase in the bi-dispersity of settling particles. The smallest degree of particle clustering or dispersion of particles is observed for particles with d₁/d₂ equal to 4:1 and φ₁/φ₂ equal to 1:2. Simulations showed that the reduction in particle clustering by increasing bi-dispersity is due to the difference in settling velocity of particles. Particles with larger size settle faster and knockout the smaller particles from clustered regions of particles in the computational domain.Keywords: dispersion in bi-disperse settling particles, particle microstructures in bi-disperse suspensions, particle resolved direct numerical simulations, settling of bi-disperse particles
Procedia PDF Downloads 208548 A Learning-Based EM Mixture Regression Algorithm
Authors: Yi-Cheng Tian, Miin-Shen Yang
Abstract:
The mixture likelihood approach to clustering is a popular clustering method where the expectation and maximization (EM) algorithm is the most used mixture likelihood method. In the literature, the EM algorithm had been used for mixture regression models. However, these EM mixture regression algorithms are sensitive to initial values with a priori number of clusters. In this paper, to resolve these drawbacks, we construct a learning-based schema for the EM mixture regression algorithm such that it is free of initializations and can automatically obtain an approximately optimal number of clusters. Some numerical examples and comparisons demonstrate the superiority and usefulness of the proposed learning-based EM mixture regression algorithm.Keywords: clustering, EM algorithm, Gaussian mixture model, mixture regression model
Procedia PDF Downloads 510547 Communication of Sensors in Clustering for Wireless Sensor Networks
Authors: Kashish Sareen, Jatinder Singh Bal
Abstract:
The use of wireless sensor networks (WSNs) has grown vastly in the last era, pointing out the crucial need for scalable and energy-efficient routing and data gathering and aggregation protocols in corresponding large-scale environments. Wireless Sensor Networks have now recently emerged as a most important computing platform and continue to grow in diverse areas to provide new opportunities for networking and services. However, the energy constrained and limited computing resources of the sensor nodes present major challenges in gathering data. The sensors collect data about their surrounding and forward it to a command centre through a base station. The past few years have witnessed increased interest in the potential use of wireless sensor networks (WSNs) as they are very useful in target detecting and other applications. However, hierarchical clustering protocols have maximum been used in to overall system lifetime, scalability and energy efficiency. In this paper, the state of the art in corresponding hierarchical clustering approaches for large-scale WSN environments is shown.Keywords: clustering, DLCC, MLCC, wireless sensor networks
Procedia PDF Downloads 482546 Interpretation and Clustering Framework for Analyzing ECG Survey Data
Authors: Irum Matloob, Shoab Ahmad Khan, Fahim Arif
Abstract:
As Indo-Pak has been the victim of heart diseases since many decades. Many surveys showed that percentage of cardiac patients is increasing in Pakistan day by day, and special attention is needed to pay on this issue. The framework is proposed for performing detailed analysis of ECG survey data which is conducted for measuring prevalence of heart diseases statistics in Pakistan. The ECG survey data is evaluated or filtered by using automated Minnesota codes and only those ECGs are used for further analysis which is fulfilling the standardized conditions mentioned in the Minnesota codes. Then feature selection is performed by applying proposed algorithm based on discernibility matrix, for selecting relevant features from the database. Clustering is performed for exposing natural clusters from the ECG survey data by applying spectral clustering algorithm using fuzzy c means algorithm. The hidden patterns and interesting relationships which have been exposed after this analysis are useful for further detailed analysis and for many other multiple purposes.Keywords: arrhythmias, centroids, ECG, clustering, discernibility matrix
Procedia PDF Downloads 470545 An Extraction of Cancer Region from MR Images Using Fuzzy Clustering Means and Morphological Operations
Authors: Ramandeep Kaur, Gurjit Singh Bhathal
Abstract:
Cancer diagnosis is very difficult task. Magnetic resonance imaging (MRI) scan is used to produce image of any part of the body and provides an efficient way for diagnosis of cancer or tumor. In existing method, fuzzy clustering mean (FCM) is used for the diagnosis of the tumor. In the proposed method FCM is used to diagnose the cancer of the foot. FCM finds the centroids of the clusters of the foot cancer obtained from MRI images. FCM thresholding result shows the extract region of the cancer. Morphological operations are applied to get extracted region of cancer.Keywords: magnetic resonance imaging (MRI), fuzzy C mean clustering, segmentation, morphological operations
Procedia PDF Downloads 399544 Analysis of ECGs Survey Data by Applying Clustering Algorithm
Authors: Irum Matloob, Shoab Ahmad Khan, Fahim Arif
Abstract:
As Indo-pak has been the victim of heart diseases since many decades. Many surveys showed that percentage of cardiac patients is increasing in Pakistan day by day, and special attention is needed to pay on this issue. The framework is proposed for performing detailed analysis of ECG survey data which is conducted for measuring the prevalence of heart diseases statistics in Pakistan. The ECG survey data is evaluated or filtered by using automated Minnesota codes and only those ECGs are used for further analysis which is fulfilling the standardized conditions mentioned in the Minnesota codes. Then feature selection is performed by applying proposed algorithm based on discernibility matrix, for selecting relevant features from the database. Clustering is performed for exposing natural clusters from the ECG survey data by applying spectral clustering algorithm using fuzzy c means algorithm. The hidden patterns and interesting relationships which have been exposed after this analysis are useful for further detailed analysis and for many other multiple purposes.Keywords: arrhythmias, centroids, ECG, clustering, discernibility matrix
Procedia PDF Downloads 351543 Double Clustering as an Unsupervised Approach for Order Picking of Distributed Warehouses
Authors: Hsin-Yi Huang, Ming-Sheng Liu, Jiun-Yan Shiau
Abstract:
Planning the order picking lists of warehouses to achieve when the costs associated with logistics on the operational performance is a significant challenge. In e-commerce era, this task is especially important productive processes are high. Nowadays, many order planning techniques employ supervised machine learning algorithms. However, the definition of which features should be processed by such algorithms is not a simple task, being crucial to the proposed technique’s success. Against this background, we consider whether unsupervised algorithms can enhance the planning of order-picking lists. A Zone2 picking approach, which is based on using clustering algorithms twice, is developed. A simplified example is given to demonstrate the merit of our approach.Keywords: order picking, warehouse, clustering, unsupervised learning
Procedia PDF Downloads 159542 Harmonic Data Preparation for Clustering and Classification
Authors: Ali Asheibi
Abstract:
The rapid increase in the size of databases required to store power quality monitoring data has demanded new techniques for analysing and understanding the data. One suggested technique to assist in analysis is data mining. Preparing raw data to be ready for data mining exploration take up most of the effort and time spent in the whole data mining process. Clustering is an important technique in data mining and machine learning in which underlying and meaningful groups of data are discovered. Large amounts of harmonic data have been collected from an actual harmonic monitoring system in a distribution system in Australia for three years. This amount of acquired data makes it difficult to identify operational events that significantly impact the harmonics generated on the system. In this paper, harmonic data preparation processes to better understanding of the data have been presented. Underlying classes in this data has then been identified using clustering technique based on the Minimum Message Length (MML) method. The underlying operational information contained within the clusters can be rapidly visualised by the engineers. The C5.0 algorithm was used for classification and interpretation of the generated clusters.Keywords: data mining, harmonic data, clustering, classification
Procedia PDF Downloads 248541 Identification of Biological Pathways Causative for Breast Cancer Using Unsupervised Machine Learning
Authors: Karthik Mittal
Abstract:
This study performs an unsupervised machine learning analysis to find clusters of related SNPs which highlight biological pathways that are important for the biological mechanisms of breast cancer. Studying genetic variations in isolation is illogical because these genetic variations are known to modulate protein production and function; the downstream effects of these modifications on biological outcomes are highly interconnected. After extracting the SNPs and their effect on different types of breast cancer using the MRBase library, two unsupervised machine learning clustering algorithms were implemented on the genetic variants: a k-means clustering algorithm and a hierarchical clustering algorithm; furthermore, principal component analysis was executed to visually represent the data. These algorithms specifically used the SNP’s beta value on the three different types of breast cancer tested in this project (estrogen-receptor positive breast cancer, estrogen-receptor negative breast cancer, and breast cancer in general) to perform this clustering. Two significant genetic pathways validated the clustering produced by this project: the MAPK signaling pathway and the connection between the BRCA2 gene and the ESR1 gene. This study provides the first proof of concept showing the importance of unsupervised machine learning in interpreting GWAS summary statistics.Keywords: breast cancer, computational biology, unsupervised machine learning, k-means, PCA
Procedia PDF Downloads 146540 Energy-Efficient Clustering Protocol in Wireless Sensor Networks for Healthcare Monitoring
Authors: Ebrahim Farahmand, Ali Mahani
Abstract:
Wireless sensor networks (WSNs) can facilitate continuous monitoring of patients and increase early detection of emergency conditions and diseases. High density WSNs helps us to accurately monitor a remote environment by intelligently combining the data from the individual nodes. Due to energy capacity limitation of sensors, enhancing the lifetime and the reliability of WSNs are important factors in designing of these networks. The clustering strategies are verified as effective and practical algorithms for reducing energy consumption in WSNs and can tackle WSNs limitations. In this paper, an Energy-efficient weight-based Clustering Protocol (EWCP) is presented. Artificial retina is selected as a case study of WSNs applied in body sensors. Cluster heads’ (CHs) selection is equipped with energy efficient parameters. Moreover, cluster members are selected based on their distance to the selected CHs. Comparing with the other benchmark protocols, the lifetime of EWCP is improved significantly.Keywords: WSN, healthcare monitoring, weighted based clustering, lifetime
Procedia PDF Downloads 309539 Clustering Based Level Set Evaluation for Low Contrast Images
Authors: Bikshalu Kalagadda, Srikanth Rangu
Abstract:
The important object of images segmentation is to extract objects with respect to some input features. One of the important methods for image segmentation is Level set method. Generally medical images and synthetic images with low contrast of pixel profile, for such images difficult to locate interested features in images. In conventional level set function, develops irregularity during its process of evaluation of contour of objects, this destroy the stability of evolution process. For this problem a remedy is proposed, a new hybrid algorithm is Clustering Level Set Evolution. Kernel fuzzy particles swarm optimization clustering with the Distance Regularized Level Set (DRLS) and Selective Binary, and Gaussian Filtering Regularized Level Set (SBGFRLS) methods are used. The ability of identifying different regions becomes easy with improved speed. Efficiency of the modified method can be evaluated by comparing with the previous method for similar specifications. Comparison can be carried out by considering medical and synthetic images.Keywords: segmentation, clustering, level set function, re-initialization, Kernel fuzzy, swarm optimization
Procedia PDF Downloads 352538 Component Based Testing Using Clustering and Support Vector Machine
Authors: Iqbaldeep Kaur, Amarjeet Kaur
Abstract:
Software Reusability is important part of software development. So component based software development in case of software testing has gained a lot of practical importance in the field of software engineering from academic researcher and also from software development industry perspective. Finding test cases for efficient reuse of test cases is one of the important problems aimed by researcher. Clustering reduce the search space, reuse test cases by grouping similar entities according to requirements ensuring reduced time complexity as it reduce the search time for retrieval the test cases. In this research paper we proposed approach for re-usability of test cases by unsupervised approach. In unsupervised learning we proposed k-mean and Support Vector Machine. We have designed the algorithm for requirement and test case document clustering according to its tf-idf vector space and the output is set of highly cohesive pattern groups.Keywords: software testing, reusability, clustering, k-mean, SVM
Procedia PDF Downloads 430537 Clustering Performance Analysis using New Correlation-Based Cluster Validity Indices
Authors: Nathakhun Wiroonsri
Abstract:
There are various cluster validity measures used for evaluating clustering results. One of the main objectives of using these measures is to seek the optimal unknown number of clusters. Some measures work well for clusters with different densities, sizes and shapes. Yet, one of the weaknesses that those validity measures share is that they sometimes provide only one clear optimal number of clusters. That number is actually unknown and there might be more than one potential sub-optimal option that a user may wish to choose based on different applications. We develop two new cluster validity indices based on a correlation between an actual distance between a pair of data points and a centroid distance of clusters that the two points are located in. Our proposed indices constantly yield several peaks at different numbers of clusters which overcome the weakness previously stated. Furthermore, the introduced correlation can also be used for evaluating the quality of a selected clustering result. Several experiments in different scenarios, including the well-known iris data set and a real-world marketing application, have been conducted to compare the proposed validity indices with several well-known ones.Keywords: clustering algorithm, cluster validity measure, correlation, data partitions, iris data set, marketing, pattern recognition
Procedia PDF Downloads 103536 Personalize E-Learning System Based on Clustering and Sequence Pattern Mining Approach
Authors: H. S. Saini, K. Vijayalakshmi, Rishi Sayal
Abstract:
Network-based education has been growing rapidly in size and quality. Knowledge clustering becomes more important in personalized information retrieval for web-learning. A personalized-Learning service after the learners’ knowledge has been classified with clustering. Through automatic analysis of learners’ behaviors, their partition with similar data level and interests may be discovered so as to produce learners with contents that best match educational needs for collaborative learning. We present a specific mining tool and a recommender engine that we have integrated in the online learning in order to help the teacher to carry out the whole e-learning process. We propose to use sequential pattern mining algorithms to discover the most used path by the students and from this information can recommend links to the new students automatically meanwhile they browse in the course. We have Developed a specific author tool in order to help the teacher to apply all the data mining process. We tend to report on many experiments with real knowledge so as to indicate the quality of using both clustering and sequential pattern mining algorithms together for discovering personalized e-learning systems.Keywords: e-learning, cluster, personalization, sequence, pattern
Procedia PDF Downloads 429535 Data Clustering Algorithm Based on Multi-Objective Periodic Bacterial Foraging Optimization with Two Learning Archives
Authors: Chen Guo, Heng Tang, Ben Niu
Abstract:
Clustering splits objects into different groups based on similarity, making the objects have higher similarity in the same group and lower similarity in different groups. Thus, clustering can be treated as an optimization problem to maximize the intra-cluster similarity or inter-cluster dissimilarity. In real-world applications, the datasets often have some complex characteristics: sparse, overlap, high dimensionality, etc. When facing these datasets, simultaneously optimizing two or more objectives can obtain better clustering results than optimizing one objective. However, except for the objectives weighting methods, traditional clustering approaches have difficulty in solving multi-objective data clustering problems. Due to this, evolutionary multi-objective optimization algorithms are investigated by researchers to optimize multiple clustering objectives. In this paper, the Data Clustering algorithm based on Multi-objective Periodic Bacterial Foraging Optimization with two Learning Archives (DC-MPBFOLA) is proposed. Specifically, first, to reduce the high computing complexity of the original BFO, periodic BFO is employed as the basic algorithmic framework. Then transfer the periodic BFO into a multi-objective type. Second, two learning strategies are proposed based on the two learning archives to guide the bacterial swarm to move in a better direction. On the one hand, the global best is selected from the global learning archive according to the convergence index and diversity index. On the other hand, the personal best is selected from the personal learning archive according to the sum of weighted objectives. According to the aforementioned learning strategies, a chemotaxis operation is designed. Third, an elite learning strategy is designed to provide fresh power to the objects in two learning archives. When the objects in these two archives do not change for two consecutive times, randomly initializing one dimension of objects can prevent the proposed algorithm from falling into local optima. Fourth, to validate the performance of the proposed algorithm, DC-MPBFOLA is compared with four state-of-art evolutionary multi-objective optimization algorithms and one classical clustering algorithm on evaluation indexes of datasets. To further verify the effectiveness and feasibility of designed strategies in DC-MPBFOLA, variants of DC-MPBFOLA are also proposed. Experimental results demonstrate that DC-MPBFOLA outperforms its competitors regarding all evaluation indexes and clustering partitions. These results also indicate that the designed strategies positively influence the performance improvement of the original BFO.Keywords: data clustering, multi-objective optimization, bacterial foraging optimization, learning archives
Procedia PDF Downloads 139534 Review: Wavelet New Tool for Path Loss Prediction
Authors: Danladi Ali, Abdullahi Mukaila
Abstract:
In this work, GSM signal strength (power) was monitored in an indoor environment. Samples of the GSM signal strength was measured on mobile equipment (ME). One-dimensional multilevel wavelet is used to predict the fading phenomenon of the GSM signal measured and neural network clustering to determine the average power received in the study area. The wavelet prediction revealed that the GSM signal is attenuated due to the fast fading phenomenon which fades about 7 times faster than the radio wavelength while the neural network clustering determined that -75dBm appeared more frequently followed by -85dBm. The work revealed that significant part of the signal measured is dominated by weak signal and the signal followed more of Rayleigh than Gaussian distribution. This confirmed the wavelet prediction.Keywords: decomposition, clustering, propagation, model, wavelet, signal strength and spectral efficiency
Procedia PDF Downloads 448533 Improved Color-Based K-Mean Algorithm for Clustering of Satellite Image
Authors: Sangeeta Yadav, Mantosh Biswas
Abstract:
In this paper, we proposed an improved color based K-mean algorithm for clustering of satellite Image (SAR). Our method comprises of two stages. The first step is an interactive selection process where users are required to input the number of colors (ncolor), number of clusters, and then they are prompted to select the points in each color cluster. In the second step these points are given as input to K-mean clustering algorithm that clusters the image based on color and Minimum Square Euclidean distance. The proposed method reduces the mixed pixel problem to a great extent.Keywords: cluster, ncolor method, K-mean method, interactive selection process
Procedia PDF Downloads 297532 Issue Reorganization Using the Measure of Relevance
Authors: William Wong Xiu Shun, Yoonjin Hyun, Mingyu Kim, Seongi Choi, Namgyu Kim
Abstract:
Recently, the demand of extracting the R&D keywords from the issues and using them in retrieving R&D information is increasing rapidly. But it is hard to identify the related issues or to distinguish them. Although the similarity between the issues cannot be identified, but with the R&D lexicon, the issues that always shared the same R&D keywords can be determined. In details, the R&D keywords that associated with particular issue is implied the key technology elements that needed to solve the problem of the particular issue. Furthermore, the related issues that sharing the same R&D keywords can be showed in a more systematic way through the issue clustering constructed from the perspective of R&D. Thus, sharing of the R&D result and reusable of the R&D technology can be facilitated. Indirectly, the redundancy of investment on the same R&D can be reduce as the R&D information can be shared between those corresponding issues and reusability of the related R&D can be improved. Therefore, a methodology of constructing an issue clustering from the perspective of common R&D keywords is proposed to satisfy the demands mentioned.Keywords: clustering, social network analysis, text mining, topic analysis
Procedia PDF Downloads 573531 A Hybrid Method for Determination of Effective Poles Using Clustering Dominant Pole Algorithm
Authors: Anuj Abraham, N. Pappa, Daniel Honc, Rahul Sharma
Abstract:
In this paper, an analysis of some model order reduction techniques is presented. A new hybrid algorithm for model order reduction of linear time invariant systems is compared with the conventional techniques namely Balanced Truncation, Hankel Norm reduction and Dominant Pole Algorithm (DPA). The proposed hybrid algorithm is known as Clustering Dominant Pole Algorithm (CDPA) is able to compute the full set of dominant poles and its cluster center efficiently. The dominant poles of a transfer function are specific eigenvalues of the state space matrix of the corresponding dynamical system. The effectiveness of this novel technique is shown through the simulation results.Keywords: balanced truncation, clustering, dominant pole, Hankel norm, model reduction
Procedia PDF Downloads 599530 A Neural Network Based Clustering Approach for Imputing Multivariate Values in Big Data
Authors: S. Nickolas, Shobha K.
Abstract:
The treatment of incomplete data is an important step in the data pre-processing. Missing values creates a noisy environment in all applications and it is an unavoidable problem in big data management and analysis. Numerous techniques likes discarding rows with missing values, mean imputation, expectation maximization, neural networks with evolutionary algorithms or optimized techniques and hot deck imputation have been introduced by researchers for handling missing data. Among these, imputation techniques plays a positive role in filling missing values when it is necessary to use all records in the data and not to discard records with missing values. In this paper we propose a novel artificial neural network based clustering algorithm, Adaptive Resonance Theory-2(ART2) for imputation of missing values in mixed attribute data sets. The process of ART2 can recognize learned models fast and be adapted to new objects rapidly. It carries out model-based clustering by using competitive learning and self-steady mechanism in dynamic environment without supervision. The proposed approach not only imputes the missing values but also provides information about handling the outliers.Keywords: ART2, data imputation, clustering, missing data, neural network, pre-processing
Procedia PDF Downloads 274529 Structure Clustering for Milestoning Applications of Complex Conformational Transitions
Authors: Amani Tahat, Serdal Kirmizialtin
Abstract:
Trajectory fragment methods such as Markov State Models (MSM), Milestoning (MS) and Transition Path sampling are the prime choice of extending the timescale of all atom Molecular Dynamics simulations. In these approaches, a set of structures that covers the accessible phase space has to be chosen a priori using cluster analysis. Structural clustering serves to partition the conformational state into natural subgroups based on their similarity, an essential statistical methodology that is used for analyzing numerous sets of empirical data produced by Molecular Dynamics (MD) simulations. Local transition kernel among these clusters later used to connect the metastable states using a Markovian kinetic model in MSM and a non-Markovian model in MS. The choice of clustering approach in constructing such kernel is crucial since the high dimensionality of the biomolecular structures might easily confuse the identification of clusters when using the traditional hierarchical clustering methodology. Of particular interest, in the case of MS where the milestones are very close to each other, accurate determination of the milestone identity of the trajectory becomes a challenging issue. Throughout this work we present two cluster analysis methods applied to the cis–trans isomerism of dinucleotide AA. The choice of nucleic acids to commonly used proteins to study the cluster analysis is two fold: i) the energy landscape is rugged; hence transitions are more complex, enabling a more realistic model to study conformational transitions, ii) Nucleic acids conformational space is high dimensional. A diverse set of internal coordinates is necessary to describe the metastable states in nucleic acids, posing a challenge in studying the conformational transitions. Herein, we need improved clustering methods that accurately identify the AA structure in its metastable states in a robust way for a wide range of confused data conditions. The single linkage approach of the hierarchical clustering available in GROMACS MD-package is the first clustering methodology applied to our data. Self Organizing Map (SOM) neural network, that also known as a Kohonen network, is the second data clustering methodology. The performance comparison of the neural network as well as hierarchical clustering method is studied by means of computing the mean first passage times for the cis-trans conformational rates. Our hope is that this study provides insight into the complexities and need in determining the appropriate clustering algorithm for kinetic analysis. Our results can improve the effectiveness of decisions based on clustering confused empirical data in studying conformational transitions in biomolecules.Keywords: milestoning, self organizing map, single linkage, structure clustering
Procedia PDF Downloads 224