Search results for: grey clustering
757 Hierarchical Cluster Analysis of Raw Milk Samples Obtained from Organic and Conventional Dairy Farming in Autonomous Province of Vojvodina, Serbia
Authors: Lidija Jevrić, Denis Kučević, Sanja Podunavac-Kuzmanović, Strahinja Kovačević, Milica Karadžić
Abstract:
In the present study, the Hierarchical Cluster Analysis (HCA) was applied in order to determine the differences between the milk samples originating from a conventional dairy farm (CF) and an organic dairy farm (OF) in AP Vojvodina, Republic of Serbia. The clustering was based on the basis of the average values of saturated fatty acids (SFA) content and unsaturated fatty acids (UFA) content obtained for every season. Therefore, the HCA included the annual SFA and UFA content values. The clustering procedure was carried out on the basis of Euclidean distances and Single linkage algorithm. The obtained dendrograms indicated that the clustering of UFA in OF was much more uniform compared to clustering of UFA in CF. In OF, spring stands out from the other months of the year. The same case can be noticed for CF, where winter is separated from the other months. The results could be expected because the composition of fatty acids content is greatly influenced by the season and nutrition of dairy cows during the year.Keywords: chemometrics, clustering, food engineering, milk quality
Procedia PDF Downloads 281756 Sales Patterns Clustering Analysis on Seasonal Product Sales Data
Authors: Soojin Kim, Jiwon Yang, Sungzoon Cho
Abstract:
As a seasonal product is only in demand for a short time, inventory management is critical to profits. Both markdowns and stockouts decrease the return on perishable products; therefore, researchers have been interested in the distribution of seasonal products with the aim of maximizing profits. In this study, we propose a data-driven seasonal product sales pattern analysis method for individual retail outlets based on observed sales data clustering; the proposed method helps in determining distribution strategies.Keywords: clustering, distribution, sales pattern, seasonal product
Procedia PDF Downloads 595755 Microbial Bioagent Triggered Biochemical Response in Tea (Camellia sinensis) Inducing Resistance against Grey Blight Disease and Yield Enhancement
Authors: Popy Bora, L. C. Bora, A. Bhattacharya, Sehnaz S. Ahmed
Abstract:
Microbial bioagents, viz., Pseudomonas fluorescens, Bacillus subtilis, and Trichoderma viride were assessed for their ability to suppress grey blight caused by Pestalotiopsis theae, a major disease of tea crop in Assam. The expression of defense-related phytochemicals due to the application of these bioagents was also evaluated. The individual bioagents, as well as their combinations, were screened for their bioefficacy against P. theae in vitro using nutrient agar (NA) as basal medium. The treatment comprising a combination of the three bioagents, P. fluorescens, B. subtilis, and T. viride showed significantly the highest inhibition against the pathogen. Bioformulation of effective bioagent combinations was further evaluated under field condition, where significantly highest reduction of grey blight (90.30%), as well as the highest increase in the green leaf yield (10.52q/ha), was recorded due to application of the bioformulation containing the three bioagents. The application of the three bioformulation also recorded an enhanced level of caffeine (4.15%) and polyphenols (22.87%). A significant increase in the enzymatic activity of phenylalanine ammonia-lyase, peroxidase and polyphenol oxidase were recorded in the plants treated with the microbial bioformulation of the three bioagents. The present investigation indicates the role of microbial agents in suppressing disease, inducing plant defense response, as well as improving the quality of tea.Keywords: enzymatic activity, grey blight, microbial bioagents, Pestalotiopsis theae, phytochemicals, plant defense, tea
Procedia PDF Downloads 141754 Feature Weighting Comparison Based on Clustering Centers in the Detection of Diabetic Retinopathy
Authors: Kemal Polat
Abstract:
In this paper, three feature weighting methods have been used to improve the classification performance of diabetic retinopathy (DR). To classify the diabetic retinopathy, features extracted from the output of several retinal image processing algorithms, such as image-level, lesion-specific and anatomical components, have been used and fed them into the classifier algorithms. The dataset used in this study has been taken from University of California, Irvine (UCI) machine learning repository. Feature weighting methods including the fuzzy c-means clustering based feature weighting, subtractive clustering based feature weighting, and Gaussian mixture clustering based feature weighting, have been used and compered with each other in the classification of DR. After feature weighting, five different classifier algorithms comprising multi-layer perceptron (MLP), k- nearest neighbor (k-NN), decision tree, support vector machine (SVM), and Naïve Bayes have been used. The hybrid method based on combination of subtractive clustering based feature weighting and decision tree classifier has been obtained the classification accuracy of 100% in the screening of DR. These results have demonstrated that the proposed hybrid scheme is very promising in the medical data set classification.Keywords: machine learning, data weighting, classification, data mining
Procedia PDF Downloads 326753 Clustering the Wheat Seeds Using SOM Artificial Neural Networks
Authors: Salah Ghamari
Abstract:
In this study, the ability of self organizing map artificial (SOM) neural networks in clustering the wheat seeds varieties according to morphological properties of them was considered. The SOM is one type of unsupervised competitive learning. Experimentally, five morphological features of 300 seeds (including three varieties: gaskozhen, Md and sardari) were obtained using image processing technique. The results show that the artificial neural network has a good performance (90.33% accuracy) in classification of the wheat varieties despite of high similarity in them. The highest classification accuracy (100%) was achieved for sardari.Keywords: artificial neural networks, clustering, self organizing map, wheat variety
Procedia PDF Downloads 656752 GCM Based Fuzzy Clustering to Identify Homogeneous Climatic Regions of North-East India
Authors: Arup K. Sarma, Jayshree Hazarika
Abstract:
The North-eastern part of India, which receives heavier rainfall than other parts of the subcontinent, is of great concern now-a-days with regard to climate change. High intensity rainfall for short duration and longer dry spell, occurring due to impact of climate change, affects river morphology too. In the present study, an attempt is made to delineate the North-Eastern region of India into some homogeneous clusters based on the Fuzzy Clustering concept and to compare the resulting clusters obtained by using conventional methods and non conventional methods of clustering. The concept of clustering is adapted in view of the fact that, impact of climate change can be studied in a homogeneous region without much variation, which can be helpful in studies related to water resources planning and management. 10 IMD (Indian Meteorological Department) stations, situated in various regions of the North-east, have been selected for making the clusters. The results of the Fuzzy C-Means (FCM) analysis show different clustering patterns for different conditions. From the analysis and comparison it can be concluded that non conventional method of using GCM data is somehow giving better results than the others. However, further analysis can be done by taking daily data instead of monthly means to reduce the effect of standardization.Keywords: climate change, conventional and nonconventional methods of clustering, FCM analysis, homogeneous regions
Procedia PDF Downloads 386751 The Clustering of Multiple Sclerosis Subgroups through L2 Norm Multifractal Denoising Technique
Authors: Yeliz Karaca, Rana Karabudak
Abstract:
Multifractal Denoising techniques are used in the identification of significant attributes by removing the noise of the dataset. Magnetic resonance (MR) image technique is the most sensitive method so as to identify chronic disorders of the nervous system such as Multiple Sclerosis. MRI and Expanded Disability Status Scale (EDSS) data belonging to 120 individuals who have one of the subgroups of MS (Relapsing Remitting MS (RRMS), Secondary Progressive MS (SPMS), Primary Progressive MS (PPMS)) as well as 19 healthy individuals in the control group have been used in this study. The study is comprised of the following stages: (i) L2 Norm Multifractal Denoising technique, one of the multifractal technique, has been used with the application on the MS data (MRI and EDSS). In this way, the new dataset has been obtained. (ii) The new MS dataset obtained from the MS dataset and L2 Multifractal Denoising technique has been applied to the K-Means and Fuzzy C Means clustering algorithms which are among the unsupervised methods. Thus, the clustering performances have been compared. (iii) In the identification of significant attributes in the MS dataset through the Multifractal denoising (L2 Norm) technique using K-Means and FCM algorithms on the MS subgroups and control group of healthy individuals, excellent performance outcome has been yielded. According to the clustering results based on the MS subgroups obtained in the study, successful clustering results have been obtained in the K-Means and FCM algorithms by applying the L2 norm of multifractal denoising technique for the MS dataset. Clustering performance has been more successful with the MS Dataset (L2_Norm MS Data Set) K-Means and FCM in which significant attributes are obtained by applying L2 Norm Denoising technique.Keywords: clinical decision support, clustering algorithms, multiple sclerosis, multifractal techniques
Procedia PDF Downloads 168750 A Polynomial Time Clustering Algorithm for Solving the Assignment Problem in the Vehicle Routing Problem
Authors: Lydia Wahid, Mona F. Ahmed, Nevin Darwish
Abstract:
The vehicle routing problem (VRP) consists of a group of customers that needs to be served. Each customer has a certain demand of goods. A central depot having a fleet of vehicles is responsible for supplying the customers with their demands. The problem is composed of two subproblems: The first subproblem is an assignment problem where the number of vehicles that will be used as well as the customers assigned to each vehicle are determined. The second subproblem is the routing problem in which for each vehicle having a number of customers assigned to it, the order of visits of the customers is determined. Optimal number of vehicles, as well as optimal total distance, should be achieved. In this paper, an approach for solving the first subproblem (the assignment problem) is presented. In the approach, a clustering algorithm is proposed for finding the optimal number of vehicles by grouping the customers into clusters where each cluster is visited by one vehicle. Finding the optimal number of clusters is NP-hard. This work presents a polynomial time clustering algorithm for finding the optimal number of clusters and solving the assignment problem.Keywords: vehicle routing problems, clustering algorithms, Clarke and Wright Saving Method, agglomerative hierarchical clustering
Procedia PDF Downloads 393749 A Similarity Measure for Classification and Clustering in Image Based Medical and Text Based Banking Applications
Authors: K. P. Sandesh, M. H. Suman
Abstract:
Text processing plays an important role in information retrieval, data-mining, and web search. Measuring the similarity between the documents is an important operation in the text processing field. In this project, a new similarity measure is proposed. To compute the similarity between two documents with respect to a feature the proposed measure takes the following three cases into account: (1) The feature appears in both documents; (2) The feature appears in only one document and; (3) The feature appears in none of the documents. The proposed measure is extended to gauge the similarity between two sets of documents. The effectiveness of our measure is evaluated on several real-world data sets for text classification and clustering problems, especially in banking and health sectors. The results show that the performance obtained by the proposed measure is better than that achieved by the other measures.Keywords: document classification, document clustering, entropy, accuracy, classifiers, clustering algorithms
Procedia PDF Downloads 518748 Application of Fuzzy Clustering on Classification Agile Supply Chain
Authors: Hamidreza Fallah Lajimi , Elham Karami, Fatemeh Ali nasab, Mostafa Mahdavikia
Abstract:
Being responsive is an increasingly important skill for firms in today’s global economy; thus firms must be agile. Naturally, it follows that an organization’s agility depends on its supply chain being agile. However, achieving supply chain agility is a function of other abilities within the organization. This paper analyses results from a survey of 71 Iran manufacturing companies in order to identify some of the factors for agile organizations in managing their supply chains. Then we classification this company in four cluster with fuzzy c-mean technique and with four validations functional determine automatically the optimal number of clusters.Keywords: agile supply chain, clustering, fuzzy clustering
Procedia PDF Downloads 475747 Sustainability and Clustering: A Bibliometric Assessment
Authors: Fernanda M. Assef, Maria Teresinha A. Steiner, David Gabriel F. Barros
Abstract:
Review researches are useful in terms of analysis of research problems. Between the types of review documents, we commonly find bibliometric studies. This type of application often helps the global visualization of a research problem and helps academics worldwide to understand the context of a research area better. In this document, a bibliometric view surrounding clustering techniques and sustainability problems is presented. The authors aimed at which issues mostly use clustering techniques, and, even which sustainability issue would be more impactful on today’s moment of research. During the bibliometric analysis, we found ten different groups of research in clustering applications for sustainability issues: Energy; Environmental; Non-urban planning; Sustainable Development; Sustainable Supply Chain; Transport; Urban Planning; Water; Waste Disposal; and, Others. And, by analyzing the citations of each group, we discovered that the Environmental group could be classified as the most impactful research cluster in the area mentioned. Now, after the content analysis of each paper classified in the environmental group, we found that the k-means technique is preferred for solving sustainability problems with clustering methods since it appeared the most amongst the documents. The authors finally conclude that a bibliometric assessment could help indicate a gap of researches on waste disposal – which was the group with the least amount of publications – and the most impactful research on environmental problems.Keywords: bibliometric assessment, clustering, sustainability, territorial partitioning
Procedia PDF Downloads 109746 Multi-Cluster Overlapping K-Means Extension Algorithm (MCOKE)
Authors: Said Baadel, Fadi Thabtah, Joan Lu
Abstract:
Clustering involves the partitioning of n objects into k clusters. Many clustering algorithms use hard-partitioning techniques where each object is assigned to one cluster. In this paper, we propose an overlapping algorithm MCOKE which allows objects to belong to one or more clusters. The algorithm is different from fuzzy clustering techniques because objects that overlap are assigned a membership value of 1 (one) as opposed to a fuzzy membership degree. The algorithm is also different from other overlapping algorithms that require a similarity threshold to be defined as a priority which can be difficult to determine by novice users.Keywords: data mining, k-means, MCOKE, overlapping
Procedia PDF Downloads 575745 A Clustering-Sequencing Approach to the Facility Layout Problem
Authors: Saeideh Salimpour, Sophie-Charlotte Viaux, Ahmed Azab, Mohammed Fazle Baki
Abstract:
The Facility Layout Problem (FLP) is key to the efficient and cost-effective operation of a system. This paper presents a hybrid heuristic- and mathematical-programming-based approach that divides the problem conceptually into those of clustering and sequencing. First, clusters of vertically aligned facilities are formed, which are later on sequenced horizontally. The developed methodology provides promising results in comparison to its counterparts in the literature by minimizing the inter-distances for facilities which have more interactions amongst each other and aims at placing the facilities with more interactions at the centroid of the shop.Keywords: clustering-sequencing approach, mathematical modeling, optimization, unequal facility layout problem
Procedia PDF Downloads 333744 A Constructed Wetland as a Reliable Method for Grey Wastewater Treatment in Rwanda
Authors: Hussein Bizimana, Osman Sönmez
Abstract:
Constructed wetlands are current the most widely recognized waste water treatment option, especially in developing countries where they have the potential for improving water quality and creating valuable wildlife habitat in ecosystem with treatment requirement relatively simple for operation and maintenance cost. Lack of grey waste water treatment facilities in Kigali İnstitute of Science and Technology in Rwanda, causes pollution in the surrounding localities of Rugunga sector, where already a problem of poor sanitation is found. In order to treat grey water produced at Kigali İnstitute of Science and Technology, with high BOD concentration, high nutrients concentration and high alkalinity; a Horizontal Sub-surface Flow pilot-scale constructed wetland was designed and can operate in Kigali İnstitute of Science and Technology. The study was carried out in a sedimentation tank of 5.5 m x 1.42 m x 1.2 m deep and a Horizontal Sub-surface constructed wetland of 4.5 m x 2.5 m x 1.42 m deep. The grey waste water flow rate of 2.5 m3/d flew through vegetated wetland and sandy pilot plant. The filter media consisted of 0.6 to 2 mm of coarse sand, 0.00003472 m/s of hydraulic conductivity and cattails (Typha latifolia spp) were used as plants species. The effluent flow rate of the plant is designed to be 1.5 m3/ day and the retention time will be 24 hrs. 72% to 79% of BOD, COD, and TSS removals are estimated to be achieved, while the nutrients (Nitrogen and Phosphate) removal is estimated to be in the range of 34% to 53%. Every effluent characteristic will meet exactly the Rwanda Utility Regulatory Agency guidelines primarily because the retention time allowed is enough to make the reduction of contaminants within effluent raw waste water. Treated water reuse system was developed where water will be used in the campus irrigation system again.Keywords: constructed wetlands, hydraulic conductivity, grey waste water, cattails
Procedia PDF Downloads 609743 Unsupervised Segmentation Technique for Acute Leukemia Cells Using Clustering Algorithms
Authors: N. H. Harun, A. S. Abdul Nasir, M. Y. Mashor, R. Hassan
Abstract:
Leukaemia is a blood cancer disease that contributes to the increment of mortality rate in Malaysia each year. There are two main categories for leukaemia, which are acute and chronic leukaemia. The production and development of acute leukaemia cells occurs rapidly and uncontrollable. Therefore, if the identification of acute leukaemia cells could be done fast and effectively, proper treatment and medicine could be delivered. Due to the requirement of prompt and accurate diagnosis of leukaemia, the current study has proposed unsupervised pixel segmentation based on clustering algorithm in order to obtain a fully segmented abnormal white blood cell (blast) in acute leukaemia image. In order to obtain the segmented blast, the current study proposed three clustering algorithms which are k-means, fuzzy c-means and moving k-means algorithms have been applied on the saturation component image. Then, median filter and seeded region growing area extraction algorithms have been applied, to smooth the region of segmented blast and to remove the large unwanted regions from the image, respectively. Comparisons among the three clustering algorithms are made in order to measure the performance of each clustering algorithm on segmenting the blast area. Based on the good sensitivity value that has been obtained, the results indicate that moving k-means clustering algorithm has successfully produced the fully segmented blast region in acute leukaemia image. Hence, indicating that the resultant images could be helpful to haematologists for further analysis of acute leukaemia.Keywords: acute leukaemia images, clustering algorithms, image segmentation, moving k-means
Procedia PDF Downloads 291742 Ensuring Uniform Energy Consumption in Non-Deterministic Wireless Sensor Network to Protract Networks Lifetime
Authors: Vrince Vimal, Madhav J. Nigam
Abstract:
Wireless sensor networks have enticed much of the spotlight from researchers all around the world, owing to its extensive applicability in agricultural, industrial and military fields. Energy conservation node deployment stratagems play a notable role for active implementation of Wireless Sensor Networks. Clustering is the approach in wireless sensor networks which improves energy efficiency in the network. The clustering algorithm needs to have an optimum size and number of clusters, as clustering, if not implemented properly, cannot effectively increase the life of the network. In this paper, an algorithm has been proposed to address connectivity issues with the aim of ensuring the uniform energy consumption of nodes in every part of the network. The results obtained after simulation showed that the proposed algorithm has an edge over existing algorithms in terms of throughput and networks lifetime.Keywords: Wireless Sensor network (WSN), Random Deployment, Clustering, Isolated Nodes, Networks Lifetime
Procedia PDF Downloads 336741 An Intelligent Traffic Management System Based on the WiFi and Bluetooth Sensing
Authors: Hamed Hossein Afshari, Shahrzad Jalali, Amir Hossein Ghods, Bijan Raahemi
Abstract:
This paper introduces an automated clustering solution that applies to WiFi/Bluetooth sensing data and is later used for traffic management applications. The paper initially summarizes a number of clustering approaches and thereafter shows their performance for noise removal. In this context, clustering is used to recognize WiFi and Bluetooth MAC addresses that belong to passengers traveling by a public urban transit bus. The main objective is to build an intelligent system that automatically filters out MAC addresses that belong to persons located outside the bus for different routes in the city of Ottawa. The proposed intelligent system alleviates the need for defining restrictive thresholds that however reduces the accuracy as well as the range of applicability of the solution for different routes. This paper moreover discusses the performance benefits of the presented clustering approaches in terms of the accuracy, time and space complexity, and the ease of use. Note that results of clustering can further be used for the purpose of the origin-destination estimation of individual passengers, predicting the traffic load, and intelligent management of urban bus schedules.Keywords: WiFi-Bluetooth sensing, cluster analysis, artificial intelligence, traffic management
Procedia PDF Downloads 241740 An Efficient Clustering Technique for Copy-Paste Attack Detection
Authors: N. Chaitawittanun, M. Munlin
Abstract:
Due to rapid advancement of powerful image processing software, digital images are easy to manipulate and modify by ordinary people. Lots of digital images are edited for a specific purpose and more difficult to distinguish form their original ones. We propose a clustering method to detect a copy-move image forgery of JPEG, BMP, TIFF, and PNG. The process starts with reducing the color of the photos. Then, we use the clustering technique to divide information of measuring data by Hausdorff Distance. The result shows that the purposed methods is capable of inspecting the image file and correctly identify the forgery.Keywords: image detection, forgery image, copy-paste, attack detection
Procedia PDF Downloads 338739 Application of Fuzzy Clustering on Classification Agile Supply Chain Firms
Authors: Hamidreza Fallah Lajimi, Elham Karami, Alireza Arab, Fatemeh Alinasab
Abstract:
Being responsive is an increasingly important skill for firms in today’s global economy; thus firms must be agile. Naturally, it follows that an organization’s agility depends on its supply chain being agile. However, achieving supply chain agility is a function of other abilities within the organization. This paper analyses results from a survey of 71 Iran manufacturing companies in order to identify some of the factors for agile organizations in managing their supply chains. Then we classification this company in four cluster with fuzzy c-mean technique and with Four validations functional determine automatically the optimal number of clusters.Keywords: agile supply chain, clustering, fuzzy clustering, business engineering
Procedia PDF Downloads 713738 Advances in Machine Learning and Deep Learning Techniques for Image Classification and Clustering
Authors: R. Nandhini, Gaurab Mudbhari
Abstract:
Ranging from the field of health care to self-driving cars, machine learning and deep learning algorithms have revolutionized the field with the proper utilization of images and visual-oriented data. Segmentation, regression, classification, clustering, dimensionality reduction, etc., are some of the Machine Learning tasks that helped Machine Learning and Deep Learning models to become state-of-the-art models for the field where images are key datasets. Among these tasks, classification and clustering are essential but difficult because of the intricate and high-dimensional characteristics of image data. This finding examines and assesses advanced techniques in supervised classification and unsupervised clustering for image datasets, emphasizing the relative efficiency of Convolutional Neural Networks (CNNs), Vision Transformers (ViTs), Deep Embedded Clustering (DEC), and self-supervised learning approaches. Due to the distinctive structural attributes present in images, conventional methods often fail to effectively capture spatial patterns, resulting in the development of models that utilize more advanced architectures and attention mechanisms. In image classification, we investigated both CNNs and ViTs. One of the most promising models, which is very much known for its ability to detect spatial hierarchies, is CNN, and it serves as a core model in our study. On the other hand, ViT is another model that also serves as a core model, reflecting a modern classification method that uses a self-attention mechanism which makes them more robust as this self-attention mechanism allows them to lean global dependencies in images without relying on convolutional layers. This paper evaluates the performance of these two architectures based on accuracy, precision, recall, and F1-score across different image datasets, analyzing their appropriateness for various categories of images. In the domain of clustering, we assess DEC, Variational Autoencoders (VAEs), and conventional clustering techniques like k-means, which are used on embeddings derived from CNN models. DEC, a prominent model in the field of clustering, has gained the attention of many ML engineers because of its ability to combine feature learning and clustering into a single framework and its main goal is to improve clustering quality through better feature representation. VAEs, on the other hand, are pretty well known for using latent embeddings for grouping similar images without requiring for prior label by utilizing the probabilistic clustering method.Keywords: machine learning, deep learning, image classification, image clustering
Procedia PDF Downloads 10737 Identifying Autism Spectrum Disorder Using Optimization-Based Clustering
Authors: Sharifah Mousli, Sona Taheri, Jiayuan He
Abstract:
Autism spectrum disorder (ASD) is a complex developmental condition involving persistent difficulties with social communication, restricted interests, and repetitive behavior. The challenges associated with ASD can interfere with an affected individual’s ability to function in social, academic, and employment settings. Although there is no effective medication known to treat ASD, to our best knowledge, early intervention can significantly improve an affected individual’s overall development. Hence, an accurate diagnosis of ASD at an early phase is essential. The use of machine learning approaches improves and speeds up the diagnosis of ASD. In this paper, we focus on the application of unsupervised clustering methods in ASD as a large volume of ASD data generated through hospitals, therapy centers, and mobile applications has no pre-existing labels. We conduct a comparative analysis using seven clustering approaches such as K-means, agglomerative hierarchical, model-based, fuzzy-C-means, affinity propagation, self organizing maps, linear vector quantisation – as well as the recently developed optimization-based clustering (COMSEP-Clust) approach. We evaluate the performances of the clustering methods extensively on real-world ASD datasets encompassing different age groups: toddlers, children, adolescents, and adults. Our experimental results suggest that the COMSEP-Clust approach outperforms the other seven methods in recognizing ASD with well-separated clusters.Keywords: autism spectrum disorder, clustering, optimization, unsupervised machine learning
Procedia PDF Downloads 116736 Spectral Clustering from the Discrepancy View and Generalized Quasirandomness
Authors: Marianna Bolla
Abstract:
The aim of this paper is to compare spectral, discrepancy, and degree properties of expanding graph sequences. As we can prove equivalences and implications between them and the definition of the generalized (multiclass) quasirandomness of Lovasz–Sos (2008), they can be regarded as generalized quasirandom properties akin to the equivalent quasirandom properties of the seminal Chung-Graham-Wilson paper (1989) in the one-class scenario. Since these properties are valid for deterministic graph sequences, irrespective of stochastic models, the partial implications also justify for low-dimensional embedding of large-scale graphs and for discrepancy minimizing spectral clustering.Keywords: generalized random graphs, multiway discrepancy, normalized modularity spectra, spectral clustering
Procedia PDF Downloads 197735 A Clustering Algorithm for Massive Texts
Authors: Ming Liu, Chong Wu, Bingquan Liu, Lei Chen
Abstract:
Internet users have to face the massive amount of textual data every day. Organizing texts into categories can help users dig the useful information from large-scale text collection. Clustering, in fact, is one of the most promising tools for categorizing texts due to its unsupervised characteristic. Unfortunately, most of traditional clustering algorithms lose their high qualities on large-scale text collection. This situation mainly attributes to the high- dimensional vectors generated from texts. To effectively and efficiently cluster large-scale text collection, this paper proposes a vector reconstruction based clustering algorithm. Only the features that can represent the cluster are preserved in cluster’s representative vector. This algorithm alternately repeats two sub-processes until it converges. One process is partial tuning sub-process, where feature’s weight is fine-tuned by iterative process. To accelerate clustering velocity, an intersection based similarity measurement and its corresponding neuron adjustment function are proposed and implemented in this sub-process. The other process is overall tuning sub-process, where the features are reallocated among different clusters. In this sub-process, the features useless to represent the cluster are removed from cluster’s representative vector. Experimental results on the three text collections (including two small-scale and one large-scale text collections) demonstrate that our algorithm obtains high quality on both small-scale and large-scale text collections.Keywords: vector reconstruction, large-scale text clustering, partial tuning sub-process, overall tuning sub-process
Procedia PDF Downloads 435734 Wear Behavior of Grey Cast Iron Coated with Al2O3-13TiO2 and Ni20Cr Using Detonation Spray Process
Authors: Harjot Singh Gill, Neelkanth Grover, Jwala Parshad Singla
Abstract:
The main aim of this research work is to present the effect of coating on two different grades of grey cast iron using detonation spray method. Ni20Cr and Al2O3-13TiO2 powders were sprayed using detonation gun onto GI250 and GIHC substrates and the results as well as coating surface morphology of the coating is studied by XRD and SEM/EDAX analysis. The wear resistance of Ni20Cr and Al2O3-13TiO2 has been investigated on pin-on-disc tribometer using ASTM G99 standards. Cumulative wear rate and coefficient of friction (µ) were calculated under three normal load of 30N, 40N, 50N at constant sliding velocity of 1m/s. Worn out surfaces were analyzed by SEM/EDAX. The results show significant resistance to wear with Al2O3-13TiO2 coating as compared to Ni20Cr and bare substrates. SEM/EDAX analysis and cumulative wear loss bar charts clearly explain the wear behavior of coated as well as bare sample of GI250 and GIHC.Keywords: detonation spray, grey cast iron, wear rate, coefficient of friction
Procedia PDF Downloads 367733 Effect of Bi-Dispersity on Particle Clustering in Sedimentation
Authors: Ali Abbas Zaidi
Abstract:
In free settling or sedimentation, particles form clusters at high Reynolds number and dilute suspensions. It is due to the entrapment of particles in the wakes of upstream particles. In this paper, the effect of bi-dispersity of settling particles on particle clustering is investigated using particle-resolved direct numerical simulation. Immersed boundary method is used for particle fluid interactions and discrete element method is used for particle-particle interactions. The solid volume fraction used in the simulation is 1% and the Reynolds number based on Sauter mean diameter is 350. Both solid volume fraction and Reynolds number lie in the clustering regime of sedimentation. In simulations, the particle diameter ratio (i.e. diameter of larger particle to smaller particle (d₁/d₂)) is varied from 2:1, 3:1 and 4:1. For each case of particle diameter ratio, solid volume fraction for each particle size (φ₁/φ₂) is varied from 1:1, 1:2 and 2:1. For comparison, simulations are also performed for monodisperse particles. For studying particles clustering, radial distribution function and instantaneous location of particles in the computational domain are studied. It is observed that the degree of particle clustering decreases with the increase in the bi-dispersity of settling particles. The smallest degree of particle clustering or dispersion of particles is observed for particles with d₁/d₂ equal to 4:1 and φ₁/φ₂ equal to 1:2. Simulations showed that the reduction in particle clustering by increasing bi-dispersity is due to the difference in settling velocity of particles. Particles with larger size settle faster and knockout the smaller particles from clustered regions of particles in the computational domain.Keywords: dispersion in bi-disperse settling particles, particle microstructures in bi-disperse suspensions, particle resolved direct numerical simulations, settling of bi-disperse particles
Procedia PDF Downloads 207732 A Learning-Based EM Mixture Regression Algorithm
Authors: Yi-Cheng Tian, Miin-Shen Yang
Abstract:
The mixture likelihood approach to clustering is a popular clustering method where the expectation and maximization (EM) algorithm is the most used mixture likelihood method. In the literature, the EM algorithm had been used for mixture regression models. However, these EM mixture regression algorithms are sensitive to initial values with a priori number of clusters. In this paper, to resolve these drawbacks, we construct a learning-based schema for the EM mixture regression algorithm such that it is free of initializations and can automatically obtain an approximately optimal number of clusters. Some numerical examples and comparisons demonstrate the superiority and usefulness of the proposed learning-based EM mixture regression algorithm.Keywords: clustering, EM algorithm, Gaussian mixture model, mixture regression model
Procedia PDF Downloads 510731 An Image Processing Scheme for Skin Fungal Disease Identification
Authors: A. A. M. A. S. S. Perera, L. A. Ranasinghe, T. K. H. Nimeshika, D. M. Dhanushka Dissanayake, Namalie Walgampaya
Abstract:
Nowadays, skin fungal diseases are mostly found in people of tropical countries like Sri Lanka. A skin fungal disease is a particular kind of illness caused by fungus. These diseases have various dangerous effects on the skin and keep on spreading over time. It becomes important to identify these diseases at their initial stage to control it from spreading. This paper presents an automated skin fungal disease identification system implemented to speed up the diagnosis process by identifying skin fungal infections in digital images. An image of the diseased skin lesion is acquired and a comprehensive computer vision and image processing scheme is used to process the image for the disease identification. This includes colour analysis using RGB and HSV colour models, texture classification using Grey Level Run Length Matrix, Grey Level Co-Occurrence Matrix and Local Binary Pattern, Object detection, Shape Identification and many more. This paper presents the approach and its outcome for identification of four most common skin fungal infections, namely, Tinea Corporis, Sporotrichosis, Malassezia and Onychomycosis. The main intention of this research is to provide an automated skin fungal disease identification system that increase the diagnostic quality, shorten the time-to-diagnosis and improve the efficiency of detection and successful treatment for skin fungal diseases.Keywords: Circularity Index, Grey Level Run Length Matrix, Grey Level Co-Occurrence Matrix, Local Binary Pattern, Object detection, Ring Detection, Shape Identification
Procedia PDF Downloads 232730 Communication of Sensors in Clustering for Wireless Sensor Networks
Authors: Kashish Sareen, Jatinder Singh Bal
Abstract:
The use of wireless sensor networks (WSNs) has grown vastly in the last era, pointing out the crucial need for scalable and energy-efficient routing and data gathering and aggregation protocols in corresponding large-scale environments. Wireless Sensor Networks have now recently emerged as a most important computing platform and continue to grow in diverse areas to provide new opportunities for networking and services. However, the energy constrained and limited computing resources of the sensor nodes present major challenges in gathering data. The sensors collect data about their surrounding and forward it to a command centre through a base station. The past few years have witnessed increased interest in the potential use of wireless sensor networks (WSNs) as they are very useful in target detecting and other applications. However, hierarchical clustering protocols have maximum been used in to overall system lifetime, scalability and energy efficiency. In this paper, the state of the art in corresponding hierarchical clustering approaches for large-scale WSN environments is shown.Keywords: clustering, DLCC, MLCC, wireless sensor networks
Procedia PDF Downloads 481729 Interpretation and Clustering Framework for Analyzing ECG Survey Data
Authors: Irum Matloob, Shoab Ahmad Khan, Fahim Arif
Abstract:
As Indo-Pak has been the victim of heart diseases since many decades. Many surveys showed that percentage of cardiac patients is increasing in Pakistan day by day, and special attention is needed to pay on this issue. The framework is proposed for performing detailed analysis of ECG survey data which is conducted for measuring prevalence of heart diseases statistics in Pakistan. The ECG survey data is evaluated or filtered by using automated Minnesota codes and only those ECGs are used for further analysis which is fulfilling the standardized conditions mentioned in the Minnesota codes. Then feature selection is performed by applying proposed algorithm based on discernibility matrix, for selecting relevant features from the database. Clustering is performed for exposing natural clusters from the ECG survey data by applying spectral clustering algorithm using fuzzy c means algorithm. The hidden patterns and interesting relationships which have been exposed after this analysis are useful for further detailed analysis and for many other multiple purposes.Keywords: arrhythmias, centroids, ECG, clustering, discernibility matrix
Procedia PDF Downloads 470728 An Extraction of Cancer Region from MR Images Using Fuzzy Clustering Means and Morphological Operations
Authors: Ramandeep Kaur, Gurjit Singh Bhathal
Abstract:
Cancer diagnosis is very difficult task. Magnetic resonance imaging (MRI) scan is used to produce image of any part of the body and provides an efficient way for diagnosis of cancer or tumor. In existing method, fuzzy clustering mean (FCM) is used for the diagnosis of the tumor. In the proposed method FCM is used to diagnose the cancer of the foot. FCM finds the centroids of the clusters of the foot cancer obtained from MRI images. FCM thresholding result shows the extract region of the cancer. Morphological operations are applied to get extracted region of cancer.Keywords: magnetic resonance imaging (MRI), fuzzy C mean clustering, segmentation, morphological operations
Procedia PDF Downloads 398