Search results for: K-means clustering algorithm
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 3911

Search results for: K-means clustering algorithm

3791 Method of Visual Prosthesis Design Based on Biologically Inspired Design

Authors: Shen Jian, Hu Jie, Zhu Guo Niu, Peng Ying Hong

Abstract:

There are two issues exited in the traditional visual prosthesis: lacking systematic method and the low level of humanization. To tackcle those obstacles, a visual prosthesis design method based on biologically inspired design is proposed. Firstly, a constrained FBS knowledge cell model is applied to construct the functional model of visual prosthesis in biological field. Then the clustering results of engineering domain are ob-tained with the use of the cross-domain knowledge cell clustering algorithm. Finally, a prototype system is designed to support the bio-logically inspired design where the conflict is digested by TRIZ and other tools, and the validity of the method is verified by the solution scheme

Keywords: knowledge-based engineering, visual prosthesis, biologically inspired design, biomedical engineering

Procedia PDF Downloads 157
3790 Design and Implementation of Machine Learning Model for Short-Term Energy Forecasting in Smart Home Management System

Authors: R. Ramesh, K. K. Shivaraman

Abstract:

The main aim of this paper is to handle the energy requirement in an efficient manner by merging the advanced digital communication and control technologies for smart grid applications. In order to reduce user home load during peak load hours, utility applies several incentives such as real-time pricing, time of use, demand response for residential customer through smart meter. However, this method provides inconvenience in the sense that user needs to respond manually to prices that vary in real time. To overcome these inconvenience, this paper proposes a convolutional neural network (CNN) with k-means clustering machine learning model which have ability to forecast energy requirement in short term, i.e., hour of the day or day of the week. By integrating our proposed technique with home energy management based on Bluetooth low energy provides predicted value to user for scheduling appliance in advanced. This paper describes detail about CNN configuration and k-means clustering algorithm for short-term energy forecasting.

Keywords: convolutional neural network, fuzzy logic, k-means clustering approach, smart home energy management

Procedia PDF Downloads 281
3789 Pattern Recognition Using Feature Based Die-Map Clustering in the Semiconductor Manufacturing Process

Authors: Seung Hwan Park, Cheng-Sool Park, Jun Seok Kim, Youngji Yoo, Daewoong An, Jun-Geol Baek

Abstract:

Depending on the big data analysis becomes important, yield prediction using data from the semiconductor process is essential. In general, yield prediction and analysis of the causes of the failure are closely related. The purpose of this study is to analyze pattern affects the final test results using a die map based clustering. Many researches have been conducted using die data from the semiconductor test process. However, analysis has limitation as the test data is less directly related to the final test results. Therefore, this study proposes a framework for analysis through clustering using more detailed data than existing die data. This study consists of three phases. In the first phase, die map is created through fail bit data in each sub-area of die. In the second phase, clustering using map data is performed. And the third stage is to find patterns that affect final test result. Finally, the proposed three steps are applied to actual industrial data and experimental results showed the potential field application.

Keywords: die-map clustering, feature extraction, pattern recognition, semiconductor manufacturing process

Procedia PDF Downloads 374
3788 Wireless Sensor Networks Optimization by Using 2-Stage Algorithm Based on Imperialist Competitive Algorithm

Authors: Hamid R. Lashgarian Azad, Seyed N. Shetab Boushehri

Abstract:

Wireless sensor networks (WSN) have become progressively popular due to their wide range of applications. Wireless Sensor Network is made of numerous tiny sensor nodes that are battery-powered. It is a very significant problem to maximize the lifetime of wireless sensor networks. In this paper, we propose a two-stage protocol based on an imperialist competitive algorithm (2S-ICA) to solve a sensor network optimization problem. The energy of the sensors can be greatly reduced and the lifetime of the network reduced by long communication distances between the sensors and the sink. We can minimize the overall communication distance considerably, thereby extending the lifetime of the network lifetime through connecting sensors into a series of independent clusters using 2SICA. Comparison results of the proposed protocol and LEACH protocol, which is common to solving WSN problems, show that our protocol has a better performance in terms of improving network life and increasing the number of transmitted data.

Keywords: wireless sensor network, imperialist competitive algorithm, LEACH protocol, k-means clustering

Procedia PDF Downloads 53
3787 Sales Patterns Clustering Analysis on Seasonal Product Sales Data

Authors: Soojin Kim, Jiwon Yang, Sungzoon Cho

Abstract:

As a seasonal product is only in demand for a short time, inventory management is critical to profits. Both markdowns and stockouts decrease the return on perishable products; therefore, researchers have been interested in the distribution of seasonal products with the aim of maximizing profits. In this study, we propose a data-driven seasonal product sales pattern analysis method for individual retail outlets based on observed sales data clustering; the proposed method helps in determining distribution strategies.

Keywords: clustering, distribution, sales pattern, seasonal product

Procedia PDF Downloads 568
3786 Cluster-Based Multi-Path Routing Algorithm in Wireless Sensor Networks

Authors: Si-Gwan Kim

Abstract:

Small-size and low-power sensors with sensing, signal processing and wireless communication capabilities is suitable for the wireless sensor networks. Due to the limited resources and battery constraints, complex routing algorithms used for the ad-hoc networks cannot be employed in sensor networks. In this paper, we propose node-disjoint multi-path hexagon-based routing algorithms in wireless sensor networks. We suggest the details of the algorithm and compare it with other works. Simulation results show that the proposed scheme achieves better performance in terms of efficiency and message delivery ratio.

Keywords: clustering, multi-path, routing protocol, sensor network

Procedia PDF Downloads 370
3785 Feature Weighting Comparison Based on Clustering Centers in the Detection of Diabetic Retinopathy

Authors: Kemal Polat

Abstract:

In this paper, three feature weighting methods have been used to improve the classification performance of diabetic retinopathy (DR). To classify the diabetic retinopathy, features extracted from the output of several retinal image processing algorithms, such as image-level, lesion-specific and anatomical components, have been used and fed them into the classifier algorithms. The dataset used in this study has been taken from University of California, Irvine (UCI) machine learning repository. Feature weighting methods including the fuzzy c-means clustering based feature weighting, subtractive clustering based feature weighting, and Gaussian mixture clustering based feature weighting, have been used and compered with each other in the classification of DR. After feature weighting, five different classifier algorithms comprising multi-layer perceptron (MLP), k- nearest neighbor (k-NN), decision tree, support vector machine (SVM), and Naïve Bayes have been used. The hybrid method based on combination of subtractive clustering based feature weighting and decision tree classifier has been obtained the classification accuracy of 100% in the screening of DR. These results have demonstrated that the proposed hybrid scheme is very promising in the medical data set classification.

Keywords: machine learning, data weighting, classification, data mining

Procedia PDF Downloads 301
3784 Remote Assessment and Change Detection of GreenLAI of Cotton Crop Using Different Vegetation Indices

Authors: Ganesh B. Shinde, Vijaya B. Musande

Abstract:

Cotton crop identification based on the timely information has significant advantage to the different implications of food, economic and environment. Due to the significant advantages, the accurate detection of cotton crop regions using supervised learning procedure is challenging problem in remote sensing. Here, classifiers on the direct image are played a major role but the results are not much satisfactorily. In order to further improve the effectiveness, variety of vegetation indices are proposed in the literature. But, recently, the major challenge is to find the better vegetation indices for the cotton crop identification through the proposed methodology. Accordingly, fuzzy c-means clustering is combined with neural network algorithm, trained by Levenberg-Marquardt for cotton crop classification. To experiment the proposed method, five LISS-III satellite images was taken and the experimentation was done with six vegetation indices such as Simple Ratio, Normalized Difference Vegetation Index, Enhanced Vegetation Index, Green Atmospherically Resistant Vegetation Index, Wide-Dynamic Range Vegetation Index, Green Chlorophyll Index. Along with these indices, Green Leaf Area Index is also considered for investigation. From the research outcome, Green Atmospherically Resistant Vegetation Index outperformed with all other indices by reaching the average accuracy value of 95.21%.

Keywords: Fuzzy C-Means clustering (FCM), neural network, Levenberg-Marquardt (LM) algorithm, vegetation indices

Procedia PDF Downloads 285
3783 A Hybrid Multi-Objective Firefly-Sine Cosine Algorithm for Multi-Objective Optimization Problem

Authors: Gaohuizi Guo, Ning Zhang

Abstract:

Firefly algorithm (FA) and Sine Cosine algorithm (SCA) are two very popular and advanced metaheuristic algorithms. However, these algorithms applied to multi-objective optimization problems have some shortcomings, respectively, such as premature convergence and limited exploration capability. Combining the privileges of FA and SCA while avoiding their deficiencies may improve the accuracy and efficiency of the algorithm. This paper proposes a hybridization of FA and SCA algorithms, named multi-objective firefly-sine cosine algorithm (MFA-SCA), to develop a more efficient meta-heuristic algorithm than FA and SCA.

Keywords: firefly algorithm, hybrid algorithm, multi-objective optimization, sine cosine algorithm

Procedia PDF Downloads 134
3782 Clustering the Wheat Seeds Using SOM Artificial Neural Networks

Authors: Salah Ghamari

Abstract:

In this study, the ability of self organizing map artificial (SOM) neural networks in clustering the wheat seeds varieties according to morphological properties of them was considered. The SOM is one type of unsupervised competitive learning. Experimentally, five morphological features of 300 seeds (including three varieties: gaskozhen, Md and sardari) were obtained using image processing technique. The results show that the artificial neural network has a good performance (90.33% accuracy) in classification of the wheat varieties despite of high similarity in them. The highest classification accuracy (100%) was achieved for sardari.

Keywords: artificial neural networks, clustering, self organizing map, wheat variety

Procedia PDF Downloads 610
3781 GCM Based Fuzzy Clustering to Identify Homogeneous Climatic Regions of North-East India

Authors: Arup K. Sarma, Jayshree Hazarika

Abstract:

The North-eastern part of India, which receives heavier rainfall than other parts of the subcontinent, is of great concern now-a-days with regard to climate change. High intensity rainfall for short duration and longer dry spell, occurring due to impact of climate change, affects river morphology too. In the present study, an attempt is made to delineate the North-Eastern region of India into some homogeneous clusters based on the Fuzzy Clustering concept and to compare the resulting clusters obtained by using conventional methods and non conventional methods of clustering. The concept of clustering is adapted in view of the fact that, impact of climate change can be studied in a homogeneous region without much variation, which can be helpful in studies related to water resources planning and management. 10 IMD (Indian Meteorological Department) stations, situated in various regions of the North-east, have been selected for making the clusters. The results of the Fuzzy C-Means (FCM) analysis show different clustering patterns for different conditions. From the analysis and comparison it can be concluded that non conventional method of using GCM data is somehow giving better results than the others. However, further analysis can be done by taking daily data instead of monthly means to reduce the effect of standardization.

Keywords: climate change, conventional and nonconventional methods of clustering, FCM analysis, homogeneous regions

Procedia PDF Downloads 354
3780 The Clustering of Multiple Sclerosis Subgroups through L2 Norm Multifractal Denoising Technique

Authors: Yeliz Karaca, Rana Karabudak

Abstract:

Multifractal Denoising techniques are used in the identification of significant attributes by removing the noise of the dataset. Magnetic resonance (MR) image technique is the most sensitive method so as to identify chronic disorders of the nervous system such as Multiple Sclerosis. MRI and Expanded Disability Status Scale (EDSS) data belonging to 120 individuals who have one of the subgroups of MS (Relapsing Remitting MS (RRMS), Secondary Progressive MS (SPMS), Primary Progressive MS (PPMS)) as well as 19 healthy individuals in the control group have been used in this study. The study is comprised of the following stages: (i) L2 Norm Multifractal Denoising technique, one of the multifractal technique, has been used with the application on the MS data (MRI and EDSS). In this way, the new dataset has been obtained. (ii) The new MS dataset obtained from the MS dataset and L2 Multifractal Denoising technique has been applied to the K-Means and Fuzzy C Means clustering algorithms which are among the unsupervised methods. Thus, the clustering performances have been compared. (iii) In the identification of significant attributes in the MS dataset through the Multifractal denoising (L2 Norm) technique using K-Means and FCM algorithms on the MS subgroups and control group of healthy individuals, excellent performance outcome has been yielded. According to the clustering results based on the MS subgroups obtained in the study, successful clustering results have been obtained in the K-Means and FCM algorithms by applying the L2 norm of multifractal denoising technique for the MS dataset. Clustering performance has been more successful with the MS Dataset (L2_Norm MS Data Set) K-Means and FCM in which significant attributes are obtained by applying L2 Norm Denoising technique.

Keywords: clinical decision support, clustering algorithms, multiple sclerosis, multifractal techniques

Procedia PDF Downloads 133
3779 Approximating Fixed Points by a Two-Step Iterative Algorithm

Authors: Safeer Hussain Khan

Abstract:

In this paper, we introduce a two-step iterative algorithm to prove a strong convergence result for approximating common fixed points of three contractive-like operators. Our algorithm basically generalizes an existing algorithm..Our iterative algorithm also contains two famous iterative algorithms: Mann iterative algorithm and Ishikawa iterative algorithm. Thus our result generalizes the corresponding results proved for the above three iterative algorithms to a class of more general operators. At the end, we remark that nothing prevents us to extend our result to the case of the iterative algorithm with error terms.

Keywords: contractive-like operator, iterative algorithm, fixed point, strong convergence

Procedia PDF Downloads 515
3778 A Similarity Measure for Classification and Clustering in Image Based Medical and Text Based Banking Applications

Authors: K. P. Sandesh, M. H. Suman

Abstract:

Text processing plays an important role in information retrieval, data-mining, and web search. Measuring the similarity between the documents is an important operation in the text processing field. In this project, a new similarity measure is proposed. To compute the similarity between two documents with respect to a feature the proposed measure takes the following three cases into account: (1) The feature appears in both documents; (2) The feature appears in only one document and; (3) The feature appears in none of the documents. The proposed measure is extended to gauge the similarity between two sets of documents. The effectiveness of our measure is evaluated on several real-world data sets for text classification and clustering problems, especially in banking and health sectors. The results show that the performance obtained by the proposed measure is better than that achieved by the other measures.

Keywords: document classification, document clustering, entropy, accuracy, classifiers, clustering algorithms

Procedia PDF Downloads 480
3777 Multi-Level Clustering Based Congestion Control Protocol for Cyber Physical Systems

Authors: Manpreet Kaur, Amita Rani, Sanjay Kumar

Abstract:

The Internet of Things (IoT), a cyber-physical paradigm, allows a large number of devices to connect and send the sensory data in the network simultaneously. This tremendous amount of data generated leads to very high network load consequently resulting in network congestion. It further amounts to frequent loss of useful information and depletion of significant amount of nodes’ energy. Therefore, there is a need to control congestion in IoT so as to prolong network lifetime and improve the quality of service (QoS). Hence, we propose a two-level clustering based routing algorithm considering congestion score and packet priority metrics that focus on minimizing the network congestion. In the proposed Priority based Congestion Control (PBCC) protocol the sensor nodes in IoT network form clusters that reduces the amount of traffic and the nodes are prioritized to emphasize important data. Simultaneously, a congestion score determines the occurrence of congestion at a particular node. The proposed protocol outperforms the existing Packet Discard Network Clustering (PDNC) protocol in terms of buffer size, packet transmission range, network region and number of nodes, under various simulation scenarios.

Keywords: internet of things, cyber-physical systems, congestion control, priority, transmission rate

Procedia PDF Downloads 281
3776 Cleaning of Scientific References in Large Patent Databases Using Rule-Based Scoring and Clustering

Authors: Emiel Caron

Abstract:

Patent databases contain patent related data, organized in a relational data model, and are used to produce various patent statistics. These databases store raw data about scientific references cited by patents. For example, Patstat holds references to tens of millions of scientific journal publications and conference proceedings. These references might be used to connect patent databases with bibliographic databases, e.g. to study to the relation between science, technology, and innovation in various domains. Problematic in such studies is the low data quality of the references, i.e. they are often ambiguous, unstructured, and incomplete. Moreover, a complete bibliographic reference is stored in only one attribute. Therefore, a computerized cleaning and disambiguation method for large patent databases is developed in this work. The method uses rule-based scoring and clustering. The rules are based on bibliographic metadata, retrieved from the raw data by regular expressions, and are transparent and adaptable. The rules in combination with string similarity measures are used to detect pairs of records that are potential duplicates. Due to the scoring, different rules can be combined, to join scientific references, i.e. the rules reinforce each other. The scores are based on expert knowledge and initial method evaluation. After the scoring, pairs of scientific references that are above a certain threshold, are clustered by means of single-linkage clustering algorithm to form connected components. The method is designed to disambiguate all the scientific references in the Patstat database. The performance evaluation of the clustering method, on a large golden set with highly cited papers, shows on average a 99% precision and a 95% recall. The method is therefore accurate but careful, i.e. it weighs precision over recall. Consequently, separate clusters of high precision are sometimes formed, when there is not enough evidence for connecting scientific references, e.g. in the case of missing year and journal information for a reference. The clusters produced by the method can be used to directly link the Patstat database with bibliographic databases as the Web of Science or Scopus.

Keywords: clustering, data cleaning, data disambiguation, data mining, patent analysis, scientometrics

Procedia PDF Downloads 169
3775 Application of Fuzzy Clustering on Classification Agile Supply Chain

Authors: Hamidreza Fallah Lajimi , Elham Karami, Fatemeh Ali nasab, Mostafa Mahdavikia

Abstract:

Being responsive is an increasingly important skill for firms in today’s global economy; thus firms must be agile. Naturally, it follows that an organization’s agility depends on its supply chain being agile. However, achieving supply chain agility is a function of other abilities within the organization. This paper analyses results from a survey of 71 Iran manufacturing companies in order to identify some of the factors for agile organizations in managing their supply chains. Then we classification this company in four cluster with fuzzy c-mean technique and with four validations functional determine automatically the optimal number of clusters.

Keywords: agile supply chain, clustering, fuzzy clustering

Procedia PDF Downloads 415
3774 Sustainability and Clustering: A Bibliometric Assessment

Authors: Fernanda M. Assef, Maria Teresinha A. Steiner, David Gabriel F. Barros

Abstract:

Review researches are useful in terms of analysis of research problems. Between the types of review documents, we commonly find bibliometric studies. This type of application often helps the global visualization of a research problem and helps academics worldwide to understand the context of a research area better. In this document, a bibliometric view surrounding clustering techniques and sustainability problems is presented. The authors aimed at which issues mostly use clustering techniques, and, even which sustainability issue would be more impactful on today’s moment of research. During the bibliometric analysis, we found ten different groups of research in clustering applications for sustainability issues: Energy; Environmental; Non-urban planning; Sustainable Development; Sustainable Supply Chain; Transport; Urban Planning; Water; Waste Disposal; and, Others. And, by analyzing the citations of each group, we discovered that the Environmental group could be classified as the most impactful research cluster in the area mentioned. Now, after the content analysis of each paper classified in the environmental group, we found that the k-means technique is preferred for solving sustainability problems with clustering methods since it appeared the most amongst the documents. The authors finally conclude that a bibliometric assessment could help indicate a gap of researches on waste disposal – which was the group with the least amount of publications – and the most impactful research on environmental problems.

Keywords: bibliometric assessment, clustering, sustainability, territorial partitioning

Procedia PDF Downloads 82
3773 A Clustering-Sequencing Approach to the Facility Layout Problem

Authors: Saeideh Salimpour, Sophie-Charlotte Viaux, Ahmed Azab, Mohammed Fazle Baki

Abstract:

The Facility Layout Problem (FLP) is key to the efficient and cost-effective operation of a system. This paper presents a hybrid heuristic- and mathematical-programming-based approach that divides the problem conceptually into those of clustering and sequencing. First, clusters of vertically aligned facilities are formed, which are later on sequenced horizontally. The developed methodology provides promising results in comparison to its counterparts in the literature by minimizing the inter-distances for facilities which have more interactions amongst each other and aims at placing the facilities with more interactions at the centroid of the shop.

Keywords: clustering-sequencing approach, mathematical modeling, optimization, unequal facility layout problem

Procedia PDF Downloads 303
3772 Understanding the Qualitative Nature of Product Reviews by Integrating Text Processing Algorithm and Usability Feature Extraction

Authors: Cherry Yieng Siang Ling, Joong Hee Lee, Myung Hwan Yun

Abstract:

The quality of a product to be usable has become the basic requirement in consumer’s perspective while failing the requirement ends up the customer from not using the product. Identifying usability issues from analyzing quantitative and qualitative data collected from usability testing and evaluation activities aids in the process of product design, yet the lack of studies and researches regarding analysis methodologies in qualitative text data of usability field inhibits the potential of these data for more useful applications. While the possibility of analyzing qualitative text data found with the rapid development of data analysis studies such as natural language processing field in understanding human language in computer, and machine learning field in providing predictive model and clustering tool. Therefore, this research aims to study the application capability of text processing algorithm in analysis of qualitative text data collected from usability activities. This research utilized datasets collected from LG neckband headset usability experiment in which the datasets consist of headset survey text data, subject’s data and product physical data. In the analysis procedure, which integrated with the text-processing algorithm, the process includes training of comments onto vector space, labeling them with the subject and product physical feature data, and clustering to validate the result of comment vector clustering. The result shows 'volume and music control button' as the usability feature that matches best with the cluster of comment vectors where centroid comments of a cluster emphasized more on button positions, while centroid comments of the other cluster emphasized more on button interface issues. When volume and music control buttons are designed separately, the participant experienced less confusion, and thus, the comments mentioned only about the buttons' positions. While in the situation where the volume and music control buttons are designed as a single button, the participants experienced interface issues regarding the buttons such as operating methods of functions and confusion of functions' buttons. The relevance of the cluster centroid comments with the extracted feature explained the capability of text processing algorithms in analyzing qualitative text data from usability testing and evaluations.

Keywords: usability, qualitative data, text-processing algorithm, natural language processing

Procedia PDF Downloads 250
3771 K-Means Based Matching Algorithm for Multi-Resolution Feature Descriptors

Authors: Shao-Tzu Huang, Chen-Chien Hsu, Wei-Yen Wang

Abstract:

Matching high dimensional features between images is computationally expensive for exhaustive search approaches in computer vision. Although the dimension of the feature can be degraded by simplifying the prior knowledge of homography, matching accuracy may degrade as a tradeoff. In this paper, we present a feature matching method based on k-means algorithm that reduces the matching cost and matches the features between images instead of using a simplified geometric assumption. Experimental results show that the proposed method outperforms the previous linear exhaustive search approaches in terms of the inlier ratio of matched pairs.

Keywords: feature matching, k-means clustering, SIFT, RANSAC

Procedia PDF Downloads 318
3770 Bridge Members Segmentation Algorithm of Terrestrial Laser Scanner Point Clouds Using Fuzzy Clustering Method

Authors: Donghwan Lee, Gichun Cha, Jooyoung Park, Junkyeong Kim, Seunghee Park

Abstract:

3D shape models of the existing structure are required for many purposes such as safety and operation management. The traditional 3D modeling methods are based on manual or semi-automatic reconstruction from close-range images. It occasions great expense and time consuming. The Terrestrial Laser Scanner (TLS) is a common survey technique to measure quickly and accurately a 3D shape model. This TLS is used to a construction site and cultural heritage management. However there are many limits to process a TLS point cloud, because the raw point cloud is massive volume data. So the capability of carrying out useful analyses is also limited with unstructured 3-D point. Thus, segmentation becomes an essential step whenever grouping of points with common attributes is required. In this paper, members segmentation algorithm was presented to separate a raw point cloud which includes only 3D coordinates. This paper presents a clustering approach based on a fuzzy method for this objective. The Fuzzy C-Means (FCM) is reviewed and used in combination with a similarity-driven cluster merging method. It is applied to the point cloud acquired with Lecia Scan Station C10/C5 at the test bed. The test-bed was a bridge which connects between 1st and 2nd engineering building in Sungkyunkwan University in Korea. It is about 32m long and 2m wide. This bridge was used as pedestrian between two buildings. The 3D point cloud of the test-bed was constructed by a measurement of the TLS. This data was divided by segmentation algorithm for each member. Experimental analyses of the results from the proposed unsupervised segmentation process are shown to be promising. It can be processed to manage configuration each member, because of the segmentation process of point cloud.

Keywords: fuzzy c-means (FCM), point cloud, segmentation, terrestrial laser scanner (TLS)

Procedia PDF Downloads 206
3769 An Intelligent Traffic Management System Based on the WiFi and Bluetooth Sensing

Authors: Hamed Hossein Afshari, Shahrzad Jalali, Amir Hossein Ghods, Bijan Raahemi

Abstract:

This paper introduces an automated clustering solution that applies to WiFi/Bluetooth sensing data and is later used for traffic management applications. The paper initially summarizes a number of clustering approaches and thereafter shows their performance for noise removal. In this context, clustering is used to recognize WiFi and Bluetooth MAC addresses that belong to passengers traveling by a public urban transit bus. The main objective is to build an intelligent system that automatically filters out MAC addresses that belong to persons located outside the bus for different routes in the city of Ottawa. The proposed intelligent system alleviates the need for defining restrictive thresholds that however reduces the accuracy as well as the range of applicability of the solution for different routes. This paper moreover discusses the performance benefits of the presented clustering approaches in terms of the accuracy, time and space complexity, and the ease of use. Note that results of clustering can further be used for the purpose of the origin-destination estimation of individual passengers, predicting the traffic load, and intelligent management of urban bus schedules.

Keywords: WiFi-Bluetooth sensing, cluster analysis, artificial intelligence, traffic management

Procedia PDF Downloads 212
3768 Complex Network Approach to International Trade of Fossil Fuel

Authors: Semanur Soyyigit Kaya, Ercan Eren

Abstract:

Energy has a prominent role for development of nations. Countries which have energy resources also have strategic power in the international trade of energy since it is essential for all stages of production in the economy. Thus, it is important for countries to analyze the weakness and strength of the system. On the other side, it is commonly believed that international trade has complex network properties. Complex network is a tool for the analysis of complex systems with heterogeneous agents and interaction between them. A complex network consists of nodes and the interactions between these nodes. Total properties which emerge as a result of these interactions are distinct from the sum of small parts (more or less) in complex systems. Thus, standard approaches to international trade are superficial to analyze these systems. Network analysis provides a new approach to analyze international trade as a network. In this network countries constitute nodes and trade relations (export or import) constitute edges. It becomes possible to analyze international trade network in terms of high degree indicators which are specific to complex systems such as connectivity, clustering, assortativity/disassortativity, centrality, etc. In this analysis, international trade of crude oil and coal which are types of fossil fuel has been analyzed from 2005 to 2014 via network analysis. First, it has been analyzed in terms of some topological parameters such as density, transitivity, clustering etc. Afterwards, fitness to Pareto distribution has been analyzed. Finally, weighted HITS algorithm has been applied to the data as a centrality measure to determine the real prominence of countries in these trade networks. Weighted HITS algorithm is a strong tool to analyze the network by ranking countries with regards to prominence of their trade partners. We have calculated both an export centrality and an import centrality by applying w-HITS algorithm to data.

Keywords: complex network approach, fossil fuel, international trade, network theory

Procedia PDF Downloads 305
3767 An Efficient Clustering Technique for Copy-Paste Attack Detection

Authors: N. Chaitawittanun, M. Munlin

Abstract:

Due to rapid advancement of powerful image processing software, digital images are easy to manipulate and modify by ordinary people. Lots of digital images are edited for a specific purpose and more difficult to distinguish form their original ones. We propose a clustering method to detect a copy-move image forgery of JPEG, BMP, TIFF, and PNG. The process starts with reducing the color of the photos. Then, we use the clustering technique to divide information of measuring data by Hausdorff Distance. The result shows that the purposed methods is capable of inspecting the image file and correctly identify the forgery.

Keywords: image detection, forgery image, copy-paste, attack detection

Procedia PDF Downloads 307
3766 Laser Data Based Automatic Generation of Lane-Level Road Map for Intelligent Vehicles

Authors: Zehai Yu, Hui Zhu, Linglong Lin, Huawei Liang, Biao Yu, Weixin Huang

Abstract:

With the development of intelligent vehicle systems, a high-precision road map is increasingly needed in many aspects. The automatic lane lines extraction and modeling are the most essential steps for the generation of a precise lane-level road map. In this paper, an automatic lane-level road map generation system is proposed. To extract the road markings on the ground, the multi-region Otsu thresholding method is applied, which calculates the intensity value of laser data that maximizes the variance between background and road markings. The extracted road marking points are then projected to the raster image and clustered using a two-stage clustering algorithm. Lane lines are subsequently recognized from these clusters by the shape features of their minimum bounding rectangle. To ensure the storage efficiency of the map, the lane lines are approximated to cubic polynomial curves using a Bayesian estimation approach. The proposed lane-level road map generation system has been tested on urban and expressway conditions in Hefei, China. The experimental results on the datasets show that our method can achieve excellent extraction and clustering effect, and the fitted lines can reach a high position accuracy with an error of less than 10 cm.

Keywords: curve fitting, lane-level road map, line recognition, multi-thresholding, two-stage clustering

Procedia PDF Downloads 106
3765 Application of Fuzzy Clustering on Classification Agile Supply Chain Firms

Authors: Hamidreza Fallah Lajimi, Elham Karami, Alireza Arab, Fatemeh Alinasab

Abstract:

Being responsive is an increasingly important skill for firms in today’s global economy; thus firms must be agile. Naturally, it follows that an organization’s agility depends on its supply chain being agile. However, achieving supply chain agility is a function of other abilities within the organization. This paper analyses results from a survey of 71 Iran manufacturing companies in order to identify some of the factors for agile organizations in managing their supply chains. Then we classification this company in four cluster with fuzzy c-mean technique and with Four validations functional determine automatically the optimal number of clusters.

Keywords: agile supply chain, clustering, fuzzy clustering, business engineering

Procedia PDF Downloads 667
3764 Identifying Autism Spectrum Disorder Using Optimization-Based Clustering

Authors: Sharifah Mousli, Sona Taheri, Jiayuan He

Abstract:

Autism spectrum disorder (ASD) is a complex developmental condition involving persistent difficulties with social communication, restricted interests, and repetitive behavior. The challenges associated with ASD can interfere with an affected individual’s ability to function in social, academic, and employment settings. Although there is no effective medication known to treat ASD, to our best knowledge, early intervention can significantly improve an affected individual’s overall development. Hence, an accurate diagnosis of ASD at an early phase is essential. The use of machine learning approaches improves and speeds up the diagnosis of ASD. In this paper, we focus on the application of unsupervised clustering methods in ASD as a large volume of ASD data generated through hospitals, therapy centers, and mobile applications has no pre-existing labels. We conduct a comparative analysis using seven clustering approaches such as K-means, agglomerative hierarchical, model-based, fuzzy-C-means, affinity propagation, self organizing maps, linear vector quantisation – as well as the recently developed optimization-based clustering (COMSEP-Clust) approach. We evaluate the performances of the clustering methods extensively on real-world ASD datasets encompassing different age groups: toddlers, children, adolescents, and adults. Our experimental results suggest that the COMSEP-Clust approach outperforms the other seven methods in recognizing ASD with well-separated clusters.

Keywords: autism spectrum disorder, clustering, optimization, unsupervised machine learning

Procedia PDF Downloads 81
3763 Spectral Clustering from the Discrepancy View and Generalized Quasirandomness

Authors: Marianna Bolla

Abstract:

The aim of this paper is to compare spectral, discrepancy, and degree properties of expanding graph sequences. As we can prove equivalences and implications between them and the definition of the generalized (multiclass) quasirandomness of Lovasz–Sos (2008), they can be regarded as generalized quasirandom properties akin to the equivalent quasirandom properties of the seminal Chung-Graham-Wilson paper (1989) in the one-class scenario. Since these properties are valid for deterministic graph sequences, irrespective of stochastic models, the partial implications also justify for low-dimensional embedding of large-scale graphs and for discrepancy minimizing spectral clustering.

Keywords: generalized random graphs, multiway discrepancy, normalized modularity spectra, spectral clustering

Procedia PDF Downloads 163
3762 Improving Cell Type Identification of Single Cell Data by Iterative Graph-Based Noise Filtering

Authors: Annika Stechemesser, Rachel Pounds, Emma Lucas, Chris Dawson, Julia Lipecki, Pavle Vrljicak, Jan Brosens, Sean Kehoe, Jason Yap, Lawrence Young, Sascha Ott

Abstract:

Advances in technology make it now possible to retrieve the genetic information of thousands of single cancerous cells. One of the key challenges in single cell analysis of cancerous tissue is to determine the number of different cell types and their characteristic genes within the sample to better understand the tumors and their reaction to different treatments. For this analysis to be possible, it is crucial to filter out background noise as it can severely blur the downstream analysis and give misleading results. In-depth analysis of the state-of-the-art filtering methods for single cell data showed that they do, in some cases, not separate noisy and normal cells sufficiently. We introduced an algorithm that filters and clusters single cell data simultaneously without relying on certain genes or thresholds chosen by eye. It detects communities in a Shared Nearest Neighbor similarity network, which captures the similarities and dissimilarities of the cells by optimizing the modularity and then identifies and removes vertices with a weak clustering belonging. This strategy is based on the fact that noisy data instances are very likely to be similar to true cell types but do not match any of these wells. Once the clustering is complete, we apply a set of evaluation metrics on the cluster level and accept or reject clusters based on the outcome. The performance of our algorithm was tested on three datasets and led to convincing results. We were able to replicate the results on a Peripheral Blood Mononuclear Cells dataset. Furthermore, we applied the algorithm to two samples of ovarian cancer from the same patient before and after chemotherapy. Comparing the standard approach to our algorithm, we found a hidden cell type in the ovarian postchemotherapy data with interesting marker genes that are potentially relevant for medical research.

Keywords: cancer research, graph theory, machine learning, single cell analysis

Procedia PDF Downloads 78