Search results for: nearest neighbor hierarchical spatial clustering
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 3656

Search results for: nearest neighbor hierarchical spatial clustering

3536 Multimodal Optimization of Density-Based Clustering Using Collective Animal Behavior Algorithm

Authors: Kristian Bautista, Ruben A. Idoy

Abstract:

A bio-inspired metaheuristic algorithm inspired by the theory of collective animal behavior (CAB) was integrated to density-based clustering modeled as multimodal optimization problem. The algorithm was tested on synthetic, Iris, Glass, Pima and Thyroid data sets in order to measure its effectiveness relative to CDE-based Clustering algorithm. Upon preliminary testing, it was found out that one of the parameter settings used was ineffective in performing clustering when applied to the algorithm prompting the researcher to do an investigation. It was revealed that fine tuning distance δ3 that determines the extent to which a given data point will be clustered helped improve the quality of cluster output. Even though the modification of distance δ3 significantly improved the solution quality and cluster output of the algorithm, results suggest that there is no difference between the population mean of the solutions obtained using the original and modified parameter setting for all data sets. This implies that using either the original or modified parameter setting will not have any effect towards obtaining the best global and local animal positions. Results also suggest that CDE-based clustering algorithm is better than CAB-density clustering algorithm for all data sets. Nevertheless, CAB-density clustering algorithm is still a good clustering algorithm because it has correctly identified the number of classes of some data sets more frequently in a thirty trial run with a much smaller standard deviation, a potential in clustering high dimensional data sets. Thus, the researcher recommends further investigation in the post-processing stage of the algorithm.

Keywords: clustering, metaheuristics, collective animal behavior algorithm, density-based clustering, multimodal optimization

Procedia PDF Downloads 201
3535 Anomaly Detection Based Fuzzy K-Mode Clustering for Categorical Data

Authors: Murat Yazici

Abstract:

Anomalies are irregularities found in data that do not adhere to a well-defined standard of normal behavior. The identification of outliers or anomalies in data has been a subject of study within the statistics field since the 1800s. Over time, a variety of anomaly detection techniques have been developed in several research communities. The cluster analysis can be used to detect anomalies. It is the process of associating data with clusters that are as similar as possible while dissimilar clusters are associated with each other. Many of the traditional cluster algorithms have limitations in dealing with data sets containing categorical properties. To detect anomalies in categorical data, fuzzy clustering approach can be used with its advantages. The fuzzy k-Mode (FKM) clustering algorithm, which is one of the fuzzy clustering approaches, by extension to the k-means algorithm, is reported for clustering datasets with categorical values. It is a form of clustering: each point can be associated with more than one cluster. In this paper, anomaly detection is performed on two simulated data by using the FKM cluster algorithm. As a significance of the study, the FKM cluster algorithm allows to determine anomalies with their abnormality degree in contrast to numerous anomaly detection algorithms. According to the results, the FKM cluster algorithm illustrated good performance in the anomaly detection of data, including both one anomaly and more than one anomaly.

Keywords: fuzzy k-mode clustering, anomaly detection, noise, categorical data

Procedia PDF Downloads 24
3534 Chemical Reaction Algorithm for Expectation Maximization Clustering

Authors: Li Ni, Pen ManMan, Li KenLi

Abstract:

Clustering is an intensive research for some years because of its multifaceted applications, such as biology, information retrieval, medicine, business and so on. The expectation maximization (EM) is a kind of algorithm framework in clustering methods, one of the ten algorithms of machine learning. Traditionally, optimization of objective function has been the standard approach in EM. Hence, research has investigated the utility of evolutionary computing and related techniques in the regard. Chemical Reaction Optimization (CRO) is a recently established method. So the property embedded in CRO is used to solve optimization problems. This paper presents an algorithm framework (EM-CRO) with modified CRO operators based on EM cluster problems. The hybrid algorithm is mainly to solve the problem of initial value sensitivity of the objective function optimization clustering algorithm. Our experiments mainly take the EM classic algorithm:k-means and fuzzy k-means as an example, through the CRO algorithm to optimize its initial value, get K-means-CRO and FKM-CRO algorithm. The experimental results of them show that there is improved efficiency for solving objective function optimization clustering problems.

Keywords: chemical reaction optimization, expection maimization, initia, objective function clustering

Procedia PDF Downloads 685
3533 Self-Supervised Attributed Graph Clustering with Dual Contrastive Loss Constraints

Authors: Lijuan Zhou, Mengqi Wu, Changyong Niu

Abstract:

Attributed graph clustering can utilize the graph topology and node attributes to uncover hidden community structures and patterns in complex networks, aiding in the understanding and analysis of complex systems. Utilizing contrastive learning for attributed graph clustering can effectively exploit meaningful implicit relationships between data. However, existing attributed graph clustering methods based on contrastive learning suffer from the following drawbacks: 1) Complex data augmentation increases computational cost, and inappropriate data augmentation may lead to semantic drift. 2) The selection of positive and negative samples neglects the intrinsic cluster structure learned from graph topology and node attributes. Therefore, this paper proposes a method called self-supervised Attributed Graph Clustering with Dual Contrastive Loss constraints (AGC-DCL). Firstly, Siamese Multilayer Perceptron (MLP) encoders are employed to generate two views separately to avoid complex data augmentation. Secondly, the neighborhood contrastive loss is introduced to constrain node representation using local topological structure while effectively embedding attribute information through attribute reconstruction. Additionally, clustering-oriented contrastive loss is applied to fully utilize clustering information in global semantics for discriminative node representations, regarding the cluster centers from two views as negative samples to fully leverage effective clustering information from different views. Comparative clustering results with existing attributed graph clustering algorithms on six datasets demonstrate the superiority of the proposed method.

Keywords: attributed graph clustering, contrastive learning, clustering-oriented, self-supervised learning

Procedia PDF Downloads 18
3532 Assessing Functional Structure in European Marine Ecosystems Using a Vector-Autoregressive Spatio-Temporal Model

Authors: Katyana A. Vert-Pre, James T. Thorson, Thomas Trancart, Eric Feunteun

Abstract:

In marine ecosystems, spatial and temporal species structure is an important component of ecosystems’ response to anthropological and environmental factors. Although spatial distribution patterns and fish temporal series of abundance have been studied in the past, little research has been allocated to the joint dynamic spatio-temporal functional patterns in marine ecosystems and their use in multispecies management and conservation. Each species represents a function to the ecosystem, and the distribution of these species might not be random. A heterogeneous functional distribution will lead to a more resilient ecosystem to external factors. Applying a Vector-Autoregressive Spatio-Temporal (VAST) model for count data, we estimate the spatio-temporal distribution, shift in time, and abundance of 140 species of the Eastern English Chanel, Bay of Biscay and Mediterranean Sea. From the model outputs, we determined spatio-temporal clusters, calculating p-values for hierarchical clustering via multiscale bootstrap resampling. Then, we designed a functional map given the defined cluster. We found that the species distribution within the ecosystem was not random. Indeed, species evolved in space and time in clusters. Moreover, these clusters remained similar over time deriving from the fact that species of a same cluster often shifted in sync, keeping the overall structure of the ecosystem similar overtime. Knowing the co-existing species within these clusters could help with predicting data-poor species distribution and abundance. Further analysis is being performed to assess the ecological functions represented in each cluster.

Keywords: cluster distribution shift, European marine ecosystems, functional distribution, spatio-temporal model

Procedia PDF Downloads 168
3531 Investigating Spatial Disparities in Health Status and Access to Health-Related Interventions among Tribals in Jharkhand

Authors: Parul Suraia, Harshit Sosan Lakra

Abstract:

Indigenous communities represent some of the most marginalized populations globally, with India labeled as tribals, experiencing particularly pronounced marginalization and a concerning decline in their numbers. These communities often inhabit geographically challenging regions characterized by low population densities, posing significant challenges to providing essential infrastructure services. Jharkhand, a Schedule 5 state, is infamous for its low-level health status due to disparities in access to health care. The primary objective of this study is to investigate the spatial inequalities in healthcare accessibility among tribal populations within the state and pinpoint critical areas requiring immediate attention. Health indicators were selected based on the tribal perspective and association of Sustainable Goal 3 (Good Health and Wellbeing) with other SDGs. Focused group discussions in which tribal people and tribal experts were done in order to finalize the indicators. Employing Principal Component Analysis, two essential indices were constructed: the Tribal Health Index (THI) and the Tribal Health Intervention Index (THII). Index values were calculated based on the district-wise secondary data for Jharkhand. The bivariate spatial association technique, Moran’s I was used to assess the spatial pattern of the variables to determine if there is any clustering (positive spatial autocorrelation) or dispersion (negative spatial autocorrelation) of values across Jharkhand. The results helped in facilitating targeting policy interventions in deprived areas of Jharkhand.

Keywords: tribal health, health spatial disparities, health status, Jharkhand

Procedia PDF Downloads 62
3530 Estimation of Missing Values in Aggregate Level Spatial Data

Authors: Amitha Puranik, V. S. Binu, Seena Biju

Abstract:

Missing data is a common problem in spatial analysis especially at the aggregate level. Missing can either occur in covariate or in response variable or in both in a given location. Many missing data techniques are available to estimate the missing data values but not all of these methods can be applied on spatial data since the data are autocorrelated. Hence there is a need to develop a method that estimates the missing values in both response variable and covariates in spatial data by taking account of the spatial autocorrelation. The present study aims to develop a model to estimate the missing data points at the aggregate level in spatial data by accounting for (a) Spatial autocorrelation of the response variable (b) Spatial autocorrelation of covariates and (c) Correlation between covariates and the response variable. Estimating the missing values of spatial data requires a model that explicitly account for the spatial autocorrelation. The proposed model not only accounts for spatial autocorrelation but also utilizes the correlation that exists between covariates, within covariates and between a response variable and covariates. The precise estimation of the missing data points in spatial data will result in an increased precision of the estimated effects of independent variables on the response variable in spatial regression analysis.

Keywords: spatial regression, missing data estimation, spatial autocorrelation, simulation analysis

Procedia PDF Downloads 347
3529 Water Resources Green Efficiency in China: Evaluation, Spatial Association Network Structure Analysis, and Influencing Factors

Authors: Tingyu Zhang

Abstract:

This paper utilizes the Super-SBM model to assess water resources green efficiency (WRGE) among provinces in China and investigate its spatial and temporal features, based on the characteristic framework of “economy-environment-society.” The social network analysis is employed to examine the network pattern and spatial interaction of WRGE. Further, the quadratic assignment procedure method is utilized for examining the influencing factors of the spatial association of WRGE regarding “relationship.” The study reveals that: (1) the spatial distribution of WRGE demonstrates a distribution pattern of Eastern>Western>Central; (2) a remarkable spatial association exists among provinces; however, no strict hierarchical structure is observed. The internal structure of the WRGE network is characterized by the feature of "Eastern strong and Western weak". The block model analysis discovers that the members of the “net spillover” and “two-way spillover” blocks are mostly in the eastern and central provinces; “broker” block, which plays an intermediary role, is mostly in the central provinces; and members of the “net beneficiary” block are mostly in the western region. (3) Differences in economic development, degree of urbanization, water use environment, and water management have significant impacts on the spatial connection of WRGE. This study is dedicated to the realization of regional linkages and synergistic enhancement of WRGE, which provides a meaningful basis for building a harmonious society of human and water coexistence.

Keywords: water resources green efficiency, super-SBM model, social network analysis, quadratic assignment procedure

Procedia PDF Downloads 25
3528 Coastalization and Urban Sprawl in the Mediterranean: Using High-Resolution Multi-Temporal Data to Identify Typologies of Spatial Development

Authors: Apostolos Lagarias, Anastasia Stratigea

Abstract:

Coastal urbanization is heavily affecting the Mediterranean, taking the form of linear urban sprawl along the coastal zone. This process is posing extreme pressure on ecosystems, leading to an unsustainable model of growth. The aim of this research is to analyze coastal urbanization patterns in the Mediterranean using High-resolution multi-temporal data provided by the Global Human Settlement Layer (GHSL) database. Methodology involves the estimation of a set of spatial metrics characterizing the density, aggregation/clustering and dispersion of built-up areas. As case study areas, the Spanish Coast and the Adriatic Italian Coast are examined. Coastalization profiles are examined and selected sub-areas massively affected by tourism development and suburbanization trends (Costa Blanca/Murcia, Costa del Sol, Puglia, Emilia-Romagna Coast) are analyzed and compared. Results show that there are considerable differences between the Spanish and the Italian typologies of spatial development, related to the land use structure and planning policies applied in each case. Monitoring and analyzing spatial patterns could inform integrated Mediterranean strategies for coastal areas and redirect spatial/environmental policies towards a more sustainable model of growth

Keywords: coastalization, Mediterranean, multi-temporal, urban sprawl, spatial metrics

Procedia PDF Downloads 109
3527 A Bayesian Hierarchical Poisson Model with an Underlying Cluster Structure for the Analysis of Measles in Colombia

Authors: Ana Corberan-Vallet, Karen C. Florez, Ingrid C. Marino, Jose D. Bermudez

Abstract:

In 2016, the Region of the Americas was declared free of measles, a viral disease that can cause severe health problems. However, since 2017, measles has reemerged in Venezuela and has subsequently reached neighboring countries. In 2018, twelve American countries reported confirmed cases of measles. Governmental and health authorities in Colombia, a country that shares the longest land boundary with Venezuela, are aware of the need for a strong response to restrict the expanse of the epidemic. In this work, we apply a Bayesian hierarchical Poisson model with an underlying cluster structure to describe disease incidence in Colombia. Concretely, the proposed methodology provides relative risk estimates at the department level and identifies clusters of disease, which facilitates the implementation of targeted public health interventions. Socio-demographic factors, such as the percentage of migrants, gross domestic product, and entry routes, are included in the model to better describe the incidence of disease. Since the model does not impose any spatial correlation at any level of the model hierarchy, it avoids the spatial confounding problem and provides a suitable framework to estimate the fixed-effect coefficients associated with spatially-structured covariates.

Keywords: Bayesian analysis, cluster identification, disease mapping, risk estimation

Procedia PDF Downloads 119
3526 Use of Information Technology in the Government of a State

Authors: Pavel E. Golosov, Vladimir I. Gorelov, Oksana L. Karelova

Abstract:

There are visible changes in the world organization, environment and health of national conscience that create a background for discussion on possible redefinition of global, state and regional management goals. Authors apply the sustainable development criteria to a hierarchical management scheme that is to lead the world community to non-contradictory growth. Concrete definitions are discussed in respect of decision-making process representing the state mostly. With the help of system analysis it is highlighted how to understand who would carry the distinctive sign of world leadership in the nearest future.

Keywords: decision-making, information technology, public administration

Procedia PDF Downloads 480
3525 An Improved K-Means Algorithm for Gene Expression Data Clustering

Authors: Billel Kenidra, Mohamed Benmohammed

Abstract:

Data mining technique used in the field of clustering is a subject of active research and assists in biological pattern recognition and extraction of new knowledge from raw data. Clustering means the act of partitioning an unlabeled dataset into groups of similar objects. Each group, called a cluster, consists of objects that are similar between themselves and dissimilar to objects of other groups. Several clustering methods are based on partitional clustering. This category attempts to directly decompose the dataset into a set of disjoint clusters leading to an integer number of clusters that optimizes a given criterion function. The criterion function may emphasize a local or a global structure of the data, and its optimization is an iterative relocation procedure. The K-Means algorithm is one of the most widely used partitional clustering techniques. Since K-Means is extremely sensitive to the initial choice of centers and a poor choice of centers may lead to a local optimum that is quite inferior to the global optimum, we propose a strategy to initiate K-Means centers. The improved K-Means algorithm is compared with the original K-Means, and the results prove how the efficiency has been significantly improved.

Keywords: microarray data mining, biological pattern recognition, partitional clustering, k-means algorithm, centroid initialization

Procedia PDF Downloads 164
3524 Clustering Categorical Data Using the K-Means Algorithm and the Attribute’s Relative Frequency

Authors: Semeh Ben Salem, Sami Naouali, Moetez Sallami

Abstract:

Clustering is a well known data mining technique used in pattern recognition and information retrieval. The initial dataset to be clustered can either contain categorical or numeric data. Each type of data has its own specific clustering algorithm. In this context, two algorithms are proposed: the k-means for clustering numeric datasets and the k-modes for categorical datasets. The main encountered problem in data mining applications is clustering categorical dataset so relevant in the datasets. One main issue to achieve the clustering process on categorical values is to transform the categorical attributes into numeric measures and directly apply the k-means algorithm instead the k-modes. In this paper, it is proposed to experiment an approach based on the previous issue by transforming the categorical values into numeric ones using the relative frequency of each modality in the attributes. The proposed approach is compared with a previously method based on transforming the categorical datasets into binary values. The scalability and accuracy of the two methods are experimented. The obtained results show that our proposed method outperforms the binary method in all cases.

Keywords: clustering, unsupervised learning, pattern recognition, categorical datasets, knowledge discovery, k-means

Procedia PDF Downloads 236
3523 Cluster Analysis of Retailers’ Benefits from Their Cooperation with Manufacturers: Business Models Perspective

Authors: M. K. Witek-Hajduk, T. M. Napiórkowski

Abstract:

A number of studies discussed the topic of benefits of retailers-manufacturers cooperation and coopetition. However, there are only few publications focused on the benefits of cooperation and coopetition between retailers and their suppliers of durable consumer goods; especially in the context of business model of cooperating partners. This paper aims to provide a clustering approach to segment retailers selling consumer durables according to the benefits they obtain from their cooperation with key manufacturers and differentiate the said retailers’ in term of the business models of cooperating partners. For the purpose of the study, a survey (with a CATI method) collected data on 603 consumer durables retailers present on the Polish market. Retailers are clustered both, with hierarchical and non-hierarchical methods. Five distinctive groups of consumer durables’ retailers are (based on the studied benefits) identified using the two-stage clustering approach. The clusters are then characterized with a set of exogenous variables, key of which are business models employed by the retailer and its partnering key manufacturer. The paper finds that the a combination of a medium sized retailer classified as an Integrator with a chiefly domestic capital and a manufacturer categorized as a Market Player will yield the highest benefits. On the other side of the spectrum is medium sized Distributor retailer with solely domestic capital – in this case, the business model of the cooperating manufactrer appears to be irreleveant. This paper is the one of the first empirical study using cluster analysis on primary data that defines the types of cooperation between consumer durables’ retailers and manufacturers – their key suppliers. The analysis integrates a perspective of both retailers’ and manufacturers’ business models and matches them with individual and joint benefits.

Keywords: benefits of cooperation, business model, cluster analysis, retailer-manufacturer cooperation

Procedia PDF Downloads 233
3522 Generalization of Clustering Coefficient on Lattice Networks Applied to Criminal Networks

Authors: Christian H. Sanabria-Montaña, Rodrigo Huerta-Quintanilla

Abstract:

A lattice network is a special type of network in which all nodes have the same number of links, and its boundary conditions are periodic. The most basic lattice network is the ring, a one-dimensional network with periodic border conditions. In contrast, the Cartesian product of d rings forms a d-dimensional lattice network. An analytical expression currently exists for the clustering coefficient in this type of network, but the theoretical value is valid only up to certain connectivity value; in other words, the analytical expression is incomplete. Here we obtain analytically the clustering coefficient expression in d-dimensional lattice networks for any link density. Our analytical results show that the clustering coefficient for a lattice network with density of links that tend to 1, leads to the value of the clustering coefficient of a fully connected network. We developed a model on criminology in which the generalized clustering coefficient expression is applied. The model states that delinquents learn the know-how of crime business by sharing knowledge, directly or indirectly, with their friends of the gang. This generalization shed light on the network properties, which is important to develop new models in different fields where network structure plays an important role in the system dynamic, such as criminology, evolutionary game theory, econophysics, among others.

Keywords: clustering coefficient, criminology, generalized, regular network d-dimensional

Procedia PDF Downloads 380
3521 Hierarchical Piecewise Linear Representation of Time Series Data

Authors: Vineetha Bettaiah, Heggere S. Ranganath

Abstract:

This paper presents a Hierarchical Piecewise Linear Approximation (HPLA) for the representation of time series data in which the time series is treated as a curve in the time-amplitude image space. The curve is partitioned into segments by choosing perceptually important points as break points. Each segment between adjacent break points is recursively partitioned into two segments at the best point or midpoint until the error between the approximating line and the original curve becomes less than a pre-specified threshold. The HPLA representation achieves dimensionality reduction while preserving prominent local features and general shape of time series. The representation permits course-fine processing at different levels of details, allows flexible definition of similarity based on mathematical measures or general time series shape, and supports time series data mining operations including query by content, clustering and classification based on whole or subsequence similarity.

Keywords: data mining, dimensionality reduction, piecewise linear representation, time series representation

Procedia PDF Downloads 250
3520 Lightweight Cryptographically Generated Address for IPv6 Neighbor Discovery

Authors: Amjed Sid Ahmed, Rosilah Hassan, Nor Effendy Othman

Abstract:

Limited functioning of the Internet Protocol version 4 (IPv4) has necessitated the development of the Internetworking Protocol next generation (IPng) to curb the challenges. Indeed, the IPng is also referred to as the Internet Protocol version 6 (IPv6) and includes the Neighbor Discovery Protocol (NDP). The latter performs the role of Address Auto-configuration, Router Discovery (RD), and Neighbor Discovery (ND). Furthermore, the role of the NDP entails redirecting the service, detecting the duplicate address, and detecting the unreachable services. Despite the fact that there is an NDP’s assumption regarding the existence of trust the links’ nodes, several crucial attacks may affect the Protocol. Internet Engineering Task Force (IETF) therefore has recommended implementation of Secure Neighbor Discovery Protocol (SEND) to tackle safety issues in NDP. The SEND protocol is mainly used for validation of address rights, malicious response inhibiting techniques and finally router certification procedures. For routine running of these tasks, SEND utilizes on the following options, Cryptographically Generated Address (CGA), RSA Signature, Nonce and Timestamp option. CGA is produced at extra high costs making it the most notable disadvantage of SEND. In this paper a clear description of the constituents of CGA, its operation and also recommendations for improvements in its generation are given.

Keywords: CGA, IPv6, NDP, SEND

Procedia PDF Downloads 364
3519 First-Principles Calculations of Hydrogen Adsorbed in Multi-Layer Graphene

Authors: Mohammad Shafiul Alam, Mineo Saito

Abstract:

Graphene-based materials have attracted much attention because they are candidates for post silicon materials. Since controlling of impurities is necessary to achieve nano device, we study hydrogen impurity in multi-layer graphene. We perform local spin Density approximation (LSDA) in which the plane wave basis set and pseudopotential are used. Previously hydrogen monomer and dimer in graphene is well theoretically studied. However, hydrogen on multilayer graphene is still not clear. By using first-principles electronic structure calculations based on the LSDA within the density functional theory method, we studied hydrogen monomers and dimers in two-layer graphene. We found that the monomers are spin-polarized and have magnetic moment 1 µB. We also found that most stable dimer is much more stable than monomer. In the most stable structures of the dimers in two-layer graphene, the two hydrogen atoms are bonded to the host carbon atoms which are nearest-neighbors. In this case two hydrogen atoms are located on the opposite sides. Whereas, when the two hydrogen atoms are bonded to the same sublattice of the host materials, magnetic moments of 2 µB appear in two-layer graphene. We found that when the two hydrogen atoms are bonded to third-nearest-neighbor carbon atoms, the electronic structure is nonmagnetic. We also studied hydrogen monomers and dimers in three-layer graphene. The result is same as that of two-layer graphene. These results are very important in the field of carbon nanomaterials as it is experimentally difficult to show the magnetic state of those materials.

Keywords: first-principles calculations, LSDA, multi-layer gra-phene, nanomaterials

Procedia PDF Downloads 308
3518 A Relative Entropy Regularization Approach for Fuzzy C-Means Clustering Problem

Authors: Ouafa Amira, Jiangshe Zhang

Abstract:

Clustering is an unsupervised machine learning technique; its aim is to extract the data structures, in which similar data objects are grouped in the same cluster, whereas dissimilar objects are grouped in different clusters. Clustering methods are widely utilized in different fields, such as: image processing, computer vision , and pattern recognition, etc. Fuzzy c-means clustering (fcm) is one of the most well known fuzzy clustering methods. It is based on solving an optimization problem, in which a minimization of a given cost function has been studied. This minimization aims to decrease the dissimilarity inside clusters, where the dissimilarity here is measured by the distances between data objects and cluster centers. The degree of belonging of a data point in a cluster is measured by a membership function which is included in the interval [0, 1]. In fcm clustering, the membership degree is constrained with the condition that the sum of a data object’s memberships in all clusters must be equal to one. This constraint can cause several problems, specially when our data objects are included in a noisy space. Regularization approach took a part in fuzzy c-means clustering technique. This process introduces an additional information in order to solve an ill-posed optimization problem. In this study, we focus on regularization by relative entropy approach, where in our optimization problem we aim to minimize the dissimilarity inside clusters. Finding an appropriate membership degree to each data object is our objective, because an appropriate membership degree leads to an accurate clustering result. Our clustering results in synthetic data sets, gaussian based data sets, and real world data sets show that our proposed model achieves a good accuracy.

Keywords: clustering, fuzzy c-means, regularization, relative entropy

Procedia PDF Downloads 242
3517 The Influence of 3D Printing Course on Middle School Students' Spatial Thinking Ability

Authors: Wang Xingjuan, Qian Dongming

Abstract:

As a common thinking ability, spatial thinking ability plays an increasingly important role in the information age. The key to cultivating students' spatial thinking ability is to cultivate students' ability to process and transform graphics. The 3D printing course enables students to constantly touch the rotation and movement of objects during the modeling process and to understand spatial graphics from different views. To this end, this article combines the classic PSVT: R test to explore the impact of 3D printing courses on the spatial thinking ability of middle school students. The results of the study found that: (1) Through the study of the 3D printing course, the students' spatial ability test scores have been significantly improved, which indirectly reflects the improvement of the spatial thinking ability level. (2) The student's spatial thinking ability test results are influenced by the parent's occupation.

Keywords: 3D printing, middle school students, spatial thinking ability, influence

Procedia PDF Downloads 157
3516 Classification of Red, Green and Blue Values from Face Images Using k-NN Classifier to Predict the Skin or Non-Skin

Authors: Kemal Polat

Abstract:

In this study, it has been estimated whether there is skin by using RBG values obtained from the camera and k-nearest neighbor (k-NN) classifier. The dataset used in this study has an unbalanced distribution and a linearly non-separable structure. This problem can also be called a big data problem. The Skin dataset was taken from UCI machine learning repository. As the classifier, we have used the k-NN method to handle this big data problem. For k value of k-NN classifier, we have used as 1. To train and test the k-NN classifier, 50-50% training-testing partition has been used. As the performance metrics, TP rate, FP Rate, Precision, recall, f-measure and AUC values have been used to evaluate the performance of k-NN classifier. These obtained results are as follows: 0.999, 0.001, 0.999, 0.999, 0.999, and 1,00. As can be seen from the obtained results, this proposed method could be used to predict whether the image is skin or not.

Keywords: k-NN classifier, skin or non-skin classification, RGB values, classification

Procedia PDF Downloads 223
3515 Multi-Criteria Decision Support System for Modeling of Civic Facilities Using GIS Applications: A Case Study of F-11, Islamabad

Authors: Asma Shaheen Hashmi, Omer Riaz, Khalid Mahmood, Fahad Ullah, Tanveer Ahmad

Abstract:

The urban landscapes are being change with the population growth and advancements in new technologies. The urban sprawl pattern and utilizes are related to the local socioeconomic and physical condition. Urban policy decisions are executed mostly through spatial planning. A decision support system (DSS) is very powerful tool which provides flexible knowledge base method for urban planning. An application was developed using geographical information system (GIS) for urban planning. A scenario based DSS was developed to integrate the hierarchical muti-criteria data of different aspects of urban landscape. These were physical environment, the dumping site, spatial distribution of road network, gas and water supply lines, and urban watershed management, selection criteria for new residential, recreational, commercial and industrial sites. The model provided a framework to incorporate the sustainable future development. The data can be entered dynamically by planners according to the appropriate criteria for the management of urban landscapes.

Keywords: urban, GIS, spatial, criteria

Procedia PDF Downloads 607
3514 Unsteady Three-Dimensional Adaptive Spatial-Temporal Multi-Scale Direct Simulation Monte Carlo Solver to Simulate Rarefied Gas Flows in Micro/Nano Devices

Authors: Mirvat Shamseddine, Issam Lakkis

Abstract:

We present an efficient, three-dimensional parallel multi-scale Direct Simulation Monte Carlo (DSMC) algorithm for the simulation of unsteady rarefied gas flows in micro/nanosystems. The algorithm employs a novel spatiotemporal adaptivity scheme. The scheme performs a fully dynamic multi-level grid adaption based on the gradients of flow macro-parameters and an automatic temporal adaptation. The computational domain consists of a hierarchical octree-based Cartesian grid representation of the flow domain and a triangular mesh for the solid object surfaces. The hybrid mesh, combined with the spatiotemporal adaptivity scheme, allows for increased flexibility and efficient data management, rendering the framework suitable for efficient particle-tracing and dynamic grid refinement and coarsening. The parallel algorithm is optimized to run DSMC simulations of strongly unsteady, non-equilibrium flows over multiple cores. The presented method is validated by comparing with benchmark studies and then employed to improve the design of micro-scale hotwire thermal sensors in rarefied gas flows.

Keywords: DSMC, oct-tree hierarchical grid, ray tracing, spatial-temporal adaptivity scheme, unsteady rarefied gas flows

Procedia PDF Downloads 280
3513 Short Association Bundle Atlas for Lateralization Studies from dMRI Data

Authors: C. Román, M. Guevara, P. Salas, D. Duclap, J. Houenou, C. Poupon, J. F. Mangin, P. Guevara

Abstract:

Diffusion Magnetic Resonance Imaging (dMRI) allows the non-invasive study of human brain white matter. From diffusion data, it is possible to reconstruct fiber trajectories using tractography algorithms. Our previous work consists in an automatic method for the identification of short association bundles of the superficial white matter (SWM), based on a whole brain inter-subject hierarchical clustering applied to a HARDI database. The method finds representative clusters of similar fibers, belonging to a group of subjects, according to a distance measure between fibers, using a non-linear registration (DTI-TK). The algorithm performs an automatic labeling based on the anatomy, defined by a cortex mesh parcelated with FreeSurfer software. The clustering was applied to two independent groups of 37 subjects. The clusters resulting from both groups were compared using a restrictive threshold of mean distance between each pair of bundles from different groups, in order to keep reproducible connections. In the left hemisphere, 48 reproducible bundles were found, while 43 bundles where found in the right hemisphere. An inter-hemispheric bundle correspondence was then applied. The symmetric horizontal reflection of the right bundles was calculated, in order to obtain the position of them in the left hemisphere. Next, the intersection between similar bundles was calculated. The pairs of bundles with a fiber intersection percentage higher than 50% were considered similar. The similar bundles between both hemispheres were fused and symmetrized. We obtained 30 common bundles between hemispheres. An atlas was created with the resulting bundles and used to segment 78 new subjects from another HARDI database, using a distance threshold between 6-8 mm according to the bundle length. Finally, a laterality index was calculated based on the bundle volume. Seven bundles of the atlas presented right laterality (IP_SP_1i, LO_LO_1i, Op_Tr_0i, PoC_PoC_0i, PoC_PreC_2i, PreC_SM_0i, y RoMF_RoMF_0i) and one presented left laterality (IP_SP_2i), there is no tendency of lateralization according to the brain region. Many factors can affect the results, like tractography artifacts, subject registration, and bundle segmentation. Further studies are necessary in order to establish the influence of these factors and evaluate SWM laterality.

Keywords: dMRI, hierarchical clustering, lateralization index, tractography

Procedia PDF Downloads 303
3512 Economics of Conflict: Core Economic Dimensions of the Georgian-South Ossetian Context

Authors: V. Charaia

Abstract:

This article presents SWOT analysis for Georgian - South Ossetian conflict. The research analyzes socio-economic aspects and considers future prospects for all sides including neighbor countries and regions. Also it includes the possibilities of positive intervention of neighbor countries to solve the conflict or to mitigate its negative results. The main question of the article is: What will it take to award Georgians and South Ossetians with a peace dividend?

Keywords: conflict economics, investments, trade, remittances

Procedia PDF Downloads 211
3511 Recurrent Neural Networks with Deep Hierarchical Mixed Structures for Chinese Document Classification

Authors: Zhaoxin Luo, Michael Zhu

Abstract:

In natural languages, there are always complex semantic hierarchies. Obtaining the feature representation based on these complex semantic hierarchies becomes the key to the success of the model. Several RNN models have recently been proposed to use latent indicators to obtain the hierarchical structure of documents. However, the model that only uses a single-layer latent indicator cannot achieve the true hierarchical structure of the language, especially a complex language like Chinese. In this paper, we propose a deep layered model that stacks arbitrarily many RNN layers equipped with latent indicators. After using EM and training it hierarchically, our model solves the computational problem of stacking RNN layers and makes it possible to stack arbitrarily many RNN layers. Our deep hierarchical model not only achieves comparable results to large pre-trained models on the Chinese short text classification problem but also achieves state of art results on the Chinese long text classification problem.

Keywords: nature language processing, recurrent neural network, hierarchical structure, document classification, Chinese

Procedia PDF Downloads 40
3510 Max-Entropy Feed-Forward Clustering Neural Network

Authors: Xiaohan Bookman, Xiaoyan Zhu

Abstract:

The outputs of non-linear feed-forward neural network are positive, which could be treated as probability when they are normalized to one. If we take Entropy-Based Principle into consideration, the outputs for each sample could be represented as the distribution of this sample for different clusters. Entropy-Based Principle is the principle with which we could estimate the unknown distribution under some limited conditions. As this paper defines two processes in Feed-Forward Neural Network, our limited condition is the abstracted features of samples which are worked out in the abstraction process. And the final outputs are the probability distribution for different clusters in the clustering process. As Entropy-Based Principle is considered into the feed-forward neural network, a clustering method is born. We have conducted some experiments on six open UCI data sets, comparing with a few baselines and applied purity as the measurement. The results illustrate that our method outperforms all the other baselines that are most popular clustering methods.

Keywords: feed-forward neural network, clustering, max-entropy principle, probabilistic models

Procedia PDF Downloads 411
3509 Clustering of Extremes in Financial Returns: A Comparison between Developed and Emerging Markets

Authors: Sara Ali Alokley, Mansour Saleh Albarrak

Abstract:

This paper investigates the dependency or clustering of extremes in the financial returns data by estimating the extremal index value θ∈[0,1]. The smaller the value of θ the more clustering we have. Here we apply the method of Ferro and Segers (2003) to estimate the extremal index for a range of threshold values. We compare the dependency structure of extremes in the developed and emerging markets. We use the financial returns of the stock market index in the developed markets of US, UK, France, Germany and Japan and the emerging markets of Brazil, Russia, India, China and Saudi Arabia. We expect that more clustering occurs in the emerging markets. This study will help to understand the dependency structure of the financial returns data.

Keywords: clustring, extremes, returns, dependency, extermal index

Procedia PDF Downloads 370
3508 Case-Based Reasoning for Build Order in Real-Time Strategy Games

Authors: Ben G. Weber, Michael Mateas

Abstract:

We present a case-based reasoning technique for selecting build orders in a real-time strategy game. The case retrieval process generalizes features of the game state and selects cases using domain-specific recall methods, which perform exact matching on a subset of the case features. We demonstrate the performance of the technique by implementing it as a component of the integrated agent framework of McCoy and Mateas. Our results demonstrate that the technique outperforms nearest-neighbor retrieval when imperfect information is enforced in a real-time strategy game.

Keywords: case based reasoning, real time strategy systems, requirements elicitation, requirement analyst, artificial intelligence

Procedia PDF Downloads 413
3507 An Energy Efficient Clustering Approach for Underwater ‎Wireless Sensor Networks

Authors: Mohammad Reza Taherkhani‎

Abstract:

Wireless sensor networks that are used to monitor a special environment, are formed from a large number of sensor nodes. The role of these sensors is to sense special parameters from ambient and to make a connection. In these networks, the most important challenge is the management of energy usage. Clustering is one of the methods that are broadly used to face this challenge. In this paper, a distributed clustering protocol based on learning automata is proposed for underwater wireless sensor networks. The proposed algorithm that is called LA-Clustering forms clusters in the same energy level, based on the energy level of nodes and the connection radius regardless of size and the structure of sensor network. The proposed approach is simulated and is compared with some other protocols with considering some metrics such as network lifetime, number of alive nodes, and number of transmitted data. The simulation results demonstrate the efficiency of the proposed approach.

Keywords: underwater sensor networks, clustering, learning automata, energy consumption

Procedia PDF Downloads 336