Search results for: Cluster dimension
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 721

Search results for: Cluster dimension

691 Enhancing K-Means Algorithm with Initial Cluster Centers Derived from Data Partitioning along the Data Axis with the Highest Variance

Authors: S. Deelers, S. Auwatanamongkol

Abstract:

In this paper, we propose an algorithm to compute initial cluster centers for K-means clustering. Data in a cell is partitioned using a cutting plane that divides cell in two smaller cells. The plane is perpendicular to the data axis with the highest variance and is designed to reduce the sum squared errors of the two cells as much as possible, while at the same time keep the two cells far apart as possible. Cells are partitioned one at a time until the number of cells equals to the predefined number of clusters, K. The centers of the K cells become the initial cluster centers for K-means. The experimental results suggest that the proposed algorithm is effective, converge to better clustering results than those of the random initialization method. The research also indicated the proposed algorithm would greatly improve the likelihood of every cluster containing some data in it.

Keywords: Clustering algorithm, K-means algorithm, Datapartitioning, Initial cluster centers.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2815
690 A Review: Comparative Analysis of Different Categorical Data Clustering Ensemble Methods

Authors: S. Sarumathi, N. Shanthi, M. Sharmila

Abstract:

Over the past epoch a rampant amount of work has been done in the data clustering research under the unsupervised learning technique in Data mining. Furthermore several algorithms and methods have been proposed focusing on clustering different data types, representation of cluster models, and accuracy rates of the clusters. However no single clustering algorithm proves to be the most efficient in providing best results. Accordingly in order to find the solution to this issue a new technique, called Cluster ensemble method was bloomed. This cluster ensemble is a good alternative approach for facing the cluster analysis problem. The main hope of the cluster ensemble is to merge different clustering solutions in such a way to achieve accuracy and to improve the quality of individual data clustering. Due to the substantial and unremitting development of new methods in the sphere of data mining and also the incessant interest in inventing new algorithms, makes obligatory to scrutinize a critical analysis of the existing techniques and the future novelty. This paper exposes the comparative study of different cluster ensemble methods along with their features, systematic working process and the average accuracy and error rates of each ensemble methods. Consequently this speculative and comprehensive analysis will be very useful for the community of clustering practitioners and also helps in deciding the most suitable one to rectify the problem in hand.

Keywords: Clustering, Cluster Ensemble methods, Co-association matrix, Consensus function, Median partition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2555
689 Evaluation of Groundwater Quality and Its Suitability for Drinking and Agricultural Purposes Using Self-Organizing Maps

Authors: L. Belkhiri, L. Mouni, A. Tiri, T.S. Narany

Abstract:

In the present study, the self-organizing map (SOM) clustering technique was applied to identify homogeneous clusters of hydrochemical parameters in El Milia plain, Algeria, to assess the quality of groundwater for potable and agricultural purposes. The visualization of SOM-analysis indicated that 35 groundwater samples collected in the study area were classified into three clusters, which showed progressive increase in electrical conductivity from cluster one to cluster three. Samples belonging to cluster one are mostly located in the recharge zone showing hard fresh water type, however, water type gradually changed to hard-brackish type in the discharge zone, including clusters two and three. Ionic ratio studies indicated the role of carbonate rock dissolution in increases on groundwater hardness, especially in cluster one. However, evaporation and evapotranspiration are the main processes increasing salinity in cluster two and three.

Keywords: Drinking water, groundwater quality, irrigation water, self-organizing maps.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1185
688 Optimizing Hadoop Block Placement Policy and Cluster Blocks Distribution

Authors: Nchimbi Edward Pius, Liu Qin, Fion Yang, Zhu Hong Ming

Abstract:

The current Hadoop block placement policy do not fairly and evenly distributes replicas of blocks written to datanodes in a Hadoop cluster.

This paper presents a new solution that helps to keep the cluster in a balanced state while an HDFS client is writing data to a file in Hadoop cluster. The solution had been implemented, and test had been conducted to evaluate its contribution to Hadoop distributed file system.

It has been found that, the solution has lowered global execution time taken by Hadoop balancer to 22 percent. It also has been found that, Hadoop balancer respectively over replicate 1.75 and 3.3 percent of all re-distributed blocks in the modified and original Hadoop clusters.

The feature that keeps the cluster in a balanced state works as a core part to Hadoop system and not just as a utility like traditional balancer. This is one of the significant achievements and uniqueness of the solution developed during the course of this research work.

Keywords: Balancer, Datanode, Distributed file system, Hadoop, Replicas.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4899
687 Graphs with Metric Dimension Two-A Characterization

Authors: Sudhakara G, Hemanth Kumar A.R

Abstract:

In this paper, we define distance partition of vertex set of a graph G with reference to a vertex in it and with the help of the same, a graph with metric dimension two (i.e. β (G) = 2 ) is characterized. In the process, we develop a polynomial time algorithm that verifies if the metric dimension of a given graph G is two. The same algorithm explores all metric bases of graph G whenever β (G) = 2 . We also find a bound for cardinality of any distance partite set with reference to a given vertex, when ever β (G) = 2 . Also, in a graph G with β (G) = 2 , a bound for cardinality of any distance partite set as well as a bound for number of vertices in any sub graph H of G is obtained in terms of diam H .

Keywords: Metric basis, Distance partition, Metric dimension.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1808
686 Clustering Unstructured Text Documents Using Fading Function

Authors: Pallav Roxy, Durga Toshniwal

Abstract:

Clustering unstructured text documents is an important issue in data mining community and has a number of applications such as document archive filtering, document organization and topic detection and subject tracing. In the real world, some of the already clustered documents may not be of importance while new documents of more significance may evolve. Most of the work done so far in clustering unstructured text documents overlooks this aspect of clustering. This paper, addresses this issue by using the Fading Function. The unstructured text documents are clustered. And for each cluster a statistics structure called Cluster Profile (CP) is implemented. The cluster profile incorporates the Fading Function. This Fading Function keeps an account of the time-dependent importance of the cluster. The work proposes a novel algorithm Clustering n-ary Merge Algorithm (CnMA) for unstructured text documents, that uses Cluster Profile and Fading Function. Experimental results illustrating the effectiveness of the proposed technique are also included.

Keywords: Clustering, Text Mining, Unstructured TextDocuments, Fading Function.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1942
685 Cluster Analysis of Customer Churn in Telecom Industry

Authors: Abbas Al-Refaie

Abstract:

The research examines the factors that affect customer churn (CC) in the Jordanian telecom industry. A total of 700 surveys were distributed. Cluster analysis revealed three main clusters. Results showed that CC and customer satisfaction (CS) were the key determinants in forming the three clusters. In two clusters, the center values of CC were high, indicating that the customers were loyal and SC was expensive and time- and energy-consuming. Still, the mobile service provider (MSP) should enhance its communication (COM), and value added services (VASs), as well as customer complaint management systems (CCMS). Finally, for the third cluster the center of the CC indicates a poor level of loyalty, which facilitates customers churn to another MSP. The results of this study provide valuable feedback for MSP decision makers regarding approaches to improving their performance and reducing CC.

Keywords: Cluster analysis, telecom industry, switching cost, customer churn.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2479
684 Game Theory Based Diligent Energy Utilization Algorithm for Routing in Wireless Sensor Network

Authors: X. Mercilin Raajini, R. Raja Kumar, P. Indumathi, V. Praveen

Abstract:

Many cluster based routing protocols have been proposed in the field of wireless sensor networks, in which a group of nodes are formed as clusters. A cluster head is selected from one among those nodes based on residual energy, coverage area, number of hops and that cluster-head will perform data gathering from various sensor nodes and forwards aggregated data to the base station or to a relay node (another cluster-head), which will forward the packet along with its own data packet to the base station. Here a Game Theory based Diligent Energy Utilization Algorithm (GTDEA) for routing is proposed. In GTDEA, the cluster head selection is done with the help of game theory, a decision making process, that selects a cluster-head based on three parameters such as residual energy (RE), Received Signal Strength Index (RSSI) and Packet Reception Rate (PRR). Finding a feasible path to the destination with minimum utilization of available energy improves the network lifetime and is achieved by the proposed approach. In GTDEA, the packets are forwarded to the base station using inter-cluster routing technique, which will further forward it to the base station. Simulation results reveal that GTDEA improves the network performance in terms of throughput, lifetime, and power consumption.

Keywords: Cluster head, Energy utilization, Game Theory, LEACH, Sensor network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1854
683 A Comprehensive Review on Different Mixed Data Clustering Ensemble Methods

Authors: S. Sarumathi, N. Shanthi, S. Vidhya, M. Sharmila

Abstract:

An extensive amount of work has been done in data clustering research under the unsupervised learning technique in Data Mining during the past two decades. Moreover, several approaches and methods have been emerged focusing on clustering diverse data types, features of cluster models and similarity rates of clusters. However, none of the single clustering algorithm exemplifies its best nature in extracting efficient clusters. Consequently, in order to rectify this issue, a new challenging technique called Cluster Ensemble method was bloomed. This new approach tends to be the alternative method for the cluster analysis problem. The main objective of the Cluster Ensemble is to aggregate the diverse clustering solutions in such a way to attain accuracy and also to improve the eminence the individual clustering algorithms. Due to the massive and rapid development of new methods in the globe of data mining, it is highly mandatory to scrutinize a vital analysis of existing techniques and the future novelty. This paper shows the comparative analysis of different cluster ensemble methods along with their methodologies and salient features. Henceforth this unambiguous analysis will be very useful for the society of clustering experts and also helps in deciding the most appropriate one to resolve the problem in hand.

Keywords: Clustering, Cluster Ensemble Methods, Coassociation matrix, Consensus Function, Median Partition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2069
682 K-Means for Spherical Clusters with Large Variance in Sizes

Authors: A. M. Fahim, G. Saake, A. M. Salem, F. A. Torkey, M. A. Ramadan

Abstract:

Data clustering is an important data exploration technique with many applications in data mining. The k-means algorithm is well known for its efficiency in clustering large data sets. However, this algorithm is suitable for spherical shaped clusters of similar sizes and densities. The quality of the resulting clusters decreases when the data set contains spherical shaped with large variance in sizes. In this paper, we introduce a competent procedure to overcome this problem. The proposed method is based on shifting the center of the large cluster toward the small cluster, and recomputing the membership of small cluster points, the experimental results reveal that the proposed algorithm produces satisfactory results.

Keywords: K-Means, Data Clustering, Cluster Analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3241
681 Temporal Change of Fractal Dimension of Explosion Earthquakes and Harmonic Tremors at Semeru Volcano, East Java, Indonesia, using Critical Exponent Method

Authors: Sukir Maryanto, Iyan Mulyana

Abstract:

Fractal analyses of successive event of explosion earthquake and harmonic tremor recorded at Semeru volcano were carried out to investigate the dynamical system regarding to their generating mechanism. The explosive eruptions accompanied by explosion earthquakes and following volcanic tremor which are generated by continuous emission of volcanic ash. The fractal dimension of successive event of explosion and harmonic tremor was estimated by Critical Exponent Method (CEM). It was found that the method yield a higher fractal dimension of explosion earthquakes and gradually decrease during the occurrence of harmonic tremor, and can be considerably as correlated complexity of the source mechanism from the variance of fractal dimension.

Keywords: Fractal dimension, Semeru volcano, explosionearthquake, harmonic tremor, Critical Exponent Method

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1717
680 Effect of Particle Gravity on the Fractal Dimension of Particle Line in three-dimensional Turbulent Flows using Kinematic Simulation

Authors: A. Abou El-Azm Aly, F. Nicolleau, T. M. Michelitsch, A. F. Nowakowski

Abstract:

In this study, the dispersion of heavy particles line in an isotropic and incompressible three-dimensional turbulent flow has been studied using the Kinematic Simulation techniques to find out the evolution of the line fractal dimension. The fractal dimension of the line is found in the case of different particle gravity (in practice, different values of particle drift velocity) in the presence of small particle inertia with a comparison with that obtained in the diffusion case of material line at the same Reynolds number. It can be concluded for the dispersion of heavy particles line in turbulent flow that the particle gravity affect the fractal dimension of the line for different particle gravity velocities in the range 0.2 < W < 2. With the increase of the particle drift velocity, the fractal dimension of the line decreases which may be explained as the particles pass many scales in their journey in the direction of the gravity and the particles trajectories do not affect by these scales at high particle drift velocities.

Keywords: Heavy particles, two-phase flow, Kinematic Simulation, Fractal dimension.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1395
679 Dual-Link Hierarchical Cluster-Based Interconnect Architecture for 3D Network on Chip

Authors: Guang Sun, Yong Li, Yuanyuan Zhang, Shijun Lin, Li Su, Depeng Jin, Lieguang zeng

Abstract:

Network on Chip (NoC) has emerged as a promising on chip communication infrastructure. Three Dimensional Integrate Circuit (3D IC) provides small interconnection length between layers and the interconnect scalability in the third dimension, which can further improve the performance of NoC. Therefore, in this paper, a hierarchical cluster-based interconnect architecture is merged with the 3D IC. This interconnect architecture significantly reduces the number of long wires. Since this architecture only has approximately a quarter of routers in 3D mesh-based architecture, the average number of hops is smaller, which leads to lower latency and higher throughput. Moreover, smaller number of routers decreases the area overhead. Meanwhile, some dual links are inserted into the bottlenecks of communication to improve the performance of NoC. Simulation results demonstrate our theoretical analysis and show the advantages of our proposed architecture in latency, throughput and area, when compared with 3D mesh-based architecture.

Keywords: Network on Chip (NoC), interconnect architecture, performance, area, Three Dimensional Integrate Circuit (3D IC).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1478
678 Two-Photon Ionization of Silver Clusters

Authors: V. Paployan, K. Madoyan, A. Melikyan, H. Minassian

Abstract:

In this paper, we calculate the two-photon ionization (TPI) cross-section for pump-probe scheme in Ag neutral cluster. The pump photon energy is assumed to be close to the surface plasmon (SP) energy of cluster in dielectric media. Due to this choice, the pump wave excites collective oscillations of electrons-SP and the probe wave causes ionization of the cluster. Since the interband transition energy in Ag exceeds the SP resonance energy, the main contribution into the TPI comes from the latter. The advantage of Ag clusters as compared to the other noble metals is that the SP resonance in silver cluster is much sharper because of peculiarities of its dielectric function. The calculations are performed by separating the coordinates of electrons corresponding to the collective oscillations and the individual motion that allows taking into account the resonance contribution of excited SP oscillations. It is shown that the ionization cross section increases by two orders of magnitude if the energy of the pump photon matches the surface plasmon energy in the cluster.

Keywords: Resonance enhancement, silver clusters, surface plasmon, two-photon ionization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1433
677 Network of Coupled Stochastic Oscillators and One-way Quantum Computations

Authors: Eugene Grichuk, Margarita Kuzmina, Eduard Manykin

Abstract:

A network of coupled stochastic oscillators is proposed for modeling of a cluster of entangled qubits that is exploited as a computation resource in one-way quantum computation schemes. A qubit model has been designed as a stochastic oscillator formed by a pair of coupled limit cycle oscillators with chaotically modulated limit cycle radii and frequencies. The qubit simulates the behavior of electric field of polarized light beam and adequately imitates the states of two-level quantum system. A cluster of entangled qubits can be associated with a beam of polarized light, light polarization degree being directly related to cluster entanglement degree. Oscillatory network, imitating qubit cluster, is designed, and system of equations for network dynamics has been written. The constructions of one-qubit gates are suggested. Changing of cluster entanglement degree caused by measurements can be exactly calculated.

Keywords: network of stochastic oscillators, one-way quantumcomputations, a beam of polarized light.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1361
676 Performance Comparison of Parallel Sorting Algorithms on the Cluster of Workstations

Authors: Lai Lai Win Kyi, Nay Min Tun

Abstract:

Sorting appears the most attention among all computational tasks over the past years because sorted data is at the heart of many computations. Sorting is of additional importance to parallel computing because of its close relation to the task of routing data among processes, which is an essential part of many parallel algorithms. Many parallel sorting algorithms have been investigated for a variety of parallel computer architectures. In this paper, three parallel sorting algorithms have been implemented and compared in terms of their overall execution time. The algorithms implemented are the odd-even transposition sort, parallel merge sort and parallel rank sort. Cluster of Workstations or Windows Compute Cluster has been used to compare the algorithms implemented. The C# programming language is used to develop the sorting algorithms. The MPI (Message Passing Interface) library has been selected to establish the communication and synchronization between processors. The time complexity for each parallel sorting algorithm will also be mentioned and analyzed.

Keywords: Cluster of Workstations, Parallel sorting algorithms, performance analysis, parallel computing and MPI.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1440
675 A New Dimension of Business Intelligence: Location-based Intelligence

Authors: Zeljko Panian

Abstract:

Through the course of this paper we define Locationbased Intelligence (LBI) which is outgrowing from process of amalgamation of geolocation and Business Intelligence. Amalgamating geolocation with traditional Business Intelligence (BI) results in a new dimension of BI named Location-based Intelligence. LBI is defined as leveraging unified location information for business intelligence. Collectively, enterprises can transform location data into business intelligence applications that will benefit all aspects of the enterprise. Expectations from this new dimension of business intelligence are great and its future is obviously bright.

Keywords: Business intelligence, geolocation, location-based intelligence, innovation, location-intelligent business

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2137
674 Parallelization and Optimization of SIFT Feature Extraction on Cluster System

Authors: Mingling Zheng, Zhenlong Song, Ke Xu, Hengzhu Liu

Abstract:

Scale Invariant Feature Transform (SIFT) has been widely applied, but extracting SIFT feature is complicated and time-consuming. In this paper, to meet the demand of the real-time applications, SIFT is parallelized and optimized on cluster system, which is named pSIFT. Redundancy storage and communication are used for boundary data to improve the performance, and before representation of feature descriptor, data reallocation is adopted to keep load balance in pSIFT. Experimental results show that pSIFT achieves good speedup and scalability.

Keywords: cluster, image matching, parallelization and optimization, SIFT.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1824
673 Color Image Segmentation Using Competitive and Cooperative Learning Approach

Authors: Yinggan Tang, Xinping Guan

Abstract:

Color image segmentation can be considered as a cluster procedure in feature space. k-means and its adaptive version, i.e. competitive learning approach are powerful tools for data clustering. But k-means and competitive learning suffer from several drawbacks such as dead-unit problem and need to pre-specify number of cluster. In this paper, we will explore to use competitive and cooperative learning approach to perform color image segmentation. In competitive and cooperative learning approach, seed points not only compete each other, but also the winner will dynamically select several nearest competitors to form a cooperative team to adapt to the input together, finally it can automatically select the correct number of cluster and avoid the dead-units problem. Experimental results show that CCL can obtain better segmentation result.

Keywords: Color image segmentation, competitive learning, cluster, k-means algorithm, competitive and cooperative learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1572
672 A Text Clustering System based on k-means Type Subspace Clustering and Ontology

Authors: Liping Jing, Michael K. Ng, Xinhua Yang, Joshua Zhexue Huang

Abstract:

This paper presents a text clustering system developed based on a k-means type subspace clustering algorithm to cluster large, high dimensional and sparse text data. In this algorithm, a new step is added in the k-means clustering process to automatically calculate the weights of keywords in each cluster so that the important words of a cluster can be identified by the weight values. For understanding and interpretation of clustering results, a few keywords that can best represent the semantic topic are extracted from each cluster. Two methods are used to extract the representative words. The candidate words are first selected according to their weights calculated by our new algorithm. Then, the candidates are fed to the WordNet to identify the set of noun words and consolidate the synonymy and hyponymy words. Experimental results have shown that the clustering algorithm is superior to the other subspace clustering algorithms, such as PROCLUS and HARP and kmeans type algorithm, e.g., Bisecting-KMeans. Furthermore, the word extraction method is effective in selection of the words to represent the topics of the clusters.

Keywords: Subspace Clustering, Text Mining, Feature Weighting, Cluster Interpretation, Ontology

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2412
671 A Neural-Network-Based Fault Diagnosis Approach for Analog Circuits by Using Wavelet Transformation and Fractal Dimension as a Preprocessor

Authors: Wenji Zhu, Yigang He

Abstract:

This paper presents a new method of analog fault diagnosis based on back-propagation neural networks (BPNNs) using wavelet decomposition and fractal dimension as preprocessors. The proposed method has the capability to detect and identify faulty components in an analog electronic circuit with tolerance by analyzing its impulse response. Using wavelet decomposition to preprocess the impulse response drastically de-noises the inputs to the neural network. The second preprocessing by fractal dimension can extract unique features, which are the fed to a neural network as inputs for further classification. A comparison of our work with [1] and [6], which also employs back-propagation (BP) neural networks, reveals that our system requires a much smaller network and performs significantly better in fault diagnosis of analog circuits due to our proposed preprocessing techniques.

Keywords: Analog circuits, fault diagnosis, tolerance, wavelettransform, fractal dimension, box dimension.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2151
670 Marketing Segmentation of Students Willing to Study Abroad based on Cluster Analysis

Authors: Kamila Tislerova, Marta Zambochova

Abstract:

Market segmentation is one of the most fundamental strategic marketing concepts. The better the segment which is chosen for targeting by a particular organisation, the more successful the organisation is assumed to be in the marketplace. Also higher education institutions have to improve their marketing tools for attracting foreign students, particularly when demanding tuition fees. This contribution aims at demonstrating the proper usage of the cluster analysis for segmentation (represented by students' willingness to study abroad) and also, based on large international survey, offers some practical marketing implications.

Keywords: Market Segmentation, Students' Preferences, Study Abroad, Cluster Analysis

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2176
669 Formosa3: A Cloud-Enabled HPC Cluster in NCHC

Authors: Chin-Hung Li, Te-Ming Chen, Ying-Chuan Chen, Shuen-Tai Wang

Abstract:

This paper proposes a new approach to offer a private cloud service in HPC clusters. In particular, our approach relies on automatically scheduling users- customized environment request as a normal job in batch system. After finishing virtualization request jobs, those guest operating systems will dismiss so that compute nodes will be released again for computing. We present initial work on the innovative integration of HPC batch system and virtualization tools that aims at coexistence such that they suffice for meeting the minimizing interference required by a traditional HPC cluster. Given the design of initial infrastructure, the proposed effort has the potential to positively impact on synergy model. The results from the experiment concluded that goal for provisioning customized cluster environment indeed can be fulfilled by using virtual machines, and efficiency can be improved with proper setup and arrangements.

Keywords: Cloud Computing, HPC Cluster, Private Cloud, Virtualization

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1992
668 Analysis of Long-Term File System Activities on Cluster Systems

Authors: Hyeyoung Cho, Sungho Kim, Sik Lee

Abstract:

I/O workload is a critical and important factor to analyze I/O pattern and to maximize file system performance. However to measure I/O workload on running distributed parallel file system is non-trivial due to collection overhead and large volume of data. In this paper, we measured and analyzed file system activities on two large-scale cluster systems which had TFlops level high performance computation resources. By comparing file system activities of 2009 with those of 2006, we analyzed the change of I/O workloads by the development of system performance and high-speed network technology.

Keywords: I/O workload, Lustre, GPFS, Cluster File System

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1413
667 Rigid Registration of Reduced Dimension Images using 1D Binary Projections

Authors: Panos D. Kotsas, Tony Dodd

Abstract:

The purpose of this work is to present a method for rigid registration of medical images using 1D binary projections when a part of one of the two images is missing. We use 1D binary projections and we adjust the projection limits according to the reduced image in order to perform accurate registration. We use the variance of the weighted ratio as a registration function which we have shown is able to register 2D and 3D images more accurately and robustly than mutual information methods. The function is computed explicitly for n=5 Chebyshev points in a [-9,+9] interval and it is approximated using Chebyshev polynomials for all other points. The images used are MR scans of the head. We find that the method is able to register the two images with average accuracy 0.3degrees for rotations and 0.2 pixels for translations for a y dimension of 156 with initial dimension 256. For y dimension 128/256 the accuracy decreases to 0.7 degrees for rotations and 0.6 pixels for translations.

Keywords: binary projections, image registration, reduceddimension images.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1419
666 Digital Forensics Compute Cluster: A High Speed Distributed Computing Capability for Digital Forensics

Authors: Daniel Gonzales, Zev Winkelman, Trung Tran, Ricardo Sanchez, Dulani Woods, John Hollywood

Abstract:

We have developed a distributed computing capability, Digital Forensics Compute Cluster (DFORC2) to speed up the ingestion and processing of digital evidence that is resident on computer hard drives. DFORC2 parallelizes evidence ingestion and file processing steps. It can be run on a standalone computer cluster or in the Amazon Web Services (AWS) cloud. When running in a virtualized computing environment, its cluster resources can be dynamically scaled up or down using Kubernetes. DFORC2 is an open source project that uses Autopsy, Apache Spark and Kafka, and other open source software packages. It extends the proven open source digital forensics capabilities of Autopsy to compute clusters and cloud architectures, so digital forensics tasks can be accomplished efficiently by a scalable array of cluster compute nodes. In this paper, we describe DFORC2 and compare it with a standalone version of Autopsy when both are used to process evidence from hard drives of different sizes.

Keywords: Cloud computing, cybersecurity, digital forensics, Kafka, Kubernetes, Spark.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1594
665 Optimal Feature Extraction Dimension in Finger Vein Recognition Using Kernel Principal Component Analysis

Authors: Amir Hajian, Sepehr Damavandinejadmonfared

Abstract:

In this paper the issue of dimensionality reduction is investigated in finger vein recognition systems using kernel Principal Component Analysis (KPCA). One aspect of KPCA is to find the most appropriate kernel function on finger vein recognition as there are several kernel functions which can be used within PCA-based algorithms. In this paper, however, another side of PCA-based algorithms -particularly KPCA- is investigated. The aspect of dimension of feature vector in PCA-based algorithms is of importance especially when it comes to the real-world applications and usage of such algorithms. It means that a fixed dimension of feature vector has to be set to reduce the dimension of the input and output data and extract the features from them. Then a classifier is performed to classify the data and make the final decision. We analyze KPCA (Polynomial, Gaussian, and Laplacian) in details in this paper and investigate the optimal feature extraction dimension in finger vein recognition using KPCA.

Keywords: Biometrics, finger vein recognition, Principal Component Analysis (PCA), Kernel Principal Component Analysis (KPCA).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1923
664 Fractal Analysis on Human Colonic Pressure Activities based on the Box-counting Method

Authors: Rongguo Yan, Guozheng Yan, Banghua Yang

Abstract:

The colonic tissue is a complicated dynamic system and the colonic activities it generates are composed of irregular segmental waves, which are referred to as erratic fluctuations or spikes. They are also highly irregular with subunit fractal structure. The traditional time-frequency domain statistics like the averaged amplitude, the motility index and the power spectrum, etc. are insufficient to describe such fluctuations. Thus the fractal box-counting dimension is proposed and the fractal scaling behaviors of the human colonic pressure activities under the physiological conditions are studied. It is shown that the dimension of the resting activity is smaller than that of the normal one, whereas the clipped version, which corresponds to the activity of the constipation patient, shows with higher fractal dimension. It may indicate a practical application to assess the colonic motility, which is often indicated by the colonic pressure activity.

Keywords: Colonic pressure activity, erratic fluctuations, fractal dimension and spikes.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1466
663 A New Evolutionary Algorithm for Cluster Analysis

Authors: B.Bahmani Firouzi, T. Niknam, M. Nayeripour

Abstract:

Clustering is a very well known technique in data mining. One of the most widely used clustering techniques is the kmeans algorithm. Solutions obtained from this technique depend on the initialization of cluster centers and the final solution converges to local minima. In order to overcome K-means algorithm shortcomings, this paper proposes a hybrid evolutionary algorithm based on the combination of PSO, SA and K-means algorithms, called PSO-SA-K, which can find better cluster partition. The performance is evaluated through several benchmark data sets. The simulation results show that the proposed algorithm outperforms previous approaches, such as PSO, SA and K-means for partitional clustering problem.

Keywords: Data clustering, Hybrid evolutionary optimization algorithm, K-means algorithm, Simulated Annealing (SA), Particle Swarm Optimization (PSO).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2232
662 Fuzzy Based Particle Swarm Optimization Routing Technique for Load Balancing in Wireless Sensor Networks

Authors: S. Balaji, E. Golden Julie, M. Rajaram, Y. Harold Robinson

Abstract:

Network lifetime improvement and uncertainty in multiple systems are the issues of wireless sensor network routing. This paper presents fuzzy based particle swarm optimization routing technique to improve the network scalability. Significantly, in the cluster formation procedure, fuzzy based system is used to solve the uncertainty and network balancing. Cluster heads play an important role to reduce the energy consumption using particle swarm optimization algorithm, the cluster head sends its information along data packets to the heads with link. The simulation results show that the presented routing protocol can perform load balancing effectively and reduce the energy consumption of cluster heads.

Keywords: Wireless sensor networks, fuzzy logic, PSO, LEACH.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1236