Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 17791

Search results for: cluster system

17761 Parallel Genetic Algorithms Clustering for Handling Recruitment Problem

Authors: Walid Moudani, Ahmad Shahin

Abstract:

This research presents a study to handle the recruitment services system. It aims to enhance a business intelligence system by embedding data mining in its core engine and to facilitate the link between job searchers and recruiters companies. The purpose of this study is to present an intelligent management system for supporting recruitment services based on data mining methods. It consists to apply segmentation on the extracted job postings offered by the different recruiters. The details of the job postings are associated to a set of relevant features that are extracted from the web and which are based on critical criterion in order to define consistent clusters. Thereafter, we assign the job searchers to the best cluster while providing a ranking according to the job postings of the selected cluster. The performance of the proposed model used is analyzed, based on a real case study, with the clustered job postings dataset and classified job searchers dataset by using some metrics.

Keywords: job postings, job searchers, clustering, genetic algorithms, business intelligence

Procedia PDF Downloads 302

17760 Role of Tourism Cluster in Improvement of Economic Competitiveness of Georgia

Authors: Alexander Sharashenidze

Abstract:

This article discusses the role of tourism in the economics of Georgia, justifies the necessity of several governmental supporting tools for diversification of tourism product and increasing competitiveness. Tourism directions are characterized through discovering Georgian tourism potential, considering cultural and geographical features; tools of formating supplemental products and development opportunities of Tbilisi and, also regions are asserted in the case of conducting appropriate government policy. There are presented tools of suggesting innovative tourism products, improvement of service, decreasing taxes, also providing availability to them. The role of tourism cluster in improvement of national competitiveness is substantiated. Based on the analysis of competitive factors influencing the development of tourism cluster, conclusions are made, and recommendations are suggested.

Keywords: economic competitivness, enhancing competitiveness, Georgian economic, tourism cluster, tourism product

Procedia PDF Downloads 500

17759 Clustering Locations of Textile and Garment Industries to Compare with the Future Industrial Cluster in Thailand

Authors: Kanogkan Leerojanaprapa

Abstract:

Textile and garment industry is used to a major exporting industry of Thailand. According to lacking of the nation's price-competitiveness by stopping the EU's GSP (Generalised Scheme of Preferences) and ‘Nationwide Minimum Wage Policy’ that Thailand’s employers must pay all employees at least 300 baht (about $10) a day, the supply chains of the Thai textile and garment industry is affected and need to be reformed. Therefore, either Thai textile or garment industry will be existed or not would be concerned. This is also challenged for the government to decide which industries should be promoted the future industries of Thailand. Recently Thai government launch The Cluster-based Special Economic Development Zones Policy for promoting business cluster (effect on September 16, 2015). They define a cluster as the concentration of interconnected businesses and related institutions that operate within the same geographic areas and textiles and garment is one of target industrial clusters and 9 provinces are targeted (Bangkok, Kanchanaburi, Nakhon Pathom, Ratchaburi, Samut Sakhon, Chonburi, Chachoengsao, Prachinburi, and Sa Kaeo). The cluster zone are defined to link west-east corridor connected to manufacturing source in Cambodia and Mynmar to Bangkok where are promoted to be design, sourcing, and trading hub. The Thai government will provide tax and non-tax incentives for targeted industries within the clusters and expects these businesses are scattered to where they can get the most benefit which will identify future industrial cluster. This research will show the difference between the current cluster and future cluster following the target provinces of the textile and garment. The current cluster is analysed from secondary data. The four characteristics of the numbers of plants in Spinning, weaving and finishing of textiles, Manufacture of made-up textile articles, except apparel, Manufacture of knitted and crocheted fabrics, and Manufacture of other textiles, not elsewhere classified in particular 77 provinces (in total) are clustered by K-means cluster analysis and Hierarchical Cluster Analysis. In addition, the cluster can be confirmed and showed which variables contribute the most to defined cluster solution with ANOVA test. The results of analysis can identify 22 provinces (which the textile or garment plants are located) into 3 clusters. Plants in cluster 1 tend to be large numbers of plants which is only Bangkok, Next plants in cluster 2 tend to be moderate numbers of plants which are Samut Prakan, Samut Sakhon and Nakhon Pathom. Finally plants in cluster 3 tend to be little numbers of plants which are other 18 provinces. The same methodology can be implemented in other industries for future study.

Keywords: ANOVA, hierarchical cluster analysis, industrial clusters, K -means cluster analysis, textile and garment industry

Procedia PDF Downloads 192

17758 Optimized Cluster Head Selection Algorithm Based on LEACH Protocol for Wireless Sensor Networks

Authors: Wided Abidi, Tahar Ezzedine

Abstract:

Low-Energy Adaptive Clustering Hierarchy (LEACH) has been considered as one of the effective hierarchical routing algorithms that optimize energy and prolong the lifetime of network. Since the selection of Cluster Head (CH) in LEACH is carried out randomly, in this paper, we propose an approach of electing CH based on LEACH protocol. In other words, we present a formula for calculating the threshold responsible for CH election. In fact, we adopt three principle criteria: the remaining energy of node, the number of neighbors within cluster range and the distance between node and CH. Simulation results show that our proposed approach beats LEACH protocol in regards of prolonging the lifetime of network and saving residual energy.

Keywords: wireless sensors networks, LEACH protocol, cluster head election, energy efficiency

Procedia PDF Downloads 297

17757 Pattern Recognition Based on Simulation of Chemical Senses (SCS)

Authors: Nermeen El Kashef, Yasser Fouad, Khaled Mahar

Abstract:

No AI-complete system can model the human brain or behavior, without looking at the totality of the whole situation and incorporating a combination of senses. This paper proposes a Pattern Recognition model based on Simulation of Chemical Senses (SCS) for separation and classification of sign language. The model based on human taste controlling strategy. The main idea of the introduced model is motivated by the facts that the tongue cluster input substance into its basic tastes first, and then the brain recognizes its flavor. To implement this strategy, two level architecture is proposed (this is inspired from taste system). The separation-level of the architecture focuses on hand posture cluster, while the classification-level of the architecture to recognizes the sign language. The efficiency of proposed model is demonstrated experimentally by recognizing American Sign Language (ASL) data set. The recognition accuracy obtained for numbers of ASL is 92.9 percent.

Keywords: artificial intelligence, biocybernetics, gustatory system, sign language recognition, taste sense

Procedia PDF Downloads 263

17756 Genomic Diversity of Clostridium perfringens Strains in Food and Human Sources

Authors: Asma Afshari, Abdollah Jamshidi, Jamshid Razmyar, Mehrnaz Rad

Abstract:

Clostridium perfringens is a serious pathogen which causes enteric diseases in domestic animals and food poisoning in humans. Spores can survive cooking processes and play an important role in the possible onset of disease. In this study RAPD-PCR and REP-PCR were used to examine the genetic diversity of 49isolates ofC. Perfringens type A from 3 different sources. The results of RAPD-PCR revealed the most genetic diversity among poultry isolates, while human isolates showed the least genetic diversity. Cluster analysis obtained from RAPD_PCR and based on the genetic distances split the 49 strains into five distinct major clusters (A, B, C, D, and E). Cluster A and C were composed of isolates from poultry meat, cluster B was composed of isolates from human feces, cluster D was composed of isolates from minced meat, poultry meat and human feces and cluster E was composed of isolates from minced meat. Further characterization of these strains by using (GTG) 5 fingerprint repetitive sequence-based PCR analysis did not show further differentiation between various types of strains. To our knowledge, this is the first study in which the genetic diversity of C. perfringens isolates from different types of meats and human feces has been investigated.

Keywords: C. perfringens, genetic diversity, RAPD-PCR, REP-PCR

Procedia PDF Downloads 459

17755 Event Driven Dynamic Clustering and Data Aggregation in Wireless Sensor Network

Authors: Ashok V. Sutagundar, Sunilkumar S. Manvi

Abstract:

Energy, delay and bandwidth are the prime issues of wireless sensor network (WSN). Energy usage optimization and efficient bandwidth utilization are important issues in WSN. Event triggered data aggregation facilitates such optimal tasks for event affected area in WSN. Reliable delivery of the critical information to sink node is also a major challenge of WSN. To tackle these issues, we propose an event driven dynamic clustering and data aggregation scheme for WSN that enhances the life time of the network by minimizing redundant data transmission. The proposed scheme operates as follows: (1) Whenever the event is triggered, event triggered node selects the cluster head. (2) Cluster head gathers data from sensor nodes within the cluster. (3) Cluster head node identifies and classifies the events out of the collected data using Bayesian classifier. (4) Aggregation of data is done using statistical method. (5) Cluster head discovers the paths to the sink node using residual energy, path distance and bandwidth. (6) If the aggregated data is critical, cluster head sends the aggregated data over the multipath for reliable data communication. (7) Otherwise aggregated data is transmitted towards sink node over the single path which is having the more bandwidth and residual energy. The performance of the scheme is validated for various WSN scenarios to evaluate the effectiveness of the proposed approach in terms of aggregation time, cluster formation time and energy consumed for aggregation.

Keywords: wireless sensor network, dynamic clustering, data aggregation, wireless communication

Procedia PDF Downloads 415

17754 Digital Forensics Compute Cluster: A High Speed Distributed Computing Capability for Digital Forensics

Authors: Daniel Gonzales, Zev Winkelman, Trung Tran, Ricardo Sanchez, Dulani Woods, John Hollywood

Abstract:

We have developed a distributed computing capability, Digital Forensics Compute Cluster (DFORC2) to speed up the ingestion and processing of digital evidence that is resident on computer hard drives. DFORC2 parallelizes evidence ingestion and file processing steps. It can be run on a standalone computer cluster or in the Amazon Web Services (AWS) cloud. When running in a virtualized computing environment, its cluster resources can be dynamically scaled up or down using Kubernetes. DFORC2 is an open source project that uses Autopsy, Apache Spark and Kafka, and other open source software packages. It extends the proven open source digital forensics capabilities of Autopsy to compute clusters and cloud architectures, so digital forensics tasks can be accomplished efficiently by a scalable array of cluster compute nodes. In this paper, we describe DFORC2 and compare it with a standalone version of Autopsy when both are used to process evidence from hard drives of different sizes.

Keywords: digital forensics, cloud computing, cyber security, spark, Kubernetes, Kafka

Procedia PDF Downloads 367

17753 Building User Behavioral Models by Processing Web Logs and Clustering Mechanisms

Authors: Madhuka G. P. D. Udantha, Gihan V. Dias, Surangika Ranathunga

Abstract:

Today Websites contain very interesting applications. But there are only few methodologies to analyze User navigations through the Websites and formulating if the Website is put to correct use. The web logs are only used if some major attack or malfunctioning occurs. Web Logs contain lot interesting dealings on users in the system. Analyzing web logs has become a challenge due to the huge log volume. Finding interesting patterns is not as easy as it is due to size, distribution and importance of minor details of each log. Web logs contain very important data of user and site which are not been put to good use. Retrieving interesting information from logs gives an idea of what the users need, group users according to their various needs and improve site to build an effective and efficient site. The model we built is able to detect attacks or malfunctioning of the system and anomaly detection. Logs will be more complex as volume of traffic and the size and complexity of web site grows. Unsupervised techniques are used in this solution which is fully automated. Expert knowledge is only used in validation. In our approach first clean and purify the logs to bring them to a common platform with a standard format and structure. After cleaning module web session builder is executed. It outputs two files, Web Sessions file and Indexed URLs file. The Indexed URLs file contains the list of URLs accessed and their indices. Web Sessions file lists down the indices of each web session. Then DBSCAN and EM Algorithms are used iteratively and recursively to get the best clustering results of the web sessions. Using homogeneity, completeness, V-measure, intra and inter cluster distance and silhouette coefficient as parameters these algorithms self-evaluate themselves to input better parametric values to run the algorithms. If a cluster is found to be too large then micro-clustering is used. Using Cluster Signature Module the clusters are annotated with a unique signature called finger-print. In this module each cluster is fed to Associative Rule Learning Module. If it outputs confidence and support as value 1 for an access sequence it would be a potential signature for the cluster. Then the access sequence occurrences are checked in other clusters. If it is found to be unique for the cluster considered then the cluster is annotated with the signature. These signatures are used in anomaly detection, prevent cyber attacks, real-time dashboards that visualize users, accessing web pages, predict actions of users and various other applications in Finance, University Websites, News and Media Websites etc.

Keywords: anomaly detection, clustering, pattern recognition, web sessions

Procedia PDF Downloads 257

17752 Feature Selection of Personal Authentication Based on EEG Signal for K-Means Cluster Analysis Using Silhouettes Score

Authors: Jianfeng Hu

Abstract:

Personal authentication based on electroencephalography (EEG) signals is one of the important field for the biometric technology. More and more researchers have used EEG signals as data source for biometric. However, there are some disadvantages for biometrics based on EEG signals. The proposed method employs entropy measures for feature extraction from EEG signals. Four type of entropies measures, sample entropy (SE), fuzzy entropy (FE), approximate entropy (AE) and spectral entropy (PE), were deployed as feature set. In a silhouettes calculation, the distance from each data point in a cluster to all another point within the same cluster and to all other data points in the closest cluster are determined. Thus silhouettes provide a measure of how well a data point was classified when it was assigned to a cluster and the separation between them. This feature renders silhouettes potentially well suited for assessing cluster quality in personal authentication methods. In this study, “silhouettes scores” was used for assessing the cluster quality of k-means clustering algorithm is well suited for comparing the performance of each EEG dataset. The main goals of this study are: (1) to represent each target as a tuple of multiple feature sets, (2) to assign a suitable measure to each feature set, (3) to combine different feature sets, (4) to determine the optimal feature weighting. Using precision/recall evaluations, the effectiveness of feature weighting in clustering was analyzed. EEG data from 22 subjects were collected. Results showed that: (1) It is possible to use fewer electrodes (3-4) for personal authentication. (2) There was the difference between each electrode for personal authentication (p<0.01). (3) There is no significant difference for authentication performance among feature sets (except feature PE). Conclusion: The combination of k-means clustering algorithm and silhouette approach proved to be an accurate method for personal authentication based on EEG signals.

Keywords: personal authentication, K-mean clustering, electroencephalogram, EEG, silhouettes

Procedia PDF Downloads 256

17751 Critical Psychosocial Risk Treatment for Engineers and Technicians

Authors: R. Berglund, T. Backström, M. Bellgran

Abstract:

This study explores how management addresses psychosocial risks in seven teams of engineers and technicians in the midst of the fourth industrial revolution. The sample is from an ongoing quasi-experiment about psychosocial risk management in a manufacturing company in Sweden. Each of the seven teams belongs to one of two clusters: a positive cluster or a negative cluster. The positive cluster reports a significantly positive change in psychosocial risk levels between two time-points and the negative cluster reports a significantly negative change. The data are collected using semi-structured interviews. The results of the computer aided thematic analysis show that there are more differences than similarities when comparing the risk treatment actions taken between the two clusters. Findings show that the managers in the positive cluster use more enabling actions that foster and support formal and informal relationship building. In contrast, managers that use less enabling actions hinder the development of positive group processes and contribute negative changes in psychosocial risk levels. This exploratory study sheds some light on how management can influence significant positive and negative changes in psychosocial risk levels during a risk management process.

Keywords: group process model, risk treatment, risk management, psychosocial

Procedia PDF Downloads 126

17750 An AI-Based Dynamical Resource Allocation Calculation Algorithm for Unmanned Aerial Vehicle

Authors: Zhou Luchen, Wu Yubing, Burra Venkata Durga Kumar

Abstract:

As the scale of the network becomes larger and more complex than before, the density of user devices is also increasing. The development of Unmanned Aerial Vehicle (UAV) networks is able to collect and transform data in an efficient way by using software-defined networks (SDN) technology. This paper proposed a three-layer distributed and dynamic cluster architecture to manage UAVs by using an AI-based resource allocation calculation algorithm to address the overloading network problem. Through separating services of each UAV, the UAV hierarchical cluster system performs the main function of reducing the network load and transferring user requests, with three sub-tasks including data collection, communication channel organization, and data relaying. In this cluster, a head node and a vice head node UAV are selected considering the Central Processing Unit (CPU), operational (RAM), and permanent (ROM) memory of devices, battery charge, and capacity. The vice head node acts as a backup that stores all the data in the head node. The k-means clustering algorithm is used in order to detect high load regions and form the UAV layered clusters. The whole process of detecting high load areas, forming and selecting UAV clusters, and moving the selected UAV cluster to that area is proposed as offloading traffic algorithm.

Keywords: k-means, resource allocation, SDN, UAV network, unmanned aerial vehicles

Procedia PDF Downloads 75

17749 An Enhanced Distributed Weighted Clustering Algorithm for Intra and Inter Cluster Routing in MANET

Authors: K. Gomathi

Abstract:

Mobile Ad hoc Networks (MANET) is defined as collection of routable wireless mobile nodes with no centralized administration and communicate each other using radio signals. Especially MANETs deployed in hostile environments where hackers will try to disturb the secure data transfer and drain the valuable network resources. Since MANET is battery operated network, preserving the network resource is essential one. For resource constrained computation, efficient routing and to increase the network stability, the network is divided into smaller groups called clusters. The clustering architecture consists of Cluster Head(CH), ordinary node and gateway. The CH is responsible for inter and intra cluster routing. CH election is a prominent research area and many more algorithms are developed using many different metrics. The CH with longer life sustains network lifetime, for this purpose Secondary Cluster Head(SCH) also elected and it is more economical. To nominate efficient CH, a Enhanced Distributed Weighted Clustering Algorithm (EDWCA) has been proposed. This approach considers metrics like battery power, degree difference and speed of the node for CH election. The proficiency of proposed one is evaluated and compared with existing algorithm using Network Simulator(NS-2).

Keywords: MANET, EDWCA, clustering, cluster head

Procedia PDF Downloads 365

17748 Creation of Greater Mekong Subregion Regional Competitiveness through Cluster Mapping

Authors: Danuvasin Charoen

Abstract:

This research investigates cluster development in the area called the Greater Mekong Subregion (GMS), which consists of Thailand, the People’s Republic of China (PRC), the Yunnan Province and Guangxi Zhuang Autonomous Region, Myanmar, the Lao People’s Democratic Republic (Lao PDR), Cambodia, and Vietnam. The study utilized Porter’s competitiveness theory and the cluster mapping approach to analyze the competitiveness of the region. The data collection consists of interviews, focus groups, and the analysis of secondary data. The findings identify some evidence of cluster development in the GMS; however, there is no clear indication of collaboration among the components in the clusters. GMS clusters tend to be stand-alone. The clusters in Vietnam, Lao PDR, Myanmar, and Cambodia tend to be labor intensive, whereas the clusters in Thailand and the PRC (Yunnan) have the potential to successfully develop into innovative clusters. The collaboration and integration among the clusters in the GMS area are promising, though it could take a long time. The most likely relationship between the GMS countries could be, for example, suppliers of the low-end, labor-intensive products will be located in the low income countries such as Myanmar, Lao PDR, and Cambodia, and these countries will be providing input materials for innovative clusters in the middle income countries such as Thailand and the PRC.

Keywords: cluster, GMS, competitiveness, development

Procedia PDF Downloads 231

17747 Application of Artificial Immune Systems Combined with Collaborative Filtering in Movie Recommendation System

Authors: Pei-Chann Chang, Jhen-Fu Liao, Chin-Hung Teng, Meng-Hui Chen

Abstract:

This research combines artificial immune system with user and item based collaborative filtering to create an efficient and accurate recommendation system. By applying the characteristic of antibodies and antigens in the artificial immune system and using Pearson correlation coefficient as the affinity threshold to cluster the data, our collaborative filtering can effectively find useful users and items for rating prediction. This research uses MovieLens dataset as our testing target to evaluate the effectiveness of the algorithm developed in this study. The experimental results show that the algorithm can effectively and accurately predict the movie ratings. Compared to some state of the art collaborative filtering systems, our system outperforms them in terms of the mean absolute error on the MovieLens dataset.

Keywords: artificial immune system, collaborative filtering, recommendation system, similarity

Procedia PDF Downloads 503

17746 Specific Frequency of Globular Clusters in Different Galaxy Types

Authors: Ahmed H. Abdullah, Pavel Kroupa

Abstract:

Globular clusters (GC) are important objects for tracing the early evolution of a galaxy. We study the correlation between the cluster population and the global properties of the host galaxy. We found that the correlation between cluster population (NGC) and the baryonic mass (Mb) of the host galaxy are best described as 10 −5.6038Mb. In order to understand the origin of the U -shape relation between the GC specific frequency (SN) and Mb (caused by the high value of SN for dwarfs galaxies and giant ellipticals and a minimum SN for intermediate mass galaxies≈ 1010M), we derive a theoretical model for the specific frequency (SNth). The theoretical model for SNth is based on the slope of the power-law embedded cluster mass function (β) and different time scale (Δt) of the forming galaxy. Our results show a good agreement between the observation and the model at a certain β and Δt. The model seems able to reproduce higher value of SNth of β = 1.5 at the midst formation time scale.

Keywords: galaxies: dwarf, globular cluster: specific frequency, number of globular clusters, formation time scale

Procedia PDF Downloads 292

17745 Clustering Performance Analysis using New Correlation-Based Cluster Validity Indices

Authors: Nathakhun Wiroonsri

Abstract:

There are various cluster validity measures used for evaluating clustering results. One of the main objectives of using these measures is to seek the optimal unknown number of clusters. Some measures work well for clusters with different densities, sizes and shapes. Yet, one of the weaknesses that those validity measures share is that they sometimes provide only one clear optimal number of clusters. That number is actually unknown and there might be more than one potential sub-optimal option that a user may wish to choose based on different applications. We develop two new cluster validity indices based on a correlation between an actual distance between a pair of data points and a centroid distance of clusters that the two points are located in. Our proposed indices constantly yield several peaks at different numbers of clusters which overcome the weakness previously stated. Furthermore, the introduced correlation can also be used for evaluating the quality of a selected clustering result. Several experiments in different scenarios, including the well-known iris data set and a real-world marketing application, have been conducted to compare the proposed validity indices with several well-known ones.

Keywords: clustering algorithm, cluster validity measure, correlation, data partitions, iris data set, marketing, pattern recognition

Procedia PDF Downloads 81

17744 Percolation Transition in an Agglomeration of Spherical Particles

Authors: Johannes J. Schneider, Mathias S. Weyland, Peter Eggenberger Hotz, William D. Jamieson, Oliver Castell, Alessia Faggian, Rudolf M. Füchslin

Abstract:

Agglomerations of polydisperse systems of spherical particles are created in computer simulations using a simplified stochastic-hydrodynamic model: Particles sink to the bottom of the cylinder, taking into account gravity reduced by the buoyant force, the Stokes friction force, the added mass effect, and random velocity changes. Two types of particles are considered, with one of them being able to create connections to neighboring particles of the same type, thus forming a network within the agglomeration at the bottom of a cylinder. Decreasing the fraction of these particles, a percolation transition occurs. The critical regime is determined by investigating the maximum cluster size and the percolation susceptibility.

Keywords: binary system, maximum cluster size, percolation, polydisperse

Procedia PDF Downloads 22

17743 The Use of Ward Linkage in Cluster Integration with a Path Analysis Approach

Authors: Adji Achmad Rinaldo Fernandes

Abstract:

Path analysis is an analytical technique to study the causal relationship between independent and dependent variables. In this study, the integration of Clusters in the Ward Linkage method was used in a variety of clusters with path analysis. The variables used are character (x₁), capacity (x₂), capital (x₃), collateral (x₄), and condition of economy (x₄) to on time pay (y₂) through the variable willingness to pay (y₁). The purpose of this study was to compare the Ward Linkage method cluster integration in various clusters with path analysis to classify willingness to pay (y₁). The data used are primary data from questionnaires filled out by customers of Bank X, using purposive sampling. The measurement method used is the average score method. The results showed that the Ward linkage method cluster integration with path analysis on 2 clusters is the best method, by comparing the coefficient of determination. Variable character (x₁), capacity (x₂), capital (x₃), collateral (x₄), and condition of economy (x₅) to on time pay (y₂) through willingness to pay (y₁) can be explained by 58.3%, while the remaining 41.7% is explained by variables outside the model.

Keywords: cluster integration, linkage, path analysis, compliant paying behavior

Procedia PDF Downloads 147

17742 A Near-Optimal Domain Independent Approach for Detecting Approximate Duplicates

Authors: Abdelaziz Fellah, Allaoua Maamir

Abstract:

We propose a domain-independent merging-cluster filter approach complemented with a set of algorithms for identifying approximate duplicate entities efficiently and accurately within a single and across multiple data sources. The near-optimal merging-cluster filter (MCF) approach is based on the Monge-Elkan well-tuned algorithm and extended with an affine variant of the Smith-Waterman similarity measure. Then we present constant, variable, and function threshold algorithms that work conceptually in a divide-merge filtering fashion for detecting near duplicates as hierarchical clusters along with their corresponding representatives. The algorithms take recursive refinement approaches in the spirit of filtering, merging, and updating, cluster representatives to detect approximate duplicates at each level of the cluster tree. Experiments show a high effectiveness and accuracy of the MCF approach in detecting approximate duplicates by outperforming the seminal Monge-Elkan’s algorithm on several real-world benchmarks and generated datasets.

Keywords: data mining, data cleaning, approximate duplicates, near-duplicates detection, data mining applications and discovery

Procedia PDF Downloads 358

17741 Condition Monitoring System of Mine Air Compressors Based on Wireless Sensor Network

Authors: Sheng Fu, Yinbo Gao, Hao Lin

Abstract:

In the current mine air compressors monitoring system, there are some difficulties in the installation and maintenance because of the wired connection. To solve the problem, this paper introduces a new air compressors monitoring system based on ZigBee in which the monitoring parameters are transmitted wirelessly. The collecting devices are designed to form a cluster network to collect vibration, temperature, and pressure of air cylinders and other parameters. All these devices are battery-powered. Besides, the monitoring software in PC is developed using MFC. Experiments show that the designed wireless sensor network works well in the site environmental condition and the system is very convenient to be installed since the wireless connection. This monitoring system will have a wide application prospect in the upgrade of the old monitoring system of the air compressors.

Keywords: condition monitoring, wireless sensor network, air compressor, zigbee, data collecting

Procedia PDF Downloads 459

17740 Lambda-Levelwise Statistical Convergence of a Sequence of Fuzzy Numbers

Authors: F. Berna Benli, Özgür Keskin

Abstract:

Lately, many mathematicians have been studied the statistical convergence of a sequence of fuzzy numbers. We know that Lambda-statistically convergence is a kind of convergence between ordinary convergence and statistical convergence. In this paper, we will introduce the new kind of convergence such as λ-levelwise statistical convergence. Then, we will define the concept of the λ-levelwise statistical cluster and limit points of a sequence of fuzzy numbers. Also, we will discuss the relations between the sets of λ-levelwise statistical cluster points and λ-levelwise statistical limit points of sequences of fuzzy numbers. This work has been extended in this paper, where some relations have been considered such that when lambda-statistical limit inferior and lambda-statistical limit superior for lambda-statistically convergent sequences of fuzzy numbers are equal. Furthermore, lambda-statistical boundedness condition for different sequences of fuzzy numbers has been studied.

Keywords: fuzzy number, λ-levelwise statistical cluster points, λ-levelwise statistical convergence, λ-levelwise statistical limit points, λ-statistical cluster points, λ-statistical convergence, λ-statistical limit points

Procedia PDF Downloads 437

17739 Spatio-Temporal Changes of Rainfall in São Paulo, Brazil (1973-2012): A Gamma Distribution and Cluster Analysis

Authors: Guilherme Henrique Gabriel, Lucí Hidalgo Nunes

Abstract:

An important feature of rainfall regimes is the variability, which is subject to the atmosphere’s general and regional dynamics, geographical position and relief. Despite being inherent to the climate system, it can harshly impact virtually all human activities. In turn, global climate change has the ability to significantly affect smaller-scale rainfall regimes by altering their current variability patterns. In this regard, it is useful to know if regional climates are changing over time and whether it is possible to link these variations to climate change trends observed globally. This study is part of an international project (Metropole-FAPESP, Proc. 2012/51876-0 and Proc. 2015/11035-5) and the objective was to identify and evaluate possible changes in rainfall behavior in the state of São Paulo, southeastern Brazil, using rainfall data from 79 rain gauges for the last forty years. Cluster analysis and gamma distribution parameters were used for evaluating spatial and temporal trends, and the outcomes are presented by means of geographic information systems tools. Results show remarkable changes in rainfall distribution patterns in São Paulo over the years: changes in shape and scale parameters of gamma distribution indicate both an increase in the irregularity of rainfall distribution and the probability of occurrence of extreme events. Additionally, the spatial outcome of cluster analysis along with the gamma distribution parameters suggest that changes occurred simultaneously over the whole area, indicating that they could be related to remote causes beyond the local and regional ones, especially in a current global climate change scenario.

Keywords: climate change, cluster analysis, gamma distribution, rainfall

Procedia PDF Downloads 289

17738 Comparing the Apparent Error Rate of Gender Specifying from Human Skeletal Remains by Using Classification and Cluster Methods

Authors: Jularat Chumnaul

Abstract:

In forensic science, corpses from various homicides are different; there are both complete and incomplete, depending on causes of death or forms of homicide. For example, some corpses are cut into pieces, some are camouflaged by dumping into the river, some are buried, some are burned to destroy the evidence, and others. If the corpses are incomplete, it can lead to the difficulty of personally identifying because some tissues and bones are destroyed. To specify gender of the corpses from skeletal remains, the most precise method is DNA identification. However, this method is costly and takes longer so that other identification techniques are used instead. The first technique that is widely used is considering the features of bones. In general, an evidence from the corpses such as some pieces of bones, especially the skull and pelvis can be used to identify their gender. To use this technique, forensic scientists are required observation skills in order to classify the difference between male and female bones. Although this technique is uncomplicated, saving time and cost, and the forensic scientists can fairly accurately determine gender by using this technique (apparently an accuracy rate of 90% or more), the crucial disadvantage is there are only some positions of skeleton that can be used to specify gender such as supraorbital ridge, nuchal crest, temporal lobe, mandible, and chin. Therefore, the skeletal remains that will be used have to be complete. The other technique that is widely used for gender specifying in forensic science and archeology is skeletal measurements. The advantage of this method is it can be used in several positions in one piece of bones, and it can be used even if the bones are not complete. In this study, the classification and cluster analysis are applied to this technique, including the Kth Nearest Neighbor Classification, Classification Tree, Ward Linkage Cluster, K-mean Cluster, and Two Step Cluster. The data contains 507 particular individuals and 9 skeletal measurements (diameter measurements), and the performance of five methods are investigated by considering the apparent error rate (APER). The results from this study indicate that the Two Step Cluster and Kth Nearest Neighbor method seem to be suitable to specify gender from human skeletal remains because both yield small apparent error rate of 0.20% and 4.14%, respectively. On the other hand, the Classification Tree, Ward Linkage Cluster, and K-mean Cluster method are not appropriate since they yield large apparent error rate of 10.65%, 10.65%, and 16.37%, respectively. However, there are other ways to evaluate the performance of classification such as an estimate of the error rate using the holdout procedure or misclassification costs, and the difference methods can make the different conclusions.

Keywords: skeletal measurements, classification, cluster, apparent error rate

Procedia PDF Downloads 227

17737 Finding the Longest Common Subsequence in Normal DNA and Disease Affected Human DNA Using Self Organizing Map

Authors: G. Tamilpavai, C. Vishnuppriya

Abstract:

Bioinformatics is an active research area which combines biological matter as well as computer science research. The longest common subsequence (LCSS) is one of the major challenges in various bioinformatics applications. The computation of the LCSS plays a vital role in biomedicine and also it is an essential task in DNA sequence analysis in genetics. It includes wide range of disease diagnosing steps. The objective of this proposed system is to find the longest common subsequence which presents in a normal and various disease affected human DNA sequence using Self Organizing Map (SOM) and LCSS. The human DNA sequence is collected from National Center for Biotechnology Information (NCBI) database. Initially, the human DNA sequence is separated as k-mer using k-mer separation rule. Mean and median values are calculated from each separated k-mer. These calculated values are fed as input to the Self Organizing Map for the purpose of clustering. Then obtained clusters are given to the Longest Common Sub Sequence (LCSS) algorithm for finding common subsequence which presents in every clusters. It returns nx(n-1)/2 subsequence for each cluster where n is number of k-mer in a specific cluster. Experimental outcomes of this proposed system produce the possible number of longest common subsequence of normal and disease affected DNA data. Thus the proposed system will be a good initiative aid for finding disease causing sequence. Finally, performance analysis is carried out for different DNA sequences. The obtained values show that the retrieval of LCSS is done in a shorter time than the existing system.

Keywords: clustering, k-mers, longest common subsequence, SOM

Procedia PDF Downloads 231

17736 Impacts of Teachers’ Cluster Model Meeting Intervention on Pupils’ Learning, Academic Achievement and Attitudinal Development in Oyo State, Nigeria

Authors: Olusola Joseph Adesina, Abiodun Ezekiel Adesina

Abstract:

Efforts at improving the falling standard of education in the country call for the need-based assessment of the primary tier of education in Nigeria. Teachers’ cluster meeting intervention is a step towards enhancing the teachers’ professional competency, efficient and effective pupils’ academic achievement and attitudinal development. The study thus determined the impact of the intervention on pupils’ achievement in Oyo State, Nigeria. Three research questions and four hypotheses guided the study. Pre-test, post-test control group, quasi-experimental design was adopted for the study. Eight intact classes from eight different schools were randomly selected into treatment and control groups. Two response instruments, pupils academic achievement test (PAAT; r = 0.87) and pupils attitude to lesson scale (PALS; r = 0.80) were used for data collection. Mean, standard deviation and analysis of covariance (ANCOVA) were used to analyse the collected data. The results showed that the teachers’ cluster meeting have significant impact on pupils academic achievement (F (1,327) =41.79; p<0.05) and attitudinal development (F (1,327) =26.01; p<0.05) in the core subjects of primary schools in Oyo State, Nigeria. The study therefore recommended among others that teachers’ cluster meeting should be sustained for teachers’ professional development and pupils’ upgradement in the State.

Keywords: teachers’ cluster meeting, pupils’ academic achievement, pupils’ attitudinal development, academic achievement

Procedia PDF Downloads 435

17735 Improving the Bioprocess Phenotype of Chinese Hamster Ovary Cells Using CRISPR/Cas9 and Sponge Decoy Mediated MiRNA Knockdowns

Authors: Kevin Kellner, Nga Lao, Orla Coleman, Paula Meleady, Niall Barron

Abstract:

Chinese Hamster Ovary (CHO) cells are the prominent cell line used in biopharmaceutical production. To improve yields and find beneficial bioprocess phenotypes genetic engineering plays an essential role in recent research. The miR-23 cluster, specifically miR-24 and miR-27, was first identified as differentially expressed during hypothermic conditions suggesting a role in proliferation and productivity in CHO cells. In this study, we used sponge decoy technology to stably deplete the miRNA expression of the cluster. Furthermore, we implemented the CRISPR/Cas9 system to knockdown miRNA expression. Sponge constructs were designed for an imperfect binding of the miRNA target, protecting from RISC mediated cleavage. GuideRNAs for the CRISPR/Cas9 system were designed to target the seed region of the miRNA. The expression of mature miRNA and precursor were confirmed using RT-qPCR. For both approaches stable expressing mixed populations were generated and characterised in batch cultures. It was shown, that CRISPR/Cas9 can be implemented in CHO cells with achieving high knockdown efficacy of every single member of the cluster. Targeting of one miRNA member showed that its genomic paralog is successfully targeted as well. The stable depletion of miR-24 using CRISPR/Cas9 showed increased growth and specific productivity in a CHO-K1 mAb expressing cell line. This phenotype was further characterized using quantitative label-free LC-MS/MS showing 186 proteins differently expressed with 19 involved in proliferation and 26 involved in protein folding/translation. Targeting miR-27 in the same cell line showed increased viability in late stages of the culture compared to the control. To evaluate the phenotype in an industry relevant cell line; the miR-23 cluster, miR-24 and miR-27 were stably depleted in a Fc fusion CHO-S cell line which showed increased batch titers up to 1.5-fold. In this work, we highlighted that the stable depletion of the miR-23 cluster and its members can improve the bioprocess phenotype concerning growth and productivity in two different cell lines. Furthermore, we showed that using CRISPR/Cas9 is comparable to the traditional sponge decoy technology.

Keywords: Chinese Hamster ovary cells, CRISPR/Cas9, microRNAs, sponge decoy technology

Procedia PDF Downloads 169

17734 A Clustering Algorithm for Massive Texts

Authors: Ming Liu, Chong Wu, Bingquan Liu, Lei Chen

Abstract:

Internet users have to face the massive amount of textual data every day. Organizing texts into categories can help users dig the useful information from large-scale text collection. Clustering, in fact, is one of the most promising tools for categorizing texts due to its unsupervised characteristic. Unfortunately, most of traditional clustering algorithms lose their high qualities on large-scale text collection. This situation mainly attributes to the high- dimensional vectors generated from texts. To effectively and efficiently cluster large-scale text collection, this paper proposes a vector reconstruction based clustering algorithm. Only the features that can represent the cluster are preserved in cluster’s representative vector. This algorithm alternately repeats two sub-processes until it converges. One process is partial tuning sub-process, where feature’s weight is fine-tuned by iterative process. To accelerate clustering velocity, an intersection based similarity measurement and its corresponding neuron adjustment function are proposed and implemented in this sub-process. The other process is overall tuning sub-process, where the features are reallocated among different clusters. In this sub-process, the features useless to represent the cluster are removed from cluster’s representative vector. Experimental results on the three text collections (including two small-scale and one large-scale text collections) demonstrate that our algorithm obtains high quality on both small-scale and large-scale text collections.

Keywords: vector reconstruction, large-scale text clustering, partial tuning sub-process, overall tuning sub-process

Procedia PDF Downloads 405

17733 An Energy-Balanced Clustering Method on Wireless Sensor Networks

Authors: Yu-Ting Tsai, Chiun-Chieh Hsu, Yu-Chun Chu

Abstract:

In recent years, due to the development of wireless network technology, many researchers have devoted to the study of wireless sensor networks. The applications of wireless sensor network mainly use the sensor nodes to collect the required information, and send the information back to the users. Since the sensed area is difficult to reach, there are many restrictions on the design of the sensor nodes, where the most important restriction is the limited energy of sensor nodes. Because of the limited energy, researchers proposed a number of ways to reduce energy consumption and balance the load of sensor nodes in order to increase the network lifetime. In this paper, we proposed the Energy-Balanced Clustering method with Auxiliary Members on Wireless Sensor Networks（EBCAM）based on the cluster routing. The main purpose is to balance the energy consumption on the sensed area and average the distribution of dead nodes in order to avoid excessive energy consumption because of the increasing in transmission distance. In addition, we use the residual energy and average energy consumption of the nodes within the cluster to choose the cluster heads, use the multi hop transmission method to deliver the data, and dynamically adjust the transmission radius according to the load conditions. Finally, we use the auxiliary cluster members to change the delivering path according to the residual energy of the cluster head in order to its load. Finally, we compare the proposed method with the related algorithms via simulated experiments and then analyze the results. It reveals that the proposed method outperforms other algorithms in the numbers of used rounds and the average energy consumption.

Keywords: auxiliary nodes, cluster, load balance, routing algorithm, wireless sensor network

Procedia PDF Downloads 253

17732 Industry 4.0 Platforms as 'Cluster' ecosystems for small and medium enterprises (SMEs)

Authors: Vivek Anand, Rainer Naegele

Abstract:

Industry 4.0 is a global mega-trend revolutionizing the world of advanced manufacturing, but also bringing up challenges for SMEs. In response, many regional, as well as digital Industry 4.0 Platforms, have been set up to boost the competencies of established enterprises as well as SMEs. The concept of 'Clusters' is a policy tool that aims to be a starting point to establish sustainable and self-supporting structures in industries of a region by identifying competencies and supporting cluster actors with services that match their growth needs. This paper is motivated by the idea that Clusters have the potential to enable firms, particularly SMEs, to accelerate the innovation process and transition to digital technologies. In this research, the efficacy of Industry 4.0 platforms as Cluster ecosystems is evaluated, especially for SMEs. Focusing on the Baden Wurttemberg region in Germany, an action research method is employed to study how SMEs leverage other actors on Industry 4.0 Platforms to further their Industry 4.0 journeys. The aim is to evaluate how such Industry 4.0 platforms stimulate innovation, cooperation and competitiveness. Additionally, the barriers to these platforms fulfilling their promise to serve as capacity building cluster ecosystems for SMEs in a region will also be identified. The findings will be helpful for academicians and policymakers alike, who can leverage a ‘cluster policy’ to enable Industry 4.0 ecosystems in their regions. Furthermore, relevant management and policy implications stem from the analysis. This will also be of interest to the various players in a cluster ecosystem - like SMEs and service providers - who benefit from the cooperation and competition. The paper will improve the understanding of how a dialogue orientation, a bottom-up approach and active integration of all involved cluster actors enhance the potential of Industry 4.0 Platforms. A strong collaborative culture is a key driver of digital transformation and technology adoption across sectors, value chains and supply chains; and will position Industry 4.0 Platforms at the forefront of the industrial renaissance. Motivated by this argument and based on the results of the qualitative research, a roadmap will be proposed to position Industry 4.0 Platforms as effective clusters ecosystems to support Industry 4.0 adoption in a region.

Keywords: cluster policy, digital transformation, industry 4.0, innovation clusters, innovation policy, SMEs and startups

Procedia PDF Downloads 185