Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 653

Search results for: globular clusters

563 Harmonic Data Preparation for Clustering and Classification

Abstract:

The rapid increase in the size of databases required to store power quality monitoring data has demanded new techniques for analysing and understanding the data. One suggested technique to assist in analysis is data mining. Preparing raw data to be ready for data mining exploration take up most of the effort and time spent in the whole data mining process. Clustering is an important technique in data mining and machine learning in which underlying and meaningful groups of data are discovered. Large amounts of harmonic data have been collected from an actual harmonic monitoring system in a distribution system in Australia for three years. This amount of acquired data makes it difficult to identify operational events that significantly impact the harmonics generated on the system. In this paper, harmonic data preparation processes to better understanding of the data have been presented. Underlying classes in this data has then been identified using clustering technique based on the Minimum Message Length (MML) method. The underlying operational information contained within the clusters can be rapidly visualised by the engineers. The C5.0 algorithm was used for classification and interpretation of the generated clusters.

Keywords: data mining, harmonic data, clustering, classification

Procedia PDF Downloads 248

562 Genome-Scale Analysis of Streptomyces Caatingaensis CMAA 1322 Metabolism, a New Abiotic Stress-Tolerant Actinomycete

Authors: Suikinai Nobre Santos, Ranko Gacesa, Paul F. Long, Itamar Soares de Melo

Abstract:

Extremophilic microorganism are adapted to biotopes combining several stress factors (temperature, pressure, radiation, salinity and pH), which indicate the richness valuable resource for the exploitation of novel biotechnological processes and constitute unique models for investigations their biomolecules (1, 2). The above information encourages us investigate bioprospecting synthesized compounds by a noval actinomycete, designated thermotolerant Streptomyces caatingaensis CMAA 1322, isolated from sample soil tropical dry forest (Caatinga) in the Brazilian semiarid region (3-17°S and 35-45°W). This set of constrating physical and climatic factores provide the unique conditions and a diversity of well adapted species, interesting site for biotechnological purposes. Preliminary studies have shown the great potential in the production of cytotoxic, pesticidal and antimicrobial molecules (3). Thus, to extend knowledge of the genes clusters responsible for producing biosynthetic pathways of natural products in strain CMAA1322, whole-genome shotgun (WGS) DNA sequencing was performed using paired-end long sequencing with PacBio RS (Pacific Biosciences). Genomic DNA was extracted from a pure culture grown overnight on LB medium using the PureLink genomic DNA kit (Life Technologies). An approximately 3- to 20-kb-insert PacBio library was constructed and sequenced on an 8 single-molecule real-time (SMRT) cell, yielding 116,269 reads (average length, 7,446 bp), which were allocated into 18 contigs, with 142.11x coverage and N50 value of 20.548 bp (BioProject number PRJNA288757). The assembled data were analyzed by Rapid Annotations using Subsystems Technology (RAST) (4) the genome size was found to be 7.055.077 bp, comprising 6167 open reading frames (ORFs) and 413 subsystems. The G+C content was estimated to be 72 mol%. The closest-neighbors tool, available in RAST through functional comparison of the genome, revealed that strain CMAA1322 is more closely related to Streptomyces hygroscopicus ATCC 53653 (similarity score value, 537), S. violaceusniger Tu 4113 (score value, 483), S. avermitilis MA-4680 (score value, 475), S. albus J1074 (score value, 447). The Streptomyces sp. CMAA1322 genome contains 98 tRNA genes and 135 genes copies related to stress response, mainly osmotic stress (14), heat shock (16), oxidative stress (49). Functional annotation by antiSMASH version 3.0 (5) identified 41 clusters for secondary metabolites (including two clusters for lanthipeptides, ten clusters for nonribosomal peptide synthetases [NRPS], three clusters for siderophores, fourteen for polyketide synthetase [PKS], six clusters encoding a terpene, two clusters encoding a bacteriocin, and one cluster encoding a phenazine). Our work provide in comparative analyse of genome and extract produced (data no published) by lineage CMAA1322, revealing the potential of microorganisms accessed from extreme environments as Caatinga” to produce a wide range of biotechnological relevant compounds.

Keywords: caatinga, streptomyces, environmental stresses, biosynthetic pathways

Procedia PDF Downloads 242

561 A Local Tensor Clustering Algorithm to Annotate Uncharacterized Genes with Many Biological Networks

Authors: Paul Shize Li, Frank Alber

Abstract:

A fundamental task of clinical genomics is to unravel the functions of genes and their associations with disorders. Although experimental biology has made efforts to discover and elucidate the molecular mechanisms of individual genes in the past decades, still about 40% of human genes have unknown functions, not to mention the diseases they may be related to. For those biologists who are interested in a particular gene with unknown functions, a powerful computational method tailored for inferring the functions and disease relevance of uncharacterized genes is strongly needed. Studies have shown that genes strongly linked to each other in multiple biological networks are more likely to have similar functions. This indicates that the densely connected subgraphs in multiple biological networks are useful in the functional and phenotypic annotation of uncharacterized genes. Therefore, in this work, we have developed an integrative network approach to identify the frequent local clusters, which are defined as those densely connected subgraphs that frequently occur in multiple biological networks and consist of the query gene that has few or no disease or function annotations. This is a local clustering algorithm that models multiple biological networks sharing the same gene set as a three-dimensional matrix, the so-called tensor, and employs the tensor-based optimization method to efficiently find the frequent local clusters. Specifically, massive public gene expression data sets that comprehensively cover dynamic, physiological, and environmental conditions are used to generate hundreds of gene co-expression networks. By integrating these gene co-expression networks, for a given uncharacterized gene that is of biologist’s interest, the proposed method can be applied to identify the frequent local clusters that consist of this uncharacterized gene. Finally, those frequent local clusters are used for function and disease annotation of this uncharacterized gene. This local tensor clustering algorithm outperformed the competing tensor-based algorithm in both module discovery and running time. We also demonstrated the use of the proposed method on real data of hundreds of gene co-expression data and showed that it can comprehensively characterize the query gene. Therefore, this study provides a new tool for annotating the uncharacterized genes and has great potential to assist clinical genomic diagnostics.

Keywords: local tensor clustering, query gene, gene co-expression network, gene annotation

Procedia PDF Downloads 168

560 Establishing a Computational Screening Framework to Identify Environmental Exposures Using Untargeted Gas-Chromatography High-Resolution Mass Spectrometry

Authors: Juni C. Kim, Anna R. Robuck, Douglas I. Walker

Abstract:

The human exposome, which includes chemical exposures over the lifetime and their effects, is now recognized as an important measure for understanding human health; however, the complexity of the data makes the identification of environmental chemicals challenging. The goal of our project was to establish a computational workflow for the improved identification of environmental pollutants containing chlorine or bromine. Using the “pattern. search” function available in the R package NonTarget, we wrote a multifunctional script that searches mass spectral clusters from untargeted gas-chromatography high-resolution mass spectrometry (GC-HRMS) for the presence of spectra consistent with chlorine and bromine-containing organic compounds. The “pattern. search” function was incorporated into a different function that allows the evaluation of clusters containing multiple analyte fragments, has multi-core support, and provides a simplified output identifying listing compounds containing chlorine and/or bromine. The new function was able to process 46,000 spectral clusters in under 8 seconds and identified over 150 potential halogenated spectra. We next applied our function to a deidentified dataset from patients diagnosed with primary biliary cholangitis (PBC), primary sclerosing cholangitis (PSC), and healthy controls. Twenty-two spectra corresponded to potential halogenated compounds in the PSC and PBC dataset, including six significantly different in PBC patients, while four differed in PSC patients. We have developed an improved algorithm for detecting halogenated compounds in GC-HRMS data, providing a strategy for prioritizing exposures in the study of human disease.

Keywords: exposome, metabolome, computational metabolomics, high-resolution mass spectrometry, exposure, pollutants

Procedia PDF Downloads 138

559 Developing a Cultural Policy Framework for Small Towns and Cities

Authors: Raymond Ndhlovu, Jen Snowball

Abstract:

It has long been known that the Cultural and Creative Industries (CCIs) have the potential to aid in physical, social and economic renewal and regeneration of towns and cities, hence their importance when dealing with regional development. The CCIs can act as a catalyst for activity and investment in an area because the ‘consumption’ of cultural activities will lead to the activities and use of other non-cultural activities, for example, hospitality development including restaurants and bars, as well as public transport. ‘Consumption’ of cultural activities also leads to employment creation, and diversification. However, CCIs tend to be clustered, especially around large cities. There is, moreover, a case for development of CCIs around smaller towns and cities, because they do not rely on high technology inputs, and long supply chains, and, their direct link to rural and isolated places makes them vital in regional development. However, there is currently little research on how to craft cultural policy for regions with smaller towns and cities. Using the Sarah Baartman District (SBDM) in South Africa as an example, this paper describes the process of developing cultural policy for a region that has potential, and existing, cultural clusters, but currently no one, coherent policy relating to CCI development. The SBDM was chosen as a case study because it has no large cities, but has some CCI clusters, and has identified them as potential drivers of local economic development. The process of developing cultural policy is discussed in stages: Identification of what resources are present; including human resources, soft and hard infrastructure; Identification of clusters; Analysis of CCI labour markets and ownership patterns; Opportunities and challenges from the point of view of CCIs and other key stakeholders; Alignment of regional policy aims with provincial and national policy objectives; and finally, design and implementation of a regional cultural policy.

Keywords: cultural and creative industries, economic impact, intrinsic value, regional development

Procedia PDF Downloads 233

558 Neural Network Approach For Clustering Host Community: Based on Perceptions Toward Tourism, Their Satisfaction Level and Demographic Attributes in Iran (Lahijan)

Authors: Nasibeh Mohammadpour, Ali Rajabzadeh, Adel Azar, Hamid Zargham Borujeni,

Abstract:

Generally, various industries development depends on their stakeholders and beneficiaries supports. One of the most important stakeholders in tourism industry ( which has become one of the most important lucrative and employment-generating activities at the international level these days) are host communities in tourist destination which are affected and effect on this industry development. Recognizing host community and its segmentations can be important to get their support for future decisions and policy making. In order to identify these segments, in this study, clustering of the residents has been done by using some tools that are designed to encounter human complexities and have ability to model and generalize complex systems without any needs for the initial clusters’ seeds like classic methods. Neural networks can help to meet these expectations. The research have been planned to design neural networks-based mathematical model for clustering the host community effectively according to multi criteria, and identifies differences among segments. In order to achieve this goal, the residents’ segmentation has been done by demographic characteristics, their attitude towards the tourism development, the level of satisfaction and the type of their support in this field. The applied method is self-organized neural networks and the results have compared with K-means. As the results show, the use of Self- Organized Map (SOM) method provides much better results by considering the Cophenetic correlation and between clusters variance coefficients. Based on these criteria, the host community is divided into five sections with unique and distinctive features, which are in the best condition (in comparison other modes) according to Cophenetic correlation coefficient of 0.8769 and between clusters variance of 0.1412.

Keywords: Artificial Nural Network, Clustering , Resident, SOM, Tourism

Procedia PDF Downloads 183

557 Topological Analysis of Hydrogen Bonds in Pyruvic Acid-Water Mixtures

Authors: Ferid Hammami

Abstract:

The molecular geometries of the possible conformations of pyruvic acid-water complexes (PA-(H₂O)ₙ = 1- 4) have been fully optimized at DFT/B3LYP/6-311G ++ (d, p) levels of calculation. Among several optimized molecular clusters, the most stable molecular arrangements obtained when one, two, three, and four water molecules are hydrogen-bonded to a central pyruvic acid molecule are presented in this paper. Apposite topological and geometrical parameters are considered as primary indicators of H-bond strength. Atoms in molecules (AIM) analysis shows that pyruvic acid can form a ring structure with water, and the molecular structures are stabilized by both strong O-H...O and C-H...O hydrogen bonds. In large clusters, classical O-H...O hydrogen bonds still exist between water molecules, and a cage-like structure is built around some parts of the central molecule of pyruvic acid. The electrostatic potential energy map (MEP) and the HOMO-LUMO molecular orbital (highest occupied molecular orbital-lowest unoccupied molecular orbital) analysis has been performed for all considered complexes.

Keywords: pyruvic acid, PA-water complex, hydrogen bonding, DFT, AIM, MEP, HOMO-LUMO

Procedia PDF Downloads 214

556 Students’ Perception and Patterns of Listening Behaviour in an Online Forum Discussion

Authors: K. L. Wong, I. N. Umar

Abstract:

Online forum is part of a Learning Management System (LMS) environment in which students share opinions. This study attempts to investigate the perceptions of students towards online forum and their patterns of listening behaviour during the forum interaction. The students’ perceptions were measured using a questionnaire, in which seven dimensions were used including online experience, benefits of forum participation, cost of participation, perceived ease of use, usefulness, attitude and intention. Meanwhile, their patterns of listening behaviours were obtained using the log file extracted from the LMS. A total of 25 postgraduate students undertaking a course were involved in this study, and their activities in the forum session were recorded by the LMS and used as a log file. The results from the questionnaire analysis indicated that the students perceived that the forum is easy to use, useful, and bring benefits to them. Also, they showed positive attitude towards online forum, and they have the intention to use it in future. Based on the log data, the participants were also divided into six clusters of listening behaviour, in which they are different in terms of temporality, breadth, depth and speaking level. The findings were compared to previous clusters grouping and future recommendations are also discussed.

Keywords: e-learning, learning management system, listening behavior, online forum

Procedia PDF Downloads 432

555 A Weighted K-Medoids Clustering Algorithm for Effective Stability in Vehicular Ad Hoc Networks

Authors: Rejab Hajlaoui, Tarek Moulahi, Hervé Guyennet

Abstract:

In a highway scenario, the vehicle speed can exceed 120 kmph. Therefore, any vehicle can enter or leave the network within a very short time. This mobility adversely affects the network connectivity and decreases the life time of all established links. To ensure an effective stability in vehicular ad hoc networks with minimum broadcasting storm, we have developed a weighted algorithm based on the k-medoids clustering algorithm (WKCA). Indeed, the number of clusters and the initial cluster heads will not be selected randomly as usual, but considering the available transmission range and the environment size. Then, to ensure optimal assignment of nodes to clusters in both k-medoids phases, the combined weight of any node will be computed according to additional metrics including direction, relative speed and proximity. Empirical results prove that in addition to the convergence speed that characterizes the k-medoids algorithm, our proposed model performs well both AODV-Clustering and OLSR-Clustering protocols under different densities and velocities in term of end-to-end delay, packet delivery ratio, and throughput.

Keywords: communication, clustering algorithm, k-medoids, sensor, vehicular ad hoc network

Procedia PDF Downloads 238

554 Anomaly Detection Based Fuzzy K-Mode Clustering for Categorical Data

Authors: Murat Yazici

Abstract:

Anomalies are irregularities found in data that do not adhere to a well-defined standard of normal behavior. The identification of outliers or anomalies in data has been a subject of study within the statistics field since the 1800s. Over time, a variety of anomaly detection techniques have been developed in several research communities. The cluster analysis can be used to detect anomalies. It is the process of associating data with clusters that are as similar as possible while dissimilar clusters are associated with each other. Many of the traditional cluster algorithms have limitations in dealing with data sets containing categorical properties. To detect anomalies in categorical data, fuzzy clustering approach can be used with its advantages. The fuzzy k-Mode (FKM) clustering algorithm, which is one of the fuzzy clustering approaches, by extension to the k-means algorithm, is reported for clustering datasets with categorical values. It is a form of clustering: each point can be associated with more than one cluster. In this paper, anomaly detection is performed on two simulated data by using the FKM cluster algorithm. As a significance of the study, the FKM cluster algorithm allows to determine anomalies with their abnormality degree in contrast to numerous anomaly detection algorithms. According to the results, the FKM cluster algorithm illustrated good performance in the anomaly detection of data, including both one anomaly and more than one anomaly.

Keywords: fuzzy k-mode clustering, anomaly detection, noise, categorical data

Procedia PDF Downloads 53

553 A Literature Review on the Effect of Industrial Clusters and the Absorptive Capacity on Innovation

Authors: Enrique Claver Cortés, Bartolomé Marco Lajara, Eduardo Sánchez García, Pedro Seva Larrosa, Encarnación Manresa Marhuenda, Lorena Ruiz Fernández, Esther Poveda Pareja

Abstract:

In recent decades, the analysis of the effects of clustering as an essential factor for the development of innovations and the competitiveness of enterprises has raised great interest in different areas. Nowadays, companies have access to almost all tangible and intangible resources located and/or developed in any country in the world. However, despite the obvious advantages that this situation entails for companies, their geographical location has shown itself, increasingly clearly, to be a fundamental factor that positively influences their innovative performance and competitiveness. Industrial clusters could represent a unique level of analysis, positioned between the individual company and the industry, which makes them an ideal unit of analysis to determine the effects derived from company membership of a cluster. Also, the absorptive capacity (hereinafter 'AC') can mediate the process of innovation development by companies located in a cluster. The transformation and exploitation of knowledge could have a mediating effect between knowledge acquisition and innovative performance. The main objective of this work is to determine the key factors that affect the degree of generation and use of knowledge from the environment by companies and, consequently, their innovative performance and competitiveness. The elements analyzed are the companies' membership of a cluster and the AC. To this end, 30 most relevant papers published on this subject in the "Web of Science" database have been reviewed. Our findings show that, within a cluster, the knowledge coming from the companies' environment can significantly influence their innovative performance and competitiveness, although in this relationship, the degree of access and exploitation of the companies to this knowledge plays a fundamental role, which depends on a series of elements both internal and external to the company.

Keywords: absorptive capacity, clusters, innovation, knowledge

Procedia PDF Downloads 131

552 Enhanced Cluster Based Connectivity Maintenance in Vehicular Ad Hoc Network

Authors: Manverpreet Kaur, Amarpreet Singh

Abstract:

The demand of Vehicular ad hoc networks is increasing day by day, due to offering the various applications and marvelous benefits to VANET users. Clustering in VANETs is most important to overcome the connectivity problems of VANETs. In this paper, we proposed a new clustering technique Enhanced cluster based connectivity maintenance in vehicular ad hoc network. Our objective is to form long living clusters. The proposed approach is grouping the vehicles, on the basis of the longest list of neighbors to form clusters. The cluster formation and cluster head selection process done by the RSU that may results it reduces the chances of overhead on to the network. The cluster head selection procedure is the vehicle which has closest speed to average speed will elect as a cluster Head by the RSU and if two vehicles have same speed which is closest to average speed then they will be calculate by one of the new parameter i.e. distance to their respective destination. The vehicle which has largest distance to their destination will be choosing as a cluster Head by the RSU. Our simulation outcomes show that our technique performs better than the existing technique.

Keywords: VANETs, clustering, connectivity, cluster head, intelligent transportation system (ITS)

Procedia PDF Downloads 247

551 A Construction Management Tool: Determining a Project Schedule Typical Behaviors Using Cluster Analysis

Authors: Natalia Rudeli, Elisabeth Viles, Adrian Santilli

Abstract:

Delays in the construction industry are a global phenomenon. Many construction projects experience extensive delays exceeding the initially estimated completion time. The main purpose of this study is to identify construction projects typical behaviors in order to develop a prognosis and management tool. Being able to know a construction projects schedule tendency will enable evidence-based decision-making to allow resolutions to be made before delays occur. This study presents an innovative approach that uses Cluster Analysis Method to support predictions during Earned Value Analyses. A clustering analysis was used to predict future scheduling, Earned Value Management (EVM), and Earned Schedule (ES) principal Indexes behaviors in construction projects. The analysis was made using a database with 90 different construction projects. It was validated with additional data extracted from literature and with another 15 contrasting projects. For all projects, planned and executed schedules were collected and the EVM and ES principal indexes were calculated. A complete linkage classification method was used. In this way, the cluster analysis made considers that the distance (or similarity) between two clusters must be measured by its most disparate elements, i.e. that the distance is given by the maximum span among its components. Finally, through the use of EVM and ES Indexes and Tukey and Fisher Pairwise Comparisons, the statistical dissimilarity was verified and four clusters were obtained. It can be said that construction projects show an average delay of 35% of its planned completion time. Furthermore, four typical behaviors were found and for each of the obtained clusters, the interim milestones and the necessary rhythms of construction were identified. In general, detected typical behaviors are: (1) Projects that perform a 5% of work advance in the first two tenths and maintain a constant rhythm until completion (greater than 10% for each remaining tenth), being able to finish on the initially estimated time. (2) Projects that start with an adequate construction rate but suffer minor delays culminating with a total delay of almost 27% of the planned time. (3) Projects which start with a performance below the planned rate and end up with an average delay of 64%, and (4) projects that begin with a poor performance, suffer great delays and end up with an average delay of a 120% of the planned completion time. The obtained clusters compose a tool to identify the behavior of new construction projects by comparing their current work performance to the validated database, thus allowing the correction of initial estimations towards more accurate completion schedules.

Keywords: cluster analysis, construction management, earned value, schedule

Procedia PDF Downloads 265

550 Bioinformatics Identification of Rare Codon Clusters in Proteins Structure of HBV

Authors: Abdorrasoul Malekpour, Mohammad Ghorbani Mojtaba Mortazavi, Mohammadreza Fattahi, Mohammad Hassan Meshkibaf, Ali Fakhrzad, Saeid Salehi, Saeideh Zahedi, Amir Ahmadimoghaddam, Parviz Farzadnia Dr., Mohammadreza Hajyani Asl Bs

Abstract:

Hepatitis B as an infectious disease has eight main genotypes (A–H). The aim of this study is to Bioinformatically identify Rare Codon Clusters (RCC) in proteins structure of HBV. For detection of protein family accession numbers (Pfam) of HBV proteins; used of uni-prot database and Pfam search tool were used. Obtained Pfam IDs were analyzed in Sherlocc program and RCCs in HBV proteins were detected. In further, the structures of TrEMBL entries proteins studied in PDB database and 3D structures of the HBV proteins and locations of RCCs were visualized and studied using Swiss PDB Viewer software. Pfam search tool have found nine significant hits and 0 insignificant hits in 3 frames. Results of Pfams studied in the Sherlocc program show this program not identified RCCs in the external core antigen (PF08290) and truncated HBeAg protein (PF08290). By contrast the RCCs become identified in Hepatitis core antigen (PF00906) Large envelope protein S (PF00695), X protein (PF00739), DNA polymerase (viral) N-terminal domain (PF00242) and Protein P (Pf00336). In HBV genome, seven RCC identified that found in hepatitis core antigen, large envelope protein S and DNA polymerase proteins and proteins structures of TrEMBL entries sequences that reported in Sherlocc program outputs are not complete. Based on situation of RCC in structure of HBV proteins, it suggested those RCCs are important in HBV life cycle. We hoped that this study provide a new and deep perspective in protein research and drug design for treatment of HBV.

Keywords: rare codon clusters, hepatitis B virus, bioinformatic study, infectious disease

Procedia PDF Downloads 488

549 Enhancement of Density-Based Spatial Clustering Algorithm with Noise for Fire Risk Assessment and Warning in Metro Manila

Authors: Pinky Mae O. De Leon, Franchezka S. P. Flores

Abstract:

This study focuses on applying an enhanced density-based spatial clustering algorithm with noise for fire risk assessments and warnings in Metro Manila. Unlike other clustering algorithms, DBSCAN is known for its ability to identify arbitrary-shaped clusters and its resistance to noise. However, its performance diminishes when handling high dimensional data, wherein it can read the noise points as relevant data points. Also, the algorithm is dependent on the parameters (eps & minPts) set by the user; choosing the wrong parameters can greatly affect its clustering result. To overcome these challenges, the study proposes three key enhancements: first is to utilize multiple MinHash and locality-sensitive hashing to decrease the dimensionality of the data set, second is to implement Jaccard Similarity before applying the parameter Epsilon to ensure that only similar data points are considered neighbors, and third is to use the concept of Jaccard Neighborhood along with the parameter MinPts to improve in classifying core points and identifying noise in the data set. The results show that the modified DBSCAN algorithm outperformed three other clustering methods, achieving fewer outliers, which facilitated a clearer identification of fire-prone areas, high Silhouette score, indicating well-separated clusters that distinctly identify areas with potential fire hazards and exceptionally achieved a low Davies-Bouldin Index and a high Calinski-Harabasz score, highlighting its ability to form compact and well-defined clusters, making it an effective tool for assessing fire hazard zones. This study is intended for assessing areas in Metro Manila that are most prone to fire risk.

Keywords: DBSCAN, clustering, Jaccard similarity, MinHash LSH, fires

Procedia PDF Downloads 1

548 Quantifying User-Related, System-Related, and Context-Related Patterns of Smartphone Use

Authors: Andrew T. Hendrickson, Liven De Marez, Marijn Martens, Gytha Muller, Tudor Paisa, Koen Ponnet, Catherine Schweizer, Megan Van Meer, Mariek Vanden Abeele

Abstract:

Quantifying and understanding the myriad ways people use their phones and how that impacts their relationships, cognitive abilities, mental health, and well-being is increasingly important in our phone-centric society. However, most studies on the patterns of phone use have focused on theory-driven tests of specific usage hypotheses using self-report questionnaires or analyses of smaller datasets. In this work we present a series of analyses from a large corpus of over 3000 users that combine data-driven and theory-driven analyses to identify reliable smartphone usage patterns and clusters of similar users. Furthermore, we compare the stability of user clusters across user- and system-initiated sessions, as well as during the hypothesized ritualized behavior times directly before and after sleeping. Our results indicate support for some hypothesized usage patterns but present a more complete and nuanced view of how people use smartphones.

Keywords: data mining, experience sampling, smartphone usage, health and well being

Procedia PDF Downloads 163

547 The Impact of the Covid-19 Crisis on the Information Behavior in the B2B Buying Process

Authors: Stehr Melanie

Abstract:

The availability of apposite information is essential for the decision-making process of organizational buyers. Due to the constraints of the Covid-19 crisis, information channels that emphasize face-to-face contact (e.g. sales visits, trade shows) have been unavailable, and usage of digitally-driven information channels (e.g. videoconferencing, platforms) has skyrocketed. This paper explores the question in which areas the pandemic induced shift in the use of information channels could be sustainable and in which areas it is a temporary phenomenon. While information and buying behavior in B2C purchases has been regularly studied in the last decade, the last fundamental model of organizational buying behavior in B2B was introduced by Johnston and Lewin (1996) in times before the advent of the internet. Subsequently, research efforts in B2B marketing shifted from organizational buyers and their decision and information behavior to the business relationships between sellers and buyers. This study builds on the extensive literature on situational factors influencing organizational buying and information behavior and uses the economics of information theory as a theoretical framework. The research focuses on the German woodworking industry, which before the Covid-19 crisis was characterized by a rather low level of digitization of information channels. By focusing on an industry with traditional communication structures, a shift in information behavior induced by an exogenous shock is considered a ripe research setting. The study is exploratory in nature. The primary data source is 40 in-depth interviews based on the repertory-grid method. Thus, 120 typical buying situations in the woodworking industry and the information and channels relevant to them are identified. The results are combined into clusters, each of which shows similar information behavior in the procurement process. In the next step, the clusters are analyzed in terms of the post and pre-Covid-19 crisis’ behavior identifying stable and dynamic information behavior aspects. Initial results show that, for example, clusters representing search goods with low risk and complexity suggest a sustainable rise in the use of digitally-driven information channels. However, in clusters containing trust goods with high significance and novelty, an increased return to face-to-face information channels can be expected after the Covid-19 crisis. The results are interesting from both a scientific and a practical point of view. This study is one of the first to apply the economics of information theory to organizational buyers and their decision and information behavior in the digital information age. Especially the focus on the dynamic aspects of information behavior after an exogenous shock might contribute new impulses to theoretical debates related to the economics of information theory. For practitioners - especially suppliers’ marketing managers and intermediaries such as publishers or trade show organizers from the woodworking industry - the study shows wide-ranging starting points for a future-oriented segmentation of their marketing program by highlighting the dynamic and stable preferences of elaborated clusters in the choice of their information channels.

Keywords: B2B buying process, crisis, economics of information theory, information channel

Procedia PDF Downloads 184

546 Detecting Local Clusters of Childhood Malnutrition in the Island Province of Marinduque, Philippines Using Spatial Scan Statistic

Authors: Novee Lor C. Leyso, Maylin C. Palatino

Abstract:

Under-five malnutrition continues to persist in the Philippines, particularly in the island Province of Marinduque, with prevalence of some forms of malnutrition even worsening in recent years. Local spatial cluster detection provides a spatial perspective in understanding this phenomenon as key in analyzing patterns of geographic variation, identification of community-appropriate programs and interventions, and focused targeting on high-risk areas. Using data from a province-wide household-based census conducted in 2014–2016, this study aimed to determine and evaluate spatial clusters of under-five malnutrition, across the province and within each municipality at the individual level using household location. Malnutrition was defined as weight-for-age z-score that fall outside the 2 standard deviations from the median of the WHO reference population. The Kulldorff’s elliptical spatial scan statistic in binomial model was used to locate clusters with high-risk of malnutrition, while adjusting for age and membership to government conditional cash transfer program as proxy for socio-economic status. One large significant cluster of under-five malnutrition was found southwest of the province, in which living in these areas at least doubles the risk of malnutrition. Additionally, at least one significant cluster were identified within each municipality—mostly located along the coastal areas. All these indicate apparent geographical variations across and within municipalities in the province. There were also similarities and disparities in the patterns of risk of malnutrition in each cluster across municipalities, and even within municipality, suggesting underlying causes at work that warrants further investigation. Therefore, community-appropriate programs and interventions should be identified and should be focused on high-risk areas to maximize limited government resources. Further studies are also recommended to determine factors affecting variations in childhood malnutrition considering the evidence of spatial clustering found in this study.

Keywords: Binomial model, Kulldorff’s elliptical spatial scan statistic, Philippines, under-five malnutrition

Procedia PDF Downloads 140

545 Hybridized Approach for Distance Estimation Using K-Means Clustering

Authors: Ritu Vashistha, Jitender Kumar

Abstract:

Clustering using the K-means algorithm is a very common way to understand and analyze the obtained output data. When a similar object is grouped, this is called the basis of Clustering. There is K number of objects and C number of cluster in to single cluster in which k is always supposed to be less than C having each cluster to be its own centroid but the major problem is how is identify the cluster is correct based on the data. Formulation of the cluster is not a regular task for every tuple of row record or entity but it is done by an iterative process. Each and every record, tuple, entity is checked and examined and similarity dissimilarity is examined. So this iterative process seems to be very lengthy and unable to give optimal output for the cluster and time taken to find the cluster. To overcome the drawback challenge, we are proposing a formula to find the clusters at the run time, so this approach can give us optimal results. The proposed approach uses the Euclidian distance formula as well melanosis to find the minimum distance between slots as technically we called clusters and the same approach we have also applied to Ant Colony Optimization(ACO) algorithm, which results in the production of two and multi-dimensional matrix.

Keywords: ant colony optimization, data clustering, centroids, data mining, k-means

Procedia PDF Downloads 128

544 Geographic Legacies for Modern Day Disease Research: Autism Spectrum Disorder as a Case-Control Study

Authors: Rebecca Richards Steed, James Van Derslice, Ken Smith, Richard Medina, Amanda Bakian

Abstract:

Elucidating gene-environment interactions for heritable disease outcomes is an emerging area of disease research, with genetic studies informing hypotheses for environment and gene interactions underlying some of the most confounding diseases of our time, like autism spectrum disorder (ASD). Geography has thus far played a key role in identifying environmental factors contributing to disease, but its use can be broadened to include genetic and environmental factors that have a synergistic effect on disease. Through the use of family pedigrees and disease outcomes with life-course residential histories, space-time clustering of generations at critical developmental windows can provide further understanding of (1) environmental factors that contribute to disease patterns in families, (2) susceptible critical windows of development most impacted by environment, (3) and that are most likely to lead to an ASD diagnosis. This paper introduces a retrospective case-control study that utilizes pedigree data, health data, and residential life-course location points to find space-time clustering of ancestors with a grandchild/child with a clinical diagnosis of ASD. Finding space-time clusters of ancestors at critical developmental windows serves as a proxy for shared environmental exposures. The authors refer to geographic life-course exposures as geographic legacies. Identifying space-time clusters of ancestors creates a bridge for researching exposures of past generations that may impact modern-day progeny health. Results from the space-time cluster analysis show multiple clusters for the maternal and paternal pedigrees. The paternal grandparent pedigree resulted in the most space-time clustering for birth and childhood developmental windows. No statistically significant clustering was found for adolescent years. These results will be further studied to identify the specific share of space-time environmental exposures. In conclusion, this study has found significant space-time clusters of parents, and grandparents for both maternal and paternal lineage. These results will be used to identify what environmental exposures have been shared with family members at critical developmental windows of time, and additional analysis will be applied.

Keywords: family pedigree, environmental exposure, geographic legacy, medical geography, transgenerational inheritance

Procedia PDF Downloads 116

543 Clustering Locations of Textile and Garment Industries to Compare with the Future Industrial Cluster in Thailand

Authors: Kanogkan Leerojanaprapa

Abstract:

Textile and garment industry is used to a major exporting industry of Thailand. According to lacking of the nation's price-competitiveness by stopping the EU's GSP (Generalised Scheme of Preferences) and ‘Nationwide Minimum Wage Policy’ that Thailand’s employers must pay all employees at least 300 baht (about $10) a day, the supply chains of the Thai textile and garment industry is affected and need to be reformed. Therefore, either Thai textile or garment industry will be existed or not would be concerned. This is also challenged for the government to decide which industries should be promoted the future industries of Thailand. Recently Thai government launch The Cluster-based Special Economic Development Zones Policy for promoting business cluster (effect on September 16, 2015). They define a cluster as the concentration of interconnected businesses and related institutions that operate within the same geographic areas and textiles and garment is one of target industrial clusters and 9 provinces are targeted (Bangkok, Kanchanaburi, Nakhon Pathom, Ratchaburi, Samut Sakhon, Chonburi, Chachoengsao, Prachinburi, and Sa Kaeo). The cluster zone are defined to link west-east corridor connected to manufacturing source in Cambodia and Mynmar to Bangkok where are promoted to be design, sourcing, and trading hub. The Thai government will provide tax and non-tax incentives for targeted industries within the clusters and expects these businesses are scattered to where they can get the most benefit which will identify future industrial cluster. This research will show the difference between the current cluster and future cluster following the target provinces of the textile and garment. The current cluster is analysed from secondary data. The four characteristics of the numbers of plants in Spinning, weaving and finishing of textiles, Manufacture of made-up textile articles, except apparel, Manufacture of knitted and crocheted fabrics, and Manufacture of other textiles, not elsewhere classified in particular 77 provinces (in total) are clustered by K-means cluster analysis and Hierarchical Cluster Analysis. In addition, the cluster can be confirmed and showed which variables contribute the most to defined cluster solution with ANOVA test. The results of analysis can identify 22 provinces (which the textile or garment plants are located) into 3 clusters. Plants in cluster 1 tend to be large numbers of plants which is only Bangkok, Next plants in cluster 2 tend to be moderate numbers of plants which are Samut Prakan, Samut Sakhon and Nakhon Pathom. Finally plants in cluster 3 tend to be little numbers of plants which are other 18 provinces. The same methodology can be implemented in other industries for future study.

Keywords: ANOVA, hierarchical cluster analysis, industrial clusters, K -means cluster analysis, textile and garment industry

Procedia PDF Downloads 213

542 Exploring Strategies Used by Victims of Intimate Partner Violence to Increase Sense of Safety: A Systematic Review and Quantitative Study

Authors: Thomas Nally, Jane Ireland, Roxanne Khan, Philip Birch

Abstract:

Intimate Partner Violence (IPV), a significant societal problem, affects individuals worldwide. However, the strategies victims use to keep safe are under-researched. IPV is significantly under-reported, and services often are not able to be accessed by all victims. Thus they are likely to use their own strategies to manage their victimization before being able to seek support. Two studies were completed to understand these strategies. A systematic review of the literature and study completed with professionals who work with victims was undertaken to understand this area. In study one, a systematic review of the literature (n=61 papers), were analyzed using Thematic Analysis. The results indicated that victims use a large array of behaviors to increase their sense of safety and coping with emotions but also experience significant barriers to help-seeking. In study 2, sixty-nine professionals completed a measure exploring the likelihood and effectiveness of various victim strategies regarding increasing their sense of safety. Strategies included in the measure were obtained from those identified in study 1. Findings indicated that professionals perceived victims of IPV to be more likely to employ safety strategies and coping behaviors that may be ineffective but not help-seeking behaviors. Further, the responses were analyzed using Cluster Analysis. Safety strategies resulted in five clusters; perpetrator-directed strategies, prevention strategies, cognitive reappraisal, safety planning and avoidance strategies. Help-Seeking resulted in six clusters; information or practical support, abuse-related support, emotional support, secondary support and informal support. Finally, coping resulted in four clusters; emotional coping, self-directed coping, thought recording/change and cognitive coping. Both studies indicate that victims may use a variety of strategies to manage their safety besides seeking help. Professionals working with victims, using a strength-based approach, should understand what is used and is effective for victims who are unable to leave the relationships or access external support.

Keywords: intimate partner violence, help-seeking, professional support, victims, victim coping, victim safety

Procedia PDF Downloads 186

541 Spatial Distribution and Cluster Analysis of Sexual Risk Behaviors and STIs Reported by Chinese Adults in Guangzhou, China: A Representative Population-Based Study

Authors: Fangjing Zhou, Wen Chen, Brian J. Hall, Yu Wang, Carl Latkin, Li Ling, Joseph D. Tucker

Abstract:

Background: Economic and social reforms designed to open China to the world has been successful, but also appear to have rapidly laid the foundation for the reemergence of STIs since 1980s. Changes in sexual behaviors, relationships, and norms among Chinese contributed to the STIs epidemic. As the massive population moved during the last 30 years, early coital debut, multiple sexual partnerships, and unprotected sex have increased within the general population. Our objectives were to assess associations between residences location, sexual risk behaviors and sexually transmitted infections (STIs) among adults living in Guangzhou, China. Methods: Stratified cluster sampling followed a two-step process was used to select populations aged 18-59 years in Guangzhou, China. Spatial methods including Geographic Information Systems (GIS) were utilized to identify 1400 coordinates with latitude and longitude. Face-to-face household interviews were conducted to collect self-report data on sexual risk behaviors and diagnosed STIs. Kulldorff’s spatial scan statistic was implemented to identify and detect spatial distribution and clusters of sexual risk behaviors and STIs. The presence and location of statistically significant clusters were mapped in the study areas using ArcGIS software. Results: In this study, 1215 of 1400 households attempted surveys, with 368 refusals, resulting in a sample of 751 completed surveys. The prevalence of self-reported sexual risk behaviors was between 5.1% and 50.0%. The self-reported lifetime prevalence of diagnosed STIs was 7.06%. Anal intercourse clustered in an area located along the border within the rural-urban continuum (p=0.001). High rate clusters for alcohol or other drugs using before sex (p=0.008) and migrants who lived in Guangzhou less than one year (p=0.007) overlapped this cluster. Excess cases for sex without a condom (p=0.031) overlapped the cluster for college students (p<0.001). Conclusions: Short-term migrants and college students reported greater sexual risk behaviors. Programs to increase safer sex within these communities to reduce the risk of STIs are warranted in Guangzhou. Spatial analysis identified geographical clusters of sexual risk behaviors, which is critical for optimizing surveillance and targeting control measures for these locations in the future.

Keywords: cluster analysis, migrant, sexual risk behaviors, spatial distribution

Procedia PDF Downloads 340

540 Revealing the Genome Based Biosynthetic Potential of a Streptomyces sp. Isolate BR123 Presenting Broad Spectrum Antimicrobial Activities

Authors: Neelma Ashraf

Abstract:

Actinomycetes, particularly genus Streptomyces is of great importance due to their role in the discovery of new natural products, particularly antimicrobial secondary metabolites in the medicinal science and biotechnology industry. Different Streptomyces strains were isolated from Helianthus annuus plants and tested for antibacterial and antifungal activities. The most promising five strains were chosen for further investigation, and growth conditions for antibiotic synthesis were optimised. The supernatants were extracted in different solvents, and the extracted products were analyzed using liquid chromatography-mass spectrometry (LC-MS) and biological testing. From one of the potent strains Streptomyces globusus sp. BR123, a compound lavendamycin was identified using these analytical techniques. In addition, this potent strain also produces a strong antifungal polyene compound with a quasimolecular ion of 2072. Streptomyces sp. BR123 was genome sequenced because of its promising antimicrobial potential in order to identify the gene cluster responsible for analyzed compound “lavendamycin”. The genome analysis yielded candidate genes responsible for the production of this potent compound. The genome sequence of 8.15 Mb of Streptomyces sp. isolate BR123 with a GC content of 72.63% and 8103 protein coding genes was attained. Many antimicrobial, antiparasitic, and anticancerous compounds were detected through multiple biosynthetic gene clusters predicted by in-Silico analysis. Though, the novelty of metabolites was determined through the insignificant resemblance with known biosynthetic gene clusters. The current study gives insight into the bioactive potential of Streptomyces sp. isolate BR123 with respect to the synthesis of bioactive secondary metabolites through genomic and spectrometric analysis. Moreover, the comparative genome study revealed the connection of isolate BR123 with other Streptomyces strains, which could expand the knowledge of this genus and the mechanism involved in the discovery of new antimicrobial metabolites.

Keywords: streptomyces, secondary metabolites, genome, biosynthetic gene clusters, high performance liquid chromatography, mass spectrometry

Procedia PDF Downloads 70

539 An Improved K-Means Algorithm for Gene Expression Data Clustering

Authors: Billel Kenidra, Mohamed Benmohammed

Abstract:

Data mining technique used in the field of clustering is a subject of active research and assists in biological pattern recognition and extraction of new knowledge from raw data. Clustering means the act of partitioning an unlabeled dataset into groups of similar objects. Each group, called a cluster, consists of objects that are similar between themselves and dissimilar to objects of other groups. Several clustering methods are based on partitional clustering. This category attempts to directly decompose the dataset into a set of disjoint clusters leading to an integer number of clusters that optimizes a given criterion function. The criterion function may emphasize a local or a global structure of the data, and its optimization is an iterative relocation procedure. The K-Means algorithm is one of the most widely used partitional clustering techniques. Since K-Means is extremely sensitive to the initial choice of centers and a poor choice of centers may lead to a local optimum that is quite inferior to the global optimum, we propose a strategy to initiate K-Means centers. The improved K-Means algorithm is compared with the original K-Means, and the results prove how the efficiency has been significantly improved.

Keywords: microarray data mining, biological pattern recognition, partitional clustering, k-means algorithm, centroid initialization

Procedia PDF Downloads 190

538 Critical Psychosocial Risk Treatment for Engineers and Technicians

Authors: R. Berglund, T. Backström, M. Bellgran

Abstract:

This study explores how management addresses psychosocial risks in seven teams of engineers and technicians in the midst of the fourth industrial revolution. The sample is from an ongoing quasi-experiment about psychosocial risk management in a manufacturing company in Sweden. Each of the seven teams belongs to one of two clusters: a positive cluster or a negative cluster. The positive cluster reports a significantly positive change in psychosocial risk levels between two time-points and the negative cluster reports a significantly negative change. The data are collected using semi-structured interviews. The results of the computer aided thematic analysis show that there are more differences than similarities when comparing the risk treatment actions taken between the two clusters. Findings show that the managers in the positive cluster use more enabling actions that foster and support formal and informal relationship building. In contrast, managers that use less enabling actions hinder the development of positive group processes and contribute negative changes in psychosocial risk levels. This exploratory study sheds some light on how management can influence significant positive and negative changes in psychosocial risk levels during a risk management process.

Keywords: group process model, risk treatment, risk management, psychosocial

Procedia PDF Downloads 160

537 Success Measurement in Corporate Venturing: Integrating Three Decades of Research

Authors: Maurice Steinhoff, Lucas Costantino, Dominik Kanbach

Abstract:

Measurement approaches to corporate venturing (CV) success are highly diverse in the extant literature. Furthermore, these approaches rarely build on each other, making it difficult to derive comparable conclusions about CV outcomes. Employing a systematic literature review of three decades of research, the objective of this study is to provide transparency and structure in the broad field of CV research. Subsequently, the paper examines 28 studies in detail, resulting in two main contributions to the research field. First, three structural dimensions of measurement approaches are derived from the studies in the sample, namely, “level of analysis” (parent, program, and venture levels), “measurement perspective” (objective, subjective, and mixed measurement), and “locus of opportunity” (internal, external, and general CV activities). Second, an integrated overview of nine unique clusters structures the different measurement approaches. These clusters allow to encapsulate measurement approaches, but also make visible the approaches’ heterogeneity, as well as specific measurement items. Thereby, the study contributes to CV research by revealing and reconciling the variety of CV success-measurement approaches. The study also provides relevant insights for practitioners, by making transparent the various approaches to measuring the success of CV activities and presenting a list of 114 concrete and distinct measurement items.

Keywords: corporate venturing, measurement items, success measurement, structured literature review

Procedia PDF Downloads 178

536 Analysis of Entrepreneurship in Industrial Cluster

Authors: Wen-Hsiang Lai

Abstract:

Except for the internal aspects of entrepreneurship (i.e. motivation, opportunity perspective and alertness), there are external aspects that affecting entrepreneurship (i.e. the industrial cluster). By comparing the machinery companies located inside and outside the industrial district, this study aims to explore the cluster effects on the entrepreneurship of companies in Taiwan machinery clusters (TMC). In this study, three factors affecting the entrepreneurship in TMC are conducted as “competition”, “embedded-ness” and “specialized knowledge”. The “competition” in the industrial cluster is defined as the competitive advantages that companies gain in form of demand effects and diversified strategies; the “embedded-ness” refers to the quality of company relations (relational embedded-ness) and ranges (structural embedded-ness) with the industry components (universities, customers and complementary) that affecting knowledge transfer and knowledge generations; the “specialized knowledge” shares the internal knowledge within industrial clusters. This study finds that when comparing to the companies which are outside the cluster, the industrial cluster has positive influence on the entrepreneurship. Additionally, the factor of “relational embedded-ness” has significant impact on the entrepreneurship and affects the adaptation ability of companies in TMC. Finally, the factor of “competition” reveals partial influence on the entrepreneurship.

Keywords: entrepreneurship, industrial cluster, industrial district, economies of agglomerations, Taiwan Machinery Cluster (TMC)

Procedia PDF Downloads 388

535 Barriers to Tuberculosis Detection in Portuguese Prisons

Authors: M. F. Abreu, A. I. Aguiar, R. Gaio, R. Duarte

Abstract:

Background: Prison establishments constitute high-risk environments for the transmission and spread of tuberculosis (TB), given their epidemiological context and the difficulty of implementing preventive and control measures. Guidelines for control and prevention of tuberculosis in prisons have been described as incomplete and heterogeneous internationally, due to several identified obstacles, for example scarcity of human resources and funding of prisoner health services. In Portugal, a protocol was created in 2014 with the aim to define and standardize procedures of detection and prevention of tuberculosis within prisons. Objective: The main objective of this study was to identify and describe barriers to tuberculosis detection in prisons of Porto and Lisbon districts in Portugal. Methods: A cross-sectional study was conducted from 2ⁿᵈ January 2018 till 30ᵗʰ June 2018. Semi-structured questionnaires were applied to health care professionals working in the prisons of the districts of Porto (n=6) and Lisbon (n=8). As inclusion criteria we considered having work experience in the area of tuberculosis (either in diagnosis, treatment, or follow up). The questionnaires were self-administered, in paper format. Descriptive analyses of the questionnaire variables were made using frequencies and median. Afterwards, a hierarchical agglomerative clusters analysis was performed. After obtaining the clusters, the chi-square test was applied to study the association between the variables collected and the clusters. The level of significance considered was 0.05. Results: From the total of 186 health professionals, 139 met the criteria of inclusion and 82 health professionals were interviewed (62,2% of participation). Most were female, nurses, with a median age of 34 years, with term employment contract. From the cluster analysis, two groups were identified with different characteristics and behaviors for the procedures of this protocol. Statistically significant results were found in: elements of cluster 1 (78% of the total participants) work in prisons for a longer time (p=0.003), 45,3% work > 4 years while 50% of the elements of cluster 2 work for less than a year, and more frequently answered they know and apply the procedures of the protocol (p=0.000). Both clusters answered frequently the need of having theoretical-practical training for TB (p=0.000), especially in the areas of diagnosis, treatment and prevention and that there is scarcity of funding to prisoner health services (p=0.000). Regarding procedures for TB screening (periodic and contact screening) and procedures for transferring a prisoner with this disease, cluster 1 also answered more frequently to perform them (p=0.000). They also referred that the material/equipment for TB screening is accessible and available (p=0.000). From this clusters we identified as barriers scarcity of human resources, the need to theoretical-practical training for tuberculosis, inexperience in working in health services prisons and limited knowledge of protocol procedures. Conclusions: The barriers found in this study are the same described internationally. This protocol is mostly being applied in portuguese prisons. The study also showed the need to invest in human and material resources. This investigation bridged gaps in knowledge that could help prison health services optimize the care provided for early detection and adherence of prisoners to treatment of tuberculosis.

Keywords: barriers, health care professionals, prisons, protocol, tuberculosis

Procedia PDF Downloads 146

534 Decision Support System in Air Pollution Using Data Mining

Authors: E. Fathallahi Aghdam, V. Hosseini

Abstract:

Environmental pollution is not limited to a specific region or country; that is why sustainable development, as a necessary process for improvement, pays attention to issues such as destruction of natural resources, degradation of biological system, global pollution, and climate change in the world, especially in the developing countries. According to the World Health Organization, as a developing city, Tehran (capital of Iran) is one of the most polluted cities in the world in terms of air pollution. In this study, three pollutants including particulate matter less than 10 microns, nitrogen oxides, and sulfur dioxide were evaluated in Tehran using data mining techniques and through Crisp approach. The data from 21 air pollution measuring stations in different areas of Tehran were collected from 1999 to 2013. Commercial softwares Clementine was selected for this study. Tehran was divided into distinct clusters in terms of the mentioned pollutants using the software. As a data mining technique, clustering is usually used as a prologue for other analyses, therefore, the similarity of clusters was evaluated in this study through analyzing local conditions, traffic behavior, and industrial activities. In fact, the results of this research can support decision-making system, help managers improve the performance and decision making, and assist in urban studies.

Keywords: data mining, clustering, air pollution, crisp approach

Procedia PDF Downloads 427