Search results for: tags' clusters
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 675

Search results for: tags' clusters

585 Time Series Regression with Meta-Clusters

Authors: Monika Chuchro

Abstract:

This paper presents a preliminary attempt to apply classification of time series using meta-clusters in order to improve the quality of regression models. In this case, clustering was performed as a method to obtain a subgroups of time series data with normal distribution from inflow into waste water treatment plant data which Composed of several groups differing by mean value. Two simple algorithms: K-mean and EM were chosen as a clustering method. The rand index was used to measure the similarity. After simple meta-clustering, regression model was performed for each subgroups. The final model was a sum of subgroups models. The quality of obtained model was compared with the regression model made using the same explanatory variables but with no clustering of data. Results were compared by determination coefficient (R2), measure of prediction accuracy mean absolute percentage error (MAPE) and comparison on linear chart. Preliminary results allows to foresee the potential of the presented technique.

Keywords: clustering, data analysis, data mining, predictive models

Procedia PDF Downloads 448
584 Disclosure on Adherence of the King Code's Audit Committee Guidance: Cluster Analyses to Determine Strengths and Weaknesses

Authors: Philna Coetzee, Clara Msiza

Abstract:

In modern society, audit committees are seen as the custodians of accountability and the conscience of management and the board. But who holds the audit committee accountable for their actions or non-actions and how do we know what they are supposed to be doing and what they are doing? The purpose of this article is to provide greater insight into the latter part of this problem, namely, determine what best practises for audit committees and the disclosure of what is the realities are. In countries where governance is well established, the roles and responsibilities of the audit committee are mostly clearly guided by legislation and/or guidance documents, with countries increasingly providing guidance on this topic. With high cost involved to adhere to governance guidelines, the public (for public organisations) and shareholders (for private organisations) expect to see the value of their ‘investment’. For audit committees, the dividends on the investment should reflect in less fraudulent activities, less corruption, higher efficiency and effectiveness, improved social and environmental impact, and increased profits, to name a few. If this is not the case (which is reflected in the number of fraudulent activities in both the private and the public sector), stakeholders have the right to ask: where was the audit committee? Therefore, the objective of this article is to contribute to the body of knowledge by comparing the adherence of audit committee to best practices guidelines as stipulated in the King Report across public listed companies, national and provincial government departments, state-owned enterprises and local municipalities. After constructs were formed, based on the literature, factor analyses were conducted to reduce the number of variables in each construct. Thereafter, cluster analyses, which is an explorative analysis technique that classifies a set of objects in such a way that objects that are more similar are grouped into the same group, were conducted. The SPSS TwoStep Clustering Component was used, being capable of handling both continuous and categorical variables. In the first step, a pre-clustering procedure clusters the objects into small sub-clusters, after which it clusters these sub-clusters into the desired number of clusters. The cluster analyses were conducted for each construct and the measure, namely the audit opinion as listed in the external audit report, were included. Analysing 228 organisations' information, the results indicate that there is a clear distinction between the four spheres of business that has been included in the analyses, indicating certain strengths and certain weaknesses within each sphere. The results may provide the overseers of audit committees’ insight into where a specific sector’s strengths and weaknesses lie. Audit committee chairs will be able to improve the areas where their audit committee is lacking behind. The strengthening of audit committees should result in an improvement of the accountability of boards, leading to less fraud and corruption.

Keywords: audit committee disclosure, cluster analyses, governance best practices, strengths and weaknesses

Procedia PDF Downloads 143
583 Identification of Promising Infant Clusters to Obtain Improved Block Layout Designs

Authors: Mustahsan Mir, Ahmed Hassanin, Mohammed A. Al-Saleh

Abstract:

The layout optimization of building blocks of unequal areas has applications in many disciplines including VLSI floorplanning, macrocell placement, unequal-area facilities layout optimization, and plant or machine layout design. A number of heuristics and some analytical and hybrid techniques have been published to solve this problem. This paper presents an efficient high-quality building-block layout design technique especially suited for solving large-size problems. The higher efficiency and improved quality of optimized solutions are made possible by introducing the concept of Promising Infant Clusters in a constructive placement procedure. The results presented in the paper demonstrate the improved performance of the presented technique for benchmark problems in comparison with published heuristic, analytic, and hybrid techniques.

Keywords: block layout problem, building-block layout design, CAD, optimization, search techniques

Procedia PDF Downloads 368
582 Harmonic Data Preparation for Clustering and Classification

Authors: Ali Asheibi

Abstract:

The rapid increase in the size of databases required to store power quality monitoring data has demanded new techniques for analysing and understanding the data. One suggested technique to assist in analysis is data mining. Preparing raw data to be ready for data mining exploration take up most of the effort and time spent in the whole data mining process. Clustering is an important technique in data mining and machine learning in which underlying and meaningful groups of data are discovered. Large amounts of harmonic data have been collected from an actual harmonic monitoring system in a distribution system in Australia for three years. This amount of acquired data makes it difficult to identify operational events that significantly impact the harmonics generated on the system. In this paper, harmonic data preparation processes to better understanding of the data have been presented. Underlying classes in this data has then been identified using clustering technique based on the Minimum Message Length (MML) method. The underlying operational information contained within the clusters can be rapidly visualised by the engineers. The C5.0 algorithm was used for classification and interpretation of the generated clusters.

Keywords: data mining, harmonic data, clustering, classification

Procedia PDF Downloads 228
581 Genome-Scale Analysis of Streptomyces Caatingaensis CMAA 1322 Metabolism, a New Abiotic Stress-Tolerant Actinomycete

Authors: Suikinai Nobre Santos, Ranko Gacesa, Paul F. Long, Itamar Soares de Melo

Abstract:

Extremophilic microorganism are adapted to biotopes combining several stress factors (temperature, pressure, radiation, salinity and pH), which indicate the richness valuable resource for the exploitation of novel biotechnological processes and constitute unique models for investigations their biomolecules (1, 2). The above information encourages us investigate bioprospecting synthesized compounds by a noval actinomycete, designated thermotolerant Streptomyces caatingaensis CMAA 1322, isolated from sample soil tropical dry forest (Caatinga) in the Brazilian semiarid region (3-17°S and 35-45°W). This set of constrating physical and climatic factores provide the unique conditions and a diversity of well adapted species, interesting site for biotechnological purposes. Preliminary studies have shown the great potential in the production of cytotoxic, pesticidal and antimicrobial molecules (3). Thus, to extend knowledge of the genes clusters responsible for producing biosynthetic pathways of natural products in strain CMAA1322, whole-genome shotgun (WGS) DNA sequencing was performed using paired-end long sequencing with PacBio RS (Pacific Biosciences). Genomic DNA was extracted from a pure culture grown overnight on LB medium using the PureLink genomic DNA kit (Life Technologies). An approximately 3- to 20-kb-insert PacBio library was constructed and sequenced on an 8 single-molecule real-time (SMRT) cell, yielding 116,269 reads (average length, 7,446 bp), which were allocated into 18 contigs, with 142.11x coverage and N50 value of 20.548 bp (BioProject number PRJNA288757). The assembled data were analyzed by Rapid Annotations using Subsystems Technology (RAST) (4) the genome size was found to be 7.055.077 bp, comprising 6167 open reading frames (ORFs) and 413 subsystems. The G+C content was estimated to be 72 mol%. The closest-neighbors tool, available in RAST through functional comparison of the genome, revealed that strain CMAA1322 is more closely related to Streptomyces hygroscopicus ATCC 53653 (similarity score value, 537), S. violaceusniger Tu 4113 (score value, 483), S. avermitilis MA-4680 (score value, 475), S. albus J1074 (score value, 447). The Streptomyces sp. CMAA1322 genome contains 98 tRNA genes and 135 genes copies related to stress response, mainly osmotic stress (14), heat shock (16), oxidative stress (49). Functional annotation by antiSMASH version 3.0 (5) identified 41 clusters for secondary metabolites (including two clusters for lanthipeptides, ten clusters for nonribosomal peptide synthetases [NRPS], three clusters for siderophores, fourteen for polyketide synthetase [PKS], six clusters encoding a terpene, two clusters encoding a bacteriocin, and one cluster encoding a phenazine). Our work provide in comparative analyse of genome and extract produced (data no published) by lineage CMAA1322, revealing the potential of microorganisms accessed from extreme environments as Caatinga” to produce a wide range of biotechnological relevant compounds.

Keywords: caatinga, streptomyces, environmental stresses, biosynthetic pathways

Procedia PDF Downloads 222
580 A Local Tensor Clustering Algorithm to Annotate Uncharacterized Genes with Many Biological Networks

Authors: Paul Shize Li, Frank Alber

Abstract:

A fundamental task of clinical genomics is to unravel the functions of genes and their associations with disorders. Although experimental biology has made efforts to discover and elucidate the molecular mechanisms of individual genes in the past decades, still about 40% of human genes have unknown functions, not to mention the diseases they may be related to. For those biologists who are interested in a particular gene with unknown functions, a powerful computational method tailored for inferring the functions and disease relevance of uncharacterized genes is strongly needed. Studies have shown that genes strongly linked to each other in multiple biological networks are more likely to have similar functions. This indicates that the densely connected subgraphs in multiple biological networks are useful in the functional and phenotypic annotation of uncharacterized genes. Therefore, in this work, we have developed an integrative network approach to identify the frequent local clusters, which are defined as those densely connected subgraphs that frequently occur in multiple biological networks and consist of the query gene that has few or no disease or function annotations. This is a local clustering algorithm that models multiple biological networks sharing the same gene set as a three-dimensional matrix, the so-called tensor, and employs the tensor-based optimization method to efficiently find the frequent local clusters. Specifically, massive public gene expression data sets that comprehensively cover dynamic, physiological, and environmental conditions are used to generate hundreds of gene co-expression networks. By integrating these gene co-expression networks, for a given uncharacterized gene that is of biologist’s interest, the proposed method can be applied to identify the frequent local clusters that consist of this uncharacterized gene. Finally, those frequent local clusters are used for function and disease annotation of this uncharacterized gene. This local tensor clustering algorithm outperformed the competing tensor-based algorithm in both module discovery and running time. We also demonstrated the use of the proposed method on real data of hundreds of gene co-expression data and showed that it can comprehensively characterize the query gene. Therefore, this study provides a new tool for annotating the uncharacterized genes and has great potential to assist clinical genomic diagnostics.

Keywords: local tensor clustering, query gene, gene co-expression network, gene annotation

Procedia PDF Downloads 116
579 Establishing a Computational Screening Framework to Identify Environmental Exposures Using Untargeted Gas-Chromatography High-Resolution Mass Spectrometry

Authors: Juni C. Kim, Anna R. Robuck, Douglas I. Walker

Abstract:

The human exposome, which includes chemical exposures over the lifetime and their effects, is now recognized as an important measure for understanding human health; however, the complexity of the data makes the identification of environmental chemicals challenging. The goal of our project was to establish a computational workflow for the improved identification of environmental pollutants containing chlorine or bromine. Using the “pattern. search” function available in the R package NonTarget, we wrote a multifunctional script that searches mass spectral clusters from untargeted gas-chromatography high-resolution mass spectrometry (GC-HRMS) for the presence of spectra consistent with chlorine and bromine-containing organic compounds. The “pattern. search” function was incorporated into a different function that allows the evaluation of clusters containing multiple analyte fragments, has multi-core support, and provides a simplified output identifying listing compounds containing chlorine and/or bromine. The new function was able to process 46,000 spectral clusters in under 8 seconds and identified over 150 potential halogenated spectra. We next applied our function to a deidentified dataset from patients diagnosed with primary biliary cholangitis (PBC), primary sclerosing cholangitis (PSC), and healthy controls. Twenty-two spectra corresponded to potential halogenated compounds in the PSC and PBC dataset, including six significantly different in PBC patients, while four differed in PSC patients. We have developed an improved algorithm for detecting halogenated compounds in GC-HRMS data, providing a strategy for prioritizing exposures in the study of human disease.

Keywords: exposome, metabolome, computational metabolomics, high-resolution mass spectrometry, exposure, pollutants

Procedia PDF Downloads 119
578 Developing a Cultural Policy Framework for Small Towns and Cities

Authors: Raymond Ndhlovu, Jen Snowball

Abstract:

It has long been known that the Cultural and Creative Industries (CCIs) have the potential to aid in physical, social and economic renewal and regeneration of towns and cities, hence their importance when dealing with regional development. The CCIs can act as a catalyst for activity and investment in an area because the ‘consumption’ of cultural activities will lead to the activities and use of other non-cultural activities, for example, hospitality development including restaurants and bars, as well as public transport. ‘Consumption’ of cultural activities also leads to employment creation, and diversification. However, CCIs tend to be clustered, especially around large cities. There is, moreover, a case for development of CCIs around smaller towns and cities, because they do not rely on high technology inputs, and long supply chains, and, their direct link to rural and isolated places makes them vital in regional development. However, there is currently little research on how to craft cultural policy for regions with smaller towns and cities. Using the Sarah Baartman District (SBDM) in South Africa as an example, this paper describes the process of developing cultural policy for a region that has potential, and existing, cultural clusters, but currently no one, coherent policy relating to CCI development. The SBDM was chosen as a case study because it has no large cities, but has some CCI clusters, and has identified them as potential drivers of local economic development. The process of developing cultural policy is discussed in stages: Identification of what resources are present; including human resources, soft and hard infrastructure; Identification of clusters; Analysis of CCI labour markets and ownership patterns; Opportunities and challenges from the point of view of CCIs and other key stakeholders; Alignment of regional policy aims with provincial and national policy objectives; and finally, design and implementation of a regional cultural policy.

Keywords: cultural and creative industries, economic impact, intrinsic value, regional development

Procedia PDF Downloads 212
577 Neural Network Approach For Clustering Host Community: Based on Perceptions Toward Tourism, Their Satisfaction Level and Demographic Attributes in Iran (Lahijan)

Authors: Nasibeh Mohammadpour, Ali Rajabzadeh, Adel Azar, Hamid Zargham Borujeni,

Abstract:

Generally, various industries development depends on their stakeholders and beneficiaries supports. One of the most important stakeholders in tourism industry ( which has become one of the most important lucrative and employment-generating activities at the international level these days) are host communities in tourist destination which are affected and effect on this industry development. Recognizing host community and its segmentations can be important to get their support for future decisions and policy making. In order to identify these segments, in this study, clustering of the residents has been done by using some tools that are designed to encounter human complexities and have ability to model and generalize complex systems without any needs for the initial clusters’ seeds like classic methods. Neural networks can help to meet these expectations. The research have been planned to design neural networks-based mathematical model for clustering the host community effectively according to multi criteria, and identifies differences among segments. In order to achieve this goal, the residents’ segmentation has been done by demographic characteristics, their attitude towards the tourism development, the level of satisfaction and the type of their support in this field. The applied method is self-organized neural networks and the results have compared with K-means. As the results show, the use of Self- Organized Map (SOM) method provides much better results by considering the Cophenetic correlation and between clusters variance coefficients. Based on these criteria, the host community is divided into five sections with unique and distinctive features, which are in the best condition (in comparison other modes) according to Cophenetic correlation coefficient of 0.8769 and between clusters variance of 0.1412.

Keywords: Artificial Nural Network, Clustering , Resident, SOM, Tourism

Procedia PDF Downloads 156
576 Topological Analysis of Hydrogen Bonds in Pyruvic Acid-Water Mixtures

Authors: Ferid Hammami

Abstract:

The molecular geometries of the possible conformations of pyruvic acid-water complexes (PA-(H₂O)ₙ = 1- 4) have been fully optimized at DFT/B3LYP/6-311G ++ (d, p) levels of calculation. Among several optimized molecular clusters, the most stable molecular arrangements obtained when one, two, three, and four water molecules are hydrogen-bonded to a central pyruvic acid molecule are presented in this paper. Apposite topological and geometrical parameters are considered as primary indicators of H-bond strength. Atoms in molecules (AIM) analysis shows that pyruvic acid can form a ring structure with water, and the molecular structures are stabilized by both strong O-H...O and C-H...O hydrogen bonds. In large clusters, classical O-H...O hydrogen bonds still exist between water molecules, and a cage-like structure is built around some parts of the central molecule of pyruvic acid. The electrostatic potential energy map (MEP) and the HOMO-LUMO molecular orbital (highest occupied molecular orbital-lowest unoccupied molecular orbital) analysis has been performed for all considered complexes.

Keywords: pyruvic acid, PA-water complex, hydrogen bonding, DFT, AIM, MEP, HOMO-LUMO

Procedia PDF Downloads 192
575 Approximately Similarity Measurement of Web Sites Using Genetic Algorithms and Binary Trees

Authors: Doru Anastasiu Popescu, Dan Rădulescu

Abstract:

In this paper, we determine the similarity of two HTML web applications. We are going to use a genetic algorithm in order to determine the most significant web pages of each application (we are not going to use every web page of a site). Using these significant web pages, we will find the similarity value between the two applications. The algorithm is going to be efficient because we are going to use a reduced number of web pages for comparisons but it will return an approximate value of the similarity. The binary trees are used to keep the tags from the significant pages. The algorithm was implemented in Java language.

Keywords: Tag, HTML, web page, genetic algorithm, similarity value, binary tree

Procedia PDF Downloads 338
574 Cryptocurrency-Based Mobile Payments with Near-Field Communication-Enabled Devices

Authors: Marko Niinimaki

Abstract:

Cryptocurrencies are getting increasingly popular, but very few of them can be conveniently used in daily mobile phone purchases. To solve this problem, we demonstrate how to build a functional prototype of a mobile cryptocurrency-based e-commerce application the communicates with Near-Field Communication (NFC) tags. Using the system, users are able to purchase physical items with an NFC tag that contains an e-commerce URL. The payment is done simply by touching the tag with a mobile device and accepting the payment. Our method is constructive: we describe the design and technologies used in the implementation and evaluate the security and performance of the solution. Our main finding is that the analysis and measurements show that our solution is feasible for e-commerce.

Keywords: cryptocurrency, e-commerce, NFC, mobile devices

Procedia PDF Downloads 160
573 Students’ Perception and Patterns of Listening Behaviour in an Online Forum Discussion

Authors: K. L. Wong, I. N. Umar

Abstract:

Online forum is part of a Learning Management System (LMS) environment in which students share opinions. This study attempts to investigate the perceptions of students towards online forum and their patterns of listening behaviour during the forum interaction. The students’ perceptions were measured using a questionnaire, in which seven dimensions were used including online experience, benefits of forum participation, cost of participation, perceived ease of use, usefulness, attitude and intention. Meanwhile, their patterns of listening behaviours were obtained using the log file extracted from the LMS. A total of 25 postgraduate students undertaking a course were involved in this study, and their activities in the forum session were recorded by the LMS and used as a log file. The results from the questionnaire analysis indicated that the students perceived that the forum is easy to use, useful, and bring benefits to them. Also, they showed positive attitude towards online forum, and they have the intention to use it in future. Based on the log data, the participants were also divided into six clusters of listening behaviour, in which they are different in terms of temporality, breadth, depth and speaking level. The findings were compared to previous clusters grouping and future recommendations are also discussed.

Keywords: e-learning, learning management system, listening behavior, online forum

Procedia PDF Downloads 410
572 A Weighted K-Medoids Clustering Algorithm for Effective Stability in Vehicular Ad Hoc Networks

Authors: Rejab Hajlaoui, Tarek Moulahi, Hervé Guyennet

Abstract:

In a highway scenario, the vehicle speed can exceed 120 kmph. Therefore, any vehicle can enter or leave the network within a very short time. This mobility adversely affects the network connectivity and decreases the life time of all established links. To ensure an effective stability in vehicular ad hoc networks with minimum broadcasting storm, we have developed a weighted algorithm based on the k-medoids clustering algorithm (WKCA). Indeed, the number of clusters and the initial cluster heads will not be selected randomly as usual, but considering the available transmission range and the environment size. Then, to ensure optimal assignment of nodes to clusters in both k-medoids phases, the combined weight of any node will be computed according to additional metrics including direction, relative speed and proximity. Empirical results prove that in addition to the convergence speed that characterizes the k-medoids algorithm, our proposed model performs well both AODV-Clustering and OLSR-Clustering protocols under different densities and velocities in term of end-to-end delay, packet delivery ratio, and throughput.

Keywords: communication, clustering algorithm, k-medoids, sensor, vehicular ad hoc network

Procedia PDF Downloads 217
571 Anomaly Detection Based Fuzzy K-Mode Clustering for Categorical Data

Authors: Murat Yazici

Abstract:

Anomalies are irregularities found in data that do not adhere to a well-defined standard of normal behavior. The identification of outliers or anomalies in data has been a subject of study within the statistics field since the 1800s. Over time, a variety of anomaly detection techniques have been developed in several research communities. The cluster analysis can be used to detect anomalies. It is the process of associating data with clusters that are as similar as possible while dissimilar clusters are associated with each other. Many of the traditional cluster algorithms have limitations in dealing with data sets containing categorical properties. To detect anomalies in categorical data, fuzzy clustering approach can be used with its advantages. The fuzzy k-Mode (FKM) clustering algorithm, which is one of the fuzzy clustering approaches, by extension to the k-means algorithm, is reported for clustering datasets with categorical values. It is a form of clustering: each point can be associated with more than one cluster. In this paper, anomaly detection is performed on two simulated data by using the FKM cluster algorithm. As a significance of the study, the FKM cluster algorithm allows to determine anomalies with their abnormality degree in contrast to numerous anomaly detection algorithms. According to the results, the FKM cluster algorithm illustrated good performance in the anomaly detection of data, including both one anomaly and more than one anomaly.

Keywords: fuzzy k-mode clustering, anomaly detection, noise, categorical data

Procedia PDF Downloads 31
570 A Literature Review on the Effect of Industrial Clusters and the Absorptive Capacity on Innovation

Authors: Enrique Claver Cortés, Bartolomé Marco Lajara, Eduardo Sánchez García, Pedro Seva Larrosa, Encarnación Manresa Marhuenda, Lorena Ruiz Fernández, Esther Poveda Pareja

Abstract:

In recent decades, the analysis of the effects of clustering as an essential factor for the development of innovations and the competitiveness of enterprises has raised great interest in different areas. Nowadays, companies have access to almost all tangible and intangible resources located and/or developed in any country in the world. However, despite the obvious advantages that this situation entails for companies, their geographical location has shown itself, increasingly clearly, to be a fundamental factor that positively influences their innovative performance and competitiveness. Industrial clusters could represent a unique level of analysis, positioned between the individual company and the industry, which makes them an ideal unit of analysis to determine the effects derived from company membership of a cluster. Also, the absorptive capacity (hereinafter 'AC') can mediate the process of innovation development by companies located in a cluster. The transformation and exploitation of knowledge could have a mediating effect between knowledge acquisition and innovative performance. The main objective of this work is to determine the key factors that affect the degree of generation and use of knowledge from the environment by companies and, consequently, their innovative performance and competitiveness. The elements analyzed are the companies' membership of a cluster and the AC. To this end, 30 most relevant papers published on this subject in the "Web of Science" database have been reviewed. Our findings show that, within a cluster, the knowledge coming from the companies' environment can significantly influence their innovative performance and competitiveness, although in this relationship, the degree of access and exploitation of the companies to this knowledge plays a fundamental role, which depends on a series of elements both internal and external to the company.

Keywords: absorptive capacity, clusters, innovation, knowledge

Procedia PDF Downloads 112
569 Enhanced Cluster Based Connectivity Maintenance in Vehicular Ad Hoc Network

Authors: Manverpreet Kaur, Amarpreet Singh

Abstract:

The demand of Vehicular ad hoc networks is increasing day by day, due to offering the various applications and marvelous benefits to VANET users. Clustering in VANETs is most important to overcome the connectivity problems of VANETs. In this paper, we proposed a new clustering technique Enhanced cluster based connectivity maintenance in vehicular ad hoc network. Our objective is to form long living clusters. The proposed approach is grouping the vehicles, on the basis of the longest list of neighbors to form clusters. The cluster formation and cluster head selection process done by the RSU that may results it reduces the chances of overhead on to the network. The cluster head selection procedure is the vehicle which has closest speed to average speed will elect as a cluster Head by the RSU and if two vehicles have same speed which is closest to average speed then they will be calculate by one of the new parameter i.e. distance to their respective destination. The vehicle which has largest distance to their destination will be choosing as a cluster Head by the RSU. Our simulation outcomes show that our technique performs better than the existing technique.

Keywords: VANETs, clustering, connectivity, cluster head, intelligent transportation system (ITS)

Procedia PDF Downloads 222
568 A Construction Management Tool: Determining a Project Schedule Typical Behaviors Using Cluster Analysis

Authors: Natalia Rudeli, Elisabeth Viles, Adrian Santilli

Abstract:

Delays in the construction industry are a global phenomenon. Many construction projects experience extensive delays exceeding the initially estimated completion time. The main purpose of this study is to identify construction projects typical behaviors in order to develop a prognosis and management tool. Being able to know a construction projects schedule tendency will enable evidence-based decision-making to allow resolutions to be made before delays occur. This study presents an innovative approach that uses Cluster Analysis Method to support predictions during Earned Value Analyses. A clustering analysis was used to predict future scheduling, Earned Value Management (EVM), and Earned Schedule (ES) principal Indexes behaviors in construction projects. The analysis was made using a database with 90 different construction projects. It was validated with additional data extracted from literature and with another 15 contrasting projects. For all projects, planned and executed schedules were collected and the EVM and ES principal indexes were calculated. A complete linkage classification method was used. In this way, the cluster analysis made considers that the distance (or similarity) between two clusters must be measured by its most disparate elements, i.e. that the distance is given by the maximum span among its components. Finally, through the use of EVM and ES Indexes and Tukey and Fisher Pairwise Comparisons, the statistical dissimilarity was verified and four clusters were obtained. It can be said that construction projects show an average delay of 35% of its planned completion time. Furthermore, four typical behaviors were found and for each of the obtained clusters, the interim milestones and the necessary rhythms of construction were identified. In general, detected typical behaviors are: (1) Projects that perform a 5% of work advance in the first two tenths and maintain a constant rhythm until completion (greater than 10% for each remaining tenth), being able to finish on the initially estimated time. (2) Projects that start with an adequate construction rate but suffer minor delays culminating with a total delay of almost 27% of the planned time. (3) Projects which start with a performance below the planned rate and end up with an average delay of 64%, and (4) projects that begin with a poor performance, suffer great delays and end up with an average delay of a 120% of the planned completion time. The obtained clusters compose a tool to identify the behavior of new construction projects by comparing their current work performance to the validated database, thus allowing the correction of initial estimations towards more accurate completion schedules.

Keywords: cluster analysis, construction management, earned value, schedule

Procedia PDF Downloads 241
567 Bioinformatics Identification of Rare Codon Clusters in Proteins Structure of HBV

Authors: Abdorrasoul Malekpour, Mohammad Ghorbani Mojtaba Mortazavi, Mohammadreza Fattahi, Mohammad Hassan Meshkibaf, Ali Fakhrzad, Saeid Salehi, Saeideh Zahedi, Amir Ahmadimoghaddam, Parviz Farzadnia Dr., Mohammadreza Hajyani Asl Bs

Abstract:

Hepatitis B as an infectious disease has eight main genotypes (A–H). The aim of this study is to Bioinformatically identify Rare Codon Clusters (RCC) in proteins structure of HBV. For detection of protein family accession numbers (Pfam) of HBV proteins; used of uni-prot database and Pfam search tool were used. Obtained Pfam IDs were analyzed in Sherlocc program and RCCs in HBV proteins were detected. In further, the structures of TrEMBL entries proteins studied in PDB database and 3D structures of the HBV proteins and locations of RCCs were visualized and studied using Swiss PDB Viewer software. Pfam search tool have found nine significant hits and 0 insignificant hits in 3 frames. Results of Pfams studied in the Sherlocc program show this program not identified RCCs in the external core antigen (PF08290) and truncated HBeAg protein (PF08290). By contrast the RCCs become identified in Hepatitis core antigen (PF00906) Large envelope protein S (PF00695), X protein (PF00739), DNA polymerase (viral) N-terminal domain (PF00242) and Protein P (Pf00336). In HBV genome, seven RCC identified that found in hepatitis core antigen, large envelope protein S and DNA polymerase proteins and proteins structures of TrEMBL entries sequences that reported in Sherlocc program outputs are not complete. Based on situation of RCC in structure of HBV proteins, it suggested those RCCs are important in HBV life cycle. We hoped that this study provide a new and deep perspective in protein research and drug design for treatment of HBV.

Keywords: rare codon clusters, hepatitis B virus, bioinformatic study, infectious disease

Procedia PDF Downloads 458
566 Quantifying User-Related, System-Related, and Context-Related Patterns of Smartphone Use

Authors: Andrew T. Hendrickson, Liven De Marez, Marijn Martens, Gytha Muller, Tudor Paisa, Koen Ponnet, Catherine Schweizer, Megan Van Meer, Mariek Vanden Abeele

Abstract:

Quantifying and understanding the myriad ways people use their phones and how that impacts their relationships, cognitive abilities, mental health, and well-being is increasingly important in our phone-centric society. However, most studies on the patterns of phone use have focused on theory-driven tests of specific usage hypotheses using self-report questionnaires or analyses of smaller datasets. In this work we present a series of analyses from a large corpus of over 3000 users that combine data-driven and theory-driven analyses to identify reliable smartphone usage patterns and clusters of similar users. Furthermore, we compare the stability of user clusters across user- and system-initiated sessions, as well as during the hypothesized ritualized behavior times directly before and after sleeping. Our results indicate support for some hypothesized usage patterns but present a more complete and nuanced view of how people use smartphones.

Keywords: data mining, experience sampling, smartphone usage, health and well being

Procedia PDF Downloads 142
565 The Impact of the Covid-19 Crisis on the Information Behavior in the B2B Buying Process

Authors: Stehr Melanie

Abstract:

The availability of apposite information is essential for the decision-making process of organizational buyers. Due to the constraints of the Covid-19 crisis, information channels that emphasize face-to-face contact (e.g. sales visits, trade shows) have been unavailable, and usage of digitally-driven information channels (e.g. videoconferencing, platforms) has skyrocketed. This paper explores the question in which areas the pandemic induced shift in the use of information channels could be sustainable and in which areas it is a temporary phenomenon. While information and buying behavior in B2C purchases has been regularly studied in the last decade, the last fundamental model of organizational buying behavior in B2B was introduced by Johnston and Lewin (1996) in times before the advent of the internet. Subsequently, research efforts in B2B marketing shifted from organizational buyers and their decision and information behavior to the business relationships between sellers and buyers. This study builds on the extensive literature on situational factors influencing organizational buying and information behavior and uses the economics of information theory as a theoretical framework. The research focuses on the German woodworking industry, which before the Covid-19 crisis was characterized by a rather low level of digitization of information channels. By focusing on an industry with traditional communication structures, a shift in information behavior induced by an exogenous shock is considered a ripe research setting. The study is exploratory in nature. The primary data source is 40 in-depth interviews based on the repertory-grid method. Thus, 120 typical buying situations in the woodworking industry and the information and channels relevant to them are identified. The results are combined into clusters, each of which shows similar information behavior in the procurement process. In the next step, the clusters are analyzed in terms of the post and pre-Covid-19 crisis’ behavior identifying stable and dynamic information behavior aspects. Initial results show that, for example, clusters representing search goods with low risk and complexity suggest a sustainable rise in the use of digitally-driven information channels. However, in clusters containing trust goods with high significance and novelty, an increased return to face-to-face information channels can be expected after the Covid-19 crisis. The results are interesting from both a scientific and a practical point of view. This study is one of the first to apply the economics of information theory to organizational buyers and their decision and information behavior in the digital information age. Especially the focus on the dynamic aspects of information behavior after an exogenous shock might contribute new impulses to theoretical debates related to the economics of information theory. For practitioners - especially suppliers’ marketing managers and intermediaries such as publishers or trade show organizers from the woodworking industry - the study shows wide-ranging starting points for a future-oriented segmentation of their marketing program by highlighting the dynamic and stable preferences of elaborated clusters in the choice of their information channels.

Keywords: B2B buying process, crisis, economics of information theory, information channel

Procedia PDF Downloads 171
564 Detecting Local Clusters of Childhood Malnutrition in the Island Province of Marinduque, Philippines Using Spatial Scan Statistic

Authors: Novee Lor C. Leyso, Maylin C. Palatino

Abstract:

Under-five malnutrition continues to persist in the Philippines, particularly in the island Province of Marinduque, with prevalence of some forms of malnutrition even worsening in recent years. Local spatial cluster detection provides a spatial perspective in understanding this phenomenon as key in analyzing patterns of geographic variation, identification of community-appropriate programs and interventions, and focused targeting on high-risk areas. Using data from a province-wide household-based census conducted in 2014–2016, this study aimed to determine and evaluate spatial clusters of under-five malnutrition, across the province and within each municipality at the individual level using household location. Malnutrition was defined as weight-for-age z-score that fall outside the 2 standard deviations from the median of the WHO reference population. The Kulldorff’s elliptical spatial scan statistic in binomial model was used to locate clusters with high-risk of malnutrition, while adjusting for age and membership to government conditional cash transfer program as proxy for socio-economic status. One large significant cluster of under-five malnutrition was found southwest of the province, in which living in these areas at least doubles the risk of malnutrition. Additionally, at least one significant cluster were identified within each municipality—mostly located along the coastal areas. All these indicate apparent geographical variations across and within municipalities in the province. There were also similarities and disparities in the patterns of risk of malnutrition in each cluster across municipalities, and even within municipality, suggesting underlying causes at work that warrants further investigation. Therefore, community-appropriate programs and interventions should be identified and should be focused on high-risk areas to maximize limited government resources. Further studies are also recommended to determine factors affecting variations in childhood malnutrition considering the evidence of spatial clustering found in this study.

Keywords: Binomial model, Kulldorff’s elliptical spatial scan statistic, Philippines, under-five malnutrition

Procedia PDF Downloads 119
563 Hybridized Approach for Distance Estimation Using K-Means Clustering

Authors: Ritu Vashistha, Jitender Kumar

Abstract:

Clustering using the K-means algorithm is a very common way to understand and analyze the obtained output data. When a similar object is grouped, this is called the basis of Clustering. There is K number of objects and C number of cluster in to single cluster in which k is always supposed to be less than C having each cluster to be its own centroid but the major problem is how is identify the cluster is correct based on the data. Formulation of the cluster is not a regular task for every tuple of row record or entity but it is done by an iterative process. Each and every record, tuple, entity is checked and examined and similarity dissimilarity is examined. So this iterative process seems to be very lengthy and unable to give optimal output for the cluster and time taken to find the cluster. To overcome the drawback challenge, we are proposing a formula to find the clusters at the run time, so this approach can give us optimal results. The proposed approach uses the Euclidian distance formula as well melanosis to find the minimum distance between slots as technically we called clusters and the same approach we have also applied to Ant Colony Optimization(ACO) algorithm, which results in the production of two and multi-dimensional matrix.

Keywords: ant colony optimization, data clustering, centroids, data mining, k-means

Procedia PDF Downloads 110
562 Geographic Legacies for Modern Day Disease Research: Autism Spectrum Disorder as a Case-Control Study

Authors: Rebecca Richards Steed, James Van Derslice, Ken Smith, Richard Medina, Amanda Bakian

Abstract:

Elucidating gene-environment interactions for heritable disease outcomes is an emerging area of disease research, with genetic studies informing hypotheses for environment and gene interactions underlying some of the most confounding diseases of our time, like autism spectrum disorder (ASD). Geography has thus far played a key role in identifying environmental factors contributing to disease, but its use can be broadened to include genetic and environmental factors that have a synergistic effect on disease. Through the use of family pedigrees and disease outcomes with life-course residential histories, space-time clustering of generations at critical developmental windows can provide further understanding of (1) environmental factors that contribute to disease patterns in families, (2) susceptible critical windows of development most impacted by environment, (3) and that are most likely to lead to an ASD diagnosis. This paper introduces a retrospective case-control study that utilizes pedigree data, health data, and residential life-course location points to find space-time clustering of ancestors with a grandchild/child with a clinical diagnosis of ASD. Finding space-time clusters of ancestors at critical developmental windows serves as a proxy for shared environmental exposures. The authors refer to geographic life-course exposures as geographic legacies. Identifying space-time clusters of ancestors creates a bridge for researching exposures of past generations that may impact modern-day progeny health. Results from the space-time cluster analysis show multiple clusters for the maternal and paternal pedigrees. The paternal grandparent pedigree resulted in the most space-time clustering for birth and childhood developmental windows. No statistically significant clustering was found for adolescent years. These results will be further studied to identify the specific share of space-time environmental exposures. In conclusion, this study has found significant space-time clusters of parents, and grandparents for both maternal and paternal lineage. These results will be used to identify what environmental exposures have been shared with family members at critical developmental windows of time, and additional analysis will be applied.

Keywords: family pedigree, environmental exposure, geographic legacy, medical geography, transgenerational inheritance

Procedia PDF Downloads 102
561 Clustering Locations of Textile and Garment Industries to Compare with the Future Industrial Cluster in Thailand

Authors: Kanogkan Leerojanaprapa

Abstract:

Textile and garment industry is used to a major exporting industry of Thailand. According to lacking of the nation's price-competitiveness by stopping the EU's GSP (Generalised Scheme of Preferences) and ‘Nationwide Minimum Wage Policy’ that Thailand’s employers must pay all employees at least 300 baht (about $10) a day, the supply chains of the Thai textile and garment industry is affected and need to be reformed. Therefore, either Thai textile or garment industry will be existed or not would be concerned. This is also challenged for the government to decide which industries should be promoted the future industries of Thailand. Recently Thai government launch The Cluster-based Special Economic Development Zones Policy for promoting business cluster (effect on September 16, 2015). They define a cluster as the concentration of interconnected businesses and related institutions that operate within the same geographic areas and textiles and garment is one of target industrial clusters and 9 provinces are targeted (Bangkok, Kanchanaburi, Nakhon Pathom, Ratchaburi, Samut Sakhon, Chonburi, Chachoengsao, Prachinburi, and Sa Kaeo). The cluster zone are defined to link west-east corridor connected to manufacturing source in Cambodia and Mynmar to Bangkok where are promoted to be design, sourcing, and trading hub. The Thai government will provide tax and non-tax incentives for targeted industries within the clusters and expects these businesses are scattered to where they can get the most benefit which will identify future industrial cluster. This research will show the difference between the current cluster and future cluster following the target provinces of the textile and garment. The current cluster is analysed from secondary data. The four characteristics of the numbers of plants in Spinning, weaving and finishing of textiles, Manufacture of made-up textile articles, except apparel, Manufacture of knitted and crocheted fabrics, and Manufacture of other textiles, not elsewhere classified in particular 77 provinces (in total) are clustered by K-means cluster analysis and Hierarchical Cluster Analysis. In addition, the cluster can be confirmed and showed which variables contribute the most to defined cluster solution with ANOVA test. The results of analysis can identify 22 provinces (which the textile or garment plants are located) into 3 clusters. Plants in cluster 1 tend to be large numbers of plants which is only Bangkok, Next plants in cluster 2 tend to be moderate numbers of plants which are Samut Prakan, Samut Sakhon and Nakhon Pathom. Finally plants in cluster 3 tend to be little numbers of plants which are other 18 provinces. The same methodology can be implemented in other industries for future study.

Keywords: ANOVA, hierarchical cluster analysis, industrial clusters, K -means cluster analysis, textile and garment industry

Procedia PDF Downloads 197
560 Exploring Strategies Used by Victims of Intimate Partner Violence to Increase Sense of Safety: A Systematic Review and Quantitative Study

Authors: Thomas Nally, Jane Ireland, Roxanne Khan, Philip Birch

Abstract:

Intimate Partner Violence (IPV), a significant societal problem, affects individuals worldwide. However, the strategies victims use to keep safe are under-researched. IPV is significantly under-reported, and services often are not able to be accessed by all victims. Thus they are likely to use their own strategies to manage their victimization before being able to seek support. Two studies were completed to understand these strategies. A systematic review of the literature and study completed with professionals who work with victims was undertaken to understand this area. In study one, a systematic review of the literature (n=61 papers), were analyzed using Thematic Analysis. The results indicated that victims use a large array of behaviors to increase their sense of safety and coping with emotions but also experience significant barriers to help-seeking. In study 2, sixty-nine professionals completed a measure exploring the likelihood and effectiveness of various victim strategies regarding increasing their sense of safety. Strategies included in the measure were obtained from those identified in study 1. Findings indicated that professionals perceived victims of IPV to be more likely to employ safety strategies and coping behaviors that may be ineffective but not help-seeking behaviors. Further, the responses were analyzed using Cluster Analysis. Safety strategies resulted in five clusters; perpetrator-directed strategies, prevention strategies, cognitive reappraisal, safety planning and avoidance strategies. Help-Seeking resulted in six clusters; information or practical support, abuse-related support, emotional support, secondary support and informal support. Finally, coping resulted in four clusters; emotional coping, self-directed coping, thought recording/change and cognitive coping. Both studies indicate that victims may use a variety of strategies to manage their safety besides seeking help. Professionals working with victims, using a strength-based approach, should understand what is used and is effective for victims who are unable to leave the relationships or access external support.

Keywords: intimate partner violence, help-seeking, professional support, victims, victim coping, victim safety

Procedia PDF Downloads 171
559 Spatial Distribution and Cluster Analysis of Sexual Risk Behaviors and STIs Reported by Chinese Adults in Guangzhou, China: A Representative Population-Based Study

Authors: Fangjing Zhou, Wen Chen, Brian J. Hall, Yu Wang, Carl Latkin, Li Ling, Joseph D. Tucker

Abstract:

Background: Economic and social reforms designed to open China to the world has been successful, but also appear to have rapidly laid the foundation for the reemergence of STIs since 1980s. Changes in sexual behaviors, relationships, and norms among Chinese contributed to the STIs epidemic. As the massive population moved during the last 30 years, early coital debut, multiple sexual partnerships, and unprotected sex have increased within the general population. Our objectives were to assess associations between residences location, sexual risk behaviors and sexually transmitted infections (STIs) among adults living in Guangzhou, China. Methods: Stratified cluster sampling followed a two-step process was used to select populations aged 18-59 years in Guangzhou, China. Spatial methods including Geographic Information Systems (GIS) were utilized to identify 1400 coordinates with latitude and longitude. Face-to-face household interviews were conducted to collect self-report data on sexual risk behaviors and diagnosed STIs. Kulldorff’s spatial scan statistic was implemented to identify and detect spatial distribution and clusters of sexual risk behaviors and STIs. The presence and location of statistically significant clusters were mapped in the study areas using ArcGIS software. Results: In this study, 1215 of 1400 households attempted surveys, with 368 refusals, resulting in a sample of 751 completed surveys. The prevalence of self-reported sexual risk behaviors was between 5.1% and 50.0%. The self-reported lifetime prevalence of diagnosed STIs was 7.06%. Anal intercourse clustered in an area located along the border within the rural-urban continuum (p=0.001). High rate clusters for alcohol or other drugs using before sex (p=0.008) and migrants who lived in Guangzhou less than one year (p=0.007) overlapped this cluster. Excess cases for sex without a condom (p=0.031) overlapped the cluster for college students (p<0.001). Conclusions: Short-term migrants and college students reported greater sexual risk behaviors. Programs to increase safer sex within these communities to reduce the risk of STIs are warranted in Guangzhou. Spatial analysis identified geographical clusters of sexual risk behaviors, which is critical for optimizing surveillance and targeting control measures for these locations in the future.

Keywords: cluster analysis, migrant, sexual risk behaviors, spatial distribution

Procedia PDF Downloads 321
558 Revealing the Genome Based Biosynthetic Potential of a Streptomyces sp. Isolate BR123 Presenting Broad Spectrum Antimicrobial Activities

Authors: Neelma Ashraf

Abstract:

Actinomycetes, particularly genus Streptomyces is of great importance due to their role in the discovery of new natural products, particularly antimicrobial secondary metabolites in the medicinal science and biotechnology industry. Different Streptomyces strains were isolated from Helianthus annuus plants and tested for antibacterial and antifungal activities. The most promising five strains were chosen for further investigation, and growth conditions for antibiotic synthesis were optimised. The supernatants were extracted in different solvents, and the extracted products were analyzed using liquid chromatography-mass spectrometry (LC-MS) and biological testing. From one of the potent strains Streptomyces globusus sp. BR123, a compound lavendamycin was identified using these analytical techniques. In addition, this potent strain also produces a strong antifungal polyene compound with a quasimolecular ion of 2072. Streptomyces sp. BR123 was genome sequenced because of its promising antimicrobial potential in order to identify the gene cluster responsible for analyzed compound “lavendamycin”. The genome analysis yielded candidate genes responsible for the production of this potent compound. The genome sequence of 8.15 Mb of Streptomyces sp. isolate BR123 with a GC content of 72.63% and 8103 protein coding genes was attained. Many antimicrobial, antiparasitic, and anticancerous compounds were detected through multiple biosynthetic gene clusters predicted by in-Silico analysis. Though, the novelty of metabolites was determined through the insignificant resemblance with known biosynthetic gene clusters. The current study gives insight into the bioactive potential of Streptomyces sp. isolate BR123 with respect to the synthesis of bioactive secondary metabolites through genomic and spectrometric analysis. Moreover, the comparative genome study revealed the connection of isolate BR123 with other Streptomyces strains, which could expand the knowledge of this genus and the mechanism involved in the discovery of new antimicrobial metabolites.

Keywords: streptomyces, secondary metabolites, genome, biosynthetic gene clusters, high performance liquid chromatography, mass spectrometry

Procedia PDF Downloads 55
557 Book Recommendation Using Query Expansion and Information Retrieval Methods

Authors: Ritesh Kumar, Rajendra Pamula

Abstract:

In this paper, we present our contribution for book recommendation. In our experiment, we combine the results of Sequential Dependence Model (SDM) and exploitation of book information such as reviews, tags and ratings. This social information is assigned by users. For this, we used CLEF-2016 Social Book Search Track Suggestion task. Finally, our proposed method extensively evaluated on CLEF -2015 Social Book Search datasets, and has better performance (nDCG@10) compared to other state-of-the-art systems. Recently we got the good performance in CLEF-2016.

Keywords: sequential dependence model, social information, social book search, query expansion

Procedia PDF Downloads 270
556 An Improved K-Means Algorithm for Gene Expression Data Clustering

Authors: Billel Kenidra, Mohamed Benmohammed

Abstract:

Data mining technique used in the field of clustering is a subject of active research and assists in biological pattern recognition and extraction of new knowledge from raw data. Clustering means the act of partitioning an unlabeled dataset into groups of similar objects. Each group, called a cluster, consists of objects that are similar between themselves and dissimilar to objects of other groups. Several clustering methods are based on partitional clustering. This category attempts to directly decompose the dataset into a set of disjoint clusters leading to an integer number of clusters that optimizes a given criterion function. The criterion function may emphasize a local or a global structure of the data, and its optimization is an iterative relocation procedure. The K-Means algorithm is one of the most widely used partitional clustering techniques. Since K-Means is extremely sensitive to the initial choice of centers and a poor choice of centers may lead to a local optimum that is quite inferior to the global optimum, we propose a strategy to initiate K-Means centers. The improved K-Means algorithm is compared with the original K-Means, and the results prove how the efficiency has been significantly improved.

Keywords: microarray data mining, biological pattern recognition, partitional clustering, k-means algorithm, centroid initialization

Procedia PDF Downloads 172