Search results for: k-means cluster
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 378

Search results for: k-means cluster

228 Real Time Approach for Data Placement in Wireless Sensor Networks

Authors: Sanjeev Gupta, Mayank Dave

Abstract:

The issue of real-time and reliable report delivery is extremely important for taking effective decision in a real world mission critical Wireless Sensor Network (WSN) based application. The sensor data behaves differently in many ways from the data in traditional databases. WSNs need a mechanism to register, process queries, and disseminate data. In this paper we propose an architectural framework for data placement and management. We propose a reliable and real time approach for data placement and achieving data integrity using self organized sensor clusters. Instead of storing information in individual cluster heads as suggested in some protocols, in our architecture we suggest storing of information of all clusters within a cell in the corresponding base station. For data dissemination and action in the wireless sensor network we propose to use Action and Relay Stations (ARS). To reduce average energy dissipation of sensor nodes, the data is sent to the nearest ARS rather than base station. We have designed our architecture in such a way so as to achieve greater energy savings, enhanced availability and reliability.

Keywords: Cluster head, data reliability, real time communication, wireless sensor networks.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1761
227 Cluster Algorithm for Genetic Diversity

Authors: Manpreet Singh, Keerat Kaur, Bhavdeep Singh

Abstract:

With the hardware technology advancing, the cost of storing is decreasing. Thus there is an urgent need for new techniques and tools that can intelligently and automatically assist us in transferring this data into useful knowledge. Different techniques of data mining are developed which are helpful for handling these large size databases [7]. Data mining is also finding its role in the field of biotechnology. Pedigree means the associated ancestry of a crop variety. Genetic diversity is the variation in the genetic composition of individuals within or among species. Genetic diversity depends upon the pedigree information of the varieties. Parents at lower hierarchic levels have more weightage for predicting genetic diversity as compared to the upper hierarchic levels. The weightage decreases as the level increases. For crossbreeding, the two varieties should be more and more genetically diverse so as to incorporate the useful characters of the two varieties in the newly developed variety. This paper discusses the searching and analyzing of different possible pairs of varieties selected on the basis of morphological characters, Climatic conditions and Nutrients so as to obtain the most optimal pair that can produce the required crossbreed variety. An algorithm was developed to determine the genetic diversity between the selected wheat varieties. Cluster analysis technique is used for retrieving the results.

Keywords: Genetic diversity, pedigree, nutrients.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1750
226 Grouping and Indexing Color Features for Efficient Image Retrieval

Authors: M. V. Sudhamani, C. R. Venugopal

Abstract:

Content-based Image Retrieval (CBIR) aims at searching image databases for specific images that are similar to a given query image based on matching of features derived from the image content. This paper focuses on a low-dimensional color based indexing technique for achieving efficient and effective retrieval performance. In our approach, the color features are extracted using the mean shift algorithm, a robust clustering technique. Then the cluster (region) mode is used as representative of the image in 3-D color space. The feature descriptor consists of the representative color of a region and is indexed using a spatial indexing method that uses *R -tree thus avoiding the high-dimensional indexing problems associated with the traditional color histogram. Alternatively, the images in the database are clustered based on region feature similarity using Euclidian distance. Only representative (centroids) features of these clusters are indexed using *R -tree thus improving the efficiency. For similarity retrieval, each representative color in the query image or region is used independently to find regions containing that color. The results of these methods are compared. A JAVA based query engine supporting query-by- example is built to retrieve images by color.

Keywords: Content-based, indexing, cluster, region.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1760
225 Performance Evaluation of Energy Efficient Communication Protocol for Mobile Ad Hoc Networks

Authors: Toshihiko Sasama, Kentaro Kishida, Kazunori Sugahara, Hiroshi Masuyama

Abstract:

A mobile ad hoc network is a network of mobile nodes without any notion of centralized administration. In such a network, each mobile node behaves not only as a host which runs applications but also as a router to forward packets on behalf of others. Clustering has been applied to routing protocols to achieve efficient communications. A CH network expresses the connected relationship among cluster-heads. This paper discusses the methods for constructing a CH network, and produces the following results: (1) The required running costs of 3 traditional methods for constructing a CH network are not so different from each other in the static circumstance, or in the dynamic circumstance. Their running costs in the static circumstance do not differ from their costs in the dynamic circumstance. Meanwhile, although the routing costs required for the above 3 methods are not so different in the static circumstance, the costs are considerably different from each other in the dynamic circumstance. Their routing costs in the static circumstance are also very different from their costs in the dynamic circumstance, and the former is one tenths of the latter. The routing cost in the dynamic circumstance is mostly the cost for re-routing. (2) On the strength of the above results, we discuss new 2 methods regarding whether they are tolerable or not in the dynamic circumstance, that is, whether the times of re-routing are small or not. These new methods are revised methods that are based on the traditional methods. We recommended the method which produces the smallest routing cost in the dynamic circumstance, therefore producing the smallest total cost.

Keywords: cluster, mobile ad hoc network, re-routing cost, simulation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1296
224 Study of Chest Pain and its Risk Factors in Over 30 Year-Old Individuals

Authors: S. Dabiran

Abstract:

Chest pain is one of the most prevalent complaints among adults that cause the people to attend to medical centers. The aim was to determine the prevalence and risk factors of chest pain among over 30 years old people in Tehran. In this cross-sectional study, 787 adults took part from Apr 2005 until Apr 2006. The sampling method was random cluster sampling and there were 25 clusters. In each cluster, interviews were performed with 32 over 30 years old, people lived in those houses. In cases with chest pain, extra questions asked. The prevalence of CP was 9% (71 cases). Of them 21 cases (6.5%) were in 41-60 year age ranges and the remainders were over 61 year old. 19 cases (26.8%) mentioned CP in resting state and all of the cases had exertion onset CP. The CP duration was 10 minutes or less in all of the cases and in most of them (84.5%), the location of pain mentioned left anterior part of chest, left anterior part of sternum and or left arm. There was positive history of myocardial infarction in 12 cases (17%). There was significant relation between CP and age, sex and between history of myocardial infarction and marital state of study people. Our results are similar to other studies- results in most parts, however it is necessary to perform supplementary tests and follow up studies to differentiate between cardiac and non-cardiac CP exactly.

Keywords: Chest pain, myocardial infarction, risk factor, prevalence

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1413
223 The Algorithm of Semi-Automatic Thai Spoonerism Words for Bi-Syllable

Authors: Nutthapat Kaewrattanapat, Wannarat Bunchongkien

Abstract:

The purposes of this research are to study and develop the algorithm of Thai spoonerism words by semi-automatic computer programs, that is to say, in part of data input, syllables are already separated and in part of spoonerism, the developed algorithm is utilized, which can establish rules and mechanisms in Thai spoonerism words for bi-syllables by utilizing analysis in elements of the syllables, namely cluster consonant, vowel, intonation mark and final consonant. From the study, it is found that bi-syllable Thai spoonerism has 1 case of spoonerism mechanism, namely transposition in value of vowel, intonation mark and consonant of both 2 syllables but keeping consonant value and cluster word (if any). From the study, the rules and mechanisms in Thai spoonerism word were applied to develop as Thai spoonerism word software, utilizing PHP program. the software was brought to conduct a performance test on software execution; it is found that the program performs bi-syllable Thai spoonerism correctly or 99% of all words used in the test and found faults on the program at 1% as the words obtained from spoonerism may not be spelling in conformity with Thai grammar and the answer in Thai spoonerism could be more than 1 answer.

Keywords: Algorithm, Spoonerism, Computational Linguistics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2299
222 Diversity Analysis of a Quinoa (Chenopodium quinoa Willd.) Germplasm during Two Seasons

Authors: M. Mhada, E. N. Jellen, S. E. Jacobsen, O. Benlhabib

Abstract:

The present work has been carried out to evaluate the diversity of a collection of 78 quinoa accessions developed through recurrent selection from Andean germplasm introduced to Morocco in the winter of 2000. Twenty-three quantitative and qualitative characters were used for the evaluation of genetic diversity and the relationship between the accessions, and also for the establishment of a core collection in Morocco. Important variation was found among the accessions in terms of plant morphology and growth behavior. Data analysis showed positive correlation of the plant height, the plant fresh and the dry weight with the grain yield, while days to flowering was found to be negatively correlated with grain yield. The first four PCs contributed 74.76% of the variability; the first PC showed significant variation with 42.86% of the total variation, PC2 with 15.37%, PC3 with 9.05% and PC4 contributed 7.49% of the total variation. Plant size, days to grain filling and days to maturity are correlated to the PC1; and seed size, inflorescence density and mildew resistance are correlated to the PC2. Hierarchical cluster analysis rearranged the 78 quinoa accessions into four main groups and ten sub-clusters. Clustering was found in associations with days to maturity and also with plant size and seed-size traits.

Keywords: Character association, Chenopodium quinoa, Diversity analysis, Morphotypic cluster, Multivariate analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2526
221 Comparative Study of Complexity in Streetscape Composition

Authors: Ahmed Mansouri, Naoji Matsumoto

Abstract:

This research is a comparative study of complexity, as a multidimensional concept, in the context of streetscape composition in Algeria and Japan. 80 streetscapes visual arrays have been collected and then presented to 20 participants, with different cultural backgrounds, in order to be categorized and classified according to their degrees of complexity. Three analysis methods have been used in this research: cluster analysis, ranking method and Hayashi Quantification method (Method III). The results showed that complexity, disorder, irregularity and disorganization are often conflicting concepts in the urban context. Algerian daytime streetscapes seem to be balanced, ordered and regular, and Japanese daytime streetscapes seem to be unbalanced, regular and vivid. Variety, richness and irregularity with some aspects of order and organization seem to characterize Algerian night streetscapes. Japanese night streetscapes seem to be more related to balance, regularity, order and organization with some aspects of confusion and ambiguity. Complexity characterized mainly Algerian avenues with green infrastructure. Therefore, for Japanese participants, Japanese traditional night streetscapes were complex. And for foreigners, Algerian and Japanese avenues nightscapes were the most complex visual arrays.

Keywords: Streetscape, Nightscape, Complexity, Visual Array, Affordance, Cluster Analysis, Hayashi Quantification Method.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2286
220 Mining Network Data for Intrusion Detection through Naïve Bayesian with Clustering

Authors: Dewan Md. Farid, Nouria Harbi, Suman Ahmmed, Md. Zahidur Rahman, Chowdhury Mofizur Rahman

Abstract:

Network security attacks are the violation of information security policy that received much attention to the computational intelligence society in the last decades. Data mining has become a very useful technique for detecting network intrusions by extracting useful knowledge from large number of network data or logs. Naïve Bayesian classifier is one of the most popular data mining algorithm for classification, which provides an optimal way to predict the class of an unknown example. It has been tested that one set of probability derived from data is not good enough to have good classification rate. In this paper, we proposed a new learning algorithm for mining network logs to detect network intrusions through naïve Bayesian classifier, which first clusters the network logs into several groups based on similarity of logs, and then calculates the prior and conditional probabilities for each group of logs. For classifying a new log, the algorithm checks in which cluster the log belongs and then use that cluster-s probability set to classify the new log. We tested the performance of our proposed algorithm by employing KDD99 benchmark network intrusion detection dataset, and the experimental results proved that it improves detection rates as well as reduces false positives for different types of network intrusions.

Keywords: Clustering, detection rate, false positive, naïveBayesian classifier, network intrusion detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5486
219 Normalizing Scientometric Indicators of Individual Publications Using Local Cluster Detection Methods on Citation Networks

Authors: Levente Varga, Dávid Deritei, Mária Ercsey-Ravasz, Răzvan Florian, Zsolt I. Lázár, István Papp, Ferenc Járai-Szabó

Abstract:

One of the major shortcomings of widely used scientometric indicators is that different disciplines cannot be compared with each other. The issue of cross-disciplinary normalization has been long discussed, but even the classification of publications into scientific domains poses problems. Structural properties of citation networks offer new possibilities, however, the large size and constant growth of these networks asks for precaution. Here we present a new tool that in order to perform cross-field normalization of scientometric indicators of individual publications relays on the structural properties of citation networks. Due to the large size of the networks, a systematic procedure for identifying scientific domains based on a local community detection algorithm is proposed. The algorithm is tested with different benchmark and real-world networks. Then, by the use of this algorithm, the mechanism of the scientometric indicator normalization process is shown for a few indicators like the citation number, P-index and a local version of the PageRank indicator. The fat-tail trend of the article indicator distribution enables us to successfully perform the indicator normalization process.

Keywords: Citation networks, scientometric indicator, cross-field normalization, local cluster detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 671
218 Unsupervised Segmentation Technique for Acute Leukemia Cells Using Clustering Algorithms

Authors: N. H. Harun, A. S. Abdul Nasir, M. Y. Mashor, R. Hassan

Abstract:

Leukaemia is a blood cancer disease that contributes to the increment of mortality rate in Malaysia each year. There are two main categories for leukaemia, which are acute and chronic leukaemia. The production and development of acute leukaemia cells occurs rapidly and uncontrollable. Therefore, if the identification of acute leukaemia cells could be done fast and effectively, proper treatment and medicine could be delivered. Due to the requirement of prompt and accurate diagnosis of leukaemia, the current study has proposed unsupervised pixel segmentation based on clustering algorithm in order to obtain a fully segmented abnormal white blood cell (blast) in acute leukaemia image. In order to obtain the segmented blast, the current study proposed three clustering algorithms which are k-means, fuzzy c-means and moving k-means algorithms have been applied on the saturation component image. Then, median filter and seeded region growing area extraction algorithms have been applied, to smooth the region of segmented blast and to remove the large unwanted regions from the image, respectively. Comparisons among the three clustering algorithms are made in order to measure the performance of each clustering algorithm on segmenting the blast area. Based on the good sensitivity value that has been obtained, the results indicate that moving kmeans clustering algorithm has successfully produced the fully segmented blast region in acute leukaemia image. Hence, indicating that the resultant images could be helpful to haematologists for further analysis of acute leukaemia.

Keywords: Acute Leukaemia Images, Clustering Algorithms, Image Segmentation, Moving k-Means.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2718
217 In vitro Study of Laser Diode Radiation Effect on the Photo-Damage of MCF-7 and MCF-10A Cell Clusters

Authors: A. Dashti, M. Eskandari, L. Farahmand, P. Parvin, A. Jafargholi

Abstract:

Breast Cancer is one of the most considerable diseases in the United States and other countries and is the second leading cause of death in women. Common breast cancer treatments would lead to adverse side effects such as loss of hair, nausea, and weakness. These complications arise because these cancer treatments damage some healthy cells while eliminating the cancer cells. In an effort to address these complications, laser radiation was utilized and tested as a targeted cancer treatment for breast cancer. In this regard, tissue engineering approaches are being employed by using an electrospun scaffold in order to facilitate the growth of breast cancer cells. Polycaprolacton (PCL) was used as a material for scaffold fabricating because of its biocompatibility, biodegradability, and supporting cell growth. The specific breast cancer cells have the ability to create a three-dimensional cell cluster due to the spontaneous accumulation of cells in the porosity of the scaffold under some specific conditions. Therefore, we are looking for a higher density of porosity and larger pore size. Fibers showed uniform diameter distribution and final scaffold had optimum characteristics with approximately 40% porosity. The images were taken by SEM and the density and the size of the porosity were determined with the Image. After scaffold preparation, it has cross-linked by glutaraldehyde. Then, it has been washed with glycine and phosphate buffer saline (PBS), in order to neutralize the residual glutaraldehyde. 3-(4,5-Dimethylthiazol-2-yl)-2,5-diphenyltetrazolium bromidefor (MTT) results have represented approximately 91.13% viability of the scaffolds for cancer cells. In order to create a cluster, Michigan Cancer Foundation-7 (MCF-7, breast cancer cell line) and Michigan Cancer Foundation-10A (MCF-10A, human mammary epithelial cell line) cells were cultured on the scaffold in 24 well plate for five days. Then, we have exposed the cluster to the laser diode 808 nm radiation to investigate the effect of laser on the tumor with different power and time. Under the same conditions, cancer cells lost their viability more than the healthy ones. In conclusion, laser therapy is a viable method to destroy the target cells and has a minimum effect on the healthy tissues and cells and it can improve the other method of cancer treatments limitations.

Keywords: Breast cancer, electrospun scaffold, polycaprolacton, laser diode, cancer treatment.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 730
216 Electricity Generation from Renewables and Targets: An Application of Multivariate Statistical Techniques

Authors: Filiz Ersoz, Taner Ersoz, Tugrul Bayraktar

Abstract:

Renewable energy is referred to as "clean energy" and common popular support for the use of renewable energy (RE) is to provide electricity with zero carbon dioxide emissions. This study provides useful insight into the European Union (EU) RE, especially, into electricity generation obtained from renewables, and their targets. The objective of this study is to identify groups of European countries, using multivariate statistical analysis and selected indicators. The hierarchical clustering method is used to decide the number of clusters for EU countries. The conducted statistical hierarchical cluster analysis is based on the Ward’s clustering method and squared Euclidean distances. Hierarchical cluster analysis identified eight distinct clusters of European countries. Then, non-hierarchical clustering (k-means) method was applied. Discriminant analysis was used to determine the validity of the results with data normalized by Z score transformation. To explore the relationship between the selected indicators, correlation coefficients were computed. The results of the study reveal the current situation of RE in European Union Member States.

Keywords: Share of electricity generation, CO2 emission, targets, multivariate methods, hierarchical clustering, K-means clustering, discriminant analyzed, correlation, EU member countries.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1191
215 A Study on the Relation among Primary Care Professionals Serving the Disadvantaged Community, Socioeconomic Status, and Adverse Health Outcome

Authors: Chau-Kuang Chen, Juanita Buford, Colette Davis, Raisha Allen, John Hughes, Jr., James Tyus, Dexter Samuels

Abstract:

During the post-Civil War era, the city of Nashville, Tennessee, had the highest mortality rate in the United States. The elevated death and disease rates among former slaves were attributable to lack of quality healthcare. To address the paucity of healthcare services, Meharry Medical College, an institution with the mission of educating minority professionals and serving the underserved population, was established in 1876. Purpose: The social ecological framework and partial least squares (PLS) path modeling were used to quantify the impact of socioeconomic status and adverse health outcome on primary care professionals serving the disadvantaged community. Thus, the study results could demonstrate the accomplishment of the College’s mission of training primary care professionals to serve in underserved areas. Methods: Various statistical methods were used to analyze alumni data from 1975 – 2013. K-means cluster analysis was utilized to identify individual medical and dental graduates in the cluster groups of the practice communities (Disadvantaged or Non-disadvantaged Communities). Discriminant analysis was implemented to verify the classification accuracy of cluster analysis. The independent t-test was performed to detect the significant mean differences of respective clustering and criterion variables. Chi-square test was used to test if the proportions of primary care and non-primary care specialists are consistent with those of medical and dental graduates practicing in the designated community clusters. Finally, the PLS path model was constructed to explore the construct validity of analytic model by providing the magnitude effects of socioeconomic status and adverse health outcome on primary care professionals serving the disadvantaged community. Results: Approximately 83% (3,192/3,864) of Meharry Medical College’s medical and dental graduates from 1975 to 2013 were practicing in disadvantaged communities. Independent t-test confirmed the content validity of the cluster analysis model. Also, the PLS path modeling demonstrated that alumni served as primary care professionals in communities with significantly lower socioeconomic status and higher adverse health outcome (p < .001). The PLS path modeling exhibited the meaningful interrelation between primary care professionals practicing communities and surrounding environments (socioeconomic statues and adverse health outcome), which yielded model reliability, validity, and applicability. Conclusion: This study applied social ecological theory and analytic modeling approaches to assess the attainment of Meharry Medical College’s mission of training primary care professionals to serve in underserved areas, particularly in communities with low socioeconomic status and high rates of adverse health outcomes. In summary, the majority of medical and dental graduates from Meharry Medical College provided primary care services to disadvantaged communities with low socioeconomic status and high adverse health outcome, which demonstrated that Meharry Medical College has fulfilled its mission. The high reliability, validity, and applicability of this model imply that it could be replicated for comparable universities and colleges elsewhere.

Keywords: Disadvantaged Community, K-means Cluster Analysis, PLS Path Modeling, Primary care.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1971
214 Non-Coplanar Nuclei in Heavy-Ion Reactions

Authors: Sahila Chopra, Hemdeep, Arshdeep Kaur, Raj K. Gupta

Abstract:

In recent times, we noticed an interesting and important role of non-coplanar degree-of-freedom (Φ = 00) in heavy ion reactions. Using the dynamical cluster-decay model (DCM) with Φ degree-of-freedom included, we have studied three compound systems 246Bk∗, 164Yb∗ and 105Ag∗. Here, within the DCM with pocket formula for nuclear proximity potential, we look for the effects of including compact, non-coplanar configurations (Φc = 00) on the non-compound nucleus (nCN) contribution in total fusion cross section σfus. For 246Bk∗, formed in 11B+235U and 14N+232Th reaction channels, the DCM with coplanar nuclei (Φc = 00) shows an nCN contribution for 11B+235U channel, but none for 14N+232Th channel, which on including Φ gives both reaction channels as pure compound nucleus decays. In the case of 164Yb∗, formed in 64Ni+100Mo, the small nCN effects for Φ=00 are reduced to almost zero for Φ = 00. Interestingly, however, 105Ag∗ for Φ = 00 shows a small nCN contribution, which gets strongly enhanced for Φ = 00, such that the characteristic property of PCN presents a change of behaviour, like that of a strongly fissioning superheavy element to a weakly fissioning nucleus; note that 105Ag∗ is a weakly fissioning nucleus and Psurv behaves like one for a weakly fissioning nucleus for both Φ = 00 and Φ = 00. Apparently, Φ is presenting itself like a good degree-of-freedom in the DCM.

Keywords: Dynamical cluster-decay model, fusion cross sections, non-compound nucleus effects, non-coplanarity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1124
213 Info-participation of the Disabled Using the Mixed Preference Data in Improving Their Travel Quality

Authors: Y. Duvarci, S. Mizokami

Abstract:

Today, the preferences and participation of the TD groups such as the elderly and disabled is still lacking in decision-making of transportation planning, and their reactions to certain type of policies are not well known. Thus, a clear methodology is needed. This study aimed to develop a method to extract the preferences of the disabled to be used in the policy-making stage that can also guide to future estimations. The method utilizes the combination of cluster analysis and data filtering using the data of the Arao city (Japan). The method is a process that follows: defining the TD group by the cluster analysis tool, their travel preferences in tabular form from the household surveys by policy variableimpact pairs, zones, and by trip purposes, and the final outcome is the preference probabilities of the disabled. The preferences vary by trip purpose; for the work trips, accessibility and transit system quality policies with the accompanying impacts of modal shifts towards public mode use as well as the decreasing travel costs, and the trip rate increase; for the social trips, the same accessibility and transit system policies leading to the same mode shift impact, together with the travel quality policy area leading to trip rate increase. These results explain the policies to focus and can be used in scenario generation in models, or any other planning purpose as decision support tool.

Keywords: Transportation Disadvantaged, Disabled, Mixed Preference, Stated Preference Data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1020
212 A Growing Natural Gas Approach for Evaluating Quality of Software Modules

Authors: Parvinder S. Sandhu, Sandeep Khimta, Kiranpreet Kaur

Abstract:

The prediction of Software quality during development life cycle of software project helps the development organization to make efficient use of available resource to produce the product of highest quality. “Whether a module is faulty or not" approach can be used to predict quality of a software module. There are numbers of software quality prediction models described in the literature based upon genetic algorithms, artificial neural network and other data mining algorithms. One of the promising aspects for quality prediction is based on clustering techniques. Most quality prediction models that are based on clustering techniques make use of K-means, Mixture-of-Guassians, Self-Organizing Map, Neural Gas and fuzzy K-means algorithm for prediction. In all these techniques a predefined structure is required that is number of neurons or clusters should be known before we start clustering process. But in case of Growing Neural Gas there is no need of predetermining the quantity of neurons and the topology of the structure to be used and it starts with a minimal neurons structure that is incremented during training until it reaches a maximum number user defined limits for clusters. Hence, in this work we have used Growing Neural Gas as underlying cluster algorithm that produces the initial set of labeled cluster from training data set and thereafter this set of clusters is used to predict the quality of test data set of software modules. The best testing results shows 80% accuracy in evaluating the quality of software modules. Hence, the proposed technique can be used by programmers in evaluating the quality of modules during software development.

Keywords: Growing Neural Gas, data clustering, fault prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1806
211 Improving Similarity Search Using Clustered Data

Authors: Deokho Kim, Wonwoo Lee, Jaewoong Lee, Teresa Ng, Gun-Ill Lee, Jiwon Jeong

Abstract:

This paper presents a method for improving object search accuracy using a deep learning model. A major limitation to provide accurate similarity with deep learning is the requirement of huge amount of data for training pairwise similarity scores (metrics), which is impractical to collect. Thus, similarity scores are usually trained with a relatively small dataset, which comes from a different domain, causing limited accuracy on measuring similarity. For this reason, this paper proposes a deep learning model that can be trained with a significantly small amount of data, a clustered data which of each cluster contains a set of visually similar images. In order to measure similarity distance with the proposed method, visual features of two images are extracted from intermediate layers of a convolutional neural network with various pooling methods, and the network is trained with pairwise similarity scores which is defined zero for images in identical cluster. The proposed method outperforms the state-of-the-art object similarity scoring techniques on evaluation for finding exact items. The proposed method achieves 86.5% of accuracy compared to the accuracy of the state-of-the-art technique, which is 59.9%. That is, an exact item can be found among four retrieved images with an accuracy of 86.5%, and the rest can possibly be similar products more than the accuracy. Therefore, the proposed method can greatly reduce the amount of training data with an order of magnitude as well as providing a reliable similarity metric.

Keywords: Visual search, deep learning, convolutional neural network, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 763
210 Comparison of Methods of Estimation for Use in Goodness of Fit Tests for Binary Multilevel Models

Authors: I. V. Pinto, M. R. Sooriyarachchi

Abstract:

It can be frequently observed that the data arising in our environment have a hierarchical or a nested structure attached with the data. Multilevel modelling is a modern approach to handle this kind of data. When multilevel modelling is combined with a binary response, the estimation methods get complex in nature and the usual techniques are derived from quasi-likelihood method. The estimation methods which are compared in this study are, marginal quasi-likelihood (order 1 & order 2) (MQL1, MQL2) and penalized quasi-likelihood (order 1 & order 2) (PQL1, PQL2). A statistical model is of no use if it does not reflect the given dataset. Therefore, checking the adequacy of the fitted model through a goodness-of-fit (GOF) test is an essential stage in any modelling procedure. However, prior to usage, it is also equally important to confirm that the GOF test performs well and is suitable for the given model. This study assesses the suitability of the GOF test developed for binary response multilevel models with respect to the method used in model estimation. An extensive set of simulations was conducted using MLwiN (v 2.19) with varying number of clusters, cluster sizes and intra cluster correlations. The test maintained the desirable Type-I error for models estimated using PQL2 and it failed for almost all the combinations of MQL. Power of the test was adequate for most of the combinations in all estimation methods except MQL1. Moreover, models were fitted using the four methods to a real-life dataset and performance of the test was compared for each model.

Keywords: Goodness-of-fit test, marginal quasi-likelihood, multilevel modelling, type-I error, penalized quasi-likelihood, power, quasi-likelihood.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 680
209 Localization of Geospatial Events and Hoax Prediction in the UFO Database

Authors: Harish Krishnamurthy, Anna Lafontant, Ren Yi

Abstract:

Unidentified Flying Objects (UFOs) have been an interesting topic for most enthusiasts and hence people all over the United States report such findings online at the National UFO Report Center (NUFORC). Some of these reports are a hoax and among those that seem legitimate, our task is not to establish that these events confirm that they indeed are events related to flying objects from aliens in outer space. Rather, we intend to identify if the report was a hoax as was identified by the UFO database team with their existing curation criterion. However, the database provides a wealth of information that can be exploited to provide various analyses and insights such as social reporting, identifying real-time spatial events and much more. We perform analysis to localize these time-series geospatial events and correlate with known real-time events. This paper does not confirm any legitimacy of alien activity, but rather attempts to gather information from likely legitimate reports of UFOs by studying the online reports. These events happen in geospatial clusters and also are time-based. We look at cluster density and data visualization to search the space of various cluster realizations to decide best probable clusters that provide us information about the proximity of such activity. A random forest classifier is also presented that is used to identify true events and hoax events, using the best possible features available such as region, week, time-period and duration. Lastly, we show the performance of the scheme on various days and correlate with real-time events where one of the UFO reports strongly correlates to a missile test conducted in the United States.

Keywords: Time-series clustering, feature extraction, hoax prediction, geospatial events.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 792
208 PoPCoRN: A Power-Aware Periodic Surveillance Scheme in Convex Region using Wireless Mobile Sensor Networks

Authors: A. K. Prajapati

Abstract:

In this paper, the periodic surveillance scheme has been proposed for any convex region using mobile wireless sensor nodes. A sensor network typically consists of fixed number of sensor nodes which report the measurements of sensed data such as temperature, pressure, humidity, etc., of its immediate proximity (the area within its sensing range). For the purpose of sensing an area of interest, there are adequate number of fixed sensor nodes required to cover the entire region of interest. It implies that the number of fixed sensor nodes required to cover a given area will depend on the sensing range of the sensor as well as deployment strategies employed. It is assumed that the sensors to be mobile within the region of surveillance, can be mounted on moving bodies like robots or vehicle. Therefore, in our scheme, the surveillance time period determines the number of sensor nodes required to be deployed in the region of interest. The proposed scheme comprises of three algorithms namely: Hexagonalization, Clustering, and Scheduling, The first algorithm partitions the coverage area into fixed sized hexagons that approximate the sensing range (cell) of individual sensor node. The clustering algorithm groups the cells into clusters, each of which will be covered by a single sensor node. The later determines a schedule for each sensor to serve its respective cluster. Each sensor node traverses all the cells belonging to the cluster assigned to it by oscillating between the first and the last cell for the duration of its life time. Simulation results show that our scheme provides full coverage within a given period of time using few sensors with minimum movement, less power consumption, and relatively less infrastructure cost.

Keywords: Sensor Network, Graph Theory, MSN, Communication.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1410
207 Influence of Drought on Yield and Yield Components in White Bean

Authors: Gholamreza Habibi

Abstract:

In order to study seed yield and seed yield components in bean under reduced irrigation condition and assessment drought tolerance of genotypes, 15 lines of White beans were evaluated in two separate RCB design with 3 replications under stress and non stress conditions. Analysis of variance showed that there were significant differences among varieties in terms of traits under study, indicating the existence of genetic variation among varieties. The results indicate that drought stress reduced seed yield, number of seed per plant, biological yield and number of pod in White been. In non stress condition, yield was highly correlated with the biological yield, whereas in stress condition it was highly correlated with harvest index. Results of stepwise regression showed that, selection can we done based on, biological yield, harvest index, number of seed per pod, seed length, 100 seed weight. Result of path analysis showed that the highest direct effect, being positive, was related to biological yield in non stress and to harvest index in stress conditions. Factor analysis were accomplished in stress and nonstress condition a, there were 4 factors that explained more than 76 percent of total variations. We used several selection indices such as Stress Susceptibility Index ( SSI ), Geometric Mean Productivity ( GMP ), Mean Productivity ( MP ), Stress Tolerance Index ( STI ) and Tolerance Index ( TOL ) to study drought tolerance of genotypes, we found that the best Stress Index for selection tolerance genotypes were STI, GMP and MP were the greatest correlations between these Indices and seed yield under stress and non stress conditions. In classification of genotypes base on phenotypic characteristics, using cluster analysis ( UPGMA ), all allels classified in 5 separate groups in stress and non stress conditions.

Keywords: Cluster analysis, factor analysis, path analysis, selection index, White bean

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2082
206 The Relationship between the Feeling of Distributive Justice and National Identity of the Youth

Authors: Leila Batmany

Abstract:

This research studies the relationship between the feeling of distributive justice and national identity of the youth. The present analysis intends to experimentally investigate the various dimensions of the justice feeling and its effect on the national identity components. The study has taken justice into consideration from four different points of view on the basis of availability of valuable social sources such as power, wealth, knowledge and status in the political, economic, and cultural and status justice respectively. Furthermore, the national identity has been considered as the feeling of honour, attachment and commitment towards national society and its seven components i.e. history, language, culture, political system, religion, geographical territory and society. The 'field study' has been used as the method for the research with the individual as unit, taking 368 young between the age of 18 and 29 living in Tehran, chosen randomly according to Cochran formula. The individual samples have been randomly chosen among five districts in north, south, west, east, and centre of Tehran, based on the multistage cluster sampling. The data collection has been performed with the use of questionnaire and interview. The most important results are as follows: i) The feeling of economic justice is the weakest one among the youth. ii) The strongest and the weakest dimensions of the national identity are, respectively, the historical and the social dimension. iii) There is a positive and meaningful relationship between the feeling political and statues justice and then national identity, whereas no meaningful relationship exists between the economic and cultural justice and the national identity. iv) There is a positive and meaningful relationship between the feeling of justice in all dimensions and legitimacy of the political system. There is also such a relationship between the legitimacy of the political system and national identity. v) Generally, there is a positive and meaningful relationship between the feeling of distributive justice and national identity among the youth. vi) It is through the legitimacy of the political system that justice feeling can have an influence on the national identity.

Keywords: Distributive justice, national identity, legitimacy of political system, Cochran formula, multistage cluster sampling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 457
205 Mapping the Quotidian Life of Practitioners of Various Religious Sects in Late Medieval Bengal: Portrayals on the Front Façades of the Baranagar Temple Cluster

Authors: I. Gupta, B. Karmakar

Abstract:

Bengal has a long history (8th century A.D. onwards) of decorating the wall of brick-built temples with curved terracotta plaques on a diverse range of subjects. These could be considered as one of the most significant visual archives to understand the various facets of the then contemporary societies. The temples under focus include Char-bangla temple complex (circa 1755 A.D.), Bhavanishvara temple (circa 1755 A.D.) and the Gangeshvara Shiva Jor-bangla temple (circa 1753 A.D.), located within a part of the river Bhagirathi basin in Baranagar, Murshidabad, West Bengal, India. Though, a diverse range of subjects have been intricately carved mainly on the front façades of the Baranagar temple cluster, the study specifically concentrates on depictions related to religious and non-religious acts performed by practitioners of various religious sects of late medieval Bengal with the intention to acquire knowledge about the various facets of their life. Apart from this, the paper also mapped the spatial location of these religious performers on the temples’ façades to examine if any systematic plan or arrangement had been employed for connoting a particular idea. Further, an attempt is made to provide a commentary on the attire worn by followers of various religious sects of late medieval Bengal. The primary materials for the study comprise the depictions which denote religious activities carved on the terracotta plaques. The secondary material has been collected from published and unpublished theses, journals and books. These data have been further supplemented with photographic documentation, some useful line-drawings and descriptions in table format to get a clear understanding of the concerned issues.

Keywords: Attire, scheme of allocation, terracotta temple, various religious sect.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 721
204 An Analysis of Institutional Environments on Corporate Social Responsibility Practices in Nigerian Renewable Energy Firms

Authors: Bolanle Deborah Motilewa, E. K. Rowland Worlu, Gbenga Mayowa Agboola, Ayodele Maxwell Olokundun

Abstract:

Several studies have proposed a one-size fit all approach to Corporate Social Responsibility (CSR) practices, such that CSR as it applies to developed countries is adapted to developing countries, ignoring the differing institutional environments (such as the regulative, economic, social and political environments), which affects the profitability and practices of businesses operating in them. CSR as it applies to filling institutional gaps in developing countries, was categorized into four themes: environmental protection, product and service innovation, social innovation and local cluster development. Based on the four themes, the study employed a qualitative research approach through the use of interviews and review of available publications to study the influence of institutional environments on CSR practices engaged in by three renewable energy firms operating in Nigeria. Over the course of three 60-minutes sessions with the top management and selected workers of the firms, four propositions were made: regulatory environment influences environmental protection practice of Nigerian renewable firms, economic environment influences product and service innovation practice of Nigerian renewable energy firms, the social environment impacts on social innovation in Nigerian renewable energy firms, and political environment affects local cluster development practice of Nigerian renewable energy firms. It was also observed that beyond institutional environments, the international exposure of an organization’s managers reflected in their approach to CSR. This finding on the influence of international exposure on CSR practices creates an area for further study. Insights from this paper are set to help policy makers in developing countries, CSR managers, and future researchers.

Keywords: Corporate social responsibility, institutional environment, renewable energy firms, developing countries, international exposure.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1374
203 Tribological Aspects of Advanced Roll Material in Cold Rolling of Stainless Steel

Authors: Mohammed Tahir, Jonas Lagergren

Abstract:

Vancron 40, a nitrided powder metallurgical tool Steel, is used in cold work applications where the predominant failure mechanisms are adhesive wear or galling. Typical applications of Vancron 40 are among others fine blanking, cold extrusion, deep drawing and cold work rolls for cluster mills. Vancron 40 positive results for cold work rolls for cluster mills and as a tool for some severe metal forming process makes it competitive compared to other type of work rolls that require higher precision, among others in cold rolling of thin stainless steel, which required high surface finish quality. In this project, three roll materials for cold rolling of stainless steel strip was examined, Vancron 40, Narva 12B (a high-carbon, high-chromium tool steel alloyed with tungsten) and Supra 3 (a Chromium-molybdenum tungsten-vanadium alloyed high speed steel). The purpose of this project was to study the depth profiles of the ironed stainless steel strips, emergence of galling and to study the lubrication performance used by steel industries. Laboratory experiments were conducted to examine scratch of the strip, galling and surface roughness of the roll materials under severe tribological conditions. The critical sliding length for onset of galling was estimated for stainless steel with four different lubricants. Laboratory experiments result of performance evaluation of resistance capability of rolls toward adhesive wear under severe conditions for low and high reductions. Vancron 40 in combination with cold rolling lubricant gave good surface quality, prevents galling of metal surfaces and good bearing capacity.

Keywords: Adhesive wear, Cold rolling, Lubricant, Stainless steel, Surface finish, Vancron 40.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2707
202 Resting-State Functional Connectivity Analysis Using an Independent Component Approach

Authors: Eric Jacob Bacon, Chaoyang Jin, Dianning He, Shuaishuai Hu, Lanbo Wang, Han Li, Shouliang Qi

Abstract:

Refractory epilepsy is a complicated type of epilepsy that can be difficult to diagnose. Recent technological advancements have made resting-state functional magnetic resonance (rsfMRI) a vital technique for studying brain activity. However, there is still much to learn about rsfMRI. Investigating rsfMRI connectivity may aid in the detection of abnormal activities. In this paper, we propose studying the functional connectivity of rsfMRI candidates to diagnose epilepsy. 45 rsfMRI candidates, comprising 26 with refractory epilepsy and 19 healthy controls, were enrolled in this study. A data-driven approach known as Independent Component Analysis (ICA) was used to achieve our goal. First, rsfMRI data from both patients and healthy controls were analyzed using group ICA. The components that were obtained were then spatially sorted to find and select meaningful ones. A two-sample t-test was also used to identify abnormal networks in patients and healthy controls. Finally, based on the fractional amplitude of low-frequency fluctuations (fALFF), a chi-square statistic test was used to distinguish the network properties of the patient and healthy control groups. The two-sample t-test analysis yielded abnormal in the default mode network, including the left superior temporal lobe and the left supramarginal. The right precuneus was found to be abnormal in the dorsal attention network. In addition, the frontal cortex showed an abnormal cluster in the medial temporal gyrus. In contrast, the temporal cortex showed an abnormal cluster in the right middle temporal gyrus and the right fronto-operculum gyrus. Finally, the chi-square statistic test was significant, producing a p-value of 0.001 for the analysis. This study offers evidence that investigating rsfMRI connectivity provides an excellent diagnosis option for refractory epilepsy.

Keywords: Independent Component Analysis, Resting State Network, refractory epilepsy, rsfMRI.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 183
201 A Simple User Administration View of Computing Clusters

Authors: Valeria M. Bastos, Myrian A. Costa, Matheus Ambrozio, Nelson F. F. Ebecken

Abstract:

In this paper a very simple and effective user administration view of computing clusters systems is implemented in order of friendly provide the configuration and monitoring of distributed application executions. The user view, the administrator view, and an internal control module create an illusionary management environment for better system usability. The architecture, properties, performance, and the comparison with others software for cluster management are briefly commented.

Keywords: Big data, computing clusters, administration view, user view.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1506
200 Hierarchical Clustering Algorithms in Data Mining

Authors: Z. Abdullah, A. R. Hamdan

Abstract:

Clustering is a process of grouping objects and data into groups of clusters to ensure that data objects from the same cluster are identical to each other. Clustering algorithms in one of the area in data mining and it can be classified into partition, hierarchical, density based and grid based. Therefore, in this paper we do survey and review four major hierarchical clustering algorithms called CURE, ROCK, CHAMELEON and BIRCH. The obtained state of the art of these algorithms will help in eliminating the current problems as well as deriving more robust and scalable algorithms for clustering.

Keywords: Clustering, method, algorithm, hierarchical, survey.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3324
199 Growth of Droplet in Radiation-Induced Plasma of Own Steam

Authors: Pavlo Selyshchev

Abstract:

The theoretical approach is developed to describe the change of drops in the atmosphere of own steam and buffer gas under irradiation. It is shown that the irradiation influences on size of stable droplet and on the conditions under which the droplet exists. Under irradiation the change of drop becomes more complex: the not monotone and periodical change of size of drop becomes possible. All possible solutions are represented by means of phase portrait. It is found all qualitatively different phase portraits as function of critical parameters: rate generation of clusters and substance density.

Keywords: Irradiation, steam, plasma, cluster formation, liquid droplets, evolution.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2039