Search results for: multiple indicator cluster survey
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 10982

Search results for: multiple indicator cluster survey

10862 A Minimum Spanning Tree-Based Method for Initializing the K-Means Clustering Algorithm

Authors: J. Yang, Y. Ma, X. Zhang, S. Li, Y. Zhang

Abstract:

The traditional k-means algorithm has been widely used as a simple and efficient clustering method. However, the algorithm often converges to local minima for the reason that it is sensitive to the initial cluster centers. In this paper, an algorithm for selecting initial cluster centers on the basis of minimum spanning tree (MST) is presented. The set of vertices in MST with same degree are regarded as a whole which is used to find the skeleton data points. Furthermore, a distance measure between the skeleton data points with consideration of degree and Euclidean distance is presented. Finally, MST-based initialization method for the k-means algorithm is presented, and the corresponding time complexity is analyzed as well. The presented algorithm is tested on five data sets from the UCI Machine Learning Repository. The experimental results illustrate the effectiveness of the presented algorithm compared to three existing initialization methods.

Keywords: degree, initial cluster center, k-means, minimum spanning tree

Procedia PDF Downloads 411
10861 Verification & Validation of Map Reduce Program Model for Parallel K-Mediod Algorithm on Hadoop Cluster

Authors: Trapti Sharma, Devesh Kumar Srivastava

Abstract:

This paper is basically a analysis study of above MapReduce implementation and also to verify and validate the MapReduce solution model for Parallel K-Mediod algorithm on Hadoop Cluster. MapReduce is a programming model which authorize the managing of huge amounts of data in parallel, on a large number of devices. It is specially well suited to constant or moderate changing set of data since the implementation point of a position is usually high. MapReduce has slowly become the framework of choice for “big data”. The MapReduce model authorizes for systematic and instant organizing of large scale data with a cluster of evaluate nodes. One of the primary affect in Hadoop is how to minimize the completion length (i.e. makespan) of a set of MapReduce duty. In this paper, we have verified and validated various MapReduce applications like wordcount, grep, terasort and parallel K-Mediod clustering algorithm. We have found that as the amount of nodes increases the completion time decreases.

Keywords: hadoop, mapreduce, k-mediod, validation, verification

Procedia PDF Downloads 369
10860 Analysing Industry Clustering to Develop Competitive Advantage for Wualai Silver Handicraft

Authors: Khanita Tumphasuwan

Abstract:

The Wualai community of Northern Thailand represents important intellectual and social capital and their silver handicraft products are desirable tourist souvenirs within Chiang Mai Province. This community has been in danger of losing this social and intellectual capital due to the application of an improper tool, the Scottish Enterprise model of clustering. This research aims to analyze and increase its competitive advantages for preventing the loss of social and intellectual capital. To improve the Wualai’s competitive advantage, analysis is undertaken using a Porterian cluster approach, including the diamond model, five forces model and cluster mapping. Research results suggest that utilizing the community’s Buddhist beliefs can foster collaboration between community members and is the only way to improve cluster effectiveness, increase competitive advantage, and in turn conserve the Wualai community.

Keywords: industry clustering, silver handicraft, competitive advantage, intellectual capital, social capital

Procedia PDF Downloads 566
10859 Sustainability Rating System for Infrastructure Projects in UAE

Authors: Amrutha Venugopal, Rabee Rustum

Abstract:

In spite of huge investments and the vital role infrastructure plays in the economy of UAE, the country has not yet developed an assessment scheme to measure the sustainability of infrastructure projects/development. The aim of this study was to develop a sustainability rating system for infrastructure projects in UAE using weighted indicator scoring. The identification of the list of 66 indicators was done by content analysis. The sources of content analysis were from government guidelines, research literature and sustainability rating system for infrastructure projects namely BCA Greenmark for Infrastructure (Singapore), ISCA (Australia) and Envision (USA). These indicators were shortlisted based on their relevance in the UAE. A mixture of qualitative and quantitative research methods is utilized to find the weightage to be applied to the indicators and to find suggestive measures to improve infrastructure sustainability in this region. Interviews and surveys were conducted with a good mix of experts from the industry. The data collected from the interviews were collated to provide suggestive measures for improving infrastructure sustainability. The collected survey data were analyzed using statistical analysis techniques to find the indicator weighing. The indicators were shortlisted by 75% to minimize the effort and investment into the process. The weighing of the deleted indicators was distributed among the critical clusters identified by Pareto analysis. Finally a simple Microsoft Excel tool was developed as the rating tool by using the calculated weighing for the indicators.

Keywords: infrastructure, rating system, suggestive measures, sustainability, UAE

Procedia PDF Downloads 305
10858 Hierarchical Clustering Algorithms in Data Mining

Authors: Z. Abdullah, A. R. Hamdan

Abstract:

Clustering is a process of grouping objects and data into groups of clusters to ensure that data objects from the same cluster are identical to each other. Clustering algorithms in one of the areas in data mining and it can be classified into partition, hierarchical, density based, and grid-based. Therefore, in this paper, we do a survey and review for four major hierarchical clustering algorithms called CURE, ROCK, CHAMELEON, and BIRCH. The obtained state of the art of these algorithms will help in eliminating the current problems, as well as deriving more robust and scalable algorithms for clustering.

Keywords: clustering, unsupervised learning, algorithms, hierarchical

Procedia PDF Downloads 885
10857 Comparison of Solar Radiation Models

Authors: O. Behar, A. Khellaf, K. Mohammedi, S. Ait Kaci

Abstract:

Up to now, most validation studies have been based on the MBE and RMSE, and therefore, focused only on long and short terms performance to test and classify solar radiation models. This traditional analysis does not take into account the quality of modeling and linearity. In our analysis we have tested 22 solar radiation models that are capable to provide instantaneous direct and global radiation at any given location Worldwide. We introduce a new indicator, which we named Global Accuracy Indicator (GAI) to examine the linear relationship between the measured and predicted values and the quality of modeling in addition to long and short terms performance. Note that the quality of model has been represented by the T-Statistical test, the model linearity has been given by the correlation coefficient and the long and short term performance have been respectively known by the MBE and RMSE. An important founding of this research is that the use GAI allows avoiding default validation when using traditional methodology that might results in erroneous prediction of solar power conversion systems performances.

Keywords: solar radiation model, parametric model, performance analysis, Global Accuracy Indicator (GAI)

Procedia PDF Downloads 351
10856 Life Satisfaction of Non-Luxembourgish and Native Luxembourgish Postgraduate Students

Authors: Chrysoula Karathanasi, Senad Karavdic, Angela Odero, Michèle Baumann

Abstract:

It is not only the economic determinants that impact on life conditions, but maintaining a good level of life satisfaction (LS) may also be an important challenge currently. In Luxembourg, university students receive financial aid from the government. They are then registered at the Centre for Documentation and Information on Higher Education (CEDIES). Luxembourg is built on migration with almost half its population consisting of foreigners. It is upon this basis that our research aims to analyze the associations with mental health factors (health satisfaction, psychological quality of life, worry), perceived financial situation, career attitudes (adaptability, optimism, knowledge, planning) and LS, for non-Luxembourgish and native postgraduate students. Between 2012 and 2013, postgraduates registered at CEDIES were contacted by post and asked to participate in an online survey with either the option of English or French. The study population comprised of 644 respondents. Our statistical analysis excluded: those born abroad who had Luxembourgish citizenship, or those born in Luxembourg who did not have citizenship. Two groups were formed one consisting 147 non-Luxembourgish and the other 284 natives. A single item measured LS (1=not at all satisfied to 10=very satisfied). Bivariate tests, correlations and multiple linear regression models were used in which only significant relationships (p<0.05) were integrated. Among the two groups no differences were found between LS indicators (7.8/10 non-Luxembourgish; 8.0/10 natives) as both were higher than the European indicator of 7.2/10 (for 25-34 years). In the case of non-Luxembourgish students, they were older than natives (29.3 years vs. 26.3 years) perceived their financial situation as more difficult, and a higher percentage of their parents had an education level higher than a Bachelor's degree (father 59.2% vs 44.6% for natives; mother 51.4% vs 33.7% for natives). In addition, the father’s education was related to the LS of postgraduates and the higher was the score, the greater was the contribution to LS. Whereas for native students, when their scores of health satisfaction and career optimism were higher, their LS’ score was higher. For both groups their LS was linked to mental health-related factors, perception of their financial situation, career optimism, adaptability and planning. The higher the psychological quality of life score was, the greater the LS of postgraduates’ was. Good health and positive attitudes related to the job market enhanced their LS indicator.

Keywords: career attributes, father's education level, life satisfaction, mental health

Procedia PDF Downloads 371
10855 Optimization of Territorial Spatial Functional Partitioning in Coal Resource-based Cities Based on Ecosystem Service Clusters - The Case of Gujiao City in Shanxi Province

Authors: Gu Sihao

Abstract:

The coordinated development of "ecology-production-life" in cities has been highly concerned by the country, and the transformation development and sustainable development of resource-based cities have become a hot research topic at present. As an important part of China's resource-based cities, coal resource-based cities have the characteristics of large number and wide distribution. However, due to the adjustment of national energy structure and the gradual exhaustion of urban coal resources, the development vitality of coal resource-based cities is gradually reduced. In many studies, the deterioration of ecological environment in coal resource-based cities has become the main problem restricting their urban transformation and sustainable development due to the "emphasis on economy and neglect of ecology". Since the 18th National Congress of the Communist Party of China (CPC), the Central Government has been deepening territorial space planning and development. On the premise of optimizing territorial space development pattern, it has completed the demarcation of ecological protection red lines, carried out ecological zoning and ecosystem evaluation, which have become an important basis and scientific guarantee for ecological modernization and ecological civilization construction. Grasp the regional multiple ecosystem services is the precondition of the ecosystem management, and the relationship between the multiple ecosystem services study, ecosystem services cluster can identify the interactions between multiple ecosystem services, and on the basis of the characteristics of the clusters on regional ecological function zoning, to better Social-Ecological system management. Based on this cognition, this study optimizes the spatial function zoning of Gujiao, a coal resource-based city, in order to provide a new theoretical basis for its sustainable development. This study is based on the detailed analysis of characteristics and utilization of Gujiao city land space, using SOFM neural networks to identify local ecosystem service clusters, according to the cluster scope and function of ecological function zoning of space partition balance and coordination between different ecosystem services strength, establish a relationship between clusters and land use, and adjust the functions of territorial space within each zone. Then, according to the characteristics of coal resources city and national spatial function zoning characteristics, as the driving factors of land change, by cellular automata simulation program, such as simulation under different restoration strategy situation of urban future development trend, and provides relevant theories and technical methods for the "third-line" demarcations of Gujiao's territorial space planning, optimizes territorial space functions, and puts forward targeted strategies for the promotion of regional ecosystem services, providing theoretical support for the improvement of human well-being and sustainable development of resource-based cities.

Keywords: coal resource-based city, territorial spatial planning, ecosystem service cluster, gmop model, geosos-FLUS model, functional zoning optimization and upgrading

Procedia PDF Downloads 61
10854 The Effects of Yield and Yield Components of Some Quality Increase Applications on Ismailoglu Grape Type in Turkey

Authors: Yaşar Önal, Aydın Akın

Abstract:

This study was conducted Ismailoglu grape type (Vitis vinifera L.) and its vine which was aged 15 was grown on its own root in a vegetation period of 2013 in Nevşehir province in Turkey. In this research, it was investigated whether the applications of Control (C), 1/3 cluster tip reduction (1/3 CTR), shoot tip reduction (STR), 1/3 CTR + STR, TKI-HUMAS (TKI-HM) (Soil) (S), TKI-HM (Foliar) (F), TKI-HM (S + F), 1/3 CTR + TKI-HM (S), 1/3 CTR + TKI-HM (F), 1/3 CTR + TKI-HM (S+F), STR + TKI-HM (S), STR + TKI-HM (F), STR + TKI-HM (S + F), 1/3 CTR + STR+TKI-HM (S), 1/3 CTR + STR + TKI-HM (F), 1/3 CTR + STR + TKI-HM (S + F) on yield and yield components of Ismailoglu grape type. The results were obtained as the highest fresh grape yield (16.15 kg/vine) with TKI-HM (S), as the highest cluster weight (652.39 g) with 1/3 CTR + STR, as the highest 100 berry weight (419.07 g) with 1/3 CTR + STR + TKI-HM (F), as the highest maturity index (44.06) with 1/3 CTR, as the highest must yield (810.00 ml) with STR + TKI-HM (F), as the highest intensity of L* color (42.04) with TKI-HM (S + F), as the highest intensity of a* color (2.60) with 1/3 CTR + TKI-HM (S), as the highest intensity of b* color (7.16) with 1/3 CTR + TKI-HM (S) applications. To increase the fresh grape yield of Ismailoglu grape type can be recommended TKI-HM (S) application.

Keywords: 1/3 cluster tip reduction, shoot tip reduction, TKI-Humas application, yield and yield components

Procedia PDF Downloads 399
10853 Evaluation of Yield and Yield Components of Malaysian Palm Oil Board-Senegal Oil Palm Germplasm Using Multivariate Tools

Authors: Khin Aye Myint, Mohd Rafii Yusop, Mohd Yusoff Abd Samad, Shairul Izan Ramlee, Mohd Din Amiruddin, Zulkifli Yaakub

Abstract:

The narrow base of genetic is the main obstacle of breeding and genetic improvement in oil palm industry. In order to broaden the genetic bases, the Malaysian Palm Oil Board has been extensively collected wild germplasm from its original area of 11 African countries which are Nigeria, Senegal, Gambia, Guinea, Sierra Leone, Ghana, Cameroon, Zaire, Angola, Madagascar, and Tanzania. The germplasm collections were established and maintained as a field gene bank in Malaysian Palm Oil Board (MPOB) Research Station in Kluang, Johor, Malaysia to conserve a wide range of oil palm genetic resources for genetic improvement of Malaysian oil palm industry. Therefore, assessing the performance and genetic diversity of the wild materials is very important for understanding the genetic structure of natural oil palm population and to explore genetic resources. Principal component analysis (PCA) and Cluster analysis are very efficient multivariate tools in the evaluation of genetic variation of germplasm and have been applied in many crops. In this study, eight populations of MPOB-Senegal oil palm germplasm were studied to explore the genetic variation pattern using PCA and cluster analysis. A total of 20 yield and yield component traits were used to analyze PCA and Ward’s clustering using SAS 9.4 version software. The first four principal components which have eigenvalue >1 accounted for 93% of total variation with the value of 44%, 19%, 18% and 12% respectively for each principal component. PC1 showed highest positive correlation with fresh fruit bunch (0.315), bunch number (0.321), oil yield (0.317), kernel yield (0.326), total economic product (0.324), and total oil (0.324) while PC 2 has the largest positive association with oil to wet mesocarp (0.397) and oil to fruit (0.458). The oil palm population were grouped into four distinct clusters based on 20 evaluated traits, this imply that high genetic variation existed in among the germplasm. Cluster 1 contains two populations which are SEN 12 and SEN 10, while cluster 2 has only one population of SEN 3. Cluster 3 consists of three populations which are SEN 4, SEN 6, and SEN 7 while SEN 2 and SEN 5 were grouped in cluster 4. Cluster 4 showed the highest mean value of fresh fruit bunch, bunch number, oil yield, kernel yield, total economic product, and total oil and Cluster 1 was characterized by high oil to wet mesocarp, and oil to fruit. The desired traits that have the largest positive correlation on extracted PCs could be utilized for the improvement of oil palm breeding program. The populations from different clusters with the highest cluster means could be used for hybridization. The information from this study can be utilized for effective conservation and selection of the MPOB-Senegal oil palm germplasm for the future breeding program.

Keywords: cluster analysis, genetic variability, germplasm, oil palm, principal component analysis

Procedia PDF Downloads 164
10852 Aggregation of Fractal Aggregates Inside Fractal Cages in Irreversible Diffusion Limited Cluster Aggregation Binary Systems

Authors: Zakiya Shireen, Sujin B. Babu

Abstract:

Irreversible diffusion-limited cluster aggregation (DLCA) of binary sticky spheres was simulated by modifying the Brownian Cluster Dynamics (BCD). We randomly distribute N spheres in a 3D box of size L, the volume fraction is given by Φtot = (π/6)N/L³. We identify NA and NB number of spheres as species A and B in our system both having identical size. In these systems, both A and B particles undergo Brownian motion. Irreversible bond formation happens only between intra-species particles and inter-species interact only through hard-core repulsions. As we perform simulation using BCD we start to observe binary gels. In our study, we have observed that species B always percolate (cluster size equal to L) as expected for the monomeric case and species A does not percolate below a critical ratio which is different for different volume fractions. We will also show that the accessible volume of the system increases when compared to the monomeric case, which means that species A is aggregating inside the cage created by B. We have also observed that for moderate Φtot the system undergoes a transition from flocculation region to percolation region indicated by the change in fractal dimension from 1.8 to 2.5. For smaller ratio of A, it stays in the flocculation regime even though B have already crossed over to the percolation regime. Thus, we observe two fractal dimension in the same system.

Keywords: BCD, fractals, percolation, sticky spheres

Procedia PDF Downloads 280
10851 The Influence of Microsilica on the Cluster Cracks' Geometry of Cement Paste

Authors: Maciej Szeląg

Abstract:

The changing nature of environmental impacts, in which cement composites are operating, are causing in the structure of the material a number of phenomena, which result in volume deformation of the composite. These strains can cause composite cracking. Cracks are merging by propagation or intersect to form a characteristic structure of cracks known as the cluster cracks. This characteristic mesh of cracks is crucial to almost all building materials, which are working in service loads conditions. Particularly dangerous for a cement matrix is a sudden load of elevated temperature – the thermal shock. Resulting in a relatively short period of time a large value of a temperature gradient between the outer surface and the material’s interior can result in cracks formation on the surface and in the volume of the material. In the paper, in order to analyze the geometry of the cluster cracks of the cement pastes, the image analysis tools were used. Tested were 4 series of specimens made of two different Portland cement. In addition, two series include microsilica as a substitute for the 10% of the cement. Within each series, specimens were performed in three w/b indicators (water/binder): 0.4; 0.5; 0.6. The cluster cracks were created by sudden loading the samples by elevated temperature of 250°C. Images of the cracked surfaces were obtained via scanning at 2400 DPI. Digital processing and measurements were performed using ImageJ v. 1.46r software. To describe the structure of the cluster cracks three stereological parameters were proposed: the average cluster area - A ̅, the average length of cluster perimeter - L ̅, and the average opening width of a crack between clusters - I ̅. The aim of the study was to identify and evaluate the relationships between measured stereological parameters, and the compressive strength and the bulk density of the modified cement pastes. The tests of the mechanical and physical feature have been carried out in accordance with EN standards. The curves describing the relationships have been developed using the least squares method, and the quality of the curve fitting to the empirical data was evaluated using three diagnostic statistics: the coefficient of determination – R2, the standard error of estimation - Se, and the coefficient of random variation – W. The use of image analysis allowed for a quantitative description of the cluster cracks’ geometry. Based on the obtained results, it was found a strong correlation between the A ̅ and L ̅ – reflecting the fractal nature of the cluster cracks formation process. It was noted that the compressive strength and the bulk density of cement pastes decrease with an increase in the values of the stereological parameters. It was also found that the main factors, which impact on the cluster cracks’ geometry are the cement particles’ size and the general content of the binder in a volume of the material. The microsilica caused the reduction in the A ̅, L ̅ and I ̅ values compared to the values obtained by the classical cement paste’s samples, which is caused by the pozzolanic properties of the microsilica.

Keywords: cement paste, cluster cracks, elevated temperature, image analysis, microsilica, stereological parameters

Procedia PDF Downloads 246
10850 A Memetic Algorithm Approach to Clustering in Mobile Wireless Sensor Networks

Authors: Masood Ahmad, Ataul Aziz Ikram, Ishtiaq Wahid

Abstract:

Wireless sensor network (WSN) is the interconnection of mobile wireless nodes with limited energy and memory. These networks can be deployed formany critical applications like military operations, rescue management, fire detection and so on. In flat routing structure, every node plays an equal role of sensor and router. The topology may change very frequently due to the mobile nature of nodes in WSNs. The topology maintenance may produce more overhead messages. To avoid topology maintenance overhead messages, an optimized cluster based mobile wireless sensor network using memetic algorithm is proposed in this paper. The nodes in this network are first divided into clusters. The cluster leaders then transmit data to that base station. The network is validated through extensive simulation study. The results show that the proposed technique has superior results compared to existing techniques.

Keywords: WSN, routing, cluster based, meme, memetic algorithm

Procedia PDF Downloads 481
10849 An Investigation of the Quantitative Correlation between Urban Spatial Morphology Indicators and Block Wind Environment

Authors: Di Wei, Xing Hu, Yangjun Chen, Baofeng Li, Hong Chen

Abstract:

To achieve the research purpose of guiding the spatial morphology design of blocks through the indicators to obtain a good wind environment, it is necessary to find the most suitable type and value range of each urban spatial morphology indicator. At present, most of the relevant researches is based on the numerical simulation of the ideal block shape and rarely proposes the results based on the complex actual block types. Therefore, this paper firstly attempted to make theoretical speculation on the main factors influencing indicators' effectiveness by analyzing the physical significance and formulating the principle of each indicator. Then it was verified by the field wind environment measurement and statistical analysis, indicating that Porosity(P₀) can be used as an important indicator to guide the design of block wind environment in the case of deep street canyons, while Frontal Area Density (λF) can be used as a supplement in the case of shallow street canyons with no height difference. Finally, computational fluid dynamics (CFD) was used to quantify the impact of block height difference and street canyons depth on λF and P₀, finding the suitable type and value range of λF and P₀. This paper would provide a feasible wind environment index system for urban designers.

Keywords: urban spatial morphology indicator, urban microclimate, computational fluid dynamics, block ventilation, correlation analysis

Procedia PDF Downloads 137
10848 Assessment of Soil Quality Indicators in Rice Soil of Tamil Nadu

Authors: Kaleeswari R. K., Seevagan L .

Abstract:

Soil quality in an agroecosystem is influenced by the cropping system, water and soil fertility management. A valid soil quality index would help to assess the soil and crop management practices for desired productivity and soil health. The soil quality indices also provide an early indication of soil degradation and needy remedial and rehabilitation measures. Imbalanced fertilization and inadequate organic carbon dynamics deteriorate soil quality in an intensive cropping system. The rice soil ecosystem is different from other arable systems since rice is grown under submergence, which requires a different set of key soil attributes for enhancing soil quality and productivity. Assessment of the soil quality index involves indicator selection, indicator scoring and comprehensive score into one index. The most appropriate indicator to evaluate soil quality can be selected by establishing the minimum data set, which can be screened by linear and multiple regression factor analysis and score function. This investigation was carried out in intensive rice cultivating regions (having >1.0 lakh hectares) of Tamil Nadu viz., Thanjavur, Thiruvarur, Nagapattinam, Villupuram, Thiruvannamalai, Cuddalore and Ramanathapuram districts. In each district, intensive rice growing block was identified. In each block, two sampling grids (10 x 10 sq.km) were used with a sampling depth of 10 – 15 cm. Using GIS coordinates, and soil sampling was carried out at various locations in the study area. The number of soil sampling points were 41, 28, 28, 32, 37, 29 and 29 in Thanjavur, Thiruvarur, Nagapattinam, Cuddalore, Villupuram, Thiruvannamalai and Ramanathapuram districts, respectively. Principal Component Analysis is a data reduction tool to select some of the potential indicators. Principal Component is a linear combination of different variables that represents the maximum variance of the dataset. Principal Component that has eigenvalues equal or higher than 1.0 was taken as the minimum data set. Principal Component Analysis was used to select the representative soil quality indicators in rice soils based on factor loading values and contribution percent values. Variables having significant differences within the production system were used for the preparation of the minimum data set. Each Principal Component explained a certain amount of variation (%) in the total dataset. This percentage provided the weight for variables. The final Principal Component Analysis based soil quality equation is SQI = ∑ i=1 (W ᵢ x S ᵢ); where S- score for the subscripted variable; W-weighing factor derived from PCA. Higher index scores meant better soil quality. Soil respiration, Soil available Nitrogen and Potentially Mineralizable Nitrogen were assessed as soil quality indicators in rice soil of the Cauvery Delta zone covering Thanjavur, Thiruvavur and Nagapattinam districts. Soil available phosphorus could be used as a soil quality indicator of rice soils in the Cuddalore district. In rain-fed rice ecosystems of coastal sandy soil, DTPA – Zn could be used as an effective soil quality indicator. Among the soil parameters selected from Principal Component Analysis, Microbial Biomass Nitrogen could be used quality indicator for rice soils of the Villupuram district. Cauvery Delta zone has better SQI as compared with other intensive rice growing zone of Tamil Nadu.

Keywords: soil quality index, soil attributes, soil mapping, and rice soil

Procedia PDF Downloads 86
10847 Cas9-Assisted Direct Cloning and Refactoring of a Silent Biosynthetic Gene Cluster

Authors: Peng Hou

Abstract:

Natural products produced from marine bacteria serve as an immense reservoir for anti-infective drugs and therapeutic agents. Nowadays, heterologous expression of gene clusters of interests has been widely adopted as an effective strategy for natural product discovery. Briefly, the heterologous expression flowchart would be: biosynthetic gene cluster identification, pathway construction and expression, and product detection. However, gene cluster capture using traditional Transformation-associated recombination (TAR) protocol is low-efficient (0.5% positive colony rate). To make things worse, most of these putative new natural products are only predicted by bioinformatics analysis such as antiSMASH, and their corresponding natural products biosynthetic pathways are either not expressed or expressed at very low levels under laboratory conditions. Those setbacks have inspired us to focus on seeking new technologies to efficiently edit and refractor of biosynthetic gene clusters. Recently, two cutting-edge techniques have attracted our attention - the CRISPR-Cas9 and Gibson Assembly. By now, we have tried to pretreat Brevibacillus laterosporus strain genomic DNA with CRISPR-Cas9 nucleases that specifically generated breaks near the gene cluster of interest. This trial resulted in an increase in the efficiency of gene cluster capture (9%). Moreover, using Gibson Assembly by adding/deleting certain operon and tailoring enzymes regardless of end compatibility, the silent construct (~80kb) has been successfully refactored into an active one, yielded a series of analogs expected. With the appearances of the novel molecular tools, we are confident to believe that development of a high throughput mature pipeline for DNA assembly, transformation, product isolation and identification would no longer be a daydream for marine natural product discovery.

Keywords: biosynthesis, CRISPR-Cas9, DNA assembly, refactor, TAR cloning

Procedia PDF Downloads 282
10846 Structuring Highly Iterative Product Development Projects by Using Agile-Indicators

Authors: Guenther Schuh, Michael Riesener, Frederic Diels

Abstract:

Nowadays, manufacturing companies are faced with the challenge of meeting heterogeneous customer requirements in short product life cycles with a variety of product functions. So far, some of the functional requirements remain unknown until late stages of the product development. A way to handle these uncertainties is the highly iterative product development (HIP) approach. By structuring the development project as a highly iterative process, this method provides customer oriented and marketable products. There are first approaches for combined, hybrid models comprising deterministic-normative methods like the Stage-Gate process and empirical-adaptive development methods like SCRUM on a project management level. However, almost unconsidered is the question, which development scopes can preferably be realized with either empirical-adaptive or deterministic-normative approaches. In this context, a development scope constitutes a self-contained section of the overall development objective. Therefore, this paper focuses on a methodology that deals with the uncertainty of requirements within the early development stages and the corresponding selection of the most appropriate development approach. For this purpose, internal influencing factors like a company’s technology ability, the prototype manufacturability and the potential solution space as well as external factors like the market accuracy, relevance and volatility will be analyzed and combined into an Agile-Indicator. The Agile-Indicator is derived in three steps. First of all, it is necessary to rate each internal and external factor in terms of the importance for the overall development task. Secondly, each requirement has to be evaluated for every single internal and external factor appropriate to their suitability for empirical-adaptive development. Finally, the total sums of internal and external side are composed in the Agile-Indicator. Thus, the Agile-Indicator constitutes a company-specific and application-related criterion, on which the allocation of empirical-adaptive and deterministic-normative development scopes can be made. In a last step, this indicator will be used for a specific clustering of development scopes by application of the fuzzy c-means (FCM) clustering algorithm. The FCM-method determines sub-clusters within functional clusters based on the empirical-adaptive environmental impact of the Agile-Indicator. By means of the methodology presented in this paper, it is possible to classify requirements, which are uncertainly carried out by the market, into empirical-adaptive or deterministic-normative development scopes.

Keywords: agile, highly iterative development, agile-indicator, product development

Procedia PDF Downloads 246
10845 External Sulphate Attack: Advanced Testing and Performance Specifications

Authors: G. Massaad, E. Roziere, A. Loukili, L. Izoret

Abstract:

Based on the monitoring of mass, hydrostatic weighing, and the amount of leached OH- we deduced the nature of leached and precipitated minerals, the amount of lost aggregates and the evolution of porosity and cracking during the sulphate attack. Using these information, we are able to draw the volume / mass changes brought by mineralogical variations and cracking of the cement matrix. Then we defined a new performance indicator, the averaged density, capable to resume along the test of sulphate attack the occurred physicochemical variation occurred in the cementitious matrix and then highlight.

Keywords: monitoring strategy, performance indicator, sulphate attack, mechanism of degradation

Procedia PDF Downloads 321
10844 Genetic Diversity and Discovery of Unique SNPs in Five Country Cultivars of Sesamum indicum by Next-Generation Sequencing

Authors: Nam-Kuk Kim, Jin Kim, Soomin Park, Changhee Lee, Mijin Chu, Seong-Hun Lee

Abstract:

In this study, we conducted whole genome re-sequencing of 10 cultivars originated from five countries including Korea, China, India, Pakistan and Ethiopia with Sesamum indicum (Zhongzho No. 13) genome as a reference. Almost 80% of the whole genome sequences of the reference genome could be covered by sequenced reads. Numerous SNP and InDel were detected by bioinformatic analysis. Among these variants, 266,051 SNPs were identified as unique to countries. Pakistan and Ethiopia had high densities of SNPs compared to other countries. Three main clusters (cluster 1: Korea, cluster 2: Pakistan and India, cluster 3: Ethiopia and China) were recovered by neighbor-joining analysis using all variants. Interestingly, some variants were detected in DGAT1 (diacylglycerol O-acyltransferase 1) and FADS (fatty acid desaturase) genes, which are known to be related with fatty acid synthesis and metabolism. These results can provide useful information to understand the regional characteristics and develop DNA markers for origin discrimination of sesame.

Keywords: Sesamum indicum, NGS, SNP, DNA marker

Procedia PDF Downloads 327
10843 A Survey of Semantic Integration Approaches in Bioinformatics

Authors: Chaimaa Messaoudi, Rachida Fissoune, Hassan Badir

Abstract:

Technological advances of computer science and data analysis are helping to provide continuously huge volumes of biological data, which are available on the web. Such advances involve and require powerful techniques for data integration to extract pertinent knowledge and information for a specific question. Biomedical exploration of these big data often requires the use of complex queries across multiple autonomous, heterogeneous and distributed data sources. Semantic integration is an active area of research in several disciplines, such as databases, information-integration, and ontology. We provide a survey of some approaches and techniques for integrating biological data, we focus on those developed in the ontology community.

Keywords: biological ontology, linked data, semantic data integration, semantic web

Procedia PDF Downloads 449
10842 An AI-Based Dynamical Resource Allocation Calculation Algorithm for Unmanned Aerial Vehicle

Authors: Zhou Luchen, Wu Yubing, Burra Venkata Durga Kumar

Abstract:

As the scale of the network becomes larger and more complex than before, the density of user devices is also increasing. The development of Unmanned Aerial Vehicle (UAV) networks is able to collect and transform data in an efficient way by using software-defined networks (SDN) technology. This paper proposed a three-layer distributed and dynamic cluster architecture to manage UAVs by using an AI-based resource allocation calculation algorithm to address the overloading network problem. Through separating services of each UAV, the UAV hierarchical cluster system performs the main function of reducing the network load and transferring user requests, with three sub-tasks including data collection, communication channel organization, and data relaying. In this cluster, a head node and a vice head node UAV are selected considering the Central Processing Unit (CPU), operational (RAM), and permanent (ROM) memory of devices, battery charge, and capacity. The vice head node acts as a backup that stores all the data in the head node. The k-means clustering algorithm is used in order to detect high load regions and form the UAV layered clusters. The whole process of detecting high load areas, forming and selecting UAV clusters, and moving the selected UAV cluster to that area is proposed as offloading traffic algorithm.

Keywords: k-means, resource allocation, SDN, UAV network, unmanned aerial vehicles

Procedia PDF Downloads 111
10841 Applying Semi-Automatic Digital Aerial Survey Technology and Canopy Characters Classification for Surface Vegetation Interpretation of Archaeological Sites

Authors: Yung-Chung Chuang

Abstract:

The cultural layers of archaeological sites are mainly affected by surface land use, land cover, and root system of surface vegetation. For this reason, continuous monitoring of land use and land cover change is important for archaeological sites protection and management. However, in actual operation, on-site investigation and orthogonal photograph interpretation require a lot of time and manpower. For this reason, it is necessary to perform a good alternative for surface vegetation survey in an automated or semi-automated manner. In this study, we applied semi-automatic digital aerial survey technology and canopy characters classification with very high-resolution aerial photographs for surface vegetation interpretation of archaeological sites. The main idea is based on different landscape or forest type can easily be distinguished with canopy characters (e.g., specific texture distribution, shadow effects and gap characters) extracted by semi-automatic image classification. A novel methodology to classify the shape of canopy characters using landscape indices and multivariate statistics was also proposed. Non-hierarchical cluster analysis was used to assess the optimal number of canopy character clusters and canonical discriminant analysis was used to generate the discriminant functions for canopy character classification (seven categories). Therefore, people could easily predict the forest type and vegetation land cover by corresponding to the specific canopy character category. The results showed that the semi-automatic classification could effectively extract the canopy characters of forest and vegetation land cover. As for forest type and vegetation type prediction, the average prediction accuracy reached 80.3%~91.7% with different sizes of test frame. It represented this technology is useful for archaeological site survey, and can improve the classification efficiency and data update rate.

Keywords: digital aerial survey, canopy characters classification, archaeological sites, multivariate statistics

Procedia PDF Downloads 142
10840 Low Overhead Dynamic Channel Selection with Cluster-Based Spatial-Temporal Station Reporting in Wireless Networks

Authors: Zeyad Abdelmageid, Xianbin Wang

Abstract:

Choosing the operational channel for a WLAN access point (AP) in WLAN networks has been a static channel assignment process initiated by the user during the deployment process of the AP, which fails to cope with the dynamic conditions of the assigned channel at the station side afterward. However, the dramatically growing number of Wi-Fi APs and stations operating in the unlicensed band has led to dynamic, distributed, and often severe interference. This highlights the urgent need for the AP to dynamically select the best overall channel of operation for the basic service set (BSS) by considering the distributed and changing channel conditions at all stations. Consequently, dynamic channel selection algorithms which consider feedback from the station side have been developed. Despite the significant performance improvement, existing channel selection algorithms suffer from very high feedback overhead. Feedback latency from the STAs, due to the high overhead, can cause the eventually selected channel to no longer be optimal for operation due to the dynamic sharing nature of the unlicensed band. This has inspired us to develop our own dynamic channel selection algorithm with reduced overhead through the proposed low-overhead, cluster-based station reporting mechanism. The main idea behind the cluster-based station reporting is the observation that STAs which are very close to each other tend to have very similar channel conditions. Instead of requesting each STA to report on every candidate channel while causing high overhead, the AP divides STAs into clusters then assigns each STA in each cluster one channel to report feedback on. With the proper design of the cluster based reporting, the AP does not lose any information about the channel conditions at the station side while reducing feedback overhead. The simulation results show equal performance and, at times, better performance with a fraction of the overhead. We believe that this algorithm has great potential in designing future dynamic channel selection algorithms with low overhead.

Keywords: channel assignment, Wi-Fi networks, clustering, DBSCAN, overhead

Procedia PDF Downloads 119
10839 Adopting the Two-Stage Nested Mixed Analysis of Variance Test to the Eco Indicator 99 to Evaluate Building Technologies under LCA Uncertainties

Authors: Svetlana Pushkar

Abstract:

Eco-indicator 99 (EI99) considers fundamental life cycle assessment (LCA) uncertainties via egalitarian/egalitarian (e/e), hierarchist/hierarchist (h/h), individualist/individualist (i/i), individualist/average (i/a), egalitarian/average (e/a), and hierarchist/average (h/a) methodological options. The objective of this study is to provide a reliable two-stage nested mixed balanced Analysis of Variance (ANOVA) test as a supplemental test to EI99 to address the problematic combination of similarly and not similarly produced materials usually found in building technologies. The robustness of the test was determined from both the “EI99 (all options)” stage (including e/e, i/i, h/h, e/a, i/a, and h/a - all methodological options) and the “EI99 (perspectives)” stage (including e/e, i/i, and h/h methodological options of EI99 - the methodological options with their particular weighting set or e/a, i/a, and h/a methodological options of EI99 - the methodological options with the average weighting set) of evaluating building technologies.

Keywords: building technologies, LCA uncertainty, Eco-indicator 99, two-stage nested mixed ANOVA test

Procedia PDF Downloads 310
10838 Estimation of a Finite Population Mean under Random Non Response Using Improved Nadaraya and Watson Kernel Weights

Authors: Nelson Bii, Christopher Ouma, John Odhiambo

Abstract:

Non-response is a potential source of errors in sample surveys. It introduces bias and large variance in the estimation of finite population parameters. Regression models have been recognized as one of the techniques of reducing bias and variance due to random non-response using auxiliary data. In this study, it is assumed that random non-response occurs in the survey variable in the second stage of cluster sampling, assuming full auxiliary information is available throughout. Auxiliary information is used at the estimation stage via a regression model to address the problem of random non-response. In particular, the auxiliary information is used via an improved Nadaraya-Watson kernel regression technique to compensate for random non-response. The asymptotic bias and mean squared error of the estimator proposed are derived. Besides, a simulation study conducted indicates that the proposed estimator has smaller values of the bias and smaller mean squared error values compared to existing estimators of finite population mean. The proposed estimator is also shown to have tighter confidence interval lengths at a 95% coverage rate. The results obtained in this study are useful, for instance, in choosing efficient estimators of the finite population mean in demographic sample surveys.

Keywords: mean squared error, random non-response, two-stage cluster sampling, confidence interval lengths

Procedia PDF Downloads 140
10837 Analysis of Formyl Peptide Receptor 1 Protein Value as an Indicator of Neutrophil Chemotaxis Dysfunction in Aggressive Periodontitis

Authors: Prajna Metta, Yanti Rusyanti, Nunung Rusminah, Bremmy Laksono

Abstract:

The decrease of neutrophil chemotaxis function may cause increased susceptibility to aggressive periodontitis (AP). Neutrophil chemotaxis is affected by formyl peptide receptor 1 (FPR1), which when activated will respond to bacterial chemotactic peptide formyl methionyl leusyl phenylalanine (FMLP). FPR1 protein value is decreased in response to a wide number of inflammatory stimuli in AP patients. This study was aimed to assess the alteration of FPR1 protein value in AP patients and if FPR1 protein value could be used as an indicator of neutrophil chemotaxis dysfunction in AP. This is a case control study with 20 AP patients and 20 control subjects. Three milliliters of peripheral blood were drawn and analyzed for FPR1 protein value with ELISA. The data were statistically analyzed with Mann-Whitney test (p>0,05). Results showed that the mean value of FPR1 protein value in AP group is 0,353 pg/mL (0,11 to 1,18 pg/mL) and the mean value of FPR1 protein value in control group is 0,296 pg/mL (0,05 to 0,88 pg/mL). P value 0,787 > 0,05 suggested that there is no significant difference of FPR1 protein value in both groups. The present study suggests that FPR1 protein value has no significance alteration in AP patients and could not be used as an indicator of neutrophil chemotaxis dysfunction.

Keywords: aggressive periodontitis, chemotaxis dysfunction, FPR1 protein value, neutrophil

Procedia PDF Downloads 218
10836 Dataset Quality Index:Development of Composite Indicator Based on Standard Data Quality Indicators

Authors: Sakda Loetpiparwanich, Preecha Vichitthamaros

Abstract:

Nowadays, poor data quality is considered one of the majority costs for a data project. The data project with data quality awareness almost as much time to data quality processes while data project without data quality awareness negatively impacts financial resources, efficiency, productivity, and credibility. One of the processes that take a long time is defining the expectations and measurements of data quality because the expectation is different up to the purpose of each data project. Especially, big data project that maybe involves with many datasets and stakeholders, that take a long time to discuss and define quality expectations and measurements. Therefore, this study aimed at developing meaningful indicators to describe overall data quality for each dataset to quick comparison and priority. The objectives of this study were to: (1) Develop a practical data quality indicators and measurements, (2) Develop data quality dimensions based on statistical characteristics and (3) Develop Composite Indicator that can describe overall data quality for each dataset. The sample consisted of more than 500 datasets from public sources obtained by random sampling. After datasets were collected, there are five steps to develop the Dataset Quality Index (SDQI). First, we define standard data quality expectations. Second, we find any indicators that can measure directly to data within datasets. Thirdly, each indicator aggregates to dimension using factor analysis. Next, the indicators and dimensions were weighted by an effort for data preparing process and usability. Finally, the dimensions aggregate to Composite Indicator. The results of these analyses showed that: (1) The developed useful indicators and measurements contained ten indicators. (2) the developed data quality dimension based on statistical characteristics, we found that ten indicators can be reduced to 4 dimensions. (3) The developed Composite Indicator, we found that the SDQI can describe overall datasets quality of each dataset and can separate into 3 Level as Good Quality, Acceptable Quality, and Poor Quality. The conclusion, the SDQI provide an overall description of data quality within datasets and meaningful composition. We can use SQDI to assess for all data in the data project, effort estimation, and priority. The SDQI also work well with Agile Method by using SDQI to assessment in the first sprint. After passing the initial evaluation, we can add more specific data quality indicators into the next sprint.

Keywords: data quality, dataset quality, data quality management, composite indicator, factor analysis, principal component analysis

Procedia PDF Downloads 139
10835 Understanding the Qualitative Nature of Product Reviews by Integrating Text Processing Algorithm and Usability Feature Extraction

Authors: Cherry Yieng Siang Ling, Joong Hee Lee, Myung Hwan Yun

Abstract:

The quality of a product to be usable has become the basic requirement in consumer’s perspective while failing the requirement ends up the customer from not using the product. Identifying usability issues from analyzing quantitative and qualitative data collected from usability testing and evaluation activities aids in the process of product design, yet the lack of studies and researches regarding analysis methodologies in qualitative text data of usability field inhibits the potential of these data for more useful applications. While the possibility of analyzing qualitative text data found with the rapid development of data analysis studies such as natural language processing field in understanding human language in computer, and machine learning field in providing predictive model and clustering tool. Therefore, this research aims to study the application capability of text processing algorithm in analysis of qualitative text data collected from usability activities. This research utilized datasets collected from LG neckband headset usability experiment in which the datasets consist of headset survey text data, subject’s data and product physical data. In the analysis procedure, which integrated with the text-processing algorithm, the process includes training of comments onto vector space, labeling them with the subject and product physical feature data, and clustering to validate the result of comment vector clustering. The result shows 'volume and music control button' as the usability feature that matches best with the cluster of comment vectors where centroid comments of a cluster emphasized more on button positions, while centroid comments of the other cluster emphasized more on button interface issues. When volume and music control buttons are designed separately, the participant experienced less confusion, and thus, the comments mentioned only about the buttons' positions. While in the situation where the volume and music control buttons are designed as a single button, the participants experienced interface issues regarding the buttons such as operating methods of functions and confusion of functions' buttons. The relevance of the cluster centroid comments with the extracted feature explained the capability of text processing algorithms in analyzing qualitative text data from usability testing and evaluations.

Keywords: usability, qualitative data, text-processing algorithm, natural language processing

Procedia PDF Downloads 285
10834 Coping Strategies among Caregivers of Children with Autism Spectrum Disorders: A Cluster Analysis

Authors: Noor Ismael, Lisa Mische Lawson, Lauren Little, Murad Moqbel

Abstract:

Background/Significance: Caregivers of children with Autism Spectrum Disorders (ASD) develop coping mechanisms to overcome daily challenges to successfully parent their child. There is variability in coping strategies used among caregivers of children with ASD. Capturing homogeneity among such variable groups may help elucidate targeted intervention approaches for caregivers of children with ASD. Study Purpose: This study aimed to identify groups of caregivers of children with ASD based on coping mechanisms, and to examine whether there are differences among these groups in terms of strain level. Methods: This study utilized a secondary data analysis, and included survey responses of 273 caregivers of children with ASD. Measures consisted of the COPE Inventory and the Caregiver Strain Questionnaire. Data analyses consisted of cluster analysis to group caregiver coping strategies, and analysis of variance to compare the caregiver coping groups on strain level. Results: Cluster analysis results showed four distinct groups with different combinations of coping strategies: Social-Supported/Planning (group one), Spontaneous/Reactive (group two), Self-Supporting/Reappraisal (group three), and Religious/Expressive (group four). Caregivers in group one (Social-Supported/Planning) demonstrated significantly higher levels than the remaining three groups in the use of the following coping strategies: planning, use of instrumental social support, and use of emotional social support, relative to the other three groups. Caregivers in group two (Spontaneous/Reactive) used less restraint relative to the other three groups, and less suppression of competing activities relative to the other three groups as coping strategies. Also, group two showed significantly lower levels of religious coping as compared to the other three groups. In contrast to group one, caregivers in group three (Self-Supporting/Reappraisal) demonstrated significantly lower levels of the use of instrumental social support and the use of emotional social support relative to the other three groups. Additionally, caregivers in group three showed more acceptance, positive reinterpretation and growth coping strategies. Caregivers in group four (Religious/Expressive) demonstrated significantly higher levels of religious coping relative to the other three groups and utilized more venting of emotions strategies. Analysis of Variance results showed no significant differences between the four groups on the strain scores. Conclusions: There are four distinct groups with different combinations of coping strategies: Social-Supported/Planning, Spontaneous/Reactive, Self-Supporting/Reappraisal, and Religious/Expressive. Each caregiver group engaged in a combination of coping strategies to overcome the strain of caregiving.

Keywords: autism, caregivers, cluster analysis, coping strategies

Procedia PDF Downloads 282
10833 A Spatial Approach to Model Mortality Rates

Authors: Yin-Yee Leong, Jack C. Yue, Hsin-Chung Wang

Abstract:

Human longevity has been experiencing its largest increase since the end of World War II, and modeling the mortality rates is therefore often the focus of many studies. Among all mortality models, the Lee–Carter model is the most popular approach since it is fairly easy to use and has good accuracy in predicting mortality rates (e.g., for Japan and the USA). However, empirical studies from several countries have shown that the age parameters of the Lee–Carter model are not constant in time. Many modifications of the Lee–Carter model have been proposed to deal with this problem, including adding an extra cohort effect and adding another period effect. In this study, we propose a spatial modification and use clusters to explain why the age parameters of the Lee–Carter model are not constant. In spatial analysis, clusters are areas with unusually high or low mortality rates than their neighbors, where the “location” of mortality rates is measured by age and time, that is, a 2-dimensional coordinate. We use a popular cluster detection method—Spatial scan statistics, a local statistical test based on the likelihood ratio test to evaluate where there are locations with mortality rates that cannot be described well by the Lee–Carter model. We first use computer simulation to demonstrate that the cluster effect is a possible source causing the problem of the age parameters not being constant. Next, we show that adding the cluster effect can solve the non-constant problem. We also apply the proposed approach to mortality data from Japan, France, the USA, and Taiwan. The empirical results show that our approach has better-fitting results and smaller mean absolute percentage errors than the Lee–Carter model.

Keywords: mortality improvement, Lee–Carter model, spatial statistics, cluster detection

Procedia PDF Downloads 171