Search results for: multistage cluster sampling.
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 902

Search results for: multistage cluster sampling.

722 Using Pattern Search Methods for Minimizing Clustering Problems

Authors: Parvaneh Shabanzadeh, Malik Hj Abu Hassan, Leong Wah June, Maryam Mohagheghtabar

Abstract:

Clustering is one of an interesting data mining topics that can be applied in many fields. Recently, the problem of cluster analysis is formulated as a problem of nonsmooth, nonconvex optimization, and an algorithm for solving the cluster analysis problem based on nonsmooth optimization techniques is developed. This optimization problem has a number of characteristics that make it challenging: it has many local minimum, the optimization variables can be either continuous or categorical, and there are no exact analytical derivatives. In this study we show how to apply a particular class of optimization methods known as pattern search methods to address these challenges. These methods do not explicitly use derivatives, an important feature that has not been addressed in previous studies. Results of numerical experiments are presented which demonstrate the effectiveness of the proposed method.

Keywords: Clustering functions, Non-smooth Optimization, Nonconvex Optimization, Pattern Search Method.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1585
721 The Effect of Brand Mascots on Consumers' Purchasing Behaviors

Authors: Isari Pairoa, Proud Arunrangsiwed

Abstract:

Brand mascots are the cartoon characters, which are mainly designed for advertising or other related marketing purposes. Many brand mascots are extremely popular, since they were presented in commercial advertisements and Line Stickers. Brand Line Stickers could lead the users to identify with the brand and brand mascots, where might influence users to become loyal customers, and share the identity with the brand. The objective of the current study is to examine the effect of brand mascots on consumers’ decision and consumers’ intention to purchase the product. This study involved 400 participants, using cluster sampling from 50 districts in Bangkok metropolitan area. The descriptive analysis shows that using brand mascot causes consumers' positive attitude toward the products, and also heightens the possibility to purchasing the products. The current study suggests the new type of marketing strategy, which is brand fandom. This study has also contributed the knowledge to the area of integrated marketing communication and identification theory.

Keywords: Brand mascot, consumers’ behavior, marketing communication, purchasing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7109
720 Using Spectral Vectors and M-Tree for Graph Clustering and Searching in Graph Databases of Protein Structures

Authors: Do Phuc, Nguyen Thi Kim Phung

Abstract:

In this paper, we represent protein structure by using graph. A protein structure database will become a graph database. Each graph is represented by a spectral vector. We use Jacobi rotation algorithm to calculate the eigenvalues of the normalized Laplacian representation of adjacency matrix of graph. To measure the similarity between two graphs, we calculate the Euclidean distance between two graph spectral vectors. To cluster the graphs, we use M-tree with the Euclidean distance to cluster spectral vectors. Besides, M-tree can be used for graph searching in graph database. Our proposal method was tested with graph database of 100 graphs representing 100 protein structures downloaded from Protein Data Bank (PDB) and we compare the result with the SCOP hierarchical structure.

Keywords: Eigenvalues, m-tree, graph database, protein structure, spectra graph theory.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1600
719 The Relationship between Students- Socio-Economic Backgrounds and Student Residential Satisfaction

Authors: Nurul ‘Ulyani Mohd Najib, Nor’ Aini Yusof, Zulkifli Osman

Abstract:

Debates on residential satisfaction topic have been vigorously discussed in family house setting. Nonetheless, less or lack of attention was given to survey on student residential satisfaction in the campus house setting. This study, however, tried to fill in the gap by focusing more on the relationship between students- socio-economic backgrounds and student residential satisfaction with their on-campus student housing facilities. Two-stage cluster sampling method was employed to classify the respondents. Then, self-administered questionnaires were distributed face-to-face to the students. In general, it was confirmed that the students- socioeconomic backgrounds have significantly influence the students- satisfaction with their on-campus student housing facilities. The main influential factors were revealed as the economic status, sense of sharing, and the ethnicity of roommates. Likewise, this study could also provide some useful feedback for the universities administration in order to improve their student housing facilities.

Keywords: Malaysia, Socio-economic, Student housing, Studentresidential satisfaction

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1953
718 Information Measures Based on Sampling Distributions

Authors: Om Parkash, A. K. Thukral, C. P. Gandhi

Abstract:

Information theory and Statistics play an important role in Biological Sciences when we use information measures for the study of diversity and equitability. In this communication, we develop the link among the three disciplines and prove that sampling distributions can be used to develop new information measures. Our study will be an interdisciplinary and will find its applications in Biological systems.

Keywords: Entropy, concavity, symmetry, arithmetic mean, diversity, equitability.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1341
717 Growing Self Organising Map Based Exploratory Analysis of Text Data

Authors: Sumith Matharage, Damminda Alahakoon

Abstract:

Textual data plays an important role in the modern world. The possibilities of applying data mining techniques to uncover hidden information present in large volumes of text collections is immense. The Growing Self Organizing Map (GSOM) is a highly successful member of the Self Organising Map family and has been used as a clustering and visualisation tool across wide range of disciplines to discover hidden patterns present in the data. A comprehensive analysis of the GSOM’s capabilities as a text clustering and visualisation tool has so far not been published. These functionalities, namely map visualisation capabilities, automatic cluster identification and hierarchical clustering capabilities are presented in this paper and are further demonstrated with experiments on a benchmark text corpus.

Keywords: Text Clustering, Growing Self Organizing Map, Automatic Cluster Identification, Hierarchical Clustering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1900
716 Evaluation of Negative Air Ions in Bioaerosol Removal: Indoor Concentration of Airborne Bacterial and Fungal in Residential Building in Qom City, Iran

Authors: Z. Asadgol, A. Nadali, H. Arfaeinia, M. Khalifeh Gholi, R. Fateh, M. Fahiminia

Abstract:

The present investigation was conducted to detect the type and concentrations of bacterial and fungal bioaerosols in one room (bedroom) of each selected residential building located in different regions of Qom during February 2015 (n=9) to July 2016 (n=11). Moreover, we evaluated the efficiency of negative air ions (NAIs) in bioaerosol reduction in indoor air in residential buildings. In the first step, the mean concentrations of bacterial and fungal in nine sampling sites evaluated in winter were 744 and 579 colony forming units (CFU)/m3, while these values were 1628.6 and 231 CFU/m3 in the 11 sampling sites evaluated in summer, respectively. The most predominant genera between bacterial and fungal in all sampling sites were detected as Micrococcus spp. and Staphylococcus spp. and also, Aspergillus spp. and Penicillium spp., respectively. The 95% and 45% of sampling sites have bacterial and fungal concentrations over the recommended levels, respectively. In the removal step, we achieved a reduction with a range of 38% to 93% for bacterial genera and 25% to 100% for fungal genera by using NAIs. The results suggested that NAI is a highly effective, simple and efficient technique in reducing the bacterial and fungal concentration in the indoor air of residential buildings.

Keywords: Bacterial, fungal, negative air ions, indoor air, Iran.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 902
715 The Effects of Seasonal Variation on the Microbial-N Flow to the Small Intestine and Prediction of Feed Intake in Grazing Karayaka Sheep

Authors: Mustafa Salman, Nurcan Cetinkaya, Zehra Selcuk, Bugra Genc

Abstract:

The objectives of the present study were to estimate the microbial-N flow to the small intestine and to predict the digestible organic matter intake (DOMI) in grazing Karayaka sheep based on urinary excretion of purine derivatives (xanthine, hypoxanthine, uric acid, and allantoin) by the use of spot urine sampling under field conditions. In the trial, 10 Karayaka sheep from 2 to 3 years of age were used. The animals were grazed in a pasture for ten months and fed with concentrate and vetch plus oat hay for the other two months (January and February) indoors. Highly significant linear and cubic relationships (P<0.001) were found among months for purine derivatives index, purine derivatives excretion, purine derivatives absorption, microbial-N and DOMI. Through urine sampling and the determination of levels of excreted urinary PD and Purine Derivatives / Creatinine ratio (PDC index), microbial-N values were estimated and they indicated that the protein nutrition of the sheep was insufficient.

In conclusion, the prediction of protein nutrition of sheep under the field conditions may be possible with the use of spot urine sampling, urinary excreted PD and PDC index. The mean purine derivative levels in spot urine samples from sheep were highest in June, July and October. Protein nutrition of pastured sheep may be affected by weather changes, including rainfall. Spot urine sampling may useful in modeling the feed consumption of pasturing sheep. However, further studies are required under different field conditions with different breeds of sheep to develop spot urine sampling as a model.

Keywords: Karayaka sheep, spot sampling, urinary purine derivatives, PDC index, microbial-N, feed intake.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2031
714 Energy-Aware Routing in Mobile Wireless Sensor Networks

Authors: R. Geetha, G. Umarani Srikanth, S. Prabhu

Abstract:

Wireless sensor networks are resource constrained networks, where energy is the major resource in such networks. Therefore, energy conservation is major aspect in the deployment of Wireless Sensor Network. This work makes use of an extended Greedy Perimeter Stateless Routing (eGPSR) protocol that mainly focuses on energy efficient data transmission. This data transmission is based on the fact that the message that is sent to a distant node consumes more energy than the message that is sent to a short range transmission. Every cluster contains a head set that consists of many virtual cluster heads. Routing is decided by head set members. The energy level of the received signal is the major constraint to choose head set from its members. The experimental result shows that the use of eGPSR in routing has improved throughput with comparatively less delay.

Keywords: eGPSR, energy efficiency, routing, wireless sensor networks, WSN.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 873
713 Clustering Multivariate Empiric Characteristic Functions for Multi-Class SVM Classification

Authors: María-Dolores Cubiles-de-la-Vega, Rafael Pino-Mejías, Esther-Lydia Silva-Ramírez

Abstract:

A dissimilarity measure between the empiric characteristic functions of the subsamples associated to the different classes in a multivariate data set is proposed. This measure can be efficiently computed, and it depends on all the cases of each class. It may be used to find groups of similar classes, which could be joined for further analysis, or it could be employed to perform an agglomerative hierarchical cluster analysis of the set of classes. The final tree can serve to build a family of binary classification models, offering an alternative approach to the multi-class SVM problem. We have tested this dendrogram based SVM approach with the oneagainst- one SVM approach over four publicly available data sets, three of them being microarray data. Both performances have been found equivalent, but the first solution requires a smaller number of binary SVM models.

Keywords: Cluster Analysis, Empiric Characteristic Function, Multi-class SVM, R.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1820
712 Constant Factor Approximation Algorithm for p-Median Network Design Problem with Multiple Cable Types

Authors: Chaghoub Soraya, Zhang Xiaoyan

Abstract:

This research presents the first constant approximation algorithm to the p-median network design problem with multiple cable types. This problem was addressed with a single cable type and there is a bifactor approximation algorithm for the problem. To the best of our knowledge, the algorithm proposed in this paper is the first constant approximation algorithm for the p-median network design with multiple cable types. The addressed problem is a combination of two well studied problems which are p-median problem and network design problem. The introduced algorithm is a random sampling approximation algorithm of constant factor which is conceived by using some random sampling techniques form the literature. It is based on a redistribution Lemma from the literature and a steiner tree problem as a subproblem. This algorithm is simple, and it relies on the notions of random sampling and probability. The proposed approach gives an approximation solution with one constant ratio without violating any of the constraints, in contrast to the one proposed in the literature. This paper provides a (21 + 2)-approximation algorithm for the p-median network design problem with multiple cable types using random sampling techniques.

Keywords: Approximation algorithms, buy-at-bulk, combinatorial optimization, network design, p-median.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 535
711 Neural Network Imputation in Complex Survey Design

Authors: Safaa R. Amer

Abstract:

Missing data yields many analysis challenges. In case of complex survey design, in addition to dealing with missing data, researchers need to account for the sampling design to achieve useful inferences. Methods for incorporating sampling weights in neural network imputation were investigated to account for complex survey designs. An estimate of variance to account for the imputation uncertainty as well as the sampling design using neural networks will be provided. A simulation study was conducted to compare estimation results based on complete case analysis, multiple imputation using a Markov Chain Monte Carlo, and neural network imputation. Furthermore, a public-use dataset was used as an example to illustrate neural networks imputation under a complex survey design

Keywords: Complex survey, estimate, imputation, neural networks, variance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1937
710 Long-Term Monitoring and Seasonal Analysis of PM10-Bound Benzo(a)pyrene in the Ambient Air of Northwestern Hungary

Authors: Zs. Csanádi, A. Szabó Nagy, J. Szabó, J. Erdős

Abstract:

Atmospheric aerosols have several important environmental impacts and health effects in point of air quality. Monitoring the PM10-bound polycyclic aromatic hydrocarbons (PAHs) could have important environmental significance and health protection aspects. Benzo(a)pyrene (BaP) is the most relevant indicator of these PAH compounds. In Hungary, the Hungarian Air Quality Network provides air quality monitoring data for several air pollutants including BaP, but these data show only the annual mean concentrations and maximum values. Seasonal variation of BaP concentrations comparing the heating and non-heating periods could have important role and difference as well. For this reason, the main objective of this study was to assess the annual concentration and seasonal variation of BaP associated with PM10 in the ambient air of Northwestern Hungary seven different sampling sites (six urban and one rural) in the sampling period of 2008–2013. A total of 1475 PM10 aerosol samples were collected in the different sampling sites and analyzed for BaP by gas chromatography method. The BaP concentrations ranged from undetected to 8 ng/m3 with the mean value range of 0.50-0.96 ng/m3 referring to all sampling sites. Relatively higher concentrations of BaP were detected in samples collected in each sampling site in the heating seasons compared with non-heating periods. The annual mean BaP concentrations were comparable with the published data of the other Hungarian sites.

Keywords: Air quality, benzo(a)pyrene, PAHs, polycyclic aromatic hydrocarbons.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1350
709 The Role of Knowledge Management in Innovation: Spanish Evidence

Authors: María Jesús Luengo-Valderrey, Mónica Moso-Díez

Abstract:

In the knowledge-based economy, innovation is considered essential in order to achieve survival and growth in organizations. On the other hand, knowledge management is currently understood as one of the keys to innovation process. Both factors are generally admitted as generators of competitive advantage in organizations. Specifically, activities on R&D&I and those that generate internal knowledge have a positive influence in innovation results. This paper examines this effect and if it is similar or not is what we aimed to quantify in this paper. We focus on the impact that proportion of knowledge workers, the R&D&I investment, the amounts destined for ICTs and training for innovation have on the variation of tangible and intangibles returns for the sector of high and medium technology in Spain. To do this, we have performed an empirical analysis on the results of questionnaires about innovation in enterprises in Spain, collected by the National Statistics Institute. First, using clusters methodology, the behavior of these enterprises regarding knowledge management is identified. Then, using SEM methodology, we performed, for each cluster, the study about cause-effect relationships among constructs defined through variables, setting its type and quantification. The cluster analysis results in four groups in which cluster number 1 and 3 presents the best performance in innovation with differentiating nuances among them, while clusters 2 and 4 obtained divergent results to a similar innovative effort. However, the results of SEM analysis for each cluster show that, in all cases, knowledge workers are those that affect innovation performance most, regardless of the level of investment, and that there is a strong correlation between knowledge workers and investment in knowledge generation. The main findings reached is that Spanish high and medium technology companies improve their innovation performance investing in internal knowledge generation measures, specially, in terms of R&D activities, and underinvest in external ones. This, and the strong correlation between knowledge workers and the set of activities that promote the knowledge generation, should be taken into account by managers of companies, when making decisions about their investments for innovation, since they are key for improving their opportunities in the global market.

Keywords: High and medium technology sector, innovation, knowledge management, Spanish companies.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2138
708 An Adaptive Fuzzy Clustering Approach for the Network Management

Authors: Amal Elmzabi, Mostafa Bellafkih, Mohammed Ramdani

Abstract:

The Chiu-s method which generates a Takagi-Sugeno Fuzzy Inference System (FIS) is a method of fuzzy rules extraction. The rules output is a linear function of inputs. In addition, these rules are not explicit for the expert. In this paper, we develop a method which generates Mamdani FIS, where the rules output is fuzzy. The method proceeds in two steps: first, it uses the subtractive clustering principle to estimate both the number of clusters and the initial locations of a cluster centers. Each obtained cluster corresponds to a Mamdani fuzzy rule. Then, it optimizes the fuzzy model parameters by applying a genetic algorithm. This method is illustrated on a traffic network management application. We suggest also a Mamdani fuzzy rules generation method, where the expert wants to classify the output variables in some fuzzy predefined classes.

Keywords: Fuzzy entropy, fuzzy inference systems, genetic algorithms, network management, subtractive clustering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1839
707 Liver Lesion Extraction with Fuzzy Thresholding in Contrast Enhanced Ultrasound Images

Authors: Abder-Rahman Ali, Adélaïde Albouy-Kissi, Manuel Grand-Brochier, Viviane Ladan-Marcus, Christine Hoeffl, Claude Marcus, Antoine Vacavant, Jean-Yves Boire

Abstract:

In this paper, we present a new segmentation approach for focal liver lesions in contrast enhanced ultrasound imaging. This approach, based on a two-cluster Fuzzy C-Means methodology, considers type-II fuzzy sets to handle uncertainty due to the image modality (presence of speckle noise, low contrast, etc.), and to calculate the optimum inter-cluster threshold. Fine boundaries are detected by a local recursive merging of ambiguous pixels. The method has been tested on a representative database. Compared to both Otsu and type-I Fuzzy C-Means techniques, the proposed method significantly reduces the segmentation errors.

Keywords: Defuzzification, fuzzy clustering, image segmentation, type-II fuzzy sets.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2246
706 An Energy Aware Data Aggregation in Wireless Sensor Network Using Connected Dominant Set

Authors: M. Santhalakshmi, P Suganthi

Abstract:

Wireless Sensor Networks (WSNs) have many advantages. Their deployment is easier and faster than wired sensor networks or other wireless networks, as they do not need fixed infrastructure. Nodes are partitioned into many small groups named clusters to aggregate data through network organization. WSN clustering guarantees performance achievement of sensor nodes. Sensor nodes energy consumption is reduced by eliminating redundant energy use and balancing energy sensor nodes use over a network. The aim of such clustering protocols is to prolong network life. Low Energy Adaptive Clustering Hierarchy (LEACH) is a popular protocol in WSN. LEACH is a clustering protocol in which the random rotations of local cluster heads are utilized in order to distribute energy load among all sensor nodes in the network. This paper proposes Connected Dominant Set (CDS) based cluster formation. CDS aggregates data in a promising approach for reducing routing overhead since messages are transmitted only within virtual backbone by means of CDS and also data aggregating lowers the ratio of responding hosts to the hosts existing in virtual backbones. CDS tries to increase networks lifetime considering such parameters as sensors lifetime, remaining and consumption energies in order to have an almost optimal data aggregation within networks. Experimental results proved CDS outperformed LEACH regarding number of cluster formations, average packet loss rate, average end to end delay, life computation, and remaining energy computation.

Keywords: Wireless sensor network, connected dominant set, clustering, data aggregation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1090
705 Multi-Agent Systems for Intelligent Clustering

Authors: Jung-Eun Park, Kyung-Whan Oh

Abstract:

Intelligent systems are required in order to quickly and accurately analyze enormous quantities of data in the Internet environment. In intelligent systems, information extracting processes can be divided into supervised learning and unsupervised learning. This paper investigates intelligent clustering by unsupervised learning. Intelligent clustering is the clustering system which determines the clustering model for data analysis and evaluates results by itself. This system can make a clustering model more rapidly, objectively and accurately than an analyzer. The methodology for the automatic clustering intelligent system is a multi-agent system that comprises a clustering agent and a cluster performance evaluation agent. An agent exchanges information about clusters with another agent and the system determines the optimal cluster number through this information. Experiments using data sets in the UCI Machine Repository are performed in order to prove the validity of the system.

Keywords: Intelligent Clustering, Multi-Agent System, PCA, SOM, VC(Variance Criterion)

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1681
704 Cross-Cultural Socio-Economic Status Attainment between Muslim and Santal Couple in Rural Bangladesh

Authors: Md. Emaj Uddin

Abstract:

This study compared socio-economic status attainment between the Muslim and Santal couples in rural Bangladesh. For this we hypothesized that socio-economic status attainment (occupation, education and income) of the Muslim couples was higher than the Santal ones in rural Bangladesh. In order to examine the hypothesis 288 couples (145 couples for Muslim and 143 couples for Santal) selected by cluster random sampling from Kalna village, Bangladesh were individually interviewed with semistructured questionnaire method. The results of Pearson Chi-Squire test suggest that there were significant differences in socio-economic status attainment between the two communities- couples. In addition, Pearson correlation coefficients also suggest that there were significant associations between the socio-economic statuses attained by the two communities- couples in rural Bangladesh. Further crosscultural study should conduct on how inter-community relations in rural social structure of Bangladesh influence the differences among the couples- socio-economic status attainment

Keywords: Bangladesh, Couple, Cross-Cultural Comparison, Muslim, Socio-Economic Status Attainment, Santal.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2193
703 A Software of Intrusion Detection Mechanism for Virtual Platforms

Authors: Ying-Chuan Chen, Shuen-Tai Wang

Abstract:

Security is an interesting and significance issue for popular virtual platforms, such as virtualization cluster and cloud platforms. Virtualization is the powerful technology for cloud computing services, there are a lot of benefits by using virtual machine tools which be called hypervisors, such as it can quickly deploy all kinds of virtual Operating Systems in single platform, able to control all virtual system resources effectively, cost down for system platform deployment, ability of customization, high elasticity and high reliability. However, some important security problems need to take care and resolved in virtual platforms that include terrible viruses, evil programs, illegal operations and intrusion behavior. In this paper, we present useful Intrusion Detection Mechanism (IDM) software that not only can auto to analyze all system-s operations with the accounting journal database, but also is able to monitor the system-s state for virtual platforms.

Keywords: security, cluster, cloud, virtualization, virtual machine, virus, intrusion detection

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1493
702 Distribution Sampling of Vector Variance without Duplications

Authors: Erna T. Herdiani, Maman A. Djauhari

Abstract:

In recent years, the use of vector variance as a measure of multivariate variability has received much attention in wide range of statistics. This paper deals with a more economic measure of multivariate variability, defined as vector variance minus all duplication elements. For high dimensional data, this will increase the computational efficiency almost 50 % compared to the original vector variance. Its sampling distribution will be investigated to make its applications possible.

Keywords: Asymptotic distribution, covariance matrix, likelihood ratio test, vector variance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1511
701 Simultaneous Clustering and Feature Selection Method for Gene Expression Data

Authors: T. Chandrasekhar, K. Thangavel, E. N. Sathishkumar

Abstract:

Microarrays are made it possible to simultaneously monitor the expression profiles of thousands of genes under various experimental conditions. It is used to identify the co-expressed genes in specific cells or tissues that are actively used to make proteins. This method is used to analysis the gene expression, an important task in bioinformatics research. Cluster analysis of gene expression data has proved to be a useful tool for identifying co-expressed genes, biologically relevant groupings of genes and samples. In this work K-Means algorithms has been applied for clustering of Gene Expression Data. Further, rough set based Quick reduct algorithm has been applied for each cluster in order to select the most similar genes having high correlation. Then the ACV measure is used to evaluate the refined clusters and classification is used to evaluate the proposed method. They could identify compact clusters with feature selection method used to genes are selected.

Keywords: Clustering, Feature selection, Gene expression data, Quick reduct.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1924
700 Clustering Protein Sequences with Tailored General Regression Model Technique

Authors: G. Lavanya Devi, Allam Appa Rao, A. Damodaram, GR Sridhar, G. Jaya Suma

Abstract:

Cluster analysis divides data into groups that are meaningful, useful, or both. Analysis of biological data is creating a new generation of epidemiologic, prognostic, diagnostic and treatment modalities. Clustering of protein sequences is one of the current research topics in the field of computer science. Linear relation is valuable in rule discovery for a given data, such as if value X goes up 1, value Y will go down 3", etc. The classical linear regression models the linear relation of two sequences perfectly. However, if we need to cluster a large repository of protein sequences into groups where sequences have strong linear relationship with each other, it is prohibitively expensive to compare sequences one by one. In this paper, we propose a new technique named General Regression Model Technique Clustering Algorithm (GRMTCA) to benignly handle the problem of linear sequences clustering. GRMT gives a measure, GR*, to tell the degree of linearity of multiple sequences without having to compare each pair of them.

Keywords: Clustering, General Regression Model, Protein Sequences, Similarity Measure.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1516
699 Spatio-temporal Variations in Heavy Metal Concentrations in Sediment of Qua Iboe River Estuary, Nigeria

Authors: Justina I. R. Udotong, Ime R. Udotong, Offiong U. Eka

Abstract:

The concentrations of heavy metals in sediments of Qua Iboe River Estuary (QIRE) were monitored at four different sampling locations in wet and dry seasons. A preliminary survey to determine the four sampling stations along the river continuum showed that the area spanned between <0.1‰ salinity at the control station and 21.5‰ at the fourth station along the river continuum. A preliminary survey to determine the four sampling locations along the river estuary showed variations in salinity and other physicochemical parameters. The estuary was found to be polluted with heavy metals from point and nonpoint sources at varying degrees. Mean values of 7.80 mg/kg, 4.97 mg/kg and 2.80 mg/kg of nickel were obtained for sediment samples from Douglas creek, Qua Iboe and Atlantic sampling locations, respectively in the dry season. The wet season nickel concentrations were however lower. The entire study area was grossly contaminated by iron. At Douglas creek, the concentration of iron in sediment was 9274 ± 9.54mg/kg while copper, nickel, lead and vanadium were <0.5mg/kg each as compared to iron. Bioaccumulation was therefore suspected within the study area as values of 31.00 ± 0.79, 36.00 ± 0.10 and 55.00 ± 0.05 mg/kg of zinc were recorded in sediment at Douglas creek, Atlantic and the control sampling locations. The results from this study showed that the source of these heavy metals were from point sources like the corrosion of metal steel pipes from old bridges as well as oily sludge wastes from the Qua Iboe Terminal / tank farm located within the vicinity of the study area.

Keywords: Heavy metal, Qua Iboe River Estuary, seasonal variations, sediment.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2073
698 Weight Functions for Signal Reconstruction Based On Level Crossings

Authors: Nagesha, G. Hemantha Kumar

Abstract:

Although the level crossing concept has been the subject of intensive investigation over the last few years, certain problems of great interest remain unsolved. One of these concern is distribution of threshold levels. This paper presents a new threshold level allocation schemes for level crossing based on nonuniform sampling. Intuitively, it is more reasonable if the information rich regions of the signal are sampled finer and those with sparse information are sampled coarser. To achieve this objective, we propose non-linear quantization functions which dynamically assign the number of quantization levels depending on the importance of the given amplitude range. Two new approaches to determine the importance of the given amplitude segment are presented. The proposed methods are based on exponential and logarithmic functions. Various aspects of proposed techniques are discussed and experimentally validated. Its efficacy is investigated by comparison with uniform sampling.

Keywords: speech signals, sampling, signal reconstruction, asynchronousdelta modulation, non-linear quantization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1608
697 Mathematical Programming on Multivariate Calibration Estimation in Stratified Sampling

Authors: Dinesh Rao, M.G.M. Khan, Sabiha Khan

Abstract:

Calibration estimation is a method of adjusting the original design weights to improve the survey estimates by using auxiliary information such as the known population total (or mean) of the auxiliary variables. A calibration estimator uses calibrated weights that are determined to minimize a given distance measure to the original design weights while satisfying a set of constraints related to the auxiliary information. In this paper, we propose a new multivariate calibration estimator for the population mean in the stratified sampling design, which incorporates information available for more than one auxiliary variable. The problem of determining the optimum calibrated weights is formulated as a Mathematical Programming Problem (MPP) that is solved using the Lagrange multiplier technique.

Keywords: Calibration estimation, Stratified sampling, Multivariate auxiliary information, Mathematical programming problem, Lagrange multiplier technique.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1882
696 Atomic Clusters: A Unique Building Motif for Future Smart Nanomaterials

Authors: Debesh R. Roy

Abstract:

The fundamental issue in understanding the origin and growth mechanism of nanomaterials, from a fundamental unit is a big challenging problem to the scientists. Recently, an immense attention is generated to the researchers for prediction of exceptionally stable atomic cluster units as the building units for future smart materials. The present study is a systematic investigation on the stability and electronic properties of a series of bimetallic (semiconductor-alkaline earth) clusters, viz., BxMg3 (x=1-5) is performed, in search for exceptional and/ or unusual stable motifs. A very popular hybrid exchange-correlation functional, B3LYP along with a higher basis set, viz., 6-31+G[d,p] is employed for this purpose under the density functional formalism. The magic stability among the concerned clusters is explained using the jellium model. It is evident from the present study that the magic stability of B4Mg3 cluster arises due to the jellium shell closure.

Keywords: Atomic Clusters, Density Functional Theory, Jellium Model, Magic Clusters, Smart Nanomaterials.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2179
695 Fuzzy Clustering of Locations for Degree of Accident Proneness based on Vehicle User Perceptions

Authors: Jayanth Jacob, C. V. Hariharakrishnan, Suganthi L.

Abstract:

The rapid urbanization of cities has a bane in the form road accidents that cause extensive damage to life and limbs. A number of location based factors are enablers of road accidents in the city. The speed of travel of vehicles is non-uniform among locations within a city. In this study, the perception of vehicle users is captured on a 10-point rating scale regarding the degree of variation in speed of travel at chosen locations in the city. The average rating is used to cluster locations using fuzzy c-means clustering and classify them as low, moderate and high speed of travel locations. The high speed of travel locations can be classified proactively to ensure that accidents do not occur due to the speeding of vehicles at such locations. The advantage of fuzzy c-means clustering is that a location may be a part of more than one cluster to a varying degree and this gives a better picture about the location with respect to the characteristic (speed of travel) being studied.

Keywords: C-means clustering, Location Specific, Road Accidents.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1788
694 Delay Analysis of Sampled-Data Systems in Hard RTOS

Authors: A. M. Azad, M. Alam, C. M. Hussain

Abstract:

In this paper, we have presented the effect of varying time-delays on performance and stability in the single-channel multirate sampled-data system in hard real-time (RT-Linux) environment. The sampling task require response time that might exceed the capacity of RT-Linux. So a straight implementation with RT-Linux is not feasible, because of the latency of the systems and hence, sampling period should be less to handle this task. The best sampling rate is chosen for the sampled-data system, which is the slowest rate meets all performance requirements. RT-Linux is consistent with its specifications and the resolution of the real-time is considered 0.01 seconds to achieve an efficient result. The test results of our laboratory experiment shows that the multi-rate control technique in hard real-time operating system (RTOS) can improve the stability problem caused by the random access delays and asynchronization.

Keywords: Multi-rate, PID, RT-Linux, Sampled-data, Servo.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1403
693 An Improved Sub-Nyquist Sampling Jamming Method for Deceiving Inverse Synthetic Aperture Radar

Authors: Yanli Qi, Ning Lv, Jing Li

Abstract:

Sub-Nyquist sampling jamming method (SNSJ) is a well known deception jamming method for inverse synthetic aperture radar (ISAR). However, the anti-decoy of the SNSJ method performs easier since the amplitude of the false-target images are weaker than the real-target image; the false-target images always lag behind the real-target image, and all targets are located in the same cross-range. In order to overcome the drawbacks mentioned above, a simple modulation based on SNSJ (M-SNSJ) is presented in this paper. The method first uses amplitude modulation factor to make the amplitude of the false-target images consistent with the real-target image, then uses the down-range modulation factor and cross-range modulation factor to make the false-target images move freely in down-range and cross-range, respectively, thus the capacity of deception is improved. Finally, the simulation results on the six available combinations of three modulation factors are given to illustrate our conclusion.

Keywords: Inverse synthetic aperture radar, ISAR, deceptive jamming, Sub-Nyquist sampling jamming method, SNSJ, modulation based on Sub-Nyquist sampling jamming method, M-SNSJ.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1232