Search results for: Constrained clustering
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 627

Search results for: Constrained clustering

327 Iterative Methods for An Inverse Problem

Authors: Minghui Wang, Shanrui Hu

Abstract:

An inverse problem of doubly center matrices is discussed. By translating the constrained problem into unconstrained problem, two iterative methods are proposed. A numerical example illustrate our algorithms.

Keywords: doubly center matrix, electric network theory, iterative methods, least-square problem.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1446
326 Clustering Mixed Data Using Non-normal Regression Tree for Process Monitoring

Authors: Youngji Yoo, Cheong-Sool Park, Jun Seok Kim, Young-Hak Lee, Sung-Shick Kim, Jun-Geol Baek

Abstract:

In the semiconductor manufacturing process, large amounts of data are collected from various sensors of multiple facilities. The collected data from sensors have several different characteristics due to variables such as types of products, former processes and recipes. In general, Statistical Quality Control (SQC) methods assume the normality of the data to detect out-of-control states of processes. Although the collected data have different characteristics, using the data as inputs of SQC will increase variations of data, require wide control limits, and decrease performance to detect outof- control. Therefore, it is necessary to separate similar data groups from mixed data for more accurate process control. In the paper, we propose a regression tree using split algorithm based on Pearson distribution to handle non-normal distribution in parametric method. The regression tree finds similar properties of data from different variables. The experiments using real semiconductor manufacturing process data show improved performance in fault detecting ability.

Keywords: Semiconductor, non-normal mixed process data, clustering, Statistical Quality Control (SQC), regression tree, Pearson distribution system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1751
325 Weighted Clustering Coefficient for Identifying Modular Formations in Protein-Protein Interaction Networks

Authors: Zelmina Lubovac, Björn Olsson, Jonas Gamalielsson

Abstract:

This paper describes a novel approach for deriving modules from protein-protein interaction networks, which combines functional information with topological properties of the network. This approach is based on weighted clustering coefficient, which uses weights representing the functional similarities between the proteins. These weights are calculated according to the semantic similarity between the proteins, which is based on their Gene Ontology terms. We recently proposed an algorithm for identification of functional modules, called SWEMODE (Semantic WEights for MODule Elucidation), that identifies dense sub-graphs containing functionally similar proteins. The rational underlying this approach is that each module can be reduced to a set of triangles (protein triplets connected to each other). Here, we propose considering semantic similarity weights of all triangle-forming edges between proteins. We also apply varying semantic similarity thresholds between neighbours of each node that are not neighbours to each other (and hereby do not form a triangle), to derive new potential triangles to include in module-defining procedure. The results show an improvement of pure topological approach, in terms of number of predicted modules that match known complexes.

Keywords: Modules, systems biology, protein interactionnetworks, yeast.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2076
324 Automatic Choice of Topics for Seminars by Clustering Students According to Their Profile

Authors: J.R. Quevedo, E. Montañés, J. Ranilla, A. Bahamonde

Abstract:

The new framework the Higher Education is immersed in involves a complete change in the way lecturers must teach and students must learn. Whereas the lecturer was the main character in traditional education, the essential goal now is to increase the students' participation in the process. Thus, one of the main tasks of lecturers in this new context is to design activities of different nature in order to encourage such participation. Seminars are one of the activities included in this environment. They are active sessions that enable going in depth into specific topics as support of other activities. They are characterized by some features such as favoring interaction between students and lecturers or improving their communication skills. Hence, planning and organizing strategic seminars is indeed a great challenge for lecturers with the aim of acquiring knowledge and abilities. This paper proposes a method using Artificial Intelligence techniques to obtain student profiles from their marks and preferences. The goal of building such profiles is twofold. First, it facilitates the task of splitting the students into different groups, each group with similar preferences and learning difficulties. Second, it makes it easy to select adequate topics to be a candidate for the seminars. The results obtained can be either a guarantee of what the lecturers could observe during the development of the course or a clue to reconsider new methodological strategies in certain topics.

Keywords: artificial intelligence, clustering, organizingseminars, student profile

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1338
323 Determining Optimum Time Multiplier Setting of Overcurrent Relays Using Mixed Integer Linear Programming

Authors: P. N. Korde, P. P. Bedekar

Abstract:

The time coordination of overcurrent relays (OCR) in a power distribution network is of great importance, as it reduces the power outages by avoiding the mal-operation of the backup relays. For this, the optimum value of the time multiplier setting (TMS) of OCRs should be chosen. The problem of determining the optimum value of TMS of OCRs in power distribution networks is formulated as a constrained optimization problem. The objective is to find the optimum value of TMS of OCRs to minimize the time of operation of relays under the constraint of maintaining the coordination of relays. A power distribution network can have a combination of numerical and electromechanical relays. The TMS of numerical relays can be set to any real value (which satisfies the constraints of the problem), whereas the TMS of electromechanical relays can be set in fixed step (0 to 1 in steps of 0.05). The main contribution of this paper is a formulation of the problem as a mixed-integer linear programming (MILP) problem and application of Gomory's cutting plane method to find the optimum value of TMS of OCRs. The TMS of electromechanical relays are taken as integers in the range 1 to 20 in the step of 1, and these values are mapped to 0.05 to 1 in the step of 0.05. The results obtained are compared with those obtained using a simplex method and its variants. It has been shown that the mixed-integer linear programming method outperforms the simplex method (and its variants) in the case of a system having a combination of numerical and electromechanical relays.

Keywords: Backup protection, constrained optimization, Gomory's cutting plane method, mixed-integer linear programming, overcurrent relay coordination, simplex method.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 389
322 Illumination Invariant Face Recognition using Supervised and Unsupervised Learning Algorithms

Authors: Shashank N. Mathur, Anil K. Ahlawat, Virendra P. Vishwakarma

Abstract:

In this paper, a comparative study of application of supervised and unsupervised learning algorithms on illumination invariant face recognition has been carried out. The supervised learning has been carried out with the help of using a bi-layered artificial neural network having one input, two hidden and one output layer. The gradient descent with momentum and adaptive learning rate back propagation learning algorithm has been used to implement the supervised learning in a way that both the inputs and corresponding outputs are provided at the time of training the network, thus here is an inherent clustering and optimized learning of weights which provide us with efficient results.. The unsupervised learning has been implemented with the help of a modified Counterpropagation network. The Counterpropagation network involves the process of clustering followed by application of Outstar rule to obtain the recognized face. The face recognition system has been developed for recognizing faces which have varying illumination intensities, where the database images vary in lighting with respect to angle of illumination with horizontal and vertical planes. The supervised and unsupervised learning algorithms have been implemented and have been tested exhaustively, with and without application of histogram equalization to get efficient results.

Keywords: Artificial Neural Networks, back propagation, Counterpropagation networks, face recognition, learning algorithms.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1657
321 Identification of Disease Causing DNA Motifs in Human DNA Using Clustering Approach

Authors: G. Tamilpavai, C. Vishnuppriya

Abstract:

Studying DNA (deoxyribonucleic acid) sequence is useful in biological processes and it is applied in the fields such as diagnostic and forensic research. DNA is the hereditary information in human and almost all other organisms. It is passed to their generations. Earlier stage detection of defective DNA sequence may lead to many developments in the field of Bioinformatics. Nowadays various tedious techniques are used to identify defective DNA. The proposed work is to analyze and identify the cancer-causing DNA motif in a given sequence. Initially the human DNA sequence is separated as k-mers using k-mer separation rule. The separated k-mers are clustered using Self Organizing Map (SOM). Using Levenshtein distance measure, cancer associated DNA motif is identified from the k-mer clusters. Experimental results of this work indicate the presence or absence of cancer causing DNA motif. If the cancer associated DNA motif is found in DNA, it is declared as the cancer disease causing DNA sequence. Otherwise the input human DNA is declared as normal sequence. Finally, elapsed time is calculated for finding the presence of cancer causing DNA motif using clustering formation. It is compared with normal process of finding cancer causing DNA motif. Locating cancer associated motif is easier in cluster formation process than the other one. The proposed work will be an initiative aid for finding genetic disease related research.

Keywords: Bioinformatics, cancer motif, DNA, k-mers, Levenshtein distance, SOM.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1324
320 Constraint Based Frequent Pattern Mining Technique for Solving GCS Problem

Authors: First G.M. Karthik, Second Ramachandra.V.Pujeri, Dr.

Abstract:

Generalized Center String (GCS) problem are generalized from Common Approximate Substring problem and Common substring problems. GCS are known to be NP-hard allowing the problems lies in the explosion of potential candidates. Finding longest center string without concerning the sequence that may not contain any motifs is not known in advance in any particular biological gene process. GCS solved by frequent pattern-mining techniques and known to be fixed parameter tractable based on the fixed input sequence length and symbol set size. Efficient method known as Bpriori algorithms can solve GCS with reasonable time/space complexities. Bpriori 2 and Bpriori 3-2 algorithm are been proposed of any length and any positions of all their instances in input sequences. In this paper, we reduced the time/space complexity of Bpriori algorithm by Constrained Based Frequent Pattern mining (CBFP) technique which integrates the idea of Constraint Based Mining and FP-tree mining. CBFP mining technique solves the GCS problem works for all center string of any length, but also for the positions of all their mutated copies of input sequence. CBFP mining technique construct TRIE like with FP tree to represent the mutated copies of center string of any length, along with constraints to restraint growth of the consensus tree. The complexity analysis for Constrained Based FP mining technique and Bpriori algorithm is done based on the worst case and average case approach. Algorithm's correctness compared with the Bpriori algorithm using artificial data is shown.

Keywords: Constraint Based Mining, FP tree, Data mining, GCS problem, CBFP mining technique.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1662
319 Skin Lesion Segmentation Using Color Channel Optimization and Clustering-based Histogram Thresholding

Authors: Rahil Garnavi, Mohammad Aldeen, M. Emre Celebi, Alauddin Bhuiyan, Constantinos Dolianitis, George Varigos

Abstract:

Automatic segmentation of skin lesions is the first step towards the automated analysis of malignant melanoma. Although numerous segmentation methods have been developed, few studies have focused on determining the most effective color space for melanoma application. This paper proposes an automatic segmentation algorithm based on color space analysis and clustering-based histogram thresholding, a process which is able to determine the optimal color channel for detecting the borders in dermoscopy images. The algorithm is tested on a set of 30 high resolution dermoscopy images. A comprehensive evaluation of the results is provided, where borders manually drawn by four dermatologists, are compared to automated borders detected by the proposed algorithm, applying three previously used metrics of accuracy, sensitivity, and specificity and a new metric of similarity. By performing ROC analysis and ranking the metrics, it is demonstrated that the best results are obtained with the X and XoYoR color channels, resulting in an accuracy of approximately 97%. The proposed method is also compared with two state-of-theart skin lesion segmentation methods.

Keywords: Border detection, Color space analysis, Dermoscopy, Histogram thresholding, Melanoma, Segmentation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2216
318 Mining Network Data for Intrusion Detection through Naïve Bayesian with Clustering

Authors: Dewan Md. Farid, Nouria Harbi, Suman Ahmmed, Md. Zahidur Rahman, Chowdhury Mofizur Rahman

Abstract:

Network security attacks are the violation of information security policy that received much attention to the computational intelligence society in the last decades. Data mining has become a very useful technique for detecting network intrusions by extracting useful knowledge from large number of network data or logs. Naïve Bayesian classifier is one of the most popular data mining algorithm for classification, which provides an optimal way to predict the class of an unknown example. It has been tested that one set of probability derived from data is not good enough to have good classification rate. In this paper, we proposed a new learning algorithm for mining network logs to detect network intrusions through naïve Bayesian classifier, which first clusters the network logs into several groups based on similarity of logs, and then calculates the prior and conditional probabilities for each group of logs. For classifying a new log, the algorithm checks in which cluster the log belongs and then use that cluster-s probability set to classify the new log. We tested the performance of our proposed algorithm by employing KDD99 benchmark network intrusion detection dataset, and the experimental results proved that it improves detection rates as well as reduces false positives for different types of network intrusions.

Keywords: Clustering, detection rate, false positive, naïveBayesian classifier, network intrusion detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5511
317 Normalization and Constrained Optimization of Measures of Fuzzy Entropy

Authors: K.C. Deshmukh, P.G. Khot, Nikhil

Abstract:

In the literature of information theory, there is necessity for comparing the different measures of fuzzy entropy and this consequently, gives rise to the need for normalizing measures of fuzzy entropy. In this paper, we have discussed this need and hence developed some normalized measures of fuzzy entropy. It is also desirable to maximize entropy and to minimize directed divergence or distance. Keeping in mind this idea, we have explained the method of optimizing different measures of fuzzy entropy.

Keywords: Fuzzy set, Uncertainty, Fuzzy entropy, Normalization, Membership function

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1429
316 Resistance and Sub-Resistances of RC Beams Subjected to Multiple Failure Modes

Authors: F. Sangiorgio, J. Silfwerbrand, G. Mancini

Abstract:

Geometric and mechanical properties all influence the resistance of RC structures and may, in certain combination of property values, increase the risk of a brittle failure of the whole system. This paper presents a statistical and probabilistic investigation on the resistance of RC beams designed according to Eurocodes 2 and 8, and subjected to multiple failure modes, under both the natural variation of material properties and the uncertainty associated with cross-section and transverse reinforcement geometry. A full probabilistic model based on JCSS Probabilistic Model Code is derived. Different beams are studied through material nonlinear analysis via Monte Carlo simulations. The resistance model is consistent with Eurocode 2. Both a multivariate statistical evaluation and the data clustering analysis of outcomes are then performed. Results show that the ultimate load behaviour of RC beams subjected to flexural and shear failure modes seems to be mainly influenced by the combination of the mechanical properties of both longitudinal reinforcement and stirrups, and the tensile strength of concrete, of which the latter appears to affect the overall response of the system in a nonlinear way. The model uncertainty of the resistance model used in the analysis plays undoubtedly an important role in interpreting results.

Keywords: Modelling, Monte Carlo Simulations, Probabilistic Models, Data Clustering, Reinforced Concrete Members, Structural Design.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2085
315 Production Scheduling Improvements in an Automotive Sector Company

Authors: Govind Sharan Dangayach, Himanshu Bhatt

Abstract:

The paper attempts to overcome the fluctuations occurring in demand of the components in an automotive sector company. Resource and time being the strict constraints, the production is not able to match the pace of the fluctuating demand. So, we introduce some production schedules that help in meeting out the required demand. The merits and demerits of the approaches are also highlighted.

Keywords: Production scheduling, Demand rise, Capacity constrained resource (CCR), Overtime.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1877
314 TheAnalyzer: Clustering-Based System for Improving Business Productivity by Analyzing User Profiles to Enhance Human-Computer Interaction

Authors: D. S. A. Nanayakkara, K. J. P. G. Perera

Abstract:

E-commerce platforms have revolutionized the shopping experience, offering convenient ways for consumers to make purchases. To improve interactions with customers and optimize marketing strategies, it is essential for businesses to understand user behavior, preferences, and needs on these platforms. This paper focuses on recommending businesses to customize interactions with users based on their behavioral patterns, leveraging data-driven analysis and machine learning techniques. Businesses can improve engagement and boost the adoption of e-commerce platforms by aligning behavioral patterns with user goals of usability and satisfaction. We propose TheAnalyzer, a clustering-based system designed to enhance business productivity by analyzing user-profiles and improving human-computer interaction. TheAnalyzer seamlessly integrates with business applications, collecting relevant data points based on users' natural interactions without additional burdens such as questionnaires or surveys. It defines five key user analytics as features for its dataset, which are easily captured through users' interactions with e-commerce platforms. This research presents a study demonstrating the successful distinction of users into specific groups based on the five key analytics considered by TheAnalyzer. With the assistance of domain experts, customized business rules can be attached to each group, enabling TheAnalyzer to influence business applications and provide an enhanced personalized user experience. The outcomes are evaluated quantitatively and qualitatively, demonstrating that utilizing TheAnalyzer’s capabilities can optimize business outcomes, enhance customer satisfaction, and drive sustainable growth. The findings of this research contribute to the advancement of personalized interactions in e-commerce platforms. By leveraging user behavioral patterns and analyzing both new and existing users, businesses can effectively tailor their interactions to improve customer satisfaction, loyalty and ultimately drive sales.

Keywords: Data clustering, data standardization, dimensionality reduction, human-computer interaction, user profiling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 172
313 Novel Hybrid Method for Gene Selection and Cancer Prediction

Authors: Liping Jing, Michael K. Ng, Tieyong Zeng

Abstract:

Microarray data profiles gene expression on a whole genome scale, therefore, it provides a good way to study associations between gene expression and occurrence or progression of cancer. More and more researchers realized that microarray data is helpful to predict cancer sample. However, the high dimension of gene expressions is much larger than the sample size, which makes this task very difficult. Therefore, how to identify the significant genes causing cancer becomes emergency and also a hot and hard research topic. Many feature selection algorithms have been proposed in the past focusing on improving cancer predictive accuracy at the expense of ignoring the correlations between the features. In this work, a novel framework (named by SGS) is presented for stable gene selection and efficient cancer prediction . The proposed framework first performs clustering algorithm to find the gene groups where genes in each group have higher correlation coefficient, and then selects the significant genes in each group with Bayesian Lasso and important gene groups with group Lasso, and finally builds prediction model based on the shrinkage gene space with efficient classification algorithm (such as, SVM, 1NN, Regression and etc.). Experiment results on real world data show that the proposed framework often outperforms the existing feature selection and prediction methods, say SAM, IG and Lasso-type prediction model.

Keywords: Gene Selection, Cancer Prediction, Lasso, Clustering, Classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2015
312 Visualization and Indexing of Spectral Databases

Authors: Tibor Kulcsar, Gabor Sarossy, Gabor Bereznai, Robert Auer, Janos Abonyi

Abstract:

On-line (near infrared) spectroscopy is widely used to support the operation of complex process systems. Information extracted from spectral database can be used to estimate unmeasured product properties and monitor the operation of the process. These techniques are based on looking for similar spectra by nearest neighborhood algorithms and distance based searching methods. Search for nearest neighbors in the spectral space is an NP-hard problem, the computational complexity increases by the number of points in the discrete spectrum and the number of samples in the database. To reduce the calculation time some kind of indexing could be used. The main idea presented in this paper is to combine indexing and visualization techniques to reduce the computational requirement of estimation algorithms by providing a two dimensional indexing that can also be used to visualize the structure of the spectral database. This 2D visualization of spectral database does not only support application of distance and similarity based techniques but enables the utilization of advanced clustering and prediction algorithms based on the Delaunay tessellation of the mapped spectral space. This means the prediction has not to use the high dimension space but can be based on the mapped space too. The results illustrate that the proposed method is able to segment (cluster) spectral databases and detect outliers that are not suitable for instance based learning algorithms.

Keywords: indexing high dimensional databases, dimensional reduction, clustering, similarity, k-nn algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1736
311 Multi-Objective Evolutionary Computation Based Feature Selection Applied to Behaviour Assessment of Children

Authors: F. Jiménez, R. Jódar, M. Martín, G. Sánchez, G. Sciavicco

Abstract:

Abstract—Attribute or feature selection is one of the basic strategies to improve the performances of data classification tasks, and, at the same time, to reduce the complexity of classifiers, and it is a particularly fundamental one when the number of attributes is relatively high. Its application to unsupervised classification is restricted to a limited number of experiments in the literature. Evolutionary computation has already proven itself to be a very effective choice to consistently reduce the number of attributes towards a better classification rate and a simpler semantic interpretation of the inferred classifiers. We present a feature selection wrapper model composed by a multi-objective evolutionary algorithm, the clustering method Expectation-Maximization (EM), and the classifier C4.5 for the unsupervised classification of data extracted from a psychological test named BASC-II (Behavior Assessment System for Children - II ed.) with two objectives: Maximizing the likelihood of the clustering model and maximizing the accuracy of the obtained classifier. We present a methodology to integrate feature selection for unsupervised classification, model evaluation, decision making (to choose the most satisfactory model according to a a posteriori process in a multi-objective context), and testing. We compare the performance of the classifier obtained by the multi-objective evolutionary algorithms ENORA and NSGA-II, and the best solution is then validated by the psychologists that collected the data.

Keywords: Feature selection, multi-objective evolutionary computation, unsupervised classification, behavior assessment system for children.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1413
310 Conjugate Gradient Algorithm for the Symmetric Arrowhead Solution of Matrix Equation AXB=C

Authors: Minghui Wang, Luping Xu, Juntao Zhang

Abstract:

Based on the conjugate gradient (CG) algorithm, the constrained matrix equation AXB=C and the associate optimal approximation problem are considered for the symmetric arrowhead matrix solutions in the premise of consistency. The convergence results of the method are presented. At last, a numerical example is given to illustrate the efficiency of this method.

Keywords: Iterative method, symmetric arrowhead matrix, conjugate gradient algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1383
309 Feature Based Unsupervised Intrusion Detection

Authors: Deeman Yousif Mahmood, Mohammed Abdullah Hussein

Abstract:

The goal of a network-based intrusion detection system is to classify activities of network traffics into two major categories: normal and attack (intrusive) activities. Nowadays, data mining and machine learning plays an important role in many sciences; including intrusion detection system (IDS) using both supervised and unsupervised techniques. However, one of the essential steps of data mining is feature selection that helps in improving the efficiency, performance and prediction rate of proposed approach. This paper applies unsupervised K-means clustering algorithm with information gain (IG) for feature selection and reduction to build a network intrusion detection system. For our experimental analysis, we have used the new NSL-KDD dataset, which is a modified dataset for KDDCup 1999 intrusion detection benchmark dataset. With a split of 60.0% for the training set and the remainder for the testing set, a 2 class classifications have been implemented (Normal, Attack). Weka framework which is a java based open source software consists of a collection of machine learning algorithms for data mining tasks has been used in the testing process. The experimental results show that the proposed approach is very accurate with low false positive rate and high true positive rate and it takes less learning time in comparison with using the full features of the dataset with the same algorithm.

Keywords: Information Gain (IG), Intrusion Detection System (IDS), K-means Clustering, Weka.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2738
308 Web Proxy Detection via Bipartite Graphs and One-Mode Projections

Authors: Zhipeng Chen, Peng Zhang, Qingyun Liu, Li Guo

Abstract:

With the Internet becoming the dominant channel for business and life, many IPs are increasingly masked using web proxies for illegal purposes such as propagating malware, impersonate phishing pages to steal sensitive data or redirect victims to other malicious targets. Moreover, as Internet traffic continues to grow in size and complexity, it has become an increasingly challenging task to detect the proxy service due to their dynamic update and high anonymity. In this paper, we present an approach based on behavioral graph analysis to study the behavior similarity of web proxy users. Specifically, we use bipartite graphs to model host communications from network traffic and build one-mode projections of bipartite graphs for discovering social-behavior similarity of web proxy users. Based on the similarity matrices of end-users from the derived one-mode projection graphs, we apply a simple yet effective spectral clustering algorithm to discover the inherent web proxy users behavior clusters. The web proxy URL may vary from time to time. Still, the inherent interest would not. So, based on the intuition, by dint of our private tools implemented by WebDriver, we examine whether the top URLs visited by the web proxy users are web proxies. Our experiment results based on real datasets show that the behavior clusters not only reduce the number of URLs analysis but also provide an effective way to detect the web proxies, especially for the unknown web proxies.

Keywords: Bipartite graph, clustering, one-mode projection, web proxy detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 713
307 Statistics of Exon Lengths in Animals, Plants, Fungi, and Protists

Authors: Alexander Kaplunovsky, Vladimir Khailenko, Alexander Bolshoy, Shara Atambayeva, AnatoliyIvashchenko

Abstract:

Eukaryotic protein-coding genes are interrupted by spliceosomal introns, which are removed from the RNA transcripts before translation into a protein. The exon-intron structures of different eukaryotic species are quite different from each other, and the evolution of such structures raises many questions. We try to address some of these questions using statistical analysis of whole genomes. We go through all the protein-coding genes in a genome and study correlations between the net length of all the exons in a gene, the number of the exons, and the average length of an exon. We also take average values of these features for each chromosome and study correlations between those averages on the chromosomal level. Our data show universal features of exon-intron structures common to animals, plants, and protists (specifically, Arabidopsis thaliana, Caenorhabditis elegans, Drosophila melanogaster, Cryptococcus neoformans, Homo sapiens, Mus musculus, Oryza sativa, and Plasmodium falciparum). We have verified linear correlation between the number of exons in a gene and the length of a protein coded by the gene, while the protein length increases in proportion to the number of exons. On the other hand, the average length of an exon always decreases with the number of exons. Finally, chromosome clustering based on average chromosome properties and parameters of linear regression between the number of exons in a gene and the net length of those exons demonstrates that these average chromosome properties are genome-specific features.

Keywords: Comparative genomics, exon-intron structure, eukaryotic clustering, linear regression.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2540
306 The Survey Research and Evaluation of Green Residential Building Based on the Improved Group Analytical Hierarchy Process Method in Yinchuan

Authors: Yun-na Wu, Zhen Wang

Abstract:

Due to the economic downturn and the deterioration of the living environment, the development of residential buildings as high energy consuming building is gradually changing from “extensive” to green building in China. So, the evaluation system of green building is continuously improved, but the current evaluation work has the following problems: (1) There are differences in the cost of the actual investment and the purchasing power of residents, also construction target of green residential building is single and lacks multi-objective performance development. (2) Green building evaluation lacks regional characteristics and cannot reflect the different regional residents demand. (3) In the process of determining the criteria weight, the experts’ judgment matrix is difficult to meet the requirement of consistency. Therefore, to solve those problems, questionnaires which are about the green residential building for Ningxia area are distributed, and the results of questionnaires can feedback the purchasing power of residents and the acceptance of the green building cost. Secondly, combined with the geographical features of Ningxia minority areas, the evaluation criteria system of green residential building is constructed. Finally, using the improved group AHP method and the grey clustering method, the criteria weight is determined, and a real case is evaluated, which is located in Xing Qing district, Ningxia. A conclusion can be obtained that the professional evaluation for this project and good social recognition is basically the same.

Keywords: Evaluation, green residential building, grey clustering method, group AHP.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 804
305 Comparative Study of Tensile Properties of Cast and Hot Forged Alumina Nanoparticle Reinforced Composites

Authors: S. Ghanaraja, Subrata Ray, S. K. Nath

Abstract:

Particle reinforced Metal Matrix Composite (MMC) succeeds in synergizing the metallic matrix with ceramic particle reinforcements to result in improved strength, particularly at elevated temperatures, but adversely it affects the ductility of the matrix because of agglomeration and porosity. The present study investigates the outcome of tensile properties in a cast and hot forged composite reinforced simultaneously with coarse and fine particles. Nano-sized alumina particles have been generated by milling mixture of aluminum and manganese dioxide powders. Milled particles after drying are added to molten metal and the resulting slurry is cast. The microstructure of the composites shows good distribution of both the size categories of particles without significant clustering. The presence of nanoparticles along with coarser particles in a composite improves both strength and ductility considerably. Delay in debonding of coarser particles to higher stress is due to reduced mismatch in extension caused by increased strain hardening in presence of the nanoparticles. However, higher addition of powder mix beyond a limit results in deterioration of mechanical properties, possibly due to clustering of nanoparticles. The porosity in cast composite generally increases with the increasing addition of powder mix as observed during process and on forging it has got reduced. The base alloy and nanocomposites show improvement in flow stress which could be attributed to lowering of porosity and grain refinement as a consequence of forging.

Keywords: Aluminum, alumina, nanoparticle reinforced composites, porosity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1449
304 Development of a Value Evaluation Model of Highway Box-Girder Bridge

Authors: Hao Hsi Tseng

Abstract:

Taiwan’s infrastructure is gradually deteriorating, while resources for maintenance and replacement are increasingly limited, raising the urgent need for methods for maintaining existing infrastructure within constrained budgets. Infrastructure value evaluation is used to enhance the efficiency of infrastructure maintenance work, allowing administrators to quickly assess the maintenance needs and performance by observing variation in infrastructure value. This research establishes a value evaluation model for Taiwan’s highway box girder bridges. The operating mechanism and process of the model are illustrated in a practical case.

Keywords: Box girder bridge, deterioration, infrastructure, maintenance, value evaluation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1210
303 Manifold Analysis by Topologically Constrained Isometric Embedding

Authors: Guy Rosman, Alexander M. Bronstein, Michael M. Bronstein, Ron Kimmel

Abstract:

We present a new algorithm for nonlinear dimensionality reduction that consistently uses global information, and that enables understanding the intrinsic geometry of non-convex manifolds. Compared to methods that consider only local information, our method appears to be more robust to noise. Unlike most methods that incorporate global information, the proposed approach automatically handles non-convexity of the data manifold. We demonstrate the performance of our algorithm and compare it to state-of-the-art methods on synthetic as well as real data.

Keywords: Dimensionality reduction, manifold learning, multidimensional scaling, geodesic distance, boundary detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1420
302 PoPCoRN: A Power-Aware Periodic Surveillance Scheme in Convex Region using Wireless Mobile Sensor Networks

Authors: A. K. Prajapati

Abstract:

In this paper, the periodic surveillance scheme has been proposed for any convex region using mobile wireless sensor nodes. A sensor network typically consists of fixed number of sensor nodes which report the measurements of sensed data such as temperature, pressure, humidity, etc., of its immediate proximity (the area within its sensing range). For the purpose of sensing an area of interest, there are adequate number of fixed sensor nodes required to cover the entire region of interest. It implies that the number of fixed sensor nodes required to cover a given area will depend on the sensing range of the sensor as well as deployment strategies employed. It is assumed that the sensors to be mobile within the region of surveillance, can be mounted on moving bodies like robots or vehicle. Therefore, in our scheme, the surveillance time period determines the number of sensor nodes required to be deployed in the region of interest. The proposed scheme comprises of three algorithms namely: Hexagonalization, Clustering, and Scheduling, The first algorithm partitions the coverage area into fixed sized hexagons that approximate the sensing range (cell) of individual sensor node. The clustering algorithm groups the cells into clusters, each of which will be covered by a single sensor node. The later determines a schedule for each sensor to serve its respective cluster. Each sensor node traverses all the cells belonging to the cluster assigned to it by oscillating between the first and the last cell for the duration of its life time. Simulation results show that our scheme provides full coverage within a given period of time using few sensors with minimum movement, less power consumption, and relatively less infrastructure cost.

Keywords: Sensor Network, Graph Theory, MSN, Communication.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1433
301 Accelerating GLA with an M-Tree

Authors: Olli Luoma, Johannes Tuikkala, Olli Nevalainen

Abstract:

In this paper, we propose a novel improvement for the generalized Lloyd Algorithm (GLA). Our algorithm makes use of an M-tree index built on the codebook which makes it possible to reduce the number of distance computations when the nearest code words are searched. Our method does not impose the use of any specific distance function, but works with any metric distance, making it more general than many other fast GLA variants. Finally, we present the positive results of our performance experiments.

Keywords: Clustering, GLA, M-Tree, Vector Quantization .

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1482
300 A Valley Detection for Path Planning

Authors: In-Geun Lim, Jin-Soo Kim, Chirl-Hwa Lee

Abstract:

This paper presents a constrained valley detection algorithm. The intent is to find valleys in the map for the path planning that enables a robot or a vehicle to move safely. The constraint to the valley is a desired width and a desired depth to ensure the space for movement when a vehicle passes through the valley. We propose an algorithm to find valleys satisfying these 2 dimensional constraints. The merit of our algorithm is that the pre-processing and the post-processing are not necessary to eliminate undesired small valleys. The algorithm is validated through simulation using digitized elevation data.

Keywords: valley, width, depth, path planning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1415
299 New Data Reuse Adaptive Filters with Noise Constraint

Authors: Young-Seok Choi

Abstract:

We present a new framework of the data-reusing (DR) adaptive algorithms by incorporating a constraint on noise, referred to as a noise constraint. The motivation behind this work is that the use of the statistical knowledge of the channel noise can contribute toward improving the convergence performance of an adaptive filter in identifying a noisy linear finite impulse response (FIR) channel. By incorporating the noise constraint into the cost function of the DR adaptive algorithms, the noise constrained DR (NC-DR) adaptive algorithms are derived. Experimental results clearly indicate their superior performance over the conventional DR ones.

Keywords: Adaptive filter, data-reusing, least-mean square (LMS), affine projection (AP), noise constraint.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1594
298 Cluster Analysis of Retailers’ Benefits from Their Cooperation with Manufacturers: Business Models Perspective

Authors: M. K. Witek-Hajduk, T. M. Napiórkowski

Abstract:

A number of studies discussed the topic of benefits of retailers-manufacturers cooperation and coopetition. However, there are only few publications focused on the benefits of cooperation and coopetition between retailers and their suppliers of durable consumer goods; especially in the context of business model of cooperating partners. This paper aims to provide a clustering approach to segment retailers selling consumer durables according to the benefits they obtain from their cooperation with key manufacturers and differentiate the said retailers’ in term of the business models of cooperating partners. For the purpose of the study, a survey (with a CATI method) collected data on 603 consumer durables retailers present on the Polish market. Retailers are clustered both, with hierarchical and non-hierarchical methods. Five distinctive groups of consumer durables’ retailers are (based on the studied benefits) identified using the two-stage clustering approach. The clusters are then characterized with a set of exogenous variables, key of which are business models employed by the retailer and its partnering key manufacturer. The paper finds that the a combination of a medium sized retailer classified as an Integrator with a chiefly domestic capital and a manufacturer categorized as a Market Player will yield the highest benefits. On the other side of the spectrum is medium sized Distributor retailer with solely domestic capital – in this case, the business model of the cooperating manufactrer appears to be irreleveant. This paper is the one of the first empirical study using cluster analysis on primary data that defines the types of cooperation between consumer durables’ retailers and manufacturers – their key suppliers. The analysis integrates a perspective of both retailers’ and manufacturers’ business models and matches them with individual and joint benefits.

Keywords: Business model, cooperation, cluster analysis, retailer-manufacturer relationships.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1089