Search results for: hierarchical clustering
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 612

Search results for: hierarchical clustering

72 Mining Correlated Bicluster from Web Usage Data Using Discrete Firefly Algorithm Based Biclustering Approach

Authors: K. Thangavel, R. Rathipriya

Abstract:

For the past one decade, biclustering has become popular data mining technique not only in the field of biological data analysis but also in other applications like text mining, market data analysis with high-dimensional two-way datasets. Biclustering clusters both rows and columns of a dataset simultaneously, as opposed to traditional clustering which clusters either rows or columns of a dataset. It retrieves subgroups of objects that are similar in one subgroup of variables and different in the remaining variables. Firefly Algorithm (FA) is a recently-proposed metaheuristic inspired by the collective behavior of fireflies. This paper provides a preliminary assessment of discrete version of FA (DFA) while coping with the task of mining coherent and large volume bicluster from web usage dataset. The experiments were conducted on two web usage datasets from public dataset repository whereby the performance of FA was compared with that exhibited by other population-based metaheuristic called binary Particle Swarm Optimization (PSO). The results achieved demonstrate the usefulness of DFA while tackling the biclustering problem.

Keywords: Biclustering, Binary Particle Swarm Optimization, Discrete Firefly Algorithm, Firefly Algorithm, Usage profile Web usage mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2133
71 A Monte Carlo Method to Data Stream Analysis

Authors: Kittisak Kerdprasop, Nittaya Kerdprasop, Pairote Sattayatham

Abstract:

Data stream analysis is the process of computing various summaries and derived values from large amounts of data which are continuously generated at a rapid rate. The nature of a stream does not allow a revisit on each data element. Furthermore, data processing must be fast to produce timely analysis results. These requirements impose constraints on the design of the algorithms to balance correctness against timely responses. Several techniques have been proposed over the past few years to address these challenges. These techniques can be categorized as either dataoriented or task-oriented. The data-oriented approach analyzes a subset of data or a smaller transformed representation, whereas taskoriented scheme solves the problem directly via approximation techniques. We propose a hybrid approach to tackle the data stream analysis problem. The data stream has been both statistically transformed to a smaller size and computationally approximated its characteristics. We adopt a Monte Carlo method in the approximation step. The data reduction has been performed horizontally and vertically through our EMR sampling method. The proposed method is analyzed by a series of experiments. We apply our algorithm on clustering and classification tasks to evaluate the utility of our approach.

Keywords: Data Stream, Monte Carlo, Sampling, DensityEstimation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1417
70 Motivated Support Vector Regression using Structural Prior Knowledge

Authors: Wei Zhang, Yao-Yu Li, Yi-Fan Zhu, Qun Li, Wei-Ping Wang

Abstract:

It-s known that incorporating prior knowledge into support vector regression (SVR) can help to improve the approximation performance. Most of researches are concerned with the incorporation of knowledge in the form of numerical relationships. Little work, however, has been done to incorporate the prior knowledge on the structural relationships among the variables (referred as to Structural Prior Knowledge, SPK). This paper explores the incorporation of SPK in SVR by constructing appropriate admissible support vector kernel (SV kernel) based on the properties of reproducing kernel (R.K). Three-levels specifications of SPK are studied with the corresponding sub-levels of prior knowledge that can be considered for the method. These include Hierarchical SPK (HSPK), Interactional SPK (ISPK) consisting of independence, global and local interaction, Functional SPK (FSPK) composed of exterior-FSPK and interior-FSPK. A convenient tool for describing the SPK, namely Description Matrix of SPK is introduced. Subsequently, a new SVR, namely Motivated Support Vector Regression (MSVR) whose structure is motivated in part by SPK, is proposed. Synthetic examples show that it is possible to incorporate a wide variety of SPK and helpful to improve the approximation performance in complex cases. The benefits of MSVR are finally shown on a real-life military application, Air-toground battle simulation, which shows great potential for MSVR to the complex military applications.

Keywords: admissible support vector kernel, reproducing kernel, structural prior knowledge, motivated support vector regression

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1623
69 Motivated Support Vector Regression with Structural Prior Knowledge

Authors: Wei Zhang, Yao-Yu Li, Yi-Fan Zhu, Qun Li, Wei-Ping Wang

Abstract:

It-s known that incorporating prior knowledge into support vector regression (SVR) can help to improve the approximation performance. Most of researches are concerned with the incorporation of knowledge in form of numerical relationships. Little work, however, has been done to incorporate the prior knowledge on the structural relationships among the variables (referred as to Structural Prior Knowledge, SPK). This paper explores the incorporation of SPK in SVR by constructing appropriate admissible support vector kernel (SV kernel) based on the properties of reproducing kernel (R.K). Three-levels specifications of SPK are studies with the corresponding sub-levels of prior knowledge that can be considered for the method. These include Hierarchical SPK (HSPK), Interactional SPK (ISPK) consisting of independence, global and local interaction, Functional SPK (FSPK) composed of exterior-FSPK and interior-FSPK. A convenient tool for describing the SPK, namely Description Matrix of SPK is introduced. Subsequently, a new SVR, namely Motivated Support Vector Regression (MSVR) whose structure is motivated in part by SPK, is proposed. Synthetic examples show that it is possible to incorporate a wide variety of SPK and helpful to improve the approximation performance in complex cases. The benefits of MSVR are finally shown on a real-life military application, Air-toground battle simulation, which shows great potential for MSVR to the complex military applications.

Keywords: admissible support vector kernel, reproducing kernel, structural prior knowledge, motivated support vector regression

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1400
68 Grouping and Indexing Color Features for Efficient Image Retrieval

Authors: M. V. Sudhamani, C. R. Venugopal

Abstract:

Content-based Image Retrieval (CBIR) aims at searching image databases for specific images that are similar to a given query image based on matching of features derived from the image content. This paper focuses on a low-dimensional color based indexing technique for achieving efficient and effective retrieval performance. In our approach, the color features are extracted using the mean shift algorithm, a robust clustering technique. Then the cluster (region) mode is used as representative of the image in 3-D color space. The feature descriptor consists of the representative color of a region and is indexed using a spatial indexing method that uses *R -tree thus avoiding the high-dimensional indexing problems associated with the traditional color histogram. Alternatively, the images in the database are clustered based on region feature similarity using Euclidian distance. Only representative (centroids) features of these clusters are indexed using *R -tree thus improving the efficiency. For similarity retrieval, each representative color in the query image or region is used independently to find regions containing that color. The results of these methods are compared. A JAVA based query engine supporting query-by- example is built to retrieve images by color.

Keywords: Content-based, indexing, cluster, region.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1811
67 Performance Evaluation of Energy Efficient Communication Protocol for Mobile Ad Hoc Networks

Authors: Toshihiko Sasama, Kentaro Kishida, Kazunori Sugahara, Hiroshi Masuyama

Abstract:

A mobile ad hoc network is a network of mobile nodes without any notion of centralized administration. In such a network, each mobile node behaves not only as a host which runs applications but also as a router to forward packets on behalf of others. Clustering has been applied to routing protocols to achieve efficient communications. A CH network expresses the connected relationship among cluster-heads. This paper discusses the methods for constructing a CH network, and produces the following results: (1) The required running costs of 3 traditional methods for constructing a CH network are not so different from each other in the static circumstance, or in the dynamic circumstance. Their running costs in the static circumstance do not differ from their costs in the dynamic circumstance. Meanwhile, although the routing costs required for the above 3 methods are not so different in the static circumstance, the costs are considerably different from each other in the dynamic circumstance. Their routing costs in the static circumstance are also very different from their costs in the dynamic circumstance, and the former is one tenths of the latter. The routing cost in the dynamic circumstance is mostly the cost for re-routing. (2) On the strength of the above results, we discuss new 2 methods regarding whether they are tolerable or not in the dynamic circumstance, that is, whether the times of re-routing are small or not. These new methods are revised methods that are based on the traditional methods. We recommended the method which produces the smallest routing cost in the dynamic circumstance, therefore producing the smallest total cost.

Keywords: cluster, mobile ad hoc network, re-routing cost, simulation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1350
66 A Deep Learning Framework for Polarimetric SAR Change Detection Using Capsule Network

Authors: Sanae Attioui, Said Najah

Abstract:

The Earth's surface is constantly changing through forces of nature and human activities. Reliable, accurate, and timely change detection is critical to environmental monitoring, resource management, and planning activities. Recently, interest in deep learning algorithms, especially convolutional neural networks, has increased in the field of image change detection due to their powerful ability to extract multi-level image features automatically. However, these networks are prone to drawbacks that limit their applications, which reside in their inability to capture spatial relationships between image instances, as this necessitates a large amount of training data. As an alternative, Capsule Network has been proposed to overcome these shortcomings. Although its effectiveness in remote sensing image analysis has been experimentally verified, its application in change detection tasks remains very sparse. Motivated by its greater robustness towards improved hierarchical object representation, this study aims to apply a capsule network for PolSAR image Change Detection. The experimental results demonstrate that the proposed change detection method can yield a significantly higher detection rate compared to methods based on convolutional neural networks.

Keywords: Change detection, capsule network, deep network, Convolutional Neural Networks, polarimetric synthetic aperture radar images, PolSAR images.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 498
65 Discovery of Human HMG-Coa Reductase Inhibitors Using Structure-Based Pharmacophore Modeling Combined with Molecular Dynamics Simulation Methodologies

Authors: Minky Son, Chanin Park, Ayoung Baek, Shalini John, Keun Woo Lee

Abstract:

3-hydroxy-3-methylglutaryl coenzyme A reductase (HMGR) catalyzes the conversion of HMG-CoA to mevalonate using NADPH and the enzyme is involved in rate-controlling step of mevalonate. Inhibition of HMGR is considered as effective way to lower cholesterol levels so it is drug target to treat hypercholesterolemia, major risk factor of cardiovascular disease. To discover novel HMGR inhibitor, we performed structure-based pharmacophore modeling combined with molecular dynamics (MD) simulation. Four HMGR inhibitors were used for MD simulation and representative structure of each simulation were selected by clustering analysis. Four structure-based pharmacophore models were generated using the representative structure. The generated models were validated used in virtual screening to find novel scaffolds for inhibiting HMGR. The screened compounds were filtered by applying drug-like properties and used in molecular docking. Finally, four hit compounds were obtained and these complexes were refined using energy minimization. These compounds might be potential leads to design novel HMGR inhibitor.

Keywords: Anti-hypercholesterolemia drug, HMGR inhibitor, Molecular dynamics simulation, Structure-based pharmacophore modeling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1948
64 A Novel Nucleus-Based Classifier for Discrimination of Osteoclasts and Mesenchymal Precursor Cells in Mouse Bone Marrow Cultures

Authors: Andreas Heindl, Alexander K. Seewald, Martin Schepelmann, Radu Rogojanu, Giovanna Bises, Theresia Thalhammer, Isabella Ellinger

Abstract:

Bone remodeling occurs by the balanced action of bone resorbing osteoclasts (OC) and bone-building osteoblasts. Increased bone resorption by excessive OC activity contributes to malignant and non-malignant diseases including osteoporosis. To study OC differentiation and function, OC formed in in vitro cultures are currently counted manually, a tedious procedure which is prone to inter-observer differences. Aiming for an automated OC-quantification system, classification of OC and precursor cells was done on fluorescence microscope images based on the distinct appearance of fluorescent nuclei. Following ellipse fitting to nuclei, a combination of eight features enabled clustering of OC and precursor cell nuclei. After evaluating different machine-learning techniques, LOGREG achieved 74% correctly classified OC and precursor cell nuclei, outperforming human experts (best expert: 55%). In combination with the automated detection of total cell areas, this system allows to measure various cell parameters and most importantly to quantify proteins involved in osteoclastogenesis.

Keywords: osteoclasts, machine learning, ellipse fitting.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1913
63 Combined Feature Based Hyperspectral Image Classification Technique Using Support Vector Machines

Authors: Mrs.K.Kavitha, S.Arivazhagan

Abstract:

A spatial classification technique incorporating a State of Art Feature Extraction algorithm is proposed in this paper for classifying a heterogeneous classes present in hyper spectral images. The classification accuracy can be improved if and only if both the feature extraction and classifier selection are proper. As the classes in the hyper spectral images are assumed to have different textures, textural classification is entertained. Run Length feature extraction is entailed along with the Principal Components and Independent Components. A Hyperspectral Image of Indiana Site taken by AVIRIS is inducted for the experiment. Among the original 220 bands, a subset of 120 bands is selected. Gray Level Run Length Matrix (GLRLM) is calculated for the selected forty bands. From GLRLMs the Run Length features for individual pixels are calculated. The Principle Components are calculated for other forty bands. Independent Components are calculated for next forty bands. As Principal & Independent Components have the ability to represent the textural content of pixels, they are treated as features. The summation of Run Length features, Principal Components, and Independent Components forms the Combined Features which are used for classification. SVM with Binary Hierarchical Tree is used to classify the hyper spectral image. Results are validated with ground truth and accuracies are calculated.

Keywords: Multi-class, Run Length features, PCA, ICA, classification and Support Vector Machines.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1522
62 A General Framework for Knowledge Discovery Using High Performance Machine Learning Algorithms

Authors: S. Nandagopalan, N. Pradeep

Abstract:

The aim of this paper is to propose a general framework for storing, analyzing, and extracting knowledge from two-dimensional echocardiographic images, color Doppler images, non-medical images, and general data sets. A number of high performance data mining algorithms have been used to carry out this task. Our framework encompasses four layers namely physical storage, object identification, knowledge discovery, user level. Techniques such as active contour model to identify the cardiac chambers, pixel classification to segment the color Doppler echo image, universal model for image retrieval, Bayesian method for classification, parallel algorithms for image segmentation, etc., were employed. Using the feature vector database that have been efficiently constructed, one can perform various data mining tasks like clustering, classification, etc. with efficient algorithms along with image mining given a query image. All these facilities are included in the framework that is supported by state-of-the-art user interface (UI). The algorithms were tested with actual patient data and Coral image database and the results show that their performance is better than the results reported already.

Keywords: Active Contour, Bayesian, Echocardiographic image, Feature vector.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1713
61 Dominating Set Algorithm and Trust Evaluation Scheme for Secured Cluster Formation and Data Transferring

Authors: Y. Harold Robinson, M. Rajaram, E. Golden Julie, S. Balaji

Abstract:

This paper describes the proficient way of choosing the cluster head based on dominating set algorithm in a wireless sensor network (WSN). The algorithm overcomes the energy deterioration problems by this selection process of cluster heads. Clustering algorithms such as LEACH, EEHC and HEED enhance scalability in WSNs. Dominating set algorithm keeps the first node alive longer than the other protocols previously used. As the dominating set of cluster heads are directly connected to each node, the energy of the network is saved by eliminating the intermediate nodes in WSN. Security and trust is pivotal in network messaging. Cluster head is secured with a unique key. The member can only connect with the cluster head if and only if they are secured too. The secured trust model provides security for data transmission in the dominated set network with the group key. The concept can be extended to add a mobile sink for each or for no of clusters to transmit data or messages between cluster heads and to base station. Data security id preferably high and data loss can be prevented. The simulation demonstrates the concept of choosing cluster heads by dominating set algorithm and trust evaluation using DSTE. The research done is rationalized.

Keywords: Wireless Sensor Networks, LEECH, EEHC, HEED, DSTE.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1405
60 Location Update Cost Analysis of Mobile IPv6 Protocols

Authors: Brahmjit Singh

Abstract:

Mobile IP has been developed to provide the continuous information network access to mobile users. In IP-based mobile networks, location management is an important component of mobility management. This management enables the system to track the location of mobile node between consecutive communications. It includes two important tasks- location update and call delivery. Location update is associated with signaling load. Frequent updates lead to degradation in the overall performance of the network and the underutilization of the resources. It is, therefore, required to devise the mechanism to minimize the update rate. Mobile IPv6 (MIPv6) and Hierarchical MIPv6 (HMIPv6) have been the potential candidates for deployments in mobile IP networks for mobility management. HMIPv6 through studies has been shown with better performance as compared to MIPv6. It reduces the signaling overhead traffic by making registration process local. In this paper, we present performance analysis of MIPv6 and HMIPv6 using an analytical model. Location update cost function is formulated based on fluid flow mobility model. The impact of cell residence time, cell residence probability and user-s mobility is investigated. Numerical results are obtained and presented in graphical form. It is shown that HMIPv6 outperforms MIPv6 for high mobility users only and for low mobility users; performance of both the schemes is almost equivalent to each other.

Keywords: Wireless networks, Mobile IP networks, Mobility management, performance analysis, Handover.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1754
59 Validation and Selection between Machine Learning Technique and Traditional Methods to Reduce Bullwhip Effects: a Data Mining Approach

Authors: Hamid R. S. Mojaveri, Seyed S. Mousavi, Mojtaba Heydar, Ahmad Aminian

Abstract:

The aim of this paper is to present a methodology in three steps to forecast supply chain demand. In first step, various data mining techniques are applied in order to prepare data for entering into forecasting models. In second step, the modeling step, an artificial neural network and support vector machine is presented after defining Mean Absolute Percentage Error index for measuring error. The structure of artificial neural network is selected based on previous researchers' results and in this article the accuracy of network is increased by using sensitivity analysis. The best forecast for classical forecasting methods (Moving Average, Exponential Smoothing, and Exponential Smoothing with Trend) is resulted based on prepared data and this forecast is compared with result of support vector machine and proposed artificial neural network. The results show that artificial neural network can forecast more precisely in comparison with other methods. Finally, forecasting methods' stability is analyzed by using raw data and even the effectiveness of clustering analysis is measured.

Keywords: Artificial Neural Networks (ANN), bullwhip effect, demand forecasting, Support Vector Machine (SVM).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2010
58 Authenticity of Lipid and Soluble Sugar Profiles of Various Oat Cultivars (Avena sativa)

Authors: Marijana M. Ačanski, Kristian A. Pastor, Djura N. Vujić

Abstract:

The identification of lipid and soluble sugar components in flour samples of different cultivars belonging to common oat species (Avena sativa L.) was performed: spring oat, winter oat and hulless oat. Fatty acids were extracted from flour samples with n-hexane, and derivatized into volatile methyl esters, using TMSH (trimethylsulfonium hydroxide in methanol). Soluble sugars were then extracted from defatted and dried samples of oat flour with 96% ethanol, and further derivatized into corresponding TMS-oximes, using hydroxylamine hydrochloride solution and BSTFA (N,O-bis-(trimethylsilyl)-trifluoroacetamide). The hexane and ethanol extracts of each oat cultivar were analyzed using GC-MS system. Lipid and simple sugar compositions are very similar in all samples of investigated cultivars. Chemometric tool was applied to numeric values of automatically integrated surface areas of detected lipid and simple sugar components in their corresponding derivatized forms. Hierarchical cluster analysis shows a very high similarity between the investigated flour samples of oat cultivars, according to the fatty acid content (0.9955). Moderate similarity was observed according to the content of soluble sugars (0.50). These preliminary results support the idea of establishing methods for oat flour authentication, and provide the means for distinguishing oat flour samples, regardless of the variety, from flour samples made of other cereal species, just by lipid and simple sugar profile analysis.

Keywords: Authentication, chemometrics, GC-MS, lipid and soluble sugar composition, oat cultivars.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1373
57 Analytical Authentication of Butter Using Fourier Transform Infrared Spectroscopy Coupled with Chemometrics

Authors: M. Bodner, M. Scampicchio

Abstract:

Fourier Transform Infrared (FT-IR) spectroscopy coupled with chemometrics was used to distinguish between butter samples and non-butter samples. Further, quantification of the content of margarine in adulterated butter samples was investigated. Fingerprinting region (1400-800 cm–1) was used to develop unsupervised pattern recognition (Principal Component Analysis, PCA), supervised modeling (Soft Independent Modelling by Class Analogy, SIMCA), classification (Partial Least Squares Discriminant Analysis, PLS-DA) and regression (Partial Least Squares Regression, PLS-R) models. PCA of the fingerprinting region shows a clustering of the two sample types. All samples were classified in their rightful class by SIMCA approach; however, nine adulterated samples (between 1% and 30% w/w of margarine) were classified as belonging both at the butter class and at the non-butter one. In the two-class PLS-DA model’s (R2 = 0.73, RMSEP, Root Mean Square Error of Prediction = 0.26% w/w) sensitivity was 71.4% and Positive Predictive Value (PPV) 100%. Its threshold was calculated at 7% w/w of margarine in adulterated butter samples. Finally, PLS-R model (R2 = 0.84, RMSEP = 16.54%) was developed. PLS-DA was a suitable classification tool and PLS-R a proper quantification approach. Results demonstrate that FT-IR spectroscopy combined with PLS-R can be used as a rapid, simple and safe method to identify pure butter samples from adulterated ones and to determine the grade of adulteration of margarine in butter samples.

Keywords: Adulterated butter, margarine, PCA, PLS-DA, PLS-R, SIMCA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 781
56 Using the Combined Model of PROMETHEE and Fuzzy Analytic Network Process for Determining Question Weights in Scientific Exams through Data Mining Approach

Authors: Hassan Haleh, Amin Ghaffari, Parisa Farahpour

Abstract:

Need for an appropriate system of evaluating students- educational developments is a key problem to achieve the predefined educational goals. Intensity of the related papers in the last years; that tries to proof or disproof the necessity and adequacy of the students assessment; is the corroborator of this matter. Some of these studies tried to increase the precision of determining question weights in scientific examinations. But in all of them there has been an attempt to adjust the initial question weights while the accuracy and precision of those initial question weights are still under question. Thus In order to increase the precision of the assessment process of students- educational development, the present study tries to propose a new method for determining the initial question weights by considering the factors of questions like: difficulty, importance and complexity; and implementing a combined method of PROMETHEE and fuzzy analytic network process using a data mining approach to improve the model-s inputs. The result of the implemented case study proves the development of performance and precision of the proposed model.

Keywords: Assessing students, Analytic network process, Clustering, Data mining, Fuzzy sets, Multi-criteria decision making, and Preference function.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1581
55 BIDENS: Iterative Density Based Biclustering Algorithm With Application to Gene Expression Analysis

Authors: Mohamed A. Mahfouz, M. A. Ismail

Abstract:

Biclustering is a very useful data mining technique for identifying patterns where different genes are co-related based on a subset of conditions in gene expression analysis. Association rules mining is an efficient approach to achieve biclustering as in BIMODULE algorithm but it is sensitive to the value given to its input parameters and the discretization procedure used in the preprocessing step, also when noise is present, classical association rules miners discover multiple small fragments of the true bicluster, but miss the true bicluster itself. This paper formally presents a generalized noise tolerant bicluster model, termed as μBicluster. An iterative algorithm termed as BIDENS based on the proposed model is introduced that can discover a set of k possibly overlapping biclusters simultaneously. Our model uses a more flexible method to partition the dimensions to preserve meaningful and significant biclusters. The proposed algorithm allows discovering biclusters that hard to be discovered by BIMODULE. Experimental study on yeast, human gene expression data and several artificial datasets shows that our algorithm offers substantial improvements over several previously proposed biclustering algorithms.

Keywords: Machine learning, biclustering, bi-dimensional clustering, gene expression analysis, data mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1963
54 Oracle JDE Enterprise One ERP Implementation: A Case Study

Authors: Abhimanyu Pati, Krishna Kumar Veluri

Abstract:

The paper intends to bring out a real life experience encountered during actual implementation of a large scale Tier-1 Enterprise Resource Planning (ERP) system in a multi-location, discrete manufacturing organization in India, involved in manufacturing of auto components and aggregates. The business complexities, prior to the implementation of ERP, include multi-product with hierarchical product structures, geographically distributed multiple plant locations with disparate business practices, lack of inter-plant broadband connectivity, existence of disparate legacy applications for different business functions, and non-standardized codifications of products, machines, employees, and accounts apart from others. On the other hand, the manufacturing environment consisted of processes like Assemble-to-Order (ATO), Make-to-Stock (MTS), and Engineer-to-Order (ETO) with a mix of discrete and process operations. The paper has highlighted various business plan areas and concerns, prior to the implementation, with specific focus on strategic issues and objectives. Subsequently, it has dealt with the complete process of ERP implementation, starting from strategic planning, project planning, resource mobilization, and finally, the program execution. The step-by-step process provides a very good learning opportunity about the implementation methodology. At the end, various organizational challenges and lessons emerged, which will act as guidelines and checklist for organizations to successfully align and implement ERP and achieve their business objectives.

Keywords: ERP, ATO, MTS, ETO, discrete manufacturing, strategic planning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1800
53 Biosensor Design through Molecular Dynamics Simulation

Authors: Wenjun Zhang, Yunqing Du, Steven W. Cranford, Ming L. Wang

Abstract:

The beginning of 21st century has witnessed new advancements in the design and use of new materials for biosensing applications, from nano to macro, protein to tissue. Traditional analytical methods lack a complete toolset to describe the complexities introduced by living systems, pathological relations, discrete hierarchical materials, cross-phase interactions, and structure-property dependencies. Materiomics – via systematic molecular dynamics (MD) simulation – can provide structureprocess- property relations by using a materials science approach linking mechanisms across scales and enables oriented biosensor design. With this approach, DNA biosensors can be utilized to detect disease biomarkers present in individuals’ breath such as acetone for diabetes. Our wireless sensor array based on single-stranded DNA (ssDNA)-decorated single-walled carbon nanotubes (SWNT) has successfully detected trace amount of various chemicals in vapor differentiated by pattern recognition. Here, we present how MD simulation can revolutionize the way of design and screening of DNA aptamers for targeting biomarkers related to oral diseases and oral health monitoring. It demonstrates great potential to be utilized to build a library of DNDA sequences for reliable detection of several biomarkers of one specific disease, and as well provides a new methodology of creating, designing, and applying of biosensors.

Keywords: Biosensor, design, DNA, molecular dynamics simulation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3036
52 Fuzzy Control of the Air Conditioning System at Different Operating Pressures

Authors: Mohanad Alata , Moh'd Al-Nimr, Rami Al-Jarrah

Abstract:

The present work demonstrates the design and simulation of a fuzzy control of an air conditioning system at different pressures. The first order Sugeno fuzzy inference system is utilized to model the system and create the controller. In addition, an estimation of the heat transfer rate and water mass flow rate injection into or withdraw from the air conditioning system is determined by the fuzzy IF-THEN rules. The approach starts by generating the input/output data. Then, the subtractive clustering algorithm along with least square estimation (LSE) generates the fuzzy rules that describe the relationship between input/output data. The fuzzy rules are tuned by Adaptive Neuro-Fuzzy Inference System (ANFIS). The results show that when the pressure increases the amount of water flow rate and heat transfer rate decrease within the lower ranges of inlet dry bulb temperatures. On the other hand, and as pressure increases the amount of water flow rate and heat transfer rate increases within the higher ranges of inlet dry bulb temperatures. The inflection in the pressure effect trend occurs at lower temperatures as the inlet air humidity increases.

Keywords: Air Conditioning, ANFIS, Fuzzy Control, Sugeno System.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3366
51 Measuring the Structural Similarity of Web-based Documents: A Novel Approach

Authors: Matthias Dehmer, Frank Emmert Streib, Alexander Mehler, Jürgen Kilian

Abstract:

Most known methods for measuring the structural similarity of document structures are based on, e.g., tag measures, path metrics and tree measures in terms of their DOM-Trees. Other methods measures the similarity in the framework of the well known vector space model. In contrast to these we present a new approach to measuring the structural similarity of web-based documents represented by so called generalized trees which are more general than DOM-Trees which represent only directed rooted trees.We will design a new similarity measure for graphs representing web-based hypertext structures. Our similarity measure is mainly based on a novel representation of a graph as strings of linear integers, whose components represent structural properties of the graph. The similarity of two graphs is then defined as the optimal alignment of the underlying property strings. In this paper we apply the well known technique of sequence alignments to solve a novel and challenging problem: Measuring the structural similarity of generalized trees. More precisely, we first transform our graphs considered as high dimensional objects in linear structures. Then we derive similarity values from the alignments of the property strings in order to measure the structural similarity of generalized trees. Hence, we transform a graph similarity problem to a string similarity problem. We demonstrate that our similarity measure captures important structural information by applying it to two different test sets consisting of graphs representing web-based documents.

Keywords: Graph similarity, hierarchical and directed graphs, hypertext, generalized trees, web structure mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2557
50 Dynamic Clustering Estimation of Tool Flank Wear in Turning Process using SVD Models of the Emitted Sound Signals

Authors: A. Samraj, S. Sayeed, J. E. Raja., J. Hossen, A. Rahman

Abstract:

Monitoring the tool flank wear without affecting the throughput is considered as the prudent method in production technology. The examination has to be done without affecting the machining process. In this paper we proposed a novel work that is used to determine tool flank wear by observing the sound signals emitted during the turning process. The work-piece material we used here is steel and aluminum and the cutting insert was carbide material. Two different cutting speeds were used in this work. The feed rate and the cutting depth were constant whereas the flank wear was a variable. The emitted sound signal of a fresh tool (0 mm flank wear) a slightly worn tool (0.2 -0.25 mm flank wear) and a severely worn tool (0.4mm and above flank wear) during turning process were recorded separately using a high sensitive microphone. Analysis using Singular Value Decomposition was done on these sound signals to extract the feature sound components. Observation of the results showed that an increase in tool flank wear correlates with an increase in the values of SVD features produced out of the sound signals for both the materials. Hence it can be concluded that wear monitoring of tool flank during turning process using SVD features with the Fuzzy C means classification on the emitted sound signal is a potential and relatively simple method.

Keywords: Fuzzy c means, Microphone, Singular ValueDecomposition, Tool Flank Wear.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1898
49 Hand Gesture Recognition Based on Combined Features Extraction

Authors: Mahmoud Elmezain, Ayoub Al-Hamadi, Bernd Michaelis

Abstract:

Hand gesture is an active area of research in the vision community, mainly for the purpose of sign language recognition and Human Computer Interaction. In this paper, we propose a system to recognize alphabet characters (A-Z) and numbers (0-9) in real-time from stereo color image sequences using Hidden Markov Models (HMMs). Our system is based on three main stages; automatic segmentation and preprocessing of the hand regions, feature extraction and classification. In automatic segmentation and preprocessing stage, color and 3D depth map are used to detect hands where the hand trajectory will take place in further step using Mean-shift algorithm and Kalman filter. In the feature extraction stage, 3D combined features of location, orientation and velocity with respected to Cartesian systems are used. And then, k-means clustering is employed for HMMs codeword. The final stage so-called classification, Baum- Welch algorithm is used to do a full train for HMMs parameters. The gesture of alphabets and numbers is recognized using Left-Right Banded model in conjunction with Viterbi algorithm. Experimental results demonstrate that, our system can successfully recognize hand gestures with 98.33% recognition rate.

Keywords: Gesture Recognition, Computer Vision & Image Processing, Pattern Recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4032
48 Discovering Complex Regularities: from Tree to Semi-Lattice Classifications

Authors: A. Faro, D. Giordano, F. Maiorana

Abstract:

Data mining uses a variety of techniques each of which is useful for some particular task. It is important to have a deep understanding of each technique and be able to perform sophisticated analysis. In this article we describe a tool built to simulate a variation of the Kohonen network to perform unsupervised clustering and support the entire data mining process up to results visualization. A graphical representation helps the user to find out a strategy to optimize classification by adding, moving or delete a neuron in order to change the number of classes. The tool is able to automatically suggest a strategy to optimize the number of classes optimization, but also support both tree classifications and semi-lattice organizations of the classes to give to the users the possibility of passing from one class to the ones with which it has some aspects in common. Examples of using tree and semi-lattice classifications are given to illustrate advantages and problems. The tool is applied to classify macroeconomic data that report the most developed countries- import and export. It is possible to classify the countries based on their economic behaviour and use the tool to characterize the commercial behaviour of a country in a selected class from the analysis of positive and negative features that contribute to classes formation. Possible interrelationships between the classes and their meaning are also discussed.

Keywords: Unsupervised classification, Kohonen networks, macroeconomics, Visual data mining, Cluster interpretation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1542
47 A State Aggregation Approach to Singularly Perturbed Markov Reward Processes

Authors: Dali Zhang, Baoqun Yin, Hongsheng Xi

Abstract:

In this paper, we propose a single sample path based algorithm with state aggregation to optimize the average rewards of singularly perturbed Markov reward processes (SPMRPs) with a large scale state spaces. It is assumed that such a reward process depend on a set of parameters. Differing from the other kinds of Markov chain, SPMRPs have their own hierarchical structure. Based on this special structure, our algorithm can alleviate the load in the optimization for performance. Moreover, our method can be applied on line because of its evolution with the sample path simulated. Compared with the original algorithm applied on these problems of general MRPs, a new gradient formula for average reward performance metric in SPMRPs is brought in, which will be proved in Appendix, and then based on these gradients, the schedule of the iteration algorithm is presented, which is based on a single sample path, and eventually a special case in which parameters only dominate the disturbance matrices will be analyzed, and a precise comparison with be displayed between our algorithm with the old ones which is aim to solve these problems in general Markov reward processes. When applied in SPMRPs, our method will approach a fast pace in these cases. Furthermore, to illustrate the practical value of SPMRPs, a simple example in multiple programming in computer systems will be listed and simulated. Corresponding to some practical model, physical meanings of SPMRPs in networks of queues will be clarified.

Keywords: Singularly perturbed Markov processes, Gradient of average reward, Differential reward, State aggregation, Perturbed close network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1636
46 Automatic Segmentation of Dermoscopy Images Using Histogram Thresholding on Optimal Color Channels

Authors: Rahil Garnavi, Mohammad Aldeen, M. Emre Celebi, Alauddin Bhuiyan, Constantinos Dolianitis, George Varigos

Abstract:

Automatic segmentation of skin lesions is the first step towards development of a computer-aided diagnosis of melanoma. Although numerous segmentation methods have been developed, few studies have focused on determining the most discriminative and effective color space for melanoma application. This paper proposes a novel automatic segmentation algorithm using color space analysis and clustering-based histogram thresholding, which is able to determine the optimal color channel for segmentation of skin lesions. To demonstrate the validity of the algorithm, it is tested on a set of 30 high resolution dermoscopy images and a comprehensive evaluation of the results is provided, where borders manually drawn by four dermatologists, are compared to automated borders detected by the proposed algorithm. The evaluation is carried out by applying three previously used metrics of accuracy, sensitivity, and specificity and a new metric of similarity. Through ROC analysis and ranking the metrics, it is shown that the best results are obtained with the X and XoYoR color channels which results in an accuracy of approximately 97%. The proposed method is also compared with two state-ofthe- art skin lesion segmentation methods, which demonstrates the effectiveness and superiority of the proposed segmentation method.

Keywords: Border detection, Color space analysis, Dermoscopy, Histogram thresholding, Melanoma, Segmentation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2085
45 A Two-Stage Expert System for Diagnosis of Leukemia Based on Type-2 Fuzzy Logic

Authors: Ali Akbar Sadat Asl

Abstract:

Diagnosis and deciding about diseases in medical fields is facing innate uncertainty which can affect the whole process of treatment. This decision is made based on expert knowledge and the way in which an expert interprets the patient's condition, and the interpretation of the various experts from the patient's condition may be different. Fuzzy logic can provide mathematical modeling for many concepts, variables, and systems that are unclear and ambiguous and also it can provide a framework for reasoning, inference, control, and decision making in conditions of uncertainty. In systems with high uncertainty and high complexity, fuzzy logic is a suitable method for modeling. In this paper, we use type-2 fuzzy logic for uncertainty modeling that is in diagnosis of leukemia. The proposed system uses an indirect-direct approach and consists of two stages: In the first stage, the inference of blood test state is determined. In this step, we use an indirect approach where the rules are extracted automatically by implementing a clustering approach. In the second stage, signs of leukemia, duration of disease until its progress and the output of the first stage are combined and the final diagnosis of the system is obtained. In this stage, the system uses a direct approach and final diagnosis is determined by the expert. The obtained results show that the type-2 fuzzy expert system can diagnose leukemia with the average accuracy about 97%.

Keywords: Expert system, leukemia, medical diagnosis, type-2 fuzzy logic.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1053
44 Surrogate based Evolutionary Algorithm for Design Optimization

Authors: Maumita Bhattacharya

Abstract:

Optimization is often a critical issue for most system design problems. Evolutionary Algorithms are population-based, stochastic search techniques, widely used as efficient global optimizers. However, finding optimal solution to complex high dimensional, multimodal problems often require highly computationally expensive function evaluations and hence are practically prohibitive. The Dynamic Approximate Fitness based Hybrid EA (DAFHEA) model presented in our earlier work [14] reduced computation time by controlled use of meta-models to partially replace the actual function evaluation by approximate function evaluation. However, the underlying assumption in DAFHEA is that the training samples for the meta-model are generated from a single uniform model. Situations like model formation involving variable input dimensions and noisy data certainly can not be covered by this assumption. In this paper we present an enhanced version of DAFHEA that incorporates a multiple-model based learning approach for the SVM approximator. DAFHEA-II (the enhanced version of the DAFHEA framework) also overcomes the high computational expense involved with additional clustering requirements of the original DAFHEA framework. The proposed framework has been tested on several benchmark functions and the empirical results illustrate the advantages of the proposed technique.

Keywords: Evolutionary algorithm, Fitness function, Optimization, Meta-model, Stochastic method.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1576
43 Applications of Support Vector Machines on Smart Phone Systems for Emotional Speech Recognition

Authors: Wernhuar Tarng, Yuan-Yuan Chen, Chien-Lung Li, Kun-Rong Hsie, Mingteh Chen

Abstract:

An emotional speech recognition system for the applications on smart phones was proposed in this study to combine with 3G mobile communications and social networks to provide users and their groups with more interaction and care. This study developed a mechanism using the support vector machines (SVM) to recognize the emotions of speech such as happiness, anger, sadness and normal. The mechanism uses a hierarchical classifier to adjust the weights of acoustic features and divides various parameters into the categories of energy and frequency for training. In this study, 28 commonly used acoustic features including pitch and volume were proposed for training. In addition, a time-frequency parameter obtained by continuous wavelet transforms was also used to identify the accent and intonation in a sentence during the recognition process. The Berlin Database of Emotional Speech was used by dividing the speech into male and female data sets for training. According to the experimental results, the accuracies of male and female test sets were increased by 4.6% and 5.2% respectively after using the time-frequency parameter for classifying happy and angry emotions. For the classification of all emotions, the average accuracy, including male and female data, was 63.5% for the test set and 90.9% for the whole data set.

Keywords: Smart phones, emotional speech recognition, socialnetworks, support vector machines, time-frequency parameter, Mel-scale frequency cepstral coefficients (MFCC).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1842