Search results for: eukaryotic clustering
55 Gene Expression Signature for Classification of Metastasis Positive and Negative Oral Cancer in Homosapiens
Authors: A. Shukla, A. Tarsauliya, R. Tiwari, S. Sharma
Abstract:
Cancer classification to their corresponding cohorts has been key area of research in bioinformatics aiming better prognosis of the disease. High dimensionality of gene data has been makes it a complex task and requires significance data identification technique in order to reducing the dimensionality and identification of significant information. In this paper, we have proposed a novel approach for classification of oral cancer into metastasis positive and negative patients. We have used significance analysis of microarrays (SAM) for identifying significant genes which constitutes gene signature. 3 different gene signatures were identified using SAM from 3 different combination of training datasets and their classification accuracy was calculated on corresponding testing datasets using k-Nearest Neighbour (kNN), Fuzzy C-Means Clustering (FCM), Support Vector Machine (SVM) and Backpropagation Neural Network (BPNN). A final gene signature of only 9 genes was obtained from above 3 individual gene signatures. 9 gene signature-s classification capability was compared using same classifiers on same testing datasets. Results obtained from experimentation shows that 9 gene signature classified all samples in testing dataset accurately while individual genes could not classify all accurately.
Keywords: Cancer, Gene Signature, SAM, Classification.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 207654 Modeling of Pulping of Sugar Maple Using Advanced Neural Network Learning
Authors: W. D. Wan Rosli, Z. Zainuddin, R. Lanouette, S. Sathasivam
Abstract:
This paper reports work done to improve the modeling of complex processes when only small experimental data sets are available. Neural networks are used to capture the nonlinear underlying phenomena contained in the data set and to partly eliminate the burden of having to specify completely the structure of the model. Two different types of neural networks were used for the application of Pulping of Sugar Maple problem. A three layer feed forward neural networks, using the Preconditioned Conjugate Gradient (PCG) methods were used in this investigation. Preconditioning is a method to improve convergence by lowering the condition number and increasing the eigenvalues clustering. The idea is to solve the modified problem where M is a positive-definite preconditioner that is closely related to A. We mainly focused on Preconditioned Conjugate Gradient- based training methods which originated from optimization theory, namely Preconditioned Conjugate Gradient with Fletcher-Reeves Update (PCGF), Preconditioned Conjugate Gradient with Polak-Ribiere Update (PCGP) and Preconditioned Conjugate Gradient with Powell-Beale Restarts (PCGB). The behavior of the PCG methods in the simulations proved to be robust against phenomenon such as oscillations due to large step size.
Keywords: Convergence, Modeling, Neural Networks, Preconditioned Conjugate Gradient.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 168553 A Design Framework for Event Recommendation in Novice Low-Literacy Communities
Authors: Yimeng Deng, Klarissa T.T. Chang
Abstract:
The proliferation of user-generated content (UGC) results in huge opportunities to explore event patterns. However, existing event recommendation systems primarily focus on advanced information technology users. Little work has been done to address novice and low-literacy users. The next billion users providing and consuming UGC are likely to include communities from developing countries who are ready to use affordable technologies for subsistence goals. Therefore, we propose a design framework for providing event recommendations to address the needs of such users. Grounded in information integration theory (IIT), our framework advocates that effective event recommendation is supported by systems capable of (1) reliable information gathering through structured user input, (2) accurate sense making through spatial-temporal analytics, and (3) intuitive information dissemination through interactive visualization techniques. A mobile pest management application is developed as an instantiation of the design framework. Our preliminary study suggests a set of design principles for novice and low-literacy users.
Keywords: Event recommendation, iconic interface, information integration, spatial-temporal clustering, user-generated content, visualization techniques
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 165652 Logistic Model Tree and Expectation-Maximization for Pollen Recognition and Grouping
Authors: Endrick Barnacin, Jean-Luc Henry, Jack Molinié, Jimmy Nagau, Hélène Delatte, Gérard Lebreton
Abstract:
Palynology is a field of interest for many disciplines. It has multiple applications such as chronological dating, climatology, allergy treatment, and even honey characterization. Unfortunately, the analysis of a pollen slide is a complicated and time-consuming task that requires the intervention of experts in the field, which is becoming increasingly rare due to economic and social conditions. So, the automation of this task is a necessity. Pollen slides analysis is mainly a visual process as it is carried out with the naked eye. That is the reason why a primary method to automate palynology is the use of digital image processing. This method presents the lowest cost and has relatively good accuracy in pollen retrieval. In this work, we propose a system combining recognition and grouping of pollen. It consists of using a Logistic Model Tree to classify pollen already known by the proposed system while detecting any unknown species. Then, the unknown pollen species are divided using a cluster-based approach. Success rates for the recognition of known species have been achieved, and automated clustering seems to be a promising approach.
Keywords: Pollen recognition, logistic model tree, expectation-maximization, local binary pattern.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 77051 Plant Varieties Selection System
Authors: Kitti Koonsanit, Chuleerat Jaruskulchai, Poonsak Miphokasap, Apisit Eiumnoh
Abstract:
In the end of the day, meteorological data and environmental data becomes widely used such as plant varieties selection system. Variety plant selection for planted area is of almost importance for all crops, including varieties of sugarcane. Since sugarcane have many varieties. Variety plant non selection for planting may not be adapted to the climate or soil conditions for planted area. Poor growth, bloom drop, poor fruit, and low price are to be from varieties which were not recommended for those planted area. This paper presents plant varieties selection system for planted areas in Thailand from meteorological data and environmental data by the use of decision tree techniques. With this software developed as an environmental data analysis tool, it can analyze resulting easier and faster. Our software is a front end of WEKA that provides fundamental data mining functions such as classify, clustering, and analysis functions. It also supports pre-processing, analysis, and decision tree output with exporting result. After that, our software can export and display data result to Google maps API in order to display result and plot plant icons effectively.
Keywords: Plant varieties selection system, decision tree, expert recommendation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 179350 Extraction of Symbolic Rules from Artificial Neural Networks
Authors: S. M. Kamruzzaman, Md. Monirul Islam
Abstract:
Although backpropagation ANNs generally predict better than decision trees do for pattern classification problems, they are often regarded as black boxes, i.e., their predictions cannot be explained as those of decision trees. In many applications, it is desirable to extract knowledge from trained ANNs for the users to gain a better understanding of how the networks solve the problems. A new rule extraction algorithm, called rule extraction from artificial neural networks (REANN) is proposed and implemented to extract symbolic rules from ANNs. A standard three-layer feedforward ANN is the basis of the algorithm. A four-phase training algorithm is proposed for backpropagation learning. Explicitness of the extracted rules is supported by comparing them to the symbolic rules generated by other methods. Extracted rules are comparable with other methods in terms of number of rules, average number of conditions for a rule, and predictive accuracy. Extensive experimental studies on several benchmarks classification problems, such as breast cancer, iris, diabetes, and season classification problems, demonstrate the effectiveness of the proposed approach with good generalization ability.Keywords: Backpropagation, clustering algorithm, constructivealgorithm, continuous activation function, pruning algorithm, ruleextraction algorithm, symbolic rules.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 161649 Mining Correlated Bicluster from Web Usage Data Using Discrete Firefly Algorithm Based Biclustering Approach
Authors: K. Thangavel, R. Rathipriya
Abstract:
For the past one decade, biclustering has become popular data mining technique not only in the field of biological data analysis but also in other applications like text mining, market data analysis with high-dimensional two-way datasets. Biclustering clusters both rows and columns of a dataset simultaneously, as opposed to traditional clustering which clusters either rows or columns of a dataset. It retrieves subgroups of objects that are similar in one subgroup of variables and different in the remaining variables. Firefly Algorithm (FA) is a recently-proposed metaheuristic inspired by the collective behavior of fireflies. This paper provides a preliminary assessment of discrete version of FA (DFA) while coping with the task of mining coherent and large volume bicluster from web usage dataset. The experiments were conducted on two web usage datasets from public dataset repository whereby the performance of FA was compared with that exhibited by other population-based metaheuristic called binary Particle Swarm Optimization (PSO). The results achieved demonstrate the usefulness of DFA while tackling the biclustering problem.
Keywords: Biclustering, Binary Particle Swarm Optimization, Discrete Firefly Algorithm, Firefly Algorithm, Usage profile Web usage mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 213348 A Monte Carlo Method to Data Stream Analysis
Authors: Kittisak Kerdprasop, Nittaya Kerdprasop, Pairote Sattayatham
Abstract:
Data stream analysis is the process of computing various summaries and derived values from large amounts of data which are continuously generated at a rapid rate. The nature of a stream does not allow a revisit on each data element. Furthermore, data processing must be fast to produce timely analysis results. These requirements impose constraints on the design of the algorithms to balance correctness against timely responses. Several techniques have been proposed over the past few years to address these challenges. These techniques can be categorized as either dataoriented or task-oriented. The data-oriented approach analyzes a subset of data or a smaller transformed representation, whereas taskoriented scheme solves the problem directly via approximation techniques. We propose a hybrid approach to tackle the data stream analysis problem. The data stream has been both statistically transformed to a smaller size and computationally approximated its characteristics. We adopt a Monte Carlo method in the approximation step. The data reduction has been performed horizontally and vertically through our EMR sampling method. The proposed method is analyzed by a series of experiments. We apply our algorithm on clustering and classification tasks to evaluate the utility of our approach.Keywords: Data Stream, Monte Carlo, Sampling, DensityEstimation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 141747 Grouping and Indexing Color Features for Efficient Image Retrieval
Authors: M. V. Sudhamani, C. R. Venugopal
Abstract:
Content-based Image Retrieval (CBIR) aims at searching image databases for specific images that are similar to a given query image based on matching of features derived from the image content. This paper focuses on a low-dimensional color based indexing technique for achieving efficient and effective retrieval performance. In our approach, the color features are extracted using the mean shift algorithm, a robust clustering technique. Then the cluster (region) mode is used as representative of the image in 3-D color space. The feature descriptor consists of the representative color of a region and is indexed using a spatial indexing method that uses *R -tree thus avoiding the high-dimensional indexing problems associated with the traditional color histogram. Alternatively, the images in the database are clustered based on region feature similarity using Euclidian distance. Only representative (centroids) features of these clusters are indexed using *R -tree thus improving the efficiency. For similarity retrieval, each representative color in the query image or region is used independently to find regions containing that color. The results of these methods are compared. A JAVA based query engine supporting query-by- example is built to retrieve images by color.
Keywords: Content-based, indexing, cluster, region.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 181146 Performance Evaluation of Energy Efficient Communication Protocol for Mobile Ad Hoc Networks
Authors: Toshihiko Sasama, Kentaro Kishida, Kazunori Sugahara, Hiroshi Masuyama
Abstract:
A mobile ad hoc network is a network of mobile nodes without any notion of centralized administration. In such a network, each mobile node behaves not only as a host which runs applications but also as a router to forward packets on behalf of others. Clustering has been applied to routing protocols to achieve efficient communications. A CH network expresses the connected relationship among cluster-heads. This paper discusses the methods for constructing a CH network, and produces the following results: (1) The required running costs of 3 traditional methods for constructing a CH network are not so different from each other in the static circumstance, or in the dynamic circumstance. Their running costs in the static circumstance do not differ from their costs in the dynamic circumstance. Meanwhile, although the routing costs required for the above 3 methods are not so different in the static circumstance, the costs are considerably different from each other in the dynamic circumstance. Their routing costs in the static circumstance are also very different from their costs in the dynamic circumstance, and the former is one tenths of the latter. The routing cost in the dynamic circumstance is mostly the cost for re-routing. (2) On the strength of the above results, we discuss new 2 methods regarding whether they are tolerable or not in the dynamic circumstance, that is, whether the times of re-routing are small or not. These new methods are revised methods that are based on the traditional methods. We recommended the method which produces the smallest routing cost in the dynamic circumstance, therefore producing the smallest total cost.Keywords: cluster, mobile ad hoc network, re-routing cost, simulation
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 135045 Discovery of Human HMG-Coa Reductase Inhibitors Using Structure-Based Pharmacophore Modeling Combined with Molecular Dynamics Simulation Methodologies
Authors: Minky Son, Chanin Park, Ayoung Baek, Shalini John, Keun Woo Lee
Abstract:
3-hydroxy-3-methylglutaryl coenzyme A reductase (HMGR) catalyzes the conversion of HMG-CoA to mevalonate using NADPH and the enzyme is involved in rate-controlling step of mevalonate. Inhibition of HMGR is considered as effective way to lower cholesterol levels so it is drug target to treat hypercholesterolemia, major risk factor of cardiovascular disease. To discover novel HMGR inhibitor, we performed structure-based pharmacophore modeling combined with molecular dynamics (MD) simulation. Four HMGR inhibitors were used for MD simulation and representative structure of each simulation were selected by clustering analysis. Four structure-based pharmacophore models were generated using the representative structure. The generated models were validated used in virtual screening to find novel scaffolds for inhibiting HMGR. The screened compounds were filtered by applying drug-like properties and used in molecular docking. Finally, four hit compounds were obtained and these complexes were refined using energy minimization. These compounds might be potential leads to design novel HMGR inhibitor.
Keywords: Anti-hypercholesterolemia drug, HMGR inhibitor, Molecular dynamics simulation, Structure-based pharmacophore modeling.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 194844 A Novel Nucleus-Based Classifier for Discrimination of Osteoclasts and Mesenchymal Precursor Cells in Mouse Bone Marrow Cultures
Authors: Andreas Heindl, Alexander K. Seewald, Martin Schepelmann, Radu Rogojanu, Giovanna Bises, Theresia Thalhammer, Isabella Ellinger
Abstract:
Bone remodeling occurs by the balanced action of bone resorbing osteoclasts (OC) and bone-building osteoblasts. Increased bone resorption by excessive OC activity contributes to malignant and non-malignant diseases including osteoporosis. To study OC differentiation and function, OC formed in in vitro cultures are currently counted manually, a tedious procedure which is prone to inter-observer differences. Aiming for an automated OC-quantification system, classification of OC and precursor cells was done on fluorescence microscope images based on the distinct appearance of fluorescent nuclei. Following ellipse fitting to nuclei, a combination of eight features enabled clustering of OC and precursor cell nuclei. After evaluating different machine-learning techniques, LOGREG achieved 74% correctly classified OC and precursor cell nuclei, outperforming human experts (best expert: 55%). In combination with the automated detection of total cell areas, this system allows to measure various cell parameters and most importantly to quantify proteins involved in osteoclastogenesis.Keywords: osteoclasts, machine learning, ellipse fitting.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 191343 A General Framework for Knowledge Discovery Using High Performance Machine Learning Algorithms
Authors: S. Nandagopalan, N. Pradeep
Abstract:
The aim of this paper is to propose a general framework for storing, analyzing, and extracting knowledge from two-dimensional echocardiographic images, color Doppler images, non-medical images, and general data sets. A number of high performance data mining algorithms have been used to carry out this task. Our framework encompasses four layers namely physical storage, object identification, knowledge discovery, user level. Techniques such as active contour model to identify the cardiac chambers, pixel classification to segment the color Doppler echo image, universal model for image retrieval, Bayesian method for classification, parallel algorithms for image segmentation, etc., were employed. Using the feature vector database that have been efficiently constructed, one can perform various data mining tasks like clustering, classification, etc. with efficient algorithms along with image mining given a query image. All these facilities are included in the framework that is supported by state-of-the-art user interface (UI). The algorithms were tested with actual patient data and Coral image database and the results show that their performance is better than the results reported already.Keywords: Active Contour, Bayesian, Echocardiographic image, Feature vector.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 171342 Dominating Set Algorithm and Trust Evaluation Scheme for Secured Cluster Formation and Data Transferring
Authors: Y. Harold Robinson, M. Rajaram, E. Golden Julie, S. Balaji
Abstract:
This paper describes the proficient way of choosing the cluster head based on dominating set algorithm in a wireless sensor network (WSN). The algorithm overcomes the energy deterioration problems by this selection process of cluster heads. Clustering algorithms such as LEACH, EEHC and HEED enhance scalability in WSNs. Dominating set algorithm keeps the first node alive longer than the other protocols previously used. As the dominating set of cluster heads are directly connected to each node, the energy of the network is saved by eliminating the intermediate nodes in WSN. Security and trust is pivotal in network messaging. Cluster head is secured with a unique key. The member can only connect with the cluster head if and only if they are secured too. The secured trust model provides security for data transmission in the dominated set network with the group key. The concept can be extended to add a mobile sink for each or for no of clusters to transmit data or messages between cluster heads and to base station. Data security id preferably high and data loss can be prevented. The simulation demonstrates the concept of choosing cluster heads by dominating set algorithm and trust evaluation using DSTE. The research done is rationalized.
Keywords: Wireless Sensor Networks, LEECH, EEHC, HEED, DSTE.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 140541 Validation and Selection between Machine Learning Technique and Traditional Methods to Reduce Bullwhip Effects: a Data Mining Approach
Authors: Hamid R. S. Mojaveri, Seyed S. Mousavi, Mojtaba Heydar, Ahmad Aminian
Abstract:
The aim of this paper is to present a methodology in three steps to forecast supply chain demand. In first step, various data mining techniques are applied in order to prepare data for entering into forecasting models. In second step, the modeling step, an artificial neural network and support vector machine is presented after defining Mean Absolute Percentage Error index for measuring error. The structure of artificial neural network is selected based on previous researchers' results and in this article the accuracy of network is increased by using sensitivity analysis. The best forecast for classical forecasting methods (Moving Average, Exponential Smoothing, and Exponential Smoothing with Trend) is resulted based on prepared data and this forecast is compared with result of support vector machine and proposed artificial neural network. The results show that artificial neural network can forecast more precisely in comparison with other methods. Finally, forecasting methods' stability is analyzed by using raw data and even the effectiveness of clustering analysis is measured.Keywords: Artificial Neural Networks (ANN), bullwhip effect, demand forecasting, Support Vector Machine (SVM).
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 201040 Diversity Analysis of a Quinoa (Chenopodium quinoa Willd.) Germplasm during Two Seasons
Authors: M. Mhada, E. N. Jellen, S. E. Jacobsen, O. Benlhabib
Abstract:
The present work has been carried out to evaluate the diversity of a collection of 78 quinoa accessions developed through recurrent selection from Andean germplasm introduced to Morocco in the winter of 2000. Twenty-three quantitative and qualitative characters were used for the evaluation of genetic diversity and the relationship between the accessions, and also for the establishment of a core collection in Morocco. Important variation was found among the accessions in terms of plant morphology and growth behavior. Data analysis showed positive correlation of the plant height, the plant fresh and the dry weight with the grain yield, while days to flowering was found to be negatively correlated with grain yield. The first four PCs contributed 74.76% of the variability; the first PC showed significant variation with 42.86% of the total variation, PC2 with 15.37%, PC3 with 9.05% and PC4 contributed 7.49% of the total variation. Plant size, days to grain filling and days to maturity are correlated to the PC1; and seed size, inflorescence density and mildew resistance are correlated to the PC2. Hierarchical cluster analysis rearranged the 78 quinoa accessions into four main groups and ten sub-clusters. Clustering was found in associations with days to maturity and also with plant size and seed-size traits.
Keywords: Character association, Chenopodium quinoa, Diversity analysis, Morphotypic cluster, Multivariate analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 258639 Analytical Authentication of Butter Using Fourier Transform Infrared Spectroscopy Coupled with Chemometrics
Authors: M. Bodner, M. Scampicchio
Abstract:
Fourier Transform Infrared (FT-IR) spectroscopy coupled with chemometrics was used to distinguish between butter samples and non-butter samples. Further, quantification of the content of margarine in adulterated butter samples was investigated. Fingerprinting region (1400-800 cm–1) was used to develop unsupervised pattern recognition (Principal Component Analysis, PCA), supervised modeling (Soft Independent Modelling by Class Analogy, SIMCA), classification (Partial Least Squares Discriminant Analysis, PLS-DA) and regression (Partial Least Squares Regression, PLS-R) models. PCA of the fingerprinting region shows a clustering of the two sample types. All samples were classified in their rightful class by SIMCA approach; however, nine adulterated samples (between 1% and 30% w/w of margarine) were classified as belonging both at the butter class and at the non-butter one. In the two-class PLS-DA model’s (R2 = 0.73, RMSEP, Root Mean Square Error of Prediction = 0.26% w/w) sensitivity was 71.4% and Positive Predictive Value (PPV) 100%. Its threshold was calculated at 7% w/w of margarine in adulterated butter samples. Finally, PLS-R model (R2 = 0.84, RMSEP = 16.54%) was developed. PLS-DA was a suitable classification tool and PLS-R a proper quantification approach. Results demonstrate that FT-IR spectroscopy combined with PLS-R can be used as a rapid, simple and safe method to identify pure butter samples from adulterated ones and to determine the grade of adulteration of margarine in butter samples.
Keywords: Adulterated butter, margarine, PCA, PLS-DA, PLS-R, SIMCA.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 78138 Using the Combined Model of PROMETHEE and Fuzzy Analytic Network Process for Determining Question Weights in Scientific Exams through Data Mining Approach
Authors: Hassan Haleh, Amin Ghaffari, Parisa Farahpour
Abstract:
Need for an appropriate system of evaluating students- educational developments is a key problem to achieve the predefined educational goals. Intensity of the related papers in the last years; that tries to proof or disproof the necessity and adequacy of the students assessment; is the corroborator of this matter. Some of these studies tried to increase the precision of determining question weights in scientific examinations. But in all of them there has been an attempt to adjust the initial question weights while the accuracy and precision of those initial question weights are still under question. Thus In order to increase the precision of the assessment process of students- educational development, the present study tries to propose a new method for determining the initial question weights by considering the factors of questions like: difficulty, importance and complexity; and implementing a combined method of PROMETHEE and fuzzy analytic network process using a data mining approach to improve the model-s inputs. The result of the implemented case study proves the development of performance and precision of the proposed model.Keywords: Assessing students, Analytic network process, Clustering, Data mining, Fuzzy sets, Multi-criteria decision making, and Preference function.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 158137 BIDENS: Iterative Density Based Biclustering Algorithm With Application to Gene Expression Analysis
Authors: Mohamed A. Mahfouz, M. A. Ismail
Abstract:
Biclustering is a very useful data mining technique for identifying patterns where different genes are co-related based on a subset of conditions in gene expression analysis. Association rules mining is an efficient approach to achieve biclustering as in BIMODULE algorithm but it is sensitive to the value given to its input parameters and the discretization procedure used in the preprocessing step, also when noise is present, classical association rules miners discover multiple small fragments of the true bicluster, but miss the true bicluster itself. This paper formally presents a generalized noise tolerant bicluster model, termed as μBicluster. An iterative algorithm termed as BIDENS based on the proposed model is introduced that can discover a set of k possibly overlapping biclusters simultaneously. Our model uses a more flexible method to partition the dimensions to preserve meaningful and significant biclusters. The proposed algorithm allows discovering biclusters that hard to be discovered by BIMODULE. Experimental study on yeast, human gene expression data and several artificial datasets shows that our algorithm offers substantial improvements over several previously proposed biclustering algorithms.Keywords: Machine learning, biclustering, bi-dimensional clustering, gene expression analysis, data mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 196336 Fuzzy Control of the Air Conditioning System at Different Operating Pressures
Authors: Mohanad Alata , Moh'd Al-Nimr, Rami Al-Jarrah
Abstract:
The present work demonstrates the design and simulation of a fuzzy control of an air conditioning system at different pressures. The first order Sugeno fuzzy inference system is utilized to model the system and create the controller. In addition, an estimation of the heat transfer rate and water mass flow rate injection into or withdraw from the air conditioning system is determined by the fuzzy IF-THEN rules. The approach starts by generating the input/output data. Then, the subtractive clustering algorithm along with least square estimation (LSE) generates the fuzzy rules that describe the relationship between input/output data. The fuzzy rules are tuned by Adaptive Neuro-Fuzzy Inference System (ANFIS). The results show that when the pressure increases the amount of water flow rate and heat transfer rate decrease within the lower ranges of inlet dry bulb temperatures. On the other hand, and as pressure increases the amount of water flow rate and heat transfer rate increases within the higher ranges of inlet dry bulb temperatures. The inflection in the pressure effect trend occurs at lower temperatures as the inlet air humidity increases.
Keywords: Air Conditioning, ANFIS, Fuzzy Control, Sugeno System.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 336635 Dynamic Clustering Estimation of Tool Flank Wear in Turning Process using SVD Models of the Emitted Sound Signals
Authors: A. Samraj, S. Sayeed, J. E. Raja., J. Hossen, A. Rahman
Abstract:
Monitoring the tool flank wear without affecting the throughput is considered as the prudent method in production technology. The examination has to be done without affecting the machining process. In this paper we proposed a novel work that is used to determine tool flank wear by observing the sound signals emitted during the turning process. The work-piece material we used here is steel and aluminum and the cutting insert was carbide material. Two different cutting speeds were used in this work. The feed rate and the cutting depth were constant whereas the flank wear was a variable. The emitted sound signal of a fresh tool (0 mm flank wear) a slightly worn tool (0.2 -0.25 mm flank wear) and a severely worn tool (0.4mm and above flank wear) during turning process were recorded separately using a high sensitive microphone. Analysis using Singular Value Decomposition was done on these sound signals to extract the feature sound components. Observation of the results showed that an increase in tool flank wear correlates with an increase in the values of SVD features produced out of the sound signals for both the materials. Hence it can be concluded that wear monitoring of tool flank during turning process using SVD features with the Fuzzy C means classification on the emitted sound signal is a potential and relatively simple method.Keywords: Fuzzy c means, Microphone, Singular ValueDecomposition, Tool Flank Wear.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 189834 Hand Gesture Recognition Based on Combined Features Extraction
Authors: Mahmoud Elmezain, Ayoub Al-Hamadi, Bernd Michaelis
Abstract:
Hand gesture is an active area of research in the vision community, mainly for the purpose of sign language recognition and Human Computer Interaction. In this paper, we propose a system to recognize alphabet characters (A-Z) and numbers (0-9) in real-time from stereo color image sequences using Hidden Markov Models (HMMs). Our system is based on three main stages; automatic segmentation and preprocessing of the hand regions, feature extraction and classification. In automatic segmentation and preprocessing stage, color and 3D depth map are used to detect hands where the hand trajectory will take place in further step using Mean-shift algorithm and Kalman filter. In the feature extraction stage, 3D combined features of location, orientation and velocity with respected to Cartesian systems are used. And then, k-means clustering is employed for HMMs codeword. The final stage so-called classification, Baum- Welch algorithm is used to do a full train for HMMs parameters. The gesture of alphabets and numbers is recognized using Left-Right Banded model in conjunction with Viterbi algorithm. Experimental results demonstrate that, our system can successfully recognize hand gestures with 98.33% recognition rate.Keywords: Gesture Recognition, Computer Vision & Image Processing, Pattern Recognition.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 403233 Discovering Complex Regularities: from Tree to Semi-Lattice Classifications
Authors: A. Faro, D. Giordano, F. Maiorana
Abstract:
Data mining uses a variety of techniques each of which is useful for some particular task. It is important to have a deep understanding of each technique and be able to perform sophisticated analysis. In this article we describe a tool built to simulate a variation of the Kohonen network to perform unsupervised clustering and support the entire data mining process up to results visualization. A graphical representation helps the user to find out a strategy to optimize classification by adding, moving or delete a neuron in order to change the number of classes. The tool is able to automatically suggest a strategy to optimize the number of classes optimization, but also support both tree classifications and semi-lattice organizations of the classes to give to the users the possibility of passing from one class to the ones with which it has some aspects in common. Examples of using tree and semi-lattice classifications are given to illustrate advantages and problems. The tool is applied to classify macroeconomic data that report the most developed countries- import and export. It is possible to classify the countries based on their economic behaviour and use the tool to characterize the commercial behaviour of a country in a selected class from the analysis of positive and negative features that contribute to classes formation. Possible interrelationships between the classes and their meaning are also discussed.Keywords: Unsupervised classification, Kohonen networks, macroeconomics, Visual data mining, Cluster interpretation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 154232 Automatic Segmentation of Dermoscopy Images Using Histogram Thresholding on Optimal Color Channels
Authors: Rahil Garnavi, Mohammad Aldeen, M. Emre Celebi, Alauddin Bhuiyan, Constantinos Dolianitis, George Varigos
Abstract:
Automatic segmentation of skin lesions is the first step towards development of a computer-aided diagnosis of melanoma. Although numerous segmentation methods have been developed, few studies have focused on determining the most discriminative and effective color space for melanoma application. This paper proposes a novel automatic segmentation algorithm using color space analysis and clustering-based histogram thresholding, which is able to determine the optimal color channel for segmentation of skin lesions. To demonstrate the validity of the algorithm, it is tested on a set of 30 high resolution dermoscopy images and a comprehensive evaluation of the results is provided, where borders manually drawn by four dermatologists, are compared to automated borders detected by the proposed algorithm. The evaluation is carried out by applying three previously used metrics of accuracy, sensitivity, and specificity and a new metric of similarity. Through ROC analysis and ranking the metrics, it is shown that the best results are obtained with the X and XoYoR color channels which results in an accuracy of approximately 97%. The proposed method is also compared with two state-ofthe- art skin lesion segmentation methods, which demonstrates the effectiveness and superiority of the proposed segmentation method.Keywords: Border detection, Color space analysis, Dermoscopy, Histogram thresholding, Melanoma, Segmentation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 208531 A Two-Stage Expert System for Diagnosis of Leukemia Based on Type-2 Fuzzy Logic
Authors: Ali Akbar Sadat Asl
Abstract:
Diagnosis and deciding about diseases in medical fields is facing innate uncertainty which can affect the whole process of treatment. This decision is made based on expert knowledge and the way in which an expert interprets the patient's condition, and the interpretation of the various experts from the patient's condition may be different. Fuzzy logic can provide mathematical modeling for many concepts, variables, and systems that are unclear and ambiguous and also it can provide a framework for reasoning, inference, control, and decision making in conditions of uncertainty. In systems with high uncertainty and high complexity, fuzzy logic is a suitable method for modeling. In this paper, we use type-2 fuzzy logic for uncertainty modeling that is in diagnosis of leukemia. The proposed system uses an indirect-direct approach and consists of two stages: In the first stage, the inference of blood test state is determined. In this step, we use an indirect approach where the rules are extracted automatically by implementing a clustering approach. In the second stage, signs of leukemia, duration of disease until its progress and the output of the first stage are combined and the final diagnosis of the system is obtained. In this stage, the system uses a direct approach and final diagnosis is determined by the expert. The obtained results show that the type-2 fuzzy expert system can diagnose leukemia with the average accuracy about 97%.
Keywords: Expert system, leukemia, medical diagnosis, type-2 fuzzy logic.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 105330 Surrogate based Evolutionary Algorithm for Design Optimization
Authors: Maumita Bhattacharya
Abstract:
Optimization is often a critical issue for most system design problems. Evolutionary Algorithms are population-based, stochastic search techniques, widely used as efficient global optimizers. However, finding optimal solution to complex high dimensional, multimodal problems often require highly computationally expensive function evaluations and hence are practically prohibitive. The Dynamic Approximate Fitness based Hybrid EA (DAFHEA) model presented in our earlier work [14] reduced computation time by controlled use of meta-models to partially replace the actual function evaluation by approximate function evaluation. However, the underlying assumption in DAFHEA is that the training samples for the meta-model are generated from a single uniform model. Situations like model formation involving variable input dimensions and noisy data certainly can not be covered by this assumption. In this paper we present an enhanced version of DAFHEA that incorporates a multiple-model based learning approach for the SVM approximator. DAFHEA-II (the enhanced version of the DAFHEA framework) also overcomes the high computational expense involved with additional clustering requirements of the original DAFHEA framework. The proposed framework has been tested on several benchmark functions and the empirical results illustrate the advantages of the proposed technique.Keywords: Evolutionary algorithm, Fitness function, Optimization, Meta-model, Stochastic method.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 157629 An Approach for Vocal Register Recognition Based on Spectral Analysis of Singing
Authors: Aleksandra Zysk, Pawel Badura
Abstract:
Recognizing and controlling vocal registers during singing is a difficult task for beginner vocalist. It requires among others identifying which part of natural resonators is being used when a sound propagates through the body. Thus, an application has been designed allowing for sound recording, automatic vocal register recognition (VRR), and a graphical user interface providing real-time visualization of the signal and recognition results. Six spectral features are determined for each time frame and passed to the support vector machine classifier yielding a binary decision on the head or chest register assignment of the segment. The classification training and testing data have been recorded by ten professional female singers (soprano, aged 19-29) performing sounds for both chest and head register. The classification accuracy exceeded 93% in each of various validation schemes. Apart from a hard two-class clustering, the support vector classifier returns also information on the distance between particular feature vector and the discrimination hyperplane in a feature space. Such an information reflects the level of certainty of the vocal register classification in a fuzzy way. Thus, the designed recognition and training application is able to assess and visualize the continuous trend in singing in a user-friendly graphical mode providing an easy way to control the vocal emission.Keywords: Classification, singing, spectral analysis, vocal emission, vocal register.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 131328 Automatic Detection of Breast Tumors in Sonoelastographic Images Using DWT
Authors: A. Sindhuja, V. Sadasivam
Abstract:
Breast Cancer is the most common malignancy in women and the second leading cause of death for women all over the world. Earlier the detection of cancer, better the treatment. The diagnosis and treatment of the cancer rely on segmentation of Sonoelastographic images. Texture features has not considered for Sonoelastographic segmentation. Sonoelastographic images of 15 patients containing both benign and malignant tumorsare considered for experimentation.The images are enhanced to remove noise in order to improve contrast and emphasize tumor boundary. It is then decomposed into sub-bands using single level Daubechies wavelets varying from single co-efficient to six coefficients. The Grey Level Co-occurrence Matrix (GLCM), Local Binary Pattern (LBP) features are extracted and then selected by ranking it using Sequential Floating Forward Selection (SFFS) technique from each sub-band. The resultant images undergo K-Means clustering and then few post-processing steps to remove the false spots. The tumor boundary is detected from the segmented image. It is proposed that Local Binary Pattern (LBP) from the vertical coefficients of Daubechies wavelet with two coefficients is best suited for segmentation of Sonoelastographic breast images among the wavelet members using one to six coefficients for decomposition. The results are also quantified with the help of an expert radiologist. The proposed work can be used for further diagnostic process to decide if the segmented tumor is benign or malignant.
Keywords: Breast Cancer, Segmentation, Sonoelastography, Tumor Detection.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 220727 An AI-Based Dynamical Resource Allocation Calculation Algorithm for Unmanned Aerial Vehicle
Authors: Zhou Luchen, Wu Yubing, Burra Venkata Durga Kumar
Abstract:
As the scale of the network becomes larger and more complex than before, the density of user devices is also increasing. The development of Unmanned Aerial Vehicle (UAV) networks is able to collect and transform data in an efficient way by using software-defined networks (SDN) technology. This paper proposed a three-layer distributed and dynamic cluster architecture to manage UAVs by using an AI-based resource allocation calculation algorithm to address the overloading network problem. Through separating services of each UAV, the UAV hierarchical cluster system performs the main function of reducing the network load and transferring user requests, with three sub-tasks including data collection, communication channel organization, and data relaying. In this cluster, a head node and a vice head node UAV are selected considering the CPU, RAM, and ROM memory of devices, battery charge, and capacity. The vice head node acts as a backup that stores all the data in the head node. The k-means clustering algorithm is used in order to detect high load regions and form the UAV layered clusters. The whole process of detecting high load areas, forming and selecting UAV clusters, and moving the selected UAV cluster to that area is proposed as offloading traffic algorithm.
Keywords: k-means, resource allocation, SDN, UAV network, unmanned aerial vehicles.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 35126 Non-Overlapping Hierarchical Index Structure for Similarity Search
Authors: Mounira Taileb, Sid Lamrous, Sami Touati
Abstract:
In order to accelerate the similarity search in highdimensional database, we propose a new hierarchical indexing method. It is composed of offline and online phases. Our contribution concerns both phases. In the offline phase, after gathering the whole of the data in clusters and constructing a hierarchical index, the main originality of our contribution consists to develop a method to construct bounding forms of clusters to avoid overlapping. For the online phase, our idea improves considerably performances of similarity search. However, for this second phase, we have also developed an adapted search algorithm. Our method baptized NOHIS (Non-Overlapping Hierarchical Index Structure) use the Principal Direction Divisive Partitioning (PDDP) as algorithm of clustering. The principle of the PDDP is to divide data recursively into two sub-clusters; division is done by using the hyper-plane orthogonal to the principal direction derived from the covariance matrix and passing through the centroid of the cluster to divide. Data of each two sub-clusters obtained are including by a minimum bounding rectangle (MBR). The two MBRs are directed according to the principal direction. Consequently, the nonoverlapping between the two forms is assured. Experiments use databases containing image descriptors. Results show that the proposed method outperforms sequential scan and SRtree in processing k-nearest neighbors.
Keywords: K-nearest neighbour search, multi-dimensional indexing, multimedia databases, similarity search.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1562