Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 14

Search results for: Microarrays

14 Predication Model for Leukemia Diseases Based on Data Mining Classification Algorithms with Best Accuracy

Authors: Fahd Sabry Esmail, M. Badr Senousy, Mohamed Ragaie

Abstract:

In recent years, there has been an explosion in the rate of using technology that help discovering the diseases. For example, DNA microarrays allow us for the first time to obtain a "global" view of the cell. It has great potential to provide accurate medical diagnosis, to help in finding the right treatment and cure for many diseases. Various classification algorithms can be applied on such micro-array datasets to devise methods that can predict the occurrence of Leukemia disease. In this study, we compared the classification accuracy and response time among eleven decision tree methods and six rule classifier methods using five performance criteria. The experiment results show that the performance of Random Tree is producing better result. Also it takes lowest time to build model in tree classifier. The classification rules algorithms such as nearest- neighbor-like algorithm (NNge) is the best algorithm due to the high accuracy and it takes lowest time to build model in classification.

Keywords: Data Mining, microarray data, Decision Tree, classification techniques, classification rule, leukemia diseases

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1921
13 A Hybrid Gene Selection Technique Using Improved Mutual Information and Fisher Score for Cancer Classification Using Microarrays

Authors: M. Anidha, K. Premalatha

Abstract:

Feature Selection is significant in order to perform constructive classification in the area of cancer diagnosis. However, a large number of features compared to the number of samples makes the task of classification computationally very hard and prone to errors in microarray gene expression datasets. In this paper, we present an innovative method for selecting highly informative gene subsets of gene expression data that effectively classifies the cancer data into tumorous and non-tumorous. The hybrid gene selection technique comprises of combined Mutual Information and Fisher score to select informative genes. The gene selection is validated by classification using Support Vector Machine (SVM) which is a supervised learning algorithm capable of solving complex classification problems. The results obtained from improved Mutual Information and F-Score with SVM as a classifier has produced efficient results.

Keywords: classification, Gene Selection, SVM, mutual information, Fisher score

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 755
12 A Cuckoo Search with Differential Evolution for Clustering Microarray Gene Expression Data

Authors: M. Pandi, K. Premalatha

Abstract:

A DNA microarray technology is a collection of microscopic DNA spots attached to a solid surface. Scientists use DNA microarrays to measure the expression levels of large numbers of genes simultaneously or to genotype multiple regions of a genome. Elucidating the patterns hidden in gene expression data offers a tremendous opportunity for an enhanced understanding of functional genomics. However, the large number of genes and the complexity of biological networks greatly increase the challenges of comprehending and interpreting the resulting mass of data, which often consists of millions of measurements. It is handled by clustering which reveals the natural structures and identifying the interesting patterns in the underlying data. In this paper, gene based clustering in gene expression data is proposed using Cuckoo Search with Differential Evolution (CS-DE). The experiment results are analyzed with gene expression benchmark datasets. The results show that CS-DE outperforms CS in benchmark datasets. To find the validation of the clustering results, this work is tested with one internal and one external cluster validation indexes.

Keywords: Genomics, Clustering, Dna, Differential Evolution, Microarray, Gene Expression Data, cuckoo search

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 957
11 Gene Selection Guided by Feature Interdependence

Authors: Hung-Ming Lai, Andreas Albrecht, Kathleen Steinhöfel

Abstract:

Cancers could normally be marked by a number of differentially expressed genes which show enormous potential as biomarkers for a certain disease. Recent years, cancer classification based on the investigation of gene expression profiles derived by high-throughput microarrays has widely been used. The selection of discriminative genes is, therefore, an essential preprocess step in carcinogenesis studies. In this paper, we have proposed a novel gene selector using information-theoretic measures for biological discovery. This multivariate filter is a four-stage framework through the analyses of feature relevance, feature interdependence, feature redundancy-dependence and subset rankings, and having been examined on the colon cancer data set. Our experimental result show that the proposed method outperformed other information theorem based filters in all aspect of classification errors and classification performance.

Keywords: Colon Cancer, Microarray Data Analysis, Gene Selection, feature interdependence, feature subset selection

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1706
10 Simultaneous Clustering and Feature Selection Method for Gene Expression Data

Authors: T. Chandrasekhar, K. Thangavel, E. N. Sathishkumar

Abstract:

Microarrays are made it possible to simultaneously monitor the expression profiles of thousands of genes under various experimental conditions. It is used to identify the co-expressed genes in specific cells or tissues that are actively used to make proteins. This method is used to analysis the gene expression, an important task in bioinformatics research. Cluster analysis of gene expression data has proved to be a useful tool for identifying co-expressed genes, biologically relevant groupings of genes and samples. In this work K-Means algorithms has been applied for clustering of Gene Expression Data. Further, rough set based Quick reduct algorithm has been applied for each cluster in order to select the most similar genes having high correlation. Then the ACV measure is used to evaluate the refined clusters and classification is used to evaluate the proposed method. They could identify compact clusters with feature selection method used to genes are selected.

Keywords: Clustering, Feature selection, Gene Expression Data, Quick reduct

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1545
9 Gene Expression Signature for Classification of Metastasis Positive and Negative Oral Cancer in Homosapiens

Authors: A. Shukla, A. Tarsauliya, R. Tiwari, S. Sharma

Abstract:

Cancer classification to their corresponding cohorts has been key area of research in bioinformatics aiming better prognosis of the disease. High dimensionality of gene data has been makes it a complex task and requires significance data identification technique in order to reducing the dimensionality and identification of significant information. In this paper, we have proposed a novel approach for classification of oral cancer into metastasis positive and negative patients. We have used significance analysis of microarrays (SAM) for identifying significant genes which constitutes gene signature. 3 different gene signatures were identified using SAM from 3 different combination of training datasets and their classification accuracy was calculated on corresponding testing datasets using k-Nearest Neighbour (kNN), Fuzzy C-Means Clustering (FCM), Support Vector Machine (SVM) and Backpropagation Neural Network (BPNN). A final gene signature of only 9 genes was obtained from above 3 individual gene signatures. 9 gene signature-s classification capability was compared using same classifiers on same testing datasets. Results obtained from experimentation shows that 9 gene signature classified all samples in testing dataset accurately while individual genes could not classify all accurately.

Keywords: Cancer, classification, SAM, Gene Signature

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1701
8 Differentiation of Gene Expression Profiles Data for Liver and Kidney of Pigs

Authors: Khlopova N.S., Glazko V.I., Glazko T.T.

Abstract:

Using DNA microarrays the comparative analysis of a gene expression profiles is carried out in a liver and kidneys of pigs. The hypothesis of a cross hybridization of one probe with different cDNA sites of the same gene or different genes is checked up, and it is shown, that cross hybridization can be a source of essential errors at revealing of a key genes in organ-specific transcriptome. It is reveald that distinctions in profiles of a gene expression are well coordinated with function, morphology, biochemistry and histology of these organs.

Keywords: Microarray, gene expression profiles, key genes

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1169
7 Multidimensional Visualization Tools for Analysis of Expression Data

Authors: Urska Cvek, Marjan Trutschl, Randolph Stone II, Zanobia Syed, John L. Clifford, Anita L. Sabichi

Abstract:

Expression data analysis is based mostly on the statistical approaches that are indispensable for the study of biological systems. Large amounts of multidimensional data resulting from the high-throughput technologies are not completely served by biostatistical techniques and are usually complemented with visual, knowledge discovery and other computational tools. In many cases, in biological systems we only speculate on the processes that are causing the changes, and it is the visual explorative analysis of data during which a hypothesis is formed. We would like to show the usability of multidimensional visualization tools and promote their use in life sciences. We survey and show some of the multidimensional visualization tools in the process of data exploration, such as parallel coordinates and radviz and we extend them by combining them with the self-organizing map algorithm. We use a time course data set of transitional cell carcinoma of the bladder in our examples. Analysis of data with these tools has the potential to uncover additional relationships and non-trivial structures.

Keywords: Visualization, Self-Organizing Maps, microarrays, parallel coordinates, radviz

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2119
6 Systholic Boolean Orthonormalizer Network in Wavelet Domain for Microarray Denoising

Authors: Mario Mastriani

Abstract:

We describe a novel method for removing noise (in wavelet domain) of unknown variance from microarrays. The method is based on the following procedure: We apply 1) Bidimentional Discrete Wavelet Transform (DWT-2D) to the Noisy Microarray, 2) scaling and rounding to the coefficients of the highest subbands (to obtain integer and positive coefficients), 3) bit-slicing to the new highest subbands (to obtain bit-planes), 4) then we apply the Systholic Boolean Orthonormalizer Network (SBON) to the input bit-plane set and we obtain two orthonormal otput bit-plane sets (in a Boolean sense), we project a set on the other one, by means of an AND operation, and then, 5) we apply re-assembling, and, 6) rescaling. Finally, 7) we apply Inverse DWT-2D and reconstruct a microarray from the modified wavelet coefficients. Denoising results compare favorably to the most of methods in use at the moment.

Keywords: microarrays, Wavelets, denoising, Bit-Plane, Boolean Orthonormalization Process

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1178
5 Analysis of DNA Microarray Data using Association Rules: A Selective Study

Authors: M. Anandhavalli Gauthaman

Abstract:

DNA microarrays allow the measurement of expression levels for a large number of genes, perhaps all genes of an organism, within a number of different experimental samples. It is very much important to extract biologically meaningful information from this huge amount of expression data to know the current state of the cell because most cellular processes are regulated by changes in gene expression. Association rule mining techniques are helpful to find association relationship between genes. Numerous association rule mining algorithms have been developed to analyze and associate this huge amount of gene expression data. This paper focuses on some of the popular association rule mining algorithms developed to analyze gene expression data.

Keywords: Gene expression, association rule mining, DNA microarray

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1724
4 An Automatic Gridding and Contour Based Segmentation Approach Applied to DNA Microarray Image Analysis

Authors: Alexandra Oliveros, Miguel Sotaquirá

Abstract:

DNA microarray technology is widely used by geneticists to diagnose or treat diseases through gene expression. This technology is based on the hybridization of a tissue-s DNA sequence into a substrate and the further analysis of the image formed by the thousands of genes in the DNA as green, red or yellow spots. The process of DNA microarray image analysis involves finding the location of the spots and the quantification of the expression level of these. In this paper, a tool to perform DNA microarray image analysis is presented, including a spot addressing method based on the image projections, the spot segmentation through contour based segmentation and the extraction of relevant information due to gene expression.

Keywords: Image Processing, Segmentation, DNA Microarrays, Edge Detection, Contour segmentation, spot addressing

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1022
3 Evaluation of Clustering Based on Preprocessing in Gene Expression Data

Authors: Seo Young Kim, Toshimitsu Hamasaki

Abstract:

Microarrays have become the effective, broadly used tools in biological and medical research to address a wide range of problems, including classification of disease subtypes and tumors. Many statistical methods are available for analyzing and systematizing these complex data into meaningful information, and one of the main goals in analyzing gene expression data is the detection of samples or genes with similar expression patterns. In this paper, we express and compare the performance of several clustering methods based on data preprocessing including strategies of normalization or noise clearness. We also evaluate each of these clustering methods with validation measures for both simulated data and real gene expression data. Consequently, clustering methods which are common used in microarray data analysis are affected by normalization and degree of noise and clearness for datasets.

Keywords: Clustering, Gene expression, Data Preprocessing

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1350
2 Dynamical Analysis of Circadian Gene Expression

Authors: Carla Layana Luis Diambra

Abstract:

Microarrays technique allows the simultaneous measurements of the expression levels of thousands of mRNAs. By mining this data one can identify the dynamics of the gene expression time series. By recourse of principal component analysis, we uncover the circadian rhythmic patterns underlying the gene expression profiles from Cyanobacterium Synechocystis. We applied PCA to reduce the dimensionality of the data set. Examination of the components also provides insight into the underlying factors measured in the experiments. Our results suggest that all rhythmic content of data can be reduced to three main components.

Keywords: Clustering, Gene expression, PCA, circadian rhythms

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1240
1 Microarrays Denoising via Smoothing of Coefficients in Wavelet Domain

Authors: Mario Mastriani, Alberto E. Giraldez

Abstract:

We describe a novel method for removing noise (in wavelet domain) of unknown variance from microarrays. The method is based on a smoothing of the coefficients of the highest subbands. Specifically, we decompose the noisy microarray into wavelet subbands, apply smoothing within each highest subband, and reconstruct a microarray from the modified wavelet coefficients. This process is applied a single time, and exclusively to the first level of decomposition, i.e., in most of the cases, it is not necessary a multirresoltuion analysis. Denoising results compare favorably to the most of methods in use at the moment.

Keywords: microarrays, Wavelets, denoising, thresholding, Directional smoothing, edge preservation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1119