Search results for: gene polymorphism

219 UTMGO: A Tool for Searching a Group of Semantically Related Gene Ontology Terms and Application to Annotation of Anonymous Protein Sequence

Authors: Razib M. Othman, Safaai Deris, Rosli M. Illias

Abstract:

Gene Ontology terms have been actively used to annotate various protein sets. SWISS-PROT, TrEMBL, and InterPro are protein databases that are annotated according to the Gene Ontology terms. However, direct implementation of the Gene Ontology terms for annotation of anonymous protein sequences is not easy, especially for species not commonly represented in biological databases. UTMGO is developed as a tool that allows the user to quickly and easily search for a group of semantically related Gene Ontology terms. The applicability of the UTMGO is demonstrated by applying it to annotation of anonymous protein sequence. The extended UTMGO uses the Gene Ontology terms together with protein sequences associated with the terms to perform the annotation task. GOPET, GOtcha, GoFigure, and JAFA are used to compare the performance of the extended UTMGO.

Keywords: Anonymous protein sequence, Gene Ontology, Protein sequence annotation, Protein sequence alignment

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1439

218 Dynamical Analysis of Circadian Gene Expression

Authors: Carla Layana Luis Diambra

Abstract:

Microarrays technique allows the simultaneous measurements of the expression levels of thousands of mRNAs. By mining this data one can identify the dynamics of the gene expression time series. By recourse of principal component analysis, we uncover the circadian rhythmic patterns underlying the gene expression profiles from Cyanobacterium Synechocystis. We applied PCA to reduce the dimensionality of the data set. Examination of the components also provides insight into the underlying factors measured in the experiments. Our results suggest that all rhythmic content of data can be reduced to three main components.

Keywords: circadian rhythms, clustering, gene expression, PCA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1591

217 A Hybrid Gene Selection Technique Using Improved Mutual Information and Fisher Score for Cancer Classification Using Microarrays

Authors: M. Anidha, K. Premalatha

Abstract:

Feature Selection is significant in order to perform constructive classification in the area of cancer diagnosis. However, a large number of features compared to the number of samples makes the task of classification computationally very hard and prone to errors in microarray gene expression datasets. In this paper, we present an innovative method for selecting highly informative gene subsets of gene expression data that effectively classifies the cancer data into tumorous and non-tumorous. The hybrid gene selection technique comprises of combined Mutual Information and Fisher score to select informative genes. The gene selection is validated by classification using Support Vector Machine (SVM) which is a supervised learning algorithm capable of solving complex classification problems. The results obtained from improved Mutual Information and F-Score with SVM as a classifier has produced efficient results.

Keywords: Gene selection, mutual information, Fisher score, classification, SVM.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1152

216 Advanced Polymorphic Techniques

Authors: Philippe Beaucamps

Abstract:

Nowadays viruses use polymorphic techniques to mutate their code on each replication, thus evading detection by antiviruses. However detection by emulation can defeat simple polymorphism: thus metamorphic techniques are used which thoroughly change the viral code, even after decryption. We briefly detail this evolution of virus protection techniques against detection and then study the METAPHOR virus, today's most advanced metamorphic virus.

Keywords: Computer virus, Viral mutation, Polymorphism, Meta¬morphism, MetaPHOR, Virus history, Obfuscation, Viral genetic techniques.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2486

215 Inhibiting Gene for a Late-Heading Gene Responsible for Photoperiod Sensitivity in Rice (Oryza sativa)

Authors: Amol Dahal, Shunsuke Hori, Haruki Nakazawa, Kazumitsu Onishi, Toshio Kawano, Masayuki Murai

Abstract:

Two indica varieties, IR36 and ‘Suweon 258’ (“S”) are middle-heading in southern Japan. 36U, also middle-heading, is an isogenic line of IR36 carrying Ur1 (Undulate rachis-1) gene. However, late-heading plants segregated in the F2 population from the F1 of S × 36U, and so did in the following generations. The concerning lateness gene is designated as Ex. From the F8 generation, isogenic-line pair of early-heading and late-heading lines, denoted by “E” (ex/ex) and “L” (Ex/Ex), were developed. Genetic analyses of heading time were conducted, using F1s and F2s among L, E, S and 36U. The following inferences were drawn from the experimental results: 1) L, and both of E and 36U harbor Ex and ex, respectively; 2) Besides Ex, S harbors an inhibitor gene to it, i.e. I-Ex which is a novel finding of the present study. 3) Ex is a dominant allele at the E1 locus.

Keywords: Basic vegetative phase, heading time, lateness gene, photoperiod-sensitive phase.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1301

214 Genetic Polymorphisms and Haplotype Structure of the Organic Cation Transporter 1 Gene in the Zulu Population of South Africa

Authors: N. Hoosain, S. Nene, B. Pearce, C. Jacobs, M. Du Plessis, M. Benjeddou

Abstract:

Organic cation transporter (OCT) 1could influence an individual’s response to various treatments and increase their susceptibility to diseases.Genotypic and allelic frequencies of nineteen non-synonymous and one intronic Single Nucleotide Polymorphism (SNP) from the OCT1 gene were determined in 101 unrelated healthy Zulu participants, using a SNaPshot^® multiplex assay. Minor allele frequencies (MAF)were compared to representative populations of Africa, Asia and Europe, from Ensembl. MAFs for S14F, V519F, rs622342 and P341L were 2.0%, 6.0%, 6.0% and 1.0%, respectively. Sixteen of nineteen investigated non-synonymous SNPs were monomorphic. No study participant harbored variant alleles for S189L, G220V, P283L, G401S, M420V, M440I, G465R, I542V, R61C, R287G, C88S, A306T, A413V, I421F, C436F and V501E. Haplotype, CGTCGCCGCGCAAGAGGTGA, was most frequently observed (81.23%).Further investigations are encouraged to evaluate potential roles these SNPs could play in the therapeutic efficacy of clinically important drugs and in the development of various diseases in the Zulu population.

Keywords: OCT1, PCR, SNaPshot assay, Zulu population.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2272

213 ACTN3 Genotype Association with Motoric Performance of Roma Children

Authors: J. Bernasovska, I. Boronova, J. Poracova, M. Mydlarova Blascakova, V. Szabadosova, P. Ruzbarsky, E. Petrejcikova, I. Bernasovsky

Abstract:

The paper presents the results of the molecular genetics analysis in sports research, with special emphasis to use genetic information in diagnosing of motoric predispositions in Roma boys from East Slovakia. The ability and move are the basic characteristics of all living organisms. The phenotypes are influenced by a combination of genetic and environmental factors. Genetic tests differ in principle from the traditional motoric tests, because the DNA of an individual does not change during life. The aim of the presented study was to examine motion abilities and to determine the frequency of ACTN3 (R577X) gene in Roma children. Genotype data were obtained from 138 Roma and 155 Slovak boys from 7 to 15 years old. Children were investigated on physical performance level in association with their genotype. Biological material for genetic analyses comprised samples of buccal swabs. Genotypes were determined using Real Time High resolution melting PCR method (Rotor-Gene 6000 Corbett and Light Cycler 480 Roche). The software allows creating reports of any analysis, where information of the specific analysis, normalized and differential graphs and many information of the samples are shown. Roma children of analyzed group legged to non-Romany children at the same age in all the compared tests. The % distribution of R and X alleles in Roma children was different from controls. The frequency of XX genotype was 9.26%, RX 46.33% and RR was 44.41%. The frequency of XX genotype was 9.26% which is comparable to a frequency of an Indian population. Data were analyzed with the ANOVA test.

Keywords: ACTN3 gene, R577X polymorphism, Roma children, Slovakia, sports performance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1206

212 Combining Gene and Chemo Therapy using Multifunctional Polymeric Micelles

Authors: Hong Yi Huang, Wei Ti Kuo, Yi You Huang

Abstract:

Non-viral gene carriers composed of biodegradable polymers or lipids have been considered as a safer alternative for gene carriers over viral vectors. We have developed multi-functional nano-micelles for both drug and gene delivery application. Polyethyleneimine (PEI) was modified by grafting stearic acid (SA) and formulated to polymeric micelles (PEI-SA) with positive surface charge for gene and drug delivery. Our results showed that PEI-SA micelles provided high siRNA binding efficiency. In addition, siRNA delivered by PEI-SA carriers also demonstrated significantly high cellular uptake even in the presence of serum proteins. The post-transcriptional gene silencing efficiency was greatly improved by the polyplex formulated by 10k PEI-SA/siRNA. The amphiphilic structure of PEI-SA micelles provided advantages for multifunctional tasks; where the hydrophilic shell modified with cationic charges can electrostatically interact with DNA or siRNA, and the hydrophobic core can serve as payloads for hydrophobic drugs, making it a promising multifunctional vehicle for both genetic and chemotherapy application.

Keywords: polyethyleneimine, gene delivery, micelles, siRNA

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1887

211 DNA Polymorphism Studies of β-Lactoglobulin Gene in Saudi Goats

Authors: Amr A. El Hanafy, Muhammad Qureshi, Jamal Sabir, Mohamed Mutawakil, Mohamed M. Ahmed, Hassan El Ashmaoui, Hassan Ramadan, Mohamed Abou-Alsoud, Mahmoud Abdel Sadek

Abstract:

Domestic goats (Capra hircus) are extremely diverse species and principal animal genetic resource of the developing world. These facilitate a persistent supply of meat, milk, fibre, and skin and are considered as important revenue generators in small pastoral environments. This study aimed to fingerprint β-LG gene at PCR-RFLP level in native Saudi goat breeds (Ardi, Habsi and Harri) in an attempt to have a preliminary image of β-LG genotypic patterns in Saudi breeds as compared to other foreign breeds such as Indian and Egyptian. Also, the Phylogenetic analysis was done to investigate evolutionary trends and similarities among the caprine β-LG gene with that of the other domestic specie, viz. cow, buffalo and sheep. Blood samples were collected from 300 animals (100 for each breed) and genomic DNA was extracted. A fragment of the β-LG gene (427bp) was amplified using specific primers. Subsequent digestion with Sac II restriction endonuclease revealed two alleles (A and B) and three different banding patterns or genotypes i.e. AA, AB and BB. The statistical analysis showed a general trend that β-LG AA genotype had higher milk yield than β-LG AB and β-LG BB genotypes. Nucleotide sequencing of the selected β-LG fragments was done and submitted to GenBank NCBI (Accession No. KJ544248, KJ588275, KJ588276, KJ783455, KJ783456 and KJ874959). Phylogenetic analysis on the basis of nucleotide sequences of native Saudi goats indicated evolutional similarity with the GenBank reference sequences of goat, Bubalus bubalis and Bos taurus. However, the origin of sheep which is the most closely related from the evolutionary point of view, was located some distance away.

Keywords: β-Lactoglobulin, Saudi goats, PCR-RFLP, Phylogenetic analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6141

210 Application of KL Divergence for Estimation of Each Metabolic Pathway Genes

Authors: Shohei Maruyama, Yasuo Matsuyama, Sachiyo Aburatani

Abstract:

Development of a method to estimate gene functions is an important task in bioinformatics. One of the approaches for the annotation is the identification of the metabolic pathway that genes are involved in. Since gene expression data reflect various intracellular phenomena, those data are considered to be related with genes’ functions. However, it has been difficult to estimate the gene function with high accuracy. It is considered that the low accuracy of the estimation is caused by the difficulty of accurately measuring a gene expression. Even though they are measured under the same condition, the gene expressions will vary usually. In this study, we proposed a feature extraction method focusing on the variability of gene expressions to estimate the genes' metabolic pathway accurately. First, we estimated the distribution of each gene expression from replicate data. Next, we calculated the similarity between all gene pairs by KL divergence, which is a method for calculating the similarity between distributions. Finally, we utilized the similarity vectors as feature vectors and trained the multiclass SVM for identifying the genes' metabolic pathway. To evaluate our developed method, we applied the method to budding yeast and trained the multiclass SVM for identifying the seven metabolic pathways. As a result, the accuracy that calculated by our developed method was higher than the one that calculated from the raw gene expression data. Thus, our developed method combined with KL divergence is useful for identifying the genes' metabolic pathway.

Keywords: Metabolic pathways, gene expression data, microarray, Kullback–Leibler divergence, KL divergence, support vector machines, SVM, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2336

209 Annotations of Gene Pathways Images in Biomedical Publications Using Siamese Network

Authors: Micheal Olaolu Arowolo, Muhammad Azam, Fei He, Mihail Popescu, Dong Xu

Abstract:

As the quantity of biological articles rises, so does the number of biological route figures. Each route figure shows gene names and relationships. Manually annotating pathway diagrams is time-consuming. Advanced image understanding models could speed up curation, but they must be more precise. There is rich information in biological pathway figures. The first step to performing image understanding of these figures is to recognize gene names automatically. Classical optical character recognition methods have been employed for gene name recognition, but they are not optimized for literature mining data. This study devised a method to recognize an image bounding box of gene name as a photo using deep Siamese neural network models to outperform the existing methods using ResNet, DenseNet and Inception architectures, the results obtained about 84% accuracy.

Keywords: Biological pathway, gene identification, object detection, Siamese network, ResNet.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 247

208 Novel Hybrid Method for Gene Selection and Cancer Prediction

Authors: Liping Jing, Michael K. Ng, Tieyong Zeng

Abstract:

Microarray data profiles gene expression on a whole genome scale, therefore, it provides a good way to study associations between gene expression and occurrence or progression of cancer. More and more researchers realized that microarray data is helpful to predict cancer sample. However, the high dimension of gene expressions is much larger than the sample size, which makes this task very difficult. Therefore, how to identify the significant genes causing cancer becomes emergency and also a hot and hard research topic. Many feature selection algorithms have been proposed in the past focusing on improving cancer predictive accuracy at the expense of ignoring the correlations between the features. In this work, a novel framework (named by SGS) is presented for stable gene selection and efficient cancer prediction . The proposed framework first performs clustering algorithm to find the gene groups where genes in each group have higher correlation coefficient, and then selects the significant genes in each group with Bayesian Lasso and important gene groups with group Lasso, and finally builds prediction model based on the shrinkage gene space with efficient classification algorithm (such as, SVM, 1NN, Regression and etc.). Experiment results on real world data show that the proposed framework often outperforms the existing feature selection and prediction methods, say SAM, IG and Lasso-type prediction model.

Keywords: Gene Selection, Cancer Prediction, Lasso, Clustering, Classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2043

207 A Phenomic Algorithm for Reconstruction of Gene Networks

Authors: Rio G. L. D'Souza, K. Chandra Sekaran, A. Kandasamy

Abstract:

The goal of Gene Expression Analysis is to understand the processes that underlie the regulatory networks and pathways controlling inter-cellular and intra-cellular activities. In recent times microarray datasets are extensively used for this purpose. The scope of such analysis has broadened in recent times towards reconstruction of gene networks and other holistic approaches of Systems Biology. Evolutionary methods are proving to be successful in such problems and a number of such methods have been proposed. However all these methods are based on processing of genotypic information. Towards this end, there is a need to develop evolutionary methods that address phenotypic interactions together with genotypic interactions. We present a novel evolutionary approach, called Phenomic algorithm, wherein the focus is on phenotypic interaction. We use the expression profiles of genes to model the interactions between them at the phenotypic level. We apply this algorithm to the yeast sporulation dataset and show that the algorithm can identify gene networks with relative ease.

Keywords: Evolutionary computing, gene expression analysis, gene networks, microarray data analysis, phenomic algorithms.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1924

206 Dynamic Metrics for Polymorphism in Object Oriented Systems

Authors: Parvinder Singh Sandhu, Gurdev Singh

Abstract:

Metrics is the process by which numbers or symbols are assigned to attributes of entities in the real world in such a way as to describe them according to clearly defined rules. Software metrics are instruments or ways to measuring all the aspect of software product. These metrics are used throughout a software project to assist in estimation, quality control, productivity assessment, and project control. Object oriented software metrics focus on measurements that are applied to the class and other characteristics. These measurements convey the software engineer to the behavior of the software and how changes can be made that will reduce complexity and improve the continuing capability of the software. Object oriented software metric can be classified in two types static and dynamic. Static metrics are concerned with all the aspects of measuring by static analysis of software and dynamic metrics are concerned with all the measuring aspect of the software at run time. Major work done before, was focusing on static metric. Also some work has been done in the field of dynamic nature of the software measurements. But research in this area is demanding for more work. In this paper we give a set of dynamic metrics specifically for polymorphism in object oriented system.

Keywords: Metrics, Software, Quality, Object oriented system, Polymorphism.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1762

205 Comparative Study on Swarm Intelligence Techniques for Biclustering of Microarray Gene Expression Data

Authors: R. Balamurugan, A. M. Natarajan, K. Premalatha

Abstract:

Microarray gene expression data play a vital in biological processes, gene regulation and disease mechanism. Biclustering in gene expression data is a subset of the genes indicating consistent patterns under the subset of the conditions. Finding a biclustering is an optimization problem. In recent years, swarm intelligence techniques are popular due to the fact that many real-world problems are increasingly large, complex and dynamic. By reasons of the size and complexity of the problems, it is necessary to find an optimization technique whose efficiency is measured by finding the near optimal solution within a reasonable amount of time. In this paper, the algorithmic concepts of the Particle Swarm Optimization (PSO), Shuffled Frog Leaping (SFL) and Cuckoo Search (CS) algorithms have been analyzed for the four benchmark gene expression dataset. The experiment results show that CS outperforms PSO and SFL for 3 datasets and SFL give better performance in one dataset. Also this work determines the biological relevance of the biclusters with Gene Ontology in terms of function, process and component.

Keywords: Particle swarm optimization, Shuffled frog leaping, Cuckoo search, biclustering, gene expression data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2663

204 Construction of a Fusion Gene Carrying E10A and K5 with 2A Peptide-Linked by Using Overlap Extension PCR

Authors: Tiancheng Lan

Abstract:

E10A is a kind of replication-defective adenovirus which carries the human endostatin gene to inhibit the growth of tumors. Kringle 5(K5) has almost the same function as angiostatin to also inhibit the growth of tumors since they are all the byproduct of the proteolytic cleavage of plasminogen. Tumor size increasing can be suppressed because both of the endostatin and K5 can restrain the angiogenesis process. Therefore, in order to improve the treatment effect on tumor, 2A peptide is used to construct a fusion gene carrying both E10A and K5. Using 2A peptide is an ideal strategy when a fusion gene is expressed because it can avoid many problems during the expression of more than one kind of protein. The overlap extension PCR is also used to connect 2A peptide with E10A and K5. The final construction of fusion gene E10A-2A-K5 can provide a possible new method of the anti-angiogenesis treatment with a better expression performance.

Keywords: E10A, Kringle 5, 2A peptide, overlap extension PCR.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 395

203 Metamorphism, Formal Grammars and Undecidable Code Mutation

Authors: Eric Filiol

Abstract:

This paper presents a formalisation of the different existing code mutation techniques (polymorphism and metamorphism) by means of formal grammars. While very few theoretical results are known about the detection complexity of viral mutation techniques, we exhaustively address this critical issue by considering the Chomsky classification of formal grammars. This enables us to determine which family of code mutation techniques are likely to be detected or on the contrary are bound to remain undetected. As an illustration we then present, on a formal basis, a proof-of-concept metamorphic mutation engine denoted PB MOT, whose detection has been proven to be undecidable.

Keywords: Polymorphism, Metamorphism, Formal Grammars, Formal Languages, Language Decision, Code Mutation, Word Problem

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2428

202 Bioinformatics Profiling of Missense Mutations

Authors: I. Nassiri, B. Goliaei, M. Tavassoli

Abstract:

The ability to distinguish missense nucleotide substitutions that contribute to harmful effect from those that do not is a difficult problem usually accomplished through functional in vivo analyses. In this study, instead current biochemical methods, the effects of missense mutations upon protein structure and function were assayed by means of computational methods and information from the databases. For this order, the effects of new missense mutations in exon 5 of PTEN gene upon protein structure and function were examined. The gene coding for PTEN was identified and localized on chromosome region 10q23.3 as the tumor suppressor gene. The utilization of these methods were shown that c.319G>A and c.341T>G missense mutations that were recognized in patients with breast cancer and Cowden disease, could be pathogenic. This method could be use for analysis of missense mutation in others genes.

Keywords: Bioinformatics, missense mutations, PTEN tumorsuppressor gene.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2389

201 Simultaneous Clustering and Feature Selection Method for Gene Expression Data

Authors: T. Chandrasekhar, K. Thangavel, E. N. Sathishkumar

Abstract:

Microarrays are made it possible to simultaneously monitor the expression profiles of thousands of genes under various experimental conditions. It is used to identify the co-expressed genes in specific cells or tissues that are actively used to make proteins. This method is used to analysis the gene expression, an important task in bioinformatics research. Cluster analysis of gene expression data has proved to be a useful tool for identifying co-expressed genes, biologically relevant groupings of genes and samples. In this work K-Means algorithms has been applied for clustering of Gene Expression Data. Further, rough set based Quick reduct algorithm has been applied for each cluster in order to select the most similar genes having high correlation. Then the ACV measure is used to evaluate the refined clusters and classification is used to evaluate the proposed method. They could identify compact clusters with feature selection method used to genes are selected.

Keywords: Clustering, Feature selection, Gene expression data, Quick reduct.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1967

200 Analysis of DNA Microarray Data using Association Rules: A Selective Study

Authors: M. Anandhavalli Gauthaman

Abstract:

DNA microarrays allow the measurement of expression levels for a large number of genes, perhaps all genes of an organism, within a number of different experimental samples. It is very much important to extract biologically meaningful information from this huge amount of expression data to know the current state of the cell because most cellular processes are regulated by changes in gene expression. Association rule mining techniques are helpful to find association relationship between genes. Numerous association rule mining algorithms have been developed to analyze and associate this huge amount of gene expression data. This paper focuses on some of the popular association rule mining algorithms developed to analyze gene expression data.

Keywords: DNA microarray, gene expression, association rule mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2144

199 Evaluation of Clustering Based on Preprocessing in Gene Expression Data

Authors: Seo Young Kim, Toshimitsu Hamasaki

Abstract:

Microarrays have become the effective, broadly used tools in biological and medical research to address a wide range of problems, including classification of disease subtypes and tumors. Many statistical methods are available for analyzing and systematizing these complex data into meaningful information, and one of the main goals in analyzing gene expression data is the detection of samples or genes with similar expression patterns. In this paper, we express and compare the performance of several clustering methods based on data preprocessing including strategies of normalization or noise clearness. We also evaluate each of these clustering methods with validation measures for both simulated data and real gene expression data. Consequently, clustering methods which are common used in microarray data analysis are affected by normalization and degree of noise and clearness for datasets.

Keywords: Gene expression, clustering, data preprocessing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1739

198 A Cuckoo Search with Differential Evolution for Clustering Microarray Gene Expression Data

Authors: M. Pandi, K. Premalatha

Abstract:

A DNA microarray technology is a collection of microscopic DNA spots attached to a solid surface. Scientists use DNA microarrays to measure the expression levels of large numbers of genes simultaneously or to genotype multiple regions of a genome. Elucidating the patterns hidden in gene expression data offers a tremendous opportunity for an enhanced understanding of functional genomics. However, the large number of genes and the complexity of biological networks greatly increase the challenges of comprehending and interpreting the resulting mass of data, which often consists of millions of measurements. It is handled by clustering which reveals the natural structures and identifying the interesting patterns in the underlying data. In this paper, gene based clustering in gene expression data is proposed using Cuckoo Search with Differential Evolution (CS-DE). The experiment results are analyzed with gene expression benchmark datasets. The results show that CS-DE outperforms CS in benchmark datasets. To find the validation of the clustering results, this work is tested with one internal and one external cluster validation indexes.

Keywords: DNA, Microarray, genomics, Cuckoo Search, Differential Evolution, Gene expression data, Clustering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1483

197 Mutational Analysis of CTLA4 Gene in Pakistani SLE Patients

Authors: N. Hussain, G. Jaffery, A.N. Sabri, S. Hasnain

Abstract:

The main aim is to perform mutational analysis of CTLA4 gene Exon 1 in SLE patients. A total of 61 SLE patients fulfilling “American College of Rheumatology (ACR) criteria" and 61 controls were enrolled in this study. The region of CTLA4 gene exon 1 was amplified by using Step-down PCR technique. Extracted DNA of band 354 bp was sequenced to analyze mutations in the exon-1 of CTLA-4 gene. Further, protein sequences were identified from nucleotide sequences of CTLA4 Exon 1 by using Expasy software and through Blast P software it was found that CTLA4 protein sequences of Pakistani SLE patients were similar to that of Chinese SLE population. No variations were found after patients sequences were compared with that of the control sequence. Furthermore it was found that CTLA4 protein sequences of Pakistani SLE patients were similar to that of Chinese SLE population. Thus CTLA4 gene may not be responsible for an autoimmune disease SLE.

Keywords: American College of Rheumatology criteria, autoimmune disease, Cytotoxic T Lymphocyte Antigen-4, Polymerase Chain Reaction, Systemic Lupus Erythematosus

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1530

196 Identification of Differentially Expressed Gene(DEG) in Atherosclerotic Lesion by Annealing Control Primer (ACP)-Based Genefishing™ PCR

Authors: M. Maimunah, G. A. Froemming, H. Nawawi, M. I. Nafeeza, O. Effat, M. Y. Rosmadi, M. S. Mohamed Saifulaman

Abstract:

Atherosclerosis was identified as a chronic inflammatory process resulting from interactions between plasma lipoproteins, cellular components (monocyte, macrophages, T lymphocytes, endothelial cells and smooth muscle cells) and the extracellular matrix of the arterial wall. Several types of genes were known to express during formation of atherosclerosis. This study is carried out to identify unknown differentially expressed gene (DEG) in atherogenesis. Rabbit’s aorta tissues were stained by H&E for histomorphology. GeneFishing™ PCR analysis was performed from total RNA extracted from the aorta tissues. The DNA fragment from DEG was cloned, sequenced and validated by Real-time PCR. Histomorphology showed intimal thickening in the aorta. DEG detected from ACP-41 was identified as cathepsin B gene and showed upregulation at week-8 and week-12 of atherogenesis. Therefore, ACP-based GeneFishing™ PCR facilitated identification of cathepsin B gene which was differentially expressed during development of atherosclerosis.

Keywords: Atherosclerosis, GeneFishing™ PCR, cathepsin B gene.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1956

195 Gene Network Analysis of PPAR-γ: A Bioinformatics Approach Using STRING

Authors: S. Bag, S. Ramaiah, P. Anitha, K. M. Kumar, P. Lavanya, V. Sivasakhthi, A. Anbarasu

Abstract:

Gene networks present a graphical view at the level of gene activities and genetic functions and help us to understand complex interactions in a meaningful manner. In the present study, we have analyzed the gene interaction of PPAR-γ (peroxisome proliferator-activated receptor gamma) by search tool for retrieval of interacting genes. We find PPAR-γ is highly networked by genetic interactions with 10 genes: RXRA (retinoid X receptor, alpha), PPARGC1A (peroxisome proliferator-activated receptor gamma, coactivator 1 alpha), NCOA1 (nuclear receptor coactivator 1), NR0B2 (nuclear receptor subfamily 0, group B, member 2), HDAC3 (histone deacetylase 3), MED1 (mediator complex subunit 1), INS (insulin), NCOR2 (nuclear receptor co-repressor 2), PAX8 (paired box 8), ADIPOQ (adiponectin) and it augurs well for the fact that obesity and several other metabolic disorders are inter related.

Keywords: Gene networks, NCOA1, PPARγ, PPARGC1A, RXRA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4543

194 Molecular Identification of ESBL Genesbla GES-1, blaVEB-1, blaCTX-M blaOXA-1, blaOXA-4,blaOXA-10 and blaPER-1 in Pseudomonas aeruginosa Strains Isolated from Burn Patientsby PCR, RFLP and Sequencing Techniques

Authors: Fereshteh Shacheraghi, Mohammad Reza Shakibaie, Hanieh Noveiri

Abstract:

Fourty one strains of ESBL producing P.aeruginosa which were previously isolated from burn patients in Kerman University general hospital, Iran were subjected to PCR, RFLP and sequencing in order to determine the type of extended spectrum β- lactamases (ESBL), the restriction digestion pattern and possibility of mutation among detected genes. DNA extraction was carried out by phenol chloroform method. PCR for detection of bla genes was performed using specific primer for each gene. Restriction Fragment Length Polymorphism (RFLP) for ESBL genes was carried out using EcoRI, NheI, PVUII, EcoRV, DdeI, and PstI restriction enzymes. The PCR products were subjected to direct sequencing of both the strands for identification of the ESBL genes.The blaCTX-M, blaVEB-1, blaPER-1, blaGES-1, blaOXA-1, blaOXA-4 and blaOXA-10 genes were detected in the (n=1) 2.43%, (n=41)100%, (n=28) 68.3%, (n=10) 24.4%, (n=29) 70.7%, (n=7)17.1% and (n=38) 92.7% of the ESBL producing isolates respectively. The RFLP analysis showed that each ESBL gene has identical pattern of digestion among the isolated strains. Sequencing of the ESBL genes confirmed the genuinety of PCR products and revealed no mutation in the restriction sites of the above genes. From results of the present investigation it can be concluded that blaVEB-1 and blaCTX-M were the most and the least frequently isolated ESBL genes among the P.aeruginosa strains isolated from burn patients. The RFLP and sequencing analysis revealed that same clone of the bla genes were indeed existed among the antibiotic resistant strains.

Keywords: ESBL genes, PCR, RFLP, Sequencing, P.aeruginosa

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2973

193 A New Hybrid K-Mean-Quick Reduct Algorithm for Gene Selection

Authors: E. N. Sathishkumar, K. Thangavel, T. Chandrasekhar

Abstract:

Feature selection is a process to select features which are more informative. It is one of the important steps in knowledge discovery. The problem is that all genes are not important in gene expression data. Some of the genes may be redundant, and others may be irrelevant and noisy. Here a novel approach is proposed Hybrid K-Mean-Quick Reduct (KMQR) algorithm for gene selection from gene expression data. In this study, the entire dataset is divided into clusters by applying K-Means algorithm. Each cluster contains similar genes. The high class discriminated genes has been selected based on their degree of dependence by applying Quick Reduct algorithm to all the clusters. Average Correlation Value (ACV) is calculated for the high class discriminated genes. The clusters which have the ACV value as 1 is determined as significant clusters, whose classification accuracy will be equal or high when comparing to the accuracy of the entire dataset. The proposed algorithm is evaluated using WEKA classifiers and compared. The proposed work shows that the high classification accuracy.

Keywords: Clustering, Gene Selection, K-Mean-Quick Reduct, Rough Sets.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2298

192 Computational Method for Annotation of Protein Sequence According to Gene Ontology Terms

Authors: Razib M. Othman, Safaai Deris, Rosli M. Illias

Abstract:

Annotation of a protein sequence is pivotal for the understanding of its function. Accuracy of manual annotation provided by curators is still questionable by having lesser evidence strength and yet a hard task and time consuming. A number of computational methods including tools have been developed to tackle this challenging task. However, they require high-cost hardware, are difficult to be setup by the bioscientists, or depend on time intensive and blind sequence similarity search like Basic Local Alignment Search Tool. This paper introduces a new method of assigning highly correlated Gene Ontology terms of annotated protein sequences to partially annotated or newly discovered protein sequences. This method is fully based on Gene Ontology data and annotations. Two problems had been identified to achieve this method. The first problem relates to splitting the single monolithic Gene Ontology RDF/XML file into a set of smaller files that can be easy to assess and process. Thus, these files can be enriched with protein sequences and Inferred from Electronic Annotation evidence associations. The second problem involves searching for a set of semantically similar Gene Ontology terms to a given query. The details of macro and micro problems involved and their solutions including objective of this study are described. This paper also describes the protein sequence annotation and the Gene Ontology. The methodology of this study and Gene Ontology based protein sequence annotation tool namely extended UTMGO is presented. Furthermore, its basic version which is a Gene Ontology browser that is based on semantic similarity search is also introduced.

Keywords: automatic clustering, bioinformatics tool, gene ontology, protein sequence annotation, semantic similarity search

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3127

191 Neural Network Based Determination of Splice Junctions by ROC Analysis

Authors: S. Makal, L. Ozyilmaz, S. Palavaroglu

Abstract:

Gene, principal unit of inheritance, is an ordered sequence of nucleotides. The genes of eukaryotic organisms include alternating segments of exons and introns. The region of Deoxyribonucleic acid (DNA) within a gene containing instructions for coding a protein is called exon. On the other hand, non-coding regions called introns are another part of DNA that regulates gene expression by removing from the messenger Ribonucleic acid (RNA) in a splicing process. This paper proposes to determine splice junctions that are exon-intron boundaries by analyzing DNA sequences. A splice junction can be either exon-intron (EI) or intron exon (IE). Because of the popularity and compatibility of the artificial neural network (ANN) in genetic fields; various ANN models are applied in this research. Multi-layer Perceptron (MLP), Radial Basis Function (RBF) and Generalized Regression Neural Networks (GRNN) are used to analyze and detect the splice junctions of gene sequences. 10-fold cross validation is used to demonstrate the accuracy of networks. The real performances of these networks are found by applying Receiver Operating Characteristic (ROC) analysis.

Keywords: Gene, neural networks, ROC analysis, splice junctions.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1657

190 A Systems Approach to Gene Ranking from DNA Microarray Data of Cervical Cancer

Authors: Frank Emmert Streib, Matthias Dehmer, Jing Liu, Max Mühlhauser

Abstract:

In this paper we present a method for gene ranking from DNA microarray data. More precisely, we calculate the correlation networks, which are unweighted and undirected graphs, from microarray data of cervical cancer whereas each network represents a tissue of a certain tumor stage and each node in the network represents a gene. From these networks we extract one tree for each gene by a local decomposition of the correlation network. The interpretation of a tree is that it represents the n-nearest neighbor genes on the n-th level of a tree, measured by the Dijkstra distance, and, hence, gives the local embedding of a gene within the correlation network. For the obtained trees we measure the pairwise similarity between trees rooted by the same gene from normal to cancerous tissues. This evaluates the modification of the tree topology due to progression of the tumor. Finally, we rank the obtained similarity values from all tissue comparisons and select the top ranked genes. For these genes the local neighborhood in the correlation networks changes most between normal and cancerous tissues. As a result we find that the top ranked genes are candidates suspected to be involved in tumor growth and, hence, indicates that our method captures essential information from the underlying DNA microarray data of cervical cancer.

Keywords: Graph similarity, DNA microarray data, cancer.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1755