Search results for: Cry Genes

82 Codes beyond Bits and Bytes: A Blueprint for Artificial Life

Authors: Rishabh Garg, Anuja Vyas, Aamna Khan, Muhammad Azwan Tariq

Abstract:

The present study focuses on integrating Machine Learning and Genomics, hereafter termed ‘GenoLearning’, to develop Artificial Life (AL). This is achieved by leveraging gene editing to imbue genes with sequences capable of performing desired functions. To accomplish this, a specialized sub-network of Siamese Neural Network (SNN), named Transformer Architecture specialized in Sequence Analysis of Genes (TASAG), compares two sequences: the desired and target sequences. Differences between these sequences are analyzed, and necessary edits are made on-screen to incorporate the desired sequence into the target sequence. The edited sequence can then be synthesized chemically using a Computerized DNA Synthesizer (CDS). The CDS fabricates DNA strands according to the sequence displayed on a computer screen, aided by microprocessors. These synthesized DNA strands can be inserted into an ovum to initiate further development, eventually leading to the creation of an Embot, and ultimately, an H-Bot. While this study aims to explore the potential benefits of Artificial Intelligence (AI) technology, it also acknowledges and addresses the ethical considerations associated with its implementation.

Keywords: Machine Learning, Genomics, Genetronics, DNA, Transformer, Siamese Neural Network, Gene Editing, Artificial Life, H-Bot, Zoobot.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 75

81 Construction of Recombinant E.coli Expressing Fusion Protein to Produce 1,3-Propanediol

Authors: Rosarin Rujananon, Poonsuk Prasertsan, Amornrat Phongdara, Tanate Panrat, Jibin Sun, Sugima Rappert, An-Ping Zeng

Abstract:

In this study, a synthetic pathway was created by assembling genes from Clostridium butyricum and Escherichia coli in different combinations. Among the genes were dhaB1 and dhaB2 from C. butyricum VPI1718 coding for glycerol dehydratase (GDHt) and its activator (GDHtAc), respectively, involved in the conversion of glycerol to 3-hydroxypropionaldehyde (3-HPA). The yqhD gene from E.coli BL21 was also included which codes for an NADPHdependent 1,3-propanediol oxidoreductase isoenzyme (PDORI) reducing 3-HPA to 1,3-propanediol (1,3-PD). Molecular modeling analysis indicated that the conformation of fusion protein of YQHD and DHAB1 was favorable for direct molecular channeling of the intermediate 3-HPA. According to the simulation results, the yqhD and dhaB1 gene were assembled in the upstream of dhaB2 to express a fusion protein, yielding the recombinant strain E. coliBL21 (DE3)//pET22b+::yqhD-dhaB1_dhaB2 (strain BP41Y3). Strain BP41Y3 gave 10-fold higher 1,3-PD concentration than E. coliBL21 (DE3)//pET22b+::yqhD-dhaB1_dhaB2 (strain BP31Y2) expressing the recombinant enzymes simultaneously but in a non-fusion mode. This is the first report using a gene fusion approach to enhance the biological conversion of glycerol to the value added compound 1,3- PD.

Keywords: Recombinant E.coli, 1, 3-propanediol, glycerol, fusion protein.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2014

80 An Ant-based Clustering System for Knowledge Discovery in DNA Chip Analysis Data

Authors: Minsoo Lee, Yun-mi Kim, Yearn Jeong Kim, Yoon-kyung Lee, Hyejung Yoon

Abstract:

Biological data has several characteristics that strongly differentiate it from typical business data. It is much more complex, usually large in size, and continuously changes. Until recently business data has been the main target for discovering trends, patterns or future expectations. However, with the recent rise in biotechnology, the powerful technology that was used for analyzing business data is now being applied to biological data. With the advanced technology at hand, the main trend in biological research is rapidly changing from structural DNA analysis to understanding cellular functions of the DNA sequences. DNA chips are now being used to perform experiments and DNA analysis processes are being used by researchers. Clustering is one of the important processes used for grouping together similar entities. There are many clustering algorithms such as hierarchical clustering, self-organizing maps, K-means clustering and so on. In this paper, we propose a clustering algorithm that imitates the ecosystem taking into account the features of biological data. We implemented the system using an Ant-Colony clustering algorithm. The system decides the number of clusters automatically. The system processes the input biological data, runs the Ant-Colony algorithm, draws the Topic Map, assigns clusters to the genes and displays the output. We tested the algorithm with a test data of 100 to1000 genes and 24 samples and show promising results for applying this algorithm to clustering DNA chip data.

Keywords: Ant colony system, biological data, clustering, DNA chip.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1973

79 BeamGA Median: A Hybrid Heuristic Search Approach

Authors: Ghada Badr, Manar Hosny, Nuha Bintayyash, Eman Albilali, Souad Larabi Marie-Sainte

Abstract:

The median problem is significantly applied to derive the most reasonable rearrangement phylogenetic tree for many species. More specifically, the problem is concerned with finding a permutation that minimizes the sum of distances between itself and a set of three signed permutations. Genomes with equal number of genes but different order can be represented as permutations. In this paper, an algorithm, namely BeamGA median, is proposed that combines a heuristic search approach (local beam) as an initialization step to generate a number of solutions, and then a Genetic Algorithm (GA) is applied in order to refine the solutions, aiming to achieve a better median with the smallest possible reversal distance from the three original permutations. In this approach, any genome rearrangement distance can be applied. In this paper, we use the reversal distance. To the best of our knowledge, the proposed approach was not applied before for solving the median problem. Our approach considers true biological evolution scenario by applying the concept of common intervals during the GA optimization process. This allows us to imitate a true biological behavior and enhance genetic approach time convergence. We were able to handle permutations with a large number of genes, within an acceptable time performance and with same or better accuracy as compared to existing algorithms.

Keywords: Median problem, phylogenetic tree, permutation, genetic algorithm, beam search, genome rearrangement distance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 979

78 Using Genetic Algorithms to Outline Crop Rotations and a Cropping-System Model

Authors: Nicolae Bold, Daniel Nijloveanu

Abstract:

The idea of cropping-system is a method used by farmers. It is an environmentally-friendly method, protecting the natural resources (soil, water, air, nutritive substances) and increase the production at the same time, taking into account some crop particularities. The combination of this powerful method with the concepts of genetic algorithms results into a possibility of generating sequences of crops in order to form a rotation. The usage of this type of algorithms has been efficient in solving problems related to optimization and their polynomial complexity allows them to be used at solving more difficult and various problems. In our case, the optimization consists in finding the most profitable rotation of cultures. One of the expected results is to optimize the usage of the resources, in order to minimize the costs and maximize the profit. In order to achieve these goals, a genetic algorithm was designed. This algorithm ensures the finding of several optimized solutions of cropping-systems possibilities which have the highest profit and, thus, which minimize the costs. The algorithm uses genetic-based methods (mutation, crossover) and structures (genes, chromosomes). A cropping-system possibility will be considered a chromosome and a crop within the rotation is a gene within a chromosome. Results about the efficiency of this method will be presented in a special section. The implementation of this method would bring benefits into the activity of the farmers by giving them hints and helping them to use the resources efficiently.

Keywords: Genetic algorithm, chromosomes, genes, cropping, agriculture.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1602

77 Antibody Reactivity of Synthetic Peptides Belonging to Proteins Encoded by Genes Located in Mycobacterium tuberculosis-Specific Genomic Regions of Differences

Authors: Abu Salim Mustafa

Abstract:

The comparisons of mycobacterial genomes have identified several Mycobacterium tuberculosis-specific genomic regions that are absent in other mycobacteria and are known as regions of differences. Due to M. tuberculosis-specificity, the peptides encoded by these regions could be useful in the specific diagnosis of tuberculosis. To explore this possibility, overlapping synthetic peptides corresponding to 39 proteins predicted to be encoded by genes present in regions of differences were tested for antibody-reactivity with sera from tuberculosis patients and healthy subjects. The results identified four immunodominant peptides corresponding to four different proteins, with three of the peptides showing significantly stronger antibody reactivity and rate of positivity with sera from tuberculosis patients than healthy subjects. The fourth peptide was recognized equally well by the sera of tuberculosis patients as well as healthy subjects. Predication of antibody epitopes by bioinformatics analyses using ABCpred server predicted multiple linear epitopes in each peptide. Furthermore, peptide sequence analysis for sequence identity using BLAST suggested M. tuberculosis-specificity for the three peptides that had preferential reactivity with sera from tuberculosis patients, but the peptide with equal reactivity with sera of TB patients and healthy subjects showed significant identity with sequences present in nob-tuberculous mycobacteria. The three identified M. tuberculosis-specific immunodominant peptides may be useful in the serological diagnosis of tuberculosis.

Keywords: Genomic regions of differences, Mycobacterium tuberculosis, peptides, serodiagnosis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 929

76 Cloning and Functional Characterization of Promoter Elements of the D Hordein Gene from the Barley (Hordeum vulgare L.) by Bioinformatic Tools

Authors: Kobra Nalbandi, Bahram Baghban Kohnehrouz, Khalil Alami Saeed

Abstract:

The low level of foreign genes expression in transgenic plants is a key factor that limits plant genetic engineering. Because of the critical regulatory activity of the promoters on gene transcription, they are studied extensively to improve the efficiency of the plant transgenic system. The strong constitutive promoters, such as CaMV 35S promoter and Ubiqutin 1 maize are usually used in plant biotechnology research. However the expression level of the foreign genes in all tissues is often undesirable. But using a strong seed-specific promoter to limit gene expression in the seed solves such problems. The purpose of this study is to isolate one of the seed specific promoters of Hordeum vulgare. So one of the common varieties of Hordeum vulgare in Iran was selected and their genomes extracted then the D-Hordein promoter amplified using the specific designed primers. Then the amplified fragment of the insert cloned in an appropriate vector and then transformed to E. coli. At last for the final admission of accuracy the cloned fragments sent for sequencing. Sequencing analysis showed that the cloned fragment DHPcontained motifs; like TATA box, CAAT-box, CCGTCC-box, AMYBOX1 and E-box etc., which constituted the seed-specific promoter activity. The results were compared with sequences existing in data banks. D-Hordein promoters of Alger has 99% similarity at 100 % coverage. The results also showed that D-Hordein promoter of barley and HMW promoter of wheat are too similar.

Keywords: Barley, Seed specific promoter, Hordein.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2635

75 Statistics of Exon Lengths in Animals, Plants, Fungi, and Protists

Authors: Alexander Kaplunovsky, Vladimir Khailenko, Alexander Bolshoy, Shara Atambayeva, AnatoliyIvashchenko

Abstract:

Eukaryotic protein-coding genes are interrupted by spliceosomal introns, which are removed from the RNA transcripts before translation into a protein. The exon-intron structures of different eukaryotic species are quite different from each other, and the evolution of such structures raises many questions. We try to address some of these questions using statistical analysis of whole genomes. We go through all the protein-coding genes in a genome and study correlations between the net length of all the exons in a gene, the number of the exons, and the average length of an exon. We also take average values of these features for each chromosome and study correlations between those averages on the chromosomal level. Our data show universal features of exon-intron structures common to animals, plants, and protists (specifically, Arabidopsis thaliana, Caenorhabditis elegans, Drosophila melanogaster, Cryptococcus neoformans, Homo sapiens, Mus musculus, Oryza sativa, and Plasmodium falciparum). We have verified linear correlation between the number of exons in a gene and the length of a protein coded by the gene, while the protein length increases in proportion to the number of exons. On the other hand, the average length of an exon always decreases with the number of exons. Finally, chromosome clustering based on average chromosome properties and parameters of linear regression between the number of exons in a gene and the net length of those exons demonstrates that these average chromosome properties are genome-specific features.

Keywords: Comparative genomics, exon-intron structure, eukaryotic clustering, linear regression.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2572

74 Survivability of Verhulst-free Populations under Mutation Accumulation

Authors: Chrysline Margus N. Piñol, Jenifer DP. De Maligaya, Ahl G. Balitaon

Abstract:

Stable nonzero populations without random deaths caused by the Verhulst factor (Verhulst-free) are a rarity. Majority either grow without bounds or die of excessive harmful mutations. To delay the accumulation of bad genes or diseases, a new environmental parameter Γ is introduced in the simulation. Current results demonstrate that stability may be achieved by setting Γ = 0.1. These steady states approach a maximum size that scales inversely with reproduction age.

Keywords: Aging, mutation accumulation, population dynamics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1275

73 Characteristics of Intronic and Intergenic Human miRNAs and Features of their Interaction with mRNA

Authors: Assel S. Issabekova, Olga A. Berillo, Vladimir A. Khailenko, Shara A. Atambayeva, Mireille Regnier, Anatoly T. Ivachshenko

Abstract:

Regulatory relationships of 686 intronic miRNA and 784 intergenic miRNAs with mRNAs of 51 intronic miRNA coding genes were established. Interaction features of studied miRNAs with 5'UTR, CDS and 3'UTR of mRNA of each gene were revealed. Functional regions of mRNA were shown to be significantly heterogenous according to the number of binding sites of miRNA and to the location density of these sites.

Keywords: 5'UTR, 3'UTR, CDS, miRNA, target mRNA

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1704

72 A Novel Cytokine Derived Fusion Tag for Over- Expression of Heterologous Proteins in E. coli

Authors: S. Banerjee, A. Apte Deshpande, N. Mandi, S. Padmanabhan

Abstract:

We report a novel fusion tag for expressing recombinant proteins in E. coli. The fusion tag is the C-terminus part of the human GMCSF gene comprising 45 amino acids, which aid in over expression of otherwise non expressible genes. Expression of hIFN a2b with this fusion tag also escapes the requirement of rare codons for expression. This is also a first report of a small fusion tag of human origin having affinity to heparin sepharose column facilitating the purification of fusion protein.

Keywords: fusion tag, bacterial expression, rare codons, human GMCSF

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1897

71 Kinetics Study for the Recombinant Cellulosome to the Degradation of Chlorella Cell Residuals

Authors: C.-C. Lin, S.-C. Kan, C.-W. Yeh, C.-I Chen, C.-J. Shieh, Y.-C. Liu

Abstract:

In this study, lipid-deprived residuals of microalgae were hydrolyzed for the production of reducing sugars by using the recombinant Bacillus cellulosome, carrying eight genes from the Clostridium thermocellum ATCC27405. The obtained cellulosome was found to exist mostly in the broth supernatant with a cellulosome activity of 2.4 U/mL. Furthermore, the Michaelis-Menten constant (Km) and Vmax of cellulosome were found to be 14.832 g/L and 3.522 U/mL. The activation energy of the cellulosome to hydrolyze microalgae LDRs was calculated as 32.804 kJ/mol.

Keywords: Lipid-deprived residuals of microalgae, cellulosome, cellulose, reducing sugars, kinetics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1857

70 Properties of Adipose Tissue Derived Mesenchymal Stem Cells with Long-Term Cryopreservation

Authors: Jienny Lee, In-Soo Cho, Sang-Ho Cha

Abstract:

Adult mesenchymal stem cells (MSCs) have been investigated using preclinical approaches for tissue regeneration. Porcine MSCs (pMSCs) are capable of growing and attaching to plastic with a fibroblast-like morphology and then differentiating into bone, adipose, and cartilage tissues in vitro. This study was conducted to investigate the proliferating abilities, differentiation potentials, and multipotency of miniature pig adipose tissue-derived MSCs (mpAD-MSCs) with or without long-term cryopreservation, considering that cryostorage has the potential for use in clinical applications. After confirming the characteristics of the mpAD-MSCs, we examined the effect of long-term cryopreservation (> 2 years) on expression of cell surface markers (CD34, CD90 and CD105), proliferating abilities (cumulative population doubling level, doubling time, colony-forming unit, and MTT assay) and differentiation potentials into mesodermal cell lineages. As a result, the expression of cell surface markers is similar between thawed and fresh mpAD-MSCs. However, long-term cryopreservation significantly lowered the differentiation potentials (adipogenic, chondrogenic, and osteogenic) of mpAD-MSCs. When compared with fresh mpAD-MSCs, thawed mpAD-MSCs exhibited lower expression of mesodermal cell lineage-related genes such as peroxisome proliferator-activated receptor-g2, lipoprotein lipase, collagen Type II alpha 1, osteonectin, and osteocalcin. Interestingly, long-term cryostoraged mpAD-MSCs exhibited significantly higher cell viability than the fresh mpAD-MSCs. Long-term cryopreservation induced a 30% increase in the cell viability of mpAD-MSCs when compared with the fresh mpAD-MSCs at 5 days after thawing. However, long-term cryopreservation significantly lowered expression of stemness markers such as Oct3/4, Sox2, and Nanog. Furthermore, long-term cryopreservation negatively affected expression of senescence-associated genes such as telomerase reverse transcriptase and heat shock protein 90 of mpAD-MSCs when compared with the fresh mpAD-MSCs. The results from this study might be important for the successful application of MSCs in clinical trials after long-term cryopreservation.

Keywords: Mesenchymal stem cells, Cryopreservation, Stemness, Senescence.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2103

69 Computing the Similarity and the Diversity in the Species Based on Cronobacter Genome

Authors: E. Al Daoud

Abstract:

The purpose of computing the similarity and the diversity in the species is to trace the process of evolution and to find the relationship between the species and discover the unique, the special, the common and the universal proteins. The proteins of the whole genome of 40 species are compared with the cronobacter genome which is used as reference genome. More than 3 billion pairwise alignments are performed using blastp. Several findings are introduced in this study, for example, we found 172 proteins in cronobacter genome which have insignificant hits in other species, 116 significant proteins in the all tested species with very high score value and 129 common proteins in the plants but have insignificant hits in mammals, birds, fishes, and insects.

Keywords: Genome, species, blastp, conserved genes, cronobacter.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1007

68 An Integrated Predictor for Cis-Regulatory Modules

Authors: Darby Tien-Hao Chang, Guan-Yu Shiu, You-Jie Sun

Abstract:

Various cis-regulatory module (CRM) predictors have been proposed in the last decade. Several well-established CRM predictors adopted different categories of prediction strategies, including window clustering, probabilistic modeling and phylogenetic footprinting. Appropriate integration of them has a potential to achieve high quality CRM prediction. This study analyzed four existing CRM predictors (ClusterBuster, MSCAN, CisModule and MultiModule) to seek a predictor combination that delivers a higher accuracy than individual CRM predictors. 465 CRMs across 140 Drosophila melanogaster genes from the RED fly database were used to evaluate the integrated CRM predictor proposed in this study. The results show that four predictor combinations achieved superior performance than the best individual CRM predictor.

Keywords: Cis-regulatory module, transcription factor binding site.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1650

67 Imputation Technique for Feature Selection in Microarray Data Set

Authors: Younies Mahmoud, Mai Mabrouk, Elsayed Sallam

Abstract:

Analyzing DNA microarray data sets is a great challenge, which faces the bioinformaticians due to the complication of using statistical and machine learning techniques. The challenge will be doubled if the microarray data sets contain missing data, which happens regularly because these techniques cannot deal with missing data. One of the most important data analysis process on the microarray data set is feature selection. This process finds the most important genes that affect certain disease. In this paper, we introduce a technique for imputing the missing data in microarray data sets while performing feature selection.

Keywords: DNA microarray, feature selection, missing data, bioinformatics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2791

66 An Accurate Method for Phylogeny Tree Reconstruction Based on a Modified Wild Dog Algorithm

Authors: Essam Al Daoud

Abstract:

This study solves a phylogeny problem by using modified wild dog pack optimization. The least squares error is considered as a cost function that needs to be minimized. Therefore, in each iteration, new distance matrices based on the constructed trees are calculated and used to select the alpha dog. To test the suggested algorithm, ten homologous genes are selected and collected from National Center for Biotechnology Information (NCBI) databanks (i.e., 16S, 18S, 28S, Cox 1, ITS1, ITS2, ETS, ATPB, Hsp90, and STN). The data are divided into three categories: 50 taxa, 100 taxa and 500 taxa. The empirical results show that the proposed algorithm is more reliable and accurate than other implemented methods.

Keywords: Least squares, neighbor joining, phylogenetic tree, wild dogpack.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1392

65 Iterative Clustering Algorithm for Analyzing Temporal Patterns of Gene Expression

Authors: Seo Young Kim, Jae Won Lee, Jong Sung Bae

Abstract:

Microarray experiments are information rich; however, extensive data mining is required to identify the patterns that characterize the underlying mechanisms of action. For biologists, a key aim when analyzing microarray data is to group genes based on the temporal patterns of their expression levels. In this paper, we used an iterative clustering method to find temporal patterns of gene expression. We evaluated the performance of this method by applying it to real sporulation data and simulated data. The patterns obtained using the iterative clustering were found to be superior to those obtained using existing clustering algorithms.

Keywords: Clustering, microarray experiment, temporal pattern of gene expression data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1354

64 Clustering Approach to Unveiling Relationships between Gene Regulatory Networks

Authors: Hiba Hasan, Khalid Raza

Abstract:

Reverse engineering of genetic regulatory network involves the modeling of the given gene expression data into a form of the network. Computationally it is possible to have the relationships between genes, so called gene regulatory networks (GRNs), that can help to find the genomics and proteomics based diagnostic approach for any disease. In this paper, clustering based method has been used to reconstruct genetic regulatory network from time series gene expression data. Supercoiled data set from Escherichia coli has been taken to demonstrate the proposed method.

Keywords: Gene expression, gene regulatory networks (GRNs), clustering, data preprocessing, network visualization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2152

63 Bioinformatics Profiling of Missense Mutations

Authors: I. Nassiri, B. Goliaei, M. Tavassoli

Abstract:

The ability to distinguish missense nucleotide substitutions that contribute to harmful effect from those that do not is a difficult problem usually accomplished through functional in vivo analyses. In this study, instead current biochemical methods, the effects of missense mutations upon protein structure and function were assayed by means of computational methods and information from the databases. For this order, the effects of new missense mutations in exon 5 of PTEN gene upon protein structure and function were examined. The gene coding for PTEN was identified and localized on chromosome region 10q23.3 as the tumor suppressor gene. The utilization of these methods were shown that c.319G>A and c.341T>G missense mutations that were recognized in patients with breast cancer and Cowden disease, could be pathogenic. This method could be use for analysis of missense mutation in others genes.

Keywords: Bioinformatics, missense mutations, PTEN tumorsuppressor gene.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2389

62 A New Predictor of Coding Regions in Genomic Sequences using a Combination of Different Approaches

Authors: Aníbal Rodríguez Fuentes, Juan V. Lorenzo Ginori, Ricardo Grau Ábalo

Abstract:

Identifying protein coding regions in DNA sequences is a basic step in the location of genes. Several approaches based on signal processing tools have been applied to solve this problem, trying to achieve more accurate predictions. This paper presents a new predictor that improves the efficacy of three techniques that use the Fourier Transform to predict coding regions, and that could be computed using an algorithm that reduces the computation load. Some ideas about the combination of the predictor with other methods are discussed. ROC curves are used to demonstrate the efficacy of the proposed predictor, based on the computation of 25 DNA sequences from three different organisms.

Keywords: Bioinformatics, Coding region prediction, Computational load reduction, Digital Signal Processing, Fourier Transform.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1667

61 An Automatic Gridding and Contour Based Segmentation Approach Applied to DNA Microarray Image Analysis

Authors: Alexandra Oliveros, Miguel Sotaquirá

Abstract:

DNA microarray technology is widely used by geneticists to diagnose or treat diseases through gene expression. This technology is based on the hybridization of a tissue-s DNA sequence into a substrate and the further analysis of the image formed by the thousands of genes in the DNA as green, red or yellow spots. The process of DNA microarray image analysis involves finding the location of the spots and the quantification of the expression level of these. In this paper, a tool to perform DNA microarray image analysis is presented, including a spot addressing method based on the image projections, the spot segmentation through contour based segmentation and the extraction of relevant information due to gene expression.

Keywords: Contour segmentation, DNA microarrays, edge detection, image processing, segmentation, spot addressing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1389

60 Principal Component Analysis using Singular Value Decomposition of Microarray Data

Authors: Dong Hoon Lim

Abstract:

A series of microarray experiments produces observations of differential expression for thousands of genes across multiple conditions. Principal component analysis(PCA) has been widely used in multivariate data analysis to reduce the dimensionality of the data in order to simplify subsequent analysis and allow for summarization of the data in a parsimonious manner. PCA, which can be implemented via a singular value decomposition(SVD), is useful for analysis of microarray data. For application of PCA using SVD we use the DNA microarray data for the small round blue cell tumors(SRBCT) of childhood by Khan et al.(2001). To decide the number of components which account for sufficient amount of information we draw scree plot. Biplot, a graphic display associated with PCA, reveals important features that exhibit relationship between variables and also the relationship of variables with observations.

Keywords: Principal component analysis, singular value decomposition, microarray data, SRBCT

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3250

59 Evaluation of Clustering Based on Preprocessing in Gene Expression Data

Authors: Seo Young Kim, Toshimitsu Hamasaki

Abstract:

Microarrays have become the effective, broadly used tools in biological and medical research to address a wide range of problems, including classification of disease subtypes and tumors. Many statistical methods are available for analyzing and systematizing these complex data into meaningful information, and one of the main goals in analyzing gene expression data is the detection of samples or genes with similar expression patterns. In this paper, we express and compare the performance of several clustering methods based on data preprocessing including strategies of normalization or noise clearness. We also evaluate each of these clustering methods with validation measures for both simulated data and real gene expression data. Consequently, clustering methods which are common used in microarray data analysis are affected by normalization and degree of noise and clearness for datasets.

Keywords: Gene expression, clustering, data preprocessing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1739

58 Modeling of Alpha-Particles’ Epigenetic Effects in Short-Term Test on Drosophila melanogaster

Authors: Z. M. Biyasheva, M. Zh. Tleubergenova, Y. A. Zaripova, A. L. Shakirov, V. V. Dyachkov

Abstract:

In recent years, interest in ecogenetic and biomedical problems related to the effects on the population of radon and its daughter decay products has increased significantly. Of particular interest is the assessment of the consequence of irradiation at hazardous radon areas, which includes the Almaty region due to the large number of tectonic faults that enhance radon emanation. In connection with the foregoing, the purpose of this work was to study the genetic effects of exposure to supernormal radon doses on the alpha-radiation model. Irradiation does not affect the growth of the cell, but rather its ability to differentiate. In addition, irradiation can lead to somatic mutations, morphoses and modifications. These damages most likely occur from changes in the composition of the substances of the cell. Such changes are epigenetic since they affect the regulatory processes of ontogenesis. Variability in the expression of regulatory genes refers to conditional mutations that modify the formation of signs of intraspecific similarity. Characteristic features of these conditional mutations are the dominant type of their manifestation, phenotypic asymmetry and their instability in the generations. Currently, the terms “morphosis” and “modification” are used to describe epigenetic variability, which are maintained in Drosophila melanogaster cultures using linkaged X- chromosomes, and the mutant X-chromosome is transmitted along the paternal line. In this paper, we investigated the epigenetic effects of alpha particles, whose source in nature is mainly radon and its daughter decay products. In the experiment, an isotope of plutonium-238 (Pu²³⁸), generating radiation with an energy of about 5500 eV, was used as a source of alpha particles. In an experiment in the first generation (F₁), deformities or morphoses were found, which can be called "radiation syndromes" or mutations, the manifestation of which is similar to the pleiotropic action of genes. The proportion of morphoses in the experiment was 1.8%, and in control 0.4%. In this experiment, the morphoses in the flies of the first and second generation looked like black spots, or melanomas on different parts of the imago body; "generalized" melanomas; curled, curved wings; shortened wing; bubble on one wing; absence of one wing, deformation of thorax, interruption and violation of tergite patterns, disruption of distribution of ocular facets and bristles; absence of pigmentation of the second and third legs. Statistical analysis by the Chi-square method showed the reliability of the difference in experiment and control at P ≤ 0.01. On the basis of this, it can be considered that alpha particles, which in the environment are mainly generated by radon and its isotopes, have a mutagenic effect that manifests itself, mainly in the formation of morphoses or deformities.

Keywords: Alpha-radiation, genotoxicity, morphoses, radioecology, radon.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 943

57 Identifying New Sequence Features for Exon-Intron Discrimination by Rescaled-Range Frameshift Analysis

Authors: Sing-Wu Liou, Yin-Fu Huang

Abstract:

For identifying the discriminative sequence features between exons and introns, a new paradigm, rescaled-range frameshift analysis (RRFA), was proposed. By RRFA, two new sequence features, the frameshift sensitivity (FS) and the accumulative penta-mer complexity (APC), were discovered which were further integrated into a new feature of larger scale, the persistency in anti-mutation (PAM). The feature-validation experiments were performed on six model organisms to test the power of discrimination. All the experimental results highly support that FS, APC and PAM were all distinguishing features between exons and introns. These identified new sequence features provide new insights into the sequence composition of genes and they have great potentials of forming a new basis for recognizing the exonintron boundaries in gene sequences.

Keywords: Exon-Intron Discrimination, Rescaled-Range Frameshift Analysis, Frameshift Sensitivity, Accumulative Sequence Complexity

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1173

56 The Role of MAOA Gene in the Etiology of Autism Spectrum Disorder in Males

Authors: Jana Kisková, Dana Gabriková

Abstract:

Monoamine oxidase A gene (MAOA) is suggested to be a candidate gene implicated in many neuropsychiatric disorders, including autism spectrum disorder (ASD). This meta-analytic review evaluates the relationship between ASD and MAOA markers such as 30 bp variable number tandem repeats in the promoter region (uVNTR) and single nucleotide polymorphisms (SNPs) by using findings from recently published studies. It seems that in Caucasian males, the risk of developing ASD increase with the presence of 4- repeat allele in the promoter region of MAOA gene whereas no differences were found between autistic patients and controls in Egyptian, West Bengal and Korean population. Some studies point to the importance of specific haplotype groups of SNPs and interaction of MAOA with others genes (e. g. FOXP2 or SRY). The results of existing studies are insufficient and further research is needed.

Keywords: Autism spectrum disorder, MAOA, uVNTR, single nucleotide polymorphism.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3439

55 MIM: A Species Independent Approach for Classifying Coding and Non-Coding DNA Sequences in Bacterial and Archaeal Genomes

Authors: Achraf El Allali, John R. Rose

Abstract:

A number of competing methodologies have been developed to identify genes and classify DNA sequences into coding and non-coding sequences. This classification process is fundamental in gene finding and gene annotation tools and is one of the most challenging tasks in bioinformatics and computational biology. An information theory measure based on mutual information has shown good accuracy in classifying DNA sequences into coding and noncoding. In this paper we describe a species independent iterative approach that distinguishes coding from non-coding sequences using the mutual information measure (MIM). A set of sixty prokaryotes is used to extract universal training data. To facilitate comparisons with the published results of other researchers, a test set of 51 bacterial and archaeal genomes was used to evaluate MIM. These results demonstrate that MIM produces superior results while remaining species independent.

Keywords: Coding Non-coding Classification, Entropy, GeneRecognition, Mutual Information.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1726

54 Multiple Sequence Alignment Using Optimization Algorithms

Authors: M. F. Omar, R. A. Salam, R. Abdullah, N. A. Rashid

Abstract:

Proteins or genes that have similar sequences are likely to perform the same function. One of the most widely used techniques for sequence comparison is sequence alignment. Sequence alignment allows mismatches and insertion/deletion, which represents biological mutations. Sequence alignment is usually performed only on two sequences. Multiple sequence alignment, is a natural extension of two-sequence alignment. In multiple sequence alignment, the emphasis is to find optimal alignment for a group of sequences. Several applicable techniques were observed in this research, from traditional method such as dynamic programming to the extend of widely used stochastic optimization method such as Genetic Algorithms (GAs) and Simulated Annealing. A framework with combination of Genetic Algorithm and Simulated Annealing is presented to solve Multiple Sequence Alignment problem. The Genetic Algorithm phase will try to find new region of solution while Simulated Annealing can be considered as an alignment improver for any near optimal solution produced by GAs.

Keywords: Simulated annealing, genetic algorithm, sequence alignment, multiple sequence alignment.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2408

53 Bioinformatic Analysis of Retroelement-Associated Sequences in Human and Mouse Promoters

Authors: Nadezhda M. Usmanova, Nikolai V. Tomilin

Abstract:

Mammalian genomes contain large number of retroelements (SINEs, LINEs and LTRs) which could affect expression of protein coding genes through associated transcription factor binding sites (TFBS). Activity of the retroelement-associated TFBS in many genes is confirmed experimentally but their global functional impact remains unclear. Human SINEs (Alu repeats) and mouse SINEs (B1 and B2 repeats) are known to be clustered in GCrich gene rich genome segments consistent with the view that they can contribute to regulation of gene expression. We have shown earlier that Alu are involved in formation of cis-regulatory modules (clusters of TFBS) in human promoters, and other authors reported that Alu located near promoter CpG islands have an increased frequency of CpG dinucleotides suggesting that these Alu are undermethylated. Human Alu and mouse B1/B2 elements have an internal bipartite promoter for RNA polymerase III containing conserved sequence motif called B-box which can bind basal transcription complex TFIIIC. It has been recently shown that TFIIIC binding to B-box leads to formation of a boundary which limits spread of repressive chromatin modifications in S. pombe. SINEassociated B-boxes may have similar function but conservation of TFIIIC binding sites in SINEs located near mammalian promoters has not been studied earlier. Here we analysed abundance and distribution of retroelements (SINEs, LINEs and LTRs) in annotated sequences of the Database of mammalian transcription start sites (DBTSS). Fractions of SINEs in human and mouse promoters are slightly lower than in all genome but >40% of human and mouse promoters contain Alu or B1/B2 elements within -1000 to +200 bp interval relative to transcription start site (TSS). Most of these SINEs is associated with distal segments of promoters (-1000 to -200 bp relative to TSS) indicating that their insertion at distances >200 bp upstream of TSS is tolerated during evolution. Distribution of SINEs in promoters correlates negatively with the distribution of CpG sequences. Using analysis of abundance of 12-mer motifs from the B1 and Alu consensus sequences in genome and DBTSS it has been confirmed that some subsegments of Alu and B1 elements are poorly conserved which depends in part on the presence of CpG dinucleotides. One of these CpG-containing subsegments in B1 elements overlaps with SINE-associated B-box and it shows better conservation in DBTSS compared to genomic sequences. It has been also studied conservation in DBTSS and genome of the B-box containing segments of old (AluJ, AluS) and young (AluY) Alu repeats and found that CpG sequence of the B-box of old Alu is better conserved in DBTSS than in genome. This indicates that Bbox- associated CpGs in promoters are better protected from methylation and mutation than B-box-associated CpGs in genomic SINEs. These results are consistent with the view that potential TFIIIC binding motifs in SINEs associated with human and mouse promoters may be functionally important. These motifs may protect promoters from repressive histone modifications which spread from adjacent sequences. This can potentially explain well known clustering of SINEs in GC-rich gene rich genome compartments and existence of unmethylated CpG islands.

Keywords: Retroelement, promoter, CpG island, DNAmethylation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1572