Search results for: genome%20assembly
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 355

Search results for: genome%20assembly

235 In silico Subtractive Genomics Approach for Identification of Strain-Specific Putative Drug Targets among Hypothetical Proteins of Drug-Resistant Klebsiella pneumoniae Strain 825795-1

Authors: Umairah Natasya Binti Mohd Omeershffudin, Suresh Kumar

Abstract:

Klebsiella pneumoniae, a Gram-negative enteric bacterium that causes nosocomial and urinary tract infections. Particular concern is the global emergence of multidrug-resistant (MDR) strains of Klebsiella pneumoniae. Characterization of antibiotic resistance determinants at the genomic level plays a critical role in understanding, and potentially controlling, the spread of multidrug-resistant (MDR) pathogens. In this study, drug-resistant Klebsiella pneumoniae strain 825795-1 was investigated with extensive computational approaches aimed at identifying novel drug targets among hypothetical proteins. We have analyzed 1099 hypothetical proteins available in genome. We have used in-silico genome subtraction methodology to design potential and pathogen-specific drug targets against Klebsiella pneumoniae. We employed bioinformatics tools to subtract the strain-specific paralogous and host-specific homologous sequences from the bacterial proteome. The sorted 645 proteins were further refined to identify the essential genes in the pathogenic bacterium using the database of essential genes (DEG). We found 135 unique essential proteins in the target proteome that could be utilized as novel targets to design newer drugs. Further, we identified 49 cytoplasmic protein as potential drug targets through sub-cellular localization prediction. Further, we investigated these proteins in the DrugBank databases, and 11 of the unique essential proteins showed druggability according to the FDA approved drug bank databases with diverse broad-spectrum property. The results of this study will facilitate discovery of new drugs against Klebsiella pneumoniae.

Keywords: pneumonia, drug target, hypothetical protein, subtractive genomics

Procedia PDF Downloads 153
234 Metagenomics-Based Molecular Epidemiology of Viral Diseases

Authors: Vyacheslav Furtak, Merja Roivainen, Olga Mirochnichenko, Majid Laassri, Bella Bidzhieva, Tatiana Zagorodnyaya, Vladimir Chizhikov, Konstantin Chumakov

Abstract:

Molecular epidemiology and environmental surveillance are parts of a rational strategy to control infectious diseases. They have been widely used in the worldwide campaign to eradicate poliomyelitis, which otherwise would be complicated by the inability to rapidly respond to outbreaks and determine sources of the infection. The conventional scheme involves isolation of viruses from patients and the environment, followed by their identification by nucleotide sequences analysis to determine phylogenetic relationships. This is a tedious and time-consuming process that yields definitive results when it may be too late to implement countermeasures. Because of the difficulty of high-throughput full-genome sequencing, most such studies are conducted by sequencing only capsid genes or their parts. Therefore the important information about the contribution of other parts of the genome and inter- and intra-species recombination to viral evolution is not captured. Here we propose a new approach based on the rapid concentration of sewage samples with tangential flow filtration followed by deep sequencing and reconstruction of nucleotide sequences of viruses present in the samples. The entire nucleic acids content of each sample is sequenced, thus preserving in digital format the complete spectrum of viruses. A set of rapid algorithms was developed to separate deep sequence reads into discrete populations corresponding to each virus and assemble them into full-length consensus contigs, as well as to generate a complete profile of sequence heterogeneities in each of them. This provides an effective approach to study molecular epidemiology and evolution of natural viral populations.

Keywords: poliovirus, eradication, environmental surveillance, laboratory diagnosis

Procedia PDF Downloads 250
233 Optimization for Guide RNA and CRISPR/Cas9 System Nanoparticle Mediated Delivery into Plant Cell for Genome Editing

Authors: Andrey V. Khromov, Antonida V. Makhotenko, Ekaterina A. Snigir, Svetlana S. Makarova, Natalia O. Kalinina, Valentin V. Makarov, Mikhail E. Taliansky

Abstract:

Due to its simplicity, CRISPR/Cas9 has become widely used and capable of inducing mutations in the genes of organisms of various kingdoms. The aim of this work was to develop applications for the efficient modification of DNA coding sequences of phytoene desaturase (PDS), coilin and vacuolar invertase (Solanum tuberosum) genes, and to develop a new nanoparticles carrier efficient technology to deliver the CRISPR/Cas9 system for editing the plant genome. For each of the genes - coilin, PDS and vacuolar invertase, five single RNA guide (sgRNAs) were synthesized. To determine the most suitable nanoplatform, two types of NP platforms were used: magnetic NPs (MNPS) and gold NPs (AuNPs). To test the penetration efficiency, they were functionalized with fluorescent agents - BSA * FITS and GFP, as well as labeled Cy3 small-sized RNA. To measure the efficiency, a fluorescence and confocal microscopy were used. It was shown that the best of these options were AuNP - both in the case of proteins and in the case of RNA. The next step was to check the possibility of delivering components of the CRISPR/Cas9 system to plant cells for editing target genes. AuNPs were functionalized with a ribonucleoprotein complex consisting of Cas9 and corresponding to target genes sgRNAs, and they were biolistically bombarded to axillary buds and apical meristems of potato plants. After the treatment by the best NP carrier, potato meristems were grown to adult plants. DNA isolated from this plants was sent to a preliminary fragment of the analysis to screen out the non-transformed samples, and then to the NGS. The present work was carried out with the financial support from the Russian Science Foundation (grant No. 16-16-04019).

Keywords: biobombardment, coilin, CRISPR/Cas9, nanoparticles, NPs, PDS, sgRNA, vacuolar invertase

Procedia PDF Downloads 288
232 An Analysis on Clustering Based Gene Selection and Classification for Gene Expression Data

Authors: K. Sathishkumar, V. Thiagarasu

Abstract:

Due to recent advances in DNA microarray technology, it is now feasible to obtain gene expression profiles of tissue samples at relatively low costs. Many scientists around the world use the advantage of this gene profiling to characterize complex biological circumstances and diseases. Microarray techniques that are used in genome-wide gene expression and genome mutation analysis help scientists and physicians in understanding of the pathophysiological mechanisms, in diagnoses and prognoses, and choosing treatment plans. DNA microarray technology has now made it possible to simultaneously monitor the expression levels of thousands of genes during important biological processes and across collections of related samples. Elucidating the patterns hidden in gene expression data offers a tremendous opportunity for an enhanced understanding of functional genomics. However, the large number of genes and the complexity of biological networks greatly increase the challenges of comprehending and interpreting the resulting mass of data, which often consists of millions of measurements. A first step toward addressing this challenge is the use of clustering techniques, which is essential in the data mining process to reveal natural structures and identify interesting patterns in the underlying data. This work presents an analysis of several clustering algorithms proposed to deals with the gene expression data effectively. The existing clustering algorithms like Support Vector Machine (SVM), K-means algorithm and evolutionary algorithm etc. are analyzed thoroughly to identify the advantages and limitations. The performance evaluation of the existing algorithms is carried out to determine the best approach. In order to improve the classification performance of the best approach in terms of Accuracy, Convergence Behavior and processing time, a hybrid clustering based optimization approach has been proposed.

Keywords: microarray technology, gene expression data, clustering, gene Selection

Procedia PDF Downloads 289
231 Evolution of DNA-Binding With-One-Finger Transcriptional Factor Family in Diploid Cotton Gossypium raimondii

Authors: Waqas Shafqat Chattha, Muhammad Iqbal, Amir Shakeel

Abstract:

Transcriptional factors are proteins that play a vital role in regulating the transcription of target genes in different biological processes and are being widely studied in different plant species. In the current era of genomics, plant genomes sequencing has directed to the genome-wide identification, analyses and categorization of diverse transcription factor families and hence provide key insights into their structural as well as functional diversity. The DNA-binding with One Finger (DOF) proteins belongs to C2-C2-type zinc finger protein family. DOF proteins are plant-specific transcription factors implicated in diverse functions including seed maturation and germination, phytohormone signalling, light-mediated gene regulation, cotton-fiber elongation and responses of the plant to biotic as well as abiotic stresses. In this context, a genome-wide in-silico analysis of DOF TF family in diploid cotton species i.e. Gossypium raimondii has enabled us to identify 55 non-redundant genes encoding DOF proteins renamed as GrDofs (Gossypium raimondii Dof). Gene distribution studies have shown that all of the GrDof genes are unevenly distributed across 12 out of 13 G. raimondii chromosomes. The gene structure analysis illustrated that 34 out of 55 GrDof genes are intron-less while remaining 21 genes have a single intron. Protein sequence-based phylogenetic analysis of putative 55 GrDOFs has divided these proteins into 5 major groups with various paralogous gene pairs. Molecular evolutionary studies aided with the conserved domain as well as gene structure analysis suggested that segmental duplications were the principal contributors for the expansion of Dof genes in G. raimondii.

Keywords: diploid cotton , G. raimondii, phylogenetic analysis, transcription factor

Procedia PDF Downloads 119
230 Persistent Ribosomal In-Frame Mis-Translation of Stop Codons as Amino Acids in Multiple Open Reading Frames of a Human Long Non-Coding RNA

Authors: Leonard Lipovich, Pattaraporn Thepsuwan, Anton-Scott Goustin, Juan Cai, Donghong Ju, James B. Brown

Abstract:

Two-thirds of human genes do not encode any known proteins. Aside from long non-coding RNA (lncRNA) genes with recently-discovered functions, the ~40,000 non-protein-coding human genes remain poorly understood, and a role for their transcripts as de-facto unconventional messenger RNAs has not been formally excluded. Ribosome profiling (Riboseq) predicts translational potential, but without independent evidence of proteins from lncRNA open reading frames (ORFs), ribosome binding of lncRNAs does not prove translation. Previously, we mass-spectrometrically documented translation of specific lncRNAs in human K562 and GM12878 cells. We now examined lncRNA translation in human MCF7 cells, integrating strand-specific Illumina RNAseq, Riboseq, and deep mass spectrometry in biological quadruplicates performed at two core facilities (BGI, China; City of Hope, USA). We excluded known-protein matches. UCSC Genome Browser-assisted manual annotation of imperfect (tryptic-digest-peptides)-to-(lncRNA-three-frame-translations) alignments revealed three peptides hypothetically explicable by 'stop-to-nonstop' in-frame replacement of stop codons by amino acids in two ORFs of the lncRNA MMP24-AS1. To search for this phenomenon genomewide, we designed and implemented a novel pipeline, matching tryptic-digest spectra to wildcard-instead-of-stop versions of repeat-masked, six-frame, whole-genome translations. Along with singleton putative stop-to-nonstop events affecting four other lncRNAs, we identified 24 additional peptides with stop-to-nonstop in-frame substitutions from multiple positive-strand MMP24-AS1 ORFs. Only UAG and UGA, never UAA, stop codons were impacted. All MMP24-AS1-matching spectra met the same significance thresholds as high-confidence known-protein signatures. Targeted resequencing of MMP24-AS1 genomic DNA and cDNA from the same samples did not reveal any mutations, polymorphisms, or sequencing-detectable RNA editing. This unprecedented apparent gene-specific violation of the genetic code highlights the importance of matching peptides to whole-genome, not known-genes-only, ORFs in mass-spectrometry workflows, and suggests a new mechanism enhancing the combinatorial complexity of the proteome. Funding: NIH Director’s New Innovator Award 1DP2-CA196375 to LL.

Keywords: genetic code, lncRNA, long non-coding RNA, mass spectrometry, proteogenomics, ribo-seq, ribosome, RNAseq

Procedia PDF Downloads 203
229 Breeding Cotton for Annual Growth Habit: Remobilizing End-of-season Perennial Reserves for Increased Yield

Authors: Salman Naveed, Nitant Gandhi, Grant Billings, Zachary Jones, B. Todd Campbell, Michael Jones, Sachin Rustgi

Abstract:

Cotton (Gossypium spp.) is the primary source of natural fiber in the U.S. and a major crop in the Southeastern U.S. Despite constant efforts to increase the cotton fiber yield, the yield gain has stagnated. Therefore, we undertook a novel approach to improve the cotton fiber yield by altering its growth habit from perennial to annual. In this effort, we identified genotypes with high-expression alleles of five floral induction and meristem identity genes (FT, SOC1, FUL, LFY, and AP1) from an upland cotton mini-core collection and crossed them in various combinations to develop cotton lines with annual growth habit, optimal flowering time and enhanced productivity. To facilitate the characterization of genotypes with the desired combinations of stacked alleles, we identified markers associated with the gene expression traits via genome-wide association analysis using a 63K SNP Array (Hulse-Kemp et al. 2015 G3 5:1187). Over 14,500 SNPs showed polymorphism and were used for association analysis. A total of 396 markers showed association with expression traits. Out of these 396 markers, 159 mapped to genes, 50 to untranslated regions, and 187 to random genomic regions. Biased genomic distribution of associated markers was observed where more trait-associated markers mapped to the cotton D sub-genome. Many quantitative trait loci coincided at specific genomic regions. This observation has implications as these traits could be bred together. The analysis also allowed the identification of candidate regulators of the expression patterns of these floral induction and meristem identity genes whose functions will be validated via virus-induced gene silencing.

Keywords: cotton, GWAS, QTL, expression traits

Procedia PDF Downloads 123
228 Genome-Wide Homozygosity Analysis of the Longevous Phenotype in the Amish Population

Authors: Sandra Smieszek, Jonathan Haines

Abstract:

Introduction: Numerous research efforts have focused on searching for ‘longevity genes’. However, attempting to decipher the genetic component of the longevous phenotype have resulted in limited success and the mechanisms governing longevity remain to be explained. We conducted a genome-wide homozygosity analysis (GWHA) of the founder population of the Amish community in central Ohio. While genome-wide association studies using unrelated individuals have revealed many interesting longevity associated variants, these variants are typically of small effect and cannot explain the observed patterns of heritability for this complex trait. The Amish provide a large cohort of extended kinships allowing for in depth analysis via family-based approach excellent population due to its. Heritability of longevity increases with age with significant genetic contribution being seen in individuals living beyond 60 years of age. In our present analysis we show that the heritability of longevity is estimated to be increasing with age particularly on the paternal side. Methods: The present analysis integrated both phenotypic and genotypic data and led to the discovery of a series of variants, distinct for stratified populations across ages and distinct for paternal and maternal cohorts. Specifically 5437 subjects were analyzed and a subset of 893 successfully genotyped individuals was used to assess CHIP heritability. We have conducted the homozygosity analysis to examine if homozygosity is associated with increased risk of living beyond 90. We analyzed AMISH cohort genotyped for 614,957 SNPs. Results: We delineated 10 significant regions of homozygosity (ROH) specific for the age group of interest (>90). Of particular interest was ROH on chromosome 13, P < 0.0001. The lead SNPs rs7318486 and rs9645914 point to COL4A2 and our lead SNP. COL25A1 encodes one of the six subunits of type IV collagen, the C-terminal portion of the protein, known as canstatin, is an inhibitor of angiogenesis and tumor growth. COL4A2 mutations have been reported with a broader spectrum of cerebrovascular, renal, ophthalmological, cardiac, and muscular abnormalities. The second region of interest points to IRS2. Furthermore we built a classifier using the obtained SNPs from the significant ROH region with 0.945 AUC giving ability to discriminate between those living beyond to 90 years of age and beyond. Conclusion: In conclusion our results suggest that a history of longevity does indeed contribute to increasing the odds of individual longevity. Preliminary results are consistent with conjecture that heritability of longevity is substantial when we start looking at oldest fifth and smaller percentiles of survival specifically in males. We will validate all the candidate variants in independent cohorts of centenarians, to test whether they are robustly associated with human longevity. The identified regions of interest via ROH analysis could be of profound importance for the understanding of genetic underpinnings of longevity.

Keywords: regions of homozygosity, longevity, SNP, Amish

Procedia PDF Downloads 207
227 Changing the Landscape of Fungal Genomics: New Trends

Authors: Igor V. Grigoriev

Abstract:

Understanding of biological processes encoded in fungi is instrumental in addressing future food, feed, and energy demands of the growing human population. Genomics is a powerful and quickly evolving tool to understand these processes. The Fungal Genomics Program of the US Department of Energy Joint Genome Institute (JGI) partners with researchers around the world to explore fungi in several large scale genomics projects, changing the fungal genomics landscape. The key trends of these changes include: (i) rapidly increasing scale of sequencing and analysis, (ii) developing approaches to go beyond culturable fungi and explore fungal ‘dark matter,’ or unculturables, and (iii) functional genomics and multi-omics data integration. Power of comparative genomics has been recently demonstrated in several JGI projects targeting mycorrhizae, plant pathogens, wood decay fungi, and sugar fermenting yeasts. The largest JGI project ‘1000 Fungal Genomes’ aims at exploring the diversity across the Fungal Tree of Life in order to better understand fungal evolution and to build a catalogue of genes, enzymes, and pathways for biotechnological applications. At this point, at least 65% of over 700 known families have one or more reference genomes sequenced, enabling metagenomics studies of microbial communities and their interactions with plants. For many of the remaining families no representative species are available from culture collections. To sequence genomes of unculturable fungi two approaches have been developed: (a) sequencing DNA from fruiting bodies of ‘macro’ and (b) single cell genomics using fungal spores. The latter has been tested using zoospores from the early diverging fungi and resulted in several near-complete genomes from underexplored branches of the Fungal Tree, including the first genomes of Zoopagomycotina. Genome sequence serves as a reference for transcriptomics studies, the first step towards functional genomics. In the JGI fungal mini-ENCODE project transcriptomes of the model fungus Neurospora crassa grown on a spectrum of carbon sources have been collected to build regulatory gene networks. Epigenomics is another tool to understand gene regulation and recently introduced single molecule sequencing platforms not only provide better genome assemblies but can also detect DNA modifications. For example, 6mC methylome was surveyed across many diverse fungi and the highest among Eukaryota levels of 6mC methylation has been reported. Finally, data production at such scale requires data integration to enable efficient data analysis. Over 700 fungal genomes and other -omes have been integrated in JGI MycoCosm portal and equipped with comparative genomics tools to enable researchers addressing a broad spectrum of biological questions and applications for bioenergy and biotechnology.

Keywords: fungal genomics, single cell genomics, DNA methylation, comparative genomics

Procedia PDF Downloads 181
226 Whole Exome Sequencing Data Analysis of Rare Diseases: Non-Coding Variants and Copy Number Variations

Authors: S. Fahiminiya, J. Nadaf, F. Rauch, L. Jerome-Majewska, J. Majewski

Abstract:

Background: Sequencing of protein coding regions of human genome (Whole Exome Sequencing; WES), has demonstrated a great success in the identification of causal mutations for several rare genetic disorders in human. Generally, most of WES studies have focused on rare variants in coding exons and splicing-sites where missense substitutions lead to the alternation of protein product. Although focusing on this category of variants has revealed the mystery behind many inherited genetic diseases in recent years, a subset of them remained still inconclusive. Here, we present the result of our WES studies where analyzing only rare variants in coding regions was not conclusive but further investigation revealed the involvement of non-coding variants and copy number variations (CNV) in etiology of the diseases. Methods: Whole exome sequencing was performed using our standard protocols at Genome Quebec Innovation Center, Montreal, Canada. All bioinformatics analyses were done using in-house WES pipeline. Results: To date, we successfully identified several disease causing mutations within gene coding regions (e.g. SCARF2: Van den Ende-Gupta syndrome and SNAP29: 22q11.2 deletion syndrome) by using WES. In addition, we showed that variants in non-coding regions and CNV have also important value and should not be ignored and/or filtered out along the way of bioinformatics analysis on WES data. For instance, in patients with osteogenesis imperfecta type V and in patients with glucocorticoid deficiency, we identified variants in 5'UTR, resulting in the production of longer or truncating non-functional proteins. Furthermore, CNVs were identified as the main cause of the diseases in patients with metaphyseal dysplasia with maxillary hypoplasia and brachydactyly and in patients with osteogenesis imperfecta type VII. Conclusions: Our study highlights the importance of considering non-coding variants and CNVs during interpretation of WES data, as they can be the only cause of disease under investigation.

Keywords: whole exome sequencing data, non-coding variants, copy number variations, rare diseases

Procedia PDF Downloads 388
225 Genomic Resilience and Ecological Vulnerability in Coffea Arabica: Insights from Whole Genome Resequencing at Its Center of Origin

Authors: Zewdneh Zana Zate

Abstract:

The study focuses on the evolutionary and ecological genomics of both wild and cultivated Coffea arabica L. at its center of origin, Ethiopia, aiming to uncover how this vital species may withstand future climate changes. Utilizing bioclimatic models, we project the future distribution of Arabica under varied climate scenarios for 2050 and 2080, identifying potential conservation zones and immediate risk areas. Through whole-genome resequencing of accessions from Ethiopian gene banks, this research assesses genetic diversity and divergence between wild and cultivated populations. It explores relationships, demographic histories, and potential hybridization events among Coffea arabica accessions to better understand the species' origins and its connection to parental species. This genomic analysis also seeks to detect signs of natural or artificial selection across populations. Integrating these genomic discoveries with ecological data, the study evaluates the current and future ecological and genomic vulnerabilities of wild Coffea arabica, emphasizing necessary adaptations for survival. We have identified key genomic regions linked to environmental stress tolerance, which could be crucial for breeding more resilient Arabica varieties. Additionally, our ecological modeling predicted a contraction of suitable habitats, urging immediate conservation actions in identified key areas. This research not only elucidates the evolutionary history and adaptive strategies of Arabica but also informs conservation priorities and breeding strategies to enhance resilience to climate change. By synthesizing genomic and ecological insights, we provide a robust framework for developing effective management strategies aimed at sustaining Coffea arabica, a species of profound global importance, in its native habitat under evolving climatic conditions.

Keywords: coffea arabica, climate change adaptation, conservation strategies, genomic resilience

Procedia PDF Downloads 9
224 Detection, Analysis and Determination of the Origin of Copy Number Variants (CNVs) in Intellectual Disability/Developmental Delay (ID/DD) Patients and Autistic Spectrum Disorders (ASD) Patients by Molecular and Cytogenetic Methods

Authors: Pavlina Capkova, Josef Srovnal, Vera Becvarova, Marie Trkova, Zuzana Capkova, Andrea Stefekova, Vaclava Curtisova, Alena Santava, Sarka Vejvalkova, Katerina Adamova, Radek Vodicka

Abstract:

ASDs are heterogeneous and complex developmental diseases with a significant genetic background. Recurrent CNVs are known to be a frequent cause of ASD. These CNVs can have, however, a variable expressivity which results in a spectrum of phenotypes from asymptomatic to ID/DD/ASD. ASD is associated with ID in ~75% individuals. Various platforms are used to detect pathogenic mutations in the genome of these patients. The performed study is focused on a determination of the frequency of pathogenic mutations in a group of ASD patients and a group of ID/DD patients using various strategies along with a comparison of their detection rate. The possible role of the origin of these mutations in aetiology of ASD was assessed. The study included 35 individuals with ASD and 68 individuals with ID/DD (64 males and 39 females in total), who underwent rigorous genetic, neurological and psychological examinations. Screening for pathogenic mutations involved karyotyping, screening for FMR1 mutations and for metabolic disorders, a targeted MLPA test with probe mixes Telomeres 3 and 5, Microdeletion 1 and 2, Autism 1, MRX and a chromosomal microarray analysis (CMA) (Illumina or Affymetrix). Chromosomal aberrations were revealed in 7 (1 in the ASD group) individuals by karyotyping. FMR1 mutations were discovered in 3 (1 in the ASD group) individuals. The detection rate of pathogenic mutations in ASD patients with a normal karyotype was 15.15% by MLPA and CMA. The frequencies of the pathogenic mutations were 25.0% by MLPA and 35.0% by CMA in ID/DD patients with a normal karyotype. CNVs inherited from asymptomatic parents were more abundant than de novo changes in ASD patients (11.43% vs. 5.71%) in contrast to the ID/DD group where de novo mutations prevailed over inherited ones (26.47% vs. 16.18%). ASD patients shared more frequently their mutations with their fathers than patients from ID/DD group (8.57% vs. 1.47%). Maternally inherited mutations predominated in the ID/DD group in comparison with the ASD group (14.7% vs. 2.86 %). CNVs of an unknown significance were found in 10 patients by CMA and in 3 patients by MLPA. Although the detection rate is the highest when using CMA, recurrent CNVs can be easily detected by MLPA. CMA proved to be more efficient in the ID/DD group where a larger spectrum of rare pathogenic CNVs was revealed. This study determined that maternally inherited highly penetrant mutations and de novo mutations more often resulted in ID/DD without ASD in patients. The paternally inherited mutations could be, however, a source of the greater variability in the genome of the ASD patients and contribute to the polygenic character of the inheritance of ASD. As the number of the subjects in the group is limited, a larger cohort is needed to confirm this conclusion. Inherited CNVs have a role in aetiology of ASD possibly in combination with additional genetic factors - the mutations elsewhere in the genome. The identification of these interactions constitutes a challenge for the future. Supported by MH CZ – DRO (FNOl, 00098892), IGA UP LF_2016_010, TACR TE02000058 and NPU LO1304.

Keywords: autistic spectrum disorders, copy number variant, chromosomal microarray, intellectual disability, karyotyping, MLPA, multiplex ligation-dependent probe amplification

Procedia PDF Downloads 326
223 Predicting Open Chromatin Regions in Cell-Free DNA Whole Genome Sequencing Data by Correlation Clustering  

Authors: Fahimeh Palizban, Farshad Noravesh, Amir Hossein Saeidian, Mahya Mehrmohamadi

Abstract:

In the recent decade, the emergence of liquid biopsy has significantly improved cancer monitoring and detection. Dying cells, including those originating from tumors, shed their DNA into the blood and contribute to a pool of circulating fragments called cell-free DNA. Accordingly, identifying the tissue origin of these DNA fragments from the plasma can result in more accurate and fast disease diagnosis and precise treatment protocols. Open chromatin regions are important epigenetic features of DNA that reflect cell types of origin. Profiling these features by DNase-seq, ATAC-seq, and histone ChIP-seq provides insights into tissue-specific and disease-specific regulatory mechanisms. There have been several studies in the area of cancer liquid biopsy that integrate distinct genomic and epigenomic features for early cancer detection along with tissue of origin detection. However, multimodal analysis requires several types of experiments to cover the genomic and epigenomic aspects of a single sample, which will lead to a huge amount of cost and time. To overcome these limitations, the idea of predicting OCRs from WGS is of particular importance. In this regard, we proposed a computational approach to target the prediction of open chromatin regions as an important epigenetic feature from cell-free DNA whole genome sequence data. To fulfill this objective, local sequencing depth will be fed to our proposed algorithm and the prediction of the most probable open chromatin regions from whole genome sequencing data can be carried out. Our method integrates the signal processing method with sequencing depth data and includes count normalization, Discrete Fourie Transform conversion, graph construction, graph cut optimization by linear programming, and clustering. To validate the proposed method, we compared the output of the clustering (open chromatin region+, open chromatin region-) with previously validated open chromatin regions related to human blood samples of the ATAC-DB database. The percentage of overlap between predicted open chromatin regions and the experimentally validated regions obtained by ATAC-seq in ATAC-DB is greater than 67%, which indicates meaningful prediction. As it is evident, OCRs are mostly located in the transcription start sites (TSS) of the genes. In this regard, we compared the concordance between the predicted OCRs and the human genes TSS regions obtained from refTSS and it showed proper accordance around 52.04% and ~78% with all and the housekeeping genes, respectively. Accurately detecting open chromatin regions from plasma cell-free DNA-seq data is a very challenging computational problem due to the existence of several confounding factors, such as technical and biological variations. Although this approach is in its infancy, there has already been an attempt to apply it, which leads to a tool named OCRDetector with some restrictions like the need for highly depth cfDNA WGS data, prior information about OCRs distribution, and considering multiple features. However, we implemented a graph signal clustering based on a single depth feature in an unsupervised learning manner that resulted in faster performance and decent accuracy. Overall, we tried to investigate the epigenomic pattern of a cell-free DNA sample from a new computational perspective that can be used along with other tools to investigate genetic and epigenetic aspects of a single whole genome sequencing data for efficient liquid biopsy-related analysis.

Keywords: open chromatin regions, cancer, cell-free DNA, epigenomics, graph signal processing, correlation clustering

Procedia PDF Downloads 110
222 Exploring Emerging Viruses From a Protected Reserve

Authors: Nemat Sokhandan Bashir

Abstract:

Threats from viruses to agricultural crops could be even larger than the losses caused by the other pathogens because, in many cases, the viral infection is latent but crucial from an epidemic point of view. Wild vegetation can be a source of many viruses that eventually find their destiny in crop plants. Although often asymptomatic in wild plants due to adaptation, they can potentially cause serious losses in crops. Therefore, exploring viruses in wild vegetation is very important. Recently, omics have been quite useful for exploring plant viruses from various plant sources, especially wild vegetation. For instance, we have discovered viruses such as Ambrossia asymptomatic virus I (AAV-1) through the application of metagenomics from Oklahoma Prairie Reserve. Accordingly, extracts from randomly-sampled plants are subjected to high speed and ultracentrifugation to separated virus-like particles (VLP), then nucleic acids in the form of DNA or RNA are extracted from such VLPs by treatment with phenol—chloroform and subsequent precipitation by ethanol. The nucleic acid preparations are separately treated with RNAse or DNAse in order to determine the genome component of VLPs. In the case of RNAs, the complementary cDNAs are synthesized before submitting to DNA sequencing. However, for VLPs with DNA contents, the procedure would be relatively straightforward without making cDNA. Because the length of the nucleic acid content of VPLs can be different, various strategies are employed to achieve sequencing. Techniques similar to so-called "chromosome walking" may be used to achieve sequences of long segments. When the nucleotide sequence data were obtained, they were subjected to BLAST analysis to determine the most related previously reported virus sequences. In one case, we determined that the novel virus was AAV-l because the sequence comparison and analysis revealed that the reads were the closest to the Indian citrus ringspot virus (ICRSV). AAV—l had an RNA genome with 7408 nucleotides in length and contained six open reading frames (ORFs). Based on phylogenies inferred from the replicase and coat protein ORFs of the virus, it was placed in the genus Mandarivirus.

Keywords: wild, plant, novel, metagenomics

Procedia PDF Downloads 43
221 Microbial Bioproduction with Design of Metabolism and Enzyme Engineering

Authors: Tomokazu Shirai, Akihiko Kondo

Abstract:

Technologies of metabolic engineering or synthetic biology are essential for effective microbial bioproduction. It is especially important to develop an in silico tool for designing a metabolic pathway producing an unnatural and valuable chemical such as fossil materials of fuel or plastics. We here demonstrated two in silico tools for designing novel metabolic pathways: BioProV and HyMeP. Furthermore, we succeeded in creating an artificial metabolic pathway by enzyme engineering.

Keywords: bioinformatics, metabolic engineering, synthetic biology, genome scale model

Procedia PDF Downloads 311
220 Gene Expression Profiling of Iron-Related Genes of Pasteurella multocida Serotype A Strain PMTB2.1

Authors: Shagufta Jabeen, Faez Jesse Firdaus Abdullah, Zunita Zakaria, Nurulfiza Mat Isa, Yung Chie Tan, Wai Yan Yee, Abdul Rahman Omar

Abstract:

Pasteurella multocida is associated with acute, as well as, chronic infections in avian and bovine such as pasteurellosis and hemorrhagic septicemia (HS) in cattle and buffaloes. Iron is one of the most important nutrients for pathogenic bacteria including Pasteurella and acts as a cofactor or prosthetic group in several essential enzymes and is needed for amino acid, pyrimidine, and DNA biosynthesis. In our recent study, we showed that 2% of Pasteurella multocida serotype A strain PMTB2.1 encode for iron regulating genes (Accession number CP007205.1). Genome sequencing of other Pasteurella multocida serotypes namely PM70 and HB01 also indicated up to 2.5% of the respective genome encode for iron regulating genes, suggesting that Pasteurella multocida genome comprises of multiple systems for iron uptake. Since P. multocida PMTB2.1 has more than 40 CDs out of 2097 CDs (approximately 2%), encode for iron-regulated. The gene expression profiling of four iron-regulating genes namely fbpb, yfea, fece and fur were characterized under iron-restricted environment. The P. multocida strain PMTB2.1 was grown in broth with and without iron chelating agent and samples were collected at different time points. Relative mRNA expression profile of these genes was determined using Taqman probe based real-time PCR assay. The data analysis, normalization with two house-keeping genes and the quantification of fold changes were carried out using Bio-Rad CFX manager software version 3.1. Results of this study reflect that iron reduced environment has significant effect on expression profile of iron regulating genes (p < 0.05) when compared to control (normal broth) and all evaluated genes act differently with response to iron reduction in media. The highest relative fold change of fece gene was observed at early stage of treatment indicating that PMTB2.1 may utilize its periplasmic protein at early stage to acquire iron. Furthermore, down-regulation expression of fece with the elevated expression of other genes at later time points suggests that PMTB2.1 control their iron requirements in response to iron availability by down-regulating the expression of iron proteins. Moreover, significantly high relative fold change (p ≤ 0.05) of fbpb gene is probably associated with the ability of P. multocida to directly use host iron complex such as hem, hemoglobin. In addition, the significant increase (p ≤ 0.05) in fbpb and yfea expressions also reflects the utilization of multiple iron systems in P. multocida strain PMTB2.1. The findings of this study are very much important as relative scarcity of free iron within hosts creates a major barrier to microbial growth inside host and utilization of outer-membrane proteins system in iron acquisition probably occurred at early stage of infection with P. multocida. In conclusion, the presence and utilization of multiple iron system in P. multocida strain PMTB2.1 revealed the importance of iron in the survival of P. multocida.

Keywords: iron-related genes, real-time PCR, gene expression profiling, fold changes

Procedia PDF Downloads 416
219 Bean in Turkey: Characterization, Inter Gene Pool Hybridization Events, Breeding, Utilizations

Authors: Faheem Shahzad Baloch, Muhammad Azhar Nadeem, Muhammad Amjad Nawaz, Ephrem Habyarimana, Gonul Comertpay, Tolga Karakoy, Rustu Hatipoglu, Mehmet Zahit Yeken, Vahdettin Ciftci

Abstract:

Turkey is considered a bridge between Europe, Asia, and Africa and possibly played an important role in the distribution of many crops including common bean. Hundreds of common bean landraces can be found in Turkey, particularly in farmers’ fields, and they consistently contribute to the overall production. To investigate the existing genetic diversity and hybridization events between the Andean and Mesoamerican gene pools in the Turkish common bean, 188 common bean accessions (182 landraces and 6 modern cultivars as controls) were collected from 19 different Turkish geographic regions. These accessions were characterized using phenotypic data (growth habit and seed weight), geographic provenance, 12557 high-quality whole-genome DArTseq markers, and 3767 novel DArTseq loci were also identified. The clustering algorithms resolved the Turkish common bean landrace germplasm into the two recognized gene pools, the Mesoamerican and Andean gene pools. Hybridization events were observed in both gene pools (14.36% of the accessions) but mostly in the Mesoamerican (7.97% of the accessions), and was low relative to previous European studies. The lower level of hybridization witnessed the existence of Turkish common bean germplasm in its original form as compared to Europe. Mesoamerican gene pool reflected a higher level of diversity, while the Andean gene pool was predominant (56.91% of the accessions), but genetically less diverse and phenotypically more pure, reflecting farmers greater preference for the Andean gene pool. We also found some genetically distinct landraces and overall, a meaningful level of genetic variability which can be used by the scientific community in breeding efforts to develop superior common bean strains.

Keywords: bean germplasm, DArTseq markers, genotyping by sequencing, Turkey, whole genome diversity

Procedia PDF Downloads 208
218 Targeting Mre11 Nuclease Overcomes Platinum Resistance and Induces Synthetic Lethality in Platinum Sensitive XRCC1 Deficient Epithelial Ovarian Cancers

Authors: Adel Alblihy, Reem Ali, Mashael Algethami, Ahmed Shoqafi, Michael S. Toss, Juliette Brownlie, Natalie J. Tatum, Ian Hickson, Paloma Ordonez Moran, Anna Grabowska, Jennie N. Jeyapalan, Nigel P. Mongan, Emad A. Rakha, Srinivasan Madhusudan

Abstract:

Platinum resistance is a clinical challenge in ovarian cancer. Platinating agents induce DNA damage which activate Mre11 nuclease directed DNA damage signalling and response (DDR). Upregulation of DDR may promote chemotherapy resistance. Here we have comprehensively evaluated Mre11 in epithelial ovarian cancers. In clinical cohort that received platinum- based chemotherapy (n=331), Mre11 protein overexpression was associated with aggressive phenotype and poor progression free survival (PFS) (p=0.002). In the ovarian cancer genome atlas (TCGA) cohort (n=498), Mre11 gene amplification was observed in a subset of serous tumours (5%) which correlated highly with Mre11 mRNA levels (p<0.0001). Altered Mre11 levels was linked with genome wide alterations that can influence platinum sensitivity. At the transcriptomic level (n=1259), Mre11 overexpression was associated with poor PFS (p=0.003). ROC analysis showed an area under the curve (AUC) of 0.642 for response to platinum-based chemotherapy. Pre-clinically, Mre11 depletion by gene knock down or blockade by small molecule inhibitor (Mirin) reversed platinum resistance in ovarian cancer cells and in 3D spheroid models. Importantly, Mre11 inhibition was synthetically lethal in platinum sensitive XRCC1 deficient ovarian cancer cells and 3D-spheroids. Selective cytotoxicity was associated with DNA double strand break (DSB) accumulation, S-phase cell cycle arrest and increased apoptosis. We conclude that pharmaceutical development of Mre11 inhibitors is a viable clinical strategy for platinum sensitization and synthetic lethality in ovarian cancer.

Keywords: MRE11; XRCC1, ovarian cancer, platinum sensitization, synthetic lethality

Procedia PDF Downloads 94
217 The First Complete Mitochondrial Genome of Melon Thrips, Thrips palmi (Thripinae: Thysanoptera): Vector for Tospoviruses

Authors: Kaomud Tyagi, Rajasree Chakraborty, Shantanu Kundu, Devkant Singha, Kailash Chandra, Vikas Kumar

Abstract:

The melon thrips, Thrips palmi is a serious pest of a wide range of agriculture crops and also act as vectors for plant viruses (genus Tospovirus, family Bunyaviridae). More molecular data on this species is required to understand the cryptic speciation and evolutionary affiliations. Mitochondrial genomes have been widely used in phylogenetic and evolutionary studies in insect. So far, mitogenomes of five thrips species (Anaphothrips obscurus, Frankliniella intonsa, Frankliniella occidentalis, Scirtothrips dorsalis and Thrips imaginis) is available in the GenBank database. In this study, we sequenced the first complete mitogenome T. palmi and compared it with available thrips mitogenomes. We assembled the mitogenome from the whole genome sequencing data generated using Illumina Hiseq2500. Annotation was performed using MITOS web-server to estimate the location of protein coding genes (PCGs), transfer RNA (tRNAs), ribosomal RNAs (rRNAs) and their secondary structures. The boundaries of PCGs and rRNAs was confirmed manually in NCBI. Phylogenetic analyses were performed using the 13 PCGs data using maximum likelihood (ML) in PAUP, and Bayesian inference (BI) in MrBayes 3.2. The complete mitogenome of T. palmi was 15,333 base pairs (bp), which was greater than the genomes of A. obscurus (14,890bp), F. intonsa (15,215 bp), F. occidentalis (14,889 bp) and S. dorsalis South Asia strain (SA1) (14,283 bp), but smaller than the genomes of T. imaginis (15,407 bp) and S. dorsalis East Asia strain (EA1) (15,343bp). Like in other thrips species, the mitochondrial genome of T. palmi was represented by 37 genes, including 13 PCGs, large and small ribosomal RNA (rrnL and rrnS) genes, 22 transfer RNA (tRNAs) genes (with one extra gene for trn-Serine) and two A+T-rich control regions (CR1 and CR2). Thirty one genes were observed on heavy (H) strand and six genes on the light (L) strand. The six tRNA genes (trnG,trnK, trnY, trnW, trnF, and trnH) were found to be conserved in all thrips species mitogenomes in their locations relative to a protein-coding or rRNA gene upstream or downstream. The gene arrangements of T. palmi is very close to T. imaginis except the rearrangements in tRNAs genes: trnR (arginine), and trnE (glutamic acid) were found to be located between cox3 and CR2 in T. imaginis which were translocated between atp6 and CR1 in T. palmi; trnL1 (Leucine) and trnS1(Serine) were located between atp6 and CR1 in T. imaginis which were translocated between cox3 and CR2 in T. palmi. The location of CR1 upstream of nad5 gene was suggested to be ancestral condition of the thrips species in subfamily Thripinae, was also observed in T. palmi. Both the Maximum likelihood (ML) and Bayesian Inference (BI) phylogenetic trees generated resulted in similar topologies. The T. palmi was clustered with T. imaginis. We concluded that more molecular data on the diverse thrips species from different hierarchical level is needed, to understand the phylogenetic and evolutionary relationships among them.

Keywords: thrips, comparative mitogenomics, gene rearrangements, phylogenetic analysis

Procedia PDF Downloads 139
216 Identification of Candidate Congenital Heart Defects Biomarkers by Applying a Random Forest Approach on DNA Methylation Data

Authors: Kan Yu, Khui Hung Lee, Eben Afrifa-Yamoah, Jing Guo, Katrina Harrison, Jack Goldblatt, Nicholas Pachter, Jitian Xiao, Guicheng Brad Zhang

Abstract:

Background and Significance of the Study: Congenital Heart Defects (CHDs) are the most common malformation at birth and one of the leading causes of infant death. Although the exact etiology remains a significant challenge, epigenetic modifications, such as DNA methylation, are thought to contribute to the pathogenesis of congenital heart defects. At present, no existing DNA methylation biomarkers are used for early detection of CHDs. The existing CHD diagnostic techniques are time-consuming and costly and can only be used to diagnose CHDs after an infant was born. The present study employed a machine learning technique to analyse genome-wide methylation data in children with and without CHDs with the aim to find methylation biomarkers for CHDs. Methods: The Illumina Human Methylation EPIC BeadChip was used to screen the genome‐wide DNA methylation profiles of 24 infants diagnosed with congenital heart defects and 24 healthy infants without congenital heart defects. Primary pre-processing was conducted by using RnBeads and limma packages. The methylation levels of top 600 genes with the lowest p-value were selected and further investigated by using a random forest approach. ROC curves were used to analyse the sensitivity and specificity of each biomarker in both training and test sample sets. The functionalities of selected genes with high sensitivity and specificity were then assessed in molecular processes. Major Findings of the Study: Three genes (MIR663, FGF3, and FAM64A) were identified from both training and validating data by random forests with an average sensitivity and specificity of 85% and 95%. GO analyses for the top 600 genes showed that these putative differentially methylated genes were primarily associated with regulation of lipid metabolic process, protein-containing complex localization, and Notch signalling pathway. The present findings highlight that aberrant DNA methylation may play a significant role in the pathogenesis of congenital heart defects.

Keywords: biomarker, congenital heart defects, DNA methylation, random forest

Procedia PDF Downloads 131
215 Phosphate Use Efficiency in Plants: A GWAS Approach to Identify the Pathways Involved

Authors: Azizah M. Nahari, Peter Doerner

Abstract:

Phosphate (Pi) is one of the essential macronutrients in plant growth and development, and it plays a central role in metabolic processes in plants, particularly photosynthesis and respiration. Limitation of crop productivity by Pi is widespread and is likely to increase in the future. Applications of Pi fertilizers have improved soil Pi fertility and crop production; however, they have also caused environmental damage. Therefore, in order to reduce dependence on unsustainable Pi fertilizers, a better understanding of phosphate use efficiency (PUE) is required for engineering nutrient-efficient crop plants. Enhanced Pi efficiency can be achieved by improved productivity per unit Pi taken up. We aim to identify, by using association mapping, general features of the most important loci that contribute to increased PUE to allow us to delineate the physiological pathways involved in defining this trait in the model plant Arabidopsis. As PUE is in part determined by the efficiency of uptake, we designed a hydroponic system to avoid confounding effects due to differences in root system architecture leading to differences in Pi uptake. In this system, 18 parental lines and 217 lines of the MAGIC population (a Multiparent Advanced Generation Inter-Cross) grown in high and low Pi availability conditions. The results showed revealed a large variation of PUE in the parental lines, indicating that the MAGIC population was well suited to identify PUE loci and pathways. 2 of 18 parental lines had the highest PUE in low Pi while some lines responded strongly and increased PUE with increased Pi. Having examined the 217 MAGIC population, considerable variance in PUE was found. A general feature was the trend of most lines to exhibit higher PUE when grown in low Pi conditions. Association mapping is currently in progress, but initial observations indicate that a wide variety of physiological processes are involved in influencing PUE in Arabidopsis. The combination of hydroponic growth methods and genome-wide association mapping is a powerful tool to identify the physiological pathways underpinning complex quantitative traits in plants.

Keywords: hydroponic system growth, phosphate use efficiency (PUE), Genome-wide association mapping, MAGIC population

Procedia PDF Downloads 296
214 Wide Dissemination of CTX-M-Type Extended-Spectrum β-Lactamases in Korean Swine Farms

Authors: Young Ah Kim, Hyunsoo Kim, Eun-Jeong Yoon, Young Hee Seo, Kyungwon Lee

Abstract:

Extended-spectrum β-lactamase (ESBL)-producing Escherichia coli from food animals are considered as a reservoir for transmission of ESBL genes to human. The aim of this study is to assess the prevalence and molecular epidemiology of ESBL-producing E. coli colonization in pigs, farm workers, and farm environments to elucidate the transmission of multidrug-resistant clones from animal to human. Nineteen pig farms were enrolled across the country in Korea from August to December 2017. ESBL-producing E. coli isolates were detected in 190 pigs, 38 farm workers, and 112 sites of farm environments using ChromID ESBL (bioMerieux, Marcy l'Etoile, France), directly (stool or perirectal swab) or after enrichment (sewage). Antimicrobial susceptibility tests were done with disk diffusion methods and blaTEM, blaSHV, and blaCTX-M were detected with PCR and sequencing. The genomes of the four CTX-M-55-producing E. coli isolates from various sources in one farm were entirely sequenced to assess the relatedness of the strains. Whole genome sequencing (WGS) was performed with PacBio RS II system (Pacific Biosciences, Menlo Park, CA, USA). ESBL genotypes were 85 CTX-M-1 group (one CTX-M-3, 23 CTX-M-15, one CTX-M-28, 59 CTX-M-55, one CTX-M-69) and 60 CTX-M-9 group (41 CTX-M-14, one CTX-M-17, one CTX-M-27, 13 CTX-M-65, 4 CTX-M-102) in total 145 isolates. The rectal colonization rates were 53.2% (101/190) in pigs and 39.5% (15/38) in farm workers. In WGS, sequence types (STs) were determined as ST69 (E. coli PJFH115 isolate from a human carrier), ST457 (two E. coli isolates PJFE101 recovered from a fence and PJFA1104 from a pig) and ST5899 (E. coli PJFA173 isolate from the other pig). The four plasmids encoding CTX-M-55 (88,456 to 149, 674 base pair), whether it belonged to IncFIB or IncFIC-IncFIB type, shared IncF backbone furnishing the conjugal elements, suggesting of genes originated from same ancestor. In conclusion, the prevalence of ESBL-producing E. coli in swine farms was surprisingly high, and many of them shared common ESBL genotypes of clinical isolates such as CTX-M-14, 15, and 55 in Korea. It could spread by horizontal transfer between isolates from different reservoirs (human-animal-environment).

Keywords: Escherichia coli, extended-spectrum β-lactamase, prevalence, whole genome sequencing

Procedia PDF Downloads 175
213 PARP1 Links Transcription of a Subset of RBL2-Dependent Genes with Cell Cycle Progression

Authors: Ewelina Wisnik, Zsolt Regdon, Kinga Chmielewska, Laszlo Virag, Agnieszka Robaszkiewicz

Abstract:

Apart from protecting genome, PARP1 has been documented to regulate many intracellular processes inter alia gene transcription by physically interacting with chromatin bound proteins and by their ADP-ribosylation. Our recent findings indicate that expression of PARP1 decreases during the differentiation of human CD34+ hematopoietic stem cells to monocytes as a consequence of differentiation-associated cell growth arrest and formation of E2F4-RBL2-HDAC1-SWI/SNF repressive complex at the promoter of this gene. Since the RBL2 complexes repress genes in a E2F-dependent manner and are widespread in the genome in G0 arrested cells, we asked (a) if RBL2 directly contributes to defining monocyte phenotype and function by targeting gene promoters and (b) if RBL2 controls gene transcription indirectly by repressing PARP1. For identification of genes controlled by RBL2 and/or PARP1,we used primer libraries for surface receptors and TLR signaling mediators, genes were silenced by siRNA or shRNA, analysis of gene promoter occupation by selected proteins was carried out by ChIP-qPCR, while statistical analysis in GraphPad Prism 5 and STATISTICA, ChIP-Seq data were analysed in Galaxy 2.5.0.0. On the list of 28 genes regulated by RBL2, we identified only four solely repressed by RBL2-E2F4-HDAC1-BRM complex. Surprisingly, 24 out of 28 emerged genes controlled by RBL2 were co-regulated by PARP1 in six different manners. In one mode of RBL2/PARP1 co-operation, represented by MAP2K6 and MAPK3, PARP1 was found to associate with gene promoters upon RBL2 silencing, which was previously shown to restore PARP1 expression in monocytes. PARP1 effect on gene transcription was observed only in the presence of active EP300, which acetylated gene promoters and activated transcription. Further analysis revealed that PARP1 binding to MA2K6 and MAPK3 promoters enabled recruitment of EP300 in monocytes, while in proliferating cancer cell lines, which actively transcribe PARP1, this protein maintained EP300 at the promoters of MA2K6 and MAPK3. Genome-wide analysis revealed a similar distribution of PARP1 and EP300 around transcription start sites and the co-occupancy of some gene promoters by PARP1 and EP300 in cancer cells. Here, we described a new RBL2/PARP1/EP300 axis which controls gene transcription regardless of the cell type. In this model cell, cycle-dependent transcription of PARP1 regulates expression of some genes repressed by RBL2 upon cell cycle limitation. Thus, RBL2 may indirectly regulate transcription of some genes by controlling the expression of EP300-recruiting PARP1. Acknowledgement: This work was financed by Polish National Science Centre grants nr DEC-2013/11/D/NZ2/00033 and DEC-2015/19/N/NZ2/01735. L.V. is funded by the National Research, Development and Innovation Office grants GINOP-2.3.2-15-2016-00020 TUMORDNS, GINOP-2.3.2-15-2016-00048-STAYALIVE and OTKA K112336. AR is supported by Polish Ministry of Science and Higher Education 776/STYP/11/2016.

Keywords: retinoblastoma transcriptional co-repressor like 2 (RBL2), poly(ADP-ribose) polymerase 1 (PARP1), E1A binding protein p300 (EP300), monocytes

Procedia PDF Downloads 176
212 Deleterious SNP’s Detection Using Machine Learning

Authors: Hamza Zidoum

Abstract:

This paper investigates the impact of human genetic variation on the function of human proteins using machine-learning algorithms. Single-Nucleotide Polymorphism represents the most common form of human genome variation. We focus on the single amino-acid polymorphism located in the coding region as they can affect the protein function leading to pathologic phenotypic change. We use several supervised Machine Learning methods to identify structural properties correlated with increased risk of the missense mutation being damaging. SVM associated with Principal Component Analysis give the best performance.

Keywords: single-nucleotide polymorphism, machine learning, feature selection, SVM

Procedia PDF Downloads 348
211 In Silico Screening, Identification and Validation of Cryptosporidium hominis Hypothetical Protein and Virtual Screening of Inhibitors as Therapeutics

Authors: Arpit Kumar Shrivastava, Subrat Kumar, Rajani Kanta Mohapatra, Priyadarshi Soumyaranjan Sahu

Abstract:

Computational approaches to predict structure, function and other biological characteristics of proteins are becoming more common in comparison to the traditional methods in drug discovery. Cryptosporidiosis is a major zoonotic diarrheal disease particularly in children, which is caused primarily by Cryptosporidium hominis and Cryptosporidium parvum. Currently, there are no vaccines for cryptosporidiosis and recommended drugs are not effective. With the availability of complete genome sequence of C. hominis, new targets have been recognized for the development of effective and better drugs and/or vaccines. We identified a unique hypothetical epitopic protein in C. hominis genome through BLASTP analysis. A 3D model of the hypothetical protein was generated using I-Tasser server through threading methodology. The quality of the model was validated through Ramachandran plot by PROCHECK server. The functional annotation of the hypothetical protein through DALI server revealed structural similarity with human Transportin 3. Phylogenetic analysis for this hypothetical protein also showed C. hominis hypothetical protein (CUV04613) was the closely related to human transportin 3 protein. The 3D protein model is further subjected to virtual screening study with inhibitors from the Zinc Database by using Dock Blaster software. Docking study reported N-(3-chlorobenzyl) ethane-1,2-diamine as the best inhibitor in terms of docking score. Docking analysis elucidated that Leu 525, Ile 526, Glu 528, Glu 529 are critical residues for ligand–receptor interactions. The molecular dynamic simulation was done to access the reliability of the binding pose of inhibitor and protein complex using GROMACS software at 10ns time point. Trajectories were analyzed at each 2.5 ns time interval, among which, H-bond with LEU-525 and GLY- 530 are significantly present in MD trajectories. Furthermore, antigenic determinants of the protein were determined with the help of DNA Star software. Our study findings showed a great potential in order to provide insights in the development of new drug(s) or vaccine(s) for control as well as prevention of cryptosporidiosis among humans and animals.

Keywords: cryptosporidium hominis, hypothetical protein, molecular docking, molecular dynamics simulation

Procedia PDF Downloads 340
210 Genomic Characterisation of Equine Sarcoid-derived Bovine Papillomavirus Type 1 and 2 Using Nanopore-Based Sequencing

Authors: Lien Gysens, Bert Vanmechelen, Maarten Haspeslagh, Piet Maes, Ann Martens

Abstract:

Bovine papillomavirus (BPV) types 1 and 2 play a central role in the etiology of the most common neoplasm in horses, the equine sarcoid. The unknown mechanism behind the unique variety in a clinical presentation on the one hand and the host-dependent clinical outcome of BPV-1 infection, on the other hand, indicate the involvement of additional factors. Earlier studies have reported the potential functional significance of intratypic sequence variants, along with the existence of sarcoid-sourced BPV variants. Therefore, intratypic sequence variation seems to be an important emerging viral factor. This study aimed to give a broad insight in sarcoid-sourced BPV variation and explore its potential association with disease presentation. In order to do this, a nanopore sequencing approach was successfully optimized for screening a wide spectrum of clinical samples. Specimens of each tumour were initially screened for BPV-1/-2 by quantitative real-time PCR. A custom-designed primer set was used on BPV-positive samples to amplify the complete viral genome in two multiplex PCR reactions, resulting in a set of overlapping amplicons. For phylogenetic analysis, separate alignments were made of all available complete genome sequences for BPV-1/-2. The resulting alignments were used to infer Bayesian phylogenetic trees. We found substantial genetic variation among sarcoid-derived BPV-1, although this variation could not be linked to disease severity. Several of the BPV-1 genomes had multiple major deletions. Remarkably, the majority of the cluster within the region coding for late viral genes. Together with the extensiveness (up to 603 nucleotides) of the described deletions, this suggests an altered function of L1/L2 in disease pathogenesis. By generating a significant amount of complete-length BPV genomes, we succeeded in introducing next-generation sequencing into veterinary research focusing on the equine sarcoid, thus facilitating the first report of both nanopore-based sequencing of complete sarcoid-sourced BPV-1/-2 and the simultaneous nanopore sequencing of multiple complete genomes originating from a single clinical sample.

Keywords: Bovine papillomavirus, equine sarcoid, horse, nanopore sequencing, phylogenetic analysis

Procedia PDF Downloads 149
209 Transcriptomic Analysis of Acanthamoeba castellanii Virulence Alteration by Epigenetic DNA Methylation

Authors: Yi-Hao Wong, Li-Li Chan, Chee-Onn Leong, Stephen Ambu, Joon-Wah Mak, Priyasashi Sahu

Abstract:

Background: Acanthamoeba is a genus of amoebae which lives as a free-living in nature or as a human pathogen that causes severe brain and eye infections. Virulence potential of Acanthamoeba is not constant and can change with growth conditions. DNA methylation, an epigenetic process which adds methyl groups to DNA, is used by eukaryotic cells, including several human parasites to control their gene expression. We used qPCR, siRNA gene silencing, and RNA sequencing (RNA-Seq) to study DNA-methyltransferase gene family (DNMT) in order to indicate the possibility of its involvement in programming Acanthamoeba virulence potential. Methods: A virulence-attenuated Acanthamoeba isolate (designation: ATCC; original isolate: ATCC 50492) was subjected to mouse passages to restore its pathogenicity; a virulence-reactivated isolate (designation: AC/5) was generated. Several established factors associated with Acanthamoeba virulence phenotype were examined to confirm the succession of reactivation process. Differential gene expression of DNMT between ATCC and AC/5 isolates was performed by qPCR. Silencing on DNMT gene expression in AC/5 isolate was achieved by siRNA duplex. Total RNAs extracted from ATCC, AC/5, and siRNA-treated (designation: si-146) were subjected to RNA-Seq for comparative transcriptomic analysis in order to identify the genome-wide effect of DNMT in regulating Acanthamoeba gene expression. qPCR was performed to validate the RNA-Seq results. Results: Physiological and cytophatic assays demonstrated an increased in virulence potential of AC/5 isolate after mouse passages. DNMT gene expression was significantly higher in AC/5 compared to ATCC isolate (p ≤ 0.01) by qPCR. si-146 duplex reduced DNMT gene expression in AC/5 isolate by 30%. Comparative transcriptome analysis identified the differentially expressed genes, with 3768 genes in AC/5 vs ATCC isolate; 2102 genes in si-146 vs AC/5 isolate and 3422 genes in si-146 vs ATCC isolate, respectively (fold-change of ≥ 2 or ≤ 0.5, p-value adjusted (padj) < 0.05). Of these, 840 and 1262 genes were upregulated and downregulated, respectively, in si-146 vs AC/5 isolate. Eukaryotic orthologous group (KOG) assignments revealed a higher percentage of downregulated gene expression in si-146 compared to AC/5 isolate, were related to posttranslational modification, signal transduction and energy production. Gene Ontology (GO) terms for those downregulated genes shown were associated with transport activity, oxidation-reduction process, and metabolic process. Among these downregulated genes were putative genes encoded for heat shock proteins, transporters, ubiquitin-related proteins, proteins for vesicular trafficking (small GTPases), and oxidoreductases. Functional analysis of similar predicted proteins had been described in other parasitic protozoa for their survival and pathogenicity. Decreased expression of these genes in si146-treated isolate may account in part for Acanthamoeba reduced pathogenicity. qPCR on 6 selected genes upregulated in AC/5 compared to ATCC isolate corroborated the RNA sequencing findings, indicating a good concordance between these two analyses. Conclusion: To the best of our knowledge, this study represents the first genome-wide analysis of DNA methylation and its effects on gene expression in Acanthamoeba spp. The present data indicate that DNA methylation has substantial effect on global gene expression, allowing further dissection of the genome-wide effects of DNA-methyltransferase gene in regulating Acanthamoeba pathogenicity.

Keywords: Acanthamoeba, DNA methylation, RNA sequencing, virulence

Procedia PDF Downloads 169
208 Prevalence and Mechanisms of Antibiotic Resistance in Escherichia coli Isolated from Mastitic Dairy Cattle in Canada

Authors: Satwik Majumder, Dongyun Jung, Jennifer Ronholm, Saji George

Abstract:

Bovine mastitis is the most common infectious disease in dairy cattle, with major economic implications for the dairy industry worldwide. Continuous monitoring for the emergence of antimicrobial resistance (AMR) among bacterial isolates from dairy farms is vital not only for animal husbandry but also for public health. In this study, the prevalence of AMR in 113 Escherichia coli isolates from cases of bovine clinical mastitis in Canada was investigated. Kirby-Bauer disk diffusion test with 18 antibiotics and microdilution method with three heavy metals (copper, zinc, and silver) was performed to determine the antibiotic and heavy-metal susceptibility. Resistant strains were assessed for efflux and ß-lactamase activities besides assessing biofilm formation and hemolysis. Whole-genome sequences for each of the isolates were examined to detect the presence of genes corresponding to the observed AMR and virulence factors. Phenotypic analysis revealed that 32 isolates were resistant to one or more antibiotics, and 107 showed resistance against at least one heavy metal. Quinolones and silver were the most efficient against the tested isolates. Among the AMR isolates, AcrAB-TolC efflux activity and ß-lactamase enzyme activities were detected in 13 and 14 isolates, respectively. All isolates produced biofilm but with different capacities, and 33 isolates showed α-hemolysin activity. A positive correlation (Pearson r = +0.89) between efflux pump activity and quantity of biofilm was observed. Genes associated with aggregation, adhesion, cyclic di-GMP, quorum sensing were detected in the AMR isolates, corroborating phenotype observations. This investigation showed the prevalence of AMR in E. coli isolates from bovine clinical mastitis. The results also suggest the inadequacy of antimicrobials with a single mode of action to curtail AMR bacteria with multiple mechanisms of resistance and virulence factors. Therefore, it calls for combinatorial therapy for the effective management of AMR infections in dairy farms and combats its potential transmission to the food supply chain through milk and dairy products.

Keywords: antimicrobial resistance, E. coli, bovine mastitis, antibiotics, heavy-metals, efflux pump, ß-lactamase enzyme, biofilm, whole-genome sequencing

Procedia PDF Downloads 178
207 Towards End-To-End Disease Prediction from Raw Metagenomic Data

Authors: Maxence Queyrel, Edi Prifti, Alexandre Templier, Jean-Daniel Zucker

Abstract:

Analysis of the human microbiome using metagenomic sequencing data has demonstrated high ability in discriminating various human diseases. Raw metagenomic sequencing data require multiple complex and computationally heavy bioinformatics steps prior to data analysis. Such data contain millions of short sequences read from the fragmented DNA sequences and stored as fastq files. Conventional processing pipelines consist in multiple steps including quality control, filtering, alignment of sequences against genomic catalogs (genes, species, taxonomic levels, functional pathways, etc.). These pipelines are complex to use, time consuming and rely on a large number of parameters that often provide variability and impact the estimation of the microbiome elements. Training Deep Neural Networks directly from raw sequencing data is a promising approach to bypass some of the challenges associated with mainstream bioinformatics pipelines. Most of these methods use the concept of word and sentence embeddings that create a meaningful and numerical representation of DNA sequences, while extracting features and reducing the dimensionality of the data. In this paper we present an end-to-end approach that classifies patients into disease groups directly from raw metagenomic reads: metagenome2vec. This approach is composed of four steps (i) generating a vocabulary of k-mers and learning their numerical embeddings; (ii) learning DNA sequence (read) embeddings; (iii) identifying the genome from which the sequence is most likely to come and (iv) training a multiple instance learning classifier which predicts the phenotype based on the vector representation of the raw data. An attention mechanism is applied in the network so that the model can be interpreted, assigning a weight to the influence of the prediction for each genome. Using two public real-life data-sets as well a simulated one, we demonstrated that this original approach reaches high performance, comparable with the state-of-the-art methods applied directly on processed data though mainstream bioinformatics workflows. These results are encouraging for this proof of concept work. We believe that with further dedication, the DNN models have the potential to surpass mainstream bioinformatics workflows in disease classification tasks.

Keywords: deep learning, disease prediction, end-to-end machine learning, metagenomics, multiple instance learning, precision medicine

Procedia PDF Downloads 97
206 CRISPR/Cas9 Based Gene Stacking in Plants for Virus Resistance Using Site-Specific Recombinases

Authors: Sabin Aslam, Sultan Habibullah Khan, James G. Thomson, Abhaya M. Dandekar

Abstract:

Losses due to viral diseases are posing a serious threat to crop production. A quick breakdown of resistance to viruses like Cotton Leaf Curl Virus (CLCuV) demands the application of a proficient technology to engineer durable resistance. Gene stacking has recently emerged as a potential approach for integrating multiple genes in crop plants. In the present study, recombinase technology has been used for site-specific gene stacking. A target vector (pG-Rec) was designed for engineering a predetermined specific site in the plant genome whereby genes can be stacked repeatedly. Using Agrobacterium-mediated transformation, the pG-Rec was transformed into Coker-312 along with Nicotiana tabacum L. cv. Xanthi and Nicotiana benthamiana. The transgene analysis of target lines was conducted through junction PCR. The transgene positive target lines were used for further transformations to site-specifically stack two genes of interest using Bxb1 and PhiC31 recombinases. In the first instance, Cas9 driven by multiplex gRNAs (for Rep gene of CLCuV) was site-specifically integrated into the target lines and determined by the junction PCR and real-time PCR. The resulting plants were subsequently used to stack the second gene of interest (AVP3 gene from Arabidopsis for enhancing cotton plant growth). The addition of the genes is simultaneously achieved with the removal of marker genes for recycling with the next round of gene stacking. Consequently, transgenic marker-free plants were produced with two genes stacked at the specific site. These transgenic plants can be potential germplasm to introduce resistance against various strains of cotton leaf curl virus (CLCuV) and abiotic stresses. The results of the research demonstrate gene stacking in crop plants, a technology that can be used to introduce multiple genes sequentially at predefined genomic sites. The current climate change scenario highlights the use of such technologies so that gigantic environmental issues can be tackled by several traits in a single step. After evaluating virus resistance in the resulting plants, the lines can be a primer to initiate stacking of further genes in Cotton for other traits as well as molecular breeding with elite cotton lines.

Keywords: cotton, CRISPR/Cas9, gene stacking, genome editing, recombinases

Procedia PDF Downloads 117