Search results for: small whole genome sequencing
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 5433

Search results for: small whole genome sequencing

5403 The Cleavage of DNA by the Anti-Tumor Drug Bleomycin at the Transcription Start Sites of Human Genes Using Genome-Wide Techniques

Authors: Vincent Murray

Abstract:

The glycopeptide bleomycin is used in the treatment of testicular cancer, Hodgkin's lymphoma, and squamous cell carcinoma. Bleomycin damages and cleaves DNA in human cells, and this is considered to be the main mode of action for bleomycin's anti-tumor activity. In particular, double-strand breaks are thought to be the main mechanism for the cellular toxicity of bleomycin. Using Illumina next-generation DNA sequencing techniques, the genome-wide sequence specificity of bleomycin-induced double-strand breaks was determined in human cells. The degree of bleomycin cleavage was also assessed at the transcription start sites (TSSs) of actively transcribed genes and compared with non-transcribed genes. It was observed that bleomycin preferentially cleaved at the TSSs of actively transcribed human genes. There was a correlation between the degree of this enhanced cleavage at TSSs and the level of transcriptional activity. Bleomycin cleavage is also affected by chromatin structure and at TSSs, the peaks of bleomycin cleavage were approximately 200 bp apart. This indicated that bleomycin was able to detect phased nucleosomes at the TSSs of actively transcribed human genes. The genome-wide cleavage pattern of the bleomycin analogues 6′-deoxy-BLM Z and zorbamycin was also investigated in human cells. As found for bleomycin, these bleomycin analogues also preferentially cleaved at the TSSs of actively transcribed human genes. The cytotoxicity (IC₅₀ values) of these bleomycin analogues was determined. It was found that the degree of enhanced cleavage at TSSs was inversely correlated with the IC₅₀ values of the bleomycin analogues. This suggested that the level of cleavage at the TSSs of actively transcribed human genes was important for the cytotoxicity of bleomycin and analogues. Hence this study provided a deeper understanding of the cellular processes involved in the cancer chemotherapeutic activity of bleomycin.

Keywords: anti-tumour activity, bleomycin analogues, chromatin structure, genome-wide study, Illumina DNA sequencing

Procedia PDF Downloads 100
5402 Genome Sequencing and Analysis of the Spontaneous Nanosilver Resistant Bacterium Proteus mirabilis Strain scdr1

Authors: Amr Saeb, Khalid Al-Rubeaan, Mohamed Abouelhoda, Manojkumar Selvaraju, Hamsa Tayeb

Abstract:

Background: P. mirabilis is a common uropathogenic bacterium that can cause major complications in patients with long-standing indwelling catheters or patients with urinary tract anomalies. In addition, P. mirabilis is a common cause of chronic osteomyelitis in diabetic foot ulcer (DFU) patients. Methodology: P. mirabilis SCDR1 was isolated from a diabetic ulcer patient. We examined P. mirabilis SCDR1 levels of resistance against nano-silver colloids, the commercial nano-silver and silver containing bandages and commonly used antibiotics. We utilized next generation sequencing techniques (NGS), bioinformatics, phylogenetic analysis and pathogenomics in the identification and characterization of the infectious pathogen. Results: P. mirabilis SCDR1 is a multi-drug resistant isolate that also showed high levels of resistance against nano-silver colloids, nano-silver chitosan composite and the commercially available nano-silver and silver bandages. The P. mirabilis-SCDR1 genome size is 3,815,621 bp with G+C content of 38.44%. P. mirabilis-SCDR1 genome contains a total of 3,533 genes, 3,414 coding DNA sequence genes, 11, 10, 18 rRNAs (5S, 16S, and 23S), and 76 tRNAs. Our isolate contains all the required pathogenicity and virulence factors to establish a successful infection. P. mirabilis SCDR1 isolate is a potential virulent pathogen that despite its original isolation site, wound, it can establish kidney infection and its associated complications. P. mirabilis SCDR1 contains several mechanisms for antibiotics and metals resistance including, biofilm formation, swarming mobility, efflux systems, and enzymatic detoxification. Conclusion: P. mirabilis SCDR1 is the spontaneous nano-silver resistant bacterial strain. P. mirabilis SCDR1 strain contains all reported pathogenic and virulence factors characteristic for the species. In addition, it possesses several mechanisms that may lead to the observed nano-silver resistance.

Keywords: Proteus mirabilis, multi-drug resistance, silver nanoparticles, resistance, next generation sequencing techniques, genome analysis, bioinformatics, phylogeny, pathogenomics, diabetic foot ulcer, xenobiotics, multidrug resistance efflux, biofilm formation, swarming mobility, resistome, glutathione S-transferase, copper/silver efflux system, altruism

Procedia PDF Downloads 310
5401 CMPD: Cancer Mutant Proteome Database

Authors: Po-Jung Huang, Chi-Ching Lee, Bertrand Chin-Ming Tan, Yuan-Ming Yeh, Julie Lichieh Chu, Tin-Wen Chen, Cheng-Yang Lee, Ruei-Chi Gan, Hsuan Liu, Petrus Tang

Abstract:

Whole-exome sequencing focuses on the protein coding regions of disease/cancer associated genes based on a priori knowledge is the most cost-effective method to study the association between genetic alterations and disease. Recent advances in high throughput sequencing technologies and proteomic techniques has provided an opportunity to integrate genomics and proteomics, allowing readily detectable mutated peptides corresponding to mutated genes. Since sequence database search is the most widely used method for protein identification using Mass spectrometry (MS)-based proteomics technology, a mutant proteome database is required to better approximate the real protein pool to improve disease-associated mutated protein identification. Large-scale whole exome/genome sequencing studies were launched by National Cancer Institute (NCI), Broad Institute, and The Cancer Genome Atlas (TCGA), which provide not only a comprehensive report on the analysis of coding variants in diverse samples cell lines but a invaluable resource for extensive research community. No existing database is available for the collection of mutant protein sequences related to the identified variants in these studies. CMPD is designed to address this issue, serving as a bridge between genomic data and proteomic studies and focusing on protein sequence-altering variations originated from both germline and cancer-associated somatic variations.

Keywords: TCGA, cancer, mutant, proteome

Procedia PDF Downloads 566
5400 Genodata: The Human Genome Variation Using BigData

Authors: Surabhi Maiti, Prajakta Tamhankar, Prachi Uttam Mehta

Abstract:

Since the accomplishment of the Human Genome Project, there has been an unparalled escalation in the sequencing of genomic data. This project has been the first major vault in the field of medical research, especially in genomics. This project won accolades by using a concept called Bigdata which was earlier, extensively used to gain value for business. Bigdata makes use of data sets which are generally in the form of files of size terabytes, petabytes, or exabytes and these data sets were traditionally used and managed using excel sheets and RDBMS. The voluminous data made the process tedious and time consuming and hence a stronger framework called Hadoop was introduced in the field of genetic sciences to make data processing faster and efficient. This paper focuses on using SPARK which is gaining momentum with the advancement of BigData technologies. Cloud Storage is an effective medium for storage of large data sets which is generated from the genetic research and the resultant sets produced from SPARK analysis.

Keywords: human genome project, Bigdata, genomic data, SPARK, cloud storage, Hadoop

Procedia PDF Downloads 229
5399 Insights into the Annotated Genome Sequence of Defluviitoga tunisiensis L3 Isolated from a Thermophilic Rural Biogas Producing Plant

Authors: Irena Maus, Katharina Gabriella Cibis, Andreas Bremges, Yvonne Stolze, Geizecler Tomazetto, Daniel Wibberg, Helmut König, Alfred Pühler, Andreas Schlüter

Abstract:

Within the agricultural sector, the production of biogas from organic substrates represents an economically attractive technology to generate bioenergy. Complex consortia of microorganisms are responsible for biomass decomposition and biogas production. Recently, species belonging to the phylum Thermotogae were detected in thermophilic biogas-production plants utilizing renewable primary products for biomethanation. To analyze adaptive genome features of representative Thermotogae strains, Defluviitoga tunisiensis L3 was isolated from a rural thermophilic biogas plant (54°C) and completely sequenced on an Illumina MiSeq system. Sequencing and assembly of the D. tunisiensis L3 genome yielded a circular chromosome with a size of 2,053,097 bp and a mean GC content of 31.38%. Functional annotation of the complete genome sequence revealed that the thermophilic strain L3 encodes several genes predicted to facilitate growth of this microorganism on arabinose, galactose, maltose, mannose, fructose, raffinose, ribose, cellobiose, lactose, xylose, xylan, lactate and mannitol. Acetate, hydrogen (H2) and carbon dioxide (CO2) are supposed to be end products of the fermentation process. The latter gene products are metabolites for methanogenic archaea, the key players in the final step of the anaerobic digestion process. To determine the degree of relatedness of dominant biogas community members within selected digester systems to D. tunisiensis L3, metagenome sequences from corresponding communities were mapped on the L3 genome. These fragment recruitments revealed that metagenome reads originating from a thermophilic biogas plant covered 95% of D. tunisiensis L3 genome sequence. In conclusion, availability of the D. tunisiensis L3 genome sequence and insights into its metabolic capabilities provide the basis for biotechnological exploitation of genome features involved in thermophilic fermentation processes utilizing renewable primary products.

Keywords: genome sequence, thermophilic biogas plant, Thermotogae, Defluviitoga tunisiensis

Procedia PDF Downloads 470
5398 Predictive Pathogen Biology: Genome-Based Prediction of Pathogenic Potential and Countermeasures Targets

Authors: Debjit Ray

Abstract:

Horizontal gene transfer (HGT) and recombination leads to the emergence of bacterial antibiotic resistance and pathogenic traits. HGT events can be identified by comparing a large number of fully sequenced genomes across a species or genus, define the phylogenetic range of HGT, and find potential sources of new resistance genes. In-depth comparative phylogenomics can also identify subtle genome or plasmid structural changes or mutations associated with phenotypic changes. Comparative phylogenomics requires that accurately sequenced, complete and properly annotated genomes of the organism. Assembling closed genomes requires additional mate-pair reads or “long read” sequencing data to accompany short-read paired-end data. To bring down the cost and time required of producing assembled genomes and annotating genome features that inform drug resistance and pathogenicity, we are analyzing the performance for genome assembly of data from the Illumina NextSeq, which has faster throughput than the Illumina HiSeq (~1-2 days versus ~1 week), and shorter reads (150bp paired-end versus 300bp paired end) but higher capacity (150-400M reads per run versus ~5-15M) compared to the Illumina MiSeq. Bioinformatics improvements are also needed to make rapid, routine production of complete genomes a reality. Modern assemblers such as SPAdes 3.6.0 running on a standard Linux blade are capable in a few hours of converting mixes of reads from different library preps into high-quality assemblies with only a few gaps. Remaining breaks in scaffolds are generally due to repeats (e.g., rRNA genes) are addressed by our software for gap closure techniques, that avoid custom PCR or targeted sequencing. Our goal is to improve the understanding of emergence of pathogenesis using sequencing, comparative genomics, and machine learning analysis of ~1000 pathogen genomes. Machine learning algorithms will be used to digest the diverse features (change in virulence genes, recombination, horizontal gene transfer, patient diagnostics). Temporal data and evolutionary models can thus determine whether the origin of a particular isolate is likely to have been from the environment (could it have evolved from previous isolates). It can be useful for comparing differences in virulence along or across the tree. More intriguing, it can test whether there is a direction to virulence strength. This would open new avenues in the prediction of uncharacterized clinical bugs and multidrug resistance evolution and pathogen emergence.

Keywords: genomics, pathogens, genome assembly, superbugs

Procedia PDF Downloads 175
5397 Metagenomics-Based Molecular Epidemiology of Viral Diseases

Authors: Vyacheslav Furtak, Merja Roivainen, Olga Mirochnichenko, Majid Laassri, Bella Bidzhieva, Tatiana Zagorodnyaya, Vladimir Chizhikov, Konstantin Chumakov

Abstract:

Molecular epidemiology and environmental surveillance are parts of a rational strategy to control infectious diseases. They have been widely used in the worldwide campaign to eradicate poliomyelitis, which otherwise would be complicated by the inability to rapidly respond to outbreaks and determine sources of the infection. The conventional scheme involves isolation of viruses from patients and the environment, followed by their identification by nucleotide sequences analysis to determine phylogenetic relationships. This is a tedious and time-consuming process that yields definitive results when it may be too late to implement countermeasures. Because of the difficulty of high-throughput full-genome sequencing, most such studies are conducted by sequencing only capsid genes or their parts. Therefore the important information about the contribution of other parts of the genome and inter- and intra-species recombination to viral evolution is not captured. Here we propose a new approach based on the rapid concentration of sewage samples with tangential flow filtration followed by deep sequencing and reconstruction of nucleotide sequences of viruses present in the samples. The entire nucleic acids content of each sample is sequenced, thus preserving in digital format the complete spectrum of viruses. A set of rapid algorithms was developed to separate deep sequence reads into discrete populations corresponding to each virus and assemble them into full-length consensus contigs, as well as to generate a complete profile of sequence heterogeneities in each of them. This provides an effective approach to study molecular epidemiology and evolution of natural viral populations.

Keywords: poliovirus, eradication, environmental surveillance, laboratory diagnosis

Procedia PDF Downloads 252
5396 Development of Microsatellite Markers for Dalmatian Pyrethrum Using Next-Generation Sequencing

Authors: Ante Turudic, Filip Varga, Zlatko Liber, Jernej Jakse, Zlatko Satovic, Ivan Radosavljevic, Martina Grdisa

Abstract:

Microsatellites (SSRs) are highly informative repetitive sequences of 2-6 base pairs, which are the most used molecular markers in assessing the genetic diversity of plant species. Dalmatian pyrethrum (Tanacetum cinerariifolium /Trevir./ Sch. Bip) is an outcrossing diploid (2n = 18) endemic to the eastern Adriatic coast and source of the natural insecticide pyrethrin. Due to the high repetitiveness and large size of the genome (haploid genome size of 9,58 pg), previous attempts to develop microsatellite markers using the standard methods were unsuccessful. A next-generation sequencing (NGS) approach was applied on genomic DNA extracted from fresh leaves of Dalmatian pyrethrum. The sequencing was conducted using NovaSeq6000 Illumina sequencer, after which almost 400 million high-quality paired-end reads were obtained, with a read length of 150 base pairs. Short reads were assembled by combining two approaches; (1) de-novo assembly and (2) joining of overlapped pair-end reads. In total, 6.909.675 contigs were obtained, with the contig average length of 249 base pairs. Of the resulting contigs, 31.380 contained one or multiple microsatellite sequences, in total 35.556 microsatellite loci were identified. Out of detected microsatellites, dinucleotide repeats were the most frequent, accounting for more than half of all microsatellites identifies (21,212; 59.7%), followed by trinucleotide repeats (9,204; 25.9%). Tetra-, penta- and hexanucleotides had similar frequency of 1,822 (5.1%), 1,472 (4.1%), and 1,846 (5.2%), respectively. Contigs containing microsatellites were further filtered by SSR pattern type, transposon occurrences, assembly characteristics, GC content, and the number of occurrences against the draft genome of T. cinerariifolium published previously. After the selection process, 50 microsatellite loci were used for primer design. Designed primers were tested on samples from five distinct populations, and 25 of them showed a high degree of polymorphism. The selected loci were then genotyped on 20 samples belonging to one population resulting in 17 microsatellite markers. Availability of codominant SSR markers will significantly improve the knowledge on population genetic diversity and structure as well as complex genetics and biochemistry of this species. Acknowledgment: This work has been fully supported by the Croatian Science Foundation under the project ‘Genetic background of Dalmatian pyrethrum (Tanacetum cinerariifolium /Trevir/ Sch. Bip.) insecticidal potential’ - (PyrDiv) (IP-06-2016-9034).

Keywords: genome assembly, NGS, SSR, Tanacetum cinerariifolium

Procedia PDF Downloads 104
5395 Applying Massively Parallel Sequencing to Forensic Soil Bacterial Profiling

Authors: Hui Li, Xueying Zhao, Ke Ma, Yu Cao, Fan Yang, Qingwen Xu, Wenbin Liu

Abstract:

Soil can often link a person or item to a crime scene, which makes it a valuable evidence in forensic casework. Several techniques have been utilized in forensic soil discrimination in previous studies. Because soil contains a vast number of microbiomes, the analyse of soil microbiomes is expected to be a potential way to characterise soil evidence. In this study, we applied massively parallel sequencing (MPS) to soil bacterial profiling on the Ion Torrent Personal Genome Machine (PGM). Soils from different regions were collected repeatedly. V-region 3 and 4 of Bacterial 16S rRNA gene were detected by MPS. Operational taxonomic units (OTU, 97%) were used to analyse soil bacteria. Several bioinformatics methods (PCoA, NMDS, Metastats, LEfse, and Heatmap) were applied in bacterial profiles. Our results demonstrate that MPS can provide a more detailed picture of the soil microbiomes and the composition of soil bacterial components from different region was individualistic. In conclusion, the utility of soil bacterial profiling via MPS of the 16S rRNA gene has potential value in characterising soil evidences and associating them with their place of origin, which can play an important role in forensic science in the future.

Keywords: bacterial profiling, forensic, massively parallel sequencing, soil evidence

Procedia PDF Downloads 534
5394 Enzymatic Repair Prior To DNA Barcoding, Aspirations, and Restraints

Authors: Maxime Merheb, Rachel Matar

Abstract:

Retrieving ancient DNA sequences which in return permit the entire genome sequencing from fossils have extraordinarily improved in recent years, thanks to sequencing technology and other methodological advances. In any case, the quest to search for ancient DNA is still obstructed by the damage inflicted on DNA which accumulates after the death of a living organism. We can characterize this damage into three main categories: (i) Physical abnormalities such as strand breaks which lead to the presence of short DNA fragments. (ii) Modified bases (mainly cytosine deamination) which cause errors in the sequence due to an incorporation of a false nucleotide during DNA amplification. (iii) DNA modifications referred to as blocking lesions, will halt the PCR extension which in return will also affect the amplification and sequencing process. We can clearly see that the issues arising from breakage and coding errors were significantly decreased in recent years. Fast sequencing of short DNA fragments was empowered by platforms for high-throughput sequencing, most of the coding errors were uncovered to be the consequences of cytosine deamination which can be easily removed from the DNA using enzymatic treatment. The methodology to repair DNA sequences is still in development, it can be basically explained by the process of reintroducing cytosine rather than uracil. This technique is thus restricted to amplified DNA molecules. To eliminate any type of damage (particularly those that block PCR) is a process still pending the complete repair methodologies; DNA detection right after extraction is highly needed. Before using any resources into extensive, unreasonable and uncertain repair techniques, it is vital to distinguish between two possible hypotheses; (i) DNA is none existent to be amplified to begin with therefore completely un-repairable, (ii) the DNA is refractory to PCR and it is worth to be repaired and amplified. Hence, it is extremely important to develop a non-enzymatic technique to detect the most degraded DNA.

Keywords: ancient DNA, DNA barcodong, enzymatic repair, PCR

Procedia PDF Downloads 382
5393 Re-Stating the Origin of Tetrapod Using Measures of Phylogenetic Support for Phylogenomic Data

Authors: Yunfeng Shan, Xiaoliang Wang, Youjun Zhou

Abstract:

Whole-genome data from two lungfish species, along with other species, present a valuable opportunity to re-investigate the longstanding debate regarding the evolutionary relationships among tetrapods, lungfishes, and coelacanths. However, the use of bootstrap support has become outdated for large-scale phylogenomic data. Without robust phylogenetic support, the phylogenetic trees become meaningless. Therefore, it is necessary to re-evaluate the phylogenies of tetrapods, lungfishes, and coelacanths using novel measures of phylogenetic support specifically designed for phylogenomic data, as the previous phylogenies were based on 100% bootstrap support. Our findings consistently provide strong evidence favoring lungfish as the closest living relative of tetrapods. This conclusion is based on high internode certainty, relative gene support, and high gene concordance factor. The evidence stems from five previous datasets derived from lungfish transcriptomes. These results yield fresh insights into the three hypotheses regarding the phylogenies of tetrapods, lungfishes, and coelacanths. Importantly, these hypotheses are not mere conjectures but are substantiated by a significant number of genes. Analyzing real biological data further demonstrates that the inclusion of additional taxa leads to more diverse tree topologies. Consequently, gene trees and species trees may not be identical even when whole-genome sequencing data is utilized. However, it is worth noting that many gene trees can accurately reflect the species tree if an appropriate number of taxa, typically ranging from six to ten, are sampled. Therefore, it is crucial to carefully select the number of taxa and an appropriate outgroup, such as slow-evolving species, while excluding fast-evolving taxa as outgroups to mitigate the adverse effects of long-branch attraction and achieve an accurate reconstruction of the species tree. This is particularly important as more whole-genome sequencing data becomes available.

Keywords: novel measures of phylogenetic support for phylogenomic data, gene concordance factor confidence, relative gene support, internode certainty, origin of tetrapods

Procedia PDF Downloads 34
5392 Unifying RSV Evolutionary Dynamics and Epidemiology Through Phylodynamic Analyses

Authors: Lydia Tan, Philippe Lemey, Lieselot Houspie, Marco Viveen, Darren Martin, Frank Coenjaerts

Abstract:

Introduction: Human respiratory syncytial virus (hRSV) is the leading cause of severe respiratory tract infections in infants under the age of two. Genomic substitutions and related evolutionary dynamics of hRSV are of great influence on virus transmission behavior. The evolutionary patterns formed are due to a precarious interplay between the host immune response and RSV, thereby selecting the most viable and less immunogenic strains. Studying genomic profiles can teach us which genes and consequent proteins play an important role in RSV survival and transmission dynamics. Study design: In this study, genetic diversity and evolutionary rate analysis were conducted on 36 RSV subgroup B whole genome sequences and 37 subgroup A genome sequences. Clinical RSV isolates were obtained from nasopharyngeal aspirates and swabs of children between 2 weeks and 5 years old of age. These strains, collected during epidemic seasons from 2001 to 2011 in the Netherlands and Belgium by either conventional or 454-sequencing. Sequences were analyzed for genetic diversity, recombination events, synonymous/non-synonymous substitution ratios, epistasis, and translational consequences of mutations were mapped to known 3D protein structures. We used Bayesian statistical inference to estimate the rate of RSV genome evolution and the rate of variability across the genome. Results: The A and B profiles were described in detail and compared to each other. Overall, the majority of the whole RSV genome is highly conserved among all strains. The attachment protein G was the most variable protein and its gene had, similar to the non-coding regions in RSV, more elevated (two-fold) substitution rates than other genes. In addition, the G gene has been identified as the major target for diversifying selection. Overall, less gene and protein variability was found within RSV-B compared to RSV-A and most protein variation between the subgroups was found in the F, G, SH and M2-2 proteins. For the F protein mutations and correlated amino acid changes are largely located in the F2 ligand-binding domain. The small hydrophobic phosphoprotein and nucleoprotein are the most conserved proteins. The evolutionary rates were similar in both subgroups (A: 6.47E-04, B: 7.76E-04 substitution/site/yr), but estimates of the time to the most recent common ancestor were much lower for RSV-B (B: 19, A: 46.8 yrs), indicating that there is more turnover in this subgroup. Conclusion: This study provides a detailed description of whole RSV genome mutations, the effect on translation products and the first estimate of the RSV genome evolution tempo. The immunogenic G protein seems to require high substitution rates in order to select less immunogenic strains and other conserved proteins are most likely essential to preserve RSV viability. The resulting G gene variability makes its protein a less interesting target for RSV intervention methods. The more conserved RSV F protein with less antigenic epitope shedding is, therefore, more suitable for developing therapeutic strategies or vaccines.

Keywords: drug target selection, epidemiology, respiratory syncytial virus, RSV

Procedia PDF Downloads 384
5391 High-Throughput Mechanized Microfluidic Test Groundwork for Precise Microbial Genomics

Authors: Pouya Karimi, Ramin Gasemi Shayan, Parsa Sheykhzade

Abstract:

Ease shotgun DNA sequencing is changing the microbial sciences. Sequencing instruments are compelling to the point that example planning is currently the key constraining element. Here, we present a microfluidic test readiness stage that incorporates the key strides in cells to grouping library test groundwork for up to 96 examples and decreases DNA input prerequisites 100-overlay while keeping up or improving information quality. The universally useful microarchitecture we show bolsters work processes with subjective quantities of response and tidy up or catch steps. By decreasing the example amount necessities, we empowered low-input (∼10,000 cells) entire genome shotgun (WGS) sequencing of Mycobacterium tuberculosis and soil miniaturized scale settlements with prevalent outcomes. We additionally utilized the upgraded throughput to succession ∼400 clinical Pseudomonas aeruginosa libraries and exhibit magnificent single-nucleotide polymorphism discovery execution that clarified phenotypically watched anti-toxin opposition. Completely coordinated lab-on-chip test arrangement beats specialized boundaries to empower more extensive organization of genomics across numerous fundamental research and translational applications.

Keywords: clinical microbiology, DNA, microbiology, microbial genomics

Procedia PDF Downloads 102
5390 Measures of Phylogenetic Support for Phylogenomic and the Whole Genomes of Two Lungfish Restate Lungfish and Origin of Land Vertebrates

Authors: Yunfeng Shan, Xiaoliang Wang, Youjun Zhou

Abstract:

Whole-genome data from two lungfish species, along with other species, present a valuable opportunity to reassess the longstanding debate regarding the evolutionary relationships among tetrapods, lungfishes, and coelacanths. However, the use of bootstrap support has become outdated for large-scale phylogenomic data. Without robust phylogenetic support, the phylogenetic trees become meaningless. Therefore, it is necessary to re-evaluate the phylogenies of tetrapods, lungfishes, and coelacanths using novel measures of phylogenetic support specifically designed for phylogenomic data, as the previous phylogenies were based on 100% bootstrap support. Our findings consistently provide strong evidence favoring lungfish as the closest living relative of tetrapods. This conclusion is based on high gene support confidence with confidence intervals exceeding 95%, high internode certainty, and high gene concordance factor. The evidence stems from two datasets containing recently deciphered whole genomes of two lungfish species, as well as five previous datasets derived from lungfish transcriptomes. These results yield fresh insights into the three hypotheses regarding the phylogenies of tetrapods, lungfishes, and coelacanths. Importantly, these hypotheses are not mere conjectures but are substantiated by a significant number of genes. Analyzing real biological data further demonstrates that the inclusion of additional taxa diminishes the number of orthologues and leads to more diverse tree topologies. Consequently, gene trees and species trees may not be identical even when whole-genome sequencing data is utilized. However, it is worth noting that many gene trees can accurately reflect the species tree if an appropriate number of taxa, typically ranging from six to ten, are sampled. Therefore, it is crucial to carefully select the number of taxa and an appropriate outgroup while excluding fast-evolving taxa as outgroups to mitigate the adverse effects of long-branch attraction (LBA) and achieve an accurate reconstruction of the species tree. This is particularly important as more whole-genome sequencing data becomes available.

Keywords: gene support confidence (GSC), origin of land vertebrates, coelacanth, two whole genomes of lungfishes, confidence intervals

Procedia PDF Downloads 54
5389 Biotechnological Interventions for Crop Improvement in Nutricereal Pearl Millet

Authors: Supriya Ambawat, Subaran Singh, C. Tara Satyavathi, B. S. Rajpurohit, Ummed Singh, Balraj Singh

Abstract:

Pearl millet [Pennisetum glaucum (L.) R. Br.] is an important staple food of the arid and semiarid tropical regions of Asia, Africa, and Latin America. It is rightly termed as nutricereal as it has high nutrition value and a good source of carbohydrate, protein, fat, ash, dietary fiber, potassium, magnesium, iron, zinc, etc. Pearl millet has low prolamine fraction and is gluten free which is useful for people having a gluten allergy. It has several health benefits like reduction in blood pressure, thyroid, diabe¬tes, cardiovascular and celiac diseases but its direct consumption as food has significantly declined due to several reasons. Keeping this in view, it is important to reorient the ef¬forts to generate demand through value-addition and quality improvement and create awareness on the nutritional merits of pearl millet. In India, through Indian Council of Agricultural Research-All India Coordinated Research Project on Pearl millet, multilocational coordinated trials for developed hybrids were conducted at various centers. The gene banks of pearl millet contain varieties with high levels of iron and zinc which were used to produce new pearl millet varieties with elevated iron levels bred with the high‐yielding varieties. Thus, using breeding approaches and biochemical analysis, a total of 167 hybrids and 61 varieties were identified and released for cultivation in different agro-ecological zones of the country which also includes some biofortified hybrids rich in Fe and Zn. Further, using several biotechnological interventions such as molecular markers, next-generation sequencing (NGS), association mapping, nested association mapping (NAM), MAGIC populations, genome editing, genotyping by sequencing (GBS), genome wide association studies (GWAS) advancement in millet improvement has become possible by identifying and tagging of genes underlying a trait in the genome. Using DArT markers very high density linkage maps were constructed for pearl millet. Improved HHB67 has been released using marker assisted selection (MAS) strategies, and genomic tools were used to identify Fe-Zn Quantitative Trait Loci (QTL). The draft genome sequence of millet has also opened various ways to explore pearl millet. Further, genomic positions of significantly associated simple sequence repeat (SSR) markers with iron and zinc content in the consensus map is being identified and research is in progress towards mapping QTLs for flour rancidity. The sequence information is being used to explore genes and enzymatic pathways responsible for rancidity of flour. Thus, development and application of several biotechnological approaches along with biofortification can accelerate the genetic gain targets for pearl millet improvement and help improve its quality.

Keywords: Biotechnological approaches, genomic tools, malnutrition, MAS, nutricereal, pearl millet, sequencing.

Procedia PDF Downloads 145
5388 Wide Dissemination of CTX-M-Type Extended-Spectrum β-Lactamases in Korean Swine Farms

Authors: Young Ah Kim, Hyunsoo Kim, Eun-Jeong Yoon, Young Hee Seo, Kyungwon Lee

Abstract:

Extended-spectrum β-lactamase (ESBL)-producing Escherichia coli from food animals are considered as a reservoir for transmission of ESBL genes to human. The aim of this study is to assess the prevalence and molecular epidemiology of ESBL-producing E. coli colonization in pigs, farm workers, and farm environments to elucidate the transmission of multidrug-resistant clones from animal to human. Nineteen pig farms were enrolled across the country in Korea from August to December 2017. ESBL-producing E. coli isolates were detected in 190 pigs, 38 farm workers, and 112 sites of farm environments using ChromID ESBL (bioMerieux, Marcy l'Etoile, France), directly (stool or perirectal swab) or after enrichment (sewage). Antimicrobial susceptibility tests were done with disk diffusion methods and blaTEM, blaSHV, and blaCTX-M were detected with PCR and sequencing. The genomes of the four CTX-M-55-producing E. coli isolates from various sources in one farm were entirely sequenced to assess the relatedness of the strains. Whole genome sequencing (WGS) was performed with PacBio RS II system (Pacific Biosciences, Menlo Park, CA, USA). ESBL genotypes were 85 CTX-M-1 group (one CTX-M-3, 23 CTX-M-15, one CTX-M-28, 59 CTX-M-55, one CTX-M-69) and 60 CTX-M-9 group (41 CTX-M-14, one CTX-M-17, one CTX-M-27, 13 CTX-M-65, 4 CTX-M-102) in total 145 isolates. The rectal colonization rates were 53.2% (101/190) in pigs and 39.5% (15/38) in farm workers. In WGS, sequence types (STs) were determined as ST69 (E. coli PJFH115 isolate from a human carrier), ST457 (two E. coli isolates PJFE101 recovered from a fence and PJFA1104 from a pig) and ST5899 (E. coli PJFA173 isolate from the other pig). The four plasmids encoding CTX-M-55 (88,456 to 149, 674 base pair), whether it belonged to IncFIB or IncFIC-IncFIB type, shared IncF backbone furnishing the conjugal elements, suggesting of genes originated from same ancestor. In conclusion, the prevalence of ESBL-producing E. coli in swine farms was surprisingly high, and many of them shared common ESBL genotypes of clinical isolates such as CTX-M-14, 15, and 55 in Korea. It could spread by horizontal transfer between isolates from different reservoirs (human-animal-environment).

Keywords: Escherichia coli, extended-spectrum β-lactamase, prevalence, whole genome sequencing

Procedia PDF Downloads 178
5387 BingleSeq: A User-Friendly R Package for Single-Cell RNA-Seq Data Analysis

Authors: Quan Gu, Daniel Dimitrov

Abstract:

BingleSeq was developed as a shiny-based, intuitive, and comprehensive application that enables the analysis of single-Cell RNA-Sequencing count data. This was achieved via incorporating three state-of-the-art software packages for each type of RNA sequencing analysis, alongside functional annotation analysis and a way to assess the overlap of differential expression method results. At its current state, the functionality implemented within BingleSeq is comparable to that of other applications, also developed with the purpose of lowering the entry requirements to RNA Sequencing analyses. BingleSeq is available on GitHub and will be submitted to R/Bioconductor.

Keywords: bioinformatics, functional annotation analysis, single-cell RNA-sequencing, transcriptomics

Procedia PDF Downloads 170
5386 Transcriptomic Analysis of Acanthamoeba castellanii Virulence Alteration by Epigenetic DNA Methylation

Authors: Yi-Hao Wong, Li-Li Chan, Chee-Onn Leong, Stephen Ambu, Joon-Wah Mak, Priyasashi Sahu

Abstract:

Background: Acanthamoeba is a genus of amoebae which lives as a free-living in nature or as a human pathogen that causes severe brain and eye infections. Virulence potential of Acanthamoeba is not constant and can change with growth conditions. DNA methylation, an epigenetic process which adds methyl groups to DNA, is used by eukaryotic cells, including several human parasites to control their gene expression. We used qPCR, siRNA gene silencing, and RNA sequencing (RNA-Seq) to study DNA-methyltransferase gene family (DNMT) in order to indicate the possibility of its involvement in programming Acanthamoeba virulence potential. Methods: A virulence-attenuated Acanthamoeba isolate (designation: ATCC; original isolate: ATCC 50492) was subjected to mouse passages to restore its pathogenicity; a virulence-reactivated isolate (designation: AC/5) was generated. Several established factors associated with Acanthamoeba virulence phenotype were examined to confirm the succession of reactivation process. Differential gene expression of DNMT between ATCC and AC/5 isolates was performed by qPCR. Silencing on DNMT gene expression in AC/5 isolate was achieved by siRNA duplex. Total RNAs extracted from ATCC, AC/5, and siRNA-treated (designation: si-146) were subjected to RNA-Seq for comparative transcriptomic analysis in order to identify the genome-wide effect of DNMT in regulating Acanthamoeba gene expression. qPCR was performed to validate the RNA-Seq results. Results: Physiological and cytophatic assays demonstrated an increased in virulence potential of AC/5 isolate after mouse passages. DNMT gene expression was significantly higher in AC/5 compared to ATCC isolate (p ≤ 0.01) by qPCR. si-146 duplex reduced DNMT gene expression in AC/5 isolate by 30%. Comparative transcriptome analysis identified the differentially expressed genes, with 3768 genes in AC/5 vs ATCC isolate; 2102 genes in si-146 vs AC/5 isolate and 3422 genes in si-146 vs ATCC isolate, respectively (fold-change of ≥ 2 or ≤ 0.5, p-value adjusted (padj) < 0.05). Of these, 840 and 1262 genes were upregulated and downregulated, respectively, in si-146 vs AC/5 isolate. Eukaryotic orthologous group (KOG) assignments revealed a higher percentage of downregulated gene expression in si-146 compared to AC/5 isolate, were related to posttranslational modification, signal transduction and energy production. Gene Ontology (GO) terms for those downregulated genes shown were associated with transport activity, oxidation-reduction process, and metabolic process. Among these downregulated genes were putative genes encoded for heat shock proteins, transporters, ubiquitin-related proteins, proteins for vesicular trafficking (small GTPases), and oxidoreductases. Functional analysis of similar predicted proteins had been described in other parasitic protozoa for their survival and pathogenicity. Decreased expression of these genes in si146-treated isolate may account in part for Acanthamoeba reduced pathogenicity. qPCR on 6 selected genes upregulated in AC/5 compared to ATCC isolate corroborated the RNA sequencing findings, indicating a good concordance between these two analyses. Conclusion: To the best of our knowledge, this study represents the first genome-wide analysis of DNA methylation and its effects on gene expression in Acanthamoeba spp. The present data indicate that DNA methylation has substantial effect on global gene expression, allowing further dissection of the genome-wide effects of DNA-methyltransferase gene in regulating Acanthamoeba pathogenicity.

Keywords: Acanthamoeba, DNA methylation, RNA sequencing, virulence

Procedia PDF Downloads 170
5385 Exploring an Exome Target Capture Method for Cross-Species Population Genetic Studies

Authors: Benjamin A. Ha, Marco Morselli, Xinhui Paige Zhang, Elizabeth A. C. Heath-Heckman, Jonathan B. Puritz, David K. Jacobs

Abstract:

Next-generation sequencing has enhanced the ability to acquire massive amounts of sequence data to address classic population genetic questions for non-model organisms. Targeted approaches allow for cost effective or more precise analyses of relevant sequences; although, many such techniques require a known genome and it can be costly to purchase probes from a company. This is challenging for non-model organisms with no published genome and can be expensive for large population genetic studies. Expressed exome capture sequencing (EecSeq) synthesizes probes in the lab from expressed mRNA, which is used to capture and sequence the coding regions of genomic DNA from a pooled suite of samples. A normalization step produces probes to recover transcripts from a wide range of expression levels. This approach offers low cost recovery of a broad range of genes in the genome. This research project expands on EecSeq to investigate if mRNA from one taxon may be used to capture relevant sequences from a series of increasingly less closely related taxa. For this purpose, we propose to use the endangered Northern Tidewater goby, Eucyclogobius newberryi, a non-model organism that inhabits California coastal lagoons. mRNA will be extracted from E. newberryi to create probes and capture exomes from eight other taxa, including the more at-risk Southern Tidewater goby, E. kristinae, and more divergent species. Captured exomes will be sequenced, analyzed bioinformatically and phylogenetically, then compared to previously generated phylogenies across this group of gobies. This will provide an assessment of the utility of the technique in cross-species studies and for analyzing low genetic variation within species as is the case for E. kristinae. This method has potential applications to provide economical ways to expand population genetic and evolutionary biology studies for non-model organisms.

Keywords: coastal lagoons, endangered species, non-model organism, target capture method

Procedia PDF Downloads 166
5384 An Exploration of the Pancreatic Cancer miRNome during the Progression of the Disease

Authors: Barsha Saha, Shouvik Chakravarty, Sukanta Ray, Kshaunish Das, Nidhan K. Biswas, Srikanta Goswami

Abstract:

Pancreatic Ductal Adenocarcinoma is a well-recognised cause of cancer death with a five-year survival rate of about 9%, and its incidence in India has been found to be increased manifold in recent years. Due to delayed detection, this highly metastatic disease has a poor prognosis. Several molecular alterations happen during the progression of the disease from pre-cancerous conditions, and many such alterations could be investigated for their biomarker potential. MicroRNAs have been shown to be prognostic for PDAC patients in a variety of studies. We hereby used NGS technologies to evaluate the role of small RNA changes during pancreatic cancer development from chronic pancreatitis. Plasma samples were collected from pancreatic cancer patients (n=16), chronic pancreatitis patients (n=8), and also from normal individuals (n=16). Pancreatic tumour tissue (n=5) and adjacent normal tissue samples (n=5) were also collected. Sequencing of small RNAs was carried out after small RNAs were isolated from plasma samples and tissue samples. We find that certain microRNAs are highly deregulated in pancreatic cancer patients in comparison to normal samples. A combinatorial analysis of plasma and tissue microRNAs and subsequent exploration of their targets and altered molecular pathways could not only identify potential biomarkers for disease diagnosis but also help to understand the underlying mechanism.

Keywords: small RNA sequencing, pancreatic cancer, biomarkers, tissue sample

Procedia PDF Downloads 70
5383 Clinical Impact of Ultra-Deep Versus Sanger Sequencing Detection of Minority Mutations on the HIV-1 Drug Resistance Genotype Interpretations after Virological Failure

Authors: S. Mohamed, D. Gonzalez, C. Sayada, P. Halfon

Abstract:

Drug resistance mutations are routinely detected using standard Sanger sequencing, which does not detect minor variants with a frequency below 20%. The impact of detecting minor variants generated by ultra-deep sequencing (UDS) on HIV drug-resistance (DR) interpretations has not yet been studied. Fifty HIV-1 patients who experienced virological failure were included in this retrospective study. The HIV-1 UDS protocol allowed the detection and quantification of HIV-1 protease and reverse transcriptase variants related to genotypes A, B, C, E, F, and G. DeepChek®-HIV simplified DR interpretation software was used to compare Sanger sequencing and UDS. The total time required for the UDS protocol was found to be approximately three times longer than Sanger sequencing with equivalent reagent costs. UDS detected all of the mutations found by population sequencing and identified additional resistance variants in all patients. An analysis of DR revealed a total of 643 and 224 clinically relevant mutations by UDS and Sanger sequencing, respectively. Three resistance mutations with > 20% prevalence were detected solely by UDS: A98S (23%), E138A (21%) and V179I (25%). A significant difference in the DR interpretations for 19 antiretroviral drugs was observed between the UDS and Sanger sequencing methods. Y181C and T215Y were the most frequent mutations associated with interpretation differences. A combination of UDS and DeepChek® software for the interpretation of DR results would help clinicians provide suitable treatments. A cut-off of 1% allowed a better characterisation of the viral population by identifying additional resistance mutations and improving the DR interpretation.

Keywords: HIV-1, ultra-deep sequencing, Sanger sequencing, drug resistance

Procedia PDF Downloads 308
5382 TAXAPRO, A Streamlined Pipeline to Analyze Shotgun Metagenomes

Authors: Sofia Sehli, Zainab El Ouafi, Casey Eddington, Soumaya Jbara, Kasambula Arthur Shem, Islam El Jaddaoui, Ayorinde Afolayan, Olaitan I. Awe, Allissa Dillman, Hassan Ghazal

Abstract:

The ability to promptly sequence whole genomes at a relatively low cost has revolutionized the way we study the microbiome. Microbiologists are no longer limited to studying what can be grown in a laboratory and instead are given the opportunity to rapidly identify the makeup of microbial communities in a wide variety of environments. Analyzing whole genome sequencing (WGS) data is a complex process that involves multiple moving parts and might be rather unintuitive for scientists that don’t typically work with this type of data. Thus, to help lower the barrier for less-computationally inclined individuals, TAXAPRO was developed at the first Omics Codeathon held virtually by the African Society for Bioinformatics and Computational Biology (ASBCB) in June 2021. TAXAPRO is an advanced metagenomics pipeline that accurately assembles organelle genomes from whole-genome sequencing data. TAXAPRO seamlessly combines WGS analysis tools to create a pipeline that automatically processes raw WGS data and presents organism abundance information in both a tabular and graphical format. TAXAPRO was evaluated using COVID-19 patient gut microbiome data. Analysis performed by TAXAPRO demonstrated a high abundance of Clostridia and Bacteroidia genera and a low abundance of Proteobacteria genera relative to others in the gut microbiome of patients hospitalized with COVID-19, consistent with the original findings derived using a different analysis methodology. This provides crucial evidence that the TAXAPRO workflow dispenses reliable organism abundance information overnight without the hassle of performing the analysis manually.

Keywords: metagenomics, shotgun metagenomic sequence analysis, COVID-19, pipeline, bioinformatics

Procedia PDF Downloads 182
5381 Electrochemical APEX for Genotyping MYH7 Gene: A Low Cost Strategy for Minisequencing of Disease Causing Mutations

Authors: Ahmed M. Debela, Mayreli Ortiz , Ciara K. O´Sullivan

Abstract:

The completion of the human genome Project (HGP) has paved the way for mapping the diversity in the overall genome sequence which helps to understand the genetic causes of inherited diseases and susceptibility to drugs or environmental toxins. Arrayed primer extension (APEX) is a microarray based minisequencing strategy for screening disease causing mutations. It is derived from Sanger DNA sequencing and uses fluorescently dideoxynucleotides (ddNTPs) for termination of a growing DNA strand from a primer with its 3´- end designed immediately upstream of a site where single nucleotide polymorphism (SNP) occurs. The use of DNA polymerase offers a very high accuracy and specificity to APEX which in turn happens to be a method of choice for multiplex SNP detection. Coupling the high specificity of this method with the high sensitivity, low cost and compatibility for miniaturization of electrochemical techniques would offer an excellent platform for detection of mutation as well as sequencing of DNA templates. We are developing an electrochemical APEX for the analysis of SNPs found in the MYH7 gene for group of cardiomyopathy patients. ddNTPs were labeled with four different redox active compounds with four distinct potentials. Thiolated oligonucleotide probes were immobilised on gold and glassy carbon substrates which are followed by hybridisation with complementary target DNA just adjacent to the base to be extended by polymerase. Electrochemical interrogation was performed after the incorporation of the redox labelled dedioxynucleotide. The work involved the synthesis and characterisation of the redox labelled ddNTPs, optimisation and characterisation of surface functionalisation strategies and the nucleotide incorporation assays.

Keywords: array based primer extension, labelled ddNTPs, electrochemical, mutations

Procedia PDF Downloads 222
5380 Difference in Virulence Factor Genes Between Transient and Persistent Streptococcus Uberis Intramammary Infection in Dairy Cattle

Authors: Anyaphat Srithanasuwan, Noppason Pangprasit, Montira Intanon, Phongsakorn Chuammitri, Witaya Suriyasathaporn, Ynte H. Schukken

Abstract:

Streptococcus uberis is one of the most common mastitis-causing pathogens, with a wide range of intramammary infection (IMI) durations and pathogenicity. This study aimed to compare shared or unique virulence factor gene clusters distinguishing persistent and transient strains of S. uberis. A total of 139 S. uberis strains were isolated from three small-holder dairy herds with a high prevalence of S. uberis mastitis. The duration of IMI was used to categorize bacteria into two groups: transient and persistent strains with an IMI duration of less than 1 month and longer than 2 months, respectively. Six representative S. uberis strains, three from each group (transience and persistence) were selected for analysis. All transient strains exhibited multi-locus sequence types (MLST), indicating a highly diverse population of transient S. uberis. In contrast, MLST of persistent strains was available in an online database (pubMLST). Identification of virulence genes was performed using whole-genome sequencing (WGS) data. Differences in genomic size and number of virulent genes were found. For example, the BCA gene or alpha-c protein and the gene associated with capsule formation (hasAB), found in persistent strains, are important for attachment and invasion, as well as the evasion of the antimicrobial mechanisms and survival persistence, respectively. These findings suggest a genetic-level difference between the two strain types. Consequently, a comprehensive study of 139 S. uberis isolates will be conducted to perform an in-depth genetic assessment through WGS analysis on an Illumina platform.

Keywords: Streptococcus Uberis, mastitis, whole genome sequence, intramammary infection, persistent S. Uberis, transient s. Uberis

Procedia PDF Downloads 29
5379 Predicting Open Chromatin Regions in Cell-Free DNA Whole Genome Sequencing Data by Correlation Clustering  

Authors: Fahimeh Palizban, Farshad Noravesh, Amir Hossein Saeidian, Mahya Mehrmohamadi

Abstract:

In the recent decade, the emergence of liquid biopsy has significantly improved cancer monitoring and detection. Dying cells, including those originating from tumors, shed their DNA into the blood and contribute to a pool of circulating fragments called cell-free DNA. Accordingly, identifying the tissue origin of these DNA fragments from the plasma can result in more accurate and fast disease diagnosis and precise treatment protocols. Open chromatin regions are important epigenetic features of DNA that reflect cell types of origin. Profiling these features by DNase-seq, ATAC-seq, and histone ChIP-seq provides insights into tissue-specific and disease-specific regulatory mechanisms. There have been several studies in the area of cancer liquid biopsy that integrate distinct genomic and epigenomic features for early cancer detection along with tissue of origin detection. However, multimodal analysis requires several types of experiments to cover the genomic and epigenomic aspects of a single sample, which will lead to a huge amount of cost and time. To overcome these limitations, the idea of predicting OCRs from WGS is of particular importance. In this regard, we proposed a computational approach to target the prediction of open chromatin regions as an important epigenetic feature from cell-free DNA whole genome sequence data. To fulfill this objective, local sequencing depth will be fed to our proposed algorithm and the prediction of the most probable open chromatin regions from whole genome sequencing data can be carried out. Our method integrates the signal processing method with sequencing depth data and includes count normalization, Discrete Fourie Transform conversion, graph construction, graph cut optimization by linear programming, and clustering. To validate the proposed method, we compared the output of the clustering (open chromatin region+, open chromatin region-) with previously validated open chromatin regions related to human blood samples of the ATAC-DB database. The percentage of overlap between predicted open chromatin regions and the experimentally validated regions obtained by ATAC-seq in ATAC-DB is greater than 67%, which indicates meaningful prediction. As it is evident, OCRs are mostly located in the transcription start sites (TSS) of the genes. In this regard, we compared the concordance between the predicted OCRs and the human genes TSS regions obtained from refTSS and it showed proper accordance around 52.04% and ~78% with all and the housekeeping genes, respectively. Accurately detecting open chromatin regions from plasma cell-free DNA-seq data is a very challenging computational problem due to the existence of several confounding factors, such as technical and biological variations. Although this approach is in its infancy, there has already been an attempt to apply it, which leads to a tool named OCRDetector with some restrictions like the need for highly depth cfDNA WGS data, prior information about OCRs distribution, and considering multiple features. However, we implemented a graph signal clustering based on a single depth feature in an unsupervised learning manner that resulted in faster performance and decent accuracy. Overall, we tried to investigate the epigenomic pattern of a cell-free DNA sample from a new computational perspective that can be used along with other tools to investigate genetic and epigenetic aspects of a single whole genome sequencing data for efficient liquid biopsy-related analysis.

Keywords: open chromatin regions, cancer, cell-free DNA, epigenomics, graph signal processing, correlation clustering

Procedia PDF Downloads 117
5378 Isolation and Molecular Characterization of Lytic Bacteriophage against Carbapenem Resistant Klebsiella pneumoniae

Authors: Guna Raj Dhungana, Roshan Nepal, Apshara Parajuli, , Archana Maharjan, Shyam K. Mishra, Pramod Aryal, Rajani Malla

Abstract:

Introduction: Klebsiella pneumoniae is a well-known opportunistic human pathogen, primarily causing healthcare-associated infections. The global emergence of carbapenemase-producing K. pneumoniaeis a major public health burden, which is often extensively multidrug resistant.Thus, because of the difficulty to treat these ‘superbug’ and menace and some term as ‘apocalypse’ of post antibiotics era, an alternative approach to controlling this pathogen is prudent and one of the approaches is phage mediated control and/or treatment. Objective: In this study, we aimed to isolate novel bacteriophage against carbapenemase-producing K. pneumoniaeand characterize for potential use inphage therapy. Material and Methods: Twenty lytic phages were isolated from river water using double layer agar assay and purified. Biological features, physiochemical characters, burst size, host specificity and activity spectrum of phages were determined. One most potent phage: Phage TU_Kle10O was selected and characterized by electron microscopy. Whole genome sequences of the phage were analyzed for presence/absence of virulent factors, and other lysin genes. Results: Novel phage TU_Kle10O showed multiple host range within own genus and did not induce any BIM up to 5th generation of host’s life cycle. Electron microscopy confirmed that the phage was tailed and belonged to Caudovirales family. Next generation sequencing revealed its genome to be 166.2 Kb. bioinformatical analysis further confirmed that the phage genome ‘did not’ contain any ‘bacterial genes’ within phage genome, which ruled out the concern for transfer of virulent genes. Specific 'lysin’ enzyme was identified phages which could be used as 'antibiotics'. Conclusion: Extensively multidrug resistant bacteria like carbapenemase-producing K. pneumoniaecould be treated efficiently by phages.Absence of ‘virulent’ genes of bacterial origin and presence of lysin proteins within phage genome makes phages an excellent candidate for therapeutics.

Keywords: bacteriophage, Klebsiella pneumoniae, MDR, phage therapy, carbapenemase,

Procedia PDF Downloads 158
5377 A Pipeline for Detecting Copy Number Variation from Whole Exome Sequencing Using Comprehensive Tools

Authors: Cheng-Yang Lee, Petrus Tang, Tzu-Hao Chang

Abstract:

Copy number variations (CNVs) have played an important role in many kinds of human diseases, such as Autism, Schizophrenia and a number of cancers. Many diseases are found in genome coding regions and whole exome sequencing (WES) is a cost-effective and powerful technology in detecting variants that are enriched in exons and have potential applications in clinical setting. Although several algorithms have been developed to detect CNVs using WES and compared with other algorithms for finding the most suitable methods using their own samples, there were not consistent datasets across most of algorithms to evaluate the ability of CNV detection. On the other hand, most of algorithms is using command line interface that may greatly limit the analysis capability of many laboratories. We create a series of simulated WES datasets from UCSC hg19 chromosome 22, and then evaluate the CNV detective ability of 19 algorithms from OMICtools database using our simulated WES datasets. We compute the sensitivity, specificity and accuracy in each algorithm for validation of the exome-derived CNVs. After comparison of 19 algorithms from OMICtools database, we construct a platform to install all of the algorithms in a virtual machine like VirtualBox which can be established conveniently in local computers, and then create a simple script that can be easily to use for detecting CNVs using algorithms selected by users. We also build a table to elaborate on many kinds of events, such as input requirement, CNV detective ability, for all of the algorithms that can provide users a specification to choose optimum algorithms.

Keywords: whole exome sequencing, copy number variations, omictools, pipeline

Procedia PDF Downloads 288
5376 First Attempts Using High-Throughput Sequencing in Senecio from the Andes

Authors: L. Salomon, P. Sklenar

Abstract:

The Andes hold the highest plant species diversity in the world. How this occurred is one of the most intriguing questions in studies addressing the origin and patterning of plant diversity worldwide. Recently, the explosive adaptive radiations found in high Andean groups have been pointed as triggers to this spectacular diversity. The Andes is the species-richest area for the biggest genus from the Asteraceae family: Senecio. There, the genus presents an incredible diversity of species, striking growth form variation, and large niche span. Even when some studies tried to disentangle the evolutionary story for some Andean species in Senecio, they obtained partially resolved and low supported phylogenies, as expected for recently radiated groups. The high-throughput sequencing (HTS) approaches have proved to be a powerful tool answering phylogenetic questions in those groups whose evolutionary stories are recent and traditional techniques like Sanger sequencing are not informative enough. Although these tools have been used to understand the evolution of an increasing number of Andean groups, nowadays, their scope has not been applied for Senecio. This project aims to contribute to a better knowledge of the mechanisms shaping the hyper diversity of Senecio in the Andean region, using HTS focusing on Senecio ser. Culcitium (Asteraceae), recently recircumscribed. Firstly, reconstructing a highly resolved and supported phylogeny, and after assessing the role of allopatric differentiation, hybridization, and genome duplication in the diversification of the group. Using the Hyb-Seq approach, combining target enrichment using Asteraceae COS loci baits and genome skimming, more than 100 new accessions were generated. HybPhyloMaker and HybPiper pipelines were used for the phylogenetic analyses, and another pipeline in development (Paralogue Wizard) was used to deal with paralogues. RAxML was used to generate gene trees and Astral for species tree reconstruction. Phyparts were used to explore as first step of gene tree discordance along the clades. Fully resolved with moderated supported trees were obtained, showing Senecio ser. Culcitium as monophyletic. Within the group, some species formed well-supported clades with morphologically related species, while some species would not have exclusive ancestry, in concordance with previous studies using amplified fragment length polymorphism (AFLP) showing geographical differentiation. Discordance between gene trees was detected. Paralogues were detected for many loci, indicating possible genome duplications; ploidy level estimation using flow cytometry will be carried out during the next months in order to identify the role of this process in the diversification of the group. Likewise, TreeSetViz package for Mesquite, hierarchical likelihood ratio congruence test using Concaterpillar, and Procrustean Approach to Cophylogeny (PACo), will be used to evaluate the congruence among different inheritance patterns. In order to evaluate the influence of hybridization and Incomplete Lineage Sorting (ILS) in each resultant clade from the phylogeny, Joly et al.'s 2009 method in a coalescent scenario and Paterson’s D-statistic will be performed. Even when the main discordance sources between gene trees were not explored in detail yet, the data show that at least to some degree, processes such as genome duplication, hybridization, and/or ILS could be involved in the evolution of the group.

Keywords: adaptive radiations, Andes, genome duplication, hybridization, Senecio

Procedia PDF Downloads 111
5375 Complete Genome Sequence Analysis of Pasteurella multocida Subspecies multocida Serotype A Strain PMTB2.1

Authors: Shagufta Jabeen, Faez J. Firdaus Abdullah, Zunita Zakaria, Nurulfiza M. Isa, Yung C. Tan, Wai Y. Yee, Abdul R. Omar

Abstract:

Pasteurella multocida (PM) is an important veterinary opportunistic pathogen particularly associated with septicemic pasteurellosis, pneumonic pasteurellosis and hemorrhagic septicemia in cattle and buffaloes. P. multocida serotype A has been reported to cause fatal pneumonia and septicemia. Pasteurella multocida subspecies multocida of serotype A Malaysian isolate PMTB2.1 was first isolated from buffaloes died of septicemia. In this study, the genome of P. multocida strain PMTB2.1 was sequenced using third-generation sequencing technology, PacBio RS2 system and analyzed bioinformatically via de novo analysis followed by in-depth analysis based on comparative genomics. Bioinformatics analysis based on de novo assembly of PacBio raw reads generated 3 contigs followed by gap filling of aligned contigs with PCR sequencing, generated a single contiguous circular chromosome with a genomic size of 2,315,138 bp and a GC content of approximately 40.32% (Accession number CP007205). The PMTB2.1 genome comprised of 2,176 protein-coding sequences, 6 rRNA operons and 56 tRNA and 4 ncRNAs sequences. The comparative genome sequence analysis of PMTB2.1 with nine complete genomes which include Actinobacillus pleuropneumoniae, Haemophilus parasuis, Escherichia coli and five P. multocida complete genome sequences including, PM70, PM36950, PMHN06, PM3480, PMHB01 and PMTB2.1 was carried out based on OrthoMCL analysis and Venn diagram. The analysis showed that 282 CDs (13%) are unique to PMTB2.1and 1,125 CDs with orthologs in all. This reflects overall close relationship of these bacteria and supports the classification in the Gamma subdivision of the Proteobacteria. In addition, genomic distance analysis among all nine genomes indicated that PMTB2.1 is closely related with other five Pasteurella species with genomic distance less than 0.13. Synteny analysis shows subtle differences in genetic structures among different P.multocida indicating the dynamics of frequent gene transfer events among different P. multocida strains. However, PM3480 and PM70 exhibited exceptionally large structural variation since they were swine and chicken isolates. Furthermore, genomic structure of PMTB2.1 is more resembling that of PM36950 with a genomic size difference of approximately 34,380 kb (smaller than PM36950) and strain-specific Integrative and Conjugative Elements (ICE) which was found only in PM36950 is absent in PMTB2.1. Meanwhile, two intact prophages sequences of approximately 62 kb were found to be present only in PMTB2.1. One of phage is similar to transposable phage SfMu. The phylogenomic tree was constructed and rooted with E. coli, A. pleuropneumoniae and H. parasuis based on OrthoMCL analysis. The genomes of P. multocida strain PMTB2.1 were clustered with bovine isolates of P. multocida strain PM36950 and PMHB01 and were separated from avian isolate PM70 and swine isolates PM3480 and PMHN06 and are distant from Actinobacillus and Haemophilus. Previous studies based on Single Nucleotide Polymorphism (SNPs) and Multilocus Sequence Typing (MLST) unable to show a clear phylogenetic relatedness between Pasteurella multocida and the different host. In conclusion, this study has provided insight on the genomic structure of PMTB2.1 in terms of potential genes that can function as virulence factors for future study in elucidating the mechanisms behind the ability of the bacteria in causing diseases in susceptible animals.

Keywords: comparative genomics, DNA sequencing, phage, phylogenomics

Procedia PDF Downloads 157
5374 Changing the Landscape of Fungal Genomics: New Trends

Authors: Igor V. Grigoriev

Abstract:

Understanding of biological processes encoded in fungi is instrumental in addressing future food, feed, and energy demands of the growing human population. Genomics is a powerful and quickly evolving tool to understand these processes. The Fungal Genomics Program of the US Department of Energy Joint Genome Institute (JGI) partners with researchers around the world to explore fungi in several large scale genomics projects, changing the fungal genomics landscape. The key trends of these changes include: (i) rapidly increasing scale of sequencing and analysis, (ii) developing approaches to go beyond culturable fungi and explore fungal ‘dark matter,’ or unculturables, and (iii) functional genomics and multi-omics data integration. Power of comparative genomics has been recently demonstrated in several JGI projects targeting mycorrhizae, plant pathogens, wood decay fungi, and sugar fermenting yeasts. The largest JGI project ‘1000 Fungal Genomes’ aims at exploring the diversity across the Fungal Tree of Life in order to better understand fungal evolution and to build a catalogue of genes, enzymes, and pathways for biotechnological applications. At this point, at least 65% of over 700 known families have one or more reference genomes sequenced, enabling metagenomics studies of microbial communities and their interactions with plants. For many of the remaining families no representative species are available from culture collections. To sequence genomes of unculturable fungi two approaches have been developed: (a) sequencing DNA from fruiting bodies of ‘macro’ and (b) single cell genomics using fungal spores. The latter has been tested using zoospores from the early diverging fungi and resulted in several near-complete genomes from underexplored branches of the Fungal Tree, including the first genomes of Zoopagomycotina. Genome sequence serves as a reference for transcriptomics studies, the first step towards functional genomics. In the JGI fungal mini-ENCODE project transcriptomes of the model fungus Neurospora crassa grown on a spectrum of carbon sources have been collected to build regulatory gene networks. Epigenomics is another tool to understand gene regulation and recently introduced single molecule sequencing platforms not only provide better genome assemblies but can also detect DNA modifications. For example, 6mC methylome was surveyed across many diverse fungi and the highest among Eukaryota levels of 6mC methylation has been reported. Finally, data production at such scale requires data integration to enable efficient data analysis. Over 700 fungal genomes and other -omes have been integrated in JGI MycoCosm portal and equipped with comparative genomics tools to enable researchers addressing a broad spectrum of biological questions and applications for bioenergy and biotechnology.

Keywords: fungal genomics, single cell genomics, DNA methylation, comparative genomics

Procedia PDF Downloads 183