Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 355

Search results for: genome%20assembly

325 Implementation of CNV-CH Algorithm Using Map-Reduce Approach

Authors: Aishik Deb, Rituparna Sinha

Abstract:

We have developed an algorithm to detect the abnormal segment/"structural variation in the genome across a number of samples. We have worked on simulated as well as real data from the BAM Files and have designed a segmentation algorithm where abnormal segments are detected. This algorithm aims to improve the accuracy and performance of the existing CNV-CH algorithm. The next-generation sequencing (NGS) approach is very fast and can generate large sequences in a reasonable time. So the huge volume of sequence information gives rise to the need for Big Data and parallel approaches of segmentation. Therefore, we have designed a map-reduce approach for the existing CNV-CH algorithm where a large amount of sequence data can be segmented and structural variations in the human genome can be detected. We have compared the efficiency of the traditional and map-reduce algorithms with respect to precision, sensitivity, and F-Score. The advantages of using our algorithm are that it is fast and has better accuracy. This algorithm can be applied to detect structural variations within a genome, which in turn can be used to detect various genetic disorders such as cancer, etc. The defects may be caused by new mutations or changes to the DNA and generally result in abnormally high or low base coverage and quantification values.

Keywords: cancer detection, convex hull segmentation, map reduce, next generation sequencing

Procedia PDF Downloads 102

324 Genome-Wide Mining of Potential Guide RNAs for Streptococcus pyogenes and Neisseria meningitides CRISPR-Cas Systems for Genome Engineering

Authors: Farahnaz Sadat Golestan Hashemi, Mohd Razi Ismail, Mohd Y. Rafii

Abstract:

Clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated protein (Cas) system can facilitate targeted genome editing in organisms. Dual or single guide RNA (gRNA) can program the Cas9 nuclease to cut target DNA in particular areas; thus, introducing concise mutations either via error-prone non-homologous end-joining repairing or via incorporating foreign DNAs by homologous recombination between donor DNA and target area. In spite of high demand of such promising technology, developing a well-organized procedure in order for reliable mining of potential target sites for gRNAs in large genomic data is still challenging. Hence, we aimed to perform high-throughput detection of target sites by specific PAMs for not only common Streptococcus pyogenes (SpCas9) but also for Neisseria meningitides (NmCas9) CRISPR-Cas systems. Previous research confirmed the successful application of such RNA-guided Cas9 orthologs for effective gene targeting and subsequently genome manipulation. However, Cas9 orthologs need their particular PAM sequence for DNA cleavage activity. Activity levels are based on the sequence of the protospacer and specific combinations of favorable PAM bases. Therefore, based on the specific length and sequence of PAM followed by a constant length of the target site for the two orthogonals of Cas9 protein, we created a reliable procedure to explore possible gRNA sequences. To mine CRISPR target sites, four different searching modes of sgRNA binding to target DNA strand were applied. These searching modes are as follows i) coding strand searching, ii) anti-coding strand searching, iii) both strand searching, and iv) paired-gRNA searching. Finally, a complete list of all potential gRNAs along with their locations, strands, and PAMs sequence orientation can be provided for both SpCas9 as well as another potential Cas9 ortholog (NmCas9). The artificial design of potential gRNAs in a genome of interest can accelerate functional genomic studies. Consequently, the application of such novel genome editing tool (CRISPR/Cas technology) will enhance by presenting increased versatility and efficiency.

Keywords: CRISPR/Cas9 genome editing, gRNA mining, SpCas9, NmCas9

Procedia PDF Downloads 227

323 RNA-Seq Based Transcriptomic Analysis of Wheat Cultivars for Unveiling of Genomic Variations and Isolation of Drought Tolerant Genes for Genome Editing

Authors: Ghulam Muhammad Ali

Abstract:

Unveiling of genes involved in drought and root architecture using transcriptomic analyses remained fragmented for further improvement of wheat through genome editing. The purpose of this research endeavor was to unveil the variations in different genes implicated in drought tolerance and root architecture in wheat through RNA-seq data analysis. In this study seedlings of 8 days old, 6 cultivars of wheat namely, Batis, Blue Silver, Local White, UZ888, Chakwal 50 and Synthetic wheat S22 were subjected to transcriptomic analysis for root and shoot genes. Total of 12 RNA samples was sequenced by Illumina. Using updated wheat transcripts from Ensembl and IWGC references with 54,175 gene models, we found that 49,621 out of 54,175 (91.5%) genes are expressed at an RPKM of 0.1 or more (in at least 1 sample). The number of genes expressed was higher in Local White than Batis. Differentially expressed genes (DEG) were higher in Chakwal 50. Expression-based clustering indicated conserved function of DRO1and RPK1 between Arabidopsis and wheat. Dendrogram showed that Local White is sister to Chakwal 50 while Batis is closely related to Blue Silver. This study flaunts transcriptomic sequence variations in different cultivars that showed mutations in genes associated with drought that may directly contribute to drought tolerance. DRO1 and RPK1 genes were fetched/isolated for genome editing. These genes are being edited in wheat through CRISPR-Cas9 for yield enhancement.

Keywords: transcriptomic, wheat, genome editing, drought, CRISPR-Cas9, yield enhancement

Procedia PDF Downloads 118

322 Habitat-Specific Divergences in the Gene Repertoire among the Reference Prevotella Genomes of the Human Microbiome

Authors: Vinod Kumar Gupta, Narendrakumar M. Chaudhari, Suchismitha Iskepalli, Chitra Dutta

Abstract:

Background-The community composition of the human microbiome is known to vary at distinct anatomical niches. But little is known about the nature of variations if any, at the genome/sub-genome levels of a specific microbial community across different niches. The present report aims to explore, as a case study, the variations in gene repertoire of 28 Prevotella reference draft genomes derived from different body-sites of human, as reported earlier by the Human Microbiome Consortium. Results-The analysis reveals the exclusive presence of 11798, 3673, 3348 and 934 gene families and exclusive absence of 17, 221, 115 and 645 gene families in Prevotella genomes derived from the human oral cavity, gastro-intestinal tracts (GIT), urogenital tract (UGT) and skin, respectively. The pan-genome for Prevotella remains “open”. Distribution of various functional COG categories differs appreciably among the habitat-specific genes, within Prevotella pan-genome and between the GIT-derived Bacteroides and Prevotella. The skin and GIT isolates of Prevotella are enriched in singletons involved in Signal transduction mechanisms, while the UGT and oral isolates show higher representation of the Defense mechanisms category. No niche-specific variations could be observed in the distribution of KEGG pathways. Conclusion-Prevotella may have developed distinct genetic strategies for adaptation to different anatomical habitats through selective, niche-specific acquisition and elimination of suitable gene-families. In addition, individual microorganisms tend to develop their own distinctive adaptive stratagems through large repertoires of singletons. Such in situ, habitat-driven refurbishment of the genetic makeup can impart substantial intra-lineage genome diversity within the microbes without perturbing their general taxonomic heritage.

Keywords: body niche adaptation, human microbiome, pangenome, Prevotella

Procedia PDF Downloads 226

321 Novel Recombinant Betasatellite Associated with Vein Thickening Symptoms on Okra Plants in Saudi Arabia

Authors: Adel M. Zakri, Mohammed A. Al-Saleh, Judith. K. Brown, Ali M. Idris

Abstract:

Betasatellites are small circular single stranded DNA molecules found associated with begomoviruses on field symptomatic plants. Their genome size is about half that of the helper begomovirus, ranging between 1.3 and 1.4 kb. The helper begomoviruses are usually members of the family Geminiviridae. Okra leaves showing vein thickening were collected from okra plants growing in Jazan, Saudi Arabia. Total DNA was extracted from leaves and used as a template to amplify circular DNA using rolling circle amplification (RCA) technology. Products were digested with PstI to linearize the helper viral genome(s), and associated DNA satellite(s), yielding a 2.8kbp and 1.4kbp fragment, respectively. The linearized fragments were cloned into the pGEM-5Zf (+) vector and subjected to DNA sequencing. The 2.8 kb fragment was identified as Cotton leaf curl Gezira virus genome, at 2780bp, an isolate closely related to strains reported previously from Saudi Arabia. A clone obtained from the 1.4 kb fragments he 1.4kb was blasted to GeneBank database found to be a betasatellite. The genome of betasatellite was 1357-bp in size. It was found to be a recombinant containing one fragment (877-bp) that shared 91% nt identity with Cotton leaf curl Gezira betasatellite [KM279620], and a smaller fragment [133--bp) that shared 86% nt identity with Tomato leaf curl Sudan virus [JX483708]. This satellite is thus a recombinant between a malvaceous-infecting satellite and a solanaceous-infecting begomovirus.

Keywords: begomovirus, betasatellites, cotton leaf curl Gezira virus, okra plants

Procedia PDF Downloads 311

320 Isolate-Specific Variations among Clinical Isolates of Brucella Identified by Whole-Genome Sequencing, Bioinformatics and Comparative Genomics

Authors: Abu S. Mustafa, Mohammad W. Khan, Faraz Shaheed Khan, Nazima Habibi

Abstract:

Brucellosis is a zoonotic disease of worldwide prevalence. There are at least four species and several strains of Brucella that cause human disease. Brucella genomes have very limited variation across strains, which hinder strain identification using classical molecular techniques, including PCR and 16 S rDNA sequencing. The aim of this study was to perform whole genome sequencing of clinical isolates of Brucella and perform bioinformatics and comparative genomics analyses to determine the existence of genetic differences across the isolates of a single Brucella species and strain. The draft sequence data were generated from 15 clinical isolates of Brucella melitensis (biovar 2 strain 63/9) using MiSeq next generation sequencing platform. The generated reads were used for further assembly and analysis. All the analysis was performed using Bioinformatics work station (8 core i7 processor, 8GB RAM with Bio-Linux operating system). FastQC was used to determine the quality of reads and low quality reads were trimmed or eliminated using Fastx_trimmer. Assembly was done by using Velvet and ABySS softwares. The ordering of assembled contigs was performed by Mauve. An online server RAST was employed to annotate the contigs assembly. Annotated genomes were compared using Mauve and ACT tools. The QC score for DNA sequence data, generated by MiSeq, was higher than 30 for 80% of reads with more than 100x coverage, which suggested that data could be utilized for further analysis. However when analyzed by FastQC, quality of four reads was not good enough for creating a complete genome draft so remaining 11 samples were used for further analysis. The comparative genome analyses showed that despite sharing same gene sets, single nucleotide polymorphisms and insertions/deletions existed across different genomes, which provided a variable extent of diversity to these bacteria. In conclusion, the next generation sequencing, bioinformatics, and comparative genome analysis can be utilized to find variations (point mutations, insertions and deletions) across different genomes of Brucella within a single strain. This information could be useful in surveillance and epidemiological studies supported by Kuwait University Research Sector grants MI04/15 and SRUL02/13.

Keywords: brucella, bioinformatics, comparative genomics, whole genome sequencing

Procedia PDF Downloads 345

319 Evolutionary Genomic Analysis of Adaptation Genomics

Authors: Agostinho Antunes

Abstract:

The completion of the human genome sequencing in 2003 opened a new perspective into the importance of whole genome sequencing projects, and currently multiple species are having their genomes completed sequenced, from simple organisms, such as bacteria, to more complex taxa, such as mammals. This voluminous sequencing data generated across multiple organisms provides also the framework to better understand the genetic makeup of such species and related ones, allowing to explore the genetic changes underlining the evolution of diverse phenotypic traits. Here, recent results from our group retrieved from comparative evolutionary genomic analyses of varied species will be considered to exemplify how gene novelty and gene enhancement by positive selection might have been determinant in the success of adaptive radiations into diverse habitats and lifestyles.

Keywords: adaptation, animals, evolution, genomics

Procedia PDF Downloads 395

318 Reconstruction of a Genome-Scale Metabolic Model to Simulate Uncoupled Growth of Zymomonas mobilis

Authors: Maryam Saeidi, Ehsan Motamedian, Seyed Abbas Shojaosadati

Abstract:

Zymomonas mobilis is known as an example of the uncoupled growth phenomenon. This microorganism also has a unique metabolism that degrades glucose by the Entner–Doudoroff (ED) pathway. In this paper, a genome-scale metabolic model including 434 genes, 757 reactions and 691 metabolites was reconstructed to simulate uncoupled growth and study its effect on flux distribution in the central metabolism. The model properly predicted that ATPase was activated in experimental growth yields of Z. mobilis. Flux distribution obtained from model indicates that the major carbon flux passed through ED pathway that resulted in the production of ethanol. Small amounts of carbon source were entered into pentose phosphate pathway and TCA cycle to produce biomass precursors. Predicted flux distribution was in good agreement with experimental data. The model results also indicated that Z. mobilis metabolism is able to produce biomass with maximum growth yield of 123.7 g (mol glucose)-1 if ATP synthase is coupled with growth and produces 82 mmol ATP gDCW-1h-1. Coupling the growth and energy reduced ethanol secretion and changed the flux distribution to produce biomass precursors.

Keywords: genome-scale metabolic model, Zymomonas mobilis, uncoupled growth, flux distribution, ATP dissipation

Procedia PDF Downloads 453

317 Genome-Wide Significant SNPs Proximal to Nicotinic Receptor Genes Impact Cognition in Schizophrenia

Authors: Mohammad Ahangari

Abstract:

Schizophrenia is a psychiatric disorder with symptoms that include cognitive deficits and nicotine has been suggested to have an effect on cognition. In recent years, the advents of Genome-Wide Association Studies(GWAS) has evolved our understanding about the genetic causes of complex disorders such as schizophrenia and studying the role of genome-wide significant genes could potentially lead to the development of new therapeutic agents for treatment of cognitive deficits in schizophrenia. The current study identified six Single Nucleotide Polymorphisms (SNP) from schizophrenia and smoking GWAS that are located on or in close proximity to the nicotinic receptor gene cluster (CHRN) and studied their association with cognition in an Irish sample of 1297 cases and controls using linear regression analysis. Further on, the interaction between CHRN gene cluster and Dopamine receptor D2 gene (DRD2) during working memory was investigated. The effect of these polymorphisms on nicotinic and dopaminergic neurotransmission, which is disrupted in schizophrenia, have been characterized in terms of their effects on memory, attention, social cognition and IQ as measured by a neuropsychological test battery and significant effects in two polymorphisms were found across global IQ domain of the test battery.

Keywords: cognition, dopamine, GWAS, nicotine, schizophrenia, SNPs

Procedia PDF Downloads 306

316 Phenotype Prediction of DNA Sequence Data: A Machine and Statistical Learning Approach

Authors: Mpho Mokoatle, Darlington Mapiye, James Mashiyane, Stephanie Muller, Gciniwe Dlamini

Abstract:

Great advances in high-throughput sequencing technologies have resulted in availability of huge amounts of sequencing data in public and private repositories, enabling a holistic understanding of complex biological phenomena. Sequence data are used for a wide range of applications such as gene annotations, expression studies, personalized treatment and precision medicine. However, this rapid growth in sequence data poses a great challenge which calls for novel data processing and analytic methods, as well as huge computing resources. In this work, a machine and statistical learning approach for DNA sequence classification based on $k$-mer representation of sequence data is proposed. The approach is tested using whole genome sequences of Mycobacterium tuberculosis (MTB) isolates to (i) reduce the size of genomic sequence data, (ii) identify an optimum size of k-mers and utilize it to build classification models, (iii) predict the phenotype from whole genome sequence data of a given bacterial isolate, and (iv) demonstrate computing challenges associated with the analysis of whole genome sequence data in producing interpretable and explainable insights. The classification models were trained on 104 whole genome sequences of MTB isoloates. Cluster analysis showed that k-mers maybe used to discriminate phenotypes and the discrimination becomes more concise as the size of k-mers increase. The best performing classification model had a k-mer size of 10 (longest k-mer) an accuracy, recall, precision, specificity, and Matthews Correlation coeffient of 72.0%, 80.5%, 80.5%, 63.6%, and 0.4 respectively. This study provides a comprehensive approach for resampling whole genome sequencing data, objectively selecting a k-mer size, and performing classification for phenotype prediction. The analysis also highlights the importance of increasing the k-mer size to produce more biological explainable results, which brings to the fore the interplay that exists amongst accuracy, computing resources and explainability of classification results. However, the analysis provides a new way to elucidate genetic information from genomic data, and identify phenotype relationships which are important especially in explaining complex biological mechanisms.

Keywords: AWD-LSTM, bootstrapping, k-mers, next generation sequencing

Procedia PDF Downloads 133

315 Phenotype Prediction of DNA Sequence Data: A Machine and Statistical Learning Approach

Authors: Darlington Mapiye, Mpho Mokoatle, James Mashiyane, Stephanie Muller, Gciniwe Dlamini

Abstract:

Great advances in high-throughput sequencing technologies have resulted in availability of huge amounts of sequencing data in public and private repositories, enabling a holistic understanding of complex biological phenomena. Sequence data are used for a wide range of applications such as gene annotations, expression studies, personalized treatment and precision medicine. However, this rapid growth in sequence data poses a great challenge which calls for novel data processing and analytic methods, as well as huge computing resources. In this work, a machine and statistical learning approach for DNA sequence classification based on k-mer representation of sequence data is proposed. The approach is tested using whole genome sequences of Mycobacterium tuberculosis (MTB) isolates to (i) reduce the size of genomic sequence data, (ii) identify an optimum size of k-mers and utilize it to build classification models, (iii) predict the phenotype from whole genome sequence data of a given bacterial isolate, and (iv) demonstrate computing challenges associated with the analysis of whole genome sequence data in producing interpretable and explainable insights. The classification models were trained on 104 whole genome sequences of MTB isoloates. Cluster analysis showed that k-mers maybe used to discriminate phenotypes and the discrimination becomes more concise as the size of k-mers increase. The best performing classification model had a k-mer size of 10 (longest k-mer) an accuracy, recall, precision, specificity, and Matthews Correlation coeffient of 72.0 %, 80.5 %, 80.5 %, 63.6 %, and 0.4 respectively. This study provides a comprehensive approach for resampling whole genome sequencing data, objectively selecting a k-mer size, and performing classification for phenotype prediction. The analysis also highlights the importance of increasing the k-mer size to produce more biological explainable results, which brings to the fore the interplay that exists amongst accuracy, computing resources and explainability of classification results. However, the analysis provides a new way to elucidate genetic information from genomic data, and identify phenotype relationships which are important especially in explaining complex biological mechanisms

Keywords: AWD-LSTM, bootstrapping, k-mers, next generation sequencing

Procedia PDF Downloads 122

314 Isolation and Molecular Characterization of Lytic Bacteriophage against Carbapenem Resistant Klebsiella pneumoniae

Authors: Guna Raj Dhungana, Roshan Nepal, Apshara Parajuli, , Archana Maharjan, Shyam K. Mishra, Pramod Aryal, Rajani Malla

Abstract:

Introduction: Klebsiella pneumoniae is a well-known opportunistic human pathogen, primarily causing healthcare-associated infections. The global emergence of carbapenemase-producing K. pneumoniaeis a major public health burden, which is often extensively multidrug resistant.Thus, because of the difficulty to treat these ‘superbug’ and menace and some term as ‘apocalypse’ of post antibiotics era, an alternative approach to controlling this pathogen is prudent and one of the approaches is phage mediated control and/or treatment. Objective: In this study, we aimed to isolate novel bacteriophage against carbapenemase-producing K. pneumoniaeand characterize for potential use inphage therapy. Material and Methods: Twenty lytic phages were isolated from river water using double layer agar assay and purified. Biological features, physiochemical characters, burst size, host specificity and activity spectrum of phages were determined. One most potent phage: Phage TU_Kle10O was selected and characterized by electron microscopy. Whole genome sequences of the phage were analyzed for presence/absence of virulent factors, and other lysin genes. Results: Novel phage TU_Kle10O showed multiple host range within own genus and did not induce any BIM up to 5th generation of host’s life cycle. Electron microscopy confirmed that the phage was tailed and belonged to Caudovirales family. Next generation sequencing revealed its genome to be 166.2 Kb. bioinformatical analysis further confirmed that the phage genome ‘did not’ contain any ‘bacterial genes’ within phage genome, which ruled out the concern for transfer of virulent genes. Specific 'lysin’ enzyme was identified phages which could be used as 'antibiotics'. Conclusion: Extensively multidrug resistant bacteria like carbapenemase-producing K. pneumoniaecould be treated efficiently by phages.Absence of ‘virulent’ genes of bacterial origin and presence of lysin proteins within phage genome makes phages an excellent candidate for therapeutics.

Keywords: bacteriophage, Klebsiella pneumoniae, MDR, phage therapy, carbapenemase,

Procedia PDF Downloads 153

313 Mining the Proteome of Fusobacterium nucleatum for Potential Therapeutics Discovery

Authors: Abdul Musaweer Habib, Habibul Hasan Mazumder, Saiful Islam, Sohel Sikder, Omar Faruk Sikder

Abstract:

The plethora of genome sequence information of bacteria in recent times has ushered in many novel strategies for antibacterial drug discovery and facilitated medical science to take up the challenge of the increasing resistance of pathogenic bacteria to current antibiotics. In this study, we adopted subtractive genomics approach to analyze the whole genome sequence of the Fusobacterium nucleatum, a human oral pathogen having association with colorectal cancer. Our study divulged 1499 proteins of Fusobacterium nucleatum, which has no homolog in human genome. These proteins were subjected to screening further by using the Database of Essential Genes (DEG) that resulted in the identification of 32 vitally important proteins for the bacterium. Subsequent analysis of the identified pivotal proteins, using the KEGG Automated Annotation Server (KAAS) resulted in sorting 3 key enzymes of F. nucleatum that may be good candidates as potential drug targets, since they are unique for the bacterium and absent in humans. In addition, we have demonstrated the 3-D structure of these three proteins. Finally, determination of ligand binding sites of the key proteins as well as screening for functional inhibitors that best fitted with the ligands sites were conducted to discover effective novel therapeutic compounds against Fusobacterium nucleatum.

Keywords: colorectal cancer, drug target, Fusobacterium nucleatum, homology modeling, ligands

Procedia PDF Downloads 357

312 Genomic Diversity and Relationship among Arabian Peninsula Dromedary Camels Using Full Genome Sequencing Approach

Authors: H. Bahbahani, H. Musa, F. Al Mathen

Abstract:

The dromedary camels (Camelus dromedarius) are single-humped even-toed ungulates populating the African Sahara, Arabian Peninsula, and Southwest Asia. The genome of this desert-adapted species has been minimally investigated using autosomal microsatellite and mitochondrial DNA markers. In this study, the genomes of 33 dromedary camel samples from different parts of the Arabian Peninsula were sequenced using Illumina Next Generation Sequencing (NGS) platform. These data were combined with Genotyping-by-Sequencing (GBS) data from African (Sudanese) dromedaries to investigate the genomic relationship between African and Arabian Peninsula dromedary camels. Principle Component Analysis (PCA) and average genome-wide admixture analysis were be conducted on these data to tackle the objectives of these studies. Both of the two analyses conducted revealed phylogeographic distinction between these two camel populations. However, no breed-wise genetic classification has been revealed among the African (Sudanese) camel breeds. The Arabian Peninsula camel populations also show higher heterozygosity than the Sudanese camels. The results of this study explain the evolutionary history and migration of African dromedary camels from their center of domestication in the southern Arabian Peninsula. These outputs help scientists to further understand the evolutionary history of dromedary camels, which might impact in conserving the favorable genetic of this species.

Keywords: dromedary, genotyping-by-sequencing, Arabian Peninsula, Sudan

Procedia PDF Downloads 166

311 Genome-Wide Functional Analysis of Phosphatase in Cryptococcus neoformans

Authors: Jae-Hyung Jin, Kyung-Tae Lee, Yee-Seul So, Eunji Jeong, Yeonseon Lee, Dongpil Lee, Dong-Gi Lee, Yong-Sun Bahn

Abstract:

Cryptococcus neoformans causes cryptococcal meningoencephalitis mainly in immunocompromised patients as well as immunocompetent people. But therapeutic options are limited to treat cryptococcosis. Some signaling pathways including cyclic AMP pathway, MAPK pathway, and calcineurin pathway play a central role in the regulation of the growth, differentiation, and virulence of C. neoformans. To understand signaling networks regulating the virulence of C. neoformans, we selected the 114 putative phosphatase genes, one of the major components of signaling networks, in the genome of C. neoformans. We identified putative phosphatases based on annotation in C. neoformans var. grubii genome database provided by the Broad Institute and National Center for Biotechnology Information (NCBI) and performed a BLAST search of phosphatases of Saccharomyces cerevisiae, Aspergillus nidulans, Candida albicans and Fusarium graminearum to Cryptococcus neoformans. We classified putative phosphatases into 14 groups based on InterPro phosphatase domain annotation. Here, we constructed 170 signature-tagged gene-deletion strains through homologous recombination methods for 91 putative phosphatases. We examined their phenotypic traits under 30 different in vitro conditions, including growth, differentiation, stress response, antifungal resistance and virulence-factor production.

Keywords: human fungal pathogen, phosphatase, deletion library, functional genomics

Procedia PDF Downloads 333

310 Cassava Plant Architecture: Insights from Genome-Wide Association Studies

Authors: Abiodun Olayinka, Daniel Dzidzienyo, Pangirayi Tongoona, Samuel Offei, Edwige Gaby Nkouaya Mbanjo, Chiedozie Egesi, Ismail Yusuf Rabbi

Abstract:

Cassava (Manihot esculenta Crantz) is a major source of starch for various industrial applications. However, the traditional cultivation and harvesting methods of cassava are labour-intensive and inefficient, limiting the supply of fresh cassava roots for industrial starch production. To achieve improved productivity and quality of fresh cassava roots through mechanized cultivation, cassava cultivars with compact plant architecture and moderate plant height are needed. Plant architecture-related traits, such as plant height, harvest index, stem diameter, branching angle, and lodging tolerance, are critical for crop productivity and suitability for mechanized cultivation. However, the genetics of cassava plant architecture remain poorly understood. This study aimed to identify the genetic bases of the relationships between plant architecture traits and productivity-related traits, particularly starch content. A panel of 453 clones developed at the International Institute of Tropical Agriculture, Nigeria, was genotyped and phenotyped for 18 plant architecture and productivity-related traits at four locations in Nigeria. A genome-wide association study (GWAS) was conducted using the phenotypic data from a panel of 453 clones and 61,238 high-quality Diversity Arrays Technology sequencing (DArTseq) derived Single Nucleotide Polymorphism (SNP) markers that are evenly distributed across the cassava genome. Five significant associations between ten SNPs and three plant architecture component traits were identified through GWAS. We found five SNPs on chromosomes 6 and 16 that were significantly associated with shoot weight, harvest index, and total yield through genome-wide association mapping. We also discovered an essential candidate gene that is co-located with peak SNPs linked to these traits in M. esculenta. A review of the cassava reference genome v7.1 revealed that the SNP on chromosome 6 is in proximity to Manes.06G101600.1, a gene that regulates endodermal differentiation and root development in plants. The findings of this study provide insights into the genetic basis of plant architecture and yield in cassava. Cassava breeders could leverage this knowledge to optimize plant architecture and yield in cassava through marker-assisted selection and targeted manipulation of the candidate gene.

Keywords: Manihot esculenta Crantz, plant architecture, DArtseq, SNP markers, genome-wide association study

Procedia PDF Downloads 37

309 Motif Search-Aided Screening of the Pseudomonas syringae pv. Maculicola Genome for Genes Encoding Tertiary Alcohol Ester Hydrolases

Authors: M. L. Mangena, N. Mokoena, K. Rashamuse, M. G. Tlou

Abstract:

Tertiary alcohol ester (TAE) hydrolases are a group of esterases (EC 3.1.1.-) that catalyze the kinetic resolution of TAEs and as a result, they are sought-after for the production of optically pure tertiary alcohols (TAs) which are useful as building blocks for number biologically active compounds. What sets these enzymes apart is, the presence of a GGG(A)X-motif in the active site which appears to be the main reason behind their activity towards the sterically demanding TAEs. The genome of Pseudomonas syringae pv. maculicola (Psm) comprises a multitude of genes that encode esterases. We therefore, hypothesize that some of these genes encode TAE hydrolases. In this study, Psm was screened for TAE hydrolase activity using the linalyl acetate (LA) plate assay and a positive reaction was observed. As a result, the genome of Psm was screened for esterases with a GGG(A)X-motif using the motif search tool and two potential TAE hydrolase genes (PsmEST1 and 2, 1100 and 1000bp, respectively) were identified, PsmEST1 was amplified by PCR and the gene sequenced for confirmation. Analysis of the sequence data with the SingnalP 4.1 server revealed that the protein comprises a signal peptide (22 amino acid residues) on the N-terminus. Primers specific for the gene encoding the mature protein (without the signal peptide) were designed such that they contain NdeI and XhoI restriction sites for directional cloning of the PCR products into pET28a. The gene was expressed in E. coli JM109 (DE3) and the clones screened for TAE hydrolase activity using the LA plate assay. A positive clone was selected, overexpressed and the protein purified using nickel affinity chromatography. The activity of the esterase towards LA was confirmed using thin layer chromatography.

Keywords: hydrolases, tertiary alcohol esters, tertiary alcohols, screening, Pseudomonas syringae pv., maculicola genome, esterase activity, linalyl acetate

Procedia PDF Downloads 324

308 Development and Characterization of Polymorphic Genomic-SSR Markers in Asian Long-Horned Beetle (Anoplophora glabripennis)

Authors: Zhao Yang Liu, Jing Tao

Abstract:

The Asian long-horned beetle, Anoplophora glabripennis (Motschulsky) (Coleoptera: Cerambycidae: Lamiinae), is a wood-borer and polyphagous xylophages native to Asia and killing healthy trees. As it causes serious danger to trees, the beetle has been paid close attention in the world. However, the genetic markers limited, especially microsatellite. In this study, 24 novel simple sequence repeat (SSR) molecular markers, a powerful tool for genetic diversity studies and linkage map construction, were developed and characterized from whole genome shotgun sequences. We developed SSR loci of 2 to 6 repeated and perfect units including 9895 points, the density of SSRs was found one SSR per 56.57 kb and the abundance of SSR was 0.02/kb, besides 140 types of repeats motifs were found. Half of the 48 pairs SSR primers (containing 4 di-, 7 tri-, 2 tetra- and 11 hexamers SSRs) we selected randomly from 1222 pairs of primers were polymorphism. The number of alleles for these markers in 48 individuals varied from 3 to 21 with an average of 7.71, the number of effective alleles ranged from 1.22 to 9.97 with an average of 3.54. Besides this, the polymorphic information content (PIC) ranged from 0.18 to 0.89 with a mean of 0.65, And Shannon's Information index (I) ranged from 0.46 to 2.62 with an average of 1.44. The results suggest that the method for screening of SSR in the whole genome is feasible and efficient. SSR markers developed in this study can be used for population genetic studies of A. glabripennis. Moreover, they may also be helpful for the development of microsatellites for other Coleoptera.

Keywords: SSR markers, Anoplophora glabripennis, genetic diversity, whole genome

Procedia PDF Downloads 359

307 The Cleavage of DNA by the Anti-Tumor Drug Bleomycin at the Transcription Start Sites of Human Genes Using Genome-Wide Techniques

Authors: Vincent Murray

Abstract:

The glycopeptide bleomycin is used in the treatment of testicular cancer, Hodgkin's lymphoma, and squamous cell carcinoma. Bleomycin damages and cleaves DNA in human cells, and this is considered to be the main mode of action for bleomycin's anti-tumor activity. In particular, double-strand breaks are thought to be the main mechanism for the cellular toxicity of bleomycin. Using Illumina next-generation DNA sequencing techniques, the genome-wide sequence specificity of bleomycin-induced double-strand breaks was determined in human cells. The degree of bleomycin cleavage was also assessed at the transcription start sites (TSSs) of actively transcribed genes and compared with non-transcribed genes. It was observed that bleomycin preferentially cleaved at the TSSs of actively transcribed human genes. There was a correlation between the degree of this enhanced cleavage at TSSs and the level of transcriptional activity. Bleomycin cleavage is also affected by chromatin structure and at TSSs, the peaks of bleomycin cleavage were approximately 200 bp apart. This indicated that bleomycin was able to detect phased nucleosomes at the TSSs of actively transcribed human genes. The genome-wide cleavage pattern of the bleomycin analogues 6′-deoxy-BLM Z and zorbamycin was also investigated in human cells. As found for bleomycin, these bleomycin analogues also preferentially cleaved at the TSSs of actively transcribed human genes. The cytotoxicity (IC₅₀ values) of these bleomycin analogues was determined. It was found that the degree of enhanced cleavage at TSSs was inversely correlated with the IC₅₀ values of the bleomycin analogues. This suggested that the level of cleavage at the TSSs of actively transcribed human genes was important for the cytotoxicity of bleomycin and analogues. Hence this study provided a deeper understanding of the cellular processes involved in the cancer chemotherapeutic activity of bleomycin.

Keywords: anti-tumour activity, bleomycin analogues, chromatin structure, genome-wide study, Illumina DNA sequencing

Procedia PDF Downloads 95

306 Unifying RSV Evolutionary Dynamics and Epidemiology Through Phylodynamic Analyses

Authors: Lydia Tan, Philippe Lemey, Lieselot Houspie, Marco Viveen, Darren Martin, Frank Coenjaerts

Abstract:

Introduction: Human respiratory syncytial virus (hRSV) is the leading cause of severe respiratory tract infections in infants under the age of two. Genomic substitutions and related evolutionary dynamics of hRSV are of great influence on virus transmission behavior. The evolutionary patterns formed are due to a precarious interplay between the host immune response and RSV, thereby selecting the most viable and less immunogenic strains. Studying genomic profiles can teach us which genes and consequent proteins play an important role in RSV survival and transmission dynamics. Study design: In this study, genetic diversity and evolutionary rate analysis were conducted on 36 RSV subgroup B whole genome sequences and 37 subgroup A genome sequences. Clinical RSV isolates were obtained from nasopharyngeal aspirates and swabs of children between 2 weeks and 5 years old of age. These strains, collected during epidemic seasons from 2001 to 2011 in the Netherlands and Belgium by either conventional or 454-sequencing. Sequences were analyzed for genetic diversity, recombination events, synonymous/non-synonymous substitution ratios, epistasis, and translational consequences of mutations were mapped to known 3D protein structures. We used Bayesian statistical inference to estimate the rate of RSV genome evolution and the rate of variability across the genome. Results: The A and B profiles were described in detail and compared to each other. Overall, the majority of the whole RSV genome is highly conserved among all strains. The attachment protein G was the most variable protein and its gene had, similar to the non-coding regions in RSV, more elevated (two-fold) substitution rates than other genes. In addition, the G gene has been identified as the major target for diversifying selection. Overall, less gene and protein variability was found within RSV-B compared to RSV-A and most protein variation between the subgroups was found in the F, G, SH and M2-2 proteins. For the F protein mutations and correlated amino acid changes are largely located in the F2 ligand-binding domain. The small hydrophobic phosphoprotein and nucleoprotein are the most conserved proteins. The evolutionary rates were similar in both subgroups (A: 6.47E-04, B: 7.76E-04 substitution/site/yr), but estimates of the time to the most recent common ancestor were much lower for RSV-B (B: 19, A: 46.8 yrs), indicating that there is more turnover in this subgroup. Conclusion: This study provides a detailed description of whole RSV genome mutations, the effect on translation products and the first estimate of the RSV genome evolution tempo. The immunogenic G protein seems to require high substitution rates in order to select less immunogenic strains and other conserved proteins are most likely essential to preserve RSV viability. The resulting G gene variability makes its protein a less interesting target for RSV intervention methods. The more conserved RSV F protein with less antigenic epitope shedding is, therefore, more suitable for developing therapeutic strategies or vaccines.

Keywords: drug target selection, epidemiology, respiratory syncytial virus, RSV

Procedia PDF Downloads 383

305 CRISPR-Mediated Genome Editing for Yield Enhancement in Tomato

Authors: Aswini M. S.

Abstract:

Tomato (Solanum lycopersicum L.) is one of the most significant vegetable crops in terms of its economic benefits. Both fresh and processed tomatoes are consumed. Tomatoes have a limited genetic base, which makes breeding extremely challenging. Plant breeding has become much simpler and more effective with genome editing tools of CRISPR and CRISPR-associated 9 protein (CRISPR/Cas9), which address the problems with traditional breeding, chemical/physical mutagenesis, and transgenics. With the use of CRISPR/Cas9, a number of tomato traits have been functionally distinguished and edited. These traits include plant architecture as well as flower characters (leaf, flower, male sterility, and parthenocarpy), fruit ripening, quality and nutrition (lycopene, carotenoid, GABA, TSS, and shelf-life), disease resistance (late blight, TYLCV, and powdery mildew), tolerance to abiotic stress (heat, drought, and salinity) and resistance to herbicides. This study explores the potential of CRISPR/Cas9 genome editing for enhancing yield in tomato plants. The study utilized the CRISPR/Cas9 genome editing technology to functionally edit various traits in tomatoes. The de novo domestication of elite features from wild cousins to cultivated tomatoes and vice versa has been demonstrated by the introgression of CRISPR/Cas9. The CycB (Lycopene beta someri) gene-mediated Cas9 editing increased the lycopene content in tomato. Also, Cas9-mediated editing of the AGL6 (Agamous-like 6) gene resulted in parthenocarpic fruit development under heat-stress conditions. The advent of CRISPR/Cas has rendered it possible to use digital resources for single guide RNA design and multiplexing, cloning (such as Golden Gate cloning, GoldenBraid, etc.), creating robust CRISPR/Cas constructs, and implementing effective transformation protocols like the Agrobacterium and DNA free protoplast method for Cas9-gRNAs ribonucleoproteins (RNPs) complex. Additionally, homologous recombination (HR)-based gene knock-in (HKI) via geminivirus replicon and base/prime editing (Target-AID technology) remains possible. Hence, CRISPR/Cas facilitates fast and efficient breeding in the improvement of tomatoes.

Keywords: CRISPR-Cas, biotic and abiotic stress, flower and fruit traits, genome editing, polygenic trait, tomato and trait introgression

Procedia PDF Downloads 39

304 Black-Brown and Yellow-Brown-Red Skin Pigmentation Elements are Shared in Common: Using Art and Science for Multicultural Education

Authors: Mary Kay Bacallao

Abstract:

New research on the human genome has revealed secrets to the variation in skin pigmentation found in all human populations. Application of this research to multicultural education has a profound effect on students from all backgrounds. This paper identifies the four locations in the human genome that code for variation in skin pigmentation worldwide. The research makes this new knowledge accessible to students of all ages as they participate in an art project that brings these scientific multicultural concepts to life. Students participate in the application of breakthrough scientific principles through hands-on art activities where they simulate the work of the DNA coding to create their own skin tone using the colors expressed to varying degrees in every people group. As students create their own artwork handprint from the pallet of colors, they realize that each color on the pallet is essential to creating every tone of skin. This research project serves to bring people together and appreciate the variety and diversity in skin tones. As students explore the variations, they create pigmentation with the use of the eumelanins, which are the black-brown sources of pigmentation, and the pheomelanins, which are the yellow-reddish-brown sources of pigmentation. The research project dispels myths about skin tones that have divided people in the past. As a group project, this research leads to greater appreciation and understanding of the diverse family groups.

Keywords: diversity, multicultural, skin pigmentation, eumelanins, pheomelanins, handprint, artwork, science, genome, human

Procedia PDF Downloads 39

303 Multivariate Genome-Wide Association Studies for Identifying Additional Loci for Myopia

Authors: Qiao Fan, Xiaobo Guo, Junxian Zhu, Xiaohu Ding, Ching-Yu Cheng, Tien-Yin Wong, Mingguang He, Heping Zhang, Xueqin Wang

Abstract:

A systematic, simultaneous analysis of multiple phenotypes in genome-wide association studies (GWASs) draws a great attention to integrate the signals from single phenotypes with increased power. However, lacking an interpretable and efficient multivariate GWAS analysis impede the application of such approach. In this study, we propose to decompose the multivariate model into a series of simple univariate models. This transformation illuminates what exactly the individual trait contributes to the significant signals from the multivariate analyses. By employing our approach in the analysis of three myopia-related endophenotypes from the Singapore Malay Eye Study (SIMES), we identify novel candidate loci which were successfully validated in an independent Guangzhou Twin Eye Study (GTES).

Keywords: GWAS multivariate, multiple traits, myopia, association

Procedia PDF Downloads 195

302 Brachypodium: A Model Genus to Study Grass Genome Organisation at the Cytomolecular Level

Authors: R. Hasterok, A. Betekhtin, N. Borowska, A. Braszewska-Zalewska, E. Breda, K. Chwialkowska, R. Gorkiewicz, D. Idziak, J. Kwasniewska, M. Kwasniewski, D. Siwinska, A. Wiszynska, E. Wolny

Abstract:

In contrast to animals, the organisation of plant genomes at the cytomolecular level is still relatively poorly studied and understood. However, the Brachypodium genus in general and B. distachyon in particular represent exceptionally good model systems for such study. This is due not only to their highly desirable ‘model’ biological features, such as small nuclear genome, low chromosome number and complex phylogenetic relations, but also to the rapidly and continuously growing repertoire of experimental tools, such as large collections of accessions, WGS information, large insert (BAC) libraries of genomic DNA, etc. Advanced cytomolecular techniques, such as fluorescence in situ hybridisation (FISH) with evermore sophisticated probes, empowered by cutting-edge microscope and digital image acquisition and processing systems, offer unprecedented insight into chromatin organisation at various phases of the cell cycle. A good example is chromosome painting which uses pools of chromosome-specific BAC clones, and enables the tracking of individual chromosomes not only during cell division but also during interphase. This presentation outlines the present status of molecular cytogenetic analyses of plant genome structure, dynamics and evolution using B. distachyon and some of its relatives. The current projects focus on important scientific questions, such as: What mechanisms shape the karyotypes? Is the distribution of individual chromosomes within an interphase nucleus determined? Are there hot spots of structural rearrangement in Brachypodium chromosomes? Which epigenetic processes play a crucial role in B. distachyon embryo development and selective silencing of rRNA genes in Brachypodium allopolyploids? The authors acknowledge financial support from the Polish National Science Centre (grants no. 2012/04/A/NZ3/00572 and 2011/01/B/NZ3/00177)

Keywords: Brachypodium, B. distachyon, chromosome, FISH, molecular cytogenetics, nucleus, plant genome organisation

Procedia PDF Downloads 323

301 Performance of High Density Genotyping in Sahiwal Cattle Breed

Authors: Hamid Mustafa, Huson J. Heather, Kim Eiusoo, Adeela Ajmal, Tad S. Sonstegard

Abstract:

The objective of this study was to evaluate the informativeness of Bovine high density SNPs genotyping in Sahiwal cattle population. This is a first attempt to assess the Bovine HD SNP genotyping array in any Pakistani indigenous cattle population. To evaluate these SNPs on genome wide scale, we considered 777,962 SNPs spanning the whole autosomal and X chromosomes in Sahiwal cattle population. Fifteen (15) non related gDNA samples were genotyped with the bovine HD infinium. Approximately 500,939 SNPs were found polymorphic (MAF > 0.05) in Sahiwal cattle population. The results of this study indicate potential application of Bovine High Density SNP genotyping in Pakistani indigenous cattle population. The information generated from this array can be applied in genetic prediction, characterization and genome wide association studies of Pakistani Sahiwal cattle population.

Keywords: Sahiwal cattle, polymorphic SNPs, genotyping, Pakistan

Procedia PDF Downloads 397

300 Expression Profiling and Immunohistochemical Analysis of Squamous Cell Carcinoma of Head and Neck (Tumor, Transition Zone, Normal) by Whole Genome Scale Sequencing

Authors: Veronika Zivicova, Petr Broz, Zdenek Fik, Alzbeta Mifkova, Jan Plzak, Zdenek Cada, Herbert Kaltner, Jana Fialova Kucerova, Hans-Joachim Gabius, Karel Smetana Jr.

Abstract:

The possibility to determine genome-wide expression profiles of cells and tissues opens a new level of analysis in the quest to define dysregulation in malignancy and thus identify new tumor markers. Toward this long-term aim, we here address two issues on this level for head and neck cancer specimen: i) defining profiles in different regions, i.e. the tumor, the transition zone and normal control and ii) comparing complete data sets for seven individual patients. Special focus in the flanking immunohistochemical part is given to adhesion/growth-regulatory galectins that upregulate chemo- and cytokine expression in an NF-κB-dependent manner, to these regulators and to markers of differentiation, i.e. keratins. The detailed listing of up- and down-regulations, also available in printed form (1), not only served to unveil new candidates for testing as marker but also let the impact of the tumor in the transition zone become apparent. The extent of interindividual variation raises a strong cautionary note on assuming uniformity of regulatory events, to be noted when considering therapeutic implications. Thus, a combination of test targets (and a network analysis for galectins and their downstream effectors) is (are) advised prior to reaching conclusions on further perspectives.

Keywords: galectins, genome scale sequencing, squamous cell carcinoma, transition zone

Procedia PDF Downloads 208

299 Genome-Wide Analysis of BES1/BZR1 Gene Family in Five Plant Species

Authors: Jafar Ahmadi, Zhohreh Asiaban, Sedigheh Fabriki Ourang

Abstract:

Brassinosteroids (BRs) regulate cell elongation, vascular differentiation, senescence and stress responses. BRs signal through the BES1/BZR1 family of transcription factors, which regulate hundreds of target genes involved in this pathway. In this research a comprehensive genome-wide analysis was carried out in BES1/BZR1 gene family in Arabidopsis thaliana, Cucumis sativus, Vitis vinifera, Glycin max, and Brachypodium distachyon. Specifications of the desired sequences, dot plot and hydropathy plot were analyzed in the protein and genome sequences of five plant species. The maximum amino acid length was attributed to protein sequence Brdic3g with 374aa and the minimum amino acid length was attributed to protein sequence Gm7g with 163aa. The maximum Instability index was attributed to protein sequence AT1G19350 equal with 79.99 and the minimum Instability index was attributed to protein sequence Gm5g equal with 33.22. Aliphatic index of these protein sequences ranged from 47.82 to 78.79 in Arabidopsis thaliana, 49.91 to 57.50 in Vitis vinifera, 55.09 to 82.43 in Glycin max, 54.09 to 54.28 in Brachypodium distachyon 55.36 to 56.83 in Cucumis sativus. Overall, data obtained from our investigation contributes a better understanding of the complexity of the BES1/BZR1 gene family and provides the first step towards directing future experimental designs to perform systematic analysis of the functions of the BES1/BZR1 gene family.

Keywords: BES1/BZR1, brassinosteroids, phylogenetic analysis, transcription factor

Procedia PDF Downloads 307

298 Prediction of Solanum Lycopersicum Genome Encoded microRNAs Targeting Tomato Spotted Wilt Virus

Authors: Muhammad Shahzad Iqbal, Zobia Sarwar, Salah-ud-Din

Abstract:

Tomato spotted wilt virus (TSWV) belongs to the genus Tospoviruses (family Bunyaviridae). It is one of the most devastating pathogens of tomato (Solanum Lycopersicum) and heavily damages the crop yield each year around the globe. In this study, we retrieved 329 mature miRNA sequences from two microRNA databases (miRBase and miRSoldb) and checked the putative target sites in the downloaded-genome sequence of TSWV. A consensus of three miRNA target prediction tools (RNA22, miRanda and psRNATarget) was used to screen the false-positive microRNAs targeting sites in the TSWV genome. These tools calculated different target sites by calculating minimum free energy (mfe), site-complementarity, minimum folding energy and other microRNA-mRNA binding factors. R language was used to plot the predicted target-site data. All the genes having possible target sites for different miRNAs were screened by building a consensus table. Out of these 329 mature miRNAs predicted by three algorithms, only eight miRNAs met all the criteria/threshold specifications. MC-Fold and MC-Sym were used to predict three-dimensional structures of miRNAs and further analyzed in USCF chimera to visualize the structural and conformational changes before and after microRNA-mRNA interactions. The results of the current study show that the predicted eight miRNAs could further be evaluated by in vitro experiments to develop TSWV-resistant transgenic tomato plants in the future.

Keywords: tomato spotted wild virus (TSWV), Solanum lycopersicum, plant virus, miRNAs, microRNA target prediction, mRNA

Procedia PDF Downloads 120

297 COVID-19 Genomic Analysis and Complete Evaluation

Authors: Narin Salehiyan, Ramin Ghasemi Shayan

Abstract:

In order to investigate coronavirus RNA replication, transcription, recombination, protein processing and transport, virion assembly, the identification of coronavirus-specific cell receptors, and polymerase processing, the manipulation of coronavirus clones and complementary DNAs (cDNAs) of defective-interfering (DI) RNAs is the subject of this chapter. The idea of the Covid genome is nonsegmented, single-abandoned, and positive-sense RNA. When compared to other RNA viruses, its size is significantly greater, ranging from 27 to 32 kb. The quality encoding the enormous surface glycoprotein depends on 4.4 kb, encoding a forcing trimeric, profoundly glycosylated protein. This takes off exactly 20 nm over the virion envelope, giving the infection the appearance-with a little creative mind of a crown or coronet. Covid research has added to the comprehension of numerous parts of atomic science as a general rule, like the component of RNA union, translational control, and protein transport and handling. It stays a fortune equipped for creating startling experiences.

Keywords: covid-19, corona, virus, genome, genetic

Procedia PDF Downloads 41

296 Unraveling the Evolution of Mycoplasma Hominis Through Its Genome Sequence

Authors: Boutheina Ben Abdelmoumen Mardassi, Salim Chibani, Safa Boujemaa, Amaury Vaysse, Julien Guglielmini, Elhem Yacoub

Abstract:

Background and aim: Mycoplasma hominis (MH) is a pathogenic bacterium belonging to the Mollicutes class. It causes a wide range of gynecological infections and infertility among adults. Recently, we have explored for the first time the phylodistribution of Tunisian M. hominis clinical strains using an expanded MLST. We have demonstrated their distinction into two pure lineages, which each corresponding to a specific pathotype: genital infections and infertility. The aim of this project is to gain further insight into the evolutionary dynamics and the specific genetic factors that distinguish MH pathotypes Methods: Whole genome sequencing of Mycoplasma hominis clinical strains was performed using illumina Miseq. Denovo assembly was performed using a publicly available in-house pipeline. We used prokka to annotate the genomes, panaroo to generate the gene presence matrix and Jolytree to establish the phylogenetic tree. We used treeWAS to identify genetic loci associated with the pathothype of interest from the presence matrix and phylogenetic tree. Results: Our results revealed a clear categorization of the 62 MH clinical strains into two distinct genetic lineages, with each corresponding to a specific pathotype.; gynecological infections and infertility[AV1] . Genome annotation showed that GC content is ranging between 26 and 27%, which is a known characteristic of Mycoplasma genome. Housekeeping genes belonging to the core genome are highly conserved among our strains. TreeWas identified 4 virulence genes associated with the pathotype gynecological infection. encoding for asparagine--tRNA ligase, restriction endonuclease subunit S, Eco47II restriction endonuclease, and transcription regulator XRE (involved in tolerance to oxidative stress). Five genes have been identified that have a statistical association with infertility, tow lipoprotein, one hypothetical protein, a glycosyl transferase involved in capsule synthesis, and pyruvate kinase involved in biofilm formation. All strains harbored an efflux pomp that belongs to the family of multidrug resistance ABC transporter, which confers resistance to a wide range of antibiotics. Indeed many adhesion factors and lipoproteins (p120, p120', p60, p80, Vaa) have been checked and confirmed in our strains with a relatively 99 % to 96 % conserved domain and hypervariable domain that represent 1 to 4 % of the reference sequence extracted from gene bank. Conclusion: In summary, this study led to the identification of specific genetic loci associated with distinct pathotypes in M hominis.

Keywords: mycoplasma hominis, infertility, gynecological infections, virulence genes, antibiotic resistance

Procedia PDF Downloads 53