Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 26863

Search results for: genome analysis

26803 RNA-Seq Analysis of the Wild Barley (H. spontaneum) Leaf Transcriptome under Salt Stress

Authors: Ahmed Bahieldin, Ahmed Atef, Jamal S. M. Sabir, Nour O. Gadalla, Sherif Edris, Ahmed M. Alzohairy, Nezar A. Radhwan, Mohammed N. Baeshen, Ahmed M. Ramadan, Hala F. Eissa, Sabah M. Hassan, Nabih A. Baeshen, Osama Abuzinadah, Magdy A. Al-Kordy, Fotouh M. El-Domyati, Robert K. Jansen

Abstract:

Wild salt-tolerant barley (Hordeum spontaneum) is the ancestor of cultivated barley (Hordeum vulgare or H. vulgare). Although the cultivated barley genome is well studied, little is known about genome structure and function of its wild ancestor. In the present study, RNA-Seq analysis was performed on young leaves of wild barley treated with salt (500 mM NaCl) at four different time intervals. Transcriptome sequencing yielded 103 to 115 million reads for all replicates of each treatment, corresponding to over 10 billion nucleotides per sample. Of the total reads, between 74.8 and 80.3% could be mapped and 77.4 to 81.7% of the transcripts were found in the H. vulgare unigene database (unigene-mapped). The unmapped wild barley reads for all treatments and replicates were assembled de novo and the resulting contigs were used as a new reference genome. This resultedin94.3 to 95.3%oftheunmapped reads mapping to the new reference. The number of differentially expressed transcripts was 9277, 3861 of which were uni gene-mapped. The annotated unigene- and de novo-mapped transcripts (5100) were utilized to generate expression clusters across time of salt stress treatment. Two-dimensional hierarchical clustering classified differential expression profiles into nine expression clusters, four of which were selected for further analysis. Differentially expressed transcripts were assigned to the main functional categories. The most important groups were ‘response to external stimulus’ and ‘electron-carrier activity’. Highly expressed transcripts are involved in several biological processes, including electron transport and exchanger mechanisms, flavonoid biosynthesis, reactive oxygen species (ROS) scavenging, ethylene production, signaling network and protein refolding. The comparisons demonstrated that mRNA-Seq is an efficient method for the analysis of differentially expressed genes and biological processes under salt stress.

Keywords: electron transport, flavonoid biosynthesis, reactive oxygen species, rnaseq

Procedia PDF Downloads 358

26802 Reconstruction of a Genome-Scale Metabolic Model to Simulate Uncoupled Growth of Zymomonas mobilis

Authors: Maryam Saeidi, Ehsan Motamedian, Seyed Abbas Shojaosadati

Abstract:

Zymomonas mobilis is known as an example of the uncoupled growth phenomenon. This microorganism also has a unique metabolism that degrades glucose by the Entner–Doudoroff (ED) pathway. In this paper, a genome-scale metabolic model including 434 genes, 757 reactions and 691 metabolites was reconstructed to simulate uncoupled growth and study its effect on flux distribution in the central metabolism. The model properly predicted that ATPase was activated in experimental growth yields of Z. mobilis. Flux distribution obtained from model indicates that the major carbon flux passed through ED pathway that resulted in the production of ethanol. Small amounts of carbon source were entered into pentose phosphate pathway and TCA cycle to produce biomass precursors. Predicted flux distribution was in good agreement with experimental data. The model results also indicated that Z. mobilis metabolism is able to produce biomass with maximum growth yield of 123.7 g (mol glucose)-1 if ATP synthase is coupled with growth and produces 82 mmol ATP gDCW-1h-1. Coupling the growth and energy reduced ethanol secretion and changed the flux distribution to produce biomass precursors.

Keywords: genome-scale metabolic model, Zymomonas mobilis, uncoupled growth, flux distribution, ATP dissipation

Procedia PDF Downloads 452

26801 Evolution of DNA-Binding With-One-Finger Transcriptional Factor Family in Diploid Cotton Gossypium raimondii

Authors: Waqas Shafqat Chattha, Muhammad Iqbal, Amir Shakeel

Abstract:

Transcriptional factors are proteins that play a vital role in regulating the transcription of target genes in different biological processes and are being widely studied in different plant species. In the current era of genomics, plant genomes sequencing has directed to the genome-wide identification, analyses and categorization of diverse transcription factor families and hence provide key insights into their structural as well as functional diversity. The DNA-binding with One Finger (DOF) proteins belongs to C2-C2-type zinc ﬁnger protein family. DOF proteins are plant-speciﬁc transcription factors implicated in diverse functions including seed maturation and germination, phytohormone signalling, light-mediated gene regulation, cotton-fiber elongation and responses of the plant to biotic as well as abiotic stresses. In this context, a genome-wide in-silico analysis of DOF TF family in diploid cotton species i.e. Gossypium raimondii has enabled us to identify 55 non-redundant genes encoding DOF proteins renamed as GrDofs (Gossypium raimondii Dof). Gene distribution studies have shown that all of the GrDof genes are unevenly distributed across 12 out of 13 G. raimondii chromosomes. The gene structure analysis illustrated that 34 out of 55 GrDof genes are intron-less while remaining 21 genes have a single intron. Protein sequence-based phylogenetic analysis of putative 55 GrDOFs has divided these proteins into 5 major groups with various paralogous gene pairs. Molecular evolutionary studies aided with the conserved domain as well as gene structure analysis suggested that segmental duplications were the principal contributors for the expansion of Dof genes in G. raimondii.

Keywords: diploid cotton , G. raimondii, phylogenetic analysis, transcription factor

Procedia PDF Downloads 117

26800 Microarrays: Wide Clinical Utilities and Advances in Healthcare

Authors: Salma M. Wakil

Abstract:

Advances in the field of genetics overwhelmed detecting large number of inherited disorders at the molecular level and directed to the development of innovative technologies. These innovations have led to gene sequencing, prenatal mutation detection, pre-implantation genetic diagnosis; population based carrier screening and genome wide analyses using microarrays. Microarrays are widely used in establishing clinical and diagnostic setup for genetic anomalies at a massive level, with the advent of cytoscan molecular karyotyping as a clinical utility card for detecting chromosomal aberrations with high coverage across the entire human genome. Unlike a regular karyotype that relies on the microscopic inspection of chromosomes, molecular karyotyping with cytoscan constructs virtual chromosomes based on the copy number analysis of DNA which improves its resolution by 100-fold. We have been investigating a large number of patients with Developmental Delay and Intellectual disability with this platform for establishing micro syndrome deletions and have detected number of novel CNV’s in the Arabian population with the clinical relevance.

Keywords: microarrays, molecular karyotyping, developmental delay, genetics

Procedia PDF Downloads 424

26799 Genome-Wide Analysis Identifies Locus Associated with Parathyroid Hormone Levels

Authors: Antonela Matana, Dubravka Brdar, Vesela Torlak, Marijana Popovic, Ivana Gunjaca, Ozren Polasek, Vesna Boraska Perica, Maja Barbalic, Ante Punda, Caroline Hayward, Tatijana Zemunik

Abstract:

Parathyroid hormone (PTH) plays a critical role in the regulation of bone mineral metabolism and calcium homeostasis. Higher PTH levels are associated with heart failure, hypertension, coronary artery disease, cardiovascular mortality and poorer bone health. A twin study estimated that 60% of the variation in PTH concentrations is genetically determined. Only one GWAS of PTH concentration has been reported to date. Identified loci explained 4.5% of the variance in circulating PTH, suggesting that additional genetic variants remain undiscovered. Therefore, the aim of this study was to identify novel genetic variants associated with PTH levels in a general population. We have performed a GWAS meta-analysis on 2596 individuals originating from three Croatian cohorts: City of Split and the Islands of Korčula and Vis, within a large-scale project of “10,001 Dalmatians”. A total of 7 411 206 variants, imputed using the 1000 Genomes reference panel, with minor allele frequency ≥ 1% and Rsq ≥ 0.5 were analyzed for the association. GWAS within each data set was performed under an additive model, controlling for age, gender and relatedness. Meta-analysis was conducted using the inverse-variance fixed-effects method. Furthermore, to identify sex-specific effects, we have conducted GWAS meta-analyses analyzing males and females separately. In addition, we have performed biological pathway analysis. Four SNPs, representing one locus, reached genome-wide significance. The most significant SNP was rs11099476 on chromosome 4 (P=1.15x10-8), which explained 1.14 % of the variance in PTH. The SNP is located near the protein-coding gene RASGEF1B. Additionally, we detected suggestive association with SNPs, rs77178854 located on chromosome 2 in the DPP10 gene (P=2.46x10-7) and rs481121 located on chromosome 1 (P=3.58x10-7) near the GRIK1 gene. One of the top hits detected in the main meta-analysis, intron variant rs77178854 located within DPP10 gene, reached genome-wide significance in females (P=2.21x10-9). No single locus was identified in the meta-analysis in males. Fifteen biological pathways were functionally enriched at a P<0.01, including muscle contraction, ion homeostasis and cardiac conduction as the most significant pathways. RASGEF1B is the guanine nucleotide exchange factor, known to be associated with height, bone density, and hip. DPP10 encodes a membrane protein that is a member of the serine proteases family, which binds specific voltage-gated potassium channels and alters their expression and biophysical properties. In conclusion, we identified 2 novel loci associated with PTH levels in a general population, providing us with further insights into the genetics of this complex trait.

Keywords: general population, genome-wide association analysis, parathyroid hormone, single nucleotide polymorphisms.

Procedia PDF Downloads 198

26798 Predictive Pathogen Biology: Genome-Based Prediction of Pathogenic Potential and Countermeasures Targets

Authors: Debjit Ray

Abstract:

Horizontal gene transfer (HGT) and recombination leads to the emergence of bacterial antibiotic resistance and pathogenic traits. HGT events can be identified by comparing a large number of fully sequenced genomes across a species or genus, define the phylogenetic range of HGT, and find potential sources of new resistance genes. In-depth comparative phylogenomics can also identify subtle genome or plasmid structural changes or mutations associated with phenotypic changes. Comparative phylogenomics requires that accurately sequenced, complete and properly annotated genomes of the organism. Assembling closed genomes requires additional mate-pair reads or “long read” sequencing data to accompany short-read paired-end data. To bring down the cost and time required of producing assembled genomes and annotating genome features that inform drug resistance and pathogenicity, we are analyzing the performance for genome assembly of data from the Illumina NextSeq, which has faster throughput than the Illumina HiSeq (~1-2 days versus ~1 week), and shorter reads (150bp paired-end versus 300bp paired end) but higher capacity (150-400M reads per run versus ~5-15M) compared to the Illumina MiSeq. Bioinformatics improvements are also needed to make rapid, routine production of complete genomes a reality. Modern assemblers such as SPAdes 3.6.0 running on a standard Linux blade are capable in a few hours of converting mixes of reads from different library preps into high-quality assemblies with only a few gaps. Remaining breaks in scaffolds are generally due to repeats (e.g., rRNA genes) are addressed by our software for gap closure techniques, that avoid custom PCR or targeted sequencing. Our goal is to improve the understanding of emergence of pathogenesis using sequencing, comparative genomics, and machine learning analysis of ~1000 pathogen genomes. Machine learning algorithms will be used to digest the diverse features (change in virulence genes, recombination, horizontal gene transfer, patient diagnostics). Temporal data and evolutionary models can thus determine whether the origin of a particular isolate is likely to have been from the environment (could it have evolved from previous isolates). It can be useful for comparing differences in virulence along or across the tree. More intriguing, it can test whether there is a direction to virulence strength. This would open new avenues in the prediction of uncharacterized clinical bugs and multidrug resistance evolution and pathogen emergence.

Keywords: genomics, pathogens, genome assembly, superbugs

Procedia PDF Downloads 170

26797 Cassava Plant Architecture: Insights from Genome-Wide Association Studies

Authors: Abiodun Olayinka, Daniel Dzidzienyo, Pangirayi Tongoona, Samuel Offei, Edwige Gaby Nkouaya Mbanjo, Chiedozie Egesi, Ismail Yusuf Rabbi

Abstract:

Cassava (Manihot esculenta Crantz) is a major source of starch for various industrial applications. However, the traditional cultivation and harvesting methods of cassava are labour-intensive and inefficient, limiting the supply of fresh cassava roots for industrial starch production. To achieve improved productivity and quality of fresh cassava roots through mechanized cultivation, cassava cultivars with compact plant architecture and moderate plant height are needed. Plant architecture-related traits, such as plant height, harvest index, stem diameter, branching angle, and lodging tolerance, are critical for crop productivity and suitability for mechanized cultivation. However, the genetics of cassava plant architecture remain poorly understood. This study aimed to identify the genetic bases of the relationships between plant architecture traits and productivity-related traits, particularly starch content. A panel of 453 clones developed at the International Institute of Tropical Agriculture, Nigeria, was genotyped and phenotyped for 18 plant architecture and productivity-related traits at four locations in Nigeria. A genome-wide association study (GWAS) was conducted using the phenotypic data from a panel of 453 clones and 61,238 high-quality Diversity Arrays Technology sequencing (DArTseq) derived Single Nucleotide Polymorphism (SNP) markers that are evenly distributed across the cassava genome. Five significant associations between ten SNPs and three plant architecture component traits were identified through GWAS. We found five SNPs on chromosomes 6 and 16 that were significantly associated with shoot weight, harvest index, and total yield through genome-wide association mapping. We also discovered an essential candidate gene that is co-located with peak SNPs linked to these traits in M. esculenta. A review of the cassava reference genome v7.1 revealed that the SNP on chromosome 6 is in proximity to Manes.06G101600.1, a gene that regulates endodermal differentiation and root development in plants. The findings of this study provide insights into the genetic basis of plant architecture and yield in cassava. Cassava breeders could leverage this knowledge to optimize plant architecture and yield in cassava through marker-assisted selection and targeted manipulation of the candidate gene.

Keywords: Manihot esculenta Crantz, plant architecture, DArtseq, SNP markers, genome-wide association study

Procedia PDF Downloads 36

26796 Genomic and Evolutionary Diversity of Long Terminal Repeat (LTR) Retrotransposons in Date Palm (Phoenix dactylifera)

Authors: Faisal Nouroz, Mukaramin Mukaramin

Abstract:

Of the transposable elements (TEs), the retrotransposons are the most copious elements identified from many sequenced genomes. They have played a major role in genome evolution, rearrangement, and expansions based on their copy and paste mode of proliferation. They are further divided into LTR and Non-LTR retrotransposons. The purpose of the current study was to identify the LTR REs in sequenced Phoenix dactylifera genome and to study their structural diversity. A total of 150 P. dactylifera BAC sequences with > 60kb sizes were randomly retrieved from National Center for Biotechnology Information (NCBI) database and screened for the presence of LTR retrotransposons. Seven bacterial artificial chromosomes (BAC) sequences showed full-length LTR Retrotransposons with 4 Copia and 3 Gypsy families having variable copy numbers in respective families. Reverse transcriptase (RT) domain was found as the most conserved domain among Copia and Gypsy superfamilies and was used to deduce evolutionary analysis. The amino acid residues among various RT sequences showed variability in their percentages indicating post divergence evolution. Amino acid Leucine was found in highest proportions followed by Lysine, while Methionine and Tryptophan were in lowest percentages. The phylogenetic analysis based on RT domains confirmed that although having most conserved RT regions, several evolutionary events occurred causing nucleotide polymorphisms and hence clustering of Gypsy and Copia superfamilies into their respective lineages. The study will be helpful in identification and annotation of these elements in other species and genera and their distribution patterns on chromosomes by fluorescent in situ hybridization techniques.

Keywords: transposable elements, Phoenix dactylifera, retrotransposons, phylogenetic analysis

Procedia PDF Downloads 102

26795 Development and Characterization of Polymorphic Genomic-SSR Markers in Asian Long-Horned Beetle (Anoplophora glabripennis)

Authors: Zhao Yang Liu, Jing Tao

Abstract:

The Asian long-horned beetle, Anoplophora glabripennis (Motschulsky) (Coleoptera: Cerambycidae: Lamiinae), is a wood-borer and polyphagous xylophages native to Asia and killing healthy trees. As it causes serious danger to trees, the beetle has been paid close attention in the world. However, the genetic markers limited, especially microsatellite. In this study, 24 novel simple sequence repeat (SSR) molecular markers, a powerful tool for genetic diversity studies and linkage map construction, were developed and characterized from whole genome shotgun sequences. We developed SSR loci of 2 to 6 repeated and perfect units including 9895 points, the density of SSRs was found one SSR per 56.57 kb and the abundance of SSR was 0.02/kb, besides 140 types of repeats motifs were found. Half of the 48 pairs SSR primers (containing 4 di-, 7 tri-, 2 tetra- and 11 hexamers SSRs) we selected randomly from 1222 pairs of primers were polymorphism. The number of alleles for these markers in 48 individuals varied from 3 to 21 with an average of 7.71, the number of effective alleles ranged from 1.22 to 9.97 with an average of 3.54. Besides this, the polymorphic information content (PIC) ranged from 0.18 to 0.89 with a mean of 0.65, And Shannon's Information index (I) ranged from 0.46 to 2.62 with an average of 1.44. The results suggest that the method for screening of SSR in the whole genome is feasible and efficient. SSR markers developed in this study can be used for population genetic studies of A. glabripennis. Moreover, they may also be helpful for the development of microsatellites for other Coleoptera.

Keywords: SSR markers, Anoplophora glabripennis, genetic diversity, whole genome

Procedia PDF Downloads 358

26794 Genome-Wide Homozygosity Analysis of the Longevous Phenotype in the Amish Population

Authors: Sandra Smieszek, Jonathan Haines

Abstract:

Introduction: Numerous research efforts have focused on searching for ‘longevity genes’. However, attempting to decipher the genetic component of the longevous phenotype have resulted in limited success and the mechanisms governing longevity remain to be explained. We conducted a genome-wide homozygosity analysis (GWHA) of the founder population of the Amish community in central Ohio. While genome-wide association studies using unrelated individuals have revealed many interesting longevity associated variants, these variants are typically of small effect and cannot explain the observed patterns of heritability for this complex trait. The Amish provide a large cohort of extended kinships allowing for in depth analysis via family-based approach excellent population due to its. Heritability of longevity increases with age with significant genetic contribution being seen in individuals living beyond 60 years of age. In our present analysis we show that the heritability of longevity is estimated to be increasing with age particularly on the paternal side. Methods: The present analysis integrated both phenotypic and genotypic data and led to the discovery of a series of variants, distinct for stratified populations across ages and distinct for paternal and maternal cohorts. Specifically 5437 subjects were analyzed and a subset of 893 successfully genotyped individuals was used to assess CHIP heritability. We have conducted the homozygosity analysis to examine if homozygosity is associated with increased risk of living beyond 90. We analyzed AMISH cohort genotyped for 614,957 SNPs. Results: We delineated 10 significant regions of homozygosity (ROH) specific for the age group of interest (>90). Of particular interest was ROH on chromosome 13, P < 0.0001. The lead SNPs rs7318486 and rs9645914 point to COL4A2 and our lead SNP. COL25A1 encodes one of the six subunits of type IV collagen, the C-terminal portion of the protein, known as canstatin, is an inhibitor of angiogenesis and tumor growth. COL4A2 mutations have been reported with a broader spectrum of cerebrovascular, renal, ophthalmological, cardiac, and muscular abnormalities. The second region of interest points to IRS2. Furthermore we built a classifier using the obtained SNPs from the significant ROH region with 0.945 AUC giving ability to discriminate between those living beyond to 90 years of age and beyond. Conclusion: In conclusion our results suggest that a history of longevity does indeed contribute to increasing the odds of individual longevity. Preliminary results are consistent with conjecture that heritability of longevity is substantial when we start looking at oldest fifth and smaller percentiles of survival specifically in males. We will validate all the candidate variants in independent cohorts of centenarians, to test whether they are robustly associated with human longevity. The identified regions of interest via ROH analysis could be of profound importance for the understanding of genetic underpinnings of longevity.

Keywords: regions of homozygosity, longevity, SNP, Amish

Procedia PDF Downloads 206

26793 The Cleavage of DNA by the Anti-Tumor Drug Bleomycin at the Transcription Start Sites of Human Genes Using Genome-Wide Techniques

Authors: Vincent Murray

Abstract:

The glycopeptide bleomycin is used in the treatment of testicular cancer, Hodgkin's lymphoma, and squamous cell carcinoma. Bleomycin damages and cleaves DNA in human cells, and this is considered to be the main mode of action for bleomycin's anti-tumor activity. In particular, double-strand breaks are thought to be the main mechanism for the cellular toxicity of bleomycin. Using Illumina next-generation DNA sequencing techniques, the genome-wide sequence specificity of bleomycin-induced double-strand breaks was determined in human cells. The degree of bleomycin cleavage was also assessed at the transcription start sites (TSSs) of actively transcribed genes and compared with non-transcribed genes. It was observed that bleomycin preferentially cleaved at the TSSs of actively transcribed human genes. There was a correlation between the degree of this enhanced cleavage at TSSs and the level of transcriptional activity. Bleomycin cleavage is also affected by chromatin structure and at TSSs, the peaks of bleomycin cleavage were approximately 200 bp apart. This indicated that bleomycin was able to detect phased nucleosomes at the TSSs of actively transcribed human genes. The genome-wide cleavage pattern of the bleomycin analogues 6′-deoxy-BLM Z and zorbamycin was also investigated in human cells. As found for bleomycin, these bleomycin analogues also preferentially cleaved at the TSSs of actively transcribed human genes. The cytotoxicity (IC₅₀ values) of these bleomycin analogues was determined. It was found that the degree of enhanced cleavage at TSSs was inversely correlated with the IC₅₀ values of the bleomycin analogues. This suggested that the level of cleavage at the TSSs of actively transcribed human genes was important for the cytotoxicity of bleomycin and analogues. Hence this study provided a deeper understanding of the cellular processes involved in the cancer chemotherapeutic activity of bleomycin.

Keywords: anti-tumour activity, bleomycin analogues, chromatin structure, genome-wide study, Illumina DNA sequencing

Procedia PDF Downloads 94

26792 CRISPR-Mediated Genome Editing for Yield Enhancement in Tomato

Authors: Aswini M. S.

Abstract:

Tomato (Solanum lycopersicum L.) is one of the most significant vegetable crops in terms of its economic benefits. Both fresh and processed tomatoes are consumed. Tomatoes have a limited genetic base, which makes breeding extremely challenging. Plant breeding has become much simpler and more effective with genome editing tools of CRISPR and CRISPR-associated 9 protein (CRISPR/Cas9), which address the problems with traditional breeding, chemical/physical mutagenesis, and transgenics. With the use of CRISPR/Cas9, a number of tomato traits have been functionally distinguished and edited. These traits include plant architecture as well as flower characters (leaf, flower, male sterility, and parthenocarpy), fruit ripening, quality and nutrition (lycopene, carotenoid, GABA, TSS, and shelf-life), disease resistance (late blight, TYLCV, and powdery mildew), tolerance to abiotic stress (heat, drought, and salinity) and resistance to herbicides. This study explores the potential of CRISPR/Cas9 genome editing for enhancing yield in tomato plants. The study utilized the CRISPR/Cas9 genome editing technology to functionally edit various traits in tomatoes. The de novo domestication of elite features from wild cousins to cultivated tomatoes and vice versa has been demonstrated by the introgression of CRISPR/Cas9. The CycB (Lycopene beta someri) gene-mediated Cas9 editing increased the lycopene content in tomato. Also, Cas9-mediated editing of the AGL6 (Agamous-like 6) gene resulted in parthenocarpic fruit development under heat-stress conditions. The advent of CRISPR/Cas has rendered it possible to use digital resources for single guide RNA design and multiplexing, cloning (such as Golden Gate cloning, GoldenBraid, etc.), creating robust CRISPR/Cas constructs, and implementing effective transformation protocols like the Agrobacterium and DNA free protoplast method for Cas9-gRNAs ribonucleoproteins (RNPs) complex. Additionally, homologous recombination (HR)-based gene knock-in (HKI) via geminivirus replicon and base/prime editing (Target-AID technology) remains possible. Hence, CRISPR/Cas facilitates fast and efficient breeding in the improvement of tomatoes.

Keywords: CRISPR-Cas, biotic and abiotic stress, flower and fruit traits, genome editing, polygenic trait, tomato and trait introgression

Procedia PDF Downloads 36

26791 An Analysis on Clustering Based Gene Selection and Classification for Gene Expression Data

Authors: K. Sathishkumar, V. Thiagarasu

Abstract:

Due to recent advances in DNA microarray technology, it is now feasible to obtain gene expression profiles of tissue samples at relatively low costs. Many scientists around the world use the advantage of this gene profiling to characterize complex biological circumstances and diseases. Microarray techniques that are used in genome-wide gene expression and genome mutation analysis help scientists and physicians in understanding of the pathophysiological mechanisms, in diagnoses and prognoses, and choosing treatment plans. DNA microarray technology has now made it possible to simultaneously monitor the expression levels of thousands of genes during important biological processes and across collections of related samples. Elucidating the patterns hidden in gene expression data offers a tremendous opportunity for an enhanced understanding of functional genomics. However, the large number of genes and the complexity of biological networks greatly increase the challenges of comprehending and interpreting the resulting mass of data, which often consists of millions of measurements. A first step toward addressing this challenge is the use of clustering techniques, which is essential in the data mining process to reveal natural structures and identify interesting patterns in the underlying data. This work presents an analysis of several clustering algorithms proposed to deals with the gene expression data effectively. The existing clustering algorithms like Support Vector Machine (SVM), K-means algorithm and evolutionary algorithm etc. are analyzed thoroughly to identify the advantages and limitations. The performance evaluation of the existing algorithms is carried out to determine the best approach. In order to improve the classification performance of the best approach in terms of Accuracy, Convergence Behavior and processing time, a hybrid clustering based optimization approach has been proposed.

Keywords: microarray technology, gene expression data, clustering, gene Selection

Procedia PDF Downloads 286

26790 Efficient Reuse of Exome Sequencing Data for Copy Number Variation Callings

Authors: Chen Wang, Jared Evans, Yan Asmann

Abstract:

With the quick evolvement of next-generation sequencing techniques, whole-exome or exome-panel data have become a cost-effective way for detection of small exonic mutations, but there has been a growing desire to accurately detect copy number variations (CNVs) as well. In order to address this research and clinical needs, we developed a sequencing coverage pattern-based method not only for copy number detections, data integrity checks, CNV calling, and visualization reports. The developed methodologies include complete automation to increase usability, genome content-coverage bias correction, CNV segmentation, data quality reports, and publication quality images. Automatic identification and removal of poor quality outlier samples were made automatically. Multiple experimental batches were routinely detected and further reduced for a clean subset of samples before analysis. Algorithm improvements were also made to improve somatic CNV detection as well as germline CNV detection in trio family. Additionally, a set of utilities was included to facilitate users for producing CNV plots in focused genes of interest. We demonstrate the somatic CNV enhancements by accurately detecting CNVs in whole exome-wide data from the cancer genome atlas cancer samples and a lymphoma case study with paired tumor and normal samples. We also showed our efficient reuses of existing exome sequencing data, for improved germline CNV calling in a family of the trio from the phase-III study of 1000 Genome to detect CNVs with various modes of inheritance. The performance of the developed method is evaluated by comparing CNV calling results with results from other orthogonal copy number platforms. Through our case studies, reuses of exome sequencing data for calling CNVs have several noticeable functionalities, including a better quality control for exome sequencing data, improved joint analysis with single nucleotide variant calls, and novel genomic discovery of under-utilized existing whole exome and custom exome panel data.

Keywords: bioinformatics, computational genetics, copy number variations, data reuse, exome sequencing, next generation sequencing

Procedia PDF Downloads 232

26789 Breeding Cotton for Annual Growth Habit: Remobilizing End-of-season Perennial Reserves for Increased Yield

Authors: Salman Naveed, Nitant Gandhi, Grant Billings, Zachary Jones, B. Todd Campbell, Michael Jones, Sachin Rustgi

Abstract:

Cotton (Gossypium spp.) is the primary source of natural fiber in the U.S. and a major crop in the Southeastern U.S. Despite constant efforts to increase the cotton fiber yield, the yield gain has stagnated. Therefore, we undertook a novel approach to improve the cotton fiber yield by altering its growth habit from perennial to annual. In this effort, we identified genotypes with high-expression alleles of five floral induction and meristem identity genes (FT, SOC1, FUL, LFY, and AP1) from an upland cotton mini-core collection and crossed them in various combinations to develop cotton lines with annual growth habit, optimal flowering time and enhanced productivity. To facilitate the characterization of genotypes with the desired combinations of stacked alleles, we identified markers associated with the gene expression traits via genome-wide association analysis using a 63K SNP Array (Hulse-Kemp et al. 2015 G3 5:1187). Over 14,500 SNPs showed polymorphism and were used for association analysis. A total of 396 markers showed association with expression traits. Out of these 396 markers, 159 mapped to genes, 50 to untranslated regions, and 187 to random genomic regions. Biased genomic distribution of associated markers was observed where more trait-associated markers mapped to the cotton D sub-genome. Many quantitative trait loci coincided at specific genomic regions. This observation has implications as these traits could be bred together. The analysis also allowed the identification of candidate regulators of the expression patterns of these floral induction and meristem identity genes whose functions will be validated via virus-induced gene silencing.

Keywords: cotton, GWAS, QTL, expression traits

Procedia PDF Downloads 122

26788 A Comprehensive Analysis of LACK (Leishmania Homologue of Receptors for Activated C Kinase) in the Context of Visceral Leishmaniasis

Authors: Sukrat Sinha, Abhay Kumar, Shanthy Sundaram

Abstract:

The Leishmania homologue of activated C kinase (LACK) is known T cell epitope from soluble Leishmania antigens (SLA) that confers protection against Leishmania challenge. This antigen has been found to be highly conserved among Leishmania strains. LACK has been shown to be protective against L. donovani challenge. A comprehensive analysis of several LACK sequences was completed. The analysis shows a high level of conservation, lower variability and higher antigenicity in specific portions of the LACK protein. This information provides insights for the potential consideration of LACK as a putative candidate in the context of visceral Leishmaniasis vaccine target.

Keywords: bioinformatics, genome assembly, leishmania activated protein kinase c (lack), next-generation sequencing

Procedia PDF Downloads 307

26787 Black-Brown and Yellow-Brown-Red Skin Pigmentation Elements are Shared in Common: Using Art and Science for Multicultural Education

Authors: Mary Kay Bacallao

Abstract:

New research on the human genome has revealed secrets to the variation in skin pigmentation found in all human populations. Application of this research to multicultural education has a profound effect on students from all backgrounds. This paper identifies the four locations in the human genome that code for variation in skin pigmentation worldwide. The research makes this new knowledge accessible to students of all ages as they participate in an art project that brings these scientific multicultural concepts to life. Students participate in the application of breakthrough scientific principles through hands-on art activities where they simulate the work of the DNA coding to create their own skin tone using the colors expressed to varying degrees in every people group. As students create their own artwork handprint from the pallet of colors, they realize that each color on the pallet is essential to creating every tone of skin. This research project serves to bring people together and appreciate the variety and diversity in skin tones. As students explore the variations, they create pigmentation with the use of the eumelanins, which are the black-brown sources of pigmentation, and the pheomelanins, which are the yellow-reddish-brown sources of pigmentation. The research project dispels myths about skin tones that have divided people in the past. As a group project, this research leads to greater appreciation and understanding of the diverse family groups.

Keywords: diversity, multicultural, skin pigmentation, eumelanins, pheomelanins, handprint, artwork, science, genome, human

Procedia PDF Downloads 38

26786 Brachypodium: A Model Genus to Study Grass Genome Organisation at the Cytomolecular Level

Authors: R. Hasterok, A. Betekhtin, N. Borowska, A. Braszewska-Zalewska, E. Breda, K. Chwialkowska, R. Gorkiewicz, D. Idziak, J. Kwasniewska, M. Kwasniewski, D. Siwinska, A. Wiszynska, E. Wolny

Abstract:

In contrast to animals, the organisation of plant genomes at the cytomolecular level is still relatively poorly studied and understood. However, the Brachypodium genus in general and B. distachyon in particular represent exceptionally good model systems for such study. This is due not only to their highly desirable ‘model’ biological features, such as small nuclear genome, low chromosome number and complex phylogenetic relations, but also to the rapidly and continuously growing repertoire of experimental tools, such as large collections of accessions, WGS information, large insert (BAC) libraries of genomic DNA, etc. Advanced cytomolecular techniques, such as fluorescence in situ hybridisation (FISH) with evermore sophisticated probes, empowered by cutting-edge microscope and digital image acquisition and processing systems, offer unprecedented insight into chromatin organisation at various phases of the cell cycle. A good example is chromosome painting which uses pools of chromosome-specific BAC clones, and enables the tracking of individual chromosomes not only during cell division but also during interphase. This presentation outlines the present status of molecular cytogenetic analyses of plant genome structure, dynamics and evolution using B. distachyon and some of its relatives. The current projects focus on important scientific questions, such as: What mechanisms shape the karyotypes? Is the distribution of individual chromosomes within an interphase nucleus determined? Are there hot spots of structural rearrangement in Brachypodium chromosomes? Which epigenetic processes play a crucial role in B. distachyon embryo development and selective silencing of rRNA genes in Brachypodium allopolyploids? The authors acknowledge financial support from the Polish National Science Centre (grants no. 2012/04/A/NZ3/00572 and 2011/01/B/NZ3/00177)

Keywords: Brachypodium, B. distachyon, chromosome, FISH, molecular cytogenetics, nucleus, plant genome organisation

Procedia PDF Downloads 321

26785 Identification and Characterization of Antimicrobial Peptides Isolated from Entophytic Bacteria and Their Activity against Multidrug-Resistance Gram-Negative Bacteria in South Korea

Authors: Maryam Beiranvand

Abstract:

Multi-drug resistance in various microorganisms has increased globally in many healthcare facilities. Less effective antimicrobial activity of drug therapies for infection control becomes trouble. Since 1980, no new type of antimicrobial drug has been identified, even though combinations of antibiotic drugs have been discovered almost every decade. Between 1981 and 2006, over 70% of novel pharmaceuticals and chemical agents came from natural sources. Microorganisms have yielded almost 22,000 natural compounds. The identification of antimicrobial components from endophytes bacteria could help overcome the threat posed by multi-drug resistant strains. The project aims to analyze and identify antimicrobial peptides isolated from entophytic bacteria and their activity against multidrug-resistant Gram-negative bacteria in South Korea. Endophytic Paenibacillus polymyxa. 4G3 isolated from the plant, Gynura procumbery exhibited considerable antimicrobial activity against Methicillin-resistant Staphylococcus aureus, and Escherichia coli. The Rapid Annotations using Subsystems Technology showed that the total size of the draft genome was 5,739,603bp, containing 5178 genes with 45.8% G+C content. Genome annotation using antiSMASH version 6.0.0 was performed, which predicted the most common types of non-ribosomal peptide synthetase (NRPS) and polyketide synthase (PKS). In this study, diethyl aminoethyl cellulose (DEAEC) resin was used as the first step in purifying for unknown peptides, and then the target protein was identified using hydrophilic and hydrophobic solutions, optimal pH, and step-by-step tests for antimicrobial activity. This crude was subjected to C18 chromatography and elution with 0, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, and 100% methanol, respectively. Only the fraction eluted with 20% -60% methanol demonstrated good antimicrobial activity against MDR E. coli. The concentration of the active fragment was measured by the Brad-ford test, and Protein A280 - Thermo Fisher Scientific at the end by examining the SDS PAGE Resolving Gel, 10% Acrylamide and purity were confirmed. Our study showed that, based on the combined results of the analysis and purification. P polymyxa. 4G3 has a high potential exists for producing novel functions of polymyxin E and bacitracin against bacterial pathogens.

Keywords: endophytic bacteria, antimicrobial activity, antimicrobial peptide, whole genome sequencing analysis, multi -drug resistance gram negative bacteria

Procedia PDF Downloads 44

26784 Genome-Scale Analysis of Streptomyces Caatingaensis CMAA 1322 Metabolism, a New Abiotic Stress-Tolerant Actinomycete

Authors: Suikinai Nobre Santos, Ranko Gacesa, Paul F. Long, Itamar Soares de Melo

Abstract:

Extremophilic microorganism are adapted to biotopes combining several stress factors (temperature, pressure, radiation, salinity and pH), which indicate the richness valuable resource for the exploitation of novel biotechnological processes and constitute unique models for investigations their biomolecules (1, 2). The above information encourages us investigate bioprospecting synthesized compounds by a noval actinomycete, designated thermotolerant Streptomyces caatingaensis CMAA 1322, isolated from sample soil tropical dry forest (Caatinga) in the Brazilian semiarid region (3-17°S and 35-45°W). This set of constrating physical and climatic factores provide the unique conditions and a diversity of well adapted species, interesting site for biotechnological purposes. Preliminary studies have shown the great potential in the production of cytotoxic, pesticidal and antimicrobial molecules (3). Thus, to extend knowledge of the genes clusters responsible for producing biosynthetic pathways of natural products in strain CMAA1322, whole-genome shotgun (WGS) DNA sequencing was performed using paired-end long sequencing with PacBio RS (Pacific Biosciences). Genomic DNA was extracted from a pure culture grown overnight on LB medium using the PureLink genomic DNA kit (Life Technologies). An approximately 3- to 20-kb-insert PacBio library was constructed and sequenced on an 8 single-molecule real-time (SMRT) cell, yielding 116,269 reads (average length, 7,446 bp), which were allocated into 18 contigs, with 142.11x coverage and N50 value of 20.548 bp (BioProject number PRJNA288757). The assembled data were analyzed by Rapid Annotations using Subsystems Technology (RAST) (4) the genome size was found to be 7.055.077 bp, comprising 6167 open reading frames (ORFs) and 413 subsystems. The G+C content was estimated to be 72 mol%. The closest-neighbors tool, available in RAST through functional comparison of the genome, revealed that strain CMAA1322 is more closely related to Streptomyces hygroscopicus ATCC 53653 (similarity score value, 537), S. violaceusniger Tu 4113 (score value, 483), S. avermitilis MA-4680 (score value, 475), S. albus J1074 (score value, 447). The Streptomyces sp. CMAA1322 genome contains 98 tRNA genes and 135 genes copies related to stress response, mainly osmotic stress (14), heat shock (16), oxidative stress (49). Functional annotation by antiSMASH version 3.0 (5) identified 41 clusters for secondary metabolites (including two clusters for lanthipeptides, ten clusters for nonribosomal peptide synthetases [NRPS], three clusters for siderophores, fourteen for polyketide synthetase [PKS], six clusters encoding a terpene, two clusters encoding a bacteriocin, and one cluster encoding a phenazine). Our work provide in comparative analyse of genome and extract produced (data no published) by lineage CMAA1322, revealing the potential of microorganisms accessed from extreme environments as Caatinga” to produce a wide range of biotechnological relevant compounds.

Keywords: caatinga, streptomyces, environmental stresses, biosynthetic pathways

Procedia PDF Downloads 212

26783 Difference in Virulence Factor Genes Between Transient and Persistent Streptococcus Uberis Intramammary Infection in Dairy Cattle

Authors: Anyaphat Srithanasuwan, Noppason Pangprasit, Montira Intanon, Phongsakorn Chuammitri, Witaya Suriyasathaporn, Ynte H. Schukken

Abstract:

Streptococcus uberis is one of the most common mastitis-causing pathogens, with a wide range of intramammary infection (IMI) durations and pathogenicity. This study aimed to compare shared or unique virulence factor gene clusters distinguishing persistent and transient strains of S. uberis. A total of 139 S. uberis strains were isolated from three small-holder dairy herds with a high prevalence of S. uberis mastitis. The duration of IMI was used to categorize bacteria into two groups: transient and persistent strains with an IMI duration of less than 1 month and longer than 2 months, respectively. Six representative S. uberis strains, three from each group (transience and persistence) were selected for analysis. All transient strains exhibited multi-locus sequence types (MLST), indicating a highly diverse population of transient S. uberis. In contrast, MLST of persistent strains was available in an online database (pubMLST). Identification of virulence genes was performed using whole-genome sequencing (WGS) data. Differences in genomic size and number of virulent genes were found. For example, the BCA gene or alpha-c protein and the gene associated with capsule formation (hasAB), found in persistent strains, are important for attachment and invasion, as well as the evasion of the antimicrobial mechanisms and survival persistence, respectively. These findings suggest a genetic-level difference between the two strain types. Consequently, a comprehensive study of 139 S. uberis isolates will be conducted to perform an in-depth genetic assessment through WGS analysis on an Illumina platform.

Keywords: Streptococcus Uberis, mastitis, whole genome sequence, intramammary infection, persistent S. Uberis, transient s. Uberis

Procedia PDF Downloads 18

26782 Genome-Wide Identification and Characterization of MLO Family Genes in Pumpkin (Cucurbita maxima Duch.)

Authors: Khin Thanda Win, Chunying Zhang, Sanghyeob Lee

Abstract:

Mildew resistance locus o (Mlo), a plant-specific gene family with seven-transmembrane (TM), plays an important role in plant resistance to powdery mildew (PM). PM caused by Podosphaera xanthii is a widespread plant disease and probably represents the major fungal threat for many Cucurbits. The recent Cucurbita maxima genome sequence data provides an opportunity to identify and characterize the MLO gene family in this species. Total twenty genes (designated CmaMLO1 through CmaMLO20) have been identified by using an in silico cloning method with the MLO gene sequences of Cucumis sativus, Cucumis melo, Citrullus lanatus and Cucurbita pepo as probes. These CmaMLOs were evenly distributed on 15 chromosomes of 20 C. maxima chromosomes without any obvious clustering. Multiple sequence alignment showed that the common structural features of MLO gene family, such as TM domains, a calmodulin-binding domain and 30 important amino acid residues for MLO function, were well conserved. Phylogenetic analysis of the CmaMLO genes and other plant species reveals seven different clades (I through VII) and only clade IV is specific to monocots (rice, barley, and wheat). Phylogenetic and structural analyses provided preliminary evidence that five genes belonged to clade V could be the susceptibility genes which may play the importance role in PM resistance. This study is the first comprehensive report on MLO genes in C. maxima to our knowledge. These findings will facilitate the functional analysis of the MLOs related to PM susceptibility and are valuable resources for the development of disease resistance in pumpkin.

Keywords: Mildew resistance locus o (Mlo), powdery mildew, phylogenetic relationship, susceptibility genes

Procedia PDF Downloads 154

26781 Performance of High Density Genotyping in Sahiwal Cattle Breed

Authors: Hamid Mustafa, Huson J. Heather, Kim Eiusoo, Adeela Ajmal, Tad S. Sonstegard

Abstract:

The objective of this study was to evaluate the informativeness of Bovine high density SNPs genotyping in Sahiwal cattle population. This is a first attempt to assess the Bovine HD SNP genotyping array in any Pakistani indigenous cattle population. To evaluate these SNPs on genome wide scale, we considered 777,962 SNPs spanning the whole autosomal and X chromosomes in Sahiwal cattle population. Fifteen (15) non related gDNA samples were genotyped with the bovine HD infinium. Approximately 500,939 SNPs were found polymorphic (MAF > 0.05) in Sahiwal cattle population. The results of this study indicate potential application of Bovine High Density SNP genotyping in Pakistani indigenous cattle population. The information generated from this array can be applied in genetic prediction, characterization and genome wide association studies of Pakistani Sahiwal cattle population.

Keywords: Sahiwal cattle, polymorphic SNPs, genotyping, Pakistan

Procedia PDF Downloads 396

26780 TARF: Web Toolkit for Annotating RNA-Related Genomic Features

Authors: Jialin Ma, Jia Meng

Abstract:

Genomic features, the genome-based coordinates, are commonly used for the representation of biological features such as genes, RNA transcripts and transcription factor binding sites. For the analysis of RNA-related genomic features, such as RNA modification sites, a common task is to correlate these features with transcript components (5'UTR, CDS, 3'UTR) to explore their distribution characteristics in terms of transcriptomic coordinates, e.g., to examine whether a specific type of biological feature is enriched near transcription start sites. Existing approaches for performing these tasks involve the manipulation of a gene database, conversion from genome-based coordinate to transcript-based coordinate, and visualization methods that are capable of showing RNA transcript components and distribution of the features. These steps are complicated and time consuming, and this is especially true for researchers who are not familiar with relevant tools. To overcome this obstacle, we develop a dedicated web app TARF, which represents web toolkit for annotating RNA-related genomic features. TARF web tool intends to provide a web-based way to easily annotate and visualize RNA-related genomic features. Once a user has uploaded the features with BED format and specified a built-in transcript database or uploaded a customized gene database with GTF format, the tool could fulfill its three main functions. First, it adds annotation on gene and RNA transcript components. For every features provided by the user, the overlapping with RNA transcript components are identified, and the information is combined in one table which is available for copy and download. Summary statistics about ambiguous belongings are also carried out. Second, the tool provides a convenient visualization method of the features on single gene/transcript level. For the selected gene, the tool shows the features with gene model on genome-based view, and also maps the features to transcript-based coordinate and show the distribution against one single spliced RNA transcript. Third, a global transcriptomic view of the genomic features is generated utilizing the Guitar R/Bioconductor package. The distribution of features on RNA transcripts are normalized with respect to RNA transcript landmarks and the enrichment of the features on different RNA transcript components is demonstrated. We tested the newly developed TARF toolkit with 3 different types of genomics features related to chromatin H3K4me3, RNA N6-methyladenosine (m6A) and RNA 5-methylcytosine (m5C), which are obtained from ChIP-Seq, MeRIP-Seq and RNA BS-Seq data, respectively. TARF successfully revealed their respective distribution characteristics, i.e. H3K4me3, m6A and m5C are enriched near transcription starting sites, stop codons and 5’UTRs, respectively. Overall, TARF is a useful web toolkit for annotation and visualization of RNA-related genomic features, and should help simplify the analysis of various RNA-related genomic features, especially those related RNA modifications.

Keywords: RNA-related genomic features, annotation, visualization, web server

Procedia PDF Downloads 180

26779 Prediction of Solanum Lycopersicum Genome Encoded microRNAs Targeting Tomato Spotted Wilt Virus

Authors: Muhammad Shahzad Iqbal, Zobia Sarwar, Salah-ud-Din

Abstract:

Tomato spotted wilt virus (TSWV) belongs to the genus Tospoviruses (family Bunyaviridae). It is one of the most devastating pathogens of tomato (Solanum Lycopersicum) and heavily damages the crop yield each year around the globe. In this study, we retrieved 329 mature miRNA sequences from two microRNA databases (miRBase and miRSoldb) and checked the putative target sites in the downloaded-genome sequence of TSWV. A consensus of three miRNA target prediction tools (RNA22, miRanda and psRNATarget) was used to screen the false-positive microRNAs targeting sites in the TSWV genome. These tools calculated different target sites by calculating minimum free energy (mfe), site-complementarity, minimum folding energy and other microRNA-mRNA binding factors. R language was used to plot the predicted target-site data. All the genes having possible target sites for different miRNAs were screened by building a consensus table. Out of these 329 mature miRNAs predicted by three algorithms, only eight miRNAs met all the criteria/threshold specifications. MC-Fold and MC-Sym were used to predict three-dimensional structures of miRNAs and further analyzed in USCF chimera to visualize the structural and conformational changes before and after microRNA-mRNA interactions. The results of the current study show that the predicted eight miRNAs could further be evaluated by in vitro experiments to develop TSWV-resistant transgenic tomato plants in the future.

Keywords: tomato spotted wild virus (TSWV), Solanum lycopersicum, plant virus, miRNAs, microRNA target prediction, mRNA

Procedia PDF Downloads 118

26778 Isolation and Characterization of a Narrow-Host Range Aeromonas hydrophila Lytic Bacteriophage

Authors: Sumeet Rai, Anuj Tyagi, B. T. Naveen Kumar, Shubhkaramjeet Kaur, Niraj K. Singh

Abstract:

Since their discovery, indiscriminate use of antibiotics in human, veterinary and aquaculture systems has resulted in global emergence/spread of multidrug-resistant bacterial pathogens. Thus, the need for alternative approaches to control bacterial infections has become utmost important. High selectivity/specificity of bacteriophages (phages) permits the targeting of specific bacteria without affecting the desirable flora. In this study, a lytic phage (Ahp1) specific to Aeromonas hydrophila subsp. hydrophila was isolated from finfish aquaculture pond. The host range of Ahp1 range was tested against 10 isolates of A. hydrophila, 7 isolates of A. veronii, 25 Vibrio cholerae isolates, 4 V. parahaemolyticus isolates and one isolate each of V. harveyi and Salmonella enterica collected previously. Except the host A. hydrophila subsp. hydrophila strain, no lytic activity against any other bacterial was detected. During the adsorption rate and one-step growth curve analysis, 69.7% of phage particles were able to get adsorbed on host cell followed by the release of 93 ± 6 phage progenies per host cell after a latent period of ~30 min. Phage nucleic acid was extracted by column purification methods. After determining the nature of phage nucleic acid as dsDNA, phage genome was subjected to next-generation sequencing by generating paired-end (PE, 2 x 300bp) reads on Illumina MiSeq system. De novo assembly of sequencing reads generated circular phage genome of 42,439 bp with G+C content of 58.95%. During open read frame (ORF) prediction and annotation, 22 ORFs (out of 49 total predicted ORFs) were functionally annotated and rest encoded for hypothetical proteins. Proteins involved in major functions such as phage structure formation and packaging, DNA replication and repair, DNA transcription and host cell lysis were encoded by the phage genome. The complete genome sequence of Ahp1 along with gene annotation was submitted to NCBI GenBank (accession number MF683623). Stability of Ahp1 preparations at storage temperatures of 4 °C, 30 °C, and 40 °C was studied over a period of 9 months. At 40 °C storage, phage counts declined by 4 log units within one month; with a total loss of viability after 2 months. At 30 °C temperature, phage preparation was stable for < 5 months. On the other hand, phage counts decreased by only 2 log units over a period of 9 during storage at 4 °C. As some of the phages have also been reported as glycerol sensitive, the stability of Ahp1 preparations in (0%, 15%, 30% and 45%) glycerol stocks were also studied during storage at -80 °C over a period of 9 months. The phage counts decreased only by 2 log units during storage, and no significant difference in phage counts was observed at different concentrations of glycerol. The Ahp1 phage discovered in our study had a very narrow host range and it may be useful for phage typing applications. Moreover, the endolysin and holin genes in Ahp1 genome could be ideal candidates for recombinant cloning and expression of antimicrobial proteins.

Keywords: Aeromonas hydrophila, endolysin, phage, narrow host range

Procedia PDF Downloads 142

26777 Computational Pipeline for Lynch Syndrome Detection: Integrating Alignment, Variant Calling, and Annotations

Authors: Rofida Gamal, Mostafa Mohammed, Mariam Adel, Marwa Gamal, Marwa kamal, Ayat Saber, Maha Mamdouh, Amira Emad, Mai Ramadan

Abstract:

Lynch Syndrome is an inherited genetic condition associated with an increased risk of colorectal and other cancers. Detecting Lynch Syndrome in individuals is crucial for early intervention and preventive measures. This study proposes a computational pipeline for Lynch Syndrome detection by integrating alignment, variant calling, and annotation. The pipeline leverages popular tools such as FastQC, Trimmomatic, BWA, bcftools, and ANNOVAR to process the input FASTQ file, perform quality trimming, align reads to the reference genome, call variants, and annotate them. It is believed that the computational pipeline was applied to a dataset of Lynch Syndrome cases, and its performance was evaluated. It is believed that the quality check step ensured the integrity of the sequencing data, while the trimming process is thought to have removed low-quality bases and adaptors. In the alignment step, it is believed that the reads were accurately mapped to the reference genome, and the subsequent variant calling step is believed to have identified potential genetic variants. The annotation step is believed to have provided functional insights into the detected variants, including their effects on known Lynch Syndrome-associated genes. The results obtained from the pipeline revealed Lynch Syndrome-related positions in the genome, providing valuable information for further investigation and clinical decision-making. The pipeline's effectiveness was demonstrated through its ability to streamline the analysis workflow and identify potential genetic markers associated with Lynch Syndrome. It is believed that the computational pipeline presents a comprehensive and efficient approach to Lynch Syndrome detection, contributing to early diagnosis and intervention. The modularity and flexibility of the pipeline are believed to enable customization and adaptation to various datasets and research settings. Further optimization and validation are believed to be necessary to enhance performance and applicability across diverse populations.

Keywords: Lynch Syndrome, computational pipeline, alignment, variant calling, annotation, genetic markers

Procedia PDF Downloads 36

26776 Association of Nuclear – Mitochondrial Epistasis with BMI in Type 1 Diabetes Mellitus Patients

Authors: Agnieszka H. Ludwig-Slomczynska, Michal T. Seweryn, Przemyslaw Kapusta, Ewelina Pitera, Katarzyna Cyganek, Urszula Mantaj, Lucja Dobrucka, Ewa Wender-Ozegowska, Maciej T. Malecki, Pawel Wolkow

Abstract:

Obesity results from an imbalance between energy intake and its expenditure. Genome-Wide Association Study (GWAS) analyses have led to discovery of only about 100 variants influencing body mass index (BMI), which explain only a small portion of genetic variability. Analysis of gene epistasis gives a chance to discover another part. Since it was shown that interaction and communication between nuclear and mitochondrial genome are indispensable for normal cell function, we have looked for epistatic interactions between the two genomes to find their correlation with BMI. Methods: The analysis was performed on 366 T1DM patients using Illumina Infinium OmniExpressExome-8 chip and followed by imputation on Michigan Imputation Server. Only genes which influence mitochondrial functioning (listed in Human MitoCarta 2.0) were included in the analysis – variants of nuclear origin (MAF > 5%) in 1140 genes and 42 mitochondrial variants (MAF > 1%). Gene expression analysis was performed on GTex data. Association analysis between genetic variants and BMI was performed with the use of Linear Mixed Models as implemented in the package 'GENESIS' in R. Analysis of association between mRNA expression and BMI was performed with the use of linear models and standard significance tests in R. Results: Among variants involved in epistasis between mitochondria and nucleus we have identified one in mitochondrial transcription factor, TFB2M (rs6701836). It interacted with mitochondrial variants localized to MT-RNR1 (p=0.0004, MAF=15%), MT-ND2 (p=0.07, MAF=5%) and MT-ND4 (p=0.01, MAF=1.1%). Analysis of the interaction between nuclear variant rs6701836 (nuc) and rs3021088 localized to MT-ND2 mitochondrial gene (mito) has shown that the combination of the two led to BMI decrease (p=0.024). Each of the variants on its own does not correlate with higher BMI [p(nuc)=0.856, p(mito)=0.116)]. Although rs6701836 is intronic, it influences gene expression in the thyroid (p=0.000037). rs3021088 is a missense variant that leads to alanine to threonine substitution in the MT-ND2 gene which belongs to complex I of the electron transport chain. The analysis of the influence of genetic variants on gene expression has confirmed the trend explained above – the interaction of the two genes leads to BMI decrease (p=0.0308). Each of the mRNAs on its own is associated with higher BMI (p(mito)=0.0244 and p(nuc)=0.0269). Conclusıons: Our results show that nuclear-mitochondrial epistasis can influence BMI in T1DM patients. The correlation between transcription factor expression and mitochondrial genetic variants will be subject to further analysis.

Keywords: body mass index, epistasis, mitochondria, type 1 diabetes

Procedia PDF Downloads 147

26775 Unraveling the Evolution of Mycoplasma Hominis Through Its Genome Sequence

Authors: Boutheina Ben Abdelmoumen Mardassi, Salim Chibani, Safa Boujemaa, Amaury Vaysse, Julien Guglielmini, Elhem Yacoub

Abstract:

Background and aim: Mycoplasma hominis (MH) is a pathogenic bacterium belonging to the Mollicutes class. It causes a wide range of gynecological infections and infertility among adults. Recently, we have explored for the first time the phylodistribution of Tunisian M. hominis clinical strains using an expanded MLST. We have demonstrated their distinction into two pure lineages, which each corresponding to a specific pathotype: genital infections and infertility. The aim of this project is to gain further insight into the evolutionary dynamics and the specific genetic factors that distinguish MH pathotypes Methods: Whole genome sequencing of Mycoplasma hominis clinical strains was performed using illumina Miseq. Denovo assembly was performed using a publicly available in-house pipeline. We used prokka to annotate the genomes, panaroo to generate the gene presence matrix and Jolytree to establish the phylogenetic tree. We used treeWAS to identify genetic loci associated with the pathothype of interest from the presence matrix and phylogenetic tree. Results: Our results revealed a clear categorization of the 62 MH clinical strains into two distinct genetic lineages, with each corresponding to a specific pathotype.; gynecological infections and infertility[AV1] . Genome annotation showed that GC content is ranging between 26 and 27%, which is a known characteristic of Mycoplasma genome. Housekeeping genes belonging to the core genome are highly conserved among our strains. TreeWas identified 4 virulence genes associated with the pathotype gynecological infection. encoding for asparagine--tRNA ligase, restriction endonuclease subunit S, Eco47II restriction endonuclease, and transcription regulator XRE (involved in tolerance to oxidative stress). Five genes have been identified that have a statistical association with infertility, tow lipoprotein, one hypothetical protein, a glycosyl transferase involved in capsule synthesis, and pyruvate kinase involved in biofilm formation. All strains harbored an efflux pomp that belongs to the family of multidrug resistance ABC transporter, which confers resistance to a wide range of antibiotics. Indeed many adhesion factors and lipoproteins (p120, p120', p60, p80, Vaa) have been checked and confirmed in our strains with a relatively 99 % to 96 % conserved domain and hypervariable domain that represent 1 to 4 % of the reference sequence extracted from gene bank. Conclusion: In summary, this study led to the identification of specific genetic loci associated with distinct pathotypes in M hominis.

Keywords: mycoplasma hominis, infertility, gynecological infections, virulence genes, antibiotic resistance

Procedia PDF Downloads 50

26774 Integrative Omics-Portrayal Disentangles Molecular Heterogeneity and Progression Mechanisms of Cancer

Authors: Binder Hans

Abstract:

Cancer is no longer seen as solely a genetic disease where genetic defects such as mutations and copy number variations affect gene regulation and eventually lead to aberrant cell functioning which can be monitored by transcriptome analysis. It has become obvious that epigenetic alterations represent a further important layer of (de-)regulation of gene activity. For example, aberrant DNA methylation is a hallmark of many cancer types, and methylation patterns were successfully used to subtype cancer heterogeneity. Hence, unraveling the interplay between different omics levels such as genome, transcriptome and epigenome is inevitable for a mechanistic understanding of molecular deregulation causing complex diseases such as cancer. This objective requires powerful downstream integrative bioinformatics methods as an essential prerequisite to discover the whole genome mutational, transcriptome and epigenome landscapes of cancer specimen and to discover cancer genesis, progression and heterogeneity. Basic challenges and tasks arise ‘beyond sequencing’ because of the big size of the data, their complexity, the need to search for hidden structures in the data, for knowledge mining to discover biological function and also systems biology conceptual models to deduce developmental interrelations between different cancer states. These tasks are tightly related to cancer biology as an (epi-)genetic disease giving rise to aberrant genomic regulation under micro-environmental control and clonal evolution which leads to heterogeneous cellular states. Machine learning algorithms such as self organizing maps (SOM) represent one interesting option to tackle these bioinformatics tasks. The SOMmethod enables recognizing complex patterns in large-scale data generated by highthroughput omics technologies. It portrays molecular phenotypes by generating individualized, easy to interpret images of the data landscape in combination with comprehensive analysis options. Our image-based, reductionist machine learning methods provide one interesting perspective how to deal with massive data in the discovery of complex diseases, gliomas, melanomas and colon cancer on molecular level. As an important new challenge, we address the combined portrayal of different omics data such as genome-wide genomic, transcriptomic and methylomic ones. The integrative-omics portrayal approach is based on the joint training of the data and it provides separate personalized data portraits for each patient and data type which can be analyzed by visual inspection as one option. The new method enables an integrative genome-wide view on the omics data types and the underlying regulatory modes. It is applied to high and low-grade gliomas and to melanomas where it disentangles transversal and longitudinal molecular heterogeneity in terms of distinct molecular subtypes and progression paths with prognostic impact.

Keywords: integrative bioinformatics, machine learning, molecular mechanisms of cancer, gliomas and melanomas

Procedia PDF Downloads 117