Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 370

Search results for: genome

370 Genome Analyses of Pseudomonas Fluorescens b29b from Coastal Kerala

Abstract:

Pseudomonas fluorescens B29B, which has asparaginase enzymatic activity, was isolated from the surface coastal seawater of Trivandrum, India. We report the complete Pseudomonas fluorescens B29B genome sequenced, identified, and annotated from a marine source. We find the genome at most minuscule a 7,331,508 bp single circular chromosome with a GC content of 62.19% and 6883 protein-coding genes. Three hundred forty subsystems were identified, including two predicted asparaginases from the genome analysis of P. fluorescens B29B for further investigation. This genome data will help further industrial biotechnology applications of proteins in general and asparaginase as a target.

Keywords: pseudomonas, marine, asparaginases, Kerala, whole-genome

Procedia PDF Downloads 213

369 Computing the Similarity and the Diversity in the Species Based on Cronobacter Genome

Authors: E. Al Daoud

Abstract:

The purpose of computing the similarity and the diversity in the species is to trace the process of evolution and to find the relationship between the species and discover the unique, the special, the common and the universal proteins. The proteins of the whole genome of 40 species are compared with the cronobacter genome which is used as reference genome. More than 3 billion pairwise alignments are performed using blastp. Several findings are introduced in this study, for example, we found 172 proteins in cronobacter genome which have insignificant hits in other species, 116 significant proteins in the all tested species with very high score value and 129 common proteins in the plants but have insignificant hits in mammals, birds, fishes, and insects.

Keywords: genome, species, blastp, conserved genes, Cronobacter

Procedia PDF Downloads 495

368 An Improved Ant Colony Algorithm for Genome Rearrangements

Authors: Essam Al Daoud

Abstract:

Genome rearrangement is an important area in computational biology and bioinformatics. The basic problem in genome rearrangements is to compute the edit distance, i.e., the minimum number of operations needed to transform one genome into another. Unfortunately, unsigned genome rearrangement problem is NP-hard. In this study an improved ant colony optimization algorithm to approximate the edit distance is proposed. The main idea is to convert the unsigned permutation to signed permutation and evaluate the ants by using Kaplan algorithm. Two new operations are added to the standard ant colony algorithm: Replacing the worst ants by re-sampling the ants from a new probability distribution and applying the crossover operations on the best ants. The proposed algorithm is tested and compared with the improved breakpoint reversal sort algorithm by using three datasets. The results indicate that the proposed algorithm achieves better accuracy ratio than the previous methods.

Keywords: ant colony algorithm, edit distance, genome breakpoint, genome rearrangement, reversal sort

Procedia PDF Downloads 343

367 The Role and Importance of Genome Sequencing in Prediction of Cancer Risk

Authors: M. Sadeghi, H. Pezeshk, R. Tusserkani, A. Sharifi Zarchi, A. Malekpour, M. Foroughmand, S. Goliaei, M. Totonchi, N. Ansari–Pour

Abstract:

The role and relative importance of intrinsic and extrinsic factors in the development of complex diseases such as cancer still remains a controversial issue. Determining the amount of variation explained by these factors needs experimental data and statistical models. These models are nevertheless based on the occurrence and accumulation of random mutational events during stem cell division, thus rendering cancer development a stochastic outcome. We demonstrate that not only individual genome sequencing is uninformative in determining cancer risk, but also assigning a unique genome sequence to any given individual (healthy or affected) is not meaningful. Current whole-genome sequencing approaches are therefore unlikely to realize the promise of personalized medicine. In conclusion, since genome sequence differs from cell to cell and changes over time, it seems that determining the risk factor of complex diseases based on genome sequence is somewhat unrealistic, and therefore, the resulting data are likely to be inherently uninformative.

Keywords: cancer risk, extrinsic factors, genome sequencing, intrinsic factors

Procedia PDF Downloads 268

366 Genome Editing in Sorghum: Advancements and Future Possibilities: A Review

Authors: Micheale Yifter Weldemichael, Hailay Mehari Gebremedhn, Teklehaimanot Hailesslasie

Abstract:

The advancement of target-specific genome editing tools, including clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated protein9 (Cas9), mega-nucleases, base editing (BE), prime editing (PE), transcription activator-like endonucleases (TALENs), and zinc-finger nucleases (ZFNs), have paved the way for a modern era of gene editing. CRISPR/Cas9, as a versatile, simple, cost-effective and robust system for genome editing, has dominated the genome manipulation field over the last few years. The application of CRISPR/Cas9 in sorghum improvement is particularly vital in the context of ecological, environmental and agricultural challenges, as well as global climate change. In this context, gene editing using CRISPR/Cas9 can improve nutritional value, yield, resistance to pests and disease and tolerance to different abiotic stress. Moreover, CRISPR/Cas9 can potentially perform complex editing to reshape already available elite varieties and new genetic variations. However, existing research is targeted at improving even further the effectiveness of the CRISPR/Cas9 genome editing techniques to fruitfully edit endogenous sorghum genes. These findings suggest that genome editing is a feasible and successful venture in sorghum. Newer improvements and developments of CRISPR/Cas9 techniques have further qualified researchers to modify extra genes in sorghum with improved efficiency. The fruitful application and development of CRISPR techniques for genome editing in sorghum will not only help in gene discovery, creating new, improved traits in sorghum regulating gene expression sorghum functional genomics, but also in making site-specific integration events.

Keywords: CRISPR/Cas9, genome editing, quality, sorghum, stress, yield

Procedia PDF Downloads 57

365 Genomic Adaptation to Local Climate Conditions in Native Cattle Using Whole Genome Sequencing Data

Authors: Rugang Tian

Abstract:

In this study, we generated whole-genome sequence (WGS) data from110 native cattle. Together with whole-genome sequences from world-wide cattle populations, we estimated the genetic diversity and population genetic structure of different cattle populations. Our findings revealed clustering of cattle groups in line with their geographic locations. We identified noticeable genetic diversity between indigenous cattle breeds and commercial populations. Among all studied cattle groups, lower genetic diversity measures were found in commercial populations, however, high genetic diversity were detected in some local cattle, particularly in Rashoki and Mongolian breeds. Our search for potential genomic regions under selection in native cattle revealed several candidate genes related with immune response and cold shock protein on multiple chromosomes such as TRPM8, NMUR1, PRKAA2, SMTNL2 and OXR1 that are involved in energy metabolism and metabolic homeostasis.

Keywords: cattle, whole-genome, population structure, adaptation

Procedia PDF Downloads 72

364 Genome Sequencing, Assembly and Annotation of Gelidium Pristoides from Kenton-on-Sea, South Africa

Authors: Sandisiwe Mangali, Graeme Bradley

Abstract:

Genome is complete set of the organism's hereditary information encoded as either deoxyribonucleic acid or ribonucleic acid in most viruses. The three different types of genomes are nuclear, mitochondrial and the plastid genome and their sequences which are uncovered by genome sequencing are known as an archive for all genetic information and enable researchers to understand the composition of a genome, regulation of gene expression and also provide information on how the whole genome works. These sequences enable researchers to explore the population structure, genetic variations, and recent demographic events in threatened species. Particularly, genome sequencing refers to a process of figuring out the exact arrangement of the basic nucleotide bases of a genome and the process through which all the afore-mentioned genomes are sequenced is referred to as whole or complete genome sequencing. Gelidium pristoides is South African endemic Rhodophyta species which has been harvested in the Eastern Cape since the 1950s for its high economic value which is one motivation for its sequencing. Its endemism further motivates its sequencing for conservation biology as endemic species are more vulnerable to anthropogenic activities endangering a species. As sequencing, mapping and annotating the Gelidium pristoides genome is the aim of this study. To accomplish this aim, the genomic DNA was extracted and quantified using the Nucleospin Plank Kit, Qubit 2.0 and Nanodrop. Thereafter, the Ion Plus Fragment Library was used for preparation of a 600bp library which was then sequenced through the Ion S5 sequencing platform for two runs. The produced reads were then quality-controlled and assembled through the SPAdes assembler with default parameters and the genome assembly was quality assessed through the QUAST software. From this assembly, the plastid and the mitochondrial genomes were then sampled out using Gelidiales organellar genomes as search queries and ordered according to them using the Geneious software. The Qubit and the Nanodrop instruments revealed an A260/A280 and A230/A260 values of 1.81 and 1.52 respectively. A total of 30792074 reads were obtained and produced a total of 94140 contigs with resulted into a sequence length of 217.06 Mbp with N50 value of 3072 bp and GC content of 41.72%. A total length of 179281bp and 25734 bp was obtained for plastid and mitochondrial respectively. Genomic data allows a clear understanding of the genomic constituent of an organism and is valuable as foundation information for studies of individual genes and resolving the evolutionary relationships between organisms including Rhodophytes and other seaweeds.

Keywords: Gelidium pristoides, genome, genome sequencing and assembly, Ion S5 sequencing platform

Procedia PDF Downloads 148

363 Genome-Wide Association Study Identify COL2A1 as a Susceptibility Gene for the Hand Development Failure of Kashin-Beck Disease

Authors: Feng Zhang

Abstract:

Kashin-Beck disease (KBD) is a chronic osteochondropathy. The mechanism of hand growth and development failure of KBD remains elusive now. In this study, we conducted a two-stage genome-wide association study (GWAS) of palmar length-width ratio (LWR) of KBD, totally involving 493 Chinese Han KBD patients. Affymetrix Genome Wide Human SNP Array 6.0 was applied for SNP genotyping. Association analysis was conducted by PLINK software. Imputation analysis was performed by IMPUTE against the reference panel of the 1000 genome project. In the GWAS, the most significant association was observed between palmar LWR and rs2071358 of COL2A1 gene (P value = 4.68×10-8). Imputation analysis identified 3 SNPs surrounding rs2071358 with significant or suggestive association signals. Replication study observed additional significant association signals at both rs2071358 (P value = 0.017) and rs4760608 (P value = 0.002) of COL2A1 gene after Bonferroni correction. Our results suggest that COL2A1 gene was a novel susceptibility gene involved in the growth and development failure of hand of KBD.

Keywords: Kashin-Beck disease, genome-wide association study, COL2A1, hand

Procedia PDF Downloads 218

362 Genome Characterization and Phylogeny Analysis of Viruses Infected Invertebrates, Parvoviridae Family

Authors: Niloofar Fariborzi, Hamzeh Alipour, Kourosh Azizi, Neda Eskandarzade, Abozar Ghorbani

Abstract:

The family Parvoviridae consists of a large diversity of single-stranded DNA viruses, which cause mild to severe diseases in both vertebrates and invertebrates. The Parvoviridae are classified into three subfamilies: Parvovirinae infect vertebrates, Densovirinae infects invertebrates, while Hamaparovirinae infects both vertebrates and invertebrates. Except for the NS1 region, which is the prime criterion for phylogeny analysis, other parts of the parvoviruses genome, such as UTRs, are diverse even among closely related viruses or within the same genus. It is believed that host switching in parvoviruses may be related to genetic changes in regions other than NS1; therefore, whole-genome screening is valuable for studying parvoviruses' host-virus interactions. The aim of this study was to analyze genome organization and phylogeny of the complete genome sequence of the 132 Paroviridae family members, focusing on viruses that infect invertebrates. The maximum and minimum divergence within each subfamily belonged to Densovirinae and Parvovirinae, respectively. The greatest evolutionary divergence was between Hamaparovirinae and Parvovirinae. Unclassified viruses were mostly from Parovirinae and had the highest divergence to densoviruses and the lowest divergence to Parovirinae viruses. In a phylogenetic tree, all hamparoviruses were found in the center of densoviruses, with the exception of Syngnathid Ichthamaparvovirus 1 (NC_055527), which was positioned between two Parvovirinae members (NC _022089 and NC_038544). The proximity of hamparoviruses members to some densoviruses strengthens the possibility that densoviruses may be the ancestors of hamaparoviruses or vice versa. Therefore, examination and phylogeny analysis of the whole genome is necessary to understand Parvoviridae family host selection.

Keywords: densoviruses, parvoviridae, bioinformatics, phylogeny

Procedia PDF Downloads 91

361 Genome-Wide Analysis of Long Terminal Repeat (LTR) Retrotransposons in Rabbit (Oryctolagus cuniculus)

Authors: Zeeshan Khan, Faisal Nouroz, Shumaila Noureen

Abstract:

European or common rabbit (Oryctolagus cuniculus) belongs to class Mammalia, order Lagomorpha of family Leporidae. They are distributed worldwide and are native to Europe (France, Spain and Portugal) and Africa (Morocco and Algeria). LTR retrotransposons are major Class I mobile genetic elements of eukaryotic genomes and play a crucial role in genome expansion, evolution and diversification. They were mostly annotated in various genomes by conventional approaches of homology searches, which restricted the annotation of novel elements. Present work involved de novo identification of LTR retrotransposons by LTR_FINDER in haploid genome of rabbit (2247.74 Mb) distributed in 22 chromosomes, of which 7,933 putative full-length or partial copies were identified containing 69.38 Mb of elements, accounting 3.08% of the genome. Highest copy numbers (731) were found on chromosome 7, followed by chromosome 12 (705), while the lowest copy numbers (27) were detected in chromosome 19 with no elements identified from chromosome 21 due to partially sequenced chromosome, unidentified nucleotides (N) and repeated simple sequence repeats (SSRs). The identified elements ranged in sizes from 1.2 - 25.8 Kb with average sizes between 2-10 Kb. Highest percentage (4.77%) of elements was found in chromosome 15, while lowest (0.55%) in chromosome 19. The most frequent tRNA type was Arginine present in majority of the elements. Based on gained results, it was estimated that rabbit exhibits 15,866 copies having 137.73 Mb of elements accounting 6.16% of diploid genome (44 chromosomes). Further molecular analyses will be helpful in chromosomal localization and distribution of these elements on chromosomes.

Keywords: rabbit, LTR retrotransposons, genome, chromosome

Procedia PDF Downloads 147

360 Genetic Diversity and Discovery of Unique SNPs in Five Country Cultivars of Sesamum indicum by Next-Generation Sequencing

Authors: Nam-Kuk Kim, Jin Kim, Soomin Park, Changhee Lee, Mijin Chu, Seong-Hun Lee

Abstract:

In this study, we conducted whole genome re-sequencing of 10 cultivars originated from five countries including Korea, China, India, Pakistan and Ethiopia with Sesamum indicum (Zhongzho No. 13) genome as a reference. Almost 80% of the whole genome sequences of the reference genome could be covered by sequenced reads. Numerous SNP and InDel were detected by bioinformatic analysis. Among these variants, 266,051 SNPs were identified as unique to countries. Pakistan and Ethiopia had high densities of SNPs compared to other countries. Three main clusters (cluster 1: Korea, cluster 2: Pakistan and India, cluster 3: Ethiopia and China) were recovered by neighbor-joining analysis using all variants. Interestingly, some variants were detected in DGAT1 (diacylglycerol O-acyltransferase 1) and FADS (fatty acid desaturase) genes, which are known to be related with fatty acid synthesis and metabolism. These results can provide useful information to understand the regional characteristics and develop DNA markers for origin discrimination of sesame.

Keywords: Sesamum indicum, NGS, SNP, DNA marker

Procedia PDF Downloads 326

359 Genomics of Aquatic Adaptation

Authors: Agostinho Antunes

Abstract:

The completion of the human genome sequencing in 2003 opened a new perspective into the importance of whole genome sequencing projects, and currently multiple species are having their genomes completed sequenced, from simple organisms, such as bacteria, to more complex taxa, such as mammals. This voluminous sequencing data generated across multiple organisms provides also the framework to better understand the genetic makeup of such species and related ones, allowing to explore the genetic changes underlining the evolution of diverse phenotypic traits. Here, recent results from our group retrieved from comparative evolutionary genomic analyses of selected marine animal species will be considered to exemplify how gene novelty and gene enhancement by positive selection might have been determinant in the success of adaptive radiations into diverse habitats and lifestyles.

Keywords: comparative genomics, adaptive evolution, bioinformatics, phylogenetics, genome mining

Procedia PDF Downloads 532

358 Insights into the Annotated Genome Sequence of Defluviitoga tunisiensis L3 Isolated from a Thermophilic Rural Biogas Producing Plant

Authors: Irena Maus, Katharina Gabriella Cibis, Andreas Bremges, Yvonne Stolze, Geizecler Tomazetto, Daniel Wibberg, Helmut König, Alfred Pühler, Andreas Schlüter

Abstract:

Within the agricultural sector, the production of biogas from organic substrates represents an economically attractive technology to generate bioenergy. Complex consortia of microorganisms are responsible for biomass decomposition and biogas production. Recently, species belonging to the phylum Thermotogae were detected in thermophilic biogas-production plants utilizing renewable primary products for biomethanation. To analyze adaptive genome features of representative Thermotogae strains, Defluviitoga tunisiensis L3 was isolated from a rural thermophilic biogas plant (54°C) and completely sequenced on an Illumina MiSeq system. Sequencing and assembly of the D. tunisiensis L3 genome yielded a circular chromosome with a size of 2,053,097 bp and a mean GC content of 31.38%. Functional annotation of the complete genome sequence revealed that the thermophilic strain L3 encodes several genes predicted to facilitate growth of this microorganism on arabinose, galactose, maltose, mannose, fructose, raffinose, ribose, cellobiose, lactose, xylose, xylan, lactate and mannitol. Acetate, hydrogen (H2) and carbon dioxide (CO2) are supposed to be end products of the fermentation process. The latter gene products are metabolites for methanogenic archaea, the key players in the final step of the anaerobic digestion process. To determine the degree of relatedness of dominant biogas community members within selected digester systems to D. tunisiensis L3, metagenome sequences from corresponding communities were mapped on the L3 genome. These fragment recruitments revealed that metagenome reads originating from a thermophilic biogas plant covered 95% of D. tunisiensis L3 genome sequence. In conclusion, availability of the D. tunisiensis L3 genome sequence and insights into its metabolic capabilities provide the basis for biotechnological exploitation of genome features involved in thermophilic fermentation processes utilizing renewable primary products.

Keywords: genome sequence, thermophilic biogas plant, Thermotogae, Defluviitoga tunisiensis

Procedia PDF Downloads 498

357 Genomics of Adaptation in the Sea

Authors: Agostinho Antunes

Abstract:

Keywords: marine genomics, evolutionary bioinformatics, human genome sequencing, genomic analyses

Procedia PDF Downloads 610

356 Modified Genome-Scale Metabolic Model of Escherichia coli by Adding Hyaluronic Acid Biosynthesis-Related Enzymes (GLMU2 and HYAD) from Pasteurella multocida

Authors: P. Pasomboon, P. Chumnanpuen, T. E-kobon

Abstract:

Hyaluronic acid (HA) consists of linear heteropolysaccharides repeat of D-glucuronic acid and N-acetyl-D-glucosamine. HA has various useful properties to maintain skin elasticity and moisture, reduce inflammation, and lubricate the movement of various body parts without causing immunogenic allergy. HA can be found in several animal tissues as well as in the capsule component of some bacteria including Pasteurella multocida. This study aimed to modify a genome-scale metabolic model of Escherichia coli using computational simulation and flux analysis methods to predict HA productivity under different carbon sources and nitrogen supplement by the addition of two enzymes (GLMU2 and HYAD) from P. multocida to improve the HA production under the specified amount of carbon sources and nitrogen supplements. Result revealed that threonine and aspartate supplement raised the HA production by 12.186%. Our analyses proposed the genome-scale metabolic model is useful for improving the HA production and narrows the number of conditions to be tested further.

Keywords: Pasteurella multocida, Escherichia coli, hyaluronic acid, genome-scale metabolic model, bioinformatics

Procedia PDF Downloads 121

355 In silico Comparative Analysis of Chloroplast Genome (cpDNA) and Some Individual Genes (rbcL and trnH-psbA) in Pooideae Subfamily Members

Authors: Ibrahim Ilker Ozyigit, Ertugrul Filiz, Ilhan Dogan

Abstract:

An in silico analysis of Brachypodium distachyon, Triticum aestivum, Festuca arundinacea, Lolium perenne, Hordeum vulgare subsp. vulgare of the Pooideaea was performed based on complete chloroplast genomes including rbcL coding and trnH-psbA intergenic spacer regions alone to compare phylogenetic resolving power. Neighbor-joining, Minimum Evolution, and Unweighted Pair Group Method with arithmetic mean methods were used to reconstruct phylogenies with the highest bootstrap supported the obtained data from whole chloroplast genome sequence. The highest and lowest values from nucleotide diversity (π) analysis were found to be 0.315813 and 0.043495 in rbcL coding region in chloroplast genome and complete chloroplast genome, respectively. The highest transition/transversion bias (R) value was recorded as 1.384 in complete chloroplast genomes. F. arudinacea-L. perenne clade was uncovered in all phylogenies. Sequences of rbcL and trnH-psbA regions were not able to resolve the Pooideae phylogenies due to lack of genetic variation.

Keywords: chloroplast DNA, Pooideae, phylogenetic analysis, rbcL, trnH-psbA

Procedia PDF Downloads 375

354 Societal Acceptability Conditions of Genome Editing for Upland Rice in Madagascar

Authors: Anny Lucrece Nlend Nkott, Ludovic Temple

Abstract:

The appearance in 2012 of the CRISPR-CaS9 genome editing technique marks a turning point in the field of genetics. This technique would make it possible to create new varieties quickly and cheaply. Although some consider CRISPR-CaS9 to be revolutionary, others consider it a potential societal threat. To document the controversy, we explain the socioeconomic conditions under which this technique could be accepted for the creation of a rainfed rice variety in Madagascar. The methodological framework is based on 38 individual and semistructured interviews, a multistakeholder forum with 27 participants, and a survey of 148 rice producers. Results reveal that the acceptability of genome editing requires (i) strengthening the seed system through the operationalization of regulatory structures and the upgrading of stakeholders' knowledge of genetically modified organisms, (ii) assessing the effects of the edited variety on biodiversity and soil nitrogen dynamics, and (iii) strengthening the technical and human capacities of the biosafety body. Structural mechanisms for regulating the seed system are necessary to ensure safe experimentation of genome editing techniques. Organizational innovation also appears to be necessary. The study documents how collective learning between communities of scientists and nonscientists is a component of systemic processes of varietal innovation. This study was carried out with the financial support of the GENERICE project (Generation and Deployment of Genome-Edited, Nitrogen-use-Efficient Rice Varieties), funded by the Agropolis Foundation.

Keywords: CRISPR-CaS9, varietal innovation, seed system, innovation system

Procedia PDF Downloads 153

353 Analysis of Endogenous Sirevirus in Germinating Barley (Hordeum vulgare L.)

Authors: Nermin Gozukirmizi, Buket Cakmak, Sevgi Marakli

Abstract:

Sireviruses are genera of copia LTR retrotransposons with a unique genome structure among retrotransposons. Barley (Hordeum vulgare L.) is an economically important plant and has been studied as a model plant regarding its short annual life cycle and seven chromosome pairs. In this study, we used mature barley embryos, 10-day-old roots and 10-day-old leaves derived from the same barley plant to investigate SIRE1 retrotransposon movements by Inter-Retrotransposon Amplified Polymorphism (IRAP) technique. We found polymorphism rates between 0-64% among embryos, roots and leaves. Polymorphism rates were detected to be 0-27% among embryos, 8-60% among roots, and 11-50% among leaves. Polymorphisms were observed not only among the parts of different individuals, but also on the parts of the same plant (23-64%). The internal domains of SIRE1 (gag, env and rt) were also analyzed in the embryos, roots and leaves. Analysis of band profiles showed no polymorphism for gag, however, different band patterns were observed among samples for rt and env. The sequencing of SIRE1 gag, env and rt domains revealed 79% similarity for gag, 95% for env and 84% for rt to Ty1-copia retrotransposons. SIRE1 retrotransposon was identified in the soybean genome and has been studied on other plants (maize, rice, tomatoe etc.). This study is the first detailed investigation of SIRE1 in barley genome. The obtained findings are expected to contribute to the comprehension of SIRE1 retrotransposon and its role in barley genome.

Keywords: barley, polymorphism, retrotransposon, SIRE1 virus

Procedia PDF Downloads 307

352 CRISPR-DT: Designing gRNAs for the CRISPR-Cpf1 System with Improved Target Efficiency and Specificity

Authors: Houxiang Zhu, Chun Liang

Abstract:

The CRISPR-Cpf1 system has been successfully applied in genome editing. However, target efficiency of the CRISPR-Cpf1 system varies among different gRNA sequences. The published CRISPR-Cpf1 gRNA data was reanalyzed. Many sequences and structural features of gRNAs (e.g., the position-specific nucleotide composition, position-nonspecific nucleotide composition, GC content, minimum free energy, and melting temperature) correlated with target efficiency were found. Using machine learning technology, a support vector machine (SVM) model was created to predict target efficiency for any given gRNAs. The first web service application, CRISPR-DT (CRISPR DNA Targeting), has been developed to help users design optimal gRNAs for the CRISPR-Cpf1 system by considering both target efficiency and specificity. CRISPR-DT will empower researchers in genome editing.

Keywords: CRISPR-Cpf1, genome editing, target efficiency, target specificity

Procedia PDF Downloads 261

351 DeepOmics: Deep Learning for Understanding Genome Functioning and the Underlying Genetic Causes of Disease

Authors: Vishnu Pratap Singh Kirar, Madhuri Saxena

Abstract:

Advancement in sequence data generation technologies is churning out voluminous omics data and posing a massive challenge to annotate the biological functional features. With so much data available, the use of machine learning methods and tools to make novel inferences has become obvious. Machine learning methods have been successfully applied to a lot of disciplines, including computational biology and bioinformatics. Researchers in computational biology are interested to develop novel machine learning frameworks to classify the huge amounts of biological data. In this proposal, it plan to employ novel machine learning approaches to aid the understanding of how apparently innocuous mutations (in intergenic DNA and at synonymous sites) cause diseases. We are also interested in discovering novel functional sites in the genome and mutations in which can affect a phenotype of interest.

Keywords: genome wide association studies (GWAS), next generation sequencing (NGS), deep learning, omics

Procedia PDF Downloads 96

350 From Genome to Field: Applying Genome Wide Association Study for Sustainable Ascochyta Blight Management in Faba Beans

Authors: Rabia Faridi, Rizwana Maqbool, Umara Sahar Rana, Zaheer Ahmad

Abstract:

Climate change impacts agriculture, notably in Germany, where spring faba beans predominate. However, improved winter hardiness aligns with milder winters, enabling autumn-sown varieties. Genetic resistance to Ascochyta blight is vital for crop integration. Traditional breeding faces challenges due to complex inheritance. This study assessed 224 homozygous faba bean lines for Ascochyta resistance traits. To achieve h²>70%, 12 replicates were required (realized h²=87%). Genetic variation and strong trait correlations were observed. Five lines outperformed 29H, while three were highly susceptible. A genome-wide association study (GWAS) with 188 inbred lines and 2058 markers, including 17 guide SNP markers, identified 12 markers associated with resistance traits, potentially indicating new resistance genes. One guide marker (Vf-Mt1g014230-001) on chromosome III validated a known QTL. The guided marker approach complemented GWAS, facilitating marker-assisted selection for Ascochyta resistance. The Göttingen Winter Bean Population offers promise for resistance breeding.

Keywords: genome wide association studies, marker assisted breeding, faba bean, ascochyta blight

Procedia PDF Downloads 58

349 Genome Sequencing of the Yeast Saccharomyces cerevisiae Strain 202-3

Authors: Yina A. Cifuentes Triana, Andrés M. Pinzón Velásco, Marío E. Velásquez Lozano

Abstract:

In this work the sequencing and genome characterization of a natural isolate of Saccharomyces cerevisiae yeast (strain 202-3), identified with potential for the production of second generation ethanol from sugarcane bagasse hydrolysates is presented. This strain was selected because its capability to consume xylose during the fermentation of sugarcane bagasse hydrolysates, taking into account that many strains of S. cerevisiae are incapable of processing this sugar. This advantage and other prominent positive aspects during fermentation profiles evaluated in bagasse hydrolysates made the strain 202-3 a candidate strain to improve the production of second-generation ethanol, which was proposed as a first step to study the strain at the genomic level. The molecular characterization was carried out by genome sequencing with the Illumina HiSeq 2000 platform paired end; the assembly was performed with different programs, finally choosing the assembler ABYSS with kmer 89. Gene prediction was developed with the approach of hidden Markov models with Augustus. The genes identified were scored based on similarity with public databases of nucleotide and protein. Records were organized from ontological functions at different hierarchical levels, which identified central metabolic functions and roles of the S. cerevisiae strain 202-3, highlighting the presence of four possible new proteins, two of them probably associated with the positive consumption of xylose.

Keywords: cellulosic ethanol, Saccharomyces cerevisiae, genome sequencing, xylose consumption

Procedia PDF Downloads 319

348 Systematic Identification of Noncoding Cancer Driver Somatic Mutations

Authors: Zohar Manber, Ran Elkon

Abstract:

Accumulation of somatic mutations (SMs) in the genome is a major driving force of cancer development. Most SMs in the tumor's genome are functionally neutral; however, some cause damage to critical processes and provide the tumor with a selective growth advantage (termed cancer driver mutations). Current research on functional significance of SMs is mainly focused on finding alterations in protein coding sequences. However, the exome comprises only 3% of the human genome, and thus, SMs in the noncoding genome significantly outnumber those that map to protein-coding regions. Although our understanding of noncoding driver SMs is very rudimentary, it is likely that disruption of regulatory elements in the genome is an important, yet largely underexplored mechanism by which somatic mutations contribute to cancer development. The expression of most human genes is controlled by multiple enhancers, and therefore, it is conceivable that regulatory SMs are distributed across different enhancers of the same target gene. Yet, to date, most statistical searches for regulatory SMs have considered each regulatory element individually, which may reduce statistical power. The first challenge in considering the cumulative activity of all the enhancers of a gene as a single unit is to map enhancers to their target promoters. Such mapping defines for each gene its set of regulating enhancers (termed "set of regulatory elements" (SRE)). Considering multiple enhancers of each gene as one unit holds great promise for enhancing the identification of driver regulatory SMs. However, the success of this approach is greatly dependent on the availability of comprehensive and accurate enhancer-promoter (E-P) maps. To date, the discovery of driver regulatory SMs has been hindered by insufficient sample sizes and statistical analyses that often considered each regulatory element separately. In this study, we analyzed more than 2,500 whole-genome sequence (WGS) samples provided by The Cancer Genome Atlas (TCGA) and The International Cancer Genome Consortium (ICGC) in order to identify such driver regulatory SMs. Our analyses took into account the combinatorial aspect of gene regulation by considering all the enhancers that control the same target gene as one unit, based on E-P maps from three genomics resources. The identification of candidate driver noncoding SMs is based on their recurrence. We searched for SREs of genes that are "hotspots" for SMs (that is, they accumulate SMs at a significantly elevated rate). To test the statistical significance of recurrence of SMs within a gene's SRE, we used both global and local background mutation rates. Using this approach, we detected - in seven different cancer types - numerous "hotspots" for SMs. To support the functional significance of these recurrent noncoding SMs, we further examined their association with the expression level of their target gene (using gene expression data provided by the ICGC and TCGA for samples that were also analyzed by WGS).

Keywords: cancer genomics, enhancers, noncoding genome, regulatory elements

Procedia PDF Downloads 103

347 Revealing the Genome Based Biosynthetic Potential of a Streptomyces sp. Isolate BR123 Presenting Broad Spectrum Antimicrobial Activities

Authors: Neelma Ashraf

Abstract:

Actinomycetes, particularly genus Streptomyces is of great importance due to their role in the discovery of new natural products, particularly antimicrobial secondary metabolites in the medicinal science and biotechnology industry. Different Streptomyces strains were isolated from Helianthus annuus plants and tested for antibacterial and antifungal activities. The most promising five strains were chosen for further investigation, and growth conditions for antibiotic synthesis were optimised. The supernatants were extracted in different solvents, and the extracted products were analyzed using liquid chromatography-mass spectrometry (LC-MS) and biological testing. From one of the potent strains Streptomyces globusus sp. BR123, a compound lavendamycin was identified using these analytical techniques. In addition, this potent strain also produces a strong antifungal polyene compound with a quasimolecular ion of 2072. Streptomyces sp. BR123 was genome sequenced because of its promising antimicrobial potential in order to identify the gene cluster responsible for analyzed compound “lavendamycin”. The genome analysis yielded candidate genes responsible for the production of this potent compound. The genome sequence of 8.15 Mb of Streptomyces sp. isolate BR123 with a GC content of 72.63% and 8103 protein coding genes was attained. Many antimicrobial, antiparasitic, and anticancerous compounds were detected through multiple biosynthetic gene clusters predicted by in-Silico analysis. Though, the novelty of metabolites was determined through the insignificant resemblance with known biosynthetic gene clusters. The current study gives insight into the bioactive potential of Streptomyces sp. isolate BR123 with respect to the synthesis of bioactive secondary metabolites through genomic and spectrometric analysis. Moreover, the comparative genome study revealed the connection of isolate BR123 with other Streptomyces strains, which could expand the knowledge of this genus and the mechanism involved in the discovery of new antimicrobial metabolites.

Keywords: streptomyces, secondary metabolites, genome, biosynthetic gene clusters, high performance liquid chromatography, mass spectrometry

Procedia PDF Downloads 70

346 A Biophysical Model of CRISPR/Cas9 on- and off-Target Binding for Rational Design of Guide RNAs

Authors: Iman Farasat, Howard M. Salis

Abstract:

The CRISPR/Cas9 system has revolutionized genome engineering by enabling site-directed and high-throughput genome editing, genome insertion, and gene knockdowns in several species, including bacteria, yeast, flies, worms, and human cell lines. This technology has the potential to enable human gene therapy to treat genetic diseases and cancer at the molecular level; however, the current CRISPR/Cas9 system suffers from seemingly sporadic off-target genome mutagenesis that prevents its use in gene therapy. A comprehensive mechanistic model that explains how the CRISPR/Cas9 functions would enable the rational design of the guide-RNAs responsible for target site selection while minimizing unexpected genome mutagenesis. Here, we present the first quantitative model of the CRISPR/Cas9 genome mutagenesis system that predicts how guide-RNA sequences (crRNAs) control target site selection and cleavage activity. We used statistical thermodynamics and law of mass action to develop a five-step biophysical model of cas9 cleavage, and examined it in vivo and in vitro. To predict a crRNA's binding specificities and cleavage rates, we then compiled a nearest neighbor (NN) energy model that accounts for all possible base pairings and mismatches between the crRNA and the possible genomic DNA sites. These calculations correctly predicted crRNA specificity across 5518 sites. Our analysis reveals that cas9 activity and specificity are anti-correlated, and, the trade-off between them is the determining factor in performing an RNA-mediated cleavage with minimal off-targets. To find an optimal solution, we first created a scheme of safe-design criteria for Cas9 target selection by systematic analysis of available high throughput measurements. We then used our biophysical model to determine the optimal Cas9 expression levels and timing that maximizes on-target cleavage and minimizes off-target activity. We successfully applied this approach in bacterial and mammalian cell lines to reduce off-target activity to near background mutagenesis level while maintaining high on-target cleavage rate.

Keywords: biophysical model, CRISPR, Cas9, genome editing

Procedia PDF Downloads 404

345 Genome Sequencing of Infectious Bronchitis Virus QX-Like Strain Isolated in Malaysia

Authors: M. Suwaibah, S. W. Tan, I. Aiini, K. Yusoff, A. R. Omar

Abstract:

Respiratory diseases are the most important infectious diseases affecting poultry worldwide. One of the avian respiratory virus of global importance causing significant economic losses is Infectious Bronchitis Virus (IBV). The virus causes a wide spectrum disease known as Infectious Bronchitis (IB), affecting not only the respiratory system but also the kidney and the reproductive system, depending on its strain. IB and Newcastle disease are two of the most prevalent diseases affecting poultry in Malaysia. However, a study on the molecular characterization of Malaysian IBV is lacking. In this study, an IBV strain IBS130 which was isolated in 2015 was fully sequenced using next-gene sequencing approach. Sequence analysis of IBS130 based on the complete genome, polyprotein 1ab and S1 genes were compared with other IBV sequences available in Genbank, National Center for Biotechnology Information (NCBI). IBV strain IBS130 is characterised as QX-like strain based on whole genome and S1 gene sequence analysis. Comparisons of the virus with other IBV strains showed that the nucleotide identity ranged from 67% to 99.2%, depending on the region analysed. The similarity in whole genome nucleotide ranging from 84.9% to 90.7% with the least similar was from Singapore strains (84.9%) and highly similar with China QX-like strains. Meanwhile, the similarity in polyprotein 1ab ranging from 85.3% to 89.9% with the least similar to Singapore strains (85.3%) and highly similar with Mass strains from USA.

Keywords: infectious bronchitis virus, phylogenetic analysis, chicken, Malaysia

Procedia PDF Downloads 185

344 Generalized Correlation Coefficient in Genome-Wide Association Analysis of Cognitive Ability in Twins

Authors: Afsaneh Mohammadnejad, Marianne Nygaard, Jan Baumbach, Shuxia Li, Weilong Li, Jesper Lund, Jacob v. B. Hjelmborg, Lene Christensen, Qihua Tan

Abstract:

Cognitive impairment in the elderly is a key issue affecting the quality of life. Despite a strong genetic background in cognition, only a limited number of single nucleotide polymorphisms (SNPs) have been found. These explain a small proportion of the genetic component of cognitive function, thus leaving a large proportion unaccounted for. We hypothesize that one reason for this missing heritability is the misspecified modeling in data analysis concerning phenotype distribution as well as the relationship between SNP dosage and the phenotype of interest. In an attempt to overcome these issues, we introduced a model-free method based on the generalized correlation coefficient (GCC) in a genome-wide association study (GWAS) of cognitive function in twin samples and compared its performance with two popular linear regression models. The GCC-based GWAS identified two genome-wide significant (P-value < 5e-8) SNPs; rs2904650 near ZDHHC2 on chromosome 8 and rs111256489 near CD6 on chromosome 11. The kinship model also detected two genome-wide significant SNPs, rs112169253 on chromosome 4 and rs17417920 on chromosome 7, whereas no genome-wide significant SNPs were found by the linear mixed model (LME). Compared to the linear models, more meaningful biological pathways like GABA receptor activation, ion channel transport, neuroactive ligand-receptor interaction, and the renin-angiotensin system were found to be enriched by SNPs from GCC. The GCC model outperformed the linear regression models by identifying more genome-wide significant genetic variants and more meaningful biological pathways related to cognitive function. Moreover, GCC-based GWAS was robust in handling genetically related twin samples, which is an important feature in handling genetic confounding in association studies.

Keywords: cognition, generalized correlation coefficient, GWAS, twins

Procedia PDF Downloads 123

343 Genodata: The Human Genome Variation Using BigData

Authors: Surabhi Maiti, Prajakta Tamhankar, Prachi Uttam Mehta

Abstract:

Since the accomplishment of the Human Genome Project, there has been an unparalled escalation in the sequencing of genomic data. This project has been the first major vault in the field of medical research, especially in genomics. This project won accolades by using a concept called Bigdata which was earlier, extensively used to gain value for business. Bigdata makes use of data sets which are generally in the form of files of size terabytes, petabytes, or exabytes and these data sets were traditionally used and managed using excel sheets and RDBMS. The voluminous data made the process tedious and time consuming and hence a stronger framework called Hadoop was introduced in the field of genetic sciences to make data processing faster and efficient. This paper focuses on using SPARK which is gaining momentum with the advancement of BigData technologies. Cloud Storage is an effective medium for storage of large data sets which is generated from the genetic research and the resultant sets produced from SPARK analysis.

Keywords: human genome project, Bigdata, genomic data, SPARK, cloud storage, Hadoop

Procedia PDF Downloads 259

342 Molecular-Genetics Studies of New Unknown APMV Isolated from Wild Bird in Ukraine

Authors: Borys Stegniy, Anton Gerilovych, Oleksii Solodiankin, Vitaliy Bolotin, Anton Stegniy, Denys Muzyka, Claudio Afonso

Abstract:

New APMV was isolated from white fronted goose in Ukraine. This isolate was tested serologically using monoclonal antibodies in haemagglutination-inhibition tests against APMV1-9. As the results obtained isolate showed cross reactions with APMV7. Following investigations were provided for the full genome sequencing using random primers and cloning into pCRII-TOPO. Analysis of 100 transformed colonies of E.coli using traditional sequencing gave us possibilities to find only 3 regions, which could identify by BLAST. The first region with the length of 367 bp had 70 % nucleotide sequence identity to the APMV 12 isolate Wigeon/Italy/3920_1/2005 at genome position 2419-2784. Next region (344 bp) had 66 % identity to the same APMV 12 isolate at position 4760-5103. The last region (365 bp) showed 71 % identity to Newcastle disease virus strain M4 at position 12569-12928.

Keywords: APMV, Newcastle disease virus, Ukraine, full genome sequencing

Procedia PDF Downloads 442

341 Exploring MPI-Based Parallel Computing in Analyzing Very Large Sequences

Authors: Bilal Wajid, Erchin Serpedin

Abstract:

The health industry is aiming towards personalized medicine. If the patient’s genome needs to be sequenced it is important that the entire analysis be completed quickly. This paper explores use of parallel computing to analyze very large sequences. Two cases have been considered. In the first case, the sequence is kept constant and the effect of increasing the number of MPI-based processes is evaluated in terms of execution time, speed and efficiency. In the second case the number of MPI-based processes have been kept constant whereas, the length of the sequence was increased.

Keywords: parallel computing, alignment, genome assembly, alignment

Procedia PDF Downloads 272