Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 92

Search results for: genomics

92 Analysis of Genomics Big Data in Cloud Computing Using Fuzzy Logic

Authors: Mohammad Vahed, Ana Sadeghitohidi, Majid Vahed, Hiroki Takahashi

Abstract:

In the genomics field, the huge amounts of data have produced by the next-generation sequencers (NGS). Data volumes are very rapidly growing, as it is postulated that more than one billion bases will be produced per year in 2020. The growth rate of produced data is much faster than Moore's law in computer technology. This makes it more difficult to deal with genomics data, such as storing data, searching information, and finding the hidden information. It is required to develop the analysis platform for genomics big data. Cloud computing newly developed enables us to deal with big data more efficiently. Hadoop is one of the frameworks distributed computing and relies upon the core of a Big Data as a Service (BDaaS). Although many services have adopted this technology, e.g. amazon, there are a few applications in the biology field. Here, we propose a new algorithm to more efficiently deal with the genomics big data, e.g. sequencing data. Our algorithm consists of two parts: First is that BDaaS is applied for handling the data more efficiently. Second is that the hybrid method of MapReduce and Fuzzy logic is applied for data processing. This step can be parallelized in implementation. Our algorithm has great potential in computational analysis of genomics big data, e.g. de novo genome assembly and sequence similarity search. We will discuss our algorithm and its feasibility.

Keywords: big data, fuzzy logic, MapReduce, Hadoop, cloud computing

Procedia PDF Downloads 294

91 Changing the Landscape of Fungal Genomics: New Trends

Authors: Igor V. Grigoriev

Abstract:

Understanding of biological processes encoded in fungi is instrumental in addressing future food, feed, and energy demands of the growing human population. Genomics is a powerful and quickly evolving tool to understand these processes. The Fungal Genomics Program of the US Department of Energy Joint Genome Institute (JGI) partners with researchers around the world to explore fungi in several large scale genomics projects, changing the fungal genomics landscape. The key trends of these changes include: (i) rapidly increasing scale of sequencing and analysis, (ii) developing approaches to go beyond culturable fungi and explore fungal ‘dark matter,’ or unculturables, and (iii) functional genomics and multi-omics data integration. Power of comparative genomics has been recently demonstrated in several JGI projects targeting mycorrhizae, plant pathogens, wood decay fungi, and sugar fermenting yeasts. The largest JGI project ‘1000 Fungal Genomes’ aims at exploring the diversity across the Fungal Tree of Life in order to better understand fungal evolution and to build a catalogue of genes, enzymes, and pathways for biotechnological applications. At this point, at least 65% of over 700 known families have one or more reference genomes sequenced, enabling metagenomics studies of microbial communities and their interactions with plants. For many of the remaining families no representative species are available from culture collections. To sequence genomes of unculturable fungi two approaches have been developed: (a) sequencing DNA from fruiting bodies of ‘macro’ and (b) single cell genomics using fungal spores. The latter has been tested using zoospores from the early diverging fungi and resulted in several near-complete genomes from underexplored branches of the Fungal Tree, including the first genomes of Zoopagomycotina. Genome sequence serves as a reference for transcriptomics studies, the first step towards functional genomics. In the JGI fungal mini-ENCODE project transcriptomes of the model fungus Neurospora crassa grown on a spectrum of carbon sources have been collected to build regulatory gene networks. Epigenomics is another tool to understand gene regulation and recently introduced single molecule sequencing platforms not only provide better genome assemblies but can also detect DNA modifications. For example, 6mC methylome was surveyed across many diverse fungi and the highest among Eukaryota levels of 6mC methylation has been reported. Finally, data production at such scale requires data integration to enable efficient data analysis. Over 700 fungal genomes and other -omes have been integrated in JGI MycoCosm portal and equipped with comparative genomics tools to enable researchers addressing a broad spectrum of biological questions and applications for bioenergy and biotechnology.

Keywords: fungal genomics, single cell genomics, DNA methylation, comparative genomics

Procedia PDF Downloads 201

90 High-Throughput Mechanized Microfluidic Test Groundwork for Precise Microbial Genomics

Authors: Pouya Karimi, Ramin Gasemi Shayan, Parsa Sheykhzade

Abstract:

Ease shotgun DNA sequencing is changing the microbial sciences. Sequencing instruments are compelling to the point that example planning is currently the key constraining element. Here, we present a microfluidic test readiness stage that incorporates the key strides in cells to grouping library test groundwork for up to 96 examples and decreases DNA input prerequisites 100-overlay while keeping up or improving information quality. The universally useful microarchitecture we show bolsters work processes with subjective quantities of response and tidy up or catch steps. By decreasing the example amount necessities, we empowered low-input (∼10,000 cells) entire genome shotgun (WGS) sequencing of Mycobacterium tuberculosis and soil miniaturized scale settlements with prevalent outcomes. We additionally utilized the upgraded throughput to succession ∼400 clinical Pseudomonas aeruginosa libraries and exhibit magnificent single-nucleotide polymorphism discovery execution that clarified phenotypically watched anti-toxin opposition. Completely coordinated lab-on-chip test arrangement beats specialized boundaries to empower more extensive organization of genomics across numerous fundamental research and translational applications.

Keywords: clinical microbiology, DNA, microbiology, microbial genomics

Procedia PDF Downloads 119

89 Evolutionary Genomic Analysis of Adaptation Genomics

Authors: Agostinho Antunes

Abstract:

The completion of the human genome sequencing in 2003 opened a new perspective into the importance of whole genome sequencing projects, and currently multiple species are having their genomes completed sequenced, from simple organisms, such as bacteria, to more complex taxa, such as mammals. This voluminous sequencing data generated across multiple organisms provides also the framework to better understand the genetic makeup of such species and related ones, allowing to explore the genetic changes underlining the evolution of diverse phenotypic traits. Here, recent results from our group retrieved from comparative evolutionary genomic analyses of varied species will be considered to exemplify how gene novelty and gene enhancement by positive selection might have been determinant in the success of adaptive radiations into diverse habitats and lifestyles.

Keywords: adaptation, animals, evolution, genomics

Procedia PDF Downloads 425

88 Genomics of Aquatic Adaptation

Authors: Agostinho Antunes

Abstract:

The completion of the human genome sequencing in 2003 opened a new perspective into the importance of whole genome sequencing projects, and currently multiple species are having their genomes completed sequenced, from simple organisms, such as bacteria, to more complex taxa, such as mammals. This voluminous sequencing data generated across multiple organisms provides also the framework to better understand the genetic makeup of such species and related ones, allowing to explore the genetic changes underlining the evolution of diverse phenotypic traits. Here, recent results from our group retrieved from comparative evolutionary genomic analyses of selected marine animal species will be considered to exemplify how gene novelty and gene enhancement by positive selection might have been determinant in the success of adaptive radiations into diverse habitats and lifestyles.

Keywords: comparative genomics, adaptive evolution, bioinformatics, phylogenetics, genome mining

Procedia PDF Downloads 524

87 Genomics of Adaptation in the Sea

Authors: Agostinho Antunes

Abstract:

The completion of the human genome sequencing in 2003 opened a new perspective into the importance of whole genome sequencing projects, and currently multiple species are having their genomes completed sequenced, from simple organisms, such as bacteria, to more complex taxa, such as mammals. This voluminous sequencing data generated across multiple organisms provides also the framework to better understand the genetic makeup of such species and related ones, allowing to explore the genetic changes underlining the evolution of diverse phenotypic traits. Here, recent results from our group retrieved from comparative evolutionary genomic analyses of selected marine animal species will be considered to exemplify how gene novelty and gene enhancement by positive selection might have been determinant in the success of adaptive radiations into diverse habitats and lifestyles.

Keywords: marine genomics, evolutionary bioinformatics, human genome sequencing, genomic analyses

Procedia PDF Downloads 602

86 A Systems Approach to Targeting Cyclooxygenase: Genomics, Bioinformatics and Metabolomics Analysis of COX-1 -/- and COX-2-/- Lung Fibroblasts Providing Indication of Sterile Inflammation

Authors: Abul B. M. M. K. Islam, Mandar Dave, Roderick V. Jensen, Ashok R. Amin

Abstract:

A systems approach was applied to characterize differentially expressed transcripts, bioinformatics pathways, and proteins and prostaglandins (PGs) from lung fibroblasts procured from wild-type (WT), COX-1-/- and COX-2-/- mice to understand system level control mechanism. Bioinformatics analysis of COX-2 and COX-1 ablated cells induced COX-1 and COX-2 specific signature respectively, which significantly overlapped with an 'IL-1β induced inflammatory signature'. This defined novel cross-talk signals that orchestrated coordinated activation of pathways of sterile inflammation sensed by cellular stress. The overlapping signals showed significant over-representation of shared pathways for interferon y and immune responses, T cell functions, NOD, and toll-like receptor signaling. Gene Ontology Biological Process (GOBP) and pathway enrichment analysis specifically showed an increase in mRNA expression associated with: (a) organ development and homeostasis in COX-1-/- cells and (b) oxidative stress and response, spliceosomes and proteasomes activity, mTOR and p53 signaling in COX-2-/- cells. COX-1 and COX-2 showed signs of functional pathways committed to cell cycle and DNA replication at the genomics level. As compared to WT, metabolomics analysis revealed a significant increase in COX-1 mRNA and synthesis of basal levels of eicosanoids (PGE2, PGD2, TXB2, LTB4, PGF1α, and PGF2α) in COX-2 ablated cells and increase in synthesis of PGE2, and PGF1α in COX-1 null cells. There was a compensation of PGE2 and PGF1α in COX-1-/- and COX-2-/- cells. Collectively, these results support a broader, differential and collaborative regulation of both COX-1 and COX-2 pathways at the metabolic, signaling, and genomics levels in cellular homeostasis and sterile inflammation induced by cellular stress.

Keywords: cyclooxygenases, inflammation, lung fibroblasts, systemic

Procedia PDF Downloads 288

85 Nutritional Genomics Profile Based Personalized Sport Nutrition

Authors: Eszter Repasi, Akos Koller

Abstract:

Our genetic information determines our look, physiology, sports performance and all our features. Maximizing the performances of athletes have adopted a science-based approach to the nutritional support. Nowadays genetics studies have blended with nutritional sciences, and a dynamically evolving, new research field have appeared. Nutritional genomics is needed to be used by nutritional experts. This is a recent field of nutritional science, which can provide a solution to reach the best sport performance using correlations between the athlete’s genome, nutritions, molecules, included human microbiome (links between food, microbiome and epigenetics), nutrigenomics and nutrigenetics. Nutritional genomics has a tremendous potential to change the future of dietary guidelines and personal recommendations. Experts need to use new technology to get information about the athletes, like nutritional genomics profile (included the determination of the oral and gut microbiome and DNA coded reaction for food components), which can modify the preparation term and sports performance. The influence of nutrients on the genes expression is called Nutrigenomics. The heterogeneous response of gene variants to nutrients, dietary components is called Nutrigenetics. The human microbiome plays a critical role in the state of health and well-being, and there are more links between food or nutrition and the human microbiome composition, which can develop diseases and epigenetic changes as well. A nutritional genomics-based profile of athletes can be the best technic for a dietitian to make a unique sports nutrition diet plan. Using functional food and the right food components can be effected on health state, thus sports performance. Scientists need to determine the best response, due to the effect of nutrients on health, through altering genome promote metabolites and result changes in physiology. Nutritional biochemistry explains why polymorphisms in genes for the absorption, circulation, or metabolism of essential nutrients (such as n-3 polyunsaturated fatty acids or epigallocatechin-3-gallate), would affect the efficacy of that nutrient. Controlled nutritional deficiencies and failures, prevented the change of health state or a newly discovered food intolerance are observed by a proper medical team, can support better sports performance. It is important that the dietetics profession informed on gene-diet interactions, that may be leading to optimal health, reduced risk of injury or disease. A special medical application for documentation and monitoring of data of health state and risk factors can uphold and warn the medical team for an early action and help to be able to do a proper health service in time. This model can set up a personalized nutrition advice from the status control, through the recovery, to the monitoring. But more studies are needed to understand the mechanisms and to be able to change the composition of the microbiome, environmental and genetic risk factors in cases of athletes.

Keywords: gene-diet interaction, multidisciplinary team, microbiome, diet plan

Procedia PDF Downloads 164

84 Isolate-Specific Variations among Clinical Isolates of Brucella Identified by Whole-Genome Sequencing, Bioinformatics and Comparative Genomics

Authors: Abu S. Mustafa, Mohammad W. Khan, Faraz Shaheed Khan, Nazima Habibi

Abstract:

Brucellosis is a zoonotic disease of worldwide prevalence. There are at least four species and several strains of Brucella that cause human disease. Brucella genomes have very limited variation across strains, which hinder strain identification using classical molecular techniques, including PCR and 16 S rDNA sequencing. The aim of this study was to perform whole genome sequencing of clinical isolates of Brucella and perform bioinformatics and comparative genomics analyses to determine the existence of genetic differences across the isolates of a single Brucella species and strain. The draft sequence data were generated from 15 clinical isolates of Brucella melitensis (biovar 2 strain 63/9) using MiSeq next generation sequencing platform. The generated reads were used for further assembly and analysis. All the analysis was performed using Bioinformatics work station (8 core i7 processor, 8GB RAM with Bio-Linux operating system). FastQC was used to determine the quality of reads and low quality reads were trimmed or eliminated using Fastx_trimmer. Assembly was done by using Velvet and ABySS softwares. The ordering of assembled contigs was performed by Mauve. An online server RAST was employed to annotate the contigs assembly. Annotated genomes were compared using Mauve and ACT tools. The QC score for DNA sequence data, generated by MiSeq, was higher than 30 for 80% of reads with more than 100x coverage, which suggested that data could be utilized for further analysis. However when analyzed by FastQC, quality of four reads was not good enough for creating a complete genome draft so remaining 11 samples were used for further analysis. The comparative genome analyses showed that despite sharing same gene sets, single nucleotide polymorphisms and insertions/deletions existed across different genomes, which provided a variable extent of diversity to these bacteria. In conclusion, the next generation sequencing, bioinformatics, and comparative genome analysis can be utilized to find variations (point mutations, insertions and deletions) across different genomes of Brucella within a single strain. This information could be useful in surveillance and epidemiological studies supported by Kuwait University Research Sector grants MI04/15 and SRUL02/13.

Keywords: brucella, bioinformatics, comparative genomics, whole genome sequencing

Procedia PDF Downloads 371

83 Diversity, Biochemical and Genomic Assessment of Selected Benthic Species of Two Tropical Lagoons, Southwest Nigeria

Authors: G. F. Okunade, M. O. Lawal, R. E. Uwadiae, D. Portnoy

Abstract:

The diversity, physico-chemical, biochemical and genomics assessment of Macrofauna species of Ologe and Badagry Lagoons were carried out between August 2016 and July 2018. The concentrations of Fe, Zn, Mn, Cd, Cr, and Pb in water were determined by Atomic Absorption Spectrophotometer (AAS). Particle size distribution was determined with wet-sieving and sedimentation using hydrometer method. Genomics analyses were carried using 25 P. fusca (quadriseriata) and 25 P.fusca from each lagoon due to abundance in both lagoons all through the two years of collection. DNA was isolated from each sample using the Mag-Bind Blood and Tissue DNA HD 96 kit; a method designed to isolate high quality. The biochemical characteristics were analysed in the dominanat species (P.aurita and T. fuscatus) using ELISA kits. Physico-chemical parameters such as pH, total dissolved solids, dissolved oxygen, conductivity and TDS were analysed using APHA standard protocols. The Physico-chemical parameters of the water quality recorded with mean values of 32.46 ± 0.66mg/L and 41.93 ± 0.65 for COD, 27.28 ± 0.97 and 34.82 ± 0.1 mg/L for BOD, 0.04 ± 4.71 mg/L for DO, 6.65 and 6.58 for pH in Ologe and Badagry lagoons with significant variations (p ≤ 0.05) across seasons. The mean and standard deviation of salinity for Ologe and Badagry Lagoons ranged from 0.43 ± 0.30 to 0.27 ± 0.09. A total of 4210 species belonging to a phylum, two classes, four families and a total of 2008 species in Ologe lagoon while a phylum, two classes, 5 families and a total of 2202 species in Badagry lagoon. The percentage composition of the classes at Ologe lagoon had 99% gastropod and 1% bivalve, while Gastropod contributed 98.91% and bivalve 1.09% in Badagry lagoon. Particle size was distributed in 0.002mm to 2.00mm, particle size distribution in Ologe lagoon recorded 0.83% gravels, 97.83% sand, and 1.33% silt particles while Badagry lagoon recorded 7.43% sand, 24.71% silt, and 67.86% clay particles hence, the excessive dredging activities going on in the lagoon. Maximum percentage of sand (100%) was seen in station 6 in Ologe lagoon while the minimum (96%) was found in station 1. P. aurita (Ologe Lagoon) and T. fuscastus (Badagry Lagoon) were the most abundant benthic species in which both contributed 61.05% and 64.35%, respectively. The enzymatic activities of P. aurita observed with mean values of 21.03 mg/dl for AST, 10.33 mg/dl for ALP, 82.16 mg/dl for ALT and 73.06 mg/dl for CHO in Ologe Lagoon While T. fuscatus observed mean values of Badagry Lagoon) recorded mean values 29.76 mg/dl, ALP with 11.69mg/L, ALT with 140.58 mg/dl and CHO with 45.98 mg/dl. There were significant variations (P < 0.05) in AST and CHO levels of activities in the muscles of the species.

Keywords: benthos, biochemical responses, genomics, metals, particle size

Procedia PDF Downloads 122

82 Genomic Surveillance of Bacillus Anthracis in South Africa Revealed a Unique Genetic Cluster of B- Clade Strains

Authors: Kgaugelo Lekota, Ayesha Hassim, Henriette Van Heerden

Abstract:

Bacillus anthracis is the causative agent of anthrax that is composed of three genetic groups, namely A, B, and C. Clade-A is distributed world-wide, while sub-clades B has been identified in Kruger National Park (KNP), South Africa. KNP is one of the endemic anthrax regions in South Africa with distinctive genetic diversity. Genomic surveillance of KNP B. anthracis strains was employed on the historical culture collection isolates (n=67) dated from the 1990’s to 2015 using a whole genome sequencing approach. Whole genome single nucleotide polymorphism (SNPs) and pan-genomics analysis were used to define the B. anthracis genetic population structure. This study showed that KNP has heterologous B. anthracis strains grouping in the A-clade with more prominent ABr.005/006 (Ancient A) SNP lineage. The 2012 and 2015 anthrax isolates are dispersed amongst minor sub-clades that prevail in non-stabilized genetic evolution strains. This was augmented with non-parsimony informative SNPs of the B. anthracis strains across minor sub-clades of the Ancient A clade. Pan-genomics of B. anthracis showed a clear distinction between A and B-clade genomes with 11 374 predicted clusters of protein coding genes. Unique accessory genes of B-clade genomes that included biosynthetic cell wall genes and multidrug resistant of Fosfomycin. South Africa consists of diverse B. anthracis strains with unique defined SNPs. The sequenced B. anthracis strains in this study will serve as a means to further trace the dissemination of B. anthracis outbreaks globally and especially in South Africa.

Keywords: bacillus anthracis, whole genome single nucleotide polymorphisms, pangenomics, kruger national park

Procedia PDF Downloads 137

81 C-eXpress: A Web-Based Analysis Platform for Comparative Functional Genomics and Proteomics in Human Cancer Cell Line, NCI-60 as an Example

Authors: Chi-Ching Lee, Po-Jung Huang, Kuo-Yang Huang, Petrus Tang

Abstract:

Background: Recent advances in high-throughput research technologies such as new-generation sequencing and multi-dimensional liquid chromatography makes it possible to dissect the complete transcriptome and proteome in a single run for the first time. However, it is almost impossible for many laboratories to handle and analysis these “BIG” data without the support from a bioinformatics team. We aimed to provide a web-based analysis platform for users with only limited knowledge on bio-computing to study the functional genomics and proteomics. Method: We use NCI-60 as an example dataset to demonstrate the power of the web-based analysis platform and data delivering system: C-eXpress takes a simple text file that contain the standard NCBI gene or protein ID and expression levels (rpkm or fold) as input file to generate a distribution map of gene/protein expression levels in a heatmap diagram organized by color gradients. The diagram is hyper-linked to a dynamic html table that allows the users to filter the datasets based on various gene features. A dynamic summary chart is generated automatically after each filtering process. Results: We implemented an integrated database that contain pre-defined annotations such as gene/protein properties (ID, name, length, MW, pI); pathways based on KEGG and GO biological process; subcellular localization based on GO cellular component; functional classification based on GO molecular function, kinase, peptidase and transporter. Multiple ways of sorting of column and rows is also provided for comparative analysis and visualization of multiple samples.

Keywords: cancer, visualization, database, functional annotation

Procedia PDF Downloads 610

80 In silico Subtractive Genomics Approach for Identification of Strain-Specific Putative Drug Targets among Hypothetical Proteins of Drug-Resistant Klebsiella pneumoniae Strain 825795-1

Authors: Umairah Natasya Binti Mohd Omeershffudin, Suresh Kumar

Abstract:

Klebsiella pneumoniae, a Gram-negative enteric bacterium that causes nosocomial and urinary tract infections. Particular concern is the global emergence of multidrug-resistant (MDR) strains of Klebsiella pneumoniae. Characterization of antibiotic resistance determinants at the genomic level plays a critical role in understanding, and potentially controlling, the spread of multidrug-resistant (MDR) pathogens. In this study, drug-resistant Klebsiella pneumoniae strain 825795-1 was investigated with extensive computational approaches aimed at identifying novel drug targets among hypothetical proteins. We have analyzed 1099 hypothetical proteins available in genome. We have used in-silico genome subtraction methodology to design potential and pathogen-specific drug targets against Klebsiella pneumoniae. We employed bioinformatics tools to subtract the strain-specific paralogous and host-specific homologous sequences from the bacterial proteome. The sorted 645 proteins were further refined to identify the essential genes in the pathogenic bacterium using the database of essential genes (DEG). We found 135 unique essential proteins in the target proteome that could be utilized as novel targets to design newer drugs. Further, we identified 49 cytoplasmic protein as potential drug targets through sub-cellular localization prediction. Further, we investigated these proteins in the DrugBank databases, and 11 of the unique essential proteins showed druggability according to the FDA approved drug bank databases with diverse broad-spectrum property. The results of this study will facilitate discovery of new drugs against Klebsiella pneumoniae.

Keywords: pneumonia, drug target, hypothetical protein, subtractive genomics

Procedia PDF Downloads 168

79 Platform Integration for High-Throughput Functional Screening Applications

Authors: Karolis Leonavičius, Dalius Kučiauskas, Dangiras Lukošius, Arnoldas Jasiūnas, Kostas Zdanys, Rokas Stanislovas, Emilis Gegevičius, Žana Kapustina, Juozas Nainys

Abstract:

Screening throughput is a common bottleneck in many research areas, including functional genomics, drug discovery, and directed evolution. High-throughput screening techniques can be classified into two main categories: (i) affinity-based screening and (ii) functional screening. The first one relies on binding assays that provide information about the affinity of a test molecule for a target binding site. Binding assays are relatively easy to establish; however, they reveal no functional activity. In contrast, functional assays show an effect triggered by the interaction of a ligand at a target binding site. Functional assays might be based on a broad range of readouts, such as cell proliferation, reporter gene expression, downstream signaling, and other effects that are a consequence of ligand binding. Screening of large cell or gene libraries based on direct activity rather than binding affinity is now a preferred strategy in many areas of research as functional assays more closely resemble the context where entities of interest are anticipated to act. Droplet sorting is the basis of high-throughput functional biological screening, yet its applicability is limited due to the technical complexity of integrating high-performance droplet analysis and manipulation systems. As a solution, the Droplet Genomics Styx platform enables custom droplet sorting workflows, which are necessary for the development of early-stage or complex biological therapeutics or industrially important biocatalysts. The poster will focus on the technical design considerations of Styx in the context of its application spectra.

Keywords: functional screening, droplet microfluidics, droplet sorting, dielectrophoresis

Procedia PDF Downloads 125

78 The Development and Provision of a Knowledge Management Ecosystem, Optimized for Genomics

Authors: Matthew I. Bellgard

Abstract:

The field of bioinformatics has made, and continues to make, substantial progress and contributions to life science research and development. However, this paper contends that a systems approach integrates bioinformatics activities for any project in a defined manner. The application of critical control points in this bioinformatics systems approach may be useful to identify and evaluate points in a pathway where specified activity risk can be reduced, monitored and quality enhanced.

Keywords: bioinformatics, food security, personalized medicine, systems approach

Procedia PDF Downloads 418

77 The Various Legal Dimensions of Genomic Data

Authors: Amy Gooden

Abstract:

When human genomic data is considered, this is often done through only one dimension of the law, or the interplay between the various dimensions is not considered, thus providing an incomplete picture of the legal framework. This research considers and analyzes the various dimensions in South African law applicable to genomic sequence data – including property rights, personality rights, and intellectual property rights. The effective use of personal genomic sequence data requires the acknowledgement and harmonization of the rights applicable to such data.

Keywords: artificial intelligence, data, law, genomics, rights

Procedia PDF Downloads 135

76 Partial Least Square Regression for High-Dimentional and High-Correlated Data

Authors: Mohammed Abdullah Alshahrani

Abstract:

The research focuses on investigating the use of partial least squares (PLS) methodology for addressing challenges associated with high-dimensional correlated data. Recent technological advancements have led to experiments producing data characterized by a large number of variables compared to observations, with substantial inter-variable correlations. Such data patterns are common in chemometrics, where near-infrared (NIR) spectrometer calibrations record chemical absorbance levels across hundreds of wavelengths, and in genomics, where thousands of genomic regions' copy number alterations (CNA) are recorded from cancer patients. PLS serves as a widely used method for analyzing high-dimensional data, functioning as a regression tool in chemometrics and a classification method in genomics. It handles data complexity by creating latent variables (components) from original variables. However, applying PLS can present challenges. The study investigates key areas to address these challenges, including unifying interpretations across three main PLS algorithms and exploring unusual negative shrinkage factors encountered during model fitting. The research presents an alternative approach to addressing the interpretation challenge of predictor weights associated with PLS. Sparse estimation of predictor weights is employed using a penalty function combining a lasso penalty for sparsity and a Cauchy distribution-based penalty to account for variable dependencies. The results demonstrate sparse and grouped weight estimates, aiding interpretation and prediction tasks in genomic data analysis. High-dimensional data scenarios, where predictors outnumber observations, are common in regression analysis applications. Ordinary least squares regression (OLS), the standard method, performs inadequately with high-dimensional and highly correlated data. Copy number alterations (CNA) in key genes have been linked to disease phenotypes, highlighting the importance of accurate classification of gene expression data in bioinformatics and biology using regularized methods like PLS for regression and classification.

Keywords: partial least square regression, genetics data, negative filter factors, high dimensional data, high correlated data

Procedia PDF Downloads 44

75 Systematic Identification of Noncoding Cancer Driver Somatic Mutations

Authors: Zohar Manber, Ran Elkon

Abstract:

Accumulation of somatic mutations (SMs) in the genome is a major driving force of cancer development. Most SMs in the tumor's genome are functionally neutral; however, some cause damage to critical processes and provide the tumor with a selective growth advantage (termed cancer driver mutations). Current research on functional significance of SMs is mainly focused on finding alterations in protein coding sequences. However, the exome comprises only 3% of the human genome, and thus, SMs in the noncoding genome significantly outnumber those that map to protein-coding regions. Although our understanding of noncoding driver SMs is very rudimentary, it is likely that disruption of regulatory elements in the genome is an important, yet largely underexplored mechanism by which somatic mutations contribute to cancer development. The expression of most human genes is controlled by multiple enhancers, and therefore, it is conceivable that regulatory SMs are distributed across different enhancers of the same target gene. Yet, to date, most statistical searches for regulatory SMs have considered each regulatory element individually, which may reduce statistical power. The first challenge in considering the cumulative activity of all the enhancers of a gene as a single unit is to map enhancers to their target promoters. Such mapping defines for each gene its set of regulating enhancers (termed "set of regulatory elements" (SRE)). Considering multiple enhancers of each gene as one unit holds great promise for enhancing the identification of driver regulatory SMs. However, the success of this approach is greatly dependent on the availability of comprehensive and accurate enhancer-promoter (E-P) maps. To date, the discovery of driver regulatory SMs has been hindered by insufficient sample sizes and statistical analyses that often considered each regulatory element separately. In this study, we analyzed more than 2,500 whole-genome sequence (WGS) samples provided by The Cancer Genome Atlas (TCGA) and The International Cancer Genome Consortium (ICGC) in order to identify such driver regulatory SMs. Our analyses took into account the combinatorial aspect of gene regulation by considering all the enhancers that control the same target gene as one unit, based on E-P maps from three genomics resources. The identification of candidate driver noncoding SMs is based on their recurrence. We searched for SREs of genes that are "hotspots" for SMs (that is, they accumulate SMs at a significantly elevated rate). To test the statistical significance of recurrence of SMs within a gene's SRE, we used both global and local background mutation rates. Using this approach, we detected - in seven different cancer types - numerous "hotspots" for SMs. To support the functional significance of these recurrent noncoding SMs, we further examined their association with the expression level of their target gene (using gene expression data provided by the ICGC and TCGA for samples that were also analyzed by WGS).

Keywords: cancer genomics, enhancers, noncoding genome, regulatory elements

Procedia PDF Downloads 99

74 Predictive Pathogen Biology: Genome-Based Prediction of Pathogenic Potential and Countermeasures Targets

Authors: Debjit Ray

Abstract:

Horizontal gene transfer (HGT) and recombination leads to the emergence of bacterial antibiotic resistance and pathogenic traits. HGT events can be identified by comparing a large number of fully sequenced genomes across a species or genus, define the phylogenetic range of HGT, and find potential sources of new resistance genes. In-depth comparative phylogenomics can also identify subtle genome or plasmid structural changes or mutations associated with phenotypic changes. Comparative phylogenomics requires that accurately sequenced, complete and properly annotated genomes of the organism. Assembling closed genomes requires additional mate-pair reads or “long read” sequencing data to accompany short-read paired-end data. To bring down the cost and time required of producing assembled genomes and annotating genome features that inform drug resistance and pathogenicity, we are analyzing the performance for genome assembly of data from the Illumina NextSeq, which has faster throughput than the Illumina HiSeq (~1-2 days versus ~1 week), and shorter reads (150bp paired-end versus 300bp paired end) but higher capacity (150-400M reads per run versus ~5-15M) compared to the Illumina MiSeq. Bioinformatics improvements are also needed to make rapid, routine production of complete genomes a reality. Modern assemblers such as SPAdes 3.6.0 running on a standard Linux blade are capable in a few hours of converting mixes of reads from different library preps into high-quality assemblies with only a few gaps. Remaining breaks in scaffolds are generally due to repeats (e.g., rRNA genes) are addressed by our software for gap closure techniques, that avoid custom PCR or targeted sequencing. Our goal is to improve the understanding of emergence of pathogenesis using sequencing, comparative genomics, and machine learning analysis of ~1000 pathogen genomes. Machine learning algorithms will be used to digest the diverse features (change in virulence genes, recombination, horizontal gene transfer, patient diagnostics). Temporal data and evolutionary models can thus determine whether the origin of a particular isolate is likely to have been from the environment (could it have evolved from previous isolates). It can be useful for comparing differences in virulence along or across the tree. More intriguing, it can test whether there is a direction to virulence strength. This would open new avenues in the prediction of uncharacterized clinical bugs and multidrug resistance evolution and pathogen emergence.

Keywords: genomics, pathogens, genome assembly, superbugs

Procedia PDF Downloads 194

73 Crop Breeding for Low Input Farming Systems and Appropriate Breeding Strategies

Authors: Baye Berihun Getahun, Mulugeta Atnaf Tiruneh, Richard G. F. Visser

Abstract:

Resource-poor farmers practice low-input farming systems, and yet, most breeding programs give less attention to this huge farming system, which serves as a source of food and income for several people in developing countries. The high-input conventional breeding system appears to have failed to adequately meet the needs and requirements of 'difficult' environments operating under this system. Moreover, the unavailability of resources for crop production is getting for their peaks, the environment is maltreated by excessive use of agrochemicals, crop productivity reaches its plateau stage, particularly in the developed nations, the world population is increasing, and food shortage sustained to persist for poor societies. In various parts of the world, genetic gain at the farmers' level remains low which could be associated with low adoption of crop varieties, which have been developed under high input systems. Farmers usually use their local varieties and apply minimum inputs as a risk-avoiding and cost-minimizing strategy. This evidence indicates that the conventional high-input plant breeding system has failed to feed the world population, and the world is moving further away from the United Nations' goals of ending hunger, food insecurity, and malnutrition. In this review, we discussed the rationality of focused breeding programs for low-input farming systems and, the technical aspect of crop breeding that accommodates future food needs and its significance for developing countries in the decreasing scenario of resources required for crop production. To this end, the application of exotic introgression techniques like polyploidization, pan-genomics, comparative genomics, and De novo domestication as a pre-breeding technique has been discussed in the review to exploit the untapped genetic diversity of the crop wild relatives (CWRs). Desired recombinants developed at the pre-breeding stage are exploited through appropriate breeding approaches such as evolutionary plant breeding (EPB), rhizosphere-related traits breeding, and participatory plant breeding approaches. Populations advanced through evolutionary breeding like composite cross populations (CCPs) and rhizosphere-associated traits breeding approach that provides opportunities for improving abiotic and biotic soil stress, nutrient acquisition capacity, and crop microbe interaction in improved varieties have been reviewed. Overall, we conclude that low input farming system is a huge farming system that requires distinctive breeding approaches, and the exotic pre-breeding introgression techniques and the appropriate breeding approaches which deploy the skills and knowledge of both breeders and farmers are vital to develop heterogeneous landrace populations, which are effective for farmers practicing low input farming across the world.

Keywords: low input farming, evolutionary plant breeding, composite cross population, participatory plant breeding

Procedia PDF Downloads 35

72 PTFE Capillary-Based DNA Amplification within an Oscillatory Thermal Cycling Device

Authors: Jyh J. Chen, Fu H. Yang, Ming H. Liao

Abstract:

This study describes a capillary-based device integrated with the heating and cooling modules for polymerase chain reaction (PCR). The device consists of the reaction polytetrafluoroethylene (PTFE) capillary, the aluminum blocks, and is equipped with two cartridge heaters, a thermoelectric (TE) cooler, a fan, and some thermocouples for temperature control. The cartridge heaters are placed into the heating blocks and maintained at two different temperatures to achieve the denaturation and the extension step. Some thermocouples inserted into the capillary are used to obtain the transient temperature profiles of the reaction sample during thermal cycles. A 483-bp DNA template is amplified successfully in the designed system and the traditional thermal cycler. This work should be interesting to persons involved in the high-temperature based reactions and genomics or cell analysis.

Keywords: polymerase chain reaction, thermal cycles, capillary, TE cooler

Procedia PDF Downloads 445

71 A New Approach for Improving Accuracy of Multi Label Stream Data

Authors: Kunal Shah, Swati Patel

Abstract:

Many real world problems involve data which can be considered as multi-label data streams. Efficient methods exist for multi-label classification in non streaming scenarios. However, learning in evolving streaming scenarios is more challenging, as the learners must be able to adapt to change using limited time and memory. Classification is used to predict class of unseen instance as accurate as possible. Multi label classification is a variant of single label classification where set of labels associated with single instance. Multi label classification is used by modern applications, such as text classification, functional genomics, image classification, music categorization etc. This paper introduces the task of multi-label classification, methods for multi-label classification and evolution measure for multi-label classification. Also, comparative analysis of multi label classification methods on the basis of theoretical study, and then on the basis of simulation was done on various data sets.

Keywords: binary relevance, concept drift, data stream mining, MLSC, multiple window with buffer

Procedia PDF Downloads 577

70 Tip60’s Novel RNA-Binding Function Modulates Alternative Splicing of Pre-mRNA Targets Implicated in Alzheimer’s Disease

Authors: Felice Elefant, Akanksha Bhatnaghar, Keegan Krick, Elizabeth Heller

Abstract:

Context: The severity of Alzheimer’s Disease (AD) progression involves an interplay of genetics, age, and environmental factors orchestrated by histone acetyltransferase (HAT) mediated neuroepigenetic mechanisms. While disruption of Tip60 HAT action in neural gene control is implicated in AD, alternative mechanisms underlying Tip60 function remain unexplored. Altered RNA splicing has recently been highlighted as a widespread hallmark in the AD transcriptome that is implicated in the disease. Research Aim: The aim of this study was to identify a novel RNA binding/splicing function for Tip60 in human hippocampus and impaired in brains from AD fly models and AD patients. Methodology/Analysis: The authors used RNA immunoprecipitation using RNA isolated from 200 pooled wild type Drosophila brains for each of the 3 biological replicates. To identify Tip60’s RNA targets, they performed genome sequencing (DNB-SequencingTM technology, BGI genomics) on 3 replicates for Input RNA and RNA IPs by Tip60. Findings: The authors' transcriptomic analysis of RNA bound to Tip60 by Tip60-RNA immunoprecipitation (RIP) revealed Tip60 RNA targets enriched for critical neuronal processes implicated in AD. Remarkably, 79% of Tip60’s RNA targets overlap with its chromatin gene targets, supporting a model by which Tip60 orchestrates bi-level transcriptional regulation at both the chromatin and RNA level, a function unprecedented for any HAT to date. Since RNA splicing occurs co-transcriptionally and splicing defects are implicated in AD, the authors investigated whether Tip60-RNA targeting modulates splicing decisions and if this function is altered in AD. Replicate multivariate analysis of transcript splicing (rMATS) analysis of RNA-Seq data sets from wild-type and AD fly brains revealed a multitude of mammalian-like AS defects. Strikingly, over half of these altered RNAs were bonafide Tip60-RNA targets enriched for in the AD-gene curated database, with some AS alterations prevented against by increasing Tip60 in fly brain. Importantly, human orthologs of several Tip60-modulated spliced genes in Drosophila are well characterized aberrantly spliced genes in human AD brains, implicating disruption of Tip60’s splicing function in AD pathogenesis. Theoretical Importance: The authors' findings support a novel RNA interaction and splicing regulatory function for Tip60 that may underlie AS impairments that hallmark AD etiology. Data Collection: The authors collected data from RNA immunoprecipitation experiments using RNA isolated from 200 pooled wild type Drosophila brains for each of the 3 biological replicates. They also performed genome sequencing (DNBSequencingTM technology, BGI genomics) on 3 replicates for Input RNA and RNA IPs by Tip60. Questions: The question addressed by this study was whether Tip60 has a novel RNA binding/splicing function in human hippocampus and whether this function is impaired in brains from AD fly models and AD patients. Conclusions: The authors' findings support a novel RNA interaction and splicing regulatory function for Tip60 that may underlie AS impairments that hallmark AD etiology.

Keywords: Alzheimer's disease, cognition, aging, neuroepigenetics

Procedia PDF Downloads 68

69 Genodata: The Human Genome Variation Using BigData

Authors: Surabhi Maiti, Prajakta Tamhankar, Prachi Uttam Mehta

Abstract:

Since the accomplishment of the Human Genome Project, there has been an unparalled escalation in the sequencing of genomic data. This project has been the first major vault in the field of medical research, especially in genomics. This project won accolades by using a concept called Bigdata which was earlier, extensively used to gain value for business. Bigdata makes use of data sets which are generally in the form of files of size terabytes, petabytes, or exabytes and these data sets were traditionally used and managed using excel sheets and RDBMS. The voluminous data made the process tedious and time consuming and hence a stronger framework called Hadoop was introduced in the field of genetic sciences to make data processing faster and efficient. This paper focuses on using SPARK which is gaining momentum with the advancement of BigData technologies. Cloud Storage is an effective medium for storage of large data sets which is generated from the genetic research and the resultant sets produced from SPARK analysis.

Keywords: human genome project, Bigdata, genomic data, SPARK, cloud storage, Hadoop

Procedia PDF Downloads 251

68 Single Cell Analysis of Circulating Monocytes in Prostate Cancer Patients

Authors: Leander Van Neste, Kirk Wojno

Abstract:

The innate immune system reacts to foreign insult in several unique ways, one of which is phagocytosis of perceived threats such as cancer, bacteria, and viruses. The goal of this study was to look for evidence of phagocytosed RNA from tumor cells in circulating monocytes. While all monocytes possess phagocytic capabilities, the non-classical CD14+/FCGR3A+ monocytes and the intermediate CD14++/FCGR3A+ monocytes most actively remove threatening ‘external’ cellular materials. Purified CD14-positive monocyte samples from fourteen patients recently diagnosed with clinically localized prostate cancer (PCa) were investigated by single-cell RNA sequencing using the 10X Genomics protocol followed by paired-end sequencing on Illumina’s NovaSeq. Similarly, samples were processed and used as controls, i.e., one patient underwent biopsy but was found not to harbor prostate cancer (benign), three young, healthy men, and three men previously diagnosed with prostate cancer that recently underwent (curative) radical prostatectomy (post-RP). Sequencing data were mapped using 10X Genomics’ CellRanger software and viable cells were subsequently identified using CellBender, removing technical artifacts such as doublets and non-cellular RNA. Next, data analysis was performed in R, using the Seurat package. Because the main goal was to identify differences between PCa patients and ‘control’ patients, rather than exploring differences between individual subjects, the individual Seurat objects of all 21 patients were merged into one Seurat object per Seurat’s recommendation. Finally, the single-cell dataset was normalized as a whole prior to further analysis. Cell identity was assessed using the SingleR and cell dex packages. The Monaco Immune Data was selected as the reference dataset, consisting of bulk RNA-seq data of sorted human immune cells. The Monaco classification was supplemented with normalized PCa data obtained from The Cancer Genome Atlas (TCGA), which consists of bulk RNA sequencing data from 499 prostate tumor tissues (including 1 metastatic) and 52 (adjacent) normal prostate tissues. SingleR was subsequently run on the combined immune cell and PCa datasets. As expected, the vast majority of cells were labeled as having a monocytic origin (~90%), with the most noticeable difference being the larger number of intermediate monocytes in the PCa patients (13.6% versus 7.1%; p<.001). In men harboring PCa, 0.60% of all purified monocytes were classified as harboring PCa signals when the TCGA data were included. This was 3-fold, 7.5-fold, and 4-fold higher compared to post-RP, benign, and young men, respectively (all p<.001). In addition, with 7.91%, the number of unclassified cells, i.e., cells with pruned labels due to high uncertainty of the assigned label, was also highest in men with PCa, compared to 3.51%, 2.67%, and 5.51% of cells in post-RP, benign, and young men, respectively (all p<.001). It can be postulated that actively phagocytosing cells are hardest to classify due to their dual immune cell and foreign cell nature. Hence, the higher number of unclassified cells and intermediate monocytes in PCa patients might reflect higher phagocytic activity due to tumor burden. This also illustrates that small numbers (~1%) of circulating peripheral blood monocytes that have interacted with tumor cells might still possess detectable phagocytosed tumor RNA.

Keywords: circulating monocytes, phagocytic cells, prostate cancer, tumor immune response

Procedia PDF Downloads 157

67 SPARK: An Open-Source Knowledge Discovery Platform That Leverages Non-Relational Databases and Massively Parallel Computational Power for Heterogeneous Genomic Datasets

Authors: Thilina Ranaweera, Enes Makalic, John L. Hopper, Adrian Bickerstaffe

Abstract:

Data are the primary asset of biomedical researchers, and the engine for both discovery and research translation. As the volume and complexity of research datasets increase, especially with new technologies such as large single nucleotide polymorphism (SNP) chips, so too does the requirement for software to manage, process and analyze the data. Researchers often need to execute complicated queries and conduct complex analyzes of large-scale datasets. Existing tools to analyze such data, and other types of high-dimensional data, unfortunately suffer from one or more major problems. They typically require a high level of computing expertise, are too simplistic (i.e., do not fit realistic models that allow for complex interactions), are limited by computing power, do not exploit the computing power of large-scale parallel architectures (e.g. supercomputers, GPU clusters etc.), or are limited in the types of analysis available, compounded by the fact that integrating new analysis methods is not straightforward. Solutions to these problems, such as those developed and implemented on parallel architectures, are currently available to only a relatively small portion of medical researchers with access and know-how. The past decade has seen a rapid expansion of data management systems for the medical domain. Much attention has been given to systems that manage phenotype datasets generated by medical studies. The introduction of heterogeneous genomic data for research subjects that reside in these systems has highlighted the need for substantial improvements in software architecture. To address this problem, we have developed SPARK, an enabling and translational system for medical research, leveraging existing high performance computing resources, and analysis techniques currently available or being developed. It builds these into The Ark, an open-source web-based system designed to manage medical data. SPARK provides a next-generation biomedical data management solution that is based upon a novel Micro-Service architecture and Big Data technologies. The system serves to demonstrate the applicability of Micro-Service architectures for the development of high performance computing applications. When applied to high-dimensional medical datasets such as genomic data, relational data management approaches with normalized data structures suffer from unfeasibly high execution times for basic operations such as insert (i.e. importing a GWAS dataset) and the queries that are typical of the genomics research domain. SPARK resolves these problems by incorporating non-relational NoSQL databases that have been driven by the emergence of Big Data. SPARK provides researchers across the world with user-friendly access to state-of-the-art data management and analysis tools while eliminating the need for high-level informatics and programming skills. The system will benefit health and medical research by eliminating the burden of large-scale data management, querying, cleaning, and analysis. SPARK represents a major advancement in genome research technologies, vastly reducing the burden of working with genomic datasets, and enabling cutting edge analysis approaches that have previously been out of reach for many medical researchers.

Keywords: biomedical research, genomics, information systems, software

Procedia PDF Downloads 263

66 Complete Genome Sequence Analysis of Pasteurella multocida Subspecies multocida Serotype A Strain PMTB2.1

Authors: Shagufta Jabeen, Faez J. Firdaus Abdullah, Zunita Zakaria, Nurulfiza M. Isa, Yung C. Tan, Wai Y. Yee, Abdul R. Omar

Abstract:

Pasteurella multocida (PM) is an important veterinary opportunistic pathogen particularly associated with septicemic pasteurellosis, pneumonic pasteurellosis and hemorrhagic septicemia in cattle and buffaloes. P. multocida serotype A has been reported to cause fatal pneumonia and septicemia. Pasteurella multocida subspecies multocida of serotype A Malaysian isolate PMTB2.1 was first isolated from buffaloes died of septicemia. In this study, the genome of P. multocida strain PMTB2.1 was sequenced using third-generation sequencing technology, PacBio RS2 system and analyzed bioinformatically via de novo analysis followed by in-depth analysis based on comparative genomics. Bioinformatics analysis based on de novo assembly of PacBio raw reads generated 3 contigs followed by gap filling of aligned contigs with PCR sequencing, generated a single contiguous circular chromosome with a genomic size of 2,315,138 bp and a GC content of approximately 40.32% (Accession number CP007205). The PMTB2.1 genome comprised of 2,176 protein-coding sequences, 6 rRNA operons and 56 tRNA and 4 ncRNAs sequences. The comparative genome sequence analysis of PMTB2.1 with nine complete genomes which include Actinobacillus pleuropneumoniae, Haemophilus parasuis, Escherichia coli and five P. multocida complete genome sequences including, PM70, PM36950, PMHN06, PM3480, PMHB01 and PMTB2.1 was carried out based on OrthoMCL analysis and Venn diagram. The analysis showed that 282 CDs (13%) are unique to PMTB2.1and 1,125 CDs with orthologs in all. This reflects overall close relationship of these bacteria and supports the classification in the Gamma subdivision of the Proteobacteria. In addition, genomic distance analysis among all nine genomes indicated that PMTB2.1 is closely related with other five Pasteurella species with genomic distance less than 0.13. Synteny analysis shows subtle differences in genetic structures among different P.multocida indicating the dynamics of frequent gene transfer events among different P. multocida strains. However, PM3480 and PM70 exhibited exceptionally large structural variation since they were swine and chicken isolates. Furthermore, genomic structure of PMTB2.1 is more resembling that of PM36950 with a genomic size difference of approximately 34,380 kb (smaller than PM36950) and strain-specific Integrative and Conjugative Elements (ICE) which was found only in PM36950 is absent in PMTB2.1. Meanwhile, two intact prophages sequences of approximately 62 kb were found to be present only in PMTB2.1. One of phage is similar to transposable phage SfMu. The phylogenomic tree was constructed and rooted with E. coli, A. pleuropneumoniae and H. parasuis based on OrthoMCL analysis. The genomes of P. multocida strain PMTB2.1 were clustered with bovine isolates of P. multocida strain PM36950 and PMHB01 and were separated from avian isolate PM70 and swine isolates PM3480 and PMHN06 and are distant from Actinobacillus and Haemophilus. Previous studies based on Single Nucleotide Polymorphism (SNPs) and Multilocus Sequence Typing (MLST) unable to show a clear phylogenetic relatedness between Pasteurella multocida and the different host. In conclusion, this study has provided insight on the genomic structure of PMTB2.1 in terms of potential genes that can function as virulence factors for future study in elucidating the mechanisms behind the ability of the bacteria in causing diseases in susceptible animals.

Keywords: comparative genomics, DNA sequencing, phage, phylogenomics

Procedia PDF Downloads 182

65 Fuzzy Data, Random Drift, and a Theoretical Model for the Sequential Emergence of Religious Capacity in Genus Homo

Authors: Margaret Boone Rappaport, Christopher J. Corbally

Abstract:

The ancient ape ancestral population from which living great ape and human species evolved had demographic features affecting their evolution. The population was large, had great genetic variability, and natural selection was effective at honing adaptations. The emerging populations of chimpanzees and humans were affected more by founder effects and genetic drift because they were smaller. Natural selection did not disappear, but it was not as strong. Consequences of the 'population crash' and the human effective population size are introduced briefly. The history of the ancient apes is written in the genomes of living humans and great apes. The expansion of the brain began before the human line emerged. Coalescence times for some genes are very old – up to several million years, long before Homo sapiens. The mismatch between gene trees and species trees highlights the anthropoid speciation processes, and gives the human genome history a fuzzy, probabilistic quality. However, it suggests traits that might form a foundation for capacities emerging later. A theoretical model is presented in which the genomes of early ape populations provide the substructure for the emergence of religious capacity later on the human line. The model does not search for religion, but its foundations. It suggests a course by which an evolutionary line that began with prosimians eventually produced a human species with biologically based religious capacity. The model of the sequential emergence of religious capacity relies on cognitive science, neuroscience, paleoneurology, primate field studies, cognitive archaeology, genomics, and population genetics. And, it emphasizes five trait types: (1) Documented, positive selection of sensory capabilities on the human line may have favored survival, but also eventually enriched human religious experience. (2) The bonobo model suggests a possible down-regulation of aggression and increase in tolerance while feeding, as well as paedomorphism – but, in a human species that remains cognitively sharp (unlike the bonobo). The two species emerged from the same ancient ape population, so it is logical to search for shared traits. (3) An up-regulation of emotional sensitivity and compassion seems to have occurred on the human line. This finds support in modern genetic studies. (4) The authors’ published model of morality's emergence in Homo erectus encompasses a cognitively based, decision-making capacity that was hypothetically overtaken, in part, by religious capacity. Together, they produced a strong, variable, biocultural capability to support human sociability. (5) The full flowering of human religious capacity came with the parietal expansion and smaller face (klinorhynchy) found only in Homo sapiens. Details from paleoneurology suggest the stage was set for human theologies. Larger parietal lobes allowed humans to imagine inner spaces, processes, and beings, and, with the frontal lobe, led to the first theologies composed of structured and integrated theories of the relationships between humans and the supernatural. The model leads to the evolution of a small population of African hominins that was ready to emerge with religious capacity when the species Homo sapiens evolved two hundred thousand years ago. By 50-60,000 years ago, when human ancestors left Africa, they were fully enabled.

Keywords: genetic drift, genomics, parietal expansion, religious capacity

Procedia PDF Downloads 333

64 Proposing an Architecture for Drug Response Prediction by Integrating Multiomics Data and Utilizing Graph Transformers

Authors: Nishank Raisinghani

Abstract:

Efficiently predicting drug response remains a challenge in the realm of drug discovery. To address this issue, we propose four model architectures that combine graphical representation with varying positions of multiheaded self-attention mechanisms. By leveraging two types of multi-omics data, transcriptomics and genomics, we create a comprehensive representation of target cells and enable drug response prediction in precision medicine. A majority of our architectures utilize multiple transformer models, one with a graph attention mechanism and the other with a multiheaded self-attention mechanism, to generate latent representations of both drug and omics data, respectively. Our model architectures apply an attention mechanism to both drug and multiomics data, with the goal of procuring more comprehensive latent representations. The latent representations are then concatenated and input into a fully connected network to predict the IC-50 score, a measure of cell drug response. We experiment with all four of these architectures and extract results from all of them. Our study greatly contributes to the future of drug discovery and precision medicine by looking to optimize the time and accuracy of drug response prediction.

Keywords: drug discovery, transformers, graph neural networks, multiomics

Procedia PDF Downloads 141

63 Development of DNA Fingerprints in Selected Medicinal Plants of India

Authors: V. Verma, Hazi Raja

Abstract:

Conventionally, morphological descriptors are routinely used for establishing the identity of varieties. But these morphological descriptors suffer from many drawbacks such as influence of environment on trait expression, epistatic interactions, pleiotrophic effects etc. Furthermore, the paucity of a sufficient number of these descriptors for unequivocal identification of increasing number of reference collection varieties enforces to look for alternatives. Therefore, DNA based finger-print based techniques were selected to define the systematic position of the selected medicinal plants like Plumbago zeylanica, Desmodium gangeticum, Uraria picta. DNA fingerprinting of herbal plants can be useful in authenticating the various claims of medical uses related to the plants, in germplasm characterization and conservation. In plants it has not only helped in identifying species but also in defining a new realm in plant genomics, plant breeding and in conserving the biodiversity. With world paving way for developments in biotechnology, DNA fingerprinting promises a very powerful tool in our future endeavors. Data will be presented on the development of microsatellite markers (SSR) used to fingerprint, characterize, and assess genetic diversity among 12 accessions of both Plumbago zeylanica, 4 accessions of Desmodium gengaticum, 4 accessions of Uraria Picta.

Keywords: Plumbago zeylanica, Desmodium gangeticum, Uraria picta, microsaetllite markers

Procedia PDF Downloads 210