Search results for: genome wide association studies
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 15839

Search results for: genome wide association studies

15779 A Systematic Review for the Association between Active Smoking and Latent Tuberculosis Infection

Authors: Pui Hong Chung, Wing Chi Ho, Jun Li, Cyrus Leung, Ek Yeoh

Abstract:

Background: Cigarette smoking is associated with poor tuberculosis (TB) outcomes in terms of progression of active TB, relapse of TB and TB-related mortality, but the association with latent tuberculosis infection (LTBI) is unclear. The systematic review conducted aimed at studying the association between active smoking and LTBI, and likelihood of dose-response relationship. Methods: Two independent reviewers searched three electronic databases comprising PudMed, Medline by EBSCOHOST, ExcerptaMedica Database (EMBASE), from inception up to 31st Dec 2015 for studies reporting data on current smoking and the LTBI with tuberculin skin test (TST) or interferon-γ release assays (IGRAs) results, comparing the odds ratios (ORs) of outcome measure of TST or IGRAs among current smokers with 95% confidence intervals (CI). Results: Seven studies were identified, including six cross-sectional studies and one longitudinal cohort study. The outcome measures from three studies were in TST, three studies in IGRAs and one for both tests. For TST, OR ranging from 1.39 to 3.40 (95% CI) with all studies shown positive association between cigarette smoking and LTBI. For IGRAs, OR ranging from 0.47 to 1.89 (95% CI) with one study shown the negative association that might be related to impaired interferon-gamma production in immunosuppressive persons. One identified study demonstrated positive dose-response relationship in TST result. Conclusions: Cigarette smoking is likely to be a risk factor of LTBI. There is the important implication for TB and tobacco control program to halt TB by empowering public health policy. Further study is also needed to provide more evidence of the dose-response model/relationship.

Keywords: latent tuberculosis infection, systematic review, active smoking, model

Procedia PDF Downloads 256
15778 Genome Characterization and Phylogeny Analysis of Viruses Infected Invertebrates, Parvoviridae Family

Authors: Niloofar Fariborzi, Hamzeh Alipour, Kourosh Azizi, Neda Eskandarzade, Abozar Ghorbani

Abstract:

The family Parvoviridae consists of a large diversity of single-stranded DNA viruses, which cause mild to severe diseases in both vertebrates and invertebrates. The Parvoviridae are classified into three subfamilies: Parvovirinae infect vertebrates, Densovirinae infects invertebrates, while Hamaparovirinae infects both vertebrates and invertebrates. Except for the NS1 region, which is the prime criterion for phylogeny analysis, other parts of the parvoviruses genome, such as UTRs, are diverse even among closely related viruses or within the same genus. It is believed that host switching in parvoviruses may be related to genetic changes in regions other than NS1; therefore, whole-genome screening is valuable for studying parvoviruses' host-virus interactions. The aim of this study was to analyze genome organization and phylogeny of the complete genome sequence of the 132 Paroviridae family members, focusing on viruses that infect invertebrates. The maximum and minimum divergence within each subfamily belonged to Densovirinae and Parvovirinae, respectively. The greatest evolutionary divergence was between Hamaparovirinae and Parvovirinae. Unclassified viruses were mostly from Parovirinae and had the highest divergence to densoviruses and the lowest divergence to Parovirinae viruses. In a phylogenetic tree, all hamparoviruses were found in the center of densoviruses, with the exception of Syngnathid Ichthamaparvovirus 1 (NC_055527), which was positioned between two Parvovirinae members (NC _022089 and NC_038544). The proximity of hamparoviruses members to some densoviruses strengthens the possibility that densoviruses may be the ancestors of hamaparoviruses or vice versa. Therefore, examination and phylogeny analysis of the whole genome is necessary to understand Parvoviridae family host selection.

Keywords: densoviruses, parvoviridae, bioinformatics, phylogeny

Procedia PDF Downloads 93
15777 Genetic Diversity and Discovery of Unique SNPs in Five Country Cultivars of Sesamum indicum by Next-Generation Sequencing

Authors: Nam-Kuk Kim, Jin Kim, Soomin Park, Changhee Lee, Mijin Chu, Seong-Hun Lee

Abstract:

In this study, we conducted whole genome re-sequencing of 10 cultivars originated from five countries including Korea, China, India, Pakistan and Ethiopia with Sesamum indicum (Zhongzho No. 13) genome as a reference. Almost 80% of the whole genome sequences of the reference genome could be covered by sequenced reads. Numerous SNP and InDel were detected by bioinformatic analysis. Among these variants, 266,051 SNPs were identified as unique to countries. Pakistan and Ethiopia had high densities of SNPs compared to other countries. Three main clusters (cluster 1: Korea, cluster 2: Pakistan and India, cluster 3: Ethiopia and China) were recovered by neighbor-joining analysis using all variants. Interestingly, some variants were detected in DGAT1 (diacylglycerol O-acyltransferase 1) and FADS (fatty acid desaturase) genes, which are known to be related with fatty acid synthesis and metabolism. These results can provide useful information to understand the regional characteristics and develop DNA markers for origin discrimination of sesame.

Keywords: Sesamum indicum, NGS, SNP, DNA marker

Procedia PDF Downloads 327
15776 Genomics of Aquatic Adaptation

Authors: Agostinho Antunes

Abstract:

The completion of the human genome sequencing in 2003 opened a new perspective into the importance of whole genome sequencing projects, and currently multiple species are having their genomes completed sequenced, from simple organisms, such as bacteria, to more complex taxa, such as mammals. This voluminous sequencing data generated across multiple organisms provides also the framework to better understand the genetic makeup of such species and related ones, allowing to explore the genetic changes underlining the evolution of diverse phenotypic traits. Here, recent results from our group retrieved from comparative evolutionary genomic analyses of selected marine animal species will be considered to exemplify how gene novelty and gene enhancement by positive selection might have been determinant in the success of adaptive radiations into diverse habitats and lifestyles.

Keywords: comparative genomics, adaptive evolution, bioinformatics, phylogenetics, genome mining

Procedia PDF Downloads 533
15775 Bayesian Meta-Analysis to Account for Heterogeneity in Studies Relating Life Events to Disease

Authors: Elizabeth Stojanovski

Abstract:

Associations between life events and various forms of cancers have been identified. The purpose of a recent random-effects meta-analysis was to identify studies that examined the association between adverse events associated with changes to financial status including decreased income and breast cancer risk. The same association was studied in four separate studies which displayed traits that were not consistent between studies such as the study design, location and time frame. It was of interest to pool information from various studies to help identify characteristics that differentiated study results. Two random-effects Bayesian meta-analysis models are proposed to combine the reported estimates of the described studies. The proposed models allow major sources of variation to be taken into account, including study level characteristics, between study variance, and within study variance and illustrate the ease with which uncertainty can be incorporated using a hierarchical Bayesian modelling approach.

Keywords: random-effects, meta-analysis, Bayesian, variation

Procedia PDF Downloads 160
15774 Evaluation of Adaptive Fitness of Indian Teak (Tectona grandis L. F.) Metapopulation through Inter Simple Sequence Repeat Markers

Authors: Vivek Vaishnav, Shamim Akhtar Ansari

Abstract:

Teak (Tectona grandis L.f.) belonging to plant family Lamiaceae and the most commercialized timber species is endemic to South-Asia. The adaptive fitness of the species metapopulation was evaluated through its genetic differentiation and assessing the influence of geo-climatic conditions. 290 genotypes were sampled from 29 locations of its natural distribution and the genetic data was incorporated with geo-climatic parameters. Through Bayesian approach based analysis of 43 highly polymorphic ISSR markers, six homogeneous clusters (0.8% genetic variability) were identified. The six clusters were found with the various regimes of the temperature range, i.e., I - 9.10±1.35⁰C, II -6.35±0.21⁰C, III -12.21±0.43⁰C, IV - 10.8±1.06⁰C, V - 11.67±3.04⁰C, and VI - 12.35±0.21⁰C. The population had a very high percentage of LD (21.48%) among the amplified loci possibly due to experiencing restricted gene flow as well as co-adaptation and association of distant/diverse loci/alleles as a result of the stabilized climatic conditions and countless cycles of historical recombination events on a large geological timescale. The same possibly accounts for the narrow distribution of teak as a climax species in the tropical deciduous forests of the country. The regions of strong LD in teak genome significantly associated with climatic parameters also reflect that the species is tolerant to the wide regimes of the temperature range and may possibly withstand global warming and climate change in the coming millennium.

Keywords: Bayesian analysis, inter simple sequence repeat, linkage disequilibrium, marker-geoclimatic association

Procedia PDF Downloads 263
15773 Association of Genetically Proxied Cholesterol-Lowering Drug Targets and Head and Neck Cancer Survival: A Mendelian Randomization Analysis

Authors: Danni Cheng

Abstract:

Background: Preclinical and epidemiological studies have reported potential protective effects of low-density lipoprotein cholesterol (LDL-C) lowering drugs on head and neck squamous cell cancer (HNSCC) survival, but the causality was not consistent. Genetic variants associated with LDL-C lowering drug targets can predict the effects of their therapeutic inhibition on disease outcomes. Objective: We aimed to evaluate the causal association of genetically proxied cholesterol-lowering drug targets and circulating lipid traits with cancer survival in HNSCC patients stratified by human papillomavirus (HPV) status using two-sample Mendelian randomization (MR) analyses. Method: Single-nucleotide polymorphisms (SNPs) in gene region of LDL-C lowering drug targets (HMGCR, NPC1L1, CETP, PCSK9, and LDLR) associated with LDL-C levels in genome-wide association study (GWAS) from the Global Lipids Genetics Consortium (GLGC) were used to proxy LDL-C lowering drug action. SNPs proxy circulating lipids (LDL-C, HDL-C, total cholesterol, triglycerides, apoprotein A and apoprotein B) were also derived from the GLGC data. Genetic associations of these SNPs and cancer survivals were derived from 1,120 HPV-positive oropharyngeal squamous cell carcinoma (OPSCC) and 2,570 non-HPV-driven HNSCC patients in VOYAGER program. We estimated the causal associations of LDL-C lowering drugs and circulating lipids with HNSCC survival using the inverse-variance weighted method. Results: Genetically proxied HMGCR inhibition was significantly associated with worse overall survival (OS) in non-HPV-drive HNSCC patients (inverse variance-weighted hazard ratio (HR IVW), 2.64[95%CI,1.28-5.43]; P = 0.01) but better OS in HPV-positive OPSCC patients (HR IVW,0.11[95%CI,0.02-0.56]; P = 0.01). Estimates for NPC1L1 were strongly associated with worse OS in both total HNSCC (HR IVW,4.17[95%CI,1.06-16.36]; P = 0.04) and non-HPV-driven HNSCC patients (HR IVW,7.33[95%CI,1.63-32.97]; P = 0.01). A similar result was found that genetically proxied PSCK9 inhibitors were significantly associated with poor OS in non-HPV-driven HNSCC (HR IVW,1.56[95%CI,1.02 to 2.39]). Conclusion: Genetically proxied long-term HMGCR inhibition was significantly associated with decreased OS in non-HPV-driven HNSCC and increased OS in HPV-positive OPSCC. While genetically proxied NPC1L1 and PCSK9 had associations with worse OS in total and non-HPV-driven HNSCC patients. Further research is needed to understand whether these drugs have consistent associations with head and neck tumor outcomes.

Keywords: Mendelian randomization analysis, head and neck cancer, cancer survival, cholesterol, statin

Procedia PDF Downloads 100
15772 Insights into the Annotated Genome Sequence of Defluviitoga tunisiensis L3 Isolated from a Thermophilic Rural Biogas Producing Plant

Authors: Irena Maus, Katharina Gabriella Cibis, Andreas Bremges, Yvonne Stolze, Geizecler Tomazetto, Daniel Wibberg, Helmut König, Alfred Pühler, Andreas Schlüter

Abstract:

Within the agricultural sector, the production of biogas from organic substrates represents an economically attractive technology to generate bioenergy. Complex consortia of microorganisms are responsible for biomass decomposition and biogas production. Recently, species belonging to the phylum Thermotogae were detected in thermophilic biogas-production plants utilizing renewable primary products for biomethanation. To analyze adaptive genome features of representative Thermotogae strains, Defluviitoga tunisiensis L3 was isolated from a rural thermophilic biogas plant (54°C) and completely sequenced on an Illumina MiSeq system. Sequencing and assembly of the D. tunisiensis L3 genome yielded a circular chromosome with a size of 2,053,097 bp and a mean GC content of 31.38%. Functional annotation of the complete genome sequence revealed that the thermophilic strain L3 encodes several genes predicted to facilitate growth of this microorganism on arabinose, galactose, maltose, mannose, fructose, raffinose, ribose, cellobiose, lactose, xylose, xylan, lactate and mannitol. Acetate, hydrogen (H2) and carbon dioxide (CO2) are supposed to be end products of the fermentation process. The latter gene products are metabolites for methanogenic archaea, the key players in the final step of the anaerobic digestion process. To determine the degree of relatedness of dominant biogas community members within selected digester systems to D. tunisiensis L3, metagenome sequences from corresponding communities were mapped on the L3 genome. These fragment recruitments revealed that metagenome reads originating from a thermophilic biogas plant covered 95% of D. tunisiensis L3 genome sequence. In conclusion, availability of the D. tunisiensis L3 genome sequence and insights into its metabolic capabilities provide the basis for biotechnological exploitation of genome features involved in thermophilic fermentation processes utilizing renewable primary products.

Keywords: genome sequence, thermophilic biogas plant, Thermotogae, Defluviitoga tunisiensis

Procedia PDF Downloads 499
15771 Genomics of Adaptation in the Sea

Authors: Agostinho Antunes

Abstract:

The completion of the human genome sequencing in 2003 opened a new perspective into the importance of whole genome sequencing projects, and currently multiple species are having their genomes completed sequenced, from simple organisms, such as bacteria, to more complex taxa, such as mammals. This voluminous sequencing data generated across multiple organisms provides also the framework to better understand the genetic makeup of such species and related ones, allowing to explore the genetic changes underlining the evolution of diverse phenotypic traits. Here, recent results from our group retrieved from comparative evolutionary genomic analyses of selected marine animal species will be considered to exemplify how gene novelty and gene enhancement by positive selection might have been determinant in the success of adaptive radiations into diverse habitats and lifestyles.

Keywords: marine genomics, evolutionary bioinformatics, human genome sequencing, genomic analyses

Procedia PDF Downloads 611
15770 Elucidating the Genetic Determinism of Seed Protein Plasticity in Response to the Environment Using Medicago truncatula

Authors: K. Cartelier, D. Aime, V. Vernoud, J. Buitink, J. M. Prosperi, K. Gallardo, C. Le Signor

Abstract:

Legumes can produce protein-rich seeds without nitrogen fertilizer through root symbiosis with nitrogen-fixing rhizobia. Rich in lysine, these proteins are used for human nutrition and animal feed. However, the instability of seed protein yield and quality due to environmental fluctuations limits the wider use of legumes such as pea. Breeding efforts are needed to optimize and stabilize seed nutritional value, which requires to identify the genetic determinism of seed protein plasticity in response to the environment. Towards this goal, we have studied the plasticity of protein content and composition of seeds from a collection of 200 Medicago truncatula ecotypes grown under four controlled conditions (optimal, drought, and winter/spring sowing). A quantitative analysis of one-dimensional protein profiles of these mature seeds was performed and plasticity indices were calculated from each abundant protein band. Genome-Wide Association Studies (GWAS) from these data identified major GWAS hotspots, from which a list of candidate genes was obtained. A Gene Ontology Enrichment Analysis revealed an over-representation of genes involved in several amino acid metabolic pathways. This led us to propose that environmental variations are likely to modulate amino acid balance, thus impacting seed protein composition. The selection of candidate genes for controlling the plasticity of seed protein composition was refined using transcriptomics data from developing Medicago truncatula seeds. The pea orthologs of key genes were identified for functional studies by mean of TILLING (Targeting Induced Local Lesions in Genomes) lines in this crop. We will present how this study highlighted mechanisms that could govern seed protein plasticity, providing new cues towards the stabilization of legume seed quality.

Keywords: GWAS, Medicago truncatula, plasticity, seed, storage proteins

Procedia PDF Downloads 142
15769 Proteome-Wide Convergent Evolution on Vocal Learning Birds Reveals Insight into cAMP-Based Learning Pathway

Authors: Chul Lee, Seoae Cho, Erich D. Jarvis, Heebal Kim

Abstract:

Vocal learning, the ability to imitate vocalizations based on auditory experience, is a homoplastic character state observed in different independent lineages of animals such as songbirds, parrots, hummingbirds and human. It has now become possible to perform genome-wide molecular analyses across vocal learners and vocal non-learners with the recent expansion of avian genome data. It was analyzed the whole genomes of human and 48 avian species including those belonging to the three avian vocal learning lineages, to determine if behavior and neural convergence are associated with molecular convergence in divergent species of vocal learners. Analyses of 8295 orthologous genes across bird species revealed 141 genes with amino acid substitutions specific to vocal learners. Out of these, 25 genes have vocal learner specific genetic homoplasies, and their functions were enriched for learning. Several sites in these genes are estimated under convergent evolution and positive selection. A potential role for a subset of these genes in vocal learning was supported by associations with gene expression profiles in vocal learning brain regions of songbirds and human disease that cause language dysfunctions. The key candidate gene with multiple independent lines of the evidences specific to vocal learners was DRD5. Our findings suggest cAMP-based learning pathway in avian vocal learners, indicating molecular homoplastic changes associated with a complex behavioral trait, vocal learning.

Keywords: amino acid substitutions, convergent evolution, positive selection, vocal learning

Procedia PDF Downloads 341
15768 Investigation of the IL23R Psoriasis/PsA Susceptibility Locus

Authors: Shraddha Rane, Richard Warren, Stephen Eyre

Abstract:

L-23 is a pro-inflammatory molecule that signals T cells to release cytokines such as IL-17A and IL-22. Psoriasis is driven by a dysregulated immune response, within which IL-23 is now thought to play a key role. Genome-wide association studies (GWAS) have identified a number of genetic risk loci that support the involvement of IL-23 signalling in psoriasis; in particular a robust susceptibility locus at a gene encoding a subunit of the IL-23 receptor (IL23R) (Stuart et al., 2015; Tsoi et al., 2012). The lead psoriasis-associated SNP rs9988642 is located approximately 500 bp downstream of IL23R but is in tight linkage disequilibrium (LD) with a missense SNP rs11209026 (R381Q) within IL23R (r2 = 0.85). The minor (G) allele of rs11209026 is present in approximately 7% of the population and is protective for psoriasis and several other autoimmune diseases including IBD, ankylosing spondylitis, RA and asthma. The psoriasis-associated missense SNP R381Q causes an arginine to glutamine substitution in a region of the IL23R protein between the transmembrane domain and the putative JAK2 binding site in the cytoplasmic portion. This substitution is expected to affect the receptor’s surface localisation or signalling ability, rather than IL23R expression. Recent studies have also identified a psoriatic arthritis (PsA)-specific signal at IL23R; thought to be independent from the psoriasis association (Bowes et al., 2015; Budu-Aggrey et al., 2016). The lead PsA-associated SNP rs12044149 is intronic to IL23R and is in LD with likely causal SNPs intersecting promoter and enhancer marks in memory CD8+ T cells (Budu-Aggrey et al., 2016). It is therefore likely that the PsA-specific SNPs affect IL23R function via a different mechanism compared with the psoriasis-specific SNPs. It could be hypothesised that the risk allele for PsA located within the IL23R promoter causes an increase IL23R expression, relative to the protective allele. An increased expression of IL23R might then lead to an exaggerated immune response. The independent genetic signals identified for psoriasis and PsA in this locus indicate that different mechanisms underlie these two conditions; although likely both affecting the function of IL23R. It is very important to further characterise these mechanisms in order to better understand how the IL-23 receptor and its downstream signalling is affected in both diseases. This will help to determine how psoriasis and PsA patients might differentially respond to therapies, particularly IL-23 biologics. To investigate this further we have developed an in vitro model using CD4 T cells which express either wild type IL23R and IL12Rβ1 or mutant IL23R (R381Q) and IL12Rβ1. Model expressing different isotypes of IL23R is also underway to investigate the effects on IL23R expression. We propose to further investigate the variants for Ps and PsA and characterise key intracellular processes related to the variants.

Keywords: IL23R, psoriasis, psoriatic arthritis, SNP

Procedia PDF Downloads 168
15767 Measuring Fluctuating Asymmetry in Human Faces Using High-Density 3D Surface Scans

Authors: O. Ekrami, P. Claes, S. Van Dongen

Abstract:

Fluctuating asymmetry (FA) has been studied for many years as an indicator of developmental stability or ‘genetic quality’ based on the assumption that perfect symmetry is ideally the expected outcome for a bilateral organism. Further studies have also investigated the possible link between FA and attractiveness or levels of masculinity or femininity. These hypotheses have been mostly examined using 2D images, and the structure of interest is usually presented using a limited number of landmarks. Such methods have the downside of simplifying and reducing the dimensionality of the structure, which will in return increase the error of the analysis. In an attempt to reach more conclusive and accurate results, in this study we have used high-resolution 3D scans of human faces and have developed an algorithm to measure and localize FA, taking a spatially-dense approach. A symmetric spatially dense anthropometric mask with paired vertices is non-rigidly mapped on target faces using an Iterative Closest Point (ICP) registration algorithm. A set of 19 manually indicated landmarks were used to examine the precision of our mapping step. The protocol’s accuracy in measurement and localizing FA is assessed using simulated faces with known amounts of asymmetry added to them. The results of validation of our approach show that the algorithm is perfectly capable of locating and measuring FA in 3D simulated faces. With the use of such algorithm, the additional captured information on asymmetry can be used to improve the studies of FA as an indicator of fitness or attractiveness. This algorithm can especially be of great benefit in studies of high number of subjects due to its automated and time-efficient nature. Additionally, taking a spatially dense approach provides us with information about the locality of FA, which is impossible to obtain using conventional methods. It also enables us to analyze the asymmetry of a morphological structures in a multivariate manner; This can be achieved by using methods such as Principal Components Analysis (PCA) or Factor Analysis, which can be a step towards understanding the underlying processes of asymmetry. This method can also be used in combination with genome wide association studies to help unravel the genetic bases of FA. To conclude, we introduced an algorithm to study and analyze asymmetry in human faces, with the possibility of extending the application to other morphological structures, in an automated, accurate and multi-variate framework.

Keywords: developmental stability, fluctuating asymmetry, morphometrics, 3D image processing

Procedia PDF Downloads 140
15766 Expression Profiling and Immunohistochemical Analysis of Squamous Cell Carcinoma of Head and Neck (Tumor, Transition Zone, Normal) by Whole Genome Scale Sequencing

Authors: Veronika Zivicova, Petr Broz, Zdenek Fik, Alzbeta Mifkova, Jan Plzak, Zdenek Cada, Herbert Kaltner, Jana Fialova Kucerova, Hans-Joachim Gabius, Karel Smetana Jr.

Abstract:

The possibility to determine genome-wide expression profiles of cells and tissues opens a new level of analysis in the quest to define dysregulation in malignancy and thus identify new tumor markers. Toward this long-term aim, we here address two issues on this level for head and neck cancer specimen: i) defining profiles in different regions, i.e. the tumor, the transition zone and normal control and ii) comparing complete data sets for seven individual patients. Special focus in the flanking immunohistochemical part is given to adhesion/growth-regulatory galectins that upregulate chemo- and cytokine expression in an NF-κB-dependent manner, to these regulators and to markers of differentiation, i.e. keratins. The detailed listing of up- and down-regulations, also available in printed form (1), not only served to unveil new candidates for testing as marker but also let the impact of the tumor in the transition zone become apparent. The extent of interindividual variation raises a strong cautionary note on assuming uniformity of regulatory events, to be noted when considering therapeutic implications. Thus, a combination of test targets (and a network analysis for galectins and their downstream effectors) is (are) advised prior to reaching conclusions on further perspectives.

Keywords: galectins, genome scale sequencing, squamous cell carcinoma, transition zone

Procedia PDF Downloads 239
15765 Comparison of Selected Pier-Scour Equations for Wide Piers Using Field Data

Authors: Nordila Ahmad, Thamer Mohammad, Bruce W. Melville, Zuliziana Suif

Abstract:

Current methods for predicting local scour at wide bridge piers, were developed on the basis of laboratory studies and very limited scour prediction were tested with field data. Laboratory wide pier scour equation from previous findings with field data were presented. A wide range of field data were used and it consists of both live-bed and clear-water scour. A method for assessing the quality of the data was developed and applied to the data set. Three other wide pier-scour equations from the literature were used to compare the performance of each predictive method. The best-performing scour equation were analyzed using statistical analysis. Comparisons of computed and observed scour depths indicate that the equation from the previous publication produced the smallest discrepancy ratio and RMSE value when compared with the large amount of laboratory and field data.

Keywords: field data, local scour, scour equation, wide piers

Procedia PDF Downloads 414
15764 Modified Genome-Scale Metabolic Model of Escherichia coli by Adding Hyaluronic Acid Biosynthesis-Related Enzymes (GLMU2 and HYAD) from Pasteurella multocida

Authors: P. Pasomboon, P. Chumnanpuen, T. E-kobon

Abstract:

Hyaluronic acid (HA) consists of linear heteropolysaccharides repeat of D-glucuronic acid and N-acetyl-D-glucosamine. HA has various useful properties to maintain skin elasticity and moisture, reduce inflammation, and lubricate the movement of various body parts without causing immunogenic allergy. HA can be found in several animal tissues as well as in the capsule component of some bacteria including Pasteurella multocida. This study aimed to modify a genome-scale metabolic model of Escherichia coli using computational simulation and flux analysis methods to predict HA productivity under different carbon sources and nitrogen supplement by the addition of two enzymes (GLMU2 and HYAD) from P. multocida to improve the HA production under the specified amount of carbon sources and nitrogen supplements. Result revealed that threonine and aspartate supplement raised the HA production by 12.186%. Our analyses proposed the genome-scale metabolic model is useful for improving the HA production and narrows the number of conditions to be tested further.

Keywords: Pasteurella multocida, Escherichia coli, hyaluronic acid, genome-scale metabolic model, bioinformatics

Procedia PDF Downloads 123
15763 Analysis of the Simulation Merger and Economic Benefit of Local Farmers' Associations in Taiwan

Authors: Lu Yung-Hsiang, Chang Kuming, Dai Yi-Fang, Liao Ching-Yi

Abstract:

According to Taiwan’s administrative division of future land planning may lead farmer association and service areas facing recombination or merger. Thus, merger combination and the economic benefit of the farmer association are worth to be discussed. The farmer association in the merger, which may cause some then will not be consolidated, or consolidate two, or ever more to one association. However, under what condition to merge is greatest, as one of observation of this study. In addition, research without using simulation methods and only on the credit department rather whole farmer association. Therefore, this paper will use the simulation approach, and examine both the merge of farmer association and the condition under which the benefits are the greatest. The data of this study set include 266 farmer associations in Taiwan period 2012 to 2013. Empirical results showed that the number of the farmer association optimal simulation combination is 108.After the merger from the first stage can be reduced by 60% of the farmers’ association. The cost saving effects of the post-merger is not different. The cost efficiency of the farmers’ association improved it. The economies of scale and scope would decrease by the merger. The research paper hopes the finding will benefit the future merger of the farmers’ association.

Keywords: simulation merger, farmer association, assurance region, data envelopment analysis

Procedia PDF Downloads 350
15762 Isolate-Specific Variations among Clinical Isolates of Brucella Identified by Whole-Genome Sequencing, Bioinformatics and Comparative Genomics

Authors: Abu S. Mustafa, Mohammad W. Khan, Faraz Shaheed Khan, Nazima Habibi

Abstract:

Brucellosis is a zoonotic disease of worldwide prevalence. There are at least four species and several strains of Brucella that cause human disease. Brucella genomes have very limited variation across strains, which hinder strain identification using classical molecular techniques, including PCR and 16 S rDNA sequencing. The aim of this study was to perform whole genome sequencing of clinical isolates of Brucella and perform bioinformatics and comparative genomics analyses to determine the existence of genetic differences across the isolates of a single Brucella species and strain. The draft sequence data were generated from 15 clinical isolates of Brucella melitensis (biovar 2 strain 63/9) using MiSeq next generation sequencing platform. The generated reads were used for further assembly and analysis. All the analysis was performed using Bioinformatics work station (8 core i7 processor, 8GB RAM with Bio-Linux operating system). FastQC was used to determine the quality of reads and low quality reads were trimmed or eliminated using Fastx_trimmer. Assembly was done by using Velvet and ABySS softwares. The ordering of assembled contigs was performed by Mauve. An online server RAST was employed to annotate the contigs assembly. Annotated genomes were compared using Mauve and ACT tools. The QC score for DNA sequence data, generated by MiSeq, was higher than 30 for 80% of reads with more than 100x coverage, which suggested that data could be utilized for further analysis. However when analyzed by FastQC, quality of four reads was not good enough for creating a complete genome draft so remaining 11 samples were used for further analysis. The comparative genome analyses showed that despite sharing same gene sets, single nucleotide polymorphisms and insertions/deletions existed across different genomes, which provided a variable extent of diversity to these bacteria. In conclusion, the next generation sequencing, bioinformatics, and comparative genome analysis can be utilized to find variations (point mutations, insertions and deletions) across different genomes of Brucella within a single strain. This information could be useful in surveillance and epidemiological studies supported by Kuwait University Research Sector grants MI04/15 and SRUL02/13.

Keywords: brucella, bioinformatics, comparative genomics, whole genome sequencing

Procedia PDF Downloads 383
15761 In silico Comparative Analysis of Chloroplast Genome (cpDNA) and Some Individual Genes (rbcL and trnH-psbA) in Pooideae Subfamily Members

Authors: Ibrahim Ilker Ozyigit, Ertugrul Filiz, Ilhan Dogan

Abstract:

An in silico analysis of Brachypodium distachyon, Triticum aestivum, Festuca arundinacea, Lolium perenne, Hordeum vulgare subsp. vulgare of the Pooideaea was performed based on complete chloroplast genomes including rbcL coding and trnH-psbA intergenic spacer regions alone to compare phylogenetic resolving power. Neighbor-joining, Minimum Evolution, and Unweighted Pair Group Method with arithmetic mean methods were used to reconstruct phylogenies with the highest bootstrap supported the obtained data from whole chloroplast genome sequence. The highest and lowest values from nucleotide diversity (π) analysis were found to be 0.315813 and 0.043495 in rbcL coding region in chloroplast genome and complete chloroplast genome, respectively. The highest transition/transversion bias (R) value was recorded as 1.384 in complete chloroplast genomes. F. arudinacea-L. perenne clade was uncovered in all phylogenies. Sequences of rbcL and trnH-psbA regions were not able to resolve the Pooideae phylogenies due to lack of genetic variation.

Keywords: chloroplast DNA, Pooideae, phylogenetic analysis, rbcL, trnH-psbA

Procedia PDF Downloads 379
15760 Genomic Surveillance of Bacillus Anthracis in South Africa Revealed a Unique Genetic Cluster of B- Clade Strains

Authors: Kgaugelo Lekota, Ayesha Hassim, Henriette Van Heerden

Abstract:

Bacillus anthracis is the causative agent of anthrax that is composed of three genetic groups, namely A, B, and C. Clade-A is distributed world-wide, while sub-clades B has been identified in Kruger National Park (KNP), South Africa. KNP is one of the endemic anthrax regions in South Africa with distinctive genetic diversity. Genomic surveillance of KNP B. anthracis strains was employed on the historical culture collection isolates (n=67) dated from the 1990’s to 2015 using a whole genome sequencing approach. Whole genome single nucleotide polymorphism (SNPs) and pan-genomics analysis were used to define the B. anthracis genetic population structure. This study showed that KNP has heterologous B. anthracis strains grouping in the A-clade with more prominent ABr.005/006 (Ancient A) SNP lineage. The 2012 and 2015 anthrax isolates are dispersed amongst minor sub-clades that prevail in non-stabilized genetic evolution strains. This was augmented with non-parsimony informative SNPs of the B. anthracis strains across minor sub-clades of the Ancient A clade. Pan-genomics of B. anthracis showed a clear distinction between A and B-clade genomes with 11 374 predicted clusters of protein coding genes. Unique accessory genes of B-clade genomes that included biosynthetic cell wall genes and multidrug resistant of Fosfomycin. South Africa consists of diverse B. anthracis strains with unique defined SNPs. The sequenced B. anthracis strains in this study will serve as a means to further trace the dissemination of B. anthracis outbreaks globally and especially in South Africa.

Keywords: bacillus anthracis, whole genome single nucleotide polymorphisms, pangenomics, kruger national park

Procedia PDF Downloads 150
15759 Societal Acceptability Conditions of Genome Editing for Upland Rice in Madagascar

Authors: Anny Lucrece Nlend Nkott, Ludovic Temple

Abstract:

The appearance in 2012 of the CRISPR-CaS9 genome editing technique marks a turning point in the field of genetics. This technique would make it possible to create new varieties quickly and cheaply. Although some consider CRISPR-CaS9 to be revolutionary, others consider it a potential societal threat. To document the controversy, we explain the socioeconomic conditions under which this technique could be accepted for the creation of a rainfed rice variety in Madagascar. The methodological framework is based on 38 individual and semistructured interviews, a multistakeholder forum with 27 participants, and a survey of 148 rice producers. Results reveal that the acceptability of genome editing requires (i) strengthening the seed system through the operationalization of regulatory structures and the upgrading of stakeholders' knowledge of genetically modified organisms, (ii) assessing the effects of the edited variety on biodiversity and soil nitrogen dynamics, and (iii) strengthening the technical and human capacities of the biosafety body. Structural mechanisms for regulating the seed system are necessary to ensure safe experimentation of genome editing techniques. Organizational innovation also appears to be necessary. The study documents how collective learning between communities of scientists and nonscientists is a component of systemic processes of varietal innovation. This study was carried out with the financial support of the GENERICE project (Generation and Deployment of Genome-Edited, Nitrogen-use-Efficient Rice Varieties), funded by the Agropolis Foundation.

Keywords: CRISPR-CaS9, varietal innovation, seed system, innovation system

Procedia PDF Downloads 154
15758 Genetically Informed Precision Drug Repurposing for Rheumatoid Arthritis

Authors: Sahar El Shair, Laura Greco, William Reay, Murray Cairns

Abstract:

Background: Rheumatoid arthritis (RA) is a chronic, systematic, inflammatory, autoimmune disease that involves damages to joints and erosions to the associated bones and cartilage, resulting in reduced physical function and disability. RA is a multifactorial disorder influenced by heterogenous genetic and environmental factors. Whilst different medications have proven successful in reducing inflammation associated with RA, they often come with significant side effects and limited efficacy. To address this, the novel pharmagenic enrichment score (PES) algorithm was tested in self-reported RA patients from the UK Biobank (UKBB), which is a cohort of predominantly European ancestry, and identified individuals with a high genetic risk in clinically actionable biological pathways to identify novel opportunities for precision interventions and drug repurposing to treat RA. Methods and materials: Genetic association data for rheumatoid arthritis was derived from publicly available genome-wide association studies (GWAS) summary statistics (N=97173). The PES framework exploits competitive gene set enrichment to identify pathways that are associated with RA to explore novel treatment opportunities. This data is then integrated into WebGestalt, Drug Interaction database (DGIdb) and DrugBank databases to identify existing compounds with existing use or potential for repurposed use. The PES for each of these candidates was then profiled in individuals with RA in the UKBB (Ncases = 3,719, Ncontrols = 333,160). Results A total of 209 pathways with known drug targets after multiple testing correction were identified. Several pathways, including interferon gamma signaling and TID pathway (which relates to a chaperone that modulates interferon signaling), were significantly associated with self-reported RA in the UKBB when adjusting for age, sex, assessment centre month and location, RA polygenic risk and 10 principal components. These pathways have a major role in RA pathogenesis, including autoimmune attacks against certain citrullinated proteins, synovial inflammation, and bone loss. Encouragingly, many also relate to the mechanism of action of existing RA medications. The analyses also revealed statistically significant association between RA polygenic scores and self-reported RA with individual PES scorings, highlighting the potential utility of the PES algorithm in uncovering additional genetic insights that could aid in the identification of individuals at risk for RA and provide opportunities for more targeted interventions. Conclusions In this study, pharmacologically annotated genetic risk was explored through the PES framework to overcome inter-individual heterogeneity and enable precision drug repurposing in RA. The results showed a statistically significant association between RA polygenic scores and self-reported RA and individual PES scorings for 3,719 RA patients. Interestingly, several enriched PES pathways were targeted by already approved RA drugs. In addition, the analysis revealed genetically supported drug repurposing opportunities for future treatment of RA with a relatively safe profile.

Keywords: rheumatoid arthritis, precision medicine, drug repurposing, system biology, bioinformatics

Procedia PDF Downloads 76
15757 Association Rules Mining Task Using Metaheuristics: Review

Authors: Abir Derouiche, Abdesslem Layeb

Abstract:

Association Rule Mining (ARM) is one of the most popular data mining tasks and it is widely used in various areas. The search for association rules is an NP-complete problem that is why metaheuristics have been widely used to solve it. The present paper presents the ARM as an optimization problem and surveys the proposed approaches in the literature based on metaheuristics.

Keywords: Optimization, Metaheuristics, Data Mining, Association rules Mining

Procedia PDF Downloads 159
15756 Genome-Scale Analysis of Streptomyces Caatingaensis CMAA 1322 Metabolism, a New Abiotic Stress-Tolerant Actinomycete

Authors: Suikinai Nobre Santos, Ranko Gacesa, Paul F. Long, Itamar Soares de Melo

Abstract:

Extremophilic microorganism are adapted to biotopes combining several stress factors (temperature, pressure, radiation, salinity and pH), which indicate the richness valuable resource for the exploitation of novel biotechnological processes and constitute unique models for investigations their biomolecules (1, 2). The above information encourages us investigate bioprospecting synthesized compounds by a noval actinomycete, designated thermotolerant Streptomyces caatingaensis CMAA 1322, isolated from sample soil tropical dry forest (Caatinga) in the Brazilian semiarid region (3-17°S and 35-45°W). This set of constrating physical and climatic factores provide the unique conditions and a diversity of well adapted species, interesting site for biotechnological purposes. Preliminary studies have shown the great potential in the production of cytotoxic, pesticidal and antimicrobial molecules (3). Thus, to extend knowledge of the genes clusters responsible for producing biosynthetic pathways of natural products in strain CMAA1322, whole-genome shotgun (WGS) DNA sequencing was performed using paired-end long sequencing with PacBio RS (Pacific Biosciences). Genomic DNA was extracted from a pure culture grown overnight on LB medium using the PureLink genomic DNA kit (Life Technologies). An approximately 3- to 20-kb-insert PacBio library was constructed and sequenced on an 8 single-molecule real-time (SMRT) cell, yielding 116,269 reads (average length, 7,446 bp), which were allocated into 18 contigs, with 142.11x coverage and N50 value of 20.548 bp (BioProject number PRJNA288757). The assembled data were analyzed by Rapid Annotations using Subsystems Technology (RAST) (4) the genome size was found to be 7.055.077 bp, comprising 6167 open reading frames (ORFs) and 413 subsystems. The G+C content was estimated to be 72 mol%. The closest-neighbors tool, available in RAST through functional comparison of the genome, revealed that strain CMAA1322 is more closely related to Streptomyces hygroscopicus ATCC 53653 (similarity score value, 537), S. violaceusniger Tu 4113 (score value, 483), S. avermitilis MA-4680 (score value, 475), S. albus J1074 (score value, 447). The Streptomyces sp. CMAA1322 genome contains 98 tRNA genes and 135 genes copies related to stress response, mainly osmotic stress (14), heat shock (16), oxidative stress (49). Functional annotation by antiSMASH version 3.0 (5) identified 41 clusters for secondary metabolites (including two clusters for lanthipeptides, ten clusters for nonribosomal peptide synthetases [NRPS], three clusters for siderophores, fourteen for polyketide synthetase [PKS], six clusters encoding a terpene, two clusters encoding a bacteriocin, and one cluster encoding a phenazine). Our work provide in comparative analyse of genome and extract produced (data no published) by lineage CMAA1322, revealing the potential of microorganisms accessed from extreme environments as Caatinga” to produce a wide range of biotechnological relevant compounds.

Keywords: caatinga, streptomyces, environmental stresses, biosynthetic pathways

Procedia PDF Downloads 242
15755 Efficient Reuse of Exome Sequencing Data for Copy Number Variation Callings

Authors: Chen Wang, Jared Evans, Yan Asmann

Abstract:

With the quick evolvement of next-generation sequencing techniques, whole-exome or exome-panel data have become a cost-effective way for detection of small exonic mutations, but there has been a growing desire to accurately detect copy number variations (CNVs) as well. In order to address this research and clinical needs, we developed a sequencing coverage pattern-based method not only for copy number detections, data integrity checks, CNV calling, and visualization reports. The developed methodologies include complete automation to increase usability, genome content-coverage bias correction, CNV segmentation, data quality reports, and publication quality images. Automatic identification and removal of poor quality outlier samples were made automatically. Multiple experimental batches were routinely detected and further reduced for a clean subset of samples before analysis. Algorithm improvements were also made to improve somatic CNV detection as well as germline CNV detection in trio family. Additionally, a set of utilities was included to facilitate users for producing CNV plots in focused genes of interest. We demonstrate the somatic CNV enhancements by accurately detecting CNVs in whole exome-wide data from the cancer genome atlas cancer samples and a lymphoma case study with paired tumor and normal samples. We also showed our efficient reuses of existing exome sequencing data, for improved germline CNV calling in a family of the trio from the phase-III study of 1000 Genome to detect CNVs with various modes of inheritance. The performance of the developed method is evaluated by comparing CNV calling results with results from other orthogonal copy number platforms. Through our case studies, reuses of exome sequencing data for calling CNVs have several noticeable functionalities, including a better quality control for exome sequencing data, improved joint analysis with single nucleotide variant calls, and novel genomic discovery of under-utilized existing whole exome and custom exome panel data.

Keywords: bioinformatics, computational genetics, copy number variations, data reuse, exome sequencing, next generation sequencing

Procedia PDF Downloads 257
15754 Analysis of Endogenous Sirevirus in Germinating Barley (Hordeum vulgare L.)

Authors: Nermin Gozukirmizi, Buket Cakmak, Sevgi Marakli

Abstract:

Sireviruses are genera of copia LTR retrotransposons with a unique genome structure among retrotransposons. Barley (Hordeum vulgare L.) is an economically important plant and has been studied as a model plant regarding its short annual life cycle and seven chromosome pairs. In this study, we used mature barley embryos, 10-day-old roots and 10-day-old leaves derived from the same barley plant to investigate SIRE1 retrotransposon movements by Inter-Retrotransposon Amplified Polymorphism (IRAP) technique. We found polymorphism rates between 0-64% among embryos, roots and leaves. Polymorphism rates were detected to be 0-27% among embryos, 8-60% among roots, and 11-50% among leaves. Polymorphisms were observed not only among the parts of different individuals, but also on the parts of the same plant (23-64%). The internal domains of SIRE1 (gag, env and rt) were also analyzed in the embryos, roots and leaves. Analysis of band profiles showed no polymorphism for gag, however, different band patterns were observed among samples for rt and env. The sequencing of SIRE1 gag, env and rt domains revealed 79% similarity for gag, 95% for env and 84% for rt to Ty1-copia retrotransposons. SIRE1 retrotransposon was identified in the soybean genome and has been studied on other plants (maize, rice, tomatoe etc.). This study is the first detailed investigation of SIRE1 in barley genome. The obtained findings are expected to contribute to the comprehension of SIRE1 retrotransposon and its role in barley genome.

Keywords: barley, polymorphism, retrotransposon, SIRE1 virus

Procedia PDF Downloads 308
15753 CRISPR-DT: Designing gRNAs for the CRISPR-Cpf1 System with Improved Target Efficiency and Specificity

Authors: Houxiang Zhu, Chun Liang

Abstract:

The CRISPR-Cpf1 system has been successfully applied in genome editing. However, target efficiency of the CRISPR-Cpf1 system varies among different gRNA sequences. The published CRISPR-Cpf1 gRNA data was reanalyzed. Many sequences and structural features of gRNAs (e.g., the position-specific nucleotide composition, position-nonspecific nucleotide composition, GC content, minimum free energy, and melting temperature) correlated with target efficiency were found. Using machine learning technology, a support vector machine (SVM) model was created to predict target efficiency for any given gRNAs. The first web service application, CRISPR-DT (CRISPR DNA Targeting), has been developed to help users design optimal gRNAs for the CRISPR-Cpf1 system by considering both target efficiency and specificity. CRISPR-DT will empower researchers in genome editing.

Keywords: CRISPR-Cpf1, genome editing, target efficiency, target specificity

Procedia PDF Downloads 262
15752 Association between Polygenic Risk of Alzheimer's Dementia, Brain MRI and Cognition in UK Biobank

Authors: Rachana Tank, Donald. M. Lyall, Kristin Flegal, Joey Ward, Jonathan Cavanagh

Abstract:

Alzheimer’s research UK estimates by 2050, 2 million individuals will be living with Late Onset Alzheimer’s disease (LOAD). However, individuals experience considerable cognitive deficits and brain pathology over decades before reaching clinically diagnosable LOAD and studies have utilised gene candidate studies such as genome wide association studies (GWAS) and polygenic risk (PGR) scores to identify high risk individuals and potential pathways. This investigation aims to determine whether high genetic risk of LOAD is associated with worse brain MRI and cognitive performance in healthy older adults within the UK Biobank cohort. Previous studies investigating associations of PGR for LOAD and measures of MRI or cognitive functioning have focused on specific aspects of hippocampal structure, in relatively small sample sizes and with poor ‘controlling’ for confounders such as smoking. Both the sample size of this study and the discovery GWAS sample are bigger than previous studies to our knowledge. Genetic interaction between loci showing largest effects in GWAS have not been extensively studied and it is known that APOE e4 poses the largest genetic risk of LOAD with potential gene-gene and gene-environment interactions of e4, for this reason we  also analyse genetic interactions of PGR with the APOE e4 genotype. High genetic loading based on a polygenic risk score of 21 SNPs for LOAD is associated with worse brain MRI and cognitive outcomes in healthy individuals within the UK Biobank cohort. Summary statistics from Kunkle et al., GWAS meta-analyses (case: n=30,344, control: n=52,427) will be used to create polygenic risk scores based on 21 SNPs and analyses will be carried out in N=37,000 participants in the UK Biobank. This will be the largest study to date investigating PGR of LOAD in relation to MRI. MRI outcome measures include WM tracts, structural volumes. Cognitive function measures include reaction time, pairs matching, trail making, digit symbol substitution and prospective memory. Interaction of the APOE e4 alleles and PGR will be analysed by including APOE status as an interaction term coded as either 0, 1 or 2 e4 alleles. Models will be adjusted partially for adjusted for age, BMI, sex, genotyping chip, smoking, depression and social deprivation. Preliminary results suggest PGR score for LOAD is associated with decreased hippocampal volumes including hippocampal body (standardised beta = -0.04, P = 0.022) and tail (standardised beta = -0.037, P = 0.030), but not with hippocampal head. There were also associations of genetic risk with decreased cognitive performance including fluid intelligence (standardised beta = -0.08, P<0.01) and reaction time (standardised beta = 2.04, P<0.01). No genetic interactions were found between APOE e4 dose and PGR score for MRI or cognitive measures. The generalisability of these results is limited by selection bias within the UK Biobank as participants are less likely to be obese, smoke, be socioeconomically deprived and have fewer self-reported health conditions when compared to the general population. Lack of a unified approach or standardised method for calculating genetic risk scores may also be a limitation of these analyses. Further discussion and results are pending.

Keywords: Alzheimer's dementia, cognition, polygenic risk, MRI

Procedia PDF Downloads 113
15751 Association Between Swallowing Disorders and Cognitive Disorders in Adults: Systematic Review and Metaanalysis

Authors: Shiva Ebrahimian Dehaghani, Afsaneh Doosti, Morteza Zare

Abstract:

Background: There is no consensus regarding the association between dysphagia and cognition. Purpose: The aim of this study was to quantitatively and qualitatively analyze the available evidence on the direction and strength of association between dysphagia and cognition. Methodology: PubMed, Scopus, Embase and Web of Science were searched about the association between dysphagia and cognition. A random-effects model was used to determine weighted odds ratios (OR) and 95% confidence intervals (CI). Sensitivity analysis was performed to determine the impact of each individual study on the pooled results. Results: A total of 1427 participants showed that some cognitive disorders were significantly associated with dysphagia (OR = 3.23; 95% CI, 2.33–4.48). Conclusion: The association between cognition and swallowing disorders suggests that multiple neuroanatomical systems are involved in these two functions.

Keywords: adult, association, cognitive impairment, dysphagia, systematic review

Procedia PDF Downloads 161
15750 Frequent Pattern Mining for Digenic Human Traits

Authors: Atsuko Okazaki, Jurg Ott

Abstract:

Some genetic diseases (‘digenic traits’) are due to the interaction between two DNA variants. For example, certain forms of Retinitis Pigmentosa (a genetic form of blindness) occur in the presence of two mutant variants, one in the ROM1 gene and one in the RDS gene, while the occurrence of only one of these mutant variants leads to a completely normal phenotype. Detecting such digenic traits by genetic methods is difficult. A common approach to finding disease-causing variants is to compare 100,000s of variants between individuals with a trait (cases) and those without the trait (controls). Such genome-wide association studies (GWASs) have been very successful but hinge on genetic effects of single variants, that is, there should be a difference in allele or genotype frequencies between cases and controls at a disease-causing variant. Frequent pattern mining (FPM) methods offer an avenue at detecting digenic traits even in the absence of single-variant effects. The idea is to enumerate pairs of genotypes (genotype patterns) with each of the two genotypes originating from different variants that may be located at very different genomic positions. What is needed is for genotype patterns to be significantly more common in cases than in controls. Let Y = 2 refer to cases and Y = 1 to controls, with X denoting a specific genotype pattern. We are seeking association rules, ‘X → Y’, with high confidence, P(Y = 2|X), significantly higher than the proportion of cases, P(Y = 2) in the study. Clearly, generally available FPM methods are very suitable for detecting disease-associated genotype patterns. We use fpgrowth as the basic FPM algorithm and built a framework around it to enumerate high-frequency digenic genotype patterns and to evaluate their statistical significance by permutation analysis. Application to a published dataset on opioid dependence furnished results that could not be found with classical GWAS methodology. There were 143 cases and 153 healthy controls, each genotyped for 82 variants in eight genes of the opioid system. The aim was to find out whether any of these variants were disease-associated. The single-variant analysis did not lead to significant results. Application of our FPM implementation resulted in one significant (p < 0.01) genotype pattern with both genotypes in the pattern being heterozygous and originating from two variants on different chromosomes. This pattern occurred in 14 cases and none of the controls. Thus, the pattern seems quite specific to this form of substance abuse and is also rather predictive of disease. An algorithm called Multifactor Dimension Reduction (MDR) was developed some 20 years ago and has been in use in human genetics ever since. This and our algorithms share some similar properties, but they are also very different in other respects. The main difference seems to be that our algorithm focuses on patterns of genotypes while the main object of inference in MDR is the 3 × 3 table of genotypes at two variants.

Keywords: digenic traits, DNA variants, epistasis, statistical genetics

Procedia PDF Downloads 122