Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 10103

Search results for: next generation sequencing techniques

10103 Accurate HLA Typing at High-Digit Resolution from NGS Data

Authors: Yazhi Huang, Jing Yang, Dingge Ying, Yan Zhang, Vorasuk Shotelersuk, Nattiya Hirankarn, Pak Chung Sham, Yu Lung Lau, Wanling Yang

Abstract:

Human leukocyte antigen (HLA) typing from next generation sequencing (NGS) data has the potential for applications in clinical laboratories and population genetic studies. Here we introduce a novel technique for HLA typing from NGS data based on read-mapping using a comprehensive reference panel containing all known HLA alleles and de novo assembly of the gene-specific short reads. An accurate HLA typing at high-digit resolution was achieved when it was tested on publicly available NGS data, outperforming other newly-developed tools such as HLAminer and PHLAT.

Keywords: human leukocyte antigens, next generation sequencing, whole exome sequencing, HLA typing

Procedia PDF Downloads 657

10102 Insights into Archaeological Human Sample Microbiome Using 16S rRNA Gene Sequencing

Authors: Alisa Kazarina, Guntis Gerhards, Elina Petersone-Gordina, Ilva Pole, Viktorija Igumnova, Janis Kimsis, Valentina Capligina, Renate Ranka

Abstract:

Human body is inhabited by a vast number of microorganisms, collectively known as the human microbiome, and there is a tremendous interest in evolutionary changes in human microbial ecology, diversity and function. The field of paleomicrobiology, study of ancient human microbiome, is powered by modern techniques of Next Generation Sequencing (NGS), which allows extracting microbial genomic data directly from archaeological sample of interest. One of the major techniques is 16S rRNA gene sequencing, by which certain 16S rRNA gene hypervariable regions are being amplified and sequenced. However, some limitations of this method exist including the taxonomic precision and efficacy of different regions used. The aim of this study was to evaluate the phylogenetic sensitivity of different 16S rRNA gene hypervariable regions for microbiome studies in the archaeological samples. Towards this aim, archaeological bone samples and corresponding soil samples from each burial environment were collected in Medieval cemeteries in Latvia. The Ion 16S™ Metagenomics Kit targeting different 16S rRNA gene hypervariable regions was used for library construction (Ion Torrent technologies). Sequenced data were analysed by using appropriate bioinformatic techniques; alignment and taxonomic representation was done using Mothur program. Sequences of most abundant genus were further aligned to E. coli 16S rRNA gene reference sequence using MEGA7 in order to identify the hypervariable region of the segment of interest. Our results showed that different hypervariable regions had different discriminatory power depending on the groups of microbes, as well as the nature of samples. On the basis of our results, we suggest that wider range of primers used can provide more accurate recapitulation of microbial communities in archaeological samples. Acknowledgements. This work was supported by the ERAF grant Nr. 1.1.1.1/16/A/101.

Keywords: 16S rRNA gene, ancient human microbiome, archaeology, bioinformatics, genomics, microbiome, molecular biology, next-generation sequencing

Procedia PDF Downloads 184

10101 Automatic Reporting System for Transcriptome Indel Identification and Annotation Based on Snapshot of Next-Generation Sequencing Reads Alignment

Authors: Shuo Mu, Guangzhi Jiang, Jinsa Chen

Abstract:

The analysis of Indel for RNA sequencing of clinical samples is easily affected by sequencing experiment errors and software selection. In order to improve the efficiency and accuracy of analysis, we developed an automatic reporting system for Indel recognition and annotation based on image snapshot of transcriptome reads alignment. This system includes sequence local-assembly and realignment, target point snapshot, and image-based recognition processes. We integrated high-confidence Indel dataset from several known databases as a training set to improve the accuracy of image processing and added a bioinformatical processing module to annotate and filter Indel artifacts. Subsequently, the system will automatically generate data, including data quality levels and images results report. Sanger sequencing verification of the reference Indel mutation of cell line NA12878 showed that the process can achieve 83% sensitivity and 96% specificity. Analysis of the collected clinical samples showed that the interpretation accuracy of the process was equivalent to that of manual inspection, and the processing efficiency showed a significant improvement. This work shows the feasibility of accurate Indel analysis of clinical next-generation sequencing (NGS) transcriptome. This result may be useful for RNA study for clinical samples with microsatellite instability in immunotherapy in the future.

Keywords: automatic reporting, indel, next-generation sequencing, NGS, transcriptome

Procedia PDF Downloads 179

10100 De Novo Assembly and Characterization of the Transcriptome during Seed Development, and Generation of Genic-SSR Markers in Pomegranate (Punica granatum L.)

Authors: Ozhan Simsek, Dicle Donmez, Burhanettin Imrak, Ahsen Isik Ozguven, Yildiz Aka Kacar

Abstract:

Pomegranate (Punica granatum L.) is known to be one of the oldest edible fruit tree species, with a wide geographical global distribution. Fruits from the two defined varieties (Hicaznar and 33N26) were taken at intervals after pollination and fertilization at different sizes. Seed samples were used for transcriptome sequencing. Primary sequencing was produced by Illumina Hi-Seq™ 2000. Firstly, we had raw reads, and it was subjected to quality control (QC). Raw reads were filtered into clean reads and aligned to the reference sequences. De novo analysis was performed to detect genes expressed in seeds of pomegranate varieties. We performed downstream analysis to determine differentially expressed genes. We generated about 27.09 gb bases in total after Illumina Hi-Seq sequencing. All samples were assembled together, we got 59,264 Unigenes, the total length, average length, N50, and GC content of Unigenes are 84.547.276 bp, 1.426 bp, 2,137 bp, and 46.20 %, respectively. Unigenes were annotated with 7 functional databases, finally, 42.681(NR: 72.02%), 39.660 (NT: 66.92%), 30.790 (Swissprot: 51.95%), 20.212 (COG: 34.11%), 27.689 (KEGG: 46.72%), 12.328 (GO: 20.80%), and 33,833 (Interpro: 57.09%) Unigenes were annotated. With functional annotation results, we detected 42.376 CDS, and 4.999 SSR distribute on 16.143 Unigenes.

Keywords: next generation sequencing, SSR, RNA-Seq, Illumina

Procedia PDF Downloads 234

10099 Efficient Reuse of Exome Sequencing Data for Copy Number Variation Callings

Authors: Chen Wang, Jared Evans, Yan Asmann

Abstract:

With the quick evolvement of next-generation sequencing techniques, whole-exome or exome-panel data have become a cost-effective way for detection of small exonic mutations, but there has been a growing desire to accurately detect copy number variations (CNVs) as well. In order to address this research and clinical needs, we developed a sequencing coverage pattern-based method not only for copy number detections, data integrity checks, CNV calling, and visualization reports. The developed methodologies include complete automation to increase usability, genome content-coverage bias correction, CNV segmentation, data quality reports, and publication quality images. Automatic identification and removal of poor quality outlier samples were made automatically. Multiple experimental batches were routinely detected and further reduced for a clean subset of samples before analysis. Algorithm improvements were also made to improve somatic CNV detection as well as germline CNV detection in trio family. Additionally, a set of utilities was included to facilitate users for producing CNV plots in focused genes of interest. We demonstrate the somatic CNV enhancements by accurately detecting CNVs in whole exome-wide data from the cancer genome atlas cancer samples and a lymphoma case study with paired tumor and normal samples. We also showed our efficient reuses of existing exome sequencing data, for improved germline CNV calling in a family of the trio from the phase-III study of 1000 Genome to detect CNVs with various modes of inheritance. The performance of the developed method is evaluated by comparing CNV calling results with results from other orthogonal copy number platforms. Through our case studies, reuses of exome sequencing data for calling CNVs have several noticeable functionalities, including a better quality control for exome sequencing data, improved joint analysis with single nucleotide variant calls, and novel genomic discovery of under-utilized existing whole exome and custom exome panel data.

Keywords: bioinformatics, computational genetics, copy number variations, data reuse, exome sequencing, next generation sequencing

Procedia PDF Downloads 251

10098 Isolate-Specific Variations among Clinical Isolates of Brucella Identified by Whole-Genome Sequencing, Bioinformatics and Comparative Genomics

Authors: Abu S. Mustafa, Mohammad W. Khan, Faraz Shaheed Khan, Nazima Habibi

Abstract:

Brucellosis is a zoonotic disease of worldwide prevalence. There are at least four species and several strains of Brucella that cause human disease. Brucella genomes have very limited variation across strains, which hinder strain identification using classical molecular techniques, including PCR and 16 S rDNA sequencing. The aim of this study was to perform whole genome sequencing of clinical isolates of Brucella and perform bioinformatics and comparative genomics analyses to determine the existence of genetic differences across the isolates of a single Brucella species and strain. The draft sequence data were generated from 15 clinical isolates of Brucella melitensis (biovar 2 strain 63/9) using MiSeq next generation sequencing platform. The generated reads were used for further assembly and analysis. All the analysis was performed using Bioinformatics work station (8 core i7 processor, 8GB RAM with Bio-Linux operating system). FastQC was used to determine the quality of reads and low quality reads were trimmed or eliminated using Fastx_trimmer. Assembly was done by using Velvet and ABySS softwares. The ordering of assembled contigs was performed by Mauve. An online server RAST was employed to annotate the contigs assembly. Annotated genomes were compared using Mauve and ACT tools. The QC score for DNA sequence data, generated by MiSeq, was higher than 30 for 80% of reads with more than 100x coverage, which suggested that data could be utilized for further analysis. However when analyzed by FastQC, quality of four reads was not good enough for creating a complete genome draft so remaining 11 samples were used for further analysis. The comparative genome analyses showed that despite sharing same gene sets, single nucleotide polymorphisms and insertions/deletions existed across different genomes, which provided a variable extent of diversity to these bacteria. In conclusion, the next generation sequencing, bioinformatics, and comparative genome analysis can be utilized to find variations (point mutations, insertions and deletions) across different genomes of Brucella within a single strain. This information could be useful in surveillance and epidemiological studies supported by Kuwait University Research Sector grants MI04/15 and SRUL02/13.

Keywords: brucella, bioinformatics, comparative genomics, whole genome sequencing

Procedia PDF Downloads 375

10097 Genome Sequencing of the Yeast Saccharomyces cerevisiae Strain 202-3

Authors: Yina A. Cifuentes Triana, Andrés M. Pinzón Velásco, Marío E. Velásquez Lozano

Abstract:

In this work the sequencing and genome characterization of a natural isolate of Saccharomyces cerevisiae yeast (strain 202-3), identified with potential for the production of second generation ethanol from sugarcane bagasse hydrolysates is presented. This strain was selected because its capability to consume xylose during the fermentation of sugarcane bagasse hydrolysates, taking into account that many strains of S. cerevisiae are incapable of processing this sugar. This advantage and other prominent positive aspects during fermentation profiles evaluated in bagasse hydrolysates made the strain 202-3 a candidate strain to improve the production of second-generation ethanol, which was proposed as a first step to study the strain at the genomic level. The molecular characterization was carried out by genome sequencing with the Illumina HiSeq 2000 platform paired end; the assembly was performed with different programs, finally choosing the assembler ABYSS with kmer 89. Gene prediction was developed with the approach of hidden Markov models with Augustus. The genes identified were scored based on similarity with public databases of nucleotide and protein. Records were organized from ontological functions at different hierarchical levels, which identified central metabolic functions and roles of the S. cerevisiae strain 202-3, highlighting the presence of four possible new proteins, two of them probably associated with the positive consumption of xylose.

Keywords: cellulosic ethanol, Saccharomyces cerevisiae, genome sequencing, xylose consumption

Procedia PDF Downloads 317

10096 Analyzing Test Data Generation Techniques Using Evolutionary Algorithms

Authors: Arslan Ellahi, Syed Amjad Hussain

Abstract:

Software Testing is a vital process in software development life cycle. We can attain the quality of software after passing it through software testing phase. We have tried to find out automatic test data generation techniques that are a key research area of software testing to achieve test automation that can eventually decrease testing time. In this paper, we review some of the approaches presented in the literature which use evolutionary search based algorithms like Genetic Algorithm, Particle Swarm Optimization (PSO), etc. to validate the test data generation process. We also look into the quality of test data generation which increases or decreases the efficiency of testing. We have proposed test data generation techniques for model-based testing. We have worked on tuning and fitness function of PSO algorithm.

Keywords: search based, evolutionary algorithm, particle swarm optimization, genetic algorithm, test data generation

Procedia PDF Downloads 184

10095 Genome Sequencing and Analysis of the Spontaneous Nanosilver Resistant Bacterium Proteus mirabilis Strain scdr1

Authors: Amr Saeb, Khalid Al-Rubeaan, Mohamed Abouelhoda, Manojkumar Selvaraju, Hamsa Tayeb

Abstract:

Background: P. mirabilis is a common uropathogenic bacterium that can cause major complications in patients with long-standing indwelling catheters or patients with urinary tract anomalies. In addition, P. mirabilis is a common cause of chronic osteomyelitis in diabetic foot ulcer (DFU) patients. Methodology: P. mirabilis SCDR1 was isolated from a diabetic ulcer patient. We examined P. mirabilis SCDR1 levels of resistance against nano-silver colloids, the commercial nano-silver and silver containing bandages and commonly used antibiotics. We utilized next generation sequencing techniques (NGS), bioinformatics, phylogenetic analysis and pathogenomics in the identification and characterization of the infectious pathogen. Results: P. mirabilis SCDR1 is a multi-drug resistant isolate that also showed high levels of resistance against nano-silver colloids, nano-silver chitosan composite and the commercially available nano-silver and silver bandages. The P. mirabilis-SCDR1 genome size is 3,815,621 bp with G+C content of 38.44%. P. mirabilis-SCDR1 genome contains a total of 3,533 genes, 3,414 coding DNA sequence genes, 11, 10, 18 rRNAs (5S, 16S, and 23S), and 76 tRNAs. Our isolate contains all the required pathogenicity and virulence factors to establish a successful infection. P. mirabilis SCDR1 isolate is a potential virulent pathogen that despite its original isolation site, wound, it can establish kidney infection and its associated complications. P. mirabilis SCDR1 contains several mechanisms for antibiotics and metals resistance including, biofilm formation, swarming mobility, efflux systems, and enzymatic detoxification. Conclusion: P. mirabilis SCDR1 is the spontaneous nano-silver resistant bacterial strain. P. mirabilis SCDR1 strain contains all reported pathogenic and virulence factors characteristic for the species. In addition, it possesses several mechanisms that may lead to the observed nano-silver resistance.

Keywords: Proteus mirabilis, multi-drug resistance, silver nanoparticles, resistance, next generation sequencing techniques, genome analysis, bioinformatics, phylogeny, pathogenomics, diabetic foot ulcer, xenobiotics, multidrug resistance efflux, biofilm formation, swarming mobility, resistome, glutathione S-transferase, copper/silver efflux system, altruism

Procedia PDF Downloads 326

10094 Metagenomics Analysis of Bacteria in Sorghum Using next Generation Sequencing

Authors: Kedibone Masenya, Memory Tekere, Jasper Rees

Abstract:

Sorghum is an important cereal crop in the world. In particular, it has attracted breeders due to capacity to serve as food, feed, fiber and bioenergy crop. Like any other plant, sorghum hosts a variety of microbes, which can either, have a neutral, negative and positive influence on the plant. In the current study, regions (V3/V4) of 16 S rRNA were targeted to extensively assess bacterial multitrophic interactions in the phyllosphere of sorghum. The results demonstrated that the presence of a pathogen has a significant effect on the endophytic bacterial community. Understanding these interactions is key to develop new strategies for plant protection.

Keywords: bacteria, multitrophic, sorghum, target sequencing

Procedia PDF Downloads 277

10093 The Genetic Architecture Underlying Dilated Cardiomyopathy in Singaporeans

Authors: Feng Ji Mervin Goh, Edmund Chee Jian Pua, Stuart Alexander Cook

Abstract:

Dilated cardiomyopathy (DCM) is a common cause of heart failure. Genetic mutations account for 50% of DCM cases with TTN mutations being the most common, accounting for up to 25% of DCM cases. However, the genetic architecture underlying Asian DCM patients is unknown. We evaluated 68 patients (female= 17) with DCM who underwent follow-up at the National Heart Centre, Singapore from 2013 through 2014. Clinical data were obtained and analyzed retrospectively. Genomic DNA was subjected to next-generation targeted sequencing. Nextera Rapid Capture Enrichment was used to capture the exons of a panel of 169 cardiac genes. DNA libraries were sequenced as paired-end 150-bp reads on Illumina MiSeq. Raw sequence reads were processed and analysed using standard bioinformatics techniques. The average age of onset of DCM was 46.1±10.21 years old. The average left ventricular ejection fraction (LVEF), left ventricular diastolic internal diameter (LVIDd), left ventricular systolic internal diameter (LVIDs) were 26.1±11.2%, 6.20±0.83cm, and 5.23±0.92cm respectively. The frequencies of mutations in major DCM-associated genes were as follows TTN (5.88% vs published frequency of 20%), LMNA (4.41% vs 6%), MYH7 (5.88% vs 4%), MYH6 (5.88% vs 4%), and SCN5a (4.41% vs 3%). The average callability at 10 times coverage of each major gene were: TTN (99.7%), LMNA (87.1%), MYH7 (94.8%), MYH6 (95.5%), and SCN5a (94.3%). In conclusion, TTN mutations are not common in Singaporean DCM patients. The frequencies of other major DCM-associated genes are comparable to frequencies published in the current literature.

Keywords: heart failure, dilated cardiomyopathy, genetics, next-generation sequencing

Procedia PDF Downloads 240

10092 Next Generation Sequencing Analysis of Circulating MiRNAs in Rheumatoid Arthritis and Osteoarthritis

Authors: Khalda Amr, Noha Eltaweel, Sherif Ismail, Hala Raslan

Abstract:

Introduction: Osteoarthritis is the most common form of arthritis that involves the wearing away of the cartilage that caps the bones in the joints. While rheumatoid arthritis is an autoimmune disease in which the immune system attacks the joints, beginning with the lining of joints. In this study, we aimed to study the top deregulated miRNAs that might be the cause of pathogenesis in both diseases. Methods: Eight cases were recruited in this study: 4 rheumatoid arthritis (RA), 2 osteoarthritis (OA) patients, as well as 2 healthy controls. Total RNA was isolated from plasma to be subjected to miRNA profiling by NGS. Sequencing libraries were constructed and generated using the NEBNextR UltraTM small RNA Sample Prep Kit for Illumina R (NEB, USA), according to the manufacturer’s instructions. The quality of samples were checked using fastqc and multiQC. Results were compared RA vs Controls and OA vs. Controls. Target gene prediction and functional annotation of the deregulated miRNAs were done using Mienturnet. The top deregulated miRNAs in each disease were selected for further validation using qRT-PCR. Results: The average number of sequencing reads per sample exceeded 2.2 million, of which approximately 57% were mapped to the human reference genome. The top DEMs in RA vs controls were miR-6724-5p, miR-1469, miR-194-3p (up), miR-1468-5p, miR-486-3p (down). In comparison, the top DEMs in OA vs controls were miR-1908-3p, miR-122b-3p, miR-3960 (up), miR-1468-5p, miR-15b-3p (down). The functional enrichment of the selected top deregulated miRNAs revealed the highly enriched KEGG pathways and GO terms. Six of the deregulated miRNAs (miR-15b, -128, -194, -328, -542 and -3180) had multiple target genes in the RA pathway, so they are more likely to affect the RA pathogenesis. Conclusion: Six of our studied deregulated miRNAs (miR-15b, -128, -194, -328, -542 and -3180) might be highly involved in the disease pathogenesis. Further functional studies are crucial to assess their functions and actual target genes.

Keywords: next generation sequencing, mirnas, rheumatoid arthritis, osteoarthritis

Procedia PDF Downloads 81

10091 BingleSeq: A User-Friendly R Package for Single-Cell RNA-Seq Data Analysis

Authors: Quan Gu, Daniel Dimitrov

Abstract:

BingleSeq was developed as a shiny-based, intuitive, and comprehensive application that enables the analysis of single-Cell RNA-Sequencing count data. This was achieved via incorporating three state-of-the-art software packages for each type of RNA sequencing analysis, alongside functional annotation analysis and a way to assess the overlap of differential expression method results. At its current state, the functionality implemented within BingleSeq is comparable to that of other applications, also developed with the purpose of lowering the entry requirements to RNA Sequencing analyses. BingleSeq is available on GitHub and will be submitted to R/Bioconductor.

Keywords: bioinformatics, functional annotation analysis, single-cell RNA-sequencing, transcriptomics

Procedia PDF Downloads 198

10090 Clinical Impact of Ultra-Deep Versus Sanger Sequencing Detection of Minority Mutations on the HIV-1 Drug Resistance Genotype Interpretations after Virological Failure

Authors: S. Mohamed, D. Gonzalez, C. Sayada, P. Halfon

Abstract:

Drug resistance mutations are routinely detected using standard Sanger sequencing, which does not detect minor variants with a frequency below 20%. The impact of detecting minor variants generated by ultra-deep sequencing (UDS) on HIV drug-resistance (DR) interpretations has not yet been studied. Fifty HIV-1 patients who experienced virological failure were included in this retrospective study. The HIV-1 UDS protocol allowed the detection and quantification of HIV-1 protease and reverse transcriptase variants related to genotypes A, B, C, E, F, and G. DeepChek®-HIV simplified DR interpretation software was used to compare Sanger sequencing and UDS. The total time required for the UDS protocol was found to be approximately three times longer than Sanger sequencing with equivalent reagent costs. UDS detected all of the mutations found by population sequencing and identified additional resistance variants in all patients. An analysis of DR revealed a total of 643 and 224 clinically relevant mutations by UDS and Sanger sequencing, respectively. Three resistance mutations with > 20% prevalence were detected solely by UDS: A98S (23%), E138A (21%) and V179I (25%). A significant difference in the DR interpretations for 19 antiretroviral drugs was observed between the UDS and Sanger sequencing methods. Y181C and T215Y were the most frequent mutations associated with interpretation differences. A combination of UDS and DeepChek® software for the interpretation of DR results would help clinicians provide suitable treatments. A cut-off of 1% allowed a better characterisation of the viral population by identifying additional resistance mutations and improving the DR interpretation.

Keywords: HIV-1, ultra-deep sequencing, Sanger sequencing, drug resistance

Procedia PDF Downloads 327

10089 South African Breast Cancer Mutation Spectrum: Pitfalls to Copy Number Variation Detection Using Internationally Designed Multiplex Ligation-Dependent Probe Amplification and Next Generation Sequencing Panels

Authors: Jaco Oosthuizen, Nerina C. Van Der Merwe

Abstract:

The National Health Laboratory Services in Bloemfontien has been the diagnostic testing facility for 1830 patients for familial breast cancer since 1997. From the cohort, 540 were comprehensively screened using High-Resolution Melting Analysis or Next Generation Sequencing for the presence of point mutations and/or indels. Approximately 90% of these patients stil remain undiagnosed as they are BRCA1/2 negative. Multiplex ligation-dependent probe amplification was initially added to screen for copy number variation detection, but with the introduction of next generation sequencing in 2017, was substituted and is currently used as a confirmation assay. The aim was to investigate the viability of utilizing internationally designed copy number variation detection assays based on mostly European/Caucasian genomic data for use within a South African context. The multiplex ligation-dependent probe amplification technique is based on the hybridization and subsequent ligation of multiple probes to a targeted exon. The ligated probes are amplified using conventional polymerase chain reaction, followed by fragment analysis by means of capillary electrophoresis. The experimental design of the assay was performed according to the guidelines of MRC-Holland. For BRCA1 (P002-D1) and BRCA2 (P045-B3), both multiplex assays were validated, and results were confirmed using a secondary probe set for each gene. The next generation sequencing technique is based on target amplification via multiplex polymerase chain reaction, where after the amplicons are sequenced parallel on a semiconductor chip. Amplified read counts are visualized as relative copy numbers to determine the median of the absolute values of all pairwise differences. Various experimental parameters such as DNA quality, quantity, and signal intensity or read depth were verified using positive and negative patients previously tested internationally. DNA quality and quantity proved to be the critical factors during the verification of both assays. The quantity influenced the relative copy number frequency directly whereas the quality of the DNA and its salt concentration influenced denaturation consistency in both assays. Multiplex ligation-dependent probe amplification produced false positives due to ligation failure when ligation was inhibited due to a variant present within the ligation site. Next generation sequencing produced false positives due to read dropout when primer sequences did not meet optimal multiplex binding kinetics due to population variants in the primer binding site. The analytical sensitivity and specificity for the South African population have been proven. Verification resulted in repeatable reactions with regards to the detection of relative copy number differences. Both multiplex ligation-dependent probe amplification and next generation sequencing multiplex panels need to be optimized to accommodate South African polymorphisms present within the genetically diverse ethnic groups to reduce the false copy number variation positive rate and increase performance efficiency.

Keywords: familial breast cancer, multiplex ligation-dependent probe amplification, next generation sequencing, South Africa

Procedia PDF Downloads 228

10088 CMOS Solid-State Nanopore DNA System-Level Sequencing Techniques Enhancement

Authors: Syed Islam, Yiyun Huang, Sebastian Magierowski, Ebrahim Ghafar-Zadeh

Abstract:

This paper presents system level CMOS solid-state nanopore techniques enhancement for speedup next generation molecular recording and high throughput channels. This discussion also considers optimum number of base-pair (bp) measurements through channel as an important role to enhance potential read accuracy. Effective power consumption estimation offered suitable rangeof multi-channel configuration. Nanopore bp extraction model in statistical method could contribute higher read accuracy with longer read-length (200 < read-length). Nanopore ionic current switching with Time Multiplexing (TM) based multichannel readout system contributed hardware savings.

Keywords: DNA, nanopore, amplifier, ADC, multichannel

Procedia PDF Downloads 449

10087 Discrete Breeding Swarm for Cost Minimization of Parallel Job Shop Scheduling Problem

Authors: Tarek Aboueldahab, Hanan Farag

Abstract:

Parallel Job Shop Scheduling Problem (JSP) is a multi-objective and multi constrains NP- optimization problem. Traditional Artificial Intelligence techniques have been widely used; however, they could be trapped into the local minimum without reaching the optimum solution, so we propose a hybrid Artificial Intelligence model (AI) with Discrete Breeding Swarm (DBS) added to traditional Artificial Intelligence to avoid this trapping. This model is applied in the cost minimization of the Car Sequencing and Operator Allocation (CSOA) problem. The practical experiment shows that our model outperforms other techniques in cost minimization.

Keywords: parallel job shop scheduling problem, artificial intelligence, discrete breeding swarm, car sequencing and operator allocation, cost minimization

Procedia PDF Downloads 181

10086 Full Length Transcriptome Sequencing and Differential Expression Gene Analysis of Hybrid Larch under PEG Stress

Authors: Zhang Lei, Zhao Qingrong, Wang Chen, Zhang Sufang, Zhang Hanguo

Abstract:

Larch is the main afforestation and timber tree species in Northeast China, and drought is one of the main factors limiting the growth of Larch and other organisms in Northeast China. In order to further explore the mechanism of Larch drought resistance, PEG was used to simulate drought stress. The full-length sequencing of Larch embryogenic callus under PEG simulated drought stress was carried out by combining Illumina-Hiseq and SMRT-seq. A total of 20.3Gb clean reads and 786492 CCS reads were obtained from the second and third generation sequencing. The de-redundant transcript sequences were predicted by lncRNA, 2083 lncRNAs were obtained, and the target genes were predicted, and a total of 2712 target genes were obtained. The de-redundant transcripts were further screened, and 1654 differentially expressed genes (DEGs )were obtained. Among them, different DEGs respond to drought stress in different ways, such as oxidation-reduction process, starch and sucrose metabolism, plant hormone pathway, carbon metabolism, lignin catabolic/biosynthetic process and so on. This study provides basic full-length sequencing data for the study of Larch drought resistance, and excavates a large number of DEGs in response to drought stress, which helps us to further understand the function of Larch drought resistance genes and provides a reference for in-depth analysis of the molecular mechanism of Larch drought resistance.

Keywords: larch, drought stress, full-length transcriptome sequencing, differentially expressed genes

Procedia PDF Downloads 165

10085 Implementation of CNV-CH Algorithm Using Map-Reduce Approach

Authors: Aishik Deb, Rituparna Sinha

Abstract:

We have developed an algorithm to detect the abnormal segment/"structural variation in the genome across a number of samples. We have worked on simulated as well as real data from the BAM Files and have designed a segmentation algorithm where abnormal segments are detected. This algorithm aims to improve the accuracy and performance of the existing CNV-CH algorithm. The next-generation sequencing (NGS) approach is very fast and can generate large sequences in a reasonable time. So the huge volume of sequence information gives rise to the need for Big Data and parallel approaches of segmentation. Therefore, we have designed a map-reduce approach for the existing CNV-CH algorithm where a large amount of sequence data can be segmented and structural variations in the human genome can be detected. We have compared the efficiency of the traditional and map-reduce algorithms with respect to precision, sensitivity, and F-Score. The advantages of using our algorithm are that it is fast and has better accuracy. This algorithm can be applied to detect structural variations within a genome, which in turn can be used to detect various genetic disorders such as cancer, etc. The defects may be caused by new mutations or changes to the DNA and generally result in abnormally high or low base coverage and quantification values.

Keywords: cancer detection, convex hull segmentation, map reduce, next generation sequencing

Procedia PDF Downloads 128

10084 Genomic Diversity and Relationship among Arabian Peninsula Dromedary Camels Using Full Genome Sequencing Approach

Authors: H. Bahbahani, H. Musa, F. Al Mathen

Abstract:

The dromedary camels (Camelus dromedarius) are single-humped even-toed ungulates populating the African Sahara, Arabian Peninsula, and Southwest Asia. The genome of this desert-adapted species has been minimally investigated using autosomal microsatellite and mitochondrial DNA markers. In this study, the genomes of 33 dromedary camel samples from different parts of the Arabian Peninsula were sequenced using Illumina Next Generation Sequencing (NGS) platform. These data were combined with Genotyping-by-Sequencing (GBS) data from African (Sudanese) dromedaries to investigate the genomic relationship between African and Arabian Peninsula dromedary camels. Principle Component Analysis (PCA) and average genome-wide admixture analysis were be conducted on these data to tackle the objectives of these studies. Both of the two analyses conducted revealed phylogeographic distinction between these two camel populations. However, no breed-wise genetic classification has been revealed among the African (Sudanese) camel breeds. The Arabian Peninsula camel populations also show higher heterozygosity than the Sudanese camels. The results of this study explain the evolutionary history and migration of African dromedary camels from their center of domestication in the southern Arabian Peninsula. These outputs help scientists to further understand the evolutionary history of dromedary camels, which might impact in conserving the favorable genetic of this species.

Keywords: dromedary, genotyping-by-sequencing, Arabian Peninsula, Sudan

Procedia PDF Downloads 195

10083 Microbial Dark Matter Analysis Using 16S rRNA Gene Metagenomics Sequences

Authors: Hana Barak, Alex Sivan, Ariel Kushmaro

Abstract:

Microorganisms are the most diverse and abundant life forms on Earth and account for a large portion of the Earth’s biomass and biodiversity. To date though, our knowledge regarding microbial life is lacking, as it is based mainly on information from cultivated organisms. Indeed, microbiologists have borrowed from astrophysics and termed the ‘uncultured microbial majority’ as ‘microbial dark matter’. The realization of how diverse and unexplored microorganisms are, actually stems from recent advances in molecular biology, and in particular from novel methods for sequencing microbial small subunit ribosomal RNA genes directly from environmental samples termed next-generation sequencing (NGS). This has led us to use NGS that generates several gigabases of sequencing data in a single experimental run, to identify and classify environmental samples of microorganisms. In metagenomics sequencing analysis (both 16S and shotgun), sequences are compared to reference databases that contain only small part of the existing microorganisms and therefore their taxonomy assignment may reveal groups of unknown microorganisms or origins. These unknowns, or the ‘microbial sequences dark matter’, are usually ignored in spite of their great importance. The goal of this work was to develop an improved bioinformatics method that enables more complete analyses of the microbial communities in numerous environments. Therefore, NGS was used to identify previously unknown microorganisms from three different environments (industrials wastewater, Negev Desert’s rocks and water wells at the Arava valley). 16S rRNA gene metagenome analysis of the microorganisms from those three environments produce about ~4 million reads for 75 samples. Between 0.1-12% of the sequences in each sample were tagged as ‘Unassigned’. Employing relatively simple methodology for resequencing of original gDNA samples through Sanger or MiSeq Illumina with specific primers, this study demonstrates that the mysterious ‘Unassigned’ group apparently contains sequences of candidate phyla. Those unknown sequences can be located on a phylogenetic tree and thus provide a better understanding of the ‘sequences dark matter’ and its role in the research of microbial communities and diversity. Studying this ‘dark matter’ will extend the existing databases and could reveal the hidden potential of the ‘microbial dark matter’.

Keywords: bacteria, bioinformatics, dark matter, Next Generation Sequencing, unknown

Procedia PDF Downloads 251

10082 An Analysis System for Integrating High-Throughput Transcript Abundance Data with Metabolic Pathways in Green Algae

Authors: Han-Qin Zheng, Yi-Fan Chiang-Hsieh, Chia-Hung Chien, Wen-Chi Chang

Abstract:

As the most important non-vascular plants, algae have many research applications, including high species diversity, biofuel sources, adsorption of heavy metals and, following processing, health supplements. With the increasing availability of next-generation sequencing (NGS) data for algae genomes and transcriptomes, an integrated resource for retrieving gene expression data and metabolic pathway is essential for functional analysis and systems biology in algae. However, gene expression profiles and biological pathways are displayed separately in current resources, and making it impossible to search current databases directly to identify the cellular response mechanisms. Therefore, this work develops a novel AlgaePath database to retrieve gene expression profiles efficiently under various conditions in numerous metabolic pathways. AlgaePath, a web-based database, integrates gene information, biological pathways, and next-generation sequencing (NGS) datasets in Chlamydomonasreinhardtii and Neodesmus sp. UTEX 2219-4. Users can identify gene expression profiles and pathway information by using five query pages (i.e. Gene Search, Pathway Search, Differentially Expressed Genes (DEGs) Search, Gene Group Analysis, and Co-Expression Analysis). The gene expression data of 45 and 4 samples can be obtained directly on pathway maps in C. reinhardtii and Neodesmus sp. UTEX 2219-4, respectively. Genes that are differentially expressed between two conditions can be identified in Folds Search. Furthermore, the Gene Group Analysis of AlgaePath includes pathway enrichment analysis, and can easily compare the gene expression profiles of functionally related genes in a map. Finally, Co-Expression Analysis provides co-expressed transcripts of a target gene. The analysis results provide a valuable reference for designing further experiments and elucidating critical mechanisms from high-throughput data. More than an effective interface to clarify the transcript response mechanisms in different metabolic pathways under various conditions, AlgaePath is also a data mining system to identify critical mechanisms based on high-throughput sequencing.

Keywords: next-generation sequencing (NGS), algae, transcriptome, metabolic pathway, co-expression

Procedia PDF Downloads 401

10081 Performance Analysis of Shunt Active Power Filter for Various Reference Current Generation Techniques

Authors: Vishal V. Choudhari, Gaurao A. Dongre, S. P. Diwan

Abstract:

A number of reference current generation have been developed for analysis of shunt active power filter to mitigate the load compensation. Depending upon the type of load the technique has to be chosen. In this paper, six reference current generation techniques viz. instantaneous reactive power theory(IRP), Synchronous reference frame theory(SRF), Perfect harmonic cancellation(PHC), Unity power factor method(UPF), Self-tuning filter method(STF), Predictive filtering method(PFM) are compared for different operating conditions. The harmonics are introduced because of non-linear loads in the system. These harmonics are eliminated using above techniques. The results and performance of system simulated on MATLAB/Simulink platform. The system is experimentally implemented using DS1104 card of dSPACE system.

Keywords: SAPF, power quality, THD, IRP, SRF, dSPACE module DS1104

Procedia PDF Downloads 586

10080 Genomics of Adaptation in the Sea

Authors: Agostinho Antunes

Abstract:

The completion of the human genome sequencing in 2003 opened a new perspective into the importance of whole genome sequencing projects, and currently multiple species are having their genomes completed sequenced, from simple organisms, such as bacteria, to more complex taxa, such as mammals. This voluminous sequencing data generated across multiple organisms provides also the framework to better understand the genetic makeup of such species and related ones, allowing to explore the genetic changes underlining the evolution of diverse phenotypic traits. Here, recent results from our group retrieved from comparative evolutionary genomic analyses of selected marine animal species will be considered to exemplify how gene novelty and gene enhancement by positive selection might have been determinant in the success of adaptive radiations into diverse habitats and lifestyles.

Keywords: marine genomics, evolutionary bioinformatics, human genome sequencing, genomic analyses

Procedia PDF Downloads 605

10079 Liquid Biopsy Based Microbial Biomarker in Coronary Artery Disease Diagnosis

Authors: Eyup Ozkan, Ozkan U. Nalbantoglu, Aycan Gundogdu, Mehmet Hora, A. Emre Onuk

Abstract:

The human microbiome has been associated with cardiological conditions and this relationship is becoming to be defined beyond the gastrointestinal track. In this study, we investigate the alteration in circulatory microbiota in the context of Coronary Artery Disease (CAD). We received circulatory blood samples from suspected CAD patients and maintain 16S ribosomal RNA sequencing to identify each patient’s microbiome. It was found that Corynebacterium and Methanobacteria genera show statistically significant differences between healthy and CAD patients. The overall biodiversities between the groups were observed to be different revealed by machine learning classification models. We also achieve and demonstrate the performance of a diagnostic method using circulatory blood microbiome-based estimation.

Keywords: coronary artery disease, blood microbiome, machine learning, angiography, next-generation sequencing

Procedia PDF Downloads 146

10078 Enzymatic Repair Prior To DNA Barcoding, Aspirations, and Restraints

Authors: Maxime Merheb, Rachel Matar

Abstract:

Retrieving ancient DNA sequences which in return permit the entire genome sequencing from fossils have extraordinarily improved in recent years, thanks to sequencing technology and other methodological advances. In any case, the quest to search for ancient DNA is still obstructed by the damage inflicted on DNA which accumulates after the death of a living organism. We can characterize this damage into three main categories: (i) Physical abnormalities such as strand breaks which lead to the presence of short DNA fragments. (ii) Modified bases (mainly cytosine deamination) which cause errors in the sequence due to an incorporation of a false nucleotide during DNA amplification. (iii) DNA modifications referred to as blocking lesions, will halt the PCR extension which in return will also affect the amplification and sequencing process. We can clearly see that the issues arising from breakage and coding errors were significantly decreased in recent years. Fast sequencing of short DNA fragments was empowered by platforms for high-throughput sequencing, most of the coding errors were uncovered to be the consequences of cytosine deamination which can be easily removed from the DNA using enzymatic treatment. The methodology to repair DNA sequences is still in development, it can be basically explained by the process of reintroducing cytosine rather than uracil. This technique is thus restricted to amplified DNA molecules. To eliminate any type of damage (particularly those that block PCR) is a process still pending the complete repair methodologies; DNA detection right after extraction is highly needed. Before using any resources into extensive, unreasonable and uncertain repair techniques, it is vital to distinguish between two possible hypotheses; (i) DNA is none existent to be amplified to begin with therefore completely un-repairable, (ii) the DNA is refractory to PCR and it is worth to be repaired and amplified. Hence, it is extremely important to develop a non-enzymatic technique to detect the most degraded DNA.

Keywords: ancient DNA, DNA barcodong, enzymatic repair, PCR

Procedia PDF Downloads 398

10077 Applying Massively Parallel Sequencing to Forensic Soil Bacterial Profiling

Authors: Hui Li, Xueying Zhao, Ke Ma, Yu Cao, Fan Yang, Qingwen Xu, Wenbin Liu

Abstract:

Soil can often link a person or item to a crime scene, which makes it a valuable evidence in forensic casework. Several techniques have been utilized in forensic soil discrimination in previous studies. Because soil contains a vast number of microbiomes, the analyse of soil microbiomes is expected to be a potential way to characterise soil evidence. In this study, we applied massively parallel sequencing (MPS) to soil bacterial profiling on the Ion Torrent Personal Genome Machine (PGM). Soils from different regions were collected repeatedly. V-region 3 and 4 of Bacterial 16S rRNA gene were detected by MPS. Operational taxonomic units (OTU, 97%) were used to analyse soil bacteria. Several bioinformatics methods (PCoA, NMDS, Metastats, LEfse, and Heatmap) were applied in bacterial profiles. Our results demonstrate that MPS can provide a more detailed picture of the soil microbiomes and the composition of soil bacterial components from different region was individualistic. In conclusion, the utility of soil bacterial profiling via MPS of the 16S rRNA gene has potential value in characterising soil evidences and associating them with their place of origin, which can play an important role in forensic science in the future.

Keywords: bacterial profiling, forensic, massively parallel sequencing, soil evidence

Procedia PDF Downloads 556

10076 Comparison and Validation of a dsDNA biomimetic Quality Control Reference for NGS based BRCA CNV analysis versus MLPA

Authors: A. Delimitsou, C. Gouedard, E. Konstanta, A. Koletis, S. Patera, E. Manou, K. Spaho, S. Murray

Abstract:

Background: There remains a lack of International Standard Control Reference materials for Next Generation Sequencing-based approaches or device calibration. We have designed and validated dsDNA biomimetic reference materials for targeted such approaches incorporating proprietary motifs (patent pending) for device/test calibration. They enable internal single-sample calibration, alleviating sample comparisons to pooled historical population-based data assembly or statistical modelling approaches. We have validated such an approach for BRCA Copy Number Variation analytics using iQRS™-CNVSUITE versus Mixed Ligation-dependent Probe Amplification. Methods: Standard BRCA Copy Number Variation analysis was compared between mixed ligation-dependent probe amplification and next generation sequencing using a cohort of 198 breast/ovarian cancer patients. Next generation sequencing based copy number variation analysis of samples spiked with iQRS™ dsDNA biomimetics were analysed using proprietary CNVSUITE software. Mixed ligation-dependent probe amplification analyses were performed on an ABI-3130 Sequencer and analysed with Coffalyser software. Results: Concordance of BRCA – copy number variation events for mixed ligation-dependent probe amplification and CNVSUITE indicated an overall sensitivity of 99.88% and specificity of 100% for iQRS™-CNVSUITE. The negative predictive value of iQRS-CNVSUITE™ for BRCA was 100%, allowing for accurate exclusion of any event. The positive predictive value was 99.88%, with no discrepancy between mixed ligation-dependent probe amplification and iQRS™-CNVSUITE. For device calibration purposes, precision was 100%, spiking of patient DNA demonstrated linearity to 1% (±2.5%) and range from 100 copies. Traditional training was supplemented by predefining the calibrator to sample cut-off (lock-down) for amplicon gain or loss based upon a relative ratio threshold, following training of iQRS™-CNVSUITE using spiked iQRS™ calibrator and control mocks. BRCA copy number variation analysis using iQRS™-CNVSUITE™ was successfully validated and ISO15189 accredited and now enters CE-IVD performance evaluation. Conclusions: The inclusion of a reference control competitor (iQRS™ dsDNA mimetic) to next generation sequencing-based sequencing offers a more robust sample-independent approach for the assessment of copy number variation events compared to mixed ligation-dependent probe amplification. The approach simplifies data analyses, improves independent sample data analyses, and allows for direct comparison to an internal reference control for sample-specific quantification. Our iQRS™ biomimetic reference materials allow for single sample copy number variation analytics and further decentralisation of diagnostics to single patient sample assessment.

Keywords: validation, diagnostics, oncology, copy number variation, reference material, calibration

Procedia PDF Downloads 62

10075 Metagenomic analysis of Irish cattle faecal samples using Oxford Nanopore MinION Next Generation Sequencing

Authors: Niamh Higgins, Dawn Howard

Abstract:

The Irish agri-food sector is of major importance to Ireland’s manufacturing sector and to the Irish economy through employment and the exporting of animal products worldwide. Infectious diseases and parasites have an impact on farm animal health causing profitability and productivity to be affected. For the sustainability of Irish dairy farming, there must be the highest standard of animal health. There can be a lack of information in accounting for > 1% of complete microbial diversity in an environment. There is the tendency of culture-based methods of microbial identification to overestimate the prevalence of species which grow easily on an agar surface. There is a need for new technologies to address these issues to assist with animal health. Metagenomic approaches provide information on both the whole genome and transcriptome present through DNA sequencing of total DNA from environmental samples producing high determination of functional and taxonomic information. Nanopore Next Generation Technologies have the ability to be powerful sequencing technologies. They provide high throughput, low material requirements and produce ultra-long reads, simplifying the experimental process. The aim of this study is to use a metagenomics approach to analyze dairy cattle faecal samples using the Oxford Nanopore MinION Next Generation Sequencer and to establish an in-house pipeline for metagenomic characterization of complex samples. Faecal samples will be obtained from Irish dairy farms, DNA extracted and the MinION will be used for sequencing, followed by bioinformatics analysis. Of particular interest, will be the parasite Buxtonella sulcata, which there has been little research on and which there is no research on its presence on Irish dairy farms. Preliminary results have shown the ability of the MinION to produce hundreds of reads in a relatively short time frame of eight hours. The faecal samples were obtained from 90 dairy cows on a Galway farm. The results from Oxford Nanopore ‘What’s in my pot’ (WIMP) using the Epi2me workflow, show that from a total of 926 classified reads, 87% were from the Kingdom Bacteria, 10% were from the Kingdom Eukaryota, 3% were from the Kingdom Archaea and < 1% were from the Kingdom Viruses. The most prevalent bacteria were those from the Genus Acholeplasma (71 reads), Bacteroides (35 reads), Clostridium (33 reads), Acinetobacter (20 reads). The most prevalent species present were those from the Genus Acholeplasma and included Acholeplasma laidlawii (39 reads) and Acholeplasma brassicae (26 reads). The preliminary results show the ability of the MinION for the identification of microorganisms to species level coming from a complex sample. With ongoing optimization of the pipe-line, the number of classified reads are likely to increase. Metagenomics has the potential in animal health for diagnostics of microorganisms present on farms. This would support wprevention rather than a cure approach as is outlined in the DAFMs National Farmed Animal Health Strategy 2017-2022.

Keywords: animal health, buxtonella sulcata, infectious disease, irish dairy cattle, metagenomics, minION, next generation sequencing

Procedia PDF Downloads 145

10074 Genome-Wide Identification of Genes Resistance to Nitric Oxide in Vibrio parahaemolyticus

Authors: Yantao Li, Jun Zheng

Abstract:

Food poison caused by consumption of contaminated food, especially seafood, is one of most serious public health threats worldwide. Vibrio parahaemolyticus is emerging bacterial pathogen and the leading cause of human gastroenteritis associated with food poison, especially in the southern coastal region of China. To successfully cause disease in host, bacterial pathogens need to overcome the host-derived stresses encountered during infection. One of the toxic chemical species elaborated by the host is nitric oxide (NO). NO is generated by acidified nitrite in the stomach and by enzymes of the inducible NO synthase (iNOS) in the host cell, and is toxic to bacteria. Bacterial pathogens have evolved some mechanisms to battle with this toxic stress. Such mechanisms include genes to sense NO produced from immune system and activate others to detoxify NO toxicity, and genes to repair the damage caused by toxic reactive nitrogen species (RNS) generated during NO toxic stress. However, little is known about the NO resistance in V. parahaemolyticus. In this study, a transposon coupled with next generation sequencing (Tn-seq) technology will be utilized to identify genes for NO resistance in V. parahaemolyticus. Our strategy will include construction the saturating transposon insertion library, transposon library challenging with NO, next generation sequencing (NGS), bioinformatics analysis and verification of the identified genes in vitro and in vivo.

Keywords: vibrio parahaemolyticus, nitric oxide, tn-seq, virulence

Procedia PDF Downloads 263