Search results for: de novo transcriptome sequencing
611 Scalable and Accurate Detection of Pathogens from Whole-Genome Shotgun Sequencing
Authors: Janos Juhasz, Sandor Pongor, Balazs Ligeti
Abstract:
Next-generation sequencing, especially whole genome shotgun sequencing, is becoming a common approach to gain insight into the microbiomes in a culture-independent way, even in clinical practice. It does not only give us information about the species composition of an environmental sample but opens the possibility to detect antimicrobial resistance and novel, or currently unknown, pathogens. Accurately and reliably detecting the microbial strains is a challenging task. Here we present a sensitive approach for detecting pathogens in metagenomics samples with special regard to detecting novel variants of known pathogens. We have developed a pipeline that uses fast, short read aligner programs (i.e., Bowtie2/BWA) and comprehensive nucleotide databases. Taxonomic binning is based on the lowest common ancestor (LCA) principle; each read is assigned to a taxon, covering the most significantly hit taxa. This approach helps in balancing between sensitivity and running time. The program was tested both on experimental and synthetic data. The results implicate that our method performs as good as the state-of-the-art BLAST-based ones, furthermore, in some cases, it even proves to be better, while running two orders magnitude faster. It is sensitive and capable of identifying taxa being present only in small abundance. Moreover, it needs two orders of magnitude less reads to complete the identification than MetaPhLan2 does. We analyzed an experimental anthrax dataset (B. anthracis strain BA104). The majority of the reads (96.50%) was classified as Bacillus anthracis, a small portion, 1.2%, was classified as other species from the Bacillus genus. We demonstrate that the evaluation of high-throughput sequencing data is feasible in a reasonable time with good classification accuracy.Keywords: metagenomics, taxonomy binning, pathogens, microbiome, B. anthracis
Procedia PDF Downloads 137610 Analysis of the Lung Microbiome in Cystic Fibrosis Patients Using 16S Sequencing
Authors: Manasvi Pinnaka, Brianna Chrisman
Abstract:
Cystic fibrosis patients often develop lung infections that range anywhere in severity from mild to life-threatening due to the presence of thick and sticky mucus that fills their airways. Since many of these infections are chronic, they not only affect a patient’s ability to breathe but also increase the chances of mortality by respiratory failure. With a publicly available dataset of DNA sequences from bacterial species in the lung microbiome of cystic fibrosis patients, the correlations between different microbial species in the lung and the extent of deterioration of lung function were investigated. 16S sequencing technologies were used to determine the microbiome composition of the samples in the dataset. For the statistical analyses, referencing helped distinguish between taxonomies, and the proportions of certain taxa relative to another were determined. It was found that the Fusobacterium, Actinomyces, and Leptotrichia microbial types all had a positive correlation with the FEV1 score, indicating the potential displacement of these species by pathogens as the disease progresses. However, the dominant pathogens themselves, including Pseudomonas aeruginosa and Staphylococcus aureus, did not have statistically significant negative correlations with the FEV1 score as described by past literature. Examining the lung microbiology of cystic fibrosis patients can help with the prediction of the current condition of lung function, with the potential to guide doctors when designing personalized treatment plans for patients.Keywords: bacterial infections, cystic fibrosis, lung microbiome, 16S sequencing
Procedia PDF Downloads 99609 Molecular Detection of mRNA bcr-abl and Circulating Leukemic Stem Cells CD34+ in Patients with Acute Lymphoblastic Leukemia and Chronic Myeloid Leukemia and Its Association with Clinical Parameters
Authors: B. Gonzalez-Yebra, H. Barajas, P. Palomares, M. Hernandez, O. Torres, M. Ayala, A. L. González, G. Vazquez-Ortiz, M. L. Guzman
Abstract:
Leukemia arises by molecular alterations of the normal hematopoietic stem cell (HSC) transforming it into a leukemic stem cell (LSC) with high cell proliferation, self-renewal, and cell differentiation. Chronic myeloid leukemia (CML) originates from an LSC-leading to elevated proliferation of myeloid cells and acute lymphoblastic leukemia (ALL) originates from an LSC development leading to elevated proliferation of lymphoid cells. In both cases, LSC can be identified by multicolor flow cytometry using several antibodies. However, to date, LSC levels in peripheral blood (PB) are not established well enough in ALL and CML patients. On the other hand, the detection of the minimal residue disease (MRD) in leukemia is mainly based on the identification of the mRNA bcr-abl gene in CML patients and some other genes in ALL patients. There is no a properly biomarker to detect MDR in both types of leukemia. The objective of this study was to determine mRNA bcr-abl and the percentage of LSC in peripheral blood of patients with CML and ALL and identify a possible association between the amount of LSC in PB and clinical data. We included in this study 19 patients with Leukemia. A PB sample was collected per patient and leukocytes were obtained by Ficoll gradient. The immunophenotype for LSC CD34+ was done by flow cytometry analysis with CD33, CD2, CD14, CD16, CD64, HLA-DR, CD13, CD15, CD19, CD10, CD20, CD34, CD38, CD71, CD90, CD117, CD123 monoclonal antibodies. In addition, to identify the presence of the mRNA bcr-abl by RT-PCR, the RNA was isolated using TRIZOL reagent. Molecular (presence of mRNA bcr-abl and LSC CD34+) and clinical results were analyzed with descriptive statistics and a multiple regression analysis was performed to determine statistically significant association. In total, 19 patients (8 patients with ALL and 11 patients with CML) were analyzed, 9 patients with de novo leukemia (ALL = 6 and CML = 3) and 10 under treatment (ALL = 5 and CML = 5). The overall frequency of mRNA bcr-abl was 31% (6/19), and it was negative in ALL patients and positive in 80% in CML patients. On the other hand, LSC was determined in 16/19 leukemia patients (%LSC= 0.02-17.3). The Novo patients had higher percentage of LSC (0.26 to 17.3%) than patients under treatment (0 to 5.93%). The amount of LSC was significantly associated with the amount of LSC were: absence of treatment, the absence of splenomegaly, and a lower number of leukocytes, negative association for the clinical variables age, sex, blasts, and mRNA bcr-abl. In conclusion, patients with de novo leukemia had a higher percentage of circulating LSC than patients under treatment, and it was associated with clinical parameters as lack of treatment, absence of splenomegaly and a lower number of leukocytes. The mRNA bcr-abl detection was only possible in the series of patients with CML, and molecular detection of LSC could be identified in the peripheral blood of all leukemia patients, we believe the identification of circulating LSC may be used as biomarker for the detection of the MRD in leukemia patients.Keywords: stem cells, leukemia, biomarkers, flow cytometry
Procedia PDF Downloads 356608 Impact of Totiviridae L-A dsRNA Virus on Saccharomyces Cerevisiae Host: Transcriptomic and Proteomic Approach
Authors: Juliana Lukša, Bazilė Ravoitytė, Elena Servienė, Saulius Serva
Abstract:
Totiviridae L-A virus is a persistent Saccharomyces cerevisiae dsRNA virus. It encodes the major structural capsid protein Gag and Gag-Pol fusion protein, responsible for virus replication and encapsulation. These features also enable the copying of satellite dsRNAs (called M dsRNAs) encoding a secreted toxin and immunity to it (known as killer toxin). Viral capsid pore presumably functions in nucleotide uptake and viral mRNA release. During cell division, sporogenesis, and cell fusion, the virions remain intracellular and are transferred to daughter cells. By employing high throughput RNA sequencing data analysis, we describe the influence of solely L-A virus on the expression of genes in three different S. cerevisiae hosts. We provide a new perception into Totiviridae L-A virus-related transcriptional regulation, encompassing multiple bioinformatics analyses. Transcriptional responses to L-A infection were similar to those induced upon stress or availability of nutrients. It also delves into the connection between the cell metabolism and L-A virus-conferred demands to the host transcriptome by uncovering host proteins that may be associated with intact virions. To better understand the virus-host interaction, we applied differential proteomic analysis of virus particle-enriched fractions of yeast strains that harboreither complete killer system (L-A-lus and M-2 virus), M-2 depleted orvirus-free. Our analysis resulted in the identification of host proteins, associated with structural proteins of the virus (Gag and Gag-Pol). This research was funded by the European Social Fund under the No.09.3.3-LMT-K-712-19-0157“Development of Competences of Scientists, other Researchers, and Students through Practical Research Activities” measure.Keywords: totiviridae, killer virus, proteomics, transcriptomics
Procedia PDF Downloads 146607 Microbial Dark Matter Analysis Using 16S rRNA Gene Metagenomics Sequences
Authors: Hana Barak, Alex Sivan, Ariel Kushmaro
Abstract:
Microorganisms are the most diverse and abundant life forms on Earth and account for a large portion of the Earth’s biomass and biodiversity. To date though, our knowledge regarding microbial life is lacking, as it is based mainly on information from cultivated organisms. Indeed, microbiologists have borrowed from astrophysics and termed the ‘uncultured microbial majority’ as ‘microbial dark matter’. The realization of how diverse and unexplored microorganisms are, actually stems from recent advances in molecular biology, and in particular from novel methods for sequencing microbial small subunit ribosomal RNA genes directly from environmental samples termed next-generation sequencing (NGS). This has led us to use NGS that generates several gigabases of sequencing data in a single experimental run, to identify and classify environmental samples of microorganisms. In metagenomics sequencing analysis (both 16S and shotgun), sequences are compared to reference databases that contain only small part of the existing microorganisms and therefore their taxonomy assignment may reveal groups of unknown microorganisms or origins. These unknowns, or the ‘microbial sequences dark matter’, are usually ignored in spite of their great importance. The goal of this work was to develop an improved bioinformatics method that enables more complete analyses of the microbial communities in numerous environments. Therefore, NGS was used to identify previously unknown microorganisms from three different environments (industrials wastewater, Negev Desert’s rocks and water wells at the Arava valley). 16S rRNA gene metagenome analysis of the microorganisms from those three environments produce about ~4 million reads for 75 samples. Between 0.1-12% of the sequences in each sample were tagged as ‘Unassigned’. Employing relatively simple methodology for resequencing of original gDNA samples through Sanger or MiSeq Illumina with specific primers, this study demonstrates that the mysterious ‘Unassigned’ group apparently contains sequences of candidate phyla. Those unknown sequences can be located on a phylogenetic tree and thus provide a better understanding of the ‘sequences dark matter’ and its role in the research of microbial communities and diversity. Studying this ‘dark matter’ will extend the existing databases and could reveal the hidden potential of the ‘microbial dark matter’.Keywords: bacteria, bioinformatics, dark matter, Next Generation Sequencing, unknown
Procedia PDF Downloads 257606 Bioinformatics Approach to Support Genetic Research in Autism in Mali
Authors: M. Kouyate, M. Sangare, S. Samake, S. Keita, H. G. Kim, D. H. Geschwind
Abstract:
Background & Objectives: Human genetic studies can be expensive, even unaffordable, in developing countries, partly due to the sequencing costs. Our aim is to pilot the use of bioinformatics tools to guide scientifically valid, locally relevant, and economically sound autism genetic research in Mali. Methods: The following databases, NCBI, HGMD, and LSDB, were used to identify hot point mutations. Phenotype, transmission pattern, theoretical protein expression in the brain, the impact of the mutation on the 3D structure of the protein) were used to prioritize selected autism genes. We used the protein database, Modeller, and clustal W. Results: We found Mef2c (Gly27Ala/Leu38Gln), Pten (Thr131IIle), Prodh (Leu289Met), Nme1 (Ser120Gly), and Dhcr7 (Pro227Thr/Glu224Lys). These mutations were associated with endonucleases BseRI, NspI, PfrJS2IV, BspGI, BsaBI, and SpoDI, respectively. Gly27Ala/Leu38Gln mutations impacted the 3D structure of the Mef2c protein. Mef2c protein sequences across species showed a high percentage of similarity with a highly conserved MADS domain. Discussion: Mef2c, Pten, Prodh, Nme1, and Dhcr 7 gene mutation frequencies in the Malian population will be very informative. PCR coupled with restriction enzyme digestion can be used to screen the targeted gene mutations. Sanger sequencing will be used for confirmation only. This will cut down considerably the sequencing cost for gene-to-gene mutation screening. The knowledge of the 3D structure and potential impact of the mutations on Mef2c protein informed the protein family and altered function (ex. Leu38Gln). Conclusion & Future Work: Bio-informatics will positively impact autism research in Mali. Our approach can be applied to another neuropsychiatric disorder.Keywords: bioinformatics, endonucleases, autism, Sanger sequencing, point mutations
Procedia PDF Downloads 83605 Tip60’s Novel RNA-Binding Function Modulates Alternative Splicing of Pre-mRNA Targets Implicated in Alzheimer’s Disease
Authors: Felice Elefant, Akanksha Bhatnaghar, Keegan Krick, Elizabeth Heller
Abstract:
Context: The severity of Alzheimer’s Disease (AD) progression involves an interplay of genetics, age, and environmental factors orchestrated by histone acetyltransferase (HAT) mediated neuroepigenetic mechanisms. While disruption of Tip60 HAT action in neural gene control is implicated in AD, alternative mechanisms underlying Tip60 function remain unexplored. Altered RNA splicing has recently been highlighted as a widespread hallmark in the AD transcriptome that is implicated in the disease. Research Aim: The aim of this study was to identify a novel RNA binding/splicing function for Tip60 in human hippocampus and impaired in brains from AD fly models and AD patients. Methodology/Analysis: The authors used RNA immunoprecipitation using RNA isolated from 200 pooled wild type Drosophila brains for each of the 3 biological replicates. To identify Tip60’s RNA targets, they performed genome sequencing (DNB-SequencingTM technology, BGI genomics) on 3 replicates for Input RNA and RNA IPs by Tip60. Findings: The authors' transcriptomic analysis of RNA bound to Tip60 by Tip60-RNA immunoprecipitation (RIP) revealed Tip60 RNA targets enriched for critical neuronal processes implicated in AD. Remarkably, 79% of Tip60’s RNA targets overlap with its chromatin gene targets, supporting a model by which Tip60 orchestrates bi-level transcriptional regulation at both the chromatin and RNA level, a function unprecedented for any HAT to date. Since RNA splicing occurs co-transcriptionally and splicing defects are implicated in AD, the authors investigated whether Tip60-RNA targeting modulates splicing decisions and if this function is altered in AD. Replicate multivariate analysis of transcript splicing (rMATS) analysis of RNA-Seq data sets from wild-type and AD fly brains revealed a multitude of mammalian-like AS defects. Strikingly, over half of these altered RNAs were bonafide Tip60-RNA targets enriched for in the AD-gene curated database, with some AS alterations prevented against by increasing Tip60 in fly brain. Importantly, human orthologs of several Tip60-modulated spliced genes in Drosophila are well characterized aberrantly spliced genes in human AD brains, implicating disruption of Tip60’s splicing function in AD pathogenesis. Theoretical Importance: The authors' findings support a novel RNA interaction and splicing regulatory function for Tip60 that may underlie AS impairments that hallmark AD etiology. Data Collection: The authors collected data from RNA immunoprecipitation experiments using RNA isolated from 200 pooled wild type Drosophila brains for each of the 3 biological replicates. They also performed genome sequencing (DNBSequencingTM technology, BGI genomics) on 3 replicates for Input RNA and RNA IPs by Tip60. Questions: The question addressed by this study was whether Tip60 has a novel RNA binding/splicing function in human hippocampus and whether this function is impaired in brains from AD fly models and AD patients. Conclusions: The authors' findings support a novel RNA interaction and splicing regulatory function for Tip60 that may underlie AS impairments that hallmark AD etiology.Keywords: Alzheimer's disease, cognition, aging, neuroepigenetics
Procedia PDF Downloads 76604 Transcriptomine: The Nuclear Receptor Signaling Transcriptome Database
Authors: Scott A. Ochsner, Christopher M. Watkins, Apollo McOwiti, David L. Steffen Lauren B. Becnel, Neil J. McKenna
Abstract:
Understanding signaling by nuclear receptors (NRs) requires an appreciation of their cognate ligand- and tissue-specific transcriptomes. While target gene regulation data are abundant in this field, they reside in hundreds of discrete publications in formats refractory to routine query and analysis and, accordingly, their full value to the NR signaling community has not been realized. One of the mandates of the Nuclear Receptor Signaling Atlas (NURSA) is to facilitate access of the community to existing public datasets. Pursuant to this mandate we are developing a freely-accessible community web resource, Transcriptomine, to bring together the sum total of available expression array and RNA-Seq data points generated by the field in a single location. Transcriptomine currently contains over 25,000,000 gene fold change datapoints from over 1200 contrasts relevant to over 100 NRs, ligands and coregulators in over 200 tissues and cell lines. Transcriptomine is designed to accommodate a spectrum of end users ranging from the bench researcher to those with advanced bioinformatic training. Visualization tools allow users to build custom charts to compare and contrast patterns of gene regulation across different tissues and in response to different ligands. Our resource affords an entirely new paradigm for leveraging gene expression data in the NR signaling field, empowering users to query gene fold changes across diverse regulatory molecules, tissues and cell lines, target genes, biological functions and disease associations, and that would otherwise be prohibitive in terms of time and effort. Transcriptomine will be regularly updated with gene lists from future genome-wide expression array and expression-sequencing datasets in the NR signaling field.Keywords: target gene database, informatics, gene expression, transcriptomics
Procedia PDF Downloads 273603 Development and Performance of Aerobic Granular Sludge at Elevated Temperature
Authors: Mustafa M. Bob, Siti Izaidah Azmi, Mohd Hakim Ab Halim, Nur Syahida Abdul Jamal, Aznah Nor-Anuar, Zaini Ujang
Abstract:
In this research, the formation and development of aerobic granular sludge (AGS) for domestic wastewater treatment application in hot climate conditions was studied using a sequencing batch reactor (SBR). The performance of the developed AGS in the removal of organic matter and nutrients from wastewater was also investigated. The operation of the reactor was based on the sequencing batch system with a complete cycle time of 3 hours that included feeding, aeration, settling, discharging and idling. The reactor was seeded with sludge collected from the municipal wastewater treatment plant in Madinah city, Saudi Arabia and operated at a temperature of 40ºC using synthetic wastewater as influent. Results showed that granular sludge was developed after an operation period of 30 days. The developed granular sludge had a good settling ability with the average size of the granules ranging from 1.03 to 2.42 mm. The removal efficiency of chemical oxygen demand (COD), ammonia nitrogen (NH3-N) and total phosphorus (TP) were 87.31%, 91.93% and 61.25% respectively. These results show that AGS can be developed at elevated temperatures and it is a promising technique to treat domestic wastewater in hot and low humidity climate conditions such as those encountered in Saudi Arabia.Keywords: aerobic granular sludge, hot climate, sequencing batch reactor, domestic wastewater treatment
Procedia PDF Downloads 358602 Pollutants Removal from Synthetic Wastewater by the Combined Electrochemical Sequencing Batch Reactor
Authors: Amin Mojiri, Akiyoshi Ohashi, Tomonori Kindaichi
Abstract:
Synthetic domestic wastewater was treated via combining treatment methods, including electrochemical oxidation, adsorption, and sequencing batch reactor (SBR). In the upper part of the reactor, an anode and a cathode (Ti/RuO2-IrO2) were organized in parallel for the electrochemical oxidation procedure. Sodium sulfate (Na2SO4) with a concentration of 2.5 g/L was applied as the electrolyte. The voltage and current were fixed on 7.50 V and 0.40 A, respectively. Then, 15% working value of the reactor was filled by activated sludge, and 85% working value of the reactor was added with synthetic wastewater. Powdered cockleshell, 1.5 g/L, was added in the reactor to do ion-exchange. Response surface methodology was employed for statistical analysis. Reaction time (h) and pH were considered as independent factors. A total of 97.0% biochemical oxygen demand, 99.9% phosphorous and 88.6% cadmium were eliminated at the optimum reaction time (80.0 min) and pH (6.4).Keywords: adsorption, electrochemical oxidation, metals, SBR
Procedia PDF Downloads 210601 Liquid Biopsy Based Microbial Biomarker in Coronary Artery Disease Diagnosis
Authors: Eyup Ozkan, Ozkan U. Nalbantoglu, Aycan Gundogdu, Mehmet Hora, A. Emre Onuk
Abstract:
The human microbiome has been associated with cardiological conditions and this relationship is becoming to be defined beyond the gastrointestinal track. In this study, we investigate the alteration in circulatory microbiota in the context of Coronary Artery Disease (CAD). We received circulatory blood samples from suspected CAD patients and maintain 16S ribosomal RNA sequencing to identify each patient’s microbiome. It was found that Corynebacterium and Methanobacteria genera show statistically significant differences between healthy and CAD patients. The overall biodiversities between the groups were observed to be different revealed by machine learning classification models. We also achieve and demonstrate the performance of a diagnostic method using circulatory blood microbiome-based estimation.Keywords: coronary artery disease, blood microbiome, machine learning, angiography, next-generation sequencing
Procedia PDF Downloads 156600 To Study the Performance of FMS under Different Manufacturing Strategies
Authors: Mohammed Ali
Abstract:
A flexible manufacturing system has been studied under different manufacturing strategies. The aim of this paper is to test the impact of number of pallets and routing flexibility (design strategy) on system performance operating at different sequencing and dispatching rules (control strategies) at unbalanced load condition (planning strategies). A computer simulation model is developed to evaluate the effects of aforementioned strategies on the make-span time, which is taken as the system performance measure. The impact of number of pallets is shown with the different levels of routing flexibility. In this paper, the same manufacturing system is modeled under different combination of sequencing and dispatching rules. The result of the simulation shows that there is definite range of pallets for each level of routing flexibility at which the systems performs satisfactorily.Keywords: flexible manufacturing system, manufacturing, strategy, makespan
Procedia PDF Downloads 668599 Rapid Start-Up and Efficient Long-Term Nitritation of Low Strength Ammonium Wastewater with a Sequencing Batch Reactor Containing Immobilized Cells
Authors: Hammad Khan, Wookeun Bae
Abstract:
Major concerns regarding nitritation of low-strength ammonium wastewaters include low ammonium loading rates (usually below 0.2 kg/m3-d) and uncertainty about long-term stability of the process. The purpose of this study was to test a sequencing batch reactor (SBR) filled with cell-immobilized polyethylene glycol (PEG) pellets to see if it could achieve efficient and stable nitritation under various environmental conditions. SBR was fed with synthetic ammonium wastewater of 30±2 mg-N/L and pH: 8±0.05, maintaining the dissolved oxygen concentration of 1.7±0.2 mg/L and the temperature at 30±1oC. The reaction was easily converted to partial nitrification mode within a month by feeding relatively high ammonium substrate (~100 mg-N/L) in the beginning. We observed stable nitritation over 300 days with high ammonium loading rates (as high as ~1.1 kg-N/m3-d), nitrite accumulation rates (mostly over 97%) and ammonium removal rate (mostly over 95%). DO was a major limiting substrate when the DO concentration was below ~4 mg/L and the NH4+-N concentration was above 5 mg/L, giving almost linear increase in the ammonium oxidation rate with the bulk DO increase. Low temperatures mainly affected the reaction rate, which could be compensated for by increasing the pellet volume (i.e. biomass). Our results demonstrated that an SBR filled with small cell-immobilized PEG pellets could achieve very efficient and stable nitritation of a low-strength ammonium wastewater.Keywords: ammonium loading rate (ALR), cell-immobilization, long-term nitritation, sequencing batch reactor (SBR), sewage treatment
Procedia PDF Downloads 273598 Isolation and Characterization of a Narrow-Host Range Aeromonas hydrophila Lytic Bacteriophage
Authors: Sumeet Rai, Anuj Tyagi, B. T. Naveen Kumar, Shubhkaramjeet Kaur, Niraj K. Singh
Abstract:
Since their discovery, indiscriminate use of antibiotics in human, veterinary and aquaculture systems has resulted in global emergence/spread of multidrug-resistant bacterial pathogens. Thus, the need for alternative approaches to control bacterial infections has become utmost important. High selectivity/specificity of bacteriophages (phages) permits the targeting of specific bacteria without affecting the desirable flora. In this study, a lytic phage (Ahp1) specific to Aeromonas hydrophila subsp. hydrophila was isolated from finfish aquaculture pond. The host range of Ahp1 range was tested against 10 isolates of A. hydrophila, 7 isolates of A. veronii, 25 Vibrio cholerae isolates, 4 V. parahaemolyticus isolates and one isolate each of V. harveyi and Salmonella enterica collected previously. Except the host A. hydrophila subsp. hydrophila strain, no lytic activity against any other bacterial was detected. During the adsorption rate and one-step growth curve analysis, 69.7% of phage particles were able to get adsorbed on host cell followed by the release of 93 ± 6 phage progenies per host cell after a latent period of ~30 min. Phage nucleic acid was extracted by column purification methods. After determining the nature of phage nucleic acid as dsDNA, phage genome was subjected to next-generation sequencing by generating paired-end (PE, 2 x 300bp) reads on Illumina MiSeq system. De novo assembly of sequencing reads generated circular phage genome of 42,439 bp with G+C content of 58.95%. During open read frame (ORF) prediction and annotation, 22 ORFs (out of 49 total predicted ORFs) were functionally annotated and rest encoded for hypothetical proteins. Proteins involved in major functions such as phage structure formation and packaging, DNA replication and repair, DNA transcription and host cell lysis were encoded by the phage genome. The complete genome sequence of Ahp1 along with gene annotation was submitted to NCBI GenBank (accession number MF683623). Stability of Ahp1 preparations at storage temperatures of 4 °C, 30 °C, and 40 °C was studied over a period of 9 months. At 40 °C storage, phage counts declined by 4 log units within one month; with a total loss of viability after 2 months. At 30 °C temperature, phage preparation was stable for < 5 months. On the other hand, phage counts decreased by only 2 log units over a period of 9 during storage at 4 °C. As some of the phages have also been reported as glycerol sensitive, the stability of Ahp1 preparations in (0%, 15%, 30% and 45%) glycerol stocks were also studied during storage at -80 °C over a period of 9 months. The phage counts decreased only by 2 log units during storage, and no significant difference in phage counts was observed at different concentrations of glycerol. The Ahp1 phage discovered in our study had a very narrow host range and it may be useful for phage typing applications. Moreover, the endolysin and holin genes in Ahp1 genome could be ideal candidates for recombinant cloning and expression of antimicrobial proteins.Keywords: Aeromonas hydrophila, endolysin, phage, narrow host range
Procedia PDF Downloads 162597 Insights into Archaeological Human Sample Microbiome Using 16S rRNA Gene Sequencing
Authors: Alisa Kazarina, Guntis Gerhards, Elina Petersone-Gordina, Ilva Pole, Viktorija Igumnova, Janis Kimsis, Valentina Capligina, Renate Ranka
Abstract:
Human body is inhabited by a vast number of microorganisms, collectively known as the human microbiome, and there is a tremendous interest in evolutionary changes in human microbial ecology, diversity and function. The field of paleomicrobiology, study of ancient human microbiome, is powered by modern techniques of Next Generation Sequencing (NGS), which allows extracting microbial genomic data directly from archaeological sample of interest. One of the major techniques is 16S rRNA gene sequencing, by which certain 16S rRNA gene hypervariable regions are being amplified and sequenced. However, some limitations of this method exist including the taxonomic precision and efficacy of different regions used. The aim of this study was to evaluate the phylogenetic sensitivity of different 16S rRNA gene hypervariable regions for microbiome studies in the archaeological samples. Towards this aim, archaeological bone samples and corresponding soil samples from each burial environment were collected in Medieval cemeteries in Latvia. The Ion 16S™ Metagenomics Kit targeting different 16S rRNA gene hypervariable regions was used for library construction (Ion Torrent technologies). Sequenced data were analysed by using appropriate bioinformatic techniques; alignment and taxonomic representation was done using Mothur program. Sequences of most abundant genus were further aligned to E. coli 16S rRNA gene reference sequence using MEGA7 in order to identify the hypervariable region of the segment of interest. Our results showed that different hypervariable regions had different discriminatory power depending on the groups of microbes, as well as the nature of samples. On the basis of our results, we suggest that wider range of primers used can provide more accurate recapitulation of microbial communities in archaeological samples. Acknowledgements. This work was supported by the ERAF grant Nr. 1.1.1.1/16/A/101.Keywords: 16S rRNA gene, ancient human microbiome, archaeology, bioinformatics, genomics, microbiome, molecular biology, next-generation sequencing
Procedia PDF Downloads 190596 Single Cell and Spatial Transcriptomics: A Beginners Viewpoint from the Conceptual Pipeline
Authors: Leo Nnamdi Ozurumba-Dwight
Abstract:
Messenger ribooxynucleic acid (mRNA) molecules are compositional, protein-based. These proteins, encoding mRNA molecules (which collectively connote the transcriptome), when analyzed by RNA sequencing (RNAseq), unveils the nature of gene expression in the RNA. The obtained gene expression provides clues of cellular traits and their dynamics in presentations. These can be studied in relation to function and responses. RNAseq is a practical concept in Genomics as it enables detection and quantitative analysis of mRNA molecules. Single cell and spatial transcriptomics both present varying avenues for expositions in genomic characteristics of single cells and pooled cells in disease conditions such as cancer, auto-immune diseases, hematopoietic based diseases, among others, from investigated biological tissue samples. Single cell transcriptomics helps conduct a direct assessment of each building unit of tissues (the cell) during diagnosis and molecular gene expressional studies. A typical technique to achieve this is through the use of a single-cell RNA sequencer (scRNAseq), which helps in conducting high throughput genomic expressional studies. However, this technique generates expressional gene data for several cells which lack presentations on the cells’ positional coordinates within the tissue. As science is developmental, the use of complimentary pre-established tissue reference maps using molecular and bioinformatics techniques has innovatively sprung-forth and is now used to resolve this set back to produce both levels of data in one shot of scRNAseq analysis. This is an emerging conceptual approach in methodology for integrative and progressively dependable transcriptomics analysis. This can support in-situ fashioned analysis for better understanding of tissue functional organization, unveil new biomarkers for early-stage detection of diseases, biomarkers for therapeutic targets in drug development, and exposit nature of cell-to-cell interactions. Also, these are vital genomic signatures and characterizations of clinical applications. Over the past decades, RNAseq has generated a wide array of information that is igniting bespoke breakthroughs and innovations in Biomedicine. On the other side, spatial transcriptomics is tissue level based and utilized to study biological specimens having heterogeneous features. It exposits the gross identity of investigated mammalian tissues, which can then be used to study cell differentiation, track cell line trajectory patterns and behavior, and regulatory homeostasis in disease states. Also, it requires referenced positional analysis to make up of genomic signatures that will be sassed from the single cells in the tissue sample. Given these two presented approaches to RNA transcriptomics study in varying quantities of cell lines, with avenues for appropriate resolutions, both approaches have made the study of gene expression from mRNA molecules interesting, progressive, developmental, and helping to tackle health challenges head-on.Keywords: transcriptomics, RNA sequencing, single cell, spatial, gene expression.
Procedia PDF Downloads 122595 High Temperature Tolerance of Chironomus Sulfurosus and Its Molecular Mechanisms
Authors: Tettey Afi Pamela, Sotaro Fujii, Hidetoshi Saito, Kawaii Koichiro
Abstract:
Introduction: Organisms employ adaptive mechanisms when faced with any stressor or risk of being wiped out. This has made it possible for them to survive in harsh environmental conditions such as increasing temperature, low pH, and anoxia. Some of the mechanisms they utilize include the expression of heat shock proteins, synthesis of cryoprotectants, and anhydrobiosis. Heat shock proteins (HSPs) have been widely studied to determine their involvement in stress tolerance among various organism, of which chironomid species have been no exception. We examined the survival and expression of genes encoding five (5) heat shock proteins (HSP70, HSP67, HSP60, HSP27, and HSP23) from Chironomus sulfurosus larvae reared from 1st instar at 25°C, 30°C, 35°C, and 40°C. Results: The highest survival rate was recorded at 30°C, followed by 25°C, then 35°C. Only a small percentage of C. sulfurosus survived at 40°C (14.5%). With regards to HSPs expression, some HSPs responded to an increase in high temperature. The relative expression levels were lowest at 30°C for HSP70, HSP60, HSP27, and HSP23. At 25°C and 40°C, HSP70, HSP67, HSP60, HSP27, and HSP23 had the highest expression. At 35°C, all had the lowest expression. Discussion: The expression of heat shock proteins varies from one species to another. We designated the genes HSP 70, HSP 67, HSP 60, HSP 27, and HSP 23 genes based on transcriptome analysis of C. sulfurosus. Our study can be termed as a long-heat shock study as C. sulfurosus was reared from the first instar to the fourth instar, and this might have led to a continuous induction of HSPs at 25°C. 40°C had the lowest survival but highest HSPs expression as C. sulfurosus larvae had to utilize HSPs for sustenance. These results and future high-throughput studies at both the transcriptome and proteome level will improve the information needed to predict the future geographic distribution of these species within the context of global warming.Keywords: chironomid, heat shock proteins, high temperature, heat shock protein expression
Procedia PDF Downloads 95594 A Pipeline for Detecting Copy Number Variation from Whole Exome Sequencing Using Comprehensive Tools
Authors: Cheng-Yang Lee, Petrus Tang, Tzu-Hao Chang
Abstract:
Copy number variations (CNVs) have played an important role in many kinds of human diseases, such as Autism, Schizophrenia and a number of cancers. Many diseases are found in genome coding regions and whole exome sequencing (WES) is a cost-effective and powerful technology in detecting variants that are enriched in exons and have potential applications in clinical setting. Although several algorithms have been developed to detect CNVs using WES and compared with other algorithms for finding the most suitable methods using their own samples, there were not consistent datasets across most of algorithms to evaluate the ability of CNV detection. On the other hand, most of algorithms is using command line interface that may greatly limit the analysis capability of many laboratories. We create a series of simulated WES datasets from UCSC hg19 chromosome 22, and then evaluate the CNV detective ability of 19 algorithms from OMICtools database using our simulated WES datasets. We compute the sensitivity, specificity and accuracy in each algorithm for validation of the exome-derived CNVs. After comparison of 19 algorithms from OMICtools database, we construct a platform to install all of the algorithms in a virtual machine like VirtualBox which can be established conveniently in local computers, and then create a simple script that can be easily to use for detecting CNVs using algorithms selected by users. We also build a table to elaborate on many kinds of events, such as input requirement, CNV detective ability, for all of the algorithms that can provide users a specification to choose optimum algorithms.Keywords: whole exome sequencing, copy number variations, omictools, pipeline
Procedia PDF Downloads 319593 Comprehensive Multi-Omics Study Highlights Osteopontin/SPP1 in Ovarian Aging Control
Authors: Chia-Jung Li, Li-Te Lin, Kuan-Hao Tsui
Abstract:
The study identifies SPP1 as a potential gene associated with ovarian aging, revealing a significant decline in its expression in aged ovaries. SPP1, also known as osteopontin (OPN), is a multifunctional glycoprotein involved with regulatory proteins and pro-inflammatory immune chemokines. However, its genetic links to ovarian aging have not been extensively explored. Spatial transcriptomic analyses were conducted on ovaries from young and aged female mice, along with a sample from a 73-year-old individual. Additionally, single-cell RNA sequencing analysis was performed to identify associations between SPP1 and key genes. The study focused on crucial genes, including ITGAV, ITGB1, CD44, MMP3, and FN1, with a particular emphasis on the correlation between SPP1 and ITGB1. The findings indicate a significant decline in SPP1 expression in aged ovaries, which was consistent in the 73-year-old sample. Single-cell RNA sequencing unveiled associations between SPP1 and key genes, emphasizing a strong co-expression correlation between SPP1 and ITGB1. While the study provides valuable insights, further research is necessary to understand the broader implications and potential applications of SPP1 in ovarian aging. Translating these findings to clinical settings requires careful consideration. The identification of SPP1 as a gene implicated in ovarian aging opens new avenues for advancing precision medicine and refining treatment strategies for conditions related to ovarian aging.Keywords: SPP1, ovarian aging, spatial transcriptomic, single-cell RNA sequencing
Procedia PDF Downloads 35592 Anaerobic Digestion Batch Study of Taxonomic Variations in Microbial Communities during Adaptation of Consortium to Different Lignocellulosic Substrates Using Targeted Sequencing
Authors: Priyanka Dargode, Suhas Gore, Manju Sharma, Arvind Lali
Abstract:
Anaerobic digestion has been widely used for production of methane from different biowastes. However, the complexity of microbial communities involved in the process is poorly understood. The performance of biogas production process concerning the process productivity is closely coupled to its microbial community structure and syntrophic interactions amongst the community members. The present study aims at understanding taxonomic variations occurring in any starter inoculum when acclimatised to different lignocellulosic biomass (LBM) feedstocks relating to time of digestion. The work underlines use of high throughput Next Generation Sequencing (NGS) for validating the changes in taxonomic patterns of microbial communities. Biomethane Potential (BMP) batches were set up with different pretreated and non-pretreated LBM residues using the same microbial consortium and samples were withdrawn for studying the changes in microbial community in terms of its structure and predominance with respect to changes in metabolic profile of the process. DNA of samples withdrawn at different time intervals with reference to performance changes of the digestion process, was extracted followed by its 16S rRNA amplicon sequencing analysis using Illumina Platform. Biomethane potential and substrate consumption was monitored using Gas Chromatography(GC) and reduction in COD (Chemical Oxygen Demand) respectively. Taxonomic analysis by QIIME server data revealed that microbial community structure changes with different substrates as well as at different time intervals. It was observed that biomethane potential of each substrate was relatively similar but, the time required for substrate utilization and its conversion to biomethane was different for different substrates. This could be attributed to the nature of substrate and consequently the discrepancy between the dominance of microbial communities with regards to different substrate and at different phases of anaerobic digestion process. Knowledge of microbial communities involved would allow a rational substrate specific consortium design which will help to reduce consortium adaptation period and enhance the substrate utilisation resulting in improved efficacy of biogas process.Keywords: amplicon sequencing, biomethane potential, community predominance, taxonomic analysis
Procedia PDF Downloads 532591 Detection, Analysis and Determination of the Origin of Copy Number Variants (CNVs) in Intellectual Disability/Developmental Delay (ID/DD) Patients and Autistic Spectrum Disorders (ASD) Patients by Molecular and Cytogenetic Methods
Authors: Pavlina Capkova, Josef Srovnal, Vera Becvarova, Marie Trkova, Zuzana Capkova, Andrea Stefekova, Vaclava Curtisova, Alena Santava, Sarka Vejvalkova, Katerina Adamova, Radek Vodicka
Abstract:
ASDs are heterogeneous and complex developmental diseases with a significant genetic background. Recurrent CNVs are known to be a frequent cause of ASD. These CNVs can have, however, a variable expressivity which results in a spectrum of phenotypes from asymptomatic to ID/DD/ASD. ASD is associated with ID in ~75% individuals. Various platforms are used to detect pathogenic mutations in the genome of these patients. The performed study is focused on a determination of the frequency of pathogenic mutations in a group of ASD patients and a group of ID/DD patients using various strategies along with a comparison of their detection rate. The possible role of the origin of these mutations in aetiology of ASD was assessed. The study included 35 individuals with ASD and 68 individuals with ID/DD (64 males and 39 females in total), who underwent rigorous genetic, neurological and psychological examinations. Screening for pathogenic mutations involved karyotyping, screening for FMR1 mutations and for metabolic disorders, a targeted MLPA test with probe mixes Telomeres 3 and 5, Microdeletion 1 and 2, Autism 1, MRX and a chromosomal microarray analysis (CMA) (Illumina or Affymetrix). Chromosomal aberrations were revealed in 7 (1 in the ASD group) individuals by karyotyping. FMR1 mutations were discovered in 3 (1 in the ASD group) individuals. The detection rate of pathogenic mutations in ASD patients with a normal karyotype was 15.15% by MLPA and CMA. The frequencies of the pathogenic mutations were 25.0% by MLPA and 35.0% by CMA in ID/DD patients with a normal karyotype. CNVs inherited from asymptomatic parents were more abundant than de novo changes in ASD patients (11.43% vs. 5.71%) in contrast to the ID/DD group where de novo mutations prevailed over inherited ones (26.47% vs. 16.18%). ASD patients shared more frequently their mutations with their fathers than patients from ID/DD group (8.57% vs. 1.47%). Maternally inherited mutations predominated in the ID/DD group in comparison with the ASD group (14.7% vs. 2.86 %). CNVs of an unknown significance were found in 10 patients by CMA and in 3 patients by MLPA. Although the detection rate is the highest when using CMA, recurrent CNVs can be easily detected by MLPA. CMA proved to be more efficient in the ID/DD group where a larger spectrum of rare pathogenic CNVs was revealed. This study determined that maternally inherited highly penetrant mutations and de novo mutations more often resulted in ID/DD without ASD in patients. The paternally inherited mutations could be, however, a source of the greater variability in the genome of the ASD patients and contribute to the polygenic character of the inheritance of ASD. As the number of the subjects in the group is limited, a larger cohort is needed to confirm this conclusion. Inherited CNVs have a role in aetiology of ASD possibly in combination with additional genetic factors - the mutations elsewhere in the genome. The identification of these interactions constitutes a challenge for the future. Supported by MH CZ – DRO (FNOl, 00098892), IGA UP LF_2016_010, TACR TE02000058 and NPU LO1304.Keywords: autistic spectrum disorders, copy number variant, chromosomal microarray, intellectual disability, karyotyping, MLPA, multiplex ligation-dependent probe amplification
Procedia PDF Downloads 349590 Analysis of Pathogen Populations Occurring in Oilseed Rape Using DNA Sequencing Techniques
Authors: Elizabeth Starzycka-Korbas, Michal Starzycki, Wojciech Rybinski, Mirosława Dabert
Abstract:
For a few years, the populations of pathogenic fungi occurring in winter oilseed rape in Malyszyn were analyzed. Brassica napus L. in Poland and in the world is a source of energy for both the men (oil), and animals, as post-extraction middling, as well as a motor fuel (oil, biofuel) therefore studies of this type are very important. The species composition of pathogenic fungi can be an indicator of seed yield. The occurrence of oilseed rape pathogens during several years were analyzed using the sequencing method DNA ITS. The results were compared in the gene bank using the program NCBI / BLAST. In field conditions before harvest of oilseed rape presence of pathogens infesting B. napus has been assessed. For example, in 2015, 150 samples have been isolated and applied to PDA medium for the identification of belonging species. From all population has been selected mycelium of 83 isolates which were sequenced. Others (67 isolates) were pathogenic fungi of the genus Alternaria which are easily to recognize. The population of pathogenic species on oilseed rape have been identified after analyzing the DNA ITS and include: Leptosphaeria sp. 38 (L. maculans 25, L. biglobosa 13), Alternaria sp. 29, Fusarium sp. 3, Sclerotinia sclerotiorum 7, heterogeneous 6, total of 83 isolates. The genus Alternaria sp. fungi wear the largest share of B. napus pathogens in particular years. Another dangerous species for oilseed rape was Leptosphaeria sp. Populations of pathogens in each year were different. The number of pathogens occurring in the field and their composition is very important for breeders and farmers because of the possible selection of the most resistant genotypes for sowing in the next growing season.Keywords: B. napus, DNA ITS Sequencing, pathogenic fungi, population
Procedia PDF Downloads 288589 Exploring the Correlation between Body Constitution of an Individual as Per Ayurveda and Gut Microbiome in Healthy, Multi Ethnic Urban Population in Bangalore, India
Authors: Shalini TV, Gangadharan GG, Sriranjini S Jaideep, ASN Seshasayee, Awadhesh Pandit
Abstract:
Introduction: Prakriti (body-mind constitution of an individual) is a conventional, customized and unique understanding of which is essential for the personalized medicine described in Ayurveda, Indian System of Medicine. Based on the Doshas( functional, bio humoral unit in the body), individuals are categorized into three major Prakriti- Vata, Pitta, and Kapha. The human gut microbiome hosts plenty of highly diverse and metabolically active microorganisms, mainly dominated by the bacteria, which are known to influence the physiology of an individual. Few researches have shown the correlation between the Prakriti and the biochemical parameters. In this study, an attempt was made to explore any correlation between the Prakriti (phenotype of an individual) with the Genetic makeup of the gut microbiome in healthy individuals. Materials and methods: 270 multi-ethnic, healthy volunteers of both sex with the age group between 18 to 40 years, with no history of antibiotics in the last 6 months were recruited into three groups of Vata, Pitta, and Kapha. The Prakriti of the individual was determined using Ayusoft, a software designed by CDAC, Pune, India. The volunteers were subjected to initial screening for the assessment of their height, weight, Body Mass Index, Vital signs and Blood investigations to ensure they are healthy. The stool and saliva samples of the recruited volunteers were collected as per the standard operating procedure developed, and the bacterial DNA was isolated using Qiagen kits. The extracted DNA was subjected to 16s rRNA sequencing using the Illumina kits. The sequencing libraries are targeting the variable V3 and V4 regions of the 16s rRNA gene. Paired sequencing was done on the MiSeq system and data were analyzed using the CLC Genomics workbench 11. Results: The 16s rRNA sequencing of the V3 and V4 regions showed a diverse pattern in both the oral and stool microbial DNA. The study did not reveal any specific pattern of bacterial flora amongst the Prakriti. All the p-values were more than the effective alpha values for all OTUs in both the buccal cavity and stool samples. Therefore, there was no observed significant enrichment of an OTU in the patient samples from either the buccal cavity or stool samples. Conclusion: In healthy volunteers of multi-ethnicity, due to the influence of the various factors, the correlation between the Prakriti and the gut microbiome was not seen.Keywords: gut microbiome, ayurveda Prakriti, sequencing, multi-ethnic urban population
Procedia PDF Downloads 135588 Discrete Breeding Swarm for Cost Minimization of Parallel Job Shop Scheduling Problem
Authors: Tarek Aboueldahab, Hanan Farag
Abstract:
Parallel Job Shop Scheduling Problem (JSP) is a multi-objective and multi constrains NP- optimization problem. Traditional Artificial Intelligence techniques have been widely used; however, they could be trapped into the local minimum without reaching the optimum solution, so we propose a hybrid Artificial Intelligence model (AI) with Discrete Breeding Swarm (DBS) added to traditional Artificial Intelligence to avoid this trapping. This model is applied in the cost minimization of the Car Sequencing and Operator Allocation (CSOA) problem. The practical experiment shows that our model outperforms other techniques in cost minimization.Keywords: parallel job shop scheduling problem, artificial intelligence, discrete breeding swarm, car sequencing and operator allocation, cost minimization
Procedia PDF Downloads 188587 The Relationship between Operating Condition and Sludge Wasting of an Aerobic Suspension-Sequencing Batch Reactor (ASSBR) Treating Phenolic Wastewater
Authors: Ali Alattabi, Clare Harris, Rafid Alkhaddar, Ali Alzeyadi
Abstract:
Petroleum refinery wastewater (PRW) can be considered as one of the most significant source of aquatic environmental pollution. It consists of oil and grease along with many other toxic organic pollutants. In recent years, a new technique was implemented using different types of membranes and sequencing batch reactors (SBRs) to treat PRW. SBR is a fill and draw type sludge system which operates in time instead of space. Many researchers have optimised SBRs’ operating conditions to obtain maximum removal of undesired wastewater pollutants. It has gained more importance mainly because of its essential flexibility in cycle time. It can handle shock loads, requires less area for operation and easy to operate. However, bulking sludge or discharging floating or settled sludge during the draw or decant phase with some SBR configurations are still one of the problems of SBR system. The main aim of this study is to develop and innovative design for the SBR optimising the process variables to result is a more robust and efficient process. Several experimental tests will be developed to determine the removal percentages of chemical oxygen demand (COD), Phenol and nitrogen compounds from synthetic PRW. Furthermore, the dissolved oxygen (DO), pH and oxidation-reduction potential (ORP) of the SBR system will be monitored online to ensure a good environment for the microorganisms to biodegrade the organic matter effectively.Keywords: petroleum refinery wastewater, sequencing batch reactor, hydraulic retention time, Phenol, COD, mixed liquor suspended solids (MLSS)
Procedia PDF Downloads 260586 Performance Evaluation of Flexible Manufacturing System: A Simulation Study
Authors: Mohammed Ali
Abstract:
In this paper, evaluation of flexible manufacturing system is made under different manufacturing strategies. The objective of this paper is to test the impact of pallets and routing flexibility on system performance operating at different sequencing rules, dispatching rules and at unbalanced load condition. A computer simulation model is developed to evaluate the effects of aforementioned manufacturing strategies on the make-span performance of flexible manufacturing system. The impact of number of pallets is shown with the different levels of routing flexibility. In this paper, the same manufacturing system is modeled under different combination of sequencing and dispatching rules. A series of simulation experiments are conducted and results analyzed. The result of the simulation shows that there is impact of pallets and routing flexibility on the performance of the system.Keywords: flexibility, flexible manufacturing system, pallets, make-span, simulation
Procedia PDF Downloads 417585 Agarose Amplification Based Sequencing (AG-seq) Characterization Cell-free RNA in Preimplantation Spent Embryo Medium
Authors: Huajuan Shi
Abstract:
Background: The biopsy of the preimplantation embryo may increase the potential risk and concern of embryo viability. Clinically discarded spent embryo medium (SEM) has entered the view of researchers, sparking an interest in noninvasive embryo screening. However, one of the major restrictions is the extremelty low quantity of cf-RNA, which is difficult to efficiently and unbiased amplify cf-RNA using traditional methods. Hence, there is urgently need to an efficient and low bias amplification method which can comprehensively and accurately obtain cf-RNA information to truly reveal the state of SEM cf-RNA. Result: In this present study, we established an agarose PCR amplification system, and has significantly improved the amplification sensitivity and efficiency by ~90 fold and 9.29 %, respectively. We applied agarose to sequencing library preparation (named AG-seq) to quantify and characterize cf-RNA in SEM. The number of detected cf-RNAs (3533 vs 598) and coverage of 3' end were significantly increased, and the noise of low abundance gene detection was reduced. The increasing percentage 5' end adenine and alternative splicing (AS) events of short fragments (< 400 bp) were discovered by AG-seq. Further, the profiles and characterizations of cf-RNA in spent cleavage medium (SCM) and spent blastocyst medium (SBM) indicated that 4‐mer end motifs of cf-RNA fragments could remarkably differentiate different embryo development stages. Significance: This study established an efficient and low-cost SEM amplification and library preparation method. Not only that, we successfully described the characterizations of SEM cf-RNA of preimplantation embryo by using AG-seq, including abundance features fragment lengths. AG-seq facilitates the study of cf-RNA as a noninvasive embryo screening biomarker and opens up potential clinical utilities of trace samples.Keywords: cell-free RNA, agarose, spent embryo medium, RNA sequencing, non-invasive detection
Procedia PDF Downloads 92584 Phenotype Prediction of DNA Sequence Data: A Machine and Statistical Learning Approach
Authors: Mpho Mokoatle, Darlington Mapiye, James Mashiyane, Stephanie Muller, Gciniwe Dlamini
Abstract:
Great advances in high-throughput sequencing technologies have resulted in availability of huge amounts of sequencing data in public and private repositories, enabling a holistic understanding of complex biological phenomena. Sequence data are used for a wide range of applications such as gene annotations, expression studies, personalized treatment and precision medicine. However, this rapid growth in sequence data poses a great challenge which calls for novel data processing and analytic methods, as well as huge computing resources. In this work, a machine and statistical learning approach for DNA sequence classification based on $k$-mer representation of sequence data is proposed. The approach is tested using whole genome sequences of Mycobacterium tuberculosis (MTB) isolates to (i) reduce the size of genomic sequence data, (ii) identify an optimum size of k-mers and utilize it to build classification models, (iii) predict the phenotype from whole genome sequence data of a given bacterial isolate, and (iv) demonstrate computing challenges associated with the analysis of whole genome sequence data in producing interpretable and explainable insights. The classification models were trained on 104 whole genome sequences of MTB isoloates. Cluster analysis showed that k-mers maybe used to discriminate phenotypes and the discrimination becomes more concise as the size of k-mers increase. The best performing classification model had a k-mer size of 10 (longest k-mer) an accuracy, recall, precision, specificity, and Matthews Correlation coeffient of 72.0%, 80.5%, 80.5%, 63.6%, and 0.4 respectively. This study provides a comprehensive approach for resampling whole genome sequencing data, objectively selecting a k-mer size, and performing classification for phenotype prediction. The analysis also highlights the importance of increasing the k-mer size to produce more biological explainable results, which brings to the fore the interplay that exists amongst accuracy, computing resources and explainability of classification results. However, the analysis provides a new way to elucidate genetic information from genomic data, and identify phenotype relationships which are important especially in explaining complex biological mechanisms.Keywords: AWD-LSTM, bootstrapping, k-mers, next generation sequencing
Procedia PDF Downloads 167583 Phenotype Prediction of DNA Sequence Data: A Machine and Statistical Learning Approach
Authors: Darlington Mapiye, Mpho Mokoatle, James Mashiyane, Stephanie Muller, Gciniwe Dlamini
Abstract:
Great advances in high-throughput sequencing technologies have resulted in availability of huge amounts of sequencing data in public and private repositories, enabling a holistic understanding of complex biological phenomena. Sequence data are used for a wide range of applications such as gene annotations, expression studies, personalized treatment and precision medicine. However, this rapid growth in sequence data poses a great challenge which calls for novel data processing and analytic methods, as well as huge computing resources. In this work, a machine and statistical learning approach for DNA sequence classification based on k-mer representation of sequence data is proposed. The approach is tested using whole genome sequences of Mycobacterium tuberculosis (MTB) isolates to (i) reduce the size of genomic sequence data, (ii) identify an optimum size of k-mers and utilize it to build classification models, (iii) predict the phenotype from whole genome sequence data of a given bacterial isolate, and (iv) demonstrate computing challenges associated with the analysis of whole genome sequence data in producing interpretable and explainable insights. The classification models were trained on 104 whole genome sequences of MTB isoloates. Cluster analysis showed that k-mers maybe used to discriminate phenotypes and the discrimination becomes more concise as the size of k-mers increase. The best performing classification model had a k-mer size of 10 (longest k-mer) an accuracy, recall, precision, specificity, and Matthews Correlation coeffient of 72.0 %, 80.5 %, 80.5 %, 63.6 %, and 0.4 respectively. This study provides a comprehensive approach for resampling whole genome sequencing data, objectively selecting a k-mer size, and performing classification for phenotype prediction. The analysis also highlights the importance of increasing the k-mer size to produce more biological explainable results, which brings to the fore the interplay that exists amongst accuracy, computing resources and explainability of classification results. However, the analysis provides a new way to elucidate genetic information from genomic data, and identify phenotype relationships which are important especially in explaining complex biological mechanismsKeywords: AWD-LSTM, bootstrapping, k-mers, next generation sequencing
Procedia PDF Downloads 159582 Whole Exome Sequencing in Characterizing Mysterious Crippling Disorder in India
Authors: Swarkar Sharma, Ekta Rai, Ankit Mahajan, Parvinder Kumar, Manoj K Dhar, Sushil Razdan, Kumarasamy Thangaraj, Carol Wise, Shiro Ikegawa M.D., K.K. Pandita M.D.
Abstract:
Rare disorders are poorly understood hence, remain uncharacterized or patients are misdiagnosed and get poor medical attention. A rare mysterious skeletal disorder that remained unidentified for decades and rendered many people physically challenged and disabled for life has been reported in an isolated remote village ‘Arai’ of Poonch district of Jammu and Kashmir. This village is located deep in mountains and the population residing in the region is highly consanguineous. In our survey of the region, 70 affected people were reported, showing similar phenotype, in the village with a population of approximately 5000 individuals. We were able to collect samples from two multi generational extended families from the village. Through Whole Exome sequencing (WES), we identified a rare variation NM_003880.3:c.156C>A NP_003871.1:p.Cys52Ter, which results in introduction of premature stop codon in WISP3 gene. We found this variation perfectly segregating with the disease in one of the family. However, this variation was absent in other family. Interestingly, a novel splice site mutation at position c.643+1G>A of WISP3 gene, perfectly segregating with the disease was observed in the second family. Thus, exploiting WES and putting different evidences together (familial histories and genetic data, clinical features, radiological and biochemical tests and findings), the disease has finally been diagnosed as a very rare recessive hereditary skeletal disease “Progressive Pseudorheumatoid Arthropathy of Childhood” (PPAC) also known as “Spondyloepiphyseal Dysplasia Tarda with Progressive Arthropathy” (SEDT-PA). This genetic characterization and identification of the disease causing mutations will aid in genetic counseling, critically required to curb this rare disorder and to prevent its appearance in future generations in the population. Further, understanding of the role of WISP3 gene the biological pathways should help in developing treatment for the disorder.Keywords: whole exome sequencing, Next Generation Sequencing, rare disorders
Procedia PDF Downloads 411