Cheng-Yang Lee

Abstracts

2 A Pipeline for Detecting Copy Number Variation from Whole Exome Sequencing Using Comprehensive Tools

Authors: Cheng-Yang Lee, Petrus Tang, Tzu-Hao Chang

Abstract:

Copy number variations (CNVs) have played an important role in many kinds of human diseases, such as Autism, Schizophrenia and a number of cancers. Many diseases are found in genome coding regions and whole exome sequencing (WES) is a cost-effective and powerful technology in detecting variants that are enriched in exons and have potential applications in clinical setting. Although several algorithms have been developed to detect CNVs using WES and compared with other algorithms for finding the most suitable methods using their own samples, there were not consistent datasets across most of algorithms to evaluate the ability of CNV detection. On the other hand, most of algorithms is using command line interface that may greatly limit the analysis capability of many laboratories. We create a series of simulated WES datasets from UCSC hg19 chromosome 22, and then evaluate the CNV detective ability of 19 algorithms from OMICtools database using our simulated WES datasets. We compute the sensitivity, specificity and accuracy in each algorithm for validation of the exome-derived CNVs. After comparison of 19 algorithms from OMICtools database, we construct a platform to install all of the algorithms in a virtual machine like VirtualBox which can be established conveniently in local computers, and then create a simple script that can be easily to use for detecting CNVs using algorithms selected by users. We also build a table to elaborate on many kinds of events, such as input requirement, CNV detective ability, for all of the algorithms that can provide users a specification to choose optimum algorithms.

Keywords: Pipeline, whole exome sequencing, copy number variations, omictools

Procedia PDF Downloads 196
1 CMPD: Cancer Mutant Proteome Database

Authors: Po-Jung Huang, Chi-Ching Lee, Bertrand Chin-Ming Tan, Yuan-Ming Yeh, Julie Lichieh Chu, Tin-Wen Chen, Cheng-Yang Lee, Ruei-Chi Gan, Hsuan Liu, Petrus Tang

Abstract:

Whole-exome sequencing focuses on the protein coding regions of disease/cancer associated genes based on a priori knowledge is the most cost-effective method to study the association between genetic alterations and disease. Recent advances in high throughput sequencing technologies and proteomic techniques has provided an opportunity to integrate genomics and proteomics, allowing readily detectable mutated peptides corresponding to mutated genes. Since sequence database search is the most widely used method for protein identification using Mass spectrometry (MS)-based proteomics technology, a mutant proteome database is required to better approximate the real protein pool to improve disease-associated mutated protein identification. Large-scale whole exome/genome sequencing studies were launched by National Cancer Institute (NCI), Broad Institute, and The Cancer Genome Atlas (TCGA), which provide not only a comprehensive report on the analysis of coding variants in diverse samples cell lines but a invaluable resource for extensive research community. No existing database is available for the collection of mutant protein sequences related to the identified variants in these studies. CMPD is designed to address this issue, serving as a bridge between genomic data and proteomic studies and focusing on protein sequence-altering variations originated from both germline and cancer-associated somatic variations.

Keywords: Cancer, mutant, TCGA, proteome

Procedia PDF Downloads 414