Search results for: whole genome sequence
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1430

Search results for: whole genome sequence

1280 DNpro: A Deep Learning Network Approach to Predicting Protein Stability Changes Induced by Single-Site Mutations

Authors: Xiao Zhou, Jianlin Cheng

Abstract:

A single amino acid mutation can have a significant impact on the stability of protein structure. Thus, the prediction of protein stability change induced by single site mutations is critical and useful for studying protein function and structure. Here, we presented a deep learning network with the dropout technique for predicting protein stability changes upon single amino acid substitution. While using only protein sequence as input, the overall prediction accuracy of the method on a standard benchmark is >85%, which is higher than existing sequence-based methods and is comparable to the methods that use not only protein sequence but also tertiary structure, pH value and temperature. The results demonstrate that deep learning is a promising technique for protein stability prediction. The good performance of this sequence-based method makes it a valuable tool for predicting the impact of mutations on most proteins whose experimental structures are not available. Both the downloadable software package and the user-friendly web server (DNpro) that implement the method for predicting protein stability changes induced by amino acid mutations are freely available for the community to use.

Keywords: bioinformatics, deep learning, protein stability prediction, biological data mining

Procedia PDF Downloads 467
1279 Multivariate Genome-Wide Association Studies for Identifying Additional Loci for Myopia

Authors: Qiao Fan, Xiaobo Guo, Junxian Zhu, Xiaohu Ding, Ching-Yu Cheng, Tien-Yin Wong, Mingguang He, Heping Zhang, Xueqin Wang

Abstract:

A systematic, simultaneous analysis of multiple phenotypes in genome-wide association studies (GWASs) draws a great attention to integrate the signals from single phenotypes with increased power. However, lacking an interpretable and efficient multivariate GWAS analysis impede the application of such approach. In this study, we propose to decompose the multivariate model into a series of simple univariate models. This transformation illuminates what exactly the individual trait contributes to the significant signals from the multivariate analyses. By employing our approach in the analysis of three myopia-related endophenotypes from the Singapore Malay Eye Study (SIMES), we identify novel candidate loci which were successfully validated in an independent Guangzhou Twin Eye Study (GTES).

Keywords: GWAS multivariate, multiple traits, myopia, association

Procedia PDF Downloads 224
1278 Brachypodium: A Model Genus to Study Grass Genome Organisation at the Cytomolecular Level

Authors: R. Hasterok, A. Betekhtin, N. Borowska, A. Braszewska-Zalewska, E. Breda, K. Chwialkowska, R. Gorkiewicz, D. Idziak, J. Kwasniewska, M. Kwasniewski, D. Siwinska, A. Wiszynska, E. Wolny

Abstract:

In contrast to animals, the organisation of plant genomes at the cytomolecular level is still relatively poorly studied and understood. However, the Brachypodium genus in general and B. distachyon in particular represent exceptionally good model systems for such study. This is due not only to their highly desirable ‘model’ biological features, such as small nuclear genome, low chromosome number and complex phylogenetic relations, but also to the rapidly and continuously growing repertoire of experimental tools, such as large collections of accessions, WGS information, large insert (BAC) libraries of genomic DNA, etc. Advanced cytomolecular techniques, such as fluorescence in situ hybridisation (FISH) with evermore sophisticated probes, empowered by cutting-edge microscope and digital image acquisition and processing systems, offer unprecedented insight into chromatin organisation at various phases of the cell cycle. A good example is chromosome painting which uses pools of chromosome-specific BAC clones, and enables the tracking of individual chromosomes not only during cell division but also during interphase. This presentation outlines the present status of molecular cytogenetic analyses of plant genome structure, dynamics and evolution using B. distachyon and some of its relatives. The current projects focus on important scientific questions, such as: What mechanisms shape the karyotypes? Is the distribution of individual chromosomes within an interphase nucleus determined? Are there hot spots of structural rearrangement in Brachypodium chromosomes? Which epigenetic processes play a crucial role in B. distachyon embryo development and selective silencing of rRNA genes in Brachypodium allopolyploids? The authors acknowledge financial support from the Polish National Science Centre (grants no. 2012/04/A/NZ3/00572 and 2011/01/B/NZ3/00177)

Keywords: Brachypodium, B. distachyon, chromosome, FISH, molecular cytogenetics, nucleus, plant genome organisation

Procedia PDF Downloads 351
1277 In-Depth Analysis on Sequence Evolution and Molecular Interaction of Influenza Receptors (Hemagglutinin and Neuraminidase)

Authors: Dong Tran, Thanh Dac Van, Ly Le

Abstract:

Hemagglutinin (HA) and Neuraminidase (NA) play an important role in host immune evasion across influenza virus evolution process. The correlation between HA and NA evolution in respect to epitopic evolution and drug interaction has yet to be investigated. In this study, combining of sequence to structure evolution and statistical analysis on epitopic/binding site specificity, we identified potential therapeutic features of HA and NA that show specific antibody binding site of HA and specific binding distribution within NA active site of current inhibitors. Our approach introduces the use of sequence variation and molecular interaction to provide an effective strategy in establishing experimental based distributed representations of protein-protein/ligand complexes. The most important advantage of our method is that it does not require complete dataset of complexes but rather directly inferring feature interaction from sequence variation and molecular interaction. Using correlated sequence analysis, we additionally identified co-evolved mutations associated with maintaining HA/NA structural and functional variability toward immunity and therapeutic treatment. Our investigation on the HA binding specificity revealed unique conserved stalk domain interacts with unique loop domain of universal antibodies (CR9114, CT149, CR8043, CR8020, F16v3, CR6261, F10). On the other hand, NA inhibitors (Oseltamivir, Zaninamivir, Laninamivir) showed specific conserved residue contribution and similar to that of NA substrate (sialic acid) which can be exploited for drug design. Our study provides an important insight into rational design and identification of novel therapeutics targeting universally recognized feature of influenza HA/NA.

Keywords: influenza virus, hemagglutinin (HA), neuraminidase (NA), sequence evolution

Procedia PDF Downloads 164
1276 Exploring Simple Sequence Repeats within Conserved microRNA Precursors Identified from Tea Expressed Sequence Tag (EST) Database

Authors: Anjan Hazra, Nirjhar Dasgupta, Chandan Sengupta, Sauren Das

Abstract:

Tea (Camellia sinensis) has received substantial attention from the scientific world time to time, not only for its commercial importance, but also for its demand to the health-conscious people across the world for its extensive use as potential sources of antioxidant supplement. These health-benefit traits primarily rely on some regulatory networks of different metabolic pathways. Development of microsatellite markers from the conserved genomic regions is being worthwhile for studying the genetic diversity of closely related species or self-pollinated species. Although several SSR markers have been reported, in tea the trait-specific Simple Sequence Repeats (SSRs) are yet to be identified, which can be used for marker assisted breeding technique. MicroRNAs are endogenous, noncoding, short RNAs directly involved in regulating gene expressions at the post-transcriptional level. It has been found that diversity in miRNA gene interferes the formation of its characteristic hair pin structure and the subsequent function. In the present study, the precursors of small regulatory RNAs (microRNAs) has been fished out from tea Expressed Sequence Tag (EST) database. Furthermore, the simple sequence repeat motifs within the putative miRNA precursor genes are also identified in order to experimentally validate their existence and function. It is already known that genic-SSR markers are very adept and breeder-friendly source for genetic diversity analysis. So, the potential outcome of this in-silico study would provide some novel clues in understanding the miRNA-triggered polymorphic genic expression controlling specific metabolic pathways, accountable for tea quality.

Keywords: micro RNA, simple sequence repeats, tea quality, trait specific marker

Procedia PDF Downloads 311
1275 Performance of High Density Genotyping in Sahiwal Cattle Breed

Authors: Hamid Mustafa, Huson J. Heather, Kim Eiusoo, Adeela Ajmal, Tad S. Sonstegard

Abstract:

The objective of this study was to evaluate the informativeness of Bovine high density SNPs genotyping in Sahiwal cattle population. This is a first attempt to assess the Bovine HD SNP genotyping array in any Pakistani indigenous cattle population. To evaluate these SNPs on genome wide scale, we considered 777,962 SNPs spanning the whole autosomal and X chromosomes in Sahiwal cattle population. Fifteen (15) non related gDNA samples were genotyped with the bovine HD infinium. Approximately 500,939 SNPs were found polymorphic (MAF > 0.05) in Sahiwal cattle population. The results of this study indicate potential application of Bovine High Density SNP genotyping in Pakistani indigenous cattle population. The information generated from this array can be applied in genetic prediction, characterization and genome wide association studies of Pakistani Sahiwal cattle population.

Keywords: Sahiwal cattle, polymorphic SNPs, genotyping, Pakistan

Procedia PDF Downloads 428
1274 Expression Profiling and Immunohistochemical Analysis of Squamous Cell Carcinoma of Head and Neck (Tumor, Transition Zone, Normal) by Whole Genome Scale Sequencing

Authors: Veronika Zivicova, Petr Broz, Zdenek Fik, Alzbeta Mifkova, Jan Plzak, Zdenek Cada, Herbert Kaltner, Jana Fialova Kucerova, Hans-Joachim Gabius, Karel Smetana Jr.

Abstract:

The possibility to determine genome-wide expression profiles of cells and tissues opens a new level of analysis in the quest to define dysregulation in malignancy and thus identify new tumor markers. Toward this long-term aim, we here address two issues on this level for head and neck cancer specimen: i) defining profiles in different regions, i.e. the tumor, the transition zone and normal control and ii) comparing complete data sets for seven individual patients. Special focus in the flanking immunohistochemical part is given to adhesion/growth-regulatory galectins that upregulate chemo- and cytokine expression in an NF-κB-dependent manner, to these regulators and to markers of differentiation, i.e. keratins. The detailed listing of up- and down-regulations, also available in printed form (1), not only served to unveil new candidates for testing as marker but also let the impact of the tumor in the transition zone become apparent. The extent of interindividual variation raises a strong cautionary note on assuming uniformity of regulatory events, to be noted when considering therapeutic implications. Thus, a combination of test targets (and a network analysis for galectins and their downstream effectors) is (are) advised prior to reaching conclusions on further perspectives.

Keywords: galectins, genome scale sequencing, squamous cell carcinoma, transition zone

Procedia PDF Downloads 238
1273 Prediction and Identification of a Permissive Epitope Insertion Site for St Toxoid in cfaB from Enterotoxigenic Escherichia coli

Authors: N. Zeinalzadeh, Mahdi Sadeghi

Abstract:

Enterotoxigenic Escherichia coli (ETEC) is the most common cause of non-inflammatory diarrhea in the developing countries, resulting in approximately 20% of all diarrheal episodes in children in these areas. ST is one of the most important virulence factors and CFA/I is one of the frequent colonization factors that help to process of ETEC infection. ST and CfaB (CFA/I subunit) are among vaccine candidates against ETEC. So, ST because of its small size is not a good immunogenic in the natural form. However to increase its immunogenic potential, here we explored candidate positions for ST insertion in CfaB sequence. After bioinformatics analysis, one of the candidate positions was selected and the chimeric gene (cfaB*st) sequence was synthesized and expressed in E. coli BL21 (DE3). The chimeric recombinant protein was purified with Ni-NTA columns and characterized with western blot analysis. The residue 74-75 of CfaB sequence could be a good candidate position for ST and other epitopes insertion.

Keywords: bioinformatics, CFA/I, enterotoxigenic E. coli, ST toxoid

Procedia PDF Downloads 448
1272 Nucleotide Based Validation of the Endangered Plant Diospyros mespiliformis (Ebenaceae) by Evaluating Short Sequence Region of Plastid rbcL Gene

Authors: Abdullah Alaklabi, Ibrahim A. Arif, Sameera O. Bafeel, Ahmad H. Alfarhan, Anis Ahamed, Jacob Thomas, Mohammad A. Bakir

Abstract:

Diospyros mespiliformis (Hochst. ex A.DC.; Ebenaceae) is a large deciduous medicinal plant. This plant species is currently listed as endangered in Saudi Arabia. Molecular identification of this plant species based on short sequence regions (571 and 664 bp) of plastid rbcL (ribulose-1, 5-biphosphate carboxylase) gene was investigated in this study. The endangered plant specimens were collected from Al-Baha, Saudi Arabia (GPS coordinate: 19.8543987, 41.3059349). Phylogenetic tree inferred from the rbcL gene sequences showed that this species is very closely related with D. brandisiana. The close relationship was also observed among D. bejaudii, D. Philippinensis and D. releyi (≥99.7% sequence homology). The partial rbcL gene sequence region (571 bp) that was amplified by rbcL primer-pair rbcLaF-rbcLaR failed to discriminate D. mespiliformis from the closely related plant species, D. brandisiana. In contrast, primer-pair rbcL1F-rbcL724R yielded longer amplicon, discriminated the species from D. brandisiana and demonstrated nucleotide variations in 3 different sites (645G>T; 663A>C; 710C>G). Although D. mespiliformis (EU980712) and D. brandisiana (EU980656) are very closely related species (99.4%); however, studied specimen showed 100% sequence homology with D. mespiliformis and 99.6% with D. brandisiana. The present findings showed that rbcL short sequence region (664 bp) of plastid rbcL gene, amplified by primer-pair rbcL1F-rbcL724R, can be used for authenticating samples of D. mespiliforformis and may provide help in authentic identification and management process of this medicinally valuable endangered plant species.

Keywords: Diospyros mespiliformis, endangered plant, identification partial rbcL

Procedia PDF Downloads 432
1271 Unveiling the Chaura Thrust: Insights into a Blind Out-of-Sequence Thrust in Himachal Pradesh, India

Authors: Rajkumar Ghosh

Abstract:

The Chaura Thrust, located in Himachal Pradesh, India, is a prominent geological feature that exhibits characteristics of an out-of-sequence thrust fault. This paper explores the geological setting of Himachal Pradesh, focusing on the Chaura Thrust's unique characteristics, its classification as an out-of-sequence thrust, and the implications of its presence in the region. The introduction provides background information on thrust faults and out-of-sequence thrusts, emphasizing their significance in understanding the tectonic history and deformation patterns of an area. It also outlines the objectives of the paper, which include examining the Chaura Thrust's geological features, discussing its classification as an out-of-sequence thrust, and assessing its implications for the region. The paper delves into the geological setting of Himachal Pradesh, describing the tectonic framework and providing insights into the formation of thrust faults in the region. Special attention is given to the Chaura Thrust, including its location, extent, and geometry, along with an overview of the associated rock formations and structural characteristics. The concept of out-of-sequence thrusts is introduced, defining their distinctive behavior and highlighting their importance in the understanding of geological processes. The Chaura Thrust is then analyzed in the context of an out-of-sequence thrust, examining the evidence and characteristics that support this classification. Factors contributing to the out-of-sequence behavior of the Chaura Thrust, such as stress interactions and fault interactions, are discussed. The geological implications and significance of the Chaura Thrust are explored, addressing its impact on the regional geology, tectonic evolution, and seismic hazard assessment. The paper also discusses the potential geological hazards associated with the Chaura Thrust and the need for effective mitigation strategies in the region. Future research directions and recommendations are provided, highlighting areas that warrant further investigation, such as detailed structural analyses, geodetic measurements, and geophysical surveys. The importance of continued research in understanding and managing geological hazards related to the Chaura Thrust is emphasized. In conclusion, the Chaura Thrust in Himachal Pradesh represents an out-of-sequence thrust fault that has significant implications for the region's geology and tectonic evolution. By studying the unique characteristics and behavior of the Chaura Thrust, researchers can gain valuable insights into the geological processes occurring in Himachal Pradesh and contribute to a better understanding and mitigation of seismic hazards in the area.

Keywords: chaura thrust, out-of-sequence thrust, himachal pradesh, geological setting, tectonic framework, rock formations, structural characteristics, stress interactions, fault interactions, geological implications, seismic hazard assessment, geological hazards, future research, mitigation strategies.

Procedia PDF Downloads 79
1270 Neural Machine Translation for Low-Resource African Languages: Benchmarking State-of-the-Art Transformer for Wolof

Authors: Cheikh Bamba Dione, Alla Lo, Elhadji Mamadou Nguer, Siley O. Ba

Abstract:

In this paper, we propose two neural machine translation (NMT) systems (French-to-Wolof and Wolof-to-French) based on sequence-to-sequence with attention and transformer architectures. We trained our models on a parallel French-Wolof corpus of about 83k sentence pairs. Because of the low-resource setting, we experimented with advanced methods for handling data sparsity, including subword segmentation, back translation, and the copied corpus method. We evaluate the models using the BLEU score and find that transformer outperforms the classic seq2seq model in all settings, in addition to being less sensitive to noise. In general, the best scores are achieved when training the models on word-level-based units. For subword-level models, using back translation proves to be slightly beneficial in low-resource (WO) to high-resource (FR) language translation for the transformer (but not for the seq2seq) models. A slight improvement can also be observed when injecting copied monolingual text in the target language. Moreover, combining the copied method data with back translation leads to a substantial improvement of the translation quality.

Keywords: backtranslation, low-resource language, neural machine translation, sequence-to-sequence, transformer, Wolof

Procedia PDF Downloads 147
1269 Opaque Mineralogy of the Late Precambrian Ophiolites from Bou Azzer Area, Anti-atlas, Morrocco

Authors: Yaser Maher Abdelaziz Hawa

Abstract:

The Basic-ultrabasic rocks of Bou Azzer ophiolite complex in the Anti-atlas , Morrocco enclose some oxide and sulfide minerals as dissiminated traces. The oxide minerals show a wide variation in composition ranging from Cr-free. Titanomagnetite and ilmenite in the chilled margin gabbro of the upper part of the ophiolite sequence to Al-rich chromian spinel and pure magnetite enclosed in the serpentinized peridotite in the lower part of the sequence. Five mineral assemblages have been distinguished depending on the rock type of the ophiolite sequence. 1-Gersodorfite + Chalcopyrite + Al-Mg rich chromian spinel + pure magnetite, hosted by serpentinized peridotite. 2- Pyrite + Chalcopyrite, enclosed in metagabbro and overlying the ultrabasic cumulates. 3- Al-Fe rich Chromian spinel with rims of Al –rich chromian magnetite enclosed in wherlite. 4- Titanomagnetite replaced by sphene enclosed in marginal Gabbro. 5- Pyrrhotite exsolving Pentlandite + ilmenite + Ilmenite + Al- rich Chromian spinel + magnetite enclosed in fresh olivine olivine in the upper part of the ophiolite sequence.

Keywords: opaques, ophiolites, anti-atlas, morrocco

Procedia PDF Downloads 106
1268 COVID-19 Genomic Analysis and Complete Evaluation

Authors: Narin Salehiyan, Ramin Ghasemi Shayan

Abstract:

In order to investigate coronavirus RNA replication, transcription, recombination, protein processing and transport, virion assembly, the identification of coronavirus-specific cell receptors, and polymerase processing, the manipulation of coronavirus clones and complementary DNAs (cDNAs) of defective-interfering (DI) RNAs is the subject of this chapter. The idea of the Covid genome is nonsegmented, single-abandoned, and positive-sense RNA. When compared to other RNA viruses, its size is significantly greater, ranging from 27 to 32 kb. The quality encoding the enormous surface glycoprotein depends on 4.4 kb, encoding a forcing trimeric, profoundly glycosylated protein. This takes off exactly 20 nm over the virion envelope, giving the infection the appearance-with a little creative mind of a crown or coronet. Covid research has added to the comprehension of numerous parts of atomic science as a general rule, like the component of RNA union, translational control, and protein transport and handling. It stays a fortune equipped for creating startling experiences.

Keywords: covid-19, corona, virus, genome, genetic

Procedia PDF Downloads 72
1267 The Various Legal Dimensions of Genomic Data

Authors: Amy Gooden

Abstract:

When human genomic data is considered, this is often done through only one dimension of the law, or the interplay between the various dimensions is not considered, thus providing an incomplete picture of the legal framework. This research considers and analyzes the various dimensions in South African law applicable to genomic sequence data – including property rights, personality rights, and intellectual property rights. The effective use of personal genomic sequence data requires the acknowledgement and harmonization of the rights applicable to such data.

Keywords: artificial intelligence, data, law, genomics, rights

Procedia PDF Downloads 138
1266 Genomic Surveillance of Bacillus Anthracis in South Africa Revealed a Unique Genetic Cluster of B- Clade Strains

Authors: Kgaugelo Lekota, Ayesha Hassim, Henriette Van Heerden

Abstract:

Bacillus anthracis is the causative agent of anthrax that is composed of three genetic groups, namely A, B, and C. Clade-A is distributed world-wide, while sub-clades B has been identified in Kruger National Park (KNP), South Africa. KNP is one of the endemic anthrax regions in South Africa with distinctive genetic diversity. Genomic surveillance of KNP B. anthracis strains was employed on the historical culture collection isolates (n=67) dated from the 1990’s to 2015 using a whole genome sequencing approach. Whole genome single nucleotide polymorphism (SNPs) and pan-genomics analysis were used to define the B. anthracis genetic population structure. This study showed that KNP has heterologous B. anthracis strains grouping in the A-clade with more prominent ABr.005/006 (Ancient A) SNP lineage. The 2012 and 2015 anthrax isolates are dispersed amongst minor sub-clades that prevail in non-stabilized genetic evolution strains. This was augmented with non-parsimony informative SNPs of the B. anthracis strains across minor sub-clades of the Ancient A clade. Pan-genomics of B. anthracis showed a clear distinction between A and B-clade genomes with 11 374 predicted clusters of protein coding genes. Unique accessory genes of B-clade genomes that included biosynthetic cell wall genes and multidrug resistant of Fosfomycin. South Africa consists of diverse B. anthracis strains with unique defined SNPs. The sequenced B. anthracis strains in this study will serve as a means to further trace the dissemination of B. anthracis outbreaks globally and especially in South Africa.

Keywords: bacillus anthracis, whole genome single nucleotide polymorphisms, pangenomics, kruger national park

Procedia PDF Downloads 150
1265 Identification of Candidate Gene for Root Development and Its Association With Plant Architecture and Yield in Cassava

Authors: Abiodun Olayinka, Daniel Dzidzienyo, Pangirayi Tongoona, Samuel Offei, Edwige Gaby Nkouaya Mbanjo, Chiedozie Egesi, Ismail Yusuf Rabbi

Abstract:

Cassava (Manihot esculenta Crantz) is a major source of starch for various industrial applications. However, the traditional cultivation and harvesting methods of cassava are labour-intensive and inefficient, limiting the supply of fresh cassava roots for industrial starch production. To achieve improved productivity and quality of fresh cassava roots through mechanized cultivation, cassava cultivars with compact plant architecture and moderate plant height are needed. Plant architecture-related traits, such as plant height, harvest index, stem diameter, branching angle, and lodging tolerance, are critical for crop productivity and suitability for mechanized cultivation. However, the genetics of cassava plant architecture remain poorly understood. This study aimed to identify the genetic bases of the relationships between plant architecture traits and productivity-related traits, particularly starch content. A panel of 453 clones developed at the International Institute of Tropical Agriculture, Nigeria, was genotyped and phenotyped for 18 plant architecture and productivity-related traits at four locations in Nigeria. A genome-wide association study (GWAS) was conducted using the phenotypic data from a panel of 453 clones and 61,238 high-quality Diversity Arrays Technology sequencing (DArTseq) derived Single Nucleotide Polymorphism (SNP) markers that are evenly distributed across the cassava genome. Five significant associations between ten SNPs and three plant architecture component traits were identified through GWAS. We found five SNPs on chromosomes 6 and 16 that were significantly associated with shoot weight, harvest index, and total yield through genome-wide association mapping. We also discovered an essential candidate gene that is co-located with peak SNPs linked to these traits in M. esculenta. A review of the cassava reference genome v7.1 revealed that the SNP on chromosome 6 is in proximity to Manes.06G101600.1, a gene that regulates endodermal differentiation and root development in plants. The findings of this study provide insights into the genetic basis of plant architecture and yield in cassava. Cassava breeders could leverage this knowledge to optimize plant architecture and yield in cassava through marker-assisted selection and targeted manipulation of the candidate gene.

Keywords: manihot esculenta crantz, plant architecture, dartseq, snp markers, genome-wide association study

Procedia PDF Downloads 95
1264 Predicting Open Chromatin Regions in Cell-Free DNA Whole Genome Sequencing Data by Correlation Clustering  

Authors: Fahimeh Palizban, Farshad Noravesh, Amir Hossein Saeidian, Mahya Mehrmohamadi

Abstract:

In the recent decade, the emergence of liquid biopsy has significantly improved cancer monitoring and detection. Dying cells, including those originating from tumors, shed their DNA into the blood and contribute to a pool of circulating fragments called cell-free DNA. Accordingly, identifying the tissue origin of these DNA fragments from the plasma can result in more accurate and fast disease diagnosis and precise treatment protocols. Open chromatin regions are important epigenetic features of DNA that reflect cell types of origin. Profiling these features by DNase-seq, ATAC-seq, and histone ChIP-seq provides insights into tissue-specific and disease-specific regulatory mechanisms. There have been several studies in the area of cancer liquid biopsy that integrate distinct genomic and epigenomic features for early cancer detection along with tissue of origin detection. However, multimodal analysis requires several types of experiments to cover the genomic and epigenomic aspects of a single sample, which will lead to a huge amount of cost and time. To overcome these limitations, the idea of predicting OCRs from WGS is of particular importance. In this regard, we proposed a computational approach to target the prediction of open chromatin regions as an important epigenetic feature from cell-free DNA whole genome sequence data. To fulfill this objective, local sequencing depth will be fed to our proposed algorithm and the prediction of the most probable open chromatin regions from whole genome sequencing data can be carried out. Our method integrates the signal processing method with sequencing depth data and includes count normalization, Discrete Fourie Transform conversion, graph construction, graph cut optimization by linear programming, and clustering. To validate the proposed method, we compared the output of the clustering (open chromatin region+, open chromatin region-) with previously validated open chromatin regions related to human blood samples of the ATAC-DB database. The percentage of overlap between predicted open chromatin regions and the experimentally validated regions obtained by ATAC-seq in ATAC-DB is greater than 67%, which indicates meaningful prediction. As it is evident, OCRs are mostly located in the transcription start sites (TSS) of the genes. In this regard, we compared the concordance between the predicted OCRs and the human genes TSS regions obtained from refTSS and it showed proper accordance around 52.04% and ~78% with all and the housekeeping genes, respectively. Accurately detecting open chromatin regions from plasma cell-free DNA-seq data is a very challenging computational problem due to the existence of several confounding factors, such as technical and biological variations. Although this approach is in its infancy, there has already been an attempt to apply it, which leads to a tool named OCRDetector with some restrictions like the need for highly depth cfDNA WGS data, prior information about OCRs distribution, and considering multiple features. However, we implemented a graph signal clustering based on a single depth feature in an unsupervised learning manner that resulted in faster performance and decent accuracy. Overall, we tried to investigate the epigenomic pattern of a cell-free DNA sample from a new computational perspective that can be used along with other tools to investigate genetic and epigenetic aspects of a single whole genome sequencing data for efficient liquid biopsy-related analysis.

Keywords: open chromatin regions, cancer, cell-free DNA, epigenomics, graph signal processing, correlation clustering

Procedia PDF Downloads 150
1263 The Influence of Directionality on the Giovanelli Illusion

Authors: Michele Sinico

Abstract:

In the Giovanelli illusion, some collinear dots appear misaligned, when each dot lies within a circle and the circles are not collinear. In this illusion, the role of the frame of reference, determined by the circles, is considered a crucial factor. Three experiments were carried out to study the influence of directionality of the circles on the misalignment. The adjustment method was used. Participants changed the orthogonal position of each dot, from the left to the right of the sequence, until a collinear sequence of dots was achieved. The first experiment verified the illusory effect of the misalignment. In the second experiment, the influence of two different directionalities of the circles (-0.58° and +0.58°) on the misalignment was tested. The results show an over-normalization on the sequences of the dots. The third experiment tested the misalignment of the dots without any inclination of the sequence of circles (0°). Only a local illusory effect was found. These results demonstrate that the directionality of the circles, as a global factor, can increase the misalignment. The findings also indicate that directionality and the frame of reference are independent factors in explaining the Giovanelli illusion.

Keywords: Giovannelli illusion, visual illusion, directionality, misalignment, the frame of reference

Procedia PDF Downloads 178
1262 Complete Chloroplast DNA Sequences of Georgian Endemic Polyploid Wheats

Authors: M. Gogniashvili, I. Maisaia, A. Kotorashvili, N. Kotaria, T. Beridze

Abstract:

Three types of plasmon (A, B and G) is typical for genus Triticum. In polyploid species - Triticum turgidum L. and Triticum aestivum L. plasmon B is detected. In the forthcoming paper, complete nucleotide sequence of chloroplast DNA of 11 representatives of Georgian wheat polyploid species, carrying plasmon B was determined. Sequencing of chloroplast DNA was performed on an Illumina MiSeq platform. Chloroplast DNA molecules were assembled using the SOAPdenovo computer program. All contigs were aligned to the reference chloroplast genome sequence using BLASTN. For detection of SNPs and Indels and phylogeny tree construction computer programs Mafft and Blast were used. Using Triticum aestivum L. subsp. macha (Dekapr. & Menabde) Mackey var. paleocolchicum Dekapr. et Menabde as a reference, 5 SNPs can be identified in chloroplast DNA of Georgian endemic polyploid wheat. The number of noncoding substitutions is 2, coding substitutions - 3. In comparison with reference DNA two - 38 bp and 56 bp inversions were observed in paleocolchicum subspecies. There were six 1 bp indels detected in Georgian polyploid wheats, all of them at microsatellite stretches. The phylogeny tree shows that subspecies macha, carthlicum and paleocolchicum occupy different positions. According to the simplified scheme based on SNP and indel data, the ancestral, female parent of the all studied polyploid wheat is unknown X predecesor, from which four lines were formed. 1 SNP and two inversions (38 bp and 56 bp) caused the formation of subsp. paleocolchicum. Three other lines are macha, durum and carthlicum lines. Macha line is further divided into two sublines (M_1 and M_4). Carthlicum line includes subsp.carthlicum and T.aestivum - C_1 - C_2 - A_1. One of the central question of wheat domestication is which people(s) participated in wheat domestication? It is proposed that the predecessors of Georgian peoples (Proto-Kartvelians) must be placed, on the evidence of archaic lexical and toponymic data, in the mountainous regions of the western and central part of the Little Caucasus (the Transcaucasian foothills) at least 4,000 years ago. One of the possibility to explain the ‘wheat puzzle’ is that Kartvelian speakers brought domesticated wheat species and subspecis from Fertile Crescent further north to South Caucasus.

Keywords: chloroplast DNA, sequencing, SNP, triticum

Procedia PDF Downloads 153
1261 Epigenetic Mechanisms Involved in the Occurrence and Development of Infectious Diseases

Authors: Frank Boris Feutmba Keutchou, Saurelle Fabienne Bieghan Same, Verelle Elsa Fogang Pokam, Charles Ursula Metapi Meikeu, Angel Marilyne Messop Nzomo, Ousman Tamgue

Abstract:

Infectious diseases are one of the most important causes of morbidity and mortality worldwide. These diseases are caused by micro-pathogenic organisms, such as bacteria, viruses, parasites, and fungi. Heritable changes in gene expression that do not involve changes to the underlying DNA sequence are referred to as epigenetics. Emerging evidence suggests that epigenetic mechanisms are important in the emergence and progression of infectious diseases. Pathogens can manipulate host epigenetic machinery to promote their own replication and evade immune responses. The Human Genome Project has provided new opportunities for developing better tools for the diagnosis and identification of target genes. Several epigenetic modifications, such as DNA methylation, histone modifications, and non-coding RNA expression, have been shown to influence infectious disease outcomes. Understanding the epigenetic mechanisms underlying infectious diseases may result in the progression of new therapeutic approaches focusing on host-pathogen interactions. The goal of this study is to show how different infectious agents interact with host cells after infection.

Keywords: epigenetic, infectious disease, micro-pathogenic organism, phenotype

Procedia PDF Downloads 80
1260 Prevalence Determination of Hepatitis D Virus Genotypes among HBsAg Positive Patients in Kerman Province of Iran

Authors: Khabat Barkhordari, Ali Mohammad Arabzadeh

Abstract:

Hepatitis delta virus (HDV) is a RNA virus that needs the function of hepatitis B virus (HBV) for its propagation and assembly. Infection by HDV can occur spontaneously with HBV infection and cause acute hepatitis or develop as secondary infection in HBV suffering patients. Based on genome sequence analysis, HDV has several genotypes which show broad geographic and diverse clinical features. The aim of current study is determine the prevalence of hepatitis delta virus genotype in patients with positive HBsAg in Kerman province of Iran. This cross-sectional study a total of 400 patients with HBV infection attending the clinic center of Besat from 2012 to 2014 were included. We carried out ELISA to detect anti-HDV antibodies. Those testing positive were analyzed further for HDV-RNA and for genotyping using restriction fragment length polymorphism (RFLP) and RT-nested PCR- sequencing. Among 400 patients in this study, 67 cases (16.75 %) were containing anti-HDV antibody which we found HDV RNA in just 7 (1.75%) serum samples. Analysis of these 7 positive HDV showed that all of them have genotype I. According to current study the HDV prevalence in Kerman is higher than the reported prevalence of 6.6% for Iran as a whole and clade 1 (genotype 1) is the predominant clade of HDV in Kerman.

Keywords: genotyping, hepatitis delta virus, molecular epidemiology, Kerman, Iran

Procedia PDF Downloads 294
1259 Unraveling the Puzzle of Out-of-Sequence Thrusting in the Higher Himalaya: Focus on Jhakri-Chaura-Sarahan Thrust, Himachal Pradesh, India

Authors: Rajkumar Ghosh

Abstract:

The study examines the structural analysis of Chaura Thrust in Himachal Pradesh, India, focusing on the activation timing of Main Central Thrust (MCT) and South Tibetan Detachment System (STDS), mylonitised zones, and the characterization of box fold and its signature in the regional geology of Himachal Himalaya. The research aims to document the Higher Himalayan Out-of-Sequence Thrust (OOST) in Himachal Pradesh, which activated the MCTL and in between a zone south of MCTU. The study also documents the GBM-associated temperature range and the activation of Higher Himalayan Out-of-Sequence Thrust (OOST) in Himachal Pradesh. The findings contribute to understanding the structural analysis of Chaura Thrust and its signature in the regional geology of Himachal Himalaya. The study highlights the significance of microscopic studies in documenting mylonitized zones and identifying various types of crenulated schistosity. The study concludes that Chaura Thrust is not a blind thrust and details the field evidence for the OOST. The study characterizes the box fold and its signature in the regional geology of Himachal Himalaya. The study also documents the activation timing and ages of MCT, STDS, MBT, and MFT and identifies various types of crenulated schistosity under the microscope. The study also highlights the significance of microscopic studies in the structural analysis of Chaura Thrust. Finally, the study documents the activation of Higher Himalayan Out-of-Sequence Thrust (OOST) in Himachal Pradesh and the expectations for strain variation near the OOST.

Keywords: Chaura Thrust, Higher Himalaya, Jhakri Thrust, Main Central Thrust, Out-of-Sequence Thrust, Sarahan Thrust

Procedia PDF Downloads 89
1258 Predictive Pathogen Biology: Genome-Based Prediction of Pathogenic Potential and Countermeasures Targets

Authors: Debjit Ray

Abstract:

Horizontal gene transfer (HGT) and recombination leads to the emergence of bacterial antibiotic resistance and pathogenic traits. HGT events can be identified by comparing a large number of fully sequenced genomes across a species or genus, define the phylogenetic range of HGT, and find potential sources of new resistance genes. In-depth comparative phylogenomics can also identify subtle genome or plasmid structural changes or mutations associated with phenotypic changes. Comparative phylogenomics requires that accurately sequenced, complete and properly annotated genomes of the organism. Assembling closed genomes requires additional mate-pair reads or “long read” sequencing data to accompany short-read paired-end data. To bring down the cost and time required of producing assembled genomes and annotating genome features that inform drug resistance and pathogenicity, we are analyzing the performance for genome assembly of data from the Illumina NextSeq, which has faster throughput than the Illumina HiSeq (~1-2 days versus ~1 week), and shorter reads (150bp paired-end versus 300bp paired end) but higher capacity (150-400M reads per run versus ~5-15M) compared to the Illumina MiSeq. Bioinformatics improvements are also needed to make rapid, routine production of complete genomes a reality. Modern assemblers such as SPAdes 3.6.0 running on a standard Linux blade are capable in a few hours of converting mixes of reads from different library preps into high-quality assemblies with only a few gaps. Remaining breaks in scaffolds are generally due to repeats (e.g., rRNA genes) are addressed by our software for gap closure techniques, that avoid custom PCR or targeted sequencing. Our goal is to improve the understanding of emergence of pathogenesis using sequencing, comparative genomics, and machine learning analysis of ~1000 pathogen genomes. Machine learning algorithms will be used to digest the diverse features (change in virulence genes, recombination, horizontal gene transfer, patient diagnostics). Temporal data and evolutionary models can thus determine whether the origin of a particular isolate is likely to have been from the environment (could it have evolved from previous isolates). It can be useful for comparing differences in virulence along or across the tree. More intriguing, it can test whether there is a direction to virulence strength. This would open new avenues in the prediction of uncharacterized clinical bugs and multidrug resistance evolution and pathogen emergence.

Keywords: genomics, pathogens, genome assembly, superbugs

Procedia PDF Downloads 197
1257 Identification of Genomic Mutations in Prostate Cancer and Cancer Stem Cells By Single Cell RNAseq Analysis

Authors: Wen-Yang Hu, Ranli Lu, Mark Maienschein-Cline, Danping Hu, Larisa Nonn, Toshi Shioda, Gail S. Prins

Abstract:

Background: Genetic mutations are highly associated with increased prostate cancer risk. In addition to whole genome sequencing, somatic mutations can be identified by aligning transcriptome sequences to the human genome. Here we analyzed bulk RNAseq and single cell RNAseq data of human prostate cancer cells and their matched non-cancer cells in benign regions from 4 individual patients. Methods: Sequencing raw reads were aligned to the reference genome hg38 using STAR. Variants were annotated using Annovar with respect to overlap gene annotation information, effect on gene and protein sequence, and SIFT annotation of nonsynonymous variant effect. We determined cancer-specific novel alleles by comparing variant calls in cancer cells to matched benign cells from the same individual by selecting unique alleles that were only detected in the cancer samples. Results: In bulk RNAseq data from 3 patients, the most common variants were the noncoding mutations at UTR3/UTR5, and the major variant types were single-nucleotide polymorphisms (SNP) including frameshift mutations. C>T transversion is the most frequently presented substitution of SNP. A total of 222 genes carrying unique exonic or UTR variants were revealed in cancer cells across 3 patients but not in benign cells. Among them, transcriptome levels of 7 genes (CITED2, YOD1, MCM4, HNRNPA2B1, KIF20B, DPYSL2, NR4A1) were significantly up or down regulated in cancer stem cells. Out of the 222 commonly mutated genes in cancer, 19 have nonsynonymous variants and 11 are damaged genes with variants including SIFT, frameshifts, stop gain/loss, and insertions/deletions (indels). Two damaged genes, activating transcription factor 6 (ATF6) and histone demethylase KDM3A are of particular interest; the former is a survival factor for certain cancer cells while the later positively activates androgen receptor target genes in prostate cancer. Further, single cell RNAseq data of cancer cells and their matched non-cancer benign cells from both primary 2D and 3D tumoroid cultures were analyzed. Similar to the bulk RNAseq data, single cell RNAseq in cancer demonstrated that the exonic mutations are less common than noncoding variants, with SNPs including frameshift mutations the most frequently presented types in cancer. Compared to cancer stem cell enriched-3D tumoroids, 2D cancer cells carried 3-times higher variants, 8-times more coding mutations and 10-times more nonsynonymous SNP. Finally, in both 2D primary and 3D tumoroid cultures, cancer stem cells exhibited fewer coding mutations and noncoding SNP or insertions/deletions than non-stem cancer cells. Summary: Our study demonstrates the usefulness of bulk and single cell RNAseaq data in identifying somatic mutations in prostate cancer, providing an alternative method in screening candidate genes for prostate cancer diagnosis and potential therapeutic targets. Cancer stem cells carry fewer somatic mutations than non-stem cancer cells due to their inherited immortal stand DNA from parental stem cells that explains their long-lived characteristics.

Keywords: prostate cancer, stem cell, genomic mutation, RNAseq

Procedia PDF Downloads 18
1256 Depositional Facies, High Resolution Sequence Stratigraphy, Reservoir Characterization of Early Oligocene Carbonates (Mukta Formation) Of North & Northwest of Heera, Mumbai Offshore

Authors: Almas Rajguru, Archana Kamath, Rachana Singh

Abstract:

The study aims to determine the depositional facies, high-resolution sequence stratigraphy, and diagenetic processes of Early Oligocene carbonates in N & N-W of Heera, Mumbai Offshore. Foraminiferal assemblage and microfacies from cores of Well A, B, C, D and E are indicative of facies association related to four depositional environments, i.e., restricted inner lagoons-tidal flats, shallow open lagoons, high energy carbonate bars-shoal complex and deeper mid-ramps of a westerly dipping homoclinal carbonate ramp. Two high-frequency (4th Order) depositional sequences bounded by sequence boundary, DS1 and DS2, displaying hierarchical stacking patterns, are identified and correlated across wells. Vadose zone diagenesis effect during short diastem/ subaerial exposure has rendered good porosity due to dissolution in HST carbonates and occasionally affected underlying TST sediments (Well D, C and E). On mapping and correlating the sequences, the presence of thin carbonate bars that can be potential reservoirs are envisaged along NW-SE direction, towards north and south of Wells E, D and C. A more pronounced development of these bars in the same orientation can be anticipated towards the west of the study area.

Keywords: sequence stratigraphy, depositional facies, diagenesis petrography, early Oligocene, Mumbai offshore

Procedia PDF Downloads 77
1255 Approximation of Convex Set by Compactly Semidefinite Representable Set

Authors: Anusuya Ghosh, Vishnu Narayanan

Abstract:

The approximation of convex set by semidefinite representable set plays an important role in semidefinite programming, especially in modern convex optimization. To optimize a linear function over a convex set is a hard problem. But optimizing the linear function over the semidefinite representable set which approximates the convex set is easy to solve as there exists numerous efficient algorithms to solve semidefinite programming problems. So, our approximation technique is significant in optimization. We develop a technique to approximate any closed convex set, say K by compactly semidefinite representable set. Further we prove that there exists a sequence of compactly semidefinite representable sets which give tighter approximation of the closed convex set, K gradually. We discuss about the convergence of the sequence of compactly semidefinite representable sets to closed convex set K. The recession cone of K and the recession cone of the compactly semidefinite representable set are equal. So, we say that the sequence of compactly semidefinite representable sets converge strongly to the closed convex set. Thus, this approximation technique is very useful development in semidefinite programming.

Keywords: semidefinite programming, semidefinite representable set, compactly semidefinite representable set, approximation

Procedia PDF Downloads 386
1254 Sequence Stratigraphy and Petrophysical Analysis of Sawan Gas Field, Central Indus Basin, Pakistan

Authors: Saeed Ur Rehman Chaudhry

Abstract:

The objectives of the study are to reconstruct sequence stratigraphic framework and petrophysical analysis of the reservoir marked by using sequence stratigraphy of Sawan Gas Field. The study area lies in Central Indus Basin, District Khairpur, Sindh province, Pakistan. The study area lies tectonically in an extensional regime. Lower Goru Formation and Sembar Formation act as a reservoir and source respectively. To achieve objectives, data set of seismic lines, consisting of seismic lines PSM96-114, PSM96-115, PSM96-133, PSM98-201, PSM98-202 and well logs of Sawan-01, Sawan-02 and Gajwaro-01 has been used. First of all interpretation of seismic lines has been carried out. Interpretation of seismic lines shows extensional regime in the area and cut entire Cretaceous section. Total of seven reflectors has been marked on each seismic line. Lower Goru Formation is thinning towards west. Seismic lines also show eastward tilt of stratigraphy due to uplift at the western side. Sequence stratigraphic reconstruction has been done by integrating seismic and wireline log data. Total of seven sequence boundaries has been interpreted between the top of Chiltan Limestone to Top of Lower Goru Formation. It has been observed on seismic lines that Sembar Formation initially generated shelf margin profile and then ramp margin on which Lower Goru deposition took place. Shelf edge deltas and slope fans have been observed on seismic lines, and signatures of slope fans are also observed on wireline logs as well. Total of six sequences has been interpreted. Stratigraphic and sequence stratigraphic correlation has been carried out by using Sawan 01, Sawan 02 and Gajwaro 01 and a Low Stand Systems tract (LST) within Lower Goru C sands has been marked as a zone of interest. The petrophysical interpretation includes shale volume, effective porosity, permeability, saturation of water and hydrocarbon. On the basis of good effective porosity and hydrocarbon saturation petrophysical analysis confirms that the LST in Sawan-01 and Sawan-02 has good hydrocarbon potential.

Keywords: petrophysical analysis, reservoir potential, Sawan Gas Field, sequence stratigraphy

Procedia PDF Downloads 262
1253 The Effect of Ingredients Mixing Sequence in Rubber Compounding on the Formation of Bound Rubber and Cross-Link Density of Natural Rubber

Authors: Abu Hasan, Rochmadi, Hary Sulistyo, Suharto Honggokusumo

Abstract:

This research purpose is to study the effect of Ingredients mixing sequence in rubber compounding onto the formation of bound rubber and cross link density of natural rubber and also the relationship of bound rubber and cross link density. Analysis of bound rubber formation of rubber compound and cross link density of rubber vulcanizates were carried out on a natural rubber formula having masticated and mixing, followed by curing. There were four methods of mixing and each mixing process was followed by four mixing sequence methods of carbon black into the rubber. In the first method of mixing sequence, rubber was masticated for 5 min and then rubber chemicals and carbon black N 330 were added simultaneously. In the second one, rubber was masticated for 1 min and followed by addition of rubber chemicals and carbon black N 330 simultaneously using the different method of mixing then the first one. In the third one, carbon black N 660 was used for the same mixing procedure of the second one, and in the last one, rubber was masticated for 3 min, carbon black N 330 and rubber chemicals were added subsequently. The addition of rubber chemicals and carbon black into masticated rubber was distinguished by the sequence and time allocated for each mixing process. Carbon black was added into two stages. In the first stage, 10 phr was added first and the remaining 40 phr was added later along with oil. In the second one to the fourth one, the addition of carbon black in the first and the second stage was added in the phr ratio 20:30, 30:20, and 40:10. The results showed that the ingredients mixing process influenced bound rubber formation and cross link density. In the three methods of mixing, the bound rubber formation was proportional with crosslink density. In contrast in the fourth one, bound rubber formation and cross link density had contradictive relation. Regardless of the mixing method operated, bound rubber had non linear relationship with cross link density. The high cross link density was formed when low bound rubber formation. The cross link density became constant at high bound rubber content.

Keywords: bound-rubber, cross-link density, natural rubber, rubber mixing process

Procedia PDF Downloads 411
1252 Evaluating the Potential of a Fast Growing Indian Marine Cyanobacterium by Reconstructing and Analysis of a Genome Scale Metabolic Model

Authors: Ruchi Pathania, Ahmad Ahmad, Shireesh Srivastava

Abstract:

Cyanobacteria is a promising microbe that can capture and convert atmospheric CO₂ and light into valuable industrial bio-products like biofuels, biodegradable plastics, etc. Among their most attractive traits are faster autotrophic growth, whole year cultivation using non-arable land, high photosynthetic activity, much greater biomass and productivity and easy for genetic manipulations. Cyanobacteria store carbon in the form of glycogen which can be hydrolyzed to release glucose and fermented to form bioethanol or other valuable products. Marine cyanobacterial species are especially attractive for countries with scarcity of freshwater. We recently identified a marine native cyanobacterium Synechococcus sp. BDU 130192 which has good growth rate and high level of polyglucans accumulation compared to Synechococcus PCC 7002. In this study, firstly we sequenced the whole genome and the sequences were annotated using the RAST server. Genome scale metabolic model (GSMM) was reconstructed through COBRA toolbox. GSMM is a computational representation of the metabolic reactions and metabolites of the target strain. GSMMs construction through the application of Flux Balance Analysis (FBA), which uses external nutrient uptake rates and estimate steady state intracellular and extracellular reaction fluxes, including maximization of cell growth. The model, which we have named isyn942, includes 942 reactions and 913 metabolites having 831 metabolic, 78 transport and 33 exchange reactions. The phylogenetic tree obtained by BLAST search revealed that the strain was a close relative of Synechococcus PCC 7002. The flux balance analysis (FBA) was applied on the model iSyn942 to predict the theoretical yields (mol product produced/mol CO₂ consumed) for native and non-native products like acetone, butanol, etc. under phototrophic condition by applying metabolic engineering strategies. The reported strain can be a viable strain for biotechnological applications, and the model will be helpful to researchers interested in understanding the metabolism as well as to design metabolic engineering strategies for enhanced production of various bioproducts.

Keywords: cyanobacteria, flux balance analysis, genome scale metabolic model, metabolic engineering

Procedia PDF Downloads 158
1251 Molecular Cloning and Identification of a Double WAP Domain–Containing Protein 3 Gene from Chinese Mitten Crab Eriocheir sinensis

Authors: Fengmei Li, Li Xu, Guoliang Xia

Abstract:

Whey acidic proteins (WAP) domain-containing proteins in crustacean are involved in innate immune response against microbial invasion. In the present study, a novel double WAP domain (DWD)-containing protein gene 3 was identified from Chinese mitten crab Eriocheir sinensis (designated EsDWD3) by expressed sequence tag (EST) analysis and PCR techniques. The full-length cDNA of EsDWD3 was of 1223 bp, consisting of a 5′-terminal untranslated region (UTR) of 74 bp, a 3′ UTR of 727 bp with a polyadenylation signal sequence AATAAA and a polyA tail, and an open reading frame (ORF) of 423 bp. The ORF encoded a polypeptide of 140 amino acids with a signal peptide of 22 amino acids. The deduced protein sequence EsDWD3 showed 96.4 % amino acid similar to other reported EsDWD1 from E. sinensis, and phylogenetic tree analysis revealed that EsDWD3 had closer relationships with the reported two double WAP domain-containing proteins of E. sinensis species.

Keywords: Chinese mitten crab, Eriocheir sinensis, cloning, double WAP domain-containing protein

Procedia PDF Downloads 355