Search results for: bio-informatics
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 203

Search results for: bio-informatics

113 Microarray Data Visualization and Preprocessing Using R and Bioconductor

Authors: Ruchi Yadav, Shivani Pandey, Prachi Srivastava

Abstract:

Microarrays provide a rich source of data on the molecular working of cells. Each microarray reports on the abundance of tens of thousands of mRNAs. Virtually every human disease is being studied using microarrays with the hope of finding the molecular mechanisms of disease. Bioinformatics analysis plays an important part of processing the information embedded in large-scale expression profiling studies and for laying the foundation for biological interpretation. A basic, yet challenging task in the analysis of microarray gene expression data is the identification of changes in gene expression that are associated with particular biological conditions. Careful statistical design and analysis are essential to improve the efficiency and reliability of microarray experiments throughout the data acquisition and analysis process. One of the most popular platforms for microarray analysis is Bioconductor, an open source and open development software project based on the R programming language. This paper describes specific procedures for conducting quality assessment, visualization and preprocessing of Affymetrix Gene Chip and also details the different bioconductor packages used to analyze affymetrix microarray data and describe the analysis and outcome of each plots.

Keywords: microarray analysis, R language, affymetrix visualization, bioconductor

Procedia PDF Downloads 480
112 The Standard of Best Interest of the Child in Custody Adjudication under the Malaysian Laws

Authors: Roslina Che Soh

Abstract:

Best interest of the child has been the prevailing principle of the custody legislations of most nations in the world. The tremendous shift from parental rights to parental responsibilities throughout the centuries had made the principle of best interests of the child as the utmost matter which parents must uphold in child upbringing. Despite the commitment to this principle is significantly enshrined in the United Nation Convention on Rights of the Child, the content and application of the principle differs across borders. Differences persist notwithstanding many countries have experienced a substantial shift over the last several decades in the types of custodial arrangements that are thought to best serve children’s interests. The laws in Malaysia similarly uphold this principle but do not provide further deliberation on the principle itself. The principle is entirely developed by the courts through decided cases. Thus, this paper seeks to discuss the extent of the application of best interest of the child principle in custody disputes. In doing so, it attempts to provide an overview of the current laws and the approach of the Civil and the Shariah courts in Malaysia in applying the principle in determining custody disputes. For purposes of comparison, it briefly examines the legislations and the courts practices in Australia and England on this matter. The purpose is to determine the best standard to be adopted by Malaysia and to propose improvement to the laws whenever appropriate.

Keywords: child custody, best interest, Malaysian law, bioinformatics, biomedicine

Procedia PDF Downloads 274
111 DNpro: A Deep Learning Network Approach to Predicting Protein Stability Changes Induced by Single-Site Mutations

Authors: Xiao Zhou, Jianlin Cheng

Abstract:

A single amino acid mutation can have a significant impact on the stability of protein structure. Thus, the prediction of protein stability change induced by single site mutations is critical and useful for studying protein function and structure. Here, we presented a deep learning network with the dropout technique for predicting protein stability changes upon single amino acid substitution. While using only protein sequence as input, the overall prediction accuracy of the method on a standard benchmark is >85%, which is higher than existing sequence-based methods and is comparable to the methods that use not only protein sequence but also tertiary structure, pH value and temperature. The results demonstrate that deep learning is a promising technique for protein stability prediction. The good performance of this sequence-based method makes it a valuable tool for predicting the impact of mutations on most proteins whose experimental structures are not available. Both the downloadable software package and the user-friendly web server (DNpro) that implement the method for predicting protein stability changes induced by amino acid mutations are freely available for the community to use.

Keywords: bioinformatics, deep learning, protein stability prediction, biological data mining

Procedia PDF Downloads 467
110 Linguistic Insights Improve Semantic Technology in Medical Research and Patient Self-Management Contexts

Authors: William Michael Short

Abstract:

Semantic Web’ technologies such as the Unified Medical Language System Metathesaurus, SNOMED-CT, and MeSH have been touted as transformational for the way users access online medical and health information, enabling both the automated analysis of natural-language data and the integration of heterogeneous healthrelated resources distributed across the Internet through the use of standardized terminologies that capture concepts and relationships between concepts that are expressed differently across datasets. However, the approaches that have so far characterized ‘semantic bioinformatics’ have not yet fulfilled the promise of the Semantic Web for medical and health information retrieval applications. This paper argues within the perspective of cognitive linguistics and cognitive anthropology that four features of human meaning-making must be taken into account before the potential of semantic technologies can be realized for this domain. First, many semantic technologies operate exclusively at the level of the word. However, texts convey meanings in ways beyond lexical semantics. For example, transitivity patterns (distributions of active or passive voice) and modality patterns (configurations of modal constituents like may, might, could, would, should) convey experiential and epistemic meanings that are not captured by single words. Language users also naturally associate stretches of text with discrete meanings, so that whole sentences can be ascribed senses similar to the senses of words (so-called ‘discourse topics’). Second, natural language processing systems tend to operate according to the principle of ‘one token, one tag’. For instance, occurrences of the word sound must be disambiguated for part of speech: in context, is sound a noun or a verb or an adjective? In syntactic analysis, deterministic annotation methods may be acceptable. But because natural language utterances are typically characterized by polyvalency and ambiguities of all kinds (including intentional ambiguities), such methods leave the meanings of texts highly impoverished. Third, ontologies tend to be disconnected from everyday language use and so struggle in cases where single concepts are captured through complex lexicalizations that involve profile shifts or other embodied representations. More problematically, concept graphs tend to capture ‘expert’ technical models rather than ‘folk’ models of knowledge and so may not match users’ common-sense intuitions about the organization of concepts in prototypical structures rather than Aristotelian categories. Fourth, and finally, most ontologies do not recognize the pervasively figurative character of human language. However, since the time of Galen the widespread use of metaphor in the linguistic usage of both medical professionals and lay persons has been recognized. In particular, metaphor is a well-documented linguistic tool for communicating experiences of pain. Because semantic medical knowledge-bases are designed to help capture variations within technical vocabularies – rather than the kinds of conventionalized figurative semantics that practitioners as well as patients actually utilize in clinical description and diagnosis – they fail to capture this dimension of linguistic usage. The failure of semantic technologies in these respects degrades the efficiency and efficacy not only of medical research, where information retrieval inefficiencies can lead to direct financial costs to organizations, but also of care provision, especially in contexts of patients’ self-management of complex medical conditions.

Keywords: ambiguity, bioinformatics, language, meaning, metaphor, ontology, semantic web, semantics

Procedia PDF Downloads 132
109 Modern Proteomics and the Application of Machine Learning Analyses in Proteomic Studies of Chronic Kidney Disease of Unknown Etiology

Authors: Dulanjali Ranasinghe, Isuru Supasan, Kaushalya Premachandra, Ranjan Dissanayake, Ajith Rajapaksha, Eustace Fernando

Abstract:

Proteomics studies of organisms are considered to be significantly information-rich compared to their genomic counterparts because proteomes of organisms represent the expressed state of all proteins of an organism at a given time. In modern top-down and bottom-up proteomics workflows, the primary analysis methods employed are gel–based methods such as two-dimensional (2D) electrophoresis and mass spectrometry based methods. Machine learning (ML) and artificial intelligence (AI) have been used increasingly in modern biological data analyses. In particular, the fields of genomics, DNA sequencing, and bioinformatics have seen an incremental trend in the usage of ML and AI techniques in recent years. The use of aforesaid techniques in the field of proteomics studies is only beginning to be materialised now. Although there is a wealth of information available in the scientific literature pertaining to proteomics workflows, no comprehensive review addresses various aspects of the combined use of proteomics and machine learning. The objective of this review is to provide a comprehensive outlook on the application of machine learning into the known proteomics workflows in order to extract more meaningful information that could be useful in a plethora of applications such as medicine, agriculture, and biotechnology.

Keywords: proteomics, machine learning, gel-based proteomics, mass spectrometry

Procedia PDF Downloads 151
108 Insights of Interaction Studies between HSP-60, HSP-70 Proteins and HSF-1 in Bubalus bubalis

Authors: Ravinder Singh, C Rajesh, Saroj Badhan, Shailendra Mishra, Ranjit Singh Kataria

Abstract:

Heat shock protein 60 and 70 are crucial chaperones that guide appropriate folding of denatured proteins under heat stress conditions. HSP60 and HSP70 provide assistance in correct folding of a multitude of denatured proteins. The heat shock factors are the family of some transcription factors which controls the regulation of gene expression of proteins involved in folding of damaged or improper folded proteins during stress conditions. Under normal condition heat shock proteins bind with HSF-1 and act as its repressor as well as aids in maintaining the HSF-1’s nonactive and monomeric confirmation. The experimental protein structure for all these proteins in Bubalus bubalis is not known till date. Therefore computational approach was explored to identify three-dimensional structure analysis of all these proteins. In this study, an extensive in silico analysis has been performed including sequence comparison among species to comparative modeling of Bubalus bubalis HSP60, HSP70 and HSF-1 protein. The stereochemical properties of proteins were assessed by utilizing several scrutiny bioinformatics tools to ensure model accuracy. Further docking approach was used to study interactions between Heat shock proteins and HSF-1.

Keywords: Bubalus bubalis, comparative modelling, docking, heat shock protein

Procedia PDF Downloads 322
107 The Phylogenetic Investigation of Candidate Genes Related to Type II Diabetes in Man and Other Species

Authors: Srijoni Banerjee

Abstract:

Sequences of some of the candidate genes (e.g., CPE, CDKAL1, GCKR, HSD11B1, IGF2BP2, IRS1, LPIN1, PKLR, TNF, PPARG) implicated in some of the complex disease, e.g. Type II diabetes in man has been compared with other species to investigate phylogenetic affinity. Based on mRNA sequence of these genes of 7 to 8 species, using bioinformatics tools Mega 5, Bioedit, Clustal W, distance matrix was obtained. Phylogenetic trees were obtained by NJ and UPGMA clustering methods. The results of the phylogenetic analyses show that of the species compared: Xenopus l., Danio r., Macaca m., Homo sapiens s., Rattus n., Mus m. and Gallus g., Bos taurus, both NJ and UPGMA clustering show close affinity between clustering of Homo sapiens s. (Man) with Rattus n. (Rat), Mus m. species for the candidate genes, except in case of Lipin1 gene. The results support the functional similarity of these genes in physiological and biochemical process involving man and mouse/rat. Therefore, in understanding the complex etiology and treatment of the complex disease mouse/rate model is the best laboratory choice for experimentation.

Keywords: phylogeny, candidate gene of type-2 diabetes, CPE, CDKAL1, GCKR, HSD11B1, IGF2BP2, IRS1, LPIN1, PKLR, TNF, PPARG

Procedia PDF Downloads 321
106 Characterization of Enterotoxigenic Escherichia coli CS6 Promoter

Authors: Mondal Indranil, Bhakat Debjyoti, Mukhopadayay Asish K., Chatterjee Nabendu S.

Abstract:

CS6 is the prevalent CF in our region and deciphering its molecular regulators would play a pivotal role in reducing the burden of ETEC pathogenesis. In prokaryotes, most of the genes are under the control of one operon and the promoter present upstream of the gene regulates the transcription of that gene. Here the promoter of CS6 was characterized by computational method and further analyzed by β-galactosidase assay and sequencing. Promoter constructs and deletions were prepared as required to analyze promoter activity. The effect of different additives on the CS6 promoter was analysed by the β-galactosidase assay. Bioinformatics analysis done by Softberry/BPROM predicted fur, lrp, and crp boxes, -10 and -35 region upstream of the CS6 gene. The promoter construction in no promoter plasmid pTL61T showed that region -573 to +1 is actually the promoter region as predicted. Sequential deletion of the region upstream of CS6 revealed that promoter activity remains the same when -573bp to -350bp is deleted. But after the deletion of the upstream region -350 bp to -255bp, promoter expression decreases drastically to 26%. Further deletion also decreases promoter activity up to a little range. So the region -355bp to -255bp holds the promoter sequence for the CS6 gene. Additives like iron, NaCl, etc., modulate promoter activity in a dose-dependent manner. From the promoter analysis, it can be said that the minimum region lies between -254 and +1. Important region(s) lies between -350 bp to -255 bp upstream in the promoter, which might have important elements needed to control CS6 gene expression.

Keywords: microbiology, promoter, colonization factor, ETEC

Procedia PDF Downloads 162
105 Development of Fuzzy Logic Control Ontology for E-Learning

Authors: Muhammad Sollehhuddin A. Jalil, Mohd Ibrahim Shapiai, Rubiyah Yusof

Abstract:

Nowadays, ontology is common in many areas like artificial intelligence, bioinformatics, e-commerce, education and many more. Ontology is one of the focus areas in the field of Information Retrieval. The purpose of an ontology is to describe a conceptual representation of concepts and their relationships within a particular domain. In other words, ontology provides a common vocabulary for anyone who needs to share information in the domain. There are several ontology domains in various fields including engineering and non-engineering knowledge. However, there are only a few available ontology for engineering knowledge. Fuzzy logic as engineering knowledge is still not available as ontology domain. In general, fuzzy logic requires step-by-step guidelines and instructions of lab experiments. In this study, we presented domain ontology for Fuzzy Logic Control (FLC) knowledge. We give Table of Content (ToC) with middle strategy based on the Uschold and King method to develop FLC ontology. The proposed framework is developed using Protégé as the ontology tool. The Protégé’s ontology reasoner, known as the Pellet reasoner is then used to validate the presented framework. The presented framework offers better performance based on consistency and classification parameter index. In general, this ontology can provide a platform to anyone who needs to understand FLC knowledge.

Keywords: engineering knowledge, fuzzy logic control ontology, ontology development, table of content

Procedia PDF Downloads 299
104 Isolation and Identification of Diacylglycerol Acyltransferase Type-2 (GAT2) Genes from Three Egyptian Olive Cultivars

Authors: Yahia I. Mohamed, Ahmed I. Marzouk, Mohamed A. Yacout

Abstract:

Aim of this work was to study the genetic basis for oil accumulation in olive fruit via tracking DGAT2 (Diacylglycerol acyltransferase type-2) gene in three Egyptian Origen Olive cultivars namely Toffahi, Hamed and Maraki using molecular marker techniques and bioinformatics tools. Results illustrate that, firstly: specific genomic band of Maraki cultivars was identified as DGAT2 (Diacylglycerol acyltransferase type-2) and identical for this gene in Olea europaea with 100 % of similarity. Secondly, differential genomic band of Maraki cultivars which produced from RAPD fingerprinting technique reflected predicted distinguished sequence which identified as DGAT2 (Diacylglycerol acyltransferase type-2) in Fragaria vesca subsp. Vesca with 76% of sequential similarity. Third and finally, specific genomic specific band of Hamed cultivars was indentified as two fragments, 1-Olea europaea cultivar Koroneiki diacylglycerol acyltransferase type 2 mRNA, complete cds with two matches regions with 99% or 2-PREDICTED: Fragaria vesca subsp. vesca diacylglycerol O-acyltransferase 2-like (LOC101313050), mRNA with 86% of similarity.

Keywords: Olea europaea, fingerprinting, diacylglycerol acyltransferase type-2 (DGAT2), Egypt

Procedia PDF Downloads 503
103 Precise Identification of Clustered Regularly Interspaced Short Palindromic Repeats-Induced Mutations via Hidden Markov Model-Based Sequence Alignment

Authors: Jingyuan Hu, Zhandong Liu

Abstract:

CRISPR genome editing technology has transformed molecular biology by accurately targeting and altering an organism’s DNA. Despite the state-of-art precision of CRISPR genome editing, the imprecise mutation outcome and off-target effects present considerable risk, potentially leading to unintended genetic changes. Targeted deep sequencing, combined with bioinformatics sequence alignment, can detect such unwanted mutations. Nevertheless, the classical method, Needleman-Wunsch (NW) algorithm may produce false alignment outcomes, resulting in inaccurate mutation identification. The key to precisely identifying CRISPR-induced mutations lies in determining optimal parameters for the sequence alignment algorithm. Hidden Markov models (HMM) are ideally suited for this task, offering flexibility across CRISPR systems by leveraging forward-backward algorithms for parameter estimation. In this study, we introduce CRISPR-HMM, a statistical software to precisely call CRISPR-induced mutations. We demonstrate that the software significantly improves precision in identifying CRISPR-induced mutations compared to NW-based alignment, thereby enhancing the overall understanding of the CRISPR gene-editing process.

Keywords: CRISPR, HMM, sequence alignment, gene editing

Procedia PDF Downloads 51
102 Virtual Screening of Potential Inhibitors against Efflux Pumps of Mycobacterium tuberculosis

Authors: Gagan Dhawan

Abstract:

Mycobacterium tuberculosis was described as ‘captain of death’ with an inherent property of multiple drug resistance majorly caused by the competent mechanism of efflux pumps. In this study, various open source tools combining chemo-informatics with bioinformatics were used for efficient in-silico drug designing. The efflux pump, Rv1218c, belonging to the ABC transporter superfamily, which is predicted to be a tetronasin-transporter in M. tuberculosis was targeted. Recent studies have shown that Rv1218c forms a complex with two more efflux pumps (Rv1219c and Rv1217c) to provide multidrug resistance to the bacterium. The 3D structure of the protein was modeled (as the structure was unavailable in the previously collected databases on this gene). The TMHMM analysis of this protein in TubercuList has shown that this protein is present in the outer membrane of the bacterium. Virtual screening of compounds from various publically available chemical libraries was performed on the M. tuberculosis protein using various open source tools. These ligands were further assessed where various physicochemical properties were evaluated and analyzed. On comparison of different physicochemical properties, toxicity and docking, the ligand 2-(hydroxymethyl)-6-[4, 5, 6-trihydroxy-2-(hydroxymethyl) tetrahydropyran-3-yl] oxy-tetrahydropyran-3, 4, 5-triol was found to be best suited for further studies.

Keywords: drug resistance, efflux pump, molecular docking, virtual screening

Procedia PDF Downloads 369
101 Hybrid Structure Learning Approach for Assessing the Phosphate Laundries Impact

Authors: Emna Benmohamed, Hela Ltifi, Mounir Ben Ayed

Abstract:

Bayesian Network (BN) is one of the most efficient classification methods. It is widely used in several fields (i.e., medical diagnostics, risk analysis, bioinformatics research). The BN is defined as a probabilistic graphical model that represents a formalism for reasoning under uncertainty. This classification method has a high-performance rate in the extraction of new knowledge from data. The construction of this model consists of two phases for structure learning and parameter learning. For solving this problem, the K2 algorithm is one of the representative data-driven algorithms, which is based on score and search approach. In addition, the integration of the expert's knowledge in the structure learning process allows the obtainment of the highest accuracy. In this paper, we propose a hybrid approach combining the improvement of the K2 algorithm called K2 algorithm for Parents and Children search (K2PC) and the expert-driven method for learning the structure of BN. The evaluation of the experimental results, using the well-known benchmarks, proves that our K2PC algorithm has better performance in terms of correct structure detection. The real application of our model shows its efficiency in the analysis of the phosphate laundry effluents' impact on the watershed in the Gafsa area (southwestern Tunisia).

Keywords: Bayesian network, classification, expert knowledge, structure learning, surface water analysis

Procedia PDF Downloads 128
100 LncRNA NEAT1 Promotes NSCLC Progression through Acting as a ceRNA of miR-377-3p

Authors: Chengcao Sun, Shujun Li, Cuili Yang, Yongyong Xi, Liang Wang, Feng Zhang, Dejia Li

Abstract:

Recently, the long non-coding RNA (lncRNA) NEAT1 has been identified as an oncogenic gene in multiple cancer types and elevated expression of NEAT1 was tightly linked to tumorigenesis and cancer progression. However, the molecular basis for this observation has not been characterized in progression of non-small cell lung cancer (NSCLC). In our studies, we identified NEAT1 was highly expressed in NSCLC patients and was a novel regulator of NSCLC progression. Patients whose tumors had high NEAT1 expression had a shorter overall survival than patients whose tumors had low NEAT1 expression. Further, NEAT1 significantly accelerates NSCLC cell growth and metastasis in vitro and tumor growth in vivo. Additionally, by using bioinformatics study and RNA pull down combined with luciferase reporter assays, we demonstrated that NEAT1 functioned as a competing endogenous RNA (ceRNA) for has-miR-377-3p, antagonized its functions and led to the de-repression of its endogenous targets E2F3, which was a core oncogene in promoting NSCLC progression. Taken together, these observations imply that the NEAT1 modulated the expression of E2F3 gene by acting as a competing endogenous RNA, which may build up the missing link between the regulatory miRNA network and NSCLC progression.

Keywords: long non-coding RNA NEAT1, hsa-miRNA-377-3p, E2F3, non-small cell lung cancer, tumorigenesis

Procedia PDF Downloads 369
99 Decellularized Brain-Chitosan Scaffold for Neural Tissue Engineering

Authors: Yun-An Chen, Hung-Jun Lin, Tai-Horng Young, Der-Zen Liu

Abstract:

Decellularized brain extracellular matrix had been shown that it has the ability to influence on cell proliferation, differentiation and associated cell phenotype. However, this scaffold is thought to have poor mechanical properties and rapid degradation, it is hard for cell recellularization. In this study, we used decellularized brain extracellular matrix combined with chitosan, which is naturally occurring polysaccharide and non-cytotoxic polymer, forming a 3-D scaffold for neural stem/precursor cells (NSPCs) regeneration. HE staining and DAPI fluorescence staining confirmed decellularized process could effectively vanish the cellular components from the brain. GAGs and collagen I, collagen IV were be showed a great preservation by Alcain staining and immunofluorescence staining respectively. Decellularized brain extracellular matrix was well mixed in chitosan to form a 3-D scaffold (DB-C scaffold). The pore size was approximately 50±10 μm examined by SEM images. Alamar blue results demonstrated NSPCs had great proliferation ability in DB-C scaffold. NSPCs that were cultured in this complex scaffold differentiated into neurons and astrocytes, as reveled by NSPCs expression of microtubule-associated protein 2 (MAP2) and glial fibrillary acidic protein (GFAP). In conclusion, DB-C scaffold may provide bioinformatics cues for NSPCs generation and aid for CNS injury functional recovery applications.

Keywords: brain, decellularization, chitosan, scaffold, neural stem/precursor cells

Procedia PDF Downloads 320
98 Genome-Wide Isoform Specific KDM5A/JARID1A/RBP2 Location Analysis Reveals Contribution of Chromatin-Interacting PHD Domain in Protein Recruitment to Binding Sites

Authors: Abul B. M. M. K. Islam, Nuria Lopez-Bigas, Elizaveta V. Benevolenskaya

Abstract:

RBP2 has shown to be important for cell differentiation control through epigenetic mechanism. The main aim of the present study is genome-wide location analysis of human RBP2 isoforms that differ in a histone-binding domain by ChIPseq. It is conceivable that the larger isoform (LI) of RBP2, which contains a specific H3K4me3 interacting domain, differs from the smaller isoform (SI) in genomic location, may account for the observed diversity in RBP2 function. To distinguish the two RBP2 isoforms, we used the fact that the SI lacks the C-terminal PHD domain and hence used the antibodies detecting both RBP2 isoforms (AI) through a common central domain, and the antibodies detecting only LI but not SI, through a C-terminal PHD domain. Overall our analysis suggests that RBP2 occupies about 77 nucleotides and binds GC rich motifs of active genes, does not bind to centromere, telomere, or enhancer regions, and binding sites are conserved compare to random. A striking difference between the only-SI and only-LI is that a large number of only-SI peaks are located in CpG islands and close to TSS compared to only-LI peaks. Enrichment analysis of the related genes indicates that several oncogenic pathways and metabolic pathways/processes are significantly enriched among only-SI/AI targets, but not LI/only-LI peak’s targets.

Keywords: bioinformatics, cancer, ChIP-seq, KDM5A

Procedia PDF Downloads 307
97 Gene Prediction in DNA Sequences Using an Ensemble Algorithm Based on Goertzel Algorithm and Anti-Notch Filter

Authors: Hamidreza Saberkari, Mousa Shamsi, Hossein Ahmadi, Saeed Vaali, , MohammadHossein Sedaaghi

Abstract:

In the recent years, using signal processing tools for accurate identification of the protein coding regions has become a challenge in bioinformatics. Most of the genomic signal processing methods is based on the period-3 characteristics of the nucleoids in DNA strands and consequently, spectral analysis is applied to the numerical sequences of DNA to find the location of periodical components. In this paper, a novel ensemble algorithm for gene selection in DNA sequences has been presented which is based on the combination of Goertzel algorithm and anti-notch filter (ANF). The proposed algorithm has many advantages when compared to other conventional methods. Firstly, it leads to identify the coding protein regions more accurate due to using the Goertzel algorithm which is tuned at the desired frequency. Secondly, faster detection time is achieved. The proposed algorithm is applied on several genes, including genes available in databases BG570 and HMR195 and their results are compared to other methods based on the nucleotide level evaluation criteria. Implementation results show the excellent performance of the proposed algorithm in identifying protein coding regions, specifically in identification of small-scale gene areas.

Keywords: protein coding regions, period-3, anti-notch filter, Goertzel algorithm

Procedia PDF Downloads 387
96 Is there Anything Useful in That? High Value Product Extraction from Artemisia annua L. in the Spent Leaf and Waste Streams

Authors: Anike Akinrinlade

Abstract:

The world population is estimated to grow from 7.1 billion to 9.22 billion by 2075, increasing therefore by 23% from the current global population. Much of the demographic changes up to 2075 will take place in the less developed regions. There are currently 54 countries which fall under the bracket of being defined as having ‘low-middle income’ economies and need new ways to generate valuable products from current resources that is available. Artemisia annua L is well used for the extraction of the phytochemical artemisinin, which accounts for around 0.01 to 1.4 % dry weight of the plant. Artemisinin is used in the treatment of malaria, a disease rampart in sub-Saharan Africa and in many other countries. Once artemisinin has been extracted the spent leaf and waste streams are disposed of as waste. A feasibility study was carried out looking at increasing the biomass value of A. annua, by designing a biorefinery where spent leaf and waste streams are utilized for high product generation. Quercetin, ferulic acid, dihydroartemisinic acid, artemisinic acid and artemsinin were screened for in the waste stream samples and the spent leaf. The analytical results showed that artemisinin, artemisinic acid and dihydroartemisinic acid were present in the waste extracts as well as camphor and arteannuin b. Ongoing effects are looking at using more industrially relevant solvents to extract the phytochemicals from the waste fractions and investigate how microwave pyrolysis of spent leaf can be utilized to generate bio-products.

Keywords: high value product generation, bioinformatics, biomedicine, waste streams, spent leaf

Procedia PDF Downloads 349
95 Genome-Wide Identification of Genes Resistance to Nitric Oxide in Vibrio parahaemolyticus

Authors: Yantao Li, Jun Zheng

Abstract:

Food poison caused by consumption of contaminated food, especially seafood, is one of most serious public health threats worldwide. Vibrio parahaemolyticus is emerging bacterial pathogen and the leading cause of human gastroenteritis associated with food poison, especially in the southern coastal region of China. To successfully cause disease in host, bacterial pathogens need to overcome the host-derived stresses encountered during infection. One of the toxic chemical species elaborated by the host is nitric oxide (NO). NO is generated by acidified nitrite in the stomach and by enzymes of the inducible NO synthase (iNOS) in the host cell, and is toxic to bacteria. Bacterial pathogens have evolved some mechanisms to battle with this toxic stress. Such mechanisms include genes to sense NO produced from immune system and activate others to detoxify NO toxicity, and genes to repair the damage caused by toxic reactive nitrogen species (RNS) generated during NO toxic stress. However, little is known about the NO resistance in V. parahaemolyticus. In this study, a transposon coupled with next generation sequencing (Tn-seq) technology will be utilized to identify genes for NO resistance in V. parahaemolyticus. Our strategy will include construction the saturating transposon insertion library, transposon library challenging with NO, next generation sequencing (NGS), bioinformatics analysis and verification of the identified genes in vitro and in vivo.

Keywords: vibrio parahaemolyticus, nitric oxide, tn-seq, virulence

Procedia PDF Downloads 264
94 Classification of Multiple Cancer Types with Deep Convolutional Neural Network

Authors: Nan Deng, Zhenqiu Liu

Abstract:

Thousands of patients with metastatic tumors were diagnosed with cancers of unknown primary sites each year. The inability to identify the primary cancer site may lead to inappropriate treatment and unexpected prognosis. Nowadays, a large amount of genomics and transcriptomics cancer data has been generated by next-generation sequencing (NGS) technologies, and The Cancer Genome Atlas (TCGA) database has accrued thousands of human cancer tumors and healthy controls, which provides an abundance of resource to differentiate cancer types. Meanwhile, deep convolutional neural networks (CNNs) have shown high accuracy on classification among a large number of image object categories. Here, we utilize 25 cancer primary tumors and 3 normal tissues from TCGA and convert their RNA-Seq gene expression profiling to color images; train, validate and test a CNN classifier directly from these images. The performance result shows that our CNN classifier can archive >80% test accuracy on most of the tumors and normal tissues. Since the gene expression pattern of distant metastases is similar to their primary tumors, the CNN classifier may provide a potential computational strategy on identifying the unknown primary origin of metastatic cancer in order to plan appropriate treatment for patients.

Keywords: bioinformatics, cancer, convolutional neural network, deep leaning, gene expression pattern

Procedia PDF Downloads 299
93 Analysis of Osmotin as Transcription Factor/Cell Signaling Modulator Using Bioinformatic Tools

Authors: Usha Kiran, M. Z. Abdin

Abstract:

Osmotin is an abundant cationic multifunctional protein discovered in cells of tobacco (Nicotiana tabacum L. var Wisconsin 38) adapted to an environment of low osmotic potential. It provides plants protection from pathogens, hence placed in the PRP family of proteins. The osmotin induced proline accumulation has been reported in plants including transgenic tomato and strawberry conferring tolerance against both biotic and abiotic stresses. The exact mechanism of induction of proline by osmotin is however, not known till date. These observations have led us to hypothesize that osmotin induced proline accumulation could be due to its involvement as transcription factor and/or cell signal pathway modulator in proline biosynthesis. The present investigation was therefore, undertaken to analyze the osmotin protein as transcription factor /cell signalling modulator using bioinformatics tools. The results of available online DNA binding motif search programs revealed that osmotin does not contain DNA-binding motifs. The alignment results of osmotin protein with the protein sequence from DATF showed the homology in the range of 0-20%, suggesting that it might not contain a DNA binding motif. Further to find unique DNA-binding domain, the superimposition of osmotin 3D structure on modeled Arabidopsis transcription factors using Chimera also suggested absence of the same. We, however, found evidence implicating osmotin in cell signaling. With these results, we concluded that osmotin is not a transcription factor but regulating proline biosynthesis and accumulation through cell signaling during abiotic stresses.

Keywords: osmotin, cell signaling modulator, bioinformatic tools, protein

Procedia PDF Downloads 272
92 Computational Approach for Grp78–Nf-ΚB Binding Interactions in the Context of Neuroprotective Pathway in Brain Injuries

Authors: Janneth Gonzalez, Marco Avila, George Barreto

Abstract:

GRP78 participates in multiple functions in the cell during normal and pathological conditions, controlling calcium homeostasis, protein folding and unfolded protein response. GRP78 is located in the endoplasmic reticulum, but it can change its location under stress, hypoxic and apoptotic conditions. NF-κB represents the keystone of the inflammatory process and regulates the transcription of several genes related with apoptosis, differentiation, and cell growth. The possible relationship between GRP78-NF-κB could support and explain several mechanisms that may regulate a variety of cell functions, especially following brain injuries. Although several reports show interactions between NF-κB and heat shock proteins family members, there is a lack of information on how GRP78 may be interacting with NF-κB, and possibly regulating its downstream activation. Therefore, we assessed the computational predictions of the GRP78 (Chain A) and NF-κB complex (IkB alpha and p65) protein-protein interactions. The interaction interface of the docking model showed that the amino acids ASN 47, GLU 215, GLY 403 of GRP78 and THR 54, ASN 182 and HIS 184 of NF-κB are key residues involved in the docking. The electrostatic field between GRP78-NF-κB interfaces and molecular dynamic simulations support the possible interaction between the proteins. In conclusion, this work shed some light in the possible GRP78-NF-κB complex indicating key residues in this crosstalk, which may be used as an input for better drug design strategy targeting NF-κB downstream signaling as a new therapeutic approach following brain injuries.

Keywords: computational biology, protein interactions, Grp78, bioinformatics, molecular dynamics

Procedia PDF Downloads 342
91 Glycan Analyzer: Software to Annotate Glycan Structures from Exoglycosidase Experiments

Authors: Ian Walsh, Terry Nguyen-Khuong, Christopher H. Taron, Pauline M. Rudd

Abstract:

Glycoproteins and their covalently bonded glycans play critical roles in the immune system, cell communication, disease and disease prognosis. Ultra performance liquid chromatography (UPLC) coupled with mass spectrometry is conventionally used to qualitatively and quantitatively characterise glycan structures in a given sample. Exoglycosidases are enzymes that catalyze sequential removal of monosaccharides from the non-reducing end of glycans. They naturally have specificity for a particular type of sugar, its stereochemistry (α or β anomer) and its position of attachment to an adjacent sugar on the glycan. Thus, monitoring the peak movements (both in the UPLC and MS1) after application of exoglycosidases provides a unique and effective way to annotate sugars with high detail - i.e. differentiating positional and linkage isomers. Manual annotation of an exoglycosidase experiment is difficult and time consuming. As such, with increasing sample complexity and the number of exoglycosidases, the analysis could result in manually interpreting hundreds of peak movements. Recently, we have implemented pattern recognition software for automated interpretation of UPLC-MS1 exoglycosidase digestions. In this work, we explain the software, indicate how much time it will save and provide example usage showing the annotation of positional and linkage isomers in Immunoglobulin G, apolipoprotein J, and simple glycan standards.

Keywords: bioinformatics, automated glycan assignment, liquid chromatography, mass spectrometry

Procedia PDF Downloads 200
90 Application of KL Divergence for Estimation of Each Metabolic Pathway Genes

Authors: Shohei Maruyama, Yasuo Matsuyama, Sachiyo Aburatani

Abstract:

The development of the method to annotate unknown gene functions is an important task in bioinformatics. One of the approaches for the annotation is The identification of the metabolic pathway that genes are involved in. Gene expression data have been utilized for the identification, since gene expression data reflect various intracellular phenomena. However, it has been difficult to estimate the gene function with high accuracy. It is considered that the low accuracy of the estimation is caused by the difficulty of accurately measuring a gene expression. Even though they are measured under the same condition, the gene expressions will vary usually. In this study, we proposed a feature extraction method focusing on the variability of gene expressions to estimate the genes' metabolic pathway accurately. First, we estimated the distribution of each gene expression from replicate data. Next, we calculated the similarity between all gene pairs by KL divergence, which is a method for calculating the similarity between distributions. Finally, we utilized the similarity vectors as feature vectors and trained the multiclass SVM for identifying the genes' metabolic pathway. To evaluate our developed method, we applied the method to budding yeast and trained the multiclass SVM for identifying the seven metabolic pathways. As a result, the accuracy that calculated by our developed method was higher than the one that calculated from the raw gene expression data. Thus, our developed method combined with KL divergence is useful for identifying the genes' metabolic pathway.

Keywords: metabolic pathways, gene expression data, microarray, Kullback–Leibler divergence, KL divergence, support vector machines, SVM, machine learning

Procedia PDF Downloads 403
89 Evaluation of Impact on Traffic Conditions Due to Electronic Toll Collection System Design in Thailand

Authors: Kankrong Suangka

Abstract:

This research explored behaviors of toll way users that impact their decision to use the Electronic Toll Collection System (ETC). It also went on to explore and evaluated the efficiency of toll plaza in terms of number of ETC booths in toll plaza and its lane location. The two main parameters selected for the scenarios analyzed were (1) the varying ration of ETC enabled users (2) the varying locations of the dedicated ETC lane. There were a total of 42 scenarios analyzed. Researched data indicated that in A.D.2013, the percentage of ETC user from the total toll user is 22%. It was found that the delay at the payment booth was reduced by increasing the ETC booth by 1 more lane under the condition that the volume of ETC users passing through the plaza less than 1,200 vehicles/hour. Meanwhile, increasing the ETC lanes by 2 lanes can accommodate an increased traffic volume to around 1,200 to 1,800 vehicles/hour. Other than that, in terms of the location of ETC lane, it was found that if for one ETC lane-plazas, installing the ETC lane at the far right are the best alternative. For toll plazas with 2 ETC lanes, the best layout is to have 1 lane in the middle and 1 lane at the far right. This layout shows the least delay when compared to other layouts. Furthermore, the results from this research showed that micro-simulator traffic models have potential for further applications and use in designing toll plaza lanes. Other than that, the results can also be used to analyze the system of the nearby area with similar traffic volume and can be used for further design improvements.

Keywords: the electronic toll collection system, average queuing delay, toll plaza configuration, bioinformatics, biomedicine

Procedia PDF Downloads 239
88 Antibody Reactivity of Synthetic Peptides Belonging to Proteins Encoded by Genes Located in Mycobacterium tuberculosis-Specific Genomic Regions of Differences

Authors: Abu Salim Mustafa

Abstract:

The comparisons of mycobacterial genomes have identified several Mycobacterium tuberculosis-specific genomic regions that are absent in other mycobacteria and are known as regions of differences. Due to M. tuberculosis-specificity, the peptides encoded by these regions could be useful in the specific diagnosis of tuberculosis. To explore this possibility, overlapping synthetic peptides corresponding to 39 proteins predicted to be encoded by genes present in regions of differences were tested for antibody-reactivity with sera from tuberculosis patients and healthy subjects. The results identified four immunodominant peptides corresponding to four different proteins, with three of the peptides showing significantly stronger antibody reactivity and rate of positivity with sera from tuberculosis patients than healthy subjects. The fourth peptide was recognized equally well by the sera of tuberculosis patients as well as healthy subjects. Predication of antibody epitopes by bioinformatics analyses using ABCpred server predicted multiple linear epitopes in each peptide. Furthermore, peptide sequence analysis for sequence identity using BLAST suggested M. tuberculosis-specificity for the three peptides that had preferential reactivity with sera from tuberculosis patients, but the peptide with equal reactivity with sera of TB patients and healthy subjects showed significant identity with sequences present in nob-tuberculous mycobacteria. The three identified M. tuberculosis-specific immunodominant peptides may be useful in the serological diagnosis of tuberculosis.

Keywords: genomic regions of differences, Mycobacterium tuberculossis, peptides, serodiagnosis

Procedia PDF Downloads 183
87 Non-Invasive Pre-Implantation Genetic Assessment Using NGS in IVF Clinical Routine

Authors: Katalin Gombos, Bence Gálik, Krisztina Ildikó Kalács, Krisztina Gödöny, Ákos Várnagy, József Bódis, Attila Gyenesei, Gábor L. Kovács

Abstract:

Although non-invasive pre-implantation genetic testing for aneuploidy (NIPGT-A) is potentially appropriate to assess chromosomal ploidy of the embryo, practical application of it in a routine IVF center has not been started in the absence of a recommendation. We developed a comprehensive workflow for a clinically applicable strategy for NIPGT-A based on next-generation sequencing (NGS) technology. We performed MALBAC whole genome amplification and NGS on spent blastocyst culture media of Day 3 embryos fertilized with intra-cytoplasmic sperm injection (ICSI). Spent embryonic culture media of morphologically good quality score embryos were enrolled in further analysis with the blank culture media as background control. Chromosomal abnormalities were identified by an optimized bioinformatics pipeline applying a copy number variation (CNV) detecting algorithm. We demonstrate a comprehensive workflow covering both wet- and dry-lab procedures supporting a clinically applicable strategy for NIPGT-A. It can be carried out within 48 h which is critical for the same-cycle blastocyst transfer, but also suitable for “freeze all” and “elective frozen embryo” strategies. The described integrated approach of non-invasive evaluation of embryonic DNA content of the culture media can potentially supplement existing pre-implantation genetic screening methods.

Keywords: next generation sequencing, in vitro fertilization, embryo assessment, non-invasive pre-implantation genetic testing

Procedia PDF Downloads 156
86 Characterization of the Intestinal Microbiota: A Signature in Fecal Samples from Patients with Irritable Bowel Syndrome

Authors: Mina Hojat Ansari, Kamran Bagheri Lankarani, Mohammad Reza Fattahi, Ali Reza Safarpour

Abstract:

Irritable bowel syndrome (IBS) is a common bowel disorder which is usually diagnosed through the abdominal pain, fecal irregularities and bloating. Alteration in the intestinal microbial composition is implicating to inflammatory and functional bowel disorders which is recently also noted as an IBS feature. Owing to the potential importance of microbiota implication in both efficiencies of the treatment and prevention of the diseases, we examined the association between the intestinal microbiota and different bowel patterns in a cohort of subjects with IBS and healthy controls. Fresh fecal samples were collected from a total of 50 subjects, 30 of whom met the Rome IV criteria for IBS and 20 Healthy control. Total DNA was extracted and library preparation was conducted following the standard protocol for small whole genome sequencing. The pooled libraries sequenced on an Illumina Nextseq platform with a 2 × 150 paired-end read length and obtained sequences were analyzed using several bioinformatics programs. The majority of sequences obtained in the current study assigned to bacteria. However, our finding highlighted the significant microbial taxa variation among the studied groups. The result, therefore, suggests a significant association of the microbiota with symptoms and bowel characteristics in patients with IBS. These alterations in fecal microbiota could be exploited as a biomarker for IBS or its subtypes and suggest the modification of the microbiota might be integrated into prevention and treatment strategies for IBS.

Keywords: irritable bowel syndrome, intestinal microbiota, small whole genome sequencing, fecal samples, Illumina

Procedia PDF Downloads 166
85 Genome Characterization and Phylogeny Analysis of Viruses Infected Invertebrates, Parvoviridae Family

Authors: Niloofar Fariborzi, Hamzeh Alipour, Kourosh Azizi, Neda Eskandarzade, Abozar Ghorbani

Abstract:

The family Parvoviridae consists of a large diversity of single-stranded DNA viruses, which cause mild to severe diseases in both vertebrates and invertebrates. The Parvoviridae are classified into three subfamilies: Parvovirinae infect vertebrates, Densovirinae infects invertebrates, while Hamaparovirinae infects both vertebrates and invertebrates. Except for the NS1 region, which is the prime criterion for phylogeny analysis, other parts of the parvoviruses genome, such as UTRs, are diverse even among closely related viruses or within the same genus. It is believed that host switching in parvoviruses may be related to genetic changes in regions other than NS1; therefore, whole-genome screening is valuable for studying parvoviruses' host-virus interactions. The aim of this study was to analyze genome organization and phylogeny of the complete genome sequence of the 132 Paroviridae family members, focusing on viruses that infect invertebrates. The maximum and minimum divergence within each subfamily belonged to Densovirinae and Parvovirinae, respectively. The greatest evolutionary divergence was between Hamaparovirinae and Parvovirinae. Unclassified viruses were mostly from Parovirinae and had the highest divergence to densoviruses and the lowest divergence to Parovirinae viruses. In a phylogenetic tree, all hamparoviruses were found in the center of densoviruses, with the exception of Syngnathid Ichthamaparvovirus 1 (NC_055527), which was positioned between two Parvovirinae members (NC _022089 and NC_038544). The proximity of hamparoviruses members to some densoviruses strengthens the possibility that densoviruses may be the ancestors of hamaparoviruses or vice versa. Therefore, examination and phylogeny analysis of the whole genome is necessary to understand Parvoviridae family host selection.

Keywords: densoviruses, parvoviridae, bioinformatics, phylogeny

Procedia PDF Downloads 93
84 The Genetic Architecture Underlying Dilated Cardiomyopathy in Singaporeans

Authors: Feng Ji Mervin Goh, Edmund Chee Jian Pua, Stuart Alexander Cook

Abstract:

Dilated cardiomyopathy (DCM) is a common cause of heart failure. Genetic mutations account for 50% of DCM cases with TTN mutations being the most common, accounting for up to 25% of DCM cases. However, the genetic architecture underlying Asian DCM patients is unknown. We evaluated 68 patients (female= 17) with DCM who underwent follow-up at the National Heart Centre, Singapore from 2013 through 2014. Clinical data were obtained and analyzed retrospectively. Genomic DNA was subjected to next-generation targeted sequencing. Nextera Rapid Capture Enrichment was used to capture the exons of a panel of 169 cardiac genes. DNA libraries were sequenced as paired-end 150-bp reads on Illumina MiSeq. Raw sequence reads were processed and analysed using standard bioinformatics techniques. The average age of onset of DCM was 46.1±10.21 years old. The average left ventricular ejection fraction (LVEF), left ventricular diastolic internal diameter (LVIDd), left ventricular systolic internal diameter (LVIDs) were 26.1±11.2%, 6.20±0.83cm, and 5.23±0.92cm respectively. The frequencies of mutations in major DCM-associated genes were as follows TTN (5.88% vs published frequency of 20%), LMNA (4.41% vs 6%), MYH7 (5.88% vs 4%), MYH6 (5.88% vs 4%), and SCN5a (4.41% vs 3%). The average callability at 10 times coverage of each major gene were: TTN (99.7%), LMNA (87.1%), MYH7 (94.8%), MYH6 (95.5%), and SCN5a (94.3%). In conclusion, TTN mutations are not common in Singaporean DCM patients. The frequencies of other major DCM-associated genes are comparable to frequencies published in the current literature.

Keywords: heart failure, dilated cardiomyopathy, genetics, next-generation sequencing

Procedia PDF Downloads 243