Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 11383

Search results for: protein structure classification

11383 Transfer Learning for Protein Structure Classification at Low Resolution

Authors: Alexander Hudson, Shaogang Gong

Abstract:

Structure determination is key to understanding protein function at a molecular level. Whilst significant advances have been made in predicting structure and function from amino acid sequence, researchers must still rely on expensive, time-consuming analytical methods to visualise detailed protein conformation. In this study, we demonstrate that it is possible to make accurate (≥80%) predictions of protein class and architecture from structures determined at low (>3A) resolution, using a deep convolutional neural network trained on high-resolution (≤3A) structures represented as 2D matrices. Thus, we provide proof of concept for high-speed, low-cost protein structure classification at low resolution, and a basis for extension to prediction of function. We investigate the impact of the input representation on classification performance, showing that side-chain information may not be necessary for fine-grained structure predictions. Finally, we confirm that high resolution, low-resolution and NMR-determined structures inhabit a common feature space, and thus provide a theoretical foundation for boosting with single-image super-resolution.

Keywords: transfer learning, protein distance maps, protein structure classification, neural networks

Procedia PDF Downloads 101

11382 Prediction of All-Beta Protein Secondary Structure Using Garnier-Osguthorpe-Robson Method

Authors: K. Tejasri, K. Suvarna Vani, S. Prathyusha, S. Ramya

Abstract:

Proteins are chained sequences of amino acids which are brought together by the peptide bonds. Many varying formations of the chains are possible due to multiple combinations of amino acids and rotation in numerous positions along the chain. Protein structure prediction is one of the crucial goals worked towards by the members of bioinformatics and theoretical chemistry backgrounds. Among the four different structure levels in proteins, we emphasize mainly the secondary level structure. Generally, the secondary protein basically comprises alpha-helix and beta-sheets. Multi-class classification problem of data with disparity is truly a challenge to overcome and has to be addressed for the beta strands. Imbalanced data distribution constitutes a couple of the classes of data having very limited training samples collated with other classes. The secondary structure data is extracted from the protein primary sequence, and the beta-strands are predicted using suitable machine learning algorithms.

Keywords: proteins, secondary structure elements, beta-sheets, beta-strands, alpha-helices, machine learning algorithms

Procedia PDF Downloads 63

11381 Protein Tertiary Structure Prediction by a Multiobjective Optimization and Neural Network Approach

Authors: Alexandre Barbosa de Almeida, Telma Woerle de Lima Soares

Abstract:

Protein structure prediction is a challenging task in the bioinformatics field. The biological function of all proteins majorly relies on the shape of their three-dimensional conformational structure, but less than 1% of all known proteins in the world have their structure solved. This work proposes a deep learning model to address this problem, attempting to predict some aspects of the protein conformations. Throughout a process of multiobjective dominance, a recurrent neural network was trained to abstract the particular bias of each individual multiobjective algorithm, generating a heuristic that could be useful to predict some of the relevant aspects of the three-dimensional conformation process formation, known as protein folding.

Keywords: Ab initio heuristic modeling, multiobjective optimization, protein structure prediction, recurrent neural network

Procedia PDF Downloads 174

11380 Estimation of Transition and Emission Probabilities

Authors: Aakansha Gupta, Neha Vadnere, Tapasvi Soni, M. Anbarsi

Abstract:

Protein secondary structure prediction is one of the most important goals pursued by bioinformatics and theoretical chemistry; it is highly important in medicine and biotechnology. Some aspects of protein functions and genome analysis can be predicted by secondary structure prediction. This is used to help annotate sequences, classify proteins, identify domains, and recognize functional motifs. In this paper, we represent protein secondary structure as a mathematical model. To extract and predict the protein secondary structure from the primary structure, we require a set of parameters. Any constants appearing in the model are specified by these parameters, which also provide a mechanism for efficient and accurate use of data. To estimate these model parameters there are many algorithms out of which the most popular one is the EM algorithm or called the Expectation Maximization Algorithm. These model parameters are estimated with the use of protein datasets like RS126 by using the Bayesian Probabilistic method (data set being categorical). This paper can then be extended into comparing the efficiency of EM algorithm to the other algorithms for estimating the model parameters, which will in turn lead to an efficient component for the Protein Secondary Structure Prediction. Further this paper provides a scope to use these parameters for predicting secondary structure of proteins using machine learning techniques like neural networks and fuzzy logic. The ultimate objective will be to obtain greater accuracy better than the previously achieved.

Keywords: model parameters, expectation maximization algorithm, protein secondary structure prediction, bioinformatics

Procedia PDF Downloads 441

11379 Easymodel: Web-based Bioinformatics Software for Protein Modeling Based on Modeller

Authors: Alireza Dantism

Abstract:

Presently, describing the function of a protein sequence is one of the most common problems in biology. Usually, this problem can be facilitated by studying the three-dimensional structure of proteins. In the absence of a protein structure, comparative modeling often provides a useful three-dimensional model of the protein that is dependent on at least one known protein structure. Comparative modeling predicts the three-dimensional structure of a given protein sequence (target) mainly based on its alignment with one or more proteins of known structure (templates). Comparative modeling consists of four main steps 1. Similarity between the target sequence and at least one known template structure 2. Alignment of target sequence and template(s) 3. Build a model based on alignment with the selected template(s). 4. Prediction of model errors 5. Optimization of the built model There are many computer programs and web servers that automate the comparative modeling process. One of the most important advantages of these servers is that it makes comparative modeling available to both experts and non-experts, and they can easily do their own modeling without the need for programming knowledge, but some other experts prefer using programming knowledge and do their modeling manually because by doing this they can maximize the accuracy of their modeling. In this study, a web-based tool has been designed to predict the tertiary structure of proteins using PHP and Python programming languages. This tool is called EasyModel. EasyModel can receive, according to the user's inputs, the desired unknown sequence (which we know as the target) in this study, the protein sequence file (template), etc., which also has a percentage of similarity with the primary sequence, and its third structure Predict the unknown sequence and present the results in the form of graphs and constructed protein files.

Keywords: structural bioinformatics, protein tertiary structure prediction, modeling, comparative modeling, modeller

Procedia PDF Downloads 58

11378 Comparison of Different Artificial Intelligence-Based Protein Secondary Structure Prediction Methods

Authors: Jamerson Felipe Pereira Lima, Jeane Cecília Bezerra de Melo

Abstract:

The difficulty and cost related to obtaining of protein tertiary structure information through experimental methods, such as X-ray crystallography or NMR spectroscopy, helped raising the development of computational methods to do so. An approach used in these last is prediction of tridimensional structure based in the residue chain, however, this has been proved an NP-hard problem, due to the complexity of this process, explained by the Levinthal paradox. An alternative solution is the prediction of intermediary structures, such as the secondary structure of the protein. Artificial Intelligence methods, such as Bayesian statistics, artificial neural networks (ANN), support vector machines (SVM), among others, were used to predict protein secondary structure. Due to its good results, artificial neural networks have been used as a standard method to predict protein secondary structure. Recent published methods that use this technique, in general, achieved a Q3 accuracy between 75% and 83%, whereas the theoretical accuracy limit for protein prediction is 88%. Alternatively, to achieve better results, support vector machines prediction methods have been developed. The statistical evaluation of methods that use different AI techniques, such as ANNs and SVMs, for example, is not a trivial problem, since different training sets, validation techniques, as well as other variables can influence the behavior of a prediction method. In this study, we propose a prediction method based on artificial neural networks, which is then compared with a selected SVM method. The chosen SVM protein secondary structure prediction method is the one proposed by Huang in his work Extracting Physico chemical Features to Predict Protein Secondary Structure (2013). The developed ANN method has the same training and testing process that was used by Huang to validate his method, which comprises the use of the CB513 protein data set and three-fold cross-validation, so that the comparative analysis of the results can be made comparing directly the statistical results of each method.

Keywords: artificial neural networks, protein secondary structure, protein structure prediction, support vector machines

Procedia PDF Downloads 581

11377 DNpro: A Deep Learning Network Approach to Predicting Protein Stability Changes Induced by Single-Site Mutations

Authors: Xiao Zhou, Jianlin Cheng

Abstract:

A single amino acid mutation can have a significant impact on the stability of protein structure. Thus, the prediction of protein stability change induced by single site mutations is critical and useful for studying protein function and structure. Here, we presented a deep learning network with the dropout technique for predicting protein stability changes upon single amino acid substitution. While using only protein sequence as input, the overall prediction accuracy of the method on a standard benchmark is >85%, which is higher than existing sequence-based methods and is comparable to the methods that use not only protein sequence but also tertiary structure, pH value and temperature. The results demonstrate that deep learning is a promising technique for protein stability prediction. The good performance of this sequence-based method makes it a valuable tool for predicting the impact of mutations on most proteins whose experimental structures are not available. Both the downloadable software package and the user-friendly web server (DNpro) that implement the method for predicting protein stability changes induced by amino acid mutations are freely available for the community to use.

Keywords: bioinformatics, deep learning, protein stability prediction, biological data mining

Procedia PDF Downloads 419

11376 Effect of Low Temperature on Structure and RNA Binding of E.coli CspA: A Molecular Dynamics Based Study

Authors: Amit Chaudhary, B. S. Yadav, P. K. Maurya, A. M., S. Srivastava, S. Singh, A. Mani

Abstract:

Cold shock protein A (CspA) is major cold inducible protein present in Escherichia coli. The protein is involved in stabilizing secondary structure of RNA by working as chaperone during cold temperature. Two RNA binding motifs play key role in the stabilizing activity. This study aimed to investigate implications of low temperature on structure and RNA binding activity of E. coli CspA. Molecular dynamics simulations were performed to compare the stability of the protein at 37°C and 10 °C. The protein was mutated at RNA binding motifs and docked with RNA to assess the stability of both complexes. Results suggest that CspA as well as CspA-RNA complex is more stable at low temperature. It was also confirmed that RNP1 and RNP2 play key role in RNA binding.

Keywords: CspA, homology modelling, mutation, molecular dynamics simulation

Procedia PDF Downloads 344

11375 Meta-Learning for Hierarchical Classification and Applications in Bioinformatics

Authors: Fabio Fabris, Alex A. Freitas

Abstract:

Hierarchical classification is a special type of classification task where the class labels are organised into a hierarchy, with more generic class labels being ancestors of more specific ones. Meta-learning for classification-algorithm recommendation consists of recommending to the user a classification algorithm, from a pool of candidate algorithms, for a dataset, based on the past performance of the candidate algorithms in other datasets. Meta-learning is normally used in conventional, non-hierarchical classification. By contrast, this paper proposes a meta-learning approach for more challenging task of hierarchical classification, and evaluates it in a large number of bioinformatics datasets. Hierarchical classification is especially relevant for bioinformatics problems, as protein and gene functions tend to be organised into a hierarchy of class labels. This work proposes meta-learning approach for recommending the best hierarchical classification algorithm to a hierarchical classification dataset. This work’s contributions are: 1) proposing an algorithm for splitting hierarchical datasets into new datasets to increase the number of meta-instances, 2) proposing meta-features for hierarchical classification, and 3) interpreting decision-tree meta-models for hierarchical classification algorithm recommendation.

Keywords: algorithm recommendation, meta-learning, bioinformatics, hierarchical classification

Procedia PDF Downloads 279

11374 Protein Remote Homology Detection and Fold Recognition by Combining Profiles with Kernel Methods

Authors: Bin Liu

Abstract:

Protein remote homology detection and fold recognition are two most important tasks in protein sequence analysis, which is critical for protein structure and function studies. In this study, we combined the profile-based features with various string kernels, and constructed several computational predictors for protein remote homology detection and fold recognition. Experimental results on two widely used benchmark datasets showed that these methods outperformed the competing methods, indicating that these predictors are useful computational tools for protein sequence analysis. By analyzing the discriminative features of the training models, some interesting patterns were discovered, reflecting the characteristics of protein superfamilies and folds, which are important for the researchers who are interested in finding the patterns of protein folds.

Keywords: protein remote homology detection, protein fold recognition, profile-based features, Support Vector Machines (SVMs)

Procedia PDF Downloads 126

11373 A Basic Modeling Approach for the 3D Protein Structure of Insulin

Authors: Daniel Zarzo Montes, Manuel Zarzo Castelló

Abstract:

Proteins play a fundamental role in biology, but their structure is complex, and it is a challenge for teachers to conceptually explain the differences between their primary, secondary, tertiary, and quaternary structures. On the other hand, there are currently many computer programs to visualize the 3D structure of proteins, but they require advanced training and knowledge. Moreover, it becomes difficult to visualize the sequence of amino acids in these models, and how the protein conformation is reached. Given this drawback, a simple and instructive procedure is proposed in order to teach the protein structure to undergraduate and graduate students. For this purpose, insulin has been chosen because it is a protein that consists of 51 amino acids, a relatively small number. The methodology has consisted of the use of plastic atom models, which are frequently used in organic chemistry and biochemistry to explain the chirality of biomolecules. For didactic purposes, when the aim is to teach the biochemical foundations of proteins, a manipulative system seems convenient, starting from the chemical structure of amino acids. It has the advantage that the bonds between amino acids can be conveniently rotated, following the pattern marked by the 3D models. First, the 51 amino acids were modeled, and then they were linked according to the sequence of this protein. Next, the three disulfide bonds that characterize the stability of insulin have been established, and then the alpha-helix structure has been formed. In order to reach the tertiary 3D conformation of this protein, different interactive models available on the Internet have been visualized. In conclusion, the proposed methodology seems very suitable for biology and biochemistry students because they can learn the fundamentals of protein modeling by means of a manipulative procedure as a basis for understanding the functionality of proteins. This methodology would be conveniently useful for a biology or biochemistry laboratory practice, either at the pre-graduate or university level.

Keywords: protein structure, 3D model, insulin, biomolecule

Procedia PDF Downloads 16

11372 An Attempt at the Multi-Criterion Classification of Small Towns

Authors: Jerzy Banski

Abstract:

The basic aim of this study is to discuss and assess different classifications and research approaches to small towns that take their social and economic functions into account, as well as relations with surrounding areas. The subject literature typically includes three types of approaches to the classification of small towns: 1) the structural, 2) the location-related, and 3) the mixed. The structural approach allows for the grouping of towns from the point of view of the social, cultural and economic functions they discharge. The location-related approach draws on the idea of there being a continuum between the center and the periphery. A mixed classification making simultaneous use of the different approaches to research brings the most information to bear in regard to categories of the urban locality. Bearing in mind the approaches to classification, it is possible to propose a synthetic method for classifying small towns that takes account of economic structure, location and the relationship between the towns and their surroundings. In the case of economic structure, the small centers may be divided into two basic groups – those featuring a multi-branch structure and those that are specialized economically. A second element of the classification reflects the locations of urban centers. Two basic types can be identified – the small town within the range of impact of a large agglomeration, or else the town outside such areas, which is to say located peripherally. The third component of the classification arises out of small towns’ relations with their surroundings. In consequence, it is possible to indicate 8 types of small-town: from local centers enjoying good accessibility and a multi-branch economic structure to peripheral supra-local centers characterised by a specialized economic structure.

Keywords: small towns, classification, functional structure, localization

Procedia PDF Downloads 158

11371 Scene Classification Using Hierarchy Neural Network, Directed Acyclic Graph Structure, and Label Relations

Authors: Po-Jen Chen, Jian-Jiun Ding, Hung-Wei Hsu, Chien-Yao Wang, Jia-Ching Wang

Abstract:

A more accurate scene classification algorithm using label relations and the hierarchy neural network was developed in this work. In many classification algorithms, it is assumed that the labels are mutually exclusive. This assumption is true in some specific problems, however, for scene classification, the assumption is not reasonable. Because there are a variety of objects with a photo image, it is more practical to assign multiple labels for an image. In this paper, two label relations, which are exclusive relation and hierarchical relation, were adopted in the classification process to achieve more accurate multiple label classification results. Moreover, the hierarchy neural network (hierarchy NN) is applied to classify the image and the directed acyclic graph structure is used for predicting a more reasonable result which obey exclusive and hierarchical relations. Simulations show that, with these techniques, a much more accurate scene classification result can be achieved.

Keywords: convolutional neural network, label relation, hierarchy neural network, scene classification

Procedia PDF Downloads 422

11370 Potential Use of Cnidoscolus Chayamansa Leaf from Mexico as High-Quality Protein Source

Authors: Diana Karina Baigts Allende, Mariana Gonzalez Diaz, Luis Antonio Chel Guerrero, Mukthar Sandoval Peraza

Abstract:

Poverty and food insecurity are still incident problems in the developing countries, where population´s diet is based on cereals which are lack in protein content. Nevertheless, during last years the use of native plants has been studied as an alternative source of protein in order to improve the nutritional intake. Chaya crop also called Spinach tree, is a prehispanic plant native from Central America and South of Mexico (Mayan culture), which has been especially valued due to its high nutritional content particularly protein and some medicinal properties. The aim of this work was to study the effect of protein isolation processing from Chaya leaf harvest in Yucatan, Mexico on its structure quality in order: i) to valorize the Chaya crop and ii) to produce low-cost and high-quality protein. Chaya leaf was extruded, clarified and recovered using: a) acid precipitation by decreasing the pH value until reach the isoelectric point (3.5) and b) thermal coagulation, by heating the protein solution at 80 °C during 30 min. Solubilized protein was re-dissolved in water and spray dried. The presence of Fraction I protein, known as RuBisCO (Rubilose-1,5-biphosfate carboxylase/oxygenase) was confirmed by gel electrophoresis (SDS-PAGE) where molecular weight bands of 55 KDa and 12 KDa were observed. The infrared spectrum showed changes in protein structure due to the isolation method. The use of high temperatures (thermal coagulation) highly decreased protein solubility in comparison to isoelectric precipitated protein, the nutritional properties according to amino acid profile was also disturbed, showing minor amounts of overall essential amino acids from 435.9 to 367.8 mg/g. Chaya protein isolate obtained by acid precipitation showed higher protein quality according to essential amino acid score compared to FAO recommendations, which could represent an important sustainable source of protein for human consumption.

Keywords: chaya leaf, nutritional properties, protein isolate, protein structure

Procedia PDF Downloads 310

11369 Evaluating Classification with Efficacy Metrics

Authors: Guofan Shao, Lina Tang, Hao Zhang

Abstract:

The values of image classification accuracy are affected by class size distributions and classification schemes, making it difficult to compare the performance of classification algorithms across different remote sensing data sources and classification systems. Based on the term efficacy from medicine and pharmacology, we have developed the metrics of image classification efficacy at the map and class levels. The novelty of this approach is that a baseline classification is involved in computing image classification efficacies so that the effects of class statistics are reduced. Furthermore, the image classification efficacies are interpretable and comparable, and thus, strengthen the assessment of image data classification methods. We use real-world and hypothetical examples to explain the use of image classification efficacies. The metrics of image classification efficacy meet the critical need to rectify the strategy for the assessment of image classification performance as image classification methods are becoming more diversified.

Keywords: accuracy assessment, efficacy, image classification, machine learning, uncertainty

Procedia PDF Downloads 177

11368 Attribute Index and Classification Method of Earthquake Damage Photographs of Engineering Structure

Authors: Ming Lu, Xiaojun Li, Bodi Lu, Juehui Xing

Abstract:

Earthquake damage phenomenon of each large earthquake gives comprehensive and profound real test to the dynamic performance and failure mechanism of different engineering structures. Cognitive engineering structure characteristics through seismic damage phenomenon are often far superior to expensive shaking table experiments. After the earthquake, people will record a variety of different types of engineering damage photos. However, a large number of earthquake damage photographs lack sufficient information and reduce their using value. To improve the research value and the use efficiency of engineering seismic damage photographs, this paper objects to explore and show seismic damage background information, which includes the earthquake magnitude, earthquake intensity, and the damaged structure characteristics. From the research requirement in earthquake engineering field, the authors use the 2008 China Wenchuan M8.0 earthquake photographs, and provide four kinds of attribute indexes and classification, which are seismic information, structure types, earthquake damage parts and disaster causation factors. The final object is to set up an engineering structural seismic damage database based on these four attribute indicators and classification, and eventually build a website providing seismic damage photographs.

Keywords: attribute index, classification method, earthquake damage picture, engineering structure

Procedia PDF Downloads 733

11367 Classifying and Predicting Efficiencies Using Interval DEA Grid Setting

Authors: Yiannis G. Smirlis

Abstract:

The classification and the prediction of efficiencies in Data Envelopment Analysis (DEA) is an important issue, especially in large scale problems or when new units frequently enter the under-assessment set. In this paper, we contribute to the subject by proposing a grid structure based on interval segmentations of the range of values for the inputs and outputs. Such intervals combined, define hyper-rectangles that partition the space of the problem. This structure, exploited by Interval DEA models and a dominance relation, acts as a DEA pre-processor, enabling the classification and prediction of efficiency scores, without applying any DEA models.

Keywords: data envelopment analysis, interval DEA, efficiency classification, efficiency prediction

Procedia PDF Downloads 134

11366 Insight into Structure and Functions of of Acyl CoA Binding Protein of Leishmania major

Authors: Rohit Singh Dangi, Ravi Kant Pal, Monica Sundd

Abstract:

Acyl-CoA binding protein (ACBP) is a housekeeping protein which functions as an intracellular carrier of acyl-CoA esters. Given the fact that the amastigote stage (blood stage) of Leishmania depends largely on fatty acids as the energy source, of which a large part is derived from its host, these proteins might have an important role in its survival. In Leishmania major, genome sequencing suggests the presence of six ACBPs, whose function remains largely unknown. For functional and structural characterization, one of the ACBP genes was cloned, and the protein was expressed and purified heterologously. Acyl-CoA ester binding and stoichiometry were analyzed by isothermal titration calorimetry and Dynamic light scattering. Our results shed light on high affinity of ACBP towards longer acyl-CoA esters, such as myristoyl-CoA to arachidonoyl-CoA with single binding site. To understand the binding mechanism & dynamics, Nuclear magnetic resonance assignments of this protein are being done. The protein's crystal structure was determined at 1.5Å resolution and revealed a classical topology for ACBP, containing four alpha-helical bundles. In the binding pocket, the loop between the first and the second helix (16 – 26AA) is four residues longer from other extensively studied ACBPs (PfACBP) and it curls upwards towards the pantothenate moiety of CoA to provide a large tunnel space for long acyl chain insertion.

Keywords: acyl-coa binding protein (ACBP), acyl-coa esters, crystal structure, isothermal titration, calorimetry, Leishmania

Procedia PDF Downloads 415

11365 Lentil Protein Fortification in Cranberry Squash

Authors: Sandhya Devi A

Abstract:

The protein content of the cranberry squash (protein: 0g) may be increased by extracting protein from the lentils (9 g), which is particularly linked to a lower risk of developing heart disease. Using the technique of alkaline extraction from the lentils flour, protein may be extracted. Alkaline extraction of protein from lentil flour was optimized utilizing response surface approach in order to maximize both protein content and yield. Cranberry squash may be taken if a protein fortification syrup is prepared and processed into the squash.

Keywords: alkaline extraction, cranberry squash, protein fortification, response surface methodology

Procedia PDF Downloads 78

11364 Hyperspectral Image Classification Using Tree Search Algorithm

Authors: Shreya Pare, Parvin Akhter

Abstract:

Remotely sensing image classification becomes a very challenging task owing to the high dimensionality of hyperspectral images. The pixel-wise classification methods fail to take the spatial structure information of an image. Therefore, to improve the performance of classification, spatial information can be integrated into the classification process. In this paper, the multilevel thresholding algorithm based on a modified fuzzy entropy function is used to perform the segmentation of hyperspectral images. The fuzzy parameters of the MFE function have been optimized by using a new meta-heuristic algorithm based on the Tree-Search algorithm. The segmented image is classified by a large distribution machine (LDM) classifier. Experimental results are shown on a hyperspectral image dataset. The experimental outputs indicate that the proposed technique (MFE-TSA-LDM) achieves much higher classification accuracy for hyperspectral images when compared to state-of-art classification techniques. The proposed algorithm provides accurate segmentation and classification maps, thus becoming more suitable for image classification with large spatial structures.

Keywords: classification, hyperspectral images, large distribution margin, modified fuzzy entropy function, multilevel thresholding, tree search algorithm, hyperspectral image classification using tree search algorithm

Procedia PDF Downloads 135

11363 On the Homology Modeling, Structural Function Relationship and Binding Site Prediction of Human Alsin Protein

Authors: Y. Ruchi, A. Prerna, S. Deepshikha

Abstract:

Amyotrophic lateral sclerosis (ALS), also known as “Lou Gehrig’s disease”. It is a neurodegenerative disease associated with degeneration of motor neurons in the cerebral cortex, brain stem, and spinal cord characterized by distal muscle weakness, atrophy, normal sensation, pyramidal signs and progressive muscular paralysis reflecting. ALS2 is a juvenile autosomal recessive disorder, slowly progressive, that maps to chromosome 2q33 and is associated with mutations in the alsin gene, a putative GTPase regulator. In this paper we have done homology modeling of alsin2 protein using multiple templates (3KCI_A, 4LIM_A, 402W_A, 4D9S_A, and 4DNV_A) designed using the Prime program in Schrödinger software. Further modeled structure is used to identify effective binding sites on the basis of structural and physical properties using sitemap program in Schrödinger software, structural and function analysis is done by using Prosite and ExPASy server that gives insight into conserved domains and motifs that can be used for protein classification. This paper summarizes the structural, functional and binding site property of alsin2 protein. These binding sites can be potential drug target sites and can be used for docking studies.

Keywords: ALS, binding site, homology modeling, neuronal degeneration

Procedia PDF Downloads 361

11362 Hydration of Protein-RNA Recognition Sites

Authors: Amita Barik, Ranjit Prasad Bahadur

Abstract:

We investigate the role of water molecules in 89 protein-RNA complexes taken from the Protein Data Bank. Those with tRNA and single-stranded RNA are less hydrated than with duplex or ribosomal proteins. Protein-RNA interfaces are hydrated less than protein-DNA interfaces, but more than protein-protein interfaces. Majority of the waters at protein-RNA interfaces makes multiple H-bonds; however, a fraction does not make any. Those making Hbonds have preferences for the polar groups of RNA than its partner protein. The spatial distribution of waters makes interfaces with ribosomal proteins and single-stranded RNA relatively ‘dry’ than interfaces with tRNA and duplex RNA. In contrast to protein-DNA interfaces, mainly due to the presence of the 2’OH, the ribose in protein-RNA interfaces is hydrated more than the phosphate or the bases. The minor groove in protein-RNA interfaces is hydrated more than the major groove, while in protein-DNA interfaces it is reverse. The strands make the highest number of water-mediated H-bonds per unit interface area followed by the helices and the non-regular structures. The preserved waters at protein-RNA interfaces make higher number of H-bonds than the other waters. Preserved waters contribute toward the affinity in protein-RNA recognition and should be carefully treated while engineering protein-RNA interfaces.

Keywords: h-bonds, minor-major grooves, preserved water, protein-RNA interfaces

Procedia PDF Downloads 258

11361 Effect of Extrusion Processing Parameters on Protein in Banana Flour Extrudates: Characterisation Using Fourier-Transform Infrared Spectroscopy

Authors: Surabhi Pandey, Pavuluri Srinivasa Rao

Abstract:

Extrusion processing is a high-temperature short time (HTST) treatment which can improve protein quality and digestibility together with retaining active nutrients. In-vitro protein digestibility of plant protein-based foods is generally enhanced by extrusion. The current study aimed to investigate the effect of extrusion cooking on in-vitro protein digestibility (IVPD) and conformational modification of protein in green banana flour extrudates. Green banana flour was extruded through a co-rotating twin-screw extruder varying the moisture content, barrel temperature, screw speed in the range of 10-20 %, 60-80 °C, 200-300 rpm, respectively, at constant feed rate. Response surface methodology was used to optimise the result for IVPD. Fourier-transform infrared spectroscopy (FTIR) analysis provided a convenient and powerful means to monitor interactions and changes in functional and conformational properties of extrudates. Results showed that protein digestibility was highest in extrudate produced at 80°C, 250 rpm and 15% feed moisture. FTIR analysis was done for the optimised sample having highest IVPD. FTIR analysis showed that there were no changes in primary structure of protein while the secondary protein structure changed. In order to explain this behaviour, infrared spectroscopy analysis was carried out, mainly in the amide I and II regions. Moreover, curve fitting analysis showed the conformational changes produced in the flour due to protein denaturation. The quantitative analysis of the changes in the amide I and II regions provided information about the modifications produced in banana flour extrudates.

Keywords: extrusion, FTIR, protein conformation, raw banana flour, SDS-PAGE method

Procedia PDF Downloads 129

11360 Recurrent Neural Networks with Deep Hierarchical Mixed Structures for Chinese Document Classification

Authors: Zhaoxin Luo, Michael Zhu

Abstract:

In natural languages, there are always complex semantic hierarchies. Obtaining the feature representation based on these complex semantic hierarchies becomes the key to the success of the model. Several RNN models have recently been proposed to use latent indicators to obtain the hierarchical structure of documents. However, the model that only uses a single-layer latent indicator cannot achieve the true hierarchical structure of the language, especially a complex language like Chinese. In this paper, we propose a deep layered model that stacks arbitrarily many RNN layers equipped with latent indicators. After using EM and training it hierarchically, our model solves the computational problem of stacking RNN layers and makes it possible to stack arbitrarily many RNN layers. Our deep hierarchical model not only achieves comparable results to large pre-trained models on the Chinese short text classification problem but also achieves state of art results on the Chinese long text classification problem.

Keywords: nature language processing, recurrent neural network, hierarchical structure, document classification, Chinese

Procedia PDF Downloads 32

11359 Protein Crystallization Induced by Surface Plasmon Resonance

Authors: Tetsuo Okutsu

Abstract:

We have developed a crystallization plate with the function of promoting protein crystallization. A gold thin film is deposited on the crystallization plate. A protein solution is dropped thereon, and crystallization is promoted when the protein is irradiated with light of a wavelength that protein does not absorb. Protein is densely adsorbed on the gold thin film surface. The light excites the surface plasmon resonance of the gold thin film, the protein is excited by the generated enhanced electric field induced by surface plasmon resonance, and the amino acid residues are radicalized to produce protein dimers. The dimers function as templates for protein crystals, crystallization is promoted.

Keywords: lysozyme, plasmon, protein, crystallization, RNaseA

Procedia PDF Downloads 184

11358 Hybrid Structure Learning Approach for Assessing the Phosphate Laundries Impact

Authors: Emna Benmohamed, Hela Ltifi, Mounir Ben Ayed

Abstract:

Bayesian Network (BN) is one of the most efficient classification methods. It is widely used in several fields (i.e., medical diagnostics, risk analysis, bioinformatics research). The BN is defined as a probabilistic graphical model that represents a formalism for reasoning under uncertainty. This classification method has a high-performance rate in the extraction of new knowledge from data. The construction of this model consists of two phases for structure learning and parameter learning. For solving this problem, the K2 algorithm is one of the representative data-driven algorithms, which is based on score and search approach. In addition, the integration of the expert's knowledge in the structure learning process allows the obtainment of the highest accuracy. In this paper, we propose a hybrid approach combining the improvement of the K2 algorithm called K2 algorithm for Parents and Children search (K2PC) and the expert-driven method for learning the structure of BN. The evaluation of the experimental results, using the well-known benchmarks, proves that our K2PC algorithm has better performance in terms of correct structure detection. The real application of our model shows its efficiency in the analysis of the phosphate laundry effluents' impact on the watershed in the Gafsa area (southwestern Tunisia).

Keywords: Bayesian network, classification, expert knowledge, structure learning, surface water analysis

Procedia PDF Downloads 87

11357 Multilabel Classification with Neural Network Ensemble Method

Authors: Sezin Ekşioğlu

Abstract:

Multilabel classification has a huge importance for several applications, it is also a challenging research topic. It is a kind of supervised learning that contains binary targets. The distance between multilabel and binary classification is having more than one class in multilabel classification problems. Features can belong to one class or many classes. There exists a wide range of applications for multi label prediction such as image labeling, text categorization, gene functionality. Even though features are classified in many classes, they may not always be properly classified. There are many ensemble methods for the classification. However, most of the researchers have been concerned about better multilabel methods. Especially little ones focus on both efficiency of classifiers and pairwise relationships at the same time in order to implement better multilabel classification. In this paper, we worked on modified ensemble methods by getting benefit from k-Nearest Neighbors and neural network structure to address issues within a beneficial way and to get better impacts from the multilabel classification. Publicly available datasets (yeast, emotion, scene and birds) are performed to demonstrate the developed algorithm efficiency and the technique is measured by accuracy, F1 score and hamming loss metrics. Our algorithm boosts benchmarks for each datasets with different metrics.

Keywords: multilabel, classification, neural network, KNN

Procedia PDF Downloads 125

11356 Urban Land Cover from GF-2 Satellite Images Using Object Based and Neural Network Classifications

Authors: Lamyaa Gamal El-Deen Taha, Ashraf Sharawi

Abstract:

China launched satellite GF-2 in 2014. This study deals with comparing nearest neighbor object-based classification and neural network classification methods for classification of the fused GF-2 image. Firstly, rectification of GF-2 image was performed. Secondly, a comparison between nearest neighbor object-based classification and neural network classification for classification of fused GF-2 was performed. Thirdly, the overall accuracy of classification and kappa index were calculated. Results indicate that nearest neighbor object-based classification is better than neural network classification for urban mapping.

Keywords: GF-2 images, feature extraction-rectification, nearest neighbour object based classification, segmentation algorithms, neural network classification, multilayer perceptron

Procedia PDF Downloads 351

11355 Arabic Text Representation and Classification Methods: Current State of the Art

Authors: Rami Ayadi, Mohsen Maraoui, Mounir Zrigui

Abstract:

In this paper, we have presented a brief current state of the art for Arabic text representation and classification methods. We decomposed Arabic Task Classification into four categories. First we describe some algorithms applied to classification on Arabic text. Secondly, we cite all major works when comparing classification algorithms applied on Arabic text, after this, we mention some authors who proposing new classification methods and finally we investigate the impact of preprocessing on Arabic TC.

Keywords: text classification, Arabic, impact of preprocessing, classification algorithms

Procedia PDF Downloads 433

11354 Characterization of the GntR Family Transcriptional Regulator Rv0792c: A Potential Drug Target for Mycobacterium tuberculosis

Authors: Thanusha D. Abeywickrama, Inoka C. Perera, Genji Kurisu

Abstract:

Tuberculosis, considered being as the ninth leading cause of death worldwide, cause from a single infectious agent M. tuberculosis and the drug resistance nature of this bacterium is a continuing threat to the world. Therefore TB preventing treatment is expanding, where this study designed to analyze the regulatory mechanism of GntR transcriptional regulator gene Rv0792c, which lie between several genes codes for some hypothetical proteins, a monooxygenase and an oxidoreductase. The gene encoding Rv0792c was cloned into pET28a and expressed protein was purified to near homogeneity by Nickel affinity chromatography. It was previously reported that the protein binds within the intergenic region (BS region) between Rv0792c gene and monooxygenase (Rv0793). This resulted in binding of three protein molecules with the BS region suggesting tight control of monooxygenase as well as its own gene. Since monooxygenase plays a key role in metabolism, this gene may have a global regulatory role. The natural ligand for this regulator is still under investigation. In relation to the Rv0792 protein structure, a Circular Dichroism (CD) spectrum was carried out to determine its secondary structure elements. Percentage-wise, 17.4% Helix, 21.8% Antiparallel, 5.1% Parallel, 12.3% turn and 43.5% other were revealed from CD spectrum data under room temperature. Differential Scanning Calorimetry (DSC) was conducted to assess the thermal stability of Rv0792, which the melting temperature of protein is 57.2 ± 0.6 °C. The graph of heat capacity (Cp) versus temperature for the best fit was obtained for non-two-state model, which concludes the folding of Rv0792 protein occurs through stable intermediates. Peak area (∆HCal ) and Peak shape (∆HVant ) was calculated from the graph and ∆HCal / ∆HVant was close to 0.5, suggesting dimeric nature of the protein.

Keywords: CD spectrum, DSC analysis, GntR transcriptional regulator, protein structure

Procedia PDF Downloads 195