Search results for: protein coding regions
5438 Identifying Protein-Coding and Non-Coding Regions in Transcriptomes
Authors: Angela U. Makolo
Abstract:
Protein-coding and Non-coding regions determine the biology of a sequenced transcriptome. Research advances have shown that Non-coding regions are important in disease progression and clinical diagnosis. Existing bioinformatics tools have been targeted towards Protein-coding regions alone. Therefore, there are challenges associated with gaining biological insights from transcriptome sequence data. These tools are also limited to computationally intensive sequence alignment, which is inadequate and less accurate to identify both Protein-coding and Non-coding regions. Alignment-free techniques can overcome the limitation of identifying both regions. Therefore, this study was designed to develop an efficient sequence alignment-free model for identifying both Protein-coding and Non-coding regions in sequenced transcriptomes. Feature grouping and randomization procedures were applied to the input transcriptomes (37,503 data points). Successive iterations were carried out to compute the gradient vector that converged the developed Protein-coding and Non-coding Region Identifier (PNRI) model to the approximate coefficient vector. The logistic regression algorithm was used with a sigmoid activation function. A parameter vector was estimated for every sample in 37,503 data points in a bid to reduce the generalization error and cost. Maximum Likelihood Estimation (MLE) was used for parameter estimation by taking the log-likelihood of six features and combining them into a summation function. Dynamic thresholding was used to classify the Protein-coding and Non-coding regions, and the Receiver Operating Characteristic (ROC) curve was determined. The generalization performance of PNRI was determined in terms of F1 score, accuracy, sensitivity, and specificity. The average generalization performance of PNRI was determined using a benchmark of multi-species organisms. The generalization error for identifying Protein-coding and Non-coding regions decreased from 0.514 to 0.508 and to 0.378, respectively, after three iterations. The cost (difference between the predicted and the actual outcome) also decreased from 1.446 to 0.842 and to 0.718, respectively, for the first, second and third iterations. The iterations terminated at the 390th epoch, having an error of 0.036 and a cost of 0.316. The computed elements of the parameter vector that maximized the objective function were 0.043, 0.519, 0.715, 0.878, 1.157, and 2.575. The PNRI gave an ROC of 0.97, indicating an improved predictive ability. The PNRI identified both Protein-coding and Non-coding regions with an F1 score of 0.970, accuracy (0.969), sensitivity (0.966), and specificity of 0.973. Using 13 non-human multi-species model organisms, the average generalization performance of the traditional method was 74.4%, while that of the developed model was 85.2%, thereby making the developed model better in the identification of Protein-coding and Non-coding regions in transcriptomes. The developed Protein-coding and Non-coding region identifier model efficiently identified the Protein-coding and Non-coding transcriptomic regions. It could be used in genome annotation and in the analysis of transcriptomes.Keywords: sequence alignment-free model, dynamic thresholding classification, input randomization, genome annotation
Procedia PDF Downloads 665437 Gene Prediction in DNA Sequences Using an Ensemble Algorithm Based on Goertzel Algorithm and Anti-Notch Filter
Authors: Hamidreza Saberkari, Mousa Shamsi, Hossein Ahmadi, Saeed Vaali, , MohammadHossein Sedaaghi
Abstract:
In the recent years, using signal processing tools for accurate identification of the protein coding regions has become a challenge in bioinformatics. Most of the genomic signal processing methods is based on the period-3 characteristics of the nucleoids in DNA strands and consequently, spectral analysis is applied to the numerical sequences of DNA to find the location of periodical components. In this paper, a novel ensemble algorithm for gene selection in DNA sequences has been presented which is based on the combination of Goertzel algorithm and anti-notch filter (ANF). The proposed algorithm has many advantages when compared to other conventional methods. Firstly, it leads to identify the coding protein regions more accurate due to using the Goertzel algorithm which is tuned at the desired frequency. Secondly, faster detection time is achieved. The proposed algorithm is applied on several genes, including genes available in databases BG570 and HMR195 and their results are compared to other methods based on the nucleotide level evaluation criteria. Implementation results show the excellent performance of the proposed algorithm in identifying protein coding regions, specifically in identification of small-scale gene areas.Keywords: protein coding regions, period-3, anti-notch filter, Goertzel algorithm
Procedia PDF Downloads 3855436 Whole Exome Sequencing Data Analysis of Rare Diseases: Non-Coding Variants and Copy Number Variations
Authors: S. Fahiminiya, J. Nadaf, F. Rauch, L. Jerome-Majewska, J. Majewski
Abstract:
Background: Sequencing of protein coding regions of human genome (Whole Exome Sequencing; WES), has demonstrated a great success in the identification of causal mutations for several rare genetic disorders in human. Generally, most of WES studies have focused on rare variants in coding exons and splicing-sites where missense substitutions lead to the alternation of protein product. Although focusing on this category of variants has revealed the mystery behind many inherited genetic diseases in recent years, a subset of them remained still inconclusive. Here, we present the result of our WES studies where analyzing only rare variants in coding regions was not conclusive but further investigation revealed the involvement of non-coding variants and copy number variations (CNV) in etiology of the diseases. Methods: Whole exome sequencing was performed using our standard protocols at Genome Quebec Innovation Center, Montreal, Canada. All bioinformatics analyses were done using in-house WES pipeline. Results: To date, we successfully identified several disease causing mutations within gene coding regions (e.g. SCARF2: Van den Ende-Gupta syndrome and SNAP29: 22q11.2 deletion syndrome) by using WES. In addition, we showed that variants in non-coding regions and CNV have also important value and should not be ignored and/or filtered out along the way of bioinformatics analysis on WES data. For instance, in patients with osteogenesis imperfecta type V and in patients with glucocorticoid deficiency, we identified variants in 5'UTR, resulting in the production of longer or truncating non-functional proteins. Furthermore, CNVs were identified as the main cause of the diseases in patients with metaphyseal dysplasia with maxillary hypoplasia and brachydactyly and in patients with osteogenesis imperfecta type VII. Conclusions: Our study highlights the importance of considering non-coding variants and CNVs during interpretation of WES data, as they can be the only cause of disease under investigation.Keywords: whole exome sequencing data, non-coding variants, copy number variations, rare diseases
Procedia PDF Downloads 4175435 An Improvement of ComiR Algorithm for MicroRNA Target Prediction by Exploiting Coding Region Sequences of mRNAs
Authors: Giorgio Bertolazzi, Panayiotis Benos, Michele Tumminello, Claudia Coronnello
Abstract:
MicroRNAs are small non-coding RNAs that post-transcriptionally regulate the expression levels of messenger RNAs. MicroRNA regulation activity depends on the recognition of binding sites located on mRNA molecules. ComiR (Combinatorial miRNA targeting) is a user friendly web tool realized to predict the targets of a set of microRNAs, starting from their expression profile. ComiR incorporates miRNA expression in a thermodynamic binding model, and it associates each gene with the probability of being a target of a set of miRNAs. ComiR algorithms were trained with the information regarding binding sites in the 3’UTR region, by using a reliable dataset containing the targets of endogenously expressed microRNA in D. melanogaster S2 cells. This dataset was obtained by comparing the results from two different experimental approaches, i.e., inhibition, and immunoprecipitation of the AGO1 protein; this protein is a component of the microRNA induced silencing complex. In this work, we tested whether including coding region binding sites in the ComiR algorithm improves the performance of the tool in predicting microRNA targets. We focused the analysis on the D. melanogaster species and updated the ComiR underlying database with the currently available releases of mRNA and microRNA sequences. As a result, we find that the ComiR algorithm trained with the information related to the coding regions is more efficient in predicting the microRNA targets, with respect to the algorithm trained with 3’utr information. On the other hand, we show that 3’utr based predictions can be seen as complementary to the coding region based predictions, which suggests that both predictions, from 3'UTR and coding regions, should be considered in a comprehensive analysis. Furthermore, we observed that the lists of targets obtained by analyzing data from one experimental approach only, that is, inhibition or immunoprecipitation of AGO1, are not reliable enough to test the performance of our microRNA target prediction algorithm. Further analysis will be conducted to investigate the effectiveness of the tool with data from other species, provided that validated datasets, as obtained from the comparison of RISC proteins inhibition and immunoprecipitation experiments, will be available for the same samples. Finally, we propose to upgrade the existing ComiR web-tool by including the coding region based trained model, available together with the 3’UTR based one.Keywords: AGO1, coding region, Drosophila melanogaster, microRNA target prediction
Procedia PDF Downloads 4495434 CMPD: Cancer Mutant Proteome Database
Authors: Po-Jung Huang, Chi-Ching Lee, Bertrand Chin-Ming Tan, Yuan-Ming Yeh, Julie Lichieh Chu, Tin-Wen Chen, Cheng-Yang Lee, Ruei-Chi Gan, Hsuan Liu, Petrus Tang
Abstract:
Whole-exome sequencing focuses on the protein coding regions of disease/cancer associated genes based on a priori knowledge is the most cost-effective method to study the association between genetic alterations and disease. Recent advances in high throughput sequencing technologies and proteomic techniques has provided an opportunity to integrate genomics and proteomics, allowing readily detectable mutated peptides corresponding to mutated genes. Since sequence database search is the most widely used method for protein identification using Mass spectrometry (MS)-based proteomics technology, a mutant proteome database is required to better approximate the real protein pool to improve disease-associated mutated protein identification. Large-scale whole exome/genome sequencing studies were launched by National Cancer Institute (NCI), Broad Institute, and The Cancer Genome Atlas (TCGA), which provide not only a comprehensive report on the analysis of coding variants in diverse samples cell lines but a invaluable resource for extensive research community. No existing database is available for the collection of mutant protein sequences related to the identified variants in these studies. CMPD is designed to address this issue, serving as a bridge between genomic data and proteomic studies and focusing on protein sequence-altering variations originated from both germline and cancer-associated somatic variations.Keywords: TCGA, cancer, mutant, proteome
Procedia PDF Downloads 5915433 Relating Symptoms with Protein Production Abnormality in Patients with Down Syndrome
Authors: Ruolan Zhou
Abstract:
Trisomy of human chromosome 21 is the primary cause of Down Syndrome (DS), and this genetic disease has significantly burdened families and countries, causing great controversy. To address this problem, the research takes an approach in exploring the relationship between genetic abnormality and this disease's symptoms, adopting several techniques, including data analysis and enrichment analysis. It also explores open-source websites, such as NCBI, DAVID, SOURCE, STRING, as well as UCSC, to complement its result. This research has analyzed the variety of genes on human chromosome 21 with simple coding, and by using analysis, it has specified the protein-coding genes, their function, and their location. By using enrichment analysis, this paper has found the abundance of keratin production-related coding-proteins on human chromosome 21. By adopting past researches, this research has attempted to disclose the relationship between trisomy of human chromosome 21 and keratin production abnormality, which might be the reason for common diseases in patients with Down Syndrome. At last, by addressing the advantage and insufficiency of this research, the discussion has provided specific directions for future research.Keywords: Down Syndrome, protein production, genome, enrichment analysis
Procedia PDF Downloads 1245432 Detection of Viral-Plant Interaction Using Some Pathogenesis Related Protein Genes to Identify Resistant Genes against Potato LeafRoll Virus and Potato Virus Y in Egyptian Isolates
Authors: Dalia. G. Aseel, E. E. Hafez, S. M. Hammad
Abstract:
Viral RNAs of both potato leaf roll virus (PLRV) and potato virus Y (PVY) were extracted from infected potato leaves collected from different Egyptian regions. Differential Display Polymerase Chain Reaction (DD-PCR) using (Endogluconase, β-1,3-glucanases, Chitinase, Peroxidase and Polyphenol oxidase) primers (forward strand) for was performed. The obtained data revealed different banding patterns depending on the viral type and the region of infection. Regarding PLRV, a 58 up regulated and 19 down regulated genes were detected, while, 31 up regulated and 14 down regulated genes were observed in case of PVY. Based on the nucleotide sequencing, variable phylogenetic relationships were reported for the three sequenced genes coding for: Induced stolen tip protein, Disease resistance RPP-like protein and non-specific lipid-transfer protein. In a complementary approach, using the quantitative Real-time PCR, the expressions of PRs genes understudy were estimated in the infected leaves by PLRV and PVY of three potato cultivars (Spunta, Diamont and Cara). The infection with both viruses inhibited the expressions of the five PRs genes. On the contrary, infected leaves by PLRV or PVY elevated the expression of some defense genes. This interaction also may be enhanced and/or inhibited the expression of some genes responsible for the plant defense mechanisms.Keywords: PLRV, PVY, PR genes, DD-PCR, qRT-PCR, sequencing
Procedia PDF Downloads 3375431 A Study on Puzzle-Based Game to Teach Elementary Students to Code
Authors: Jaisoon Baek, Gyuhwan Oh
Abstract:
In this study, we developed a puzzle game based on coding and a web-based management system to observe the user's learning status in real time and maximize the understanding of the coding of elementary students. We have improved upon and existing coding game which cannot be connected to textual language coding or comprehends learning state. We analyzed the syntax of various coding languages for the curriculum and provided a menu to convert icon into textual coding languages. In addition, the management system includes multiple types of tutoring, real-time analysis of user play data and feedback. Following its application in regular elementary school software classes, students reported positive effects on understanding and interest in coding were shown by students. It is expected that this will contribute to quality improvement in software education by providing contents with proven educational value by breaking away from simple learning-oriented coding games.Keywords: coding education, serious game, coding, education management system
Procedia PDF Downloads 1405430 Computational Thinking Based Coding Environment for Coding and Free Semester Mathematics Education in Korea
Authors: Han Hyuk Cho, Hanik Jo
Abstract:
In recent years, coding education has been globally emphasized, and the Free Semester System and coding education were introduced to the public schools from the beginning of 2016 and 2018 respectively in Korea. With the introduction of the Free Semester System and the rising demand of Computational Thinking (CT) capacity, this paper aims to design ‘Coding Environment’ and Minecraft-like Turtlecraft in which learners can design and construct mathematical objects through mathematical symbolic expressions. Students can transfer the constructed mathematical objects to the Turtlecraft environment (open-source codingmath website), and also can print them out through 3D printers. Furthermore, we design learnable mathematics and coding curriculum by representing the figurate numbers and patterns in terms of executable expression in the coding context and connecting them to algebraic symbols, which will allow students to experience mathematical patterns and symbolic coding expressions.Keywords: coding education, computational thinking, mathematics education, TurtleMAL and Turtlecraft
Procedia PDF Downloads 2055429 Effect of Extrusion Processing Parameters on Protein in Banana Flour Extrudates: Characterisation Using Fourier-Transform Infrared Spectroscopy
Authors: Surabhi Pandey, Pavuluri Srinivasa Rao
Abstract:
Extrusion processing is a high-temperature short time (HTST) treatment which can improve protein quality and digestibility together with retaining active nutrients. In-vitro protein digestibility of plant protein-based foods is generally enhanced by extrusion. The current study aimed to investigate the effect of extrusion cooking on in-vitro protein digestibility (IVPD) and conformational modification of protein in green banana flour extrudates. Green banana flour was extruded through a co-rotating twin-screw extruder varying the moisture content, barrel temperature, screw speed in the range of 10-20 %, 60-80 °C, 200-300 rpm, respectively, at constant feed rate. Response surface methodology was used to optimise the result for IVPD. Fourier-transform infrared spectroscopy (FTIR) analysis provided a convenient and powerful means to monitor interactions and changes in functional and conformational properties of extrudates. Results showed that protein digestibility was highest in extrudate produced at 80°C, 250 rpm and 15% feed moisture. FTIR analysis was done for the optimised sample having highest IVPD. FTIR analysis showed that there were no changes in primary structure of protein while the secondary protein structure changed. In order to explain this behaviour, infrared spectroscopy analysis was carried out, mainly in the amide I and II regions. Moreover, curve fitting analysis showed the conformational changes produced in the flour due to protein denaturation. The quantitative analysis of the changes in the amide I and II regions provided information about the modifications produced in banana flour extrudates.Keywords: extrusion, FTIR, protein conformation, raw banana flour, SDS-PAGE method
Procedia PDF Downloads 1605428 A High Compression Ratio for a Losseless Image Compression Based on the Arithmetic Coding with the Sorted Run Length Coding: Meteosat Second Generation Image Compression
Authors: Cherifi Mehdi, Lahdir Mourad, Ameur Soltane
Abstract:
Image compression is the heart of several multimedia techniques. It is used to reduce the number of bits required to represent an image. Meteosat Second Generation (MSG) satellite allows the acquisition of 12 image files every 15 minutes and that results in a large databases sizes. In this paper, a novel image compression method based on the arithmetic coding with the sorted Run Length Coding (SRLC) for MSG images is proposed. The SRLC allows us to find the occurrence of the consecutive pixels of the original image to create a sorted run. The arithmetic coding allows the encoding of the sorted data of the previous stage to retrieve a unique code word that represents a binary code stream in the sorted order to boost the compression ratio. Through this article, we show that our method can perform the best results concerning compression ratio and bit rate unlike the method based on the Run Length Coding (RLC) and the arithmetic coding. Evaluation criteria like the compression ratio and the bit rate allow the confirmation of the efficiency of our method of image compression.Keywords: image compression, arithmetic coding, Run Length Coding, RLC, Sorted Run Length Coding, SRLC, Meteosat Second Generation, MSG
Procedia PDF Downloads 3515427 Fortification of Concentrated Milk Protein Beverages with Soy Proteins: Impact of Divalent Cations and Heating Treatment on the Physical Stability
Authors: Yichao Liang, Biye Chen, Xiang Li, Steven R. Dimler
Abstract:
This study investigated the effects of adding calcium and magnesium chloride on heat and storage stability of milk protein concentrate-soy protein isolate (8:2 respectively) mixtures containing 10% w/w total protein subjected to the in-container sterilization (115 °C x 15 min). The particle size does not change when emulsions are heated at pH between 6.7 and 7.3 irrespective of the mixed protein ratio. Increasing concentration of divalent cation salts resulted in an increase in protein particle size, dry sediment formation and sediment height and a decrease in pH, heat stability and hydration in milk protein concentrate-soy protein isolate mixtures solutions on sterilization at 115°C. Fortification of divalent cation salts in milk protein concentrate-soy protein isolate mixture solutions resulted in an accelerated protein sedimentation and two unique sediment regions during accelerated storage stability testing. Moreover, the heat stability decreased upon sterilization at 115°C, with addition of MgCl₂ causing a greater increase in sedimentation velocity and compressibility than CaCl₂. Increasing pH value of protein milk concentrate-soy protein isolate mixtures solutions from 6.7 to 7.2 resulted in an increase in viscosity following the heat treatment. The study demonstrated that the type and concentration of divalent cation salts used strongly impact heat and storage stability of milk protein concentrate-soy protein isolate mixture nutritional beverages.Keywords: divalent cation salts, heat stability, milk protein concentrate, soy protein isolate, storage stability
Procedia PDF Downloads 3295426 Text Mining Techniques for Prioritizing Pathogenic Mutations in Protein Families Known to Misfold or Aggregate
Authors: Khaleel Saleh Al-Rababah
Abstract:
Amyloid fibril forming regions, which are known as protein aggregates, in sequences of some protein families are associated with a number of diseases known as amyloidosis. Mutations play a role in forming fibrils by accelerating the fibril formation process. In this paper we want to extract diseases that caused by those mutations as a result of the impact of the mutations on structural and functional properties of the aggregated protein. We propose a text mining system, to automatically extract mutations, diseases and relations between mutations and diseases. We presented an algorithm based on finite state to cluster mutations found in the same sentence as a sentence could contain different mutation cause different diseases. Also, we presented a co reference algorithm that enables cross-link sentences.Keywords: amyloid, amyloidosis, co reference, protein, text mining
Procedia PDF Downloads 5215425 Spread Spectrum with Notch Frequency Using Pulse Coding Method for Switching Converter of Communication Equipment
Authors: Yasunori Kobori, Futoshi Fukaya, Takuya Arafune, Nobukazu Tsukiji, Nobukazu Takai, Haruo Kobayashi
Abstract:
This paper proposes an EMI spread spectrum technique to enable to set notch frequencies using pulse coding method for DC-DC switching converters of communication equipment. The notches in the spectrum of the switching pulses appear at the frequencies obtained from empirically derived equations with the proposed spread spectrum technique using the pulse coding methods, the PWM (Pulse Width Modulation) coding or the PCM (Pulse Cycle Modulation) coding. This technique would be useful for the switching converters in the communication equipment which receives standard radio waves, without being affected by noise from the switching converters. In our proposed technique, the notch frequencies in the spectrum depend on the pulse coding method. We have investigated this technique to apply to the switching converters and found that there is good relationship agreement between the notch frequencies and the empirical equations. The notch frequencies with the PWM coding is equal to the equation F=k/(WL-WS). With the PCM coding, that is equal to the equation F=k/(TL-TS).Keywords: notch frequency, pulse coding, spread spectrum, switching converter
Procedia PDF Downloads 3695424 Medical Image Compression by Region of Interest Based on DT-CWT Using Run-length Coding and Huffman Coding
Authors: Ali Seddiki, Mohamed Djebbouri, Driss Guerchi
Abstract:
Medical imaging produces human body pictures in digital form. Since these imaging techniques produce prohibitive amounts of data, compression is necessary for storage and communication purposes. In some areas in medicine, it may be sufficient to maintain high image quality only in region of interest (ROI). This paper discusses a contribution to quality purpose compression in the region of interest of scintigraphic images based on dual tree complex wavelet transform (DT-CWT) using Run-Length coding (RLE) and Huffman coding (HC).Keywords: DT-CWT, region of interest, run length coding, Scintigraphic images
Procedia PDF Downloads 2805423 Lentil Protein Fortification in Cranberry Squash
Authors: Sandhya Devi A
Abstract:
The protein content of the cranberry squash (protein: 0g) may be increased by extracting protein from the lentils (9 g), which is particularly linked to a lower risk of developing heart disease. Using the technique of alkaline extraction from the lentils flour, protein may be extracted. Alkaline extraction of protein from lentil flour was optimized utilizing response surface approach in order to maximize both protein content and yield. Cranberry squash may be taken if a protein fortification syrup is prepared and processed into the squash.Keywords: alkaline extraction, cranberry squash, protein fortification, response surface methodology
Procedia PDF Downloads 1075422 Functional Variants Detection by RNAseq
Authors: Raffaele A. Calogero
Abstract:
RNAseq represents an attractive methodology for the detection of functional genomic variants. RNAseq results obtained from polyA+ RNA selection protocol (POLYA) and from exonic regions capturing protocol (ACCESS) indicate that ACCESS detects 10% more coding SNV/INDELs with respect to POLYA. ACCESS requires less reads for coding SNV detection with respect to POLYA. However, if the analysis aims at identifying SNV/INDELs also in the 5’ and 3’ UTRs, POLYA is definitively the preferred method. No particular advantage comes from ACCESS or POLYA in the detection of fusion transcripts.Keywords: fusion transcripts, INDEL, RNA-seq, WES, SNV
Procedia PDF Downloads 2845421 Improved Performance Using Adaptive Pre-Coding in the Cellular Network
Authors: Yong-Jun Kim, Jae-Hyun Ro, Chang-Bin Ha, Hyoung-Kyu Song
Abstract:
This paper proposes the cooperative transmission scheme with pre-coding because the cellular communication requires high reliability. The cooperative transmission scheme uses pre-coding method with limited feedback information among small cells. Particularly, the proposed scheme has adaptive mode according to the position of mobile station. Thus, demand of recent wireless communication is resolved by this scheme. From the simulation results, the proposed scheme has better performance compared to the conventional scheme in the cellular network.Keywords: CDD, cellular network, pre-coding, SPC
Procedia PDF Downloads 5675420 Usage of Channel Coding Techniques for Peak-to-Average Power Ratio Reduction in Visible Light Communications Systems
Authors: P. L. D. N. M. de Silva, S. G. Edirisinghe, R. Weerasuriya
Abstract:
High peak-to-average power ratio (PAPR) is a concern of orthogonal frequency division multiplexing (OFDM) based visible light communication (VLC) systems. Discrete Fourier Transform spread (DFT-s) OFDM is an alternative single carrier modulation scheme which would address this concern. Employing channel coding techniques is another mechanism to reduce the PAPR. Previous research has been conducted to study the impact of these techniques separately. However, to the best of the knowledge of the authors, no study has been done so far to identify the improvement which can be harnessed by hybridizing these two techniques for VLC systems. Therefore, this is a novel study area under this research. In addition, channel coding techniques such as Polar codes and Turbo codes have been tested in the VLC domain. However, other efficient techniques such as Hamming coding and Convolutional coding have not been studied too. Therefore, the authors present the impact of the hybrid of DFT-s OFDM and Channel coding (Hamming coding and Convolutional coding) on PAPR in VLC systems using Matlab simulations.Keywords: convolutional coding, discrete Fourier transform spread orthogonal frequency division multiplexing, hamming coding, peak-to-average power ratio, visible light communications
Procedia PDF Downloads 1525419 Hydration of Protein-RNA Recognition Sites
Authors: Amita Barik, Ranjit Prasad Bahadur
Abstract:
We investigate the role of water molecules in 89 protein-RNA complexes taken from the Protein Data Bank. Those with tRNA and single-stranded RNA are less hydrated than with duplex or ribosomal proteins. Protein-RNA interfaces are hydrated less than protein-DNA interfaces, but more than protein-protein interfaces. Majority of the waters at protein-RNA interfaces makes multiple H-bonds; however, a fraction does not make any. Those making Hbonds have preferences for the polar groups of RNA than its partner protein. The spatial distribution of waters makes interfaces with ribosomal proteins and single-stranded RNA relatively ‘dry’ than interfaces with tRNA and duplex RNA. In contrast to protein-DNA interfaces, mainly due to the presence of the 2’OH, the ribose in protein-RNA interfaces is hydrated more than the phosphate or the bases. The minor groove in protein-RNA interfaces is hydrated more than the major groove, while in protein-DNA interfaces it is reverse. The strands make the highest number of water-mediated H-bonds per unit interface area followed by the helices and the non-regular structures. The preserved waters at protein-RNA interfaces make higher number of H-bonds than the other waters. Preserved waters contribute toward the affinity in protein-RNA recognition and should be carefully treated while engineering protein-RNA interfaces.Keywords: h-bonds, minor-major grooves, preserved water, protein-RNA interfaces
Procedia PDF Downloads 3005418 Protein Crystallization Induced by Surface Plasmon Resonance
Authors: Tetsuo Okutsu
Abstract:
We have developed a crystallization plate with the function of promoting protein crystallization. A gold thin film is deposited on the crystallization plate. A protein solution is dropped thereon, and crystallization is promoted when the protein is irradiated with light of a wavelength that protein does not absorb. Protein is densely adsorbed on the gold thin film surface. The light excites the surface plasmon resonance of the gold thin film, the protein is excited by the generated enhanced electric field induced by surface plasmon resonance, and the amino acid residues are radicalized to produce protein dimers. The dimers function as templates for protein crystals, crystallization is promoted.Keywords: lysozyme, plasmon, protein, crystallization, RNaseA
Procedia PDF Downloads 2165417 Image Steganography Using Predictive Coding for Secure Transmission
Authors: Baljit Singh Khehra, Jagreeti Kaur
Abstract:
In this paper, steganographic strategy is used to hide the text file inside an image. To increase the storage limit, predictive coding is utilized to implant information. In the proposed plan, one can exchange secure information by means of predictive coding methodology. The predictive coding produces high stego-image. The pixels are utilized to insert mystery information in it. The proposed information concealing plan is powerful as contrasted with the existing methodologies. By applying this strategy, a provision helps clients to productively conceal the information. Entropy, standard deviation, mean square error and peak signal noise ratio are the parameters used to evaluate the proposed methodology. The results of proposed approach are quite promising.Keywords: cryptography, steganography, reversible image, predictive coding
Procedia PDF Downloads 4135416 Chemical Synthesis of a cDNA and Its Expression Analysis
Authors: Salman Akrokayan
Abstract:
Synthetic cDNA (ScDNA) of granulocyte colony-stimulating factor (G-CSF) was constructed using a DNA synthesizer with the aim to increase its expression level. 5' end of the ScDNA of G-CSF coding region was modified by decreasing the GC content without altering the predicted amino acids sequence. The identity of the resulting protein from ScDNA was confirmed by the highly specific enzyme-linked immunosorbent assay. In conclusion, a synthetic G-CSF cDNA in combination with the recombinant DNA protocol offers a rapid and reliable strategy for synthesizing the target protein. However, the commercial utilization of this methodology requires rigorous validation and quality control.Keywords: synthetic cDNA, recombinant G-CSF, cloning, gene expression
Procedia PDF Downloads 2835415 Adaptive Multiple Transforms Hardware Architecture for Versatile Video Coding
Authors: T. Damak, S. Houidi, M. A. Ben Ayed, N. Masmoudi
Abstract:
The Versatile Video Coding standard (VVC) is actually under development by the Joint Video Exploration Team (or JVET). An Adaptive Multiple Transforms (AMT) approach was announced. It is based on different transform modules that provided an efficient coding. However, the AMT solution raises several issues especially regarding the complexity of the selected set of transforms. This can be an important issue, particularly for a future industrial adoption. This paper proposed an efficient hardware implementation of the most used transform in AMT approach: the DCT II. The developed circuit is adapted to different block sizes and can reach a minimum frequency of 192 MHz allowing an optimized execution time.Keywords: adaptive multiple transforms, AMT, DCT II, hardware, transform, versatile video coding, VVC
Procedia PDF Downloads 1455414 In silico Comparative Analysis of Chloroplast Genome (cpDNA) and Some Individual Genes (rbcL and trnH-psbA) in Pooideae Subfamily Members
Authors: Ibrahim Ilker Ozyigit, Ertugrul Filiz, Ilhan Dogan
Abstract:
An in silico analysis of Brachypodium distachyon, Triticum aestivum, Festuca arundinacea, Lolium perenne, Hordeum vulgare subsp. vulgare of the Pooideaea was performed based on complete chloroplast genomes including rbcL coding and trnH-psbA intergenic spacer regions alone to compare phylogenetic resolving power. Neighbor-joining, Minimum Evolution, and Unweighted Pair Group Method with arithmetic mean methods were used to reconstruct phylogenies with the highest bootstrap supported the obtained data from whole chloroplast genome sequence. The highest and lowest values from nucleotide diversity (π) analysis were found to be 0.315813 and 0.043495 in rbcL coding region in chloroplast genome and complete chloroplast genome, respectively. The highest transition/transversion bias (R) value was recorded as 1.384 in complete chloroplast genomes. F. arudinacea-L. perenne clade was uncovered in all phylogenies. Sequences of rbcL and trnH-psbA regions were not able to resolve the Pooideae phylogenies due to lack of genetic variation.Keywords: chloroplast DNA, Pooideae, phylogenetic analysis, rbcL, trnH-psbA
Procedia PDF Downloads 3745413 Scintigraphic Image Coding of Region of Interest Based on SPIHT Algorithm Using Global Thresholding and Huffman Coding
Authors: A. Seddiki, M. Djebbouri, D. Guerchi
Abstract:
Medical imaging produces human body pictures in digital form. Since these imaging techniques produce prohibitive amounts of data, compression is necessary for storage and communication purposes. Many current compression schemes provide a very high compression rate but with considerable loss of quality. On the other hand, in some areas in medicine, it may be sufficient to maintain high image quality only in region of interest (ROI). This paper discusses a contribution to the lossless compression in the region of interest of Scintigraphic images based on SPIHT algorithm and global transform thresholding using Huffman coding.Keywords: global thresholding transform, huffman coding, region of interest, SPIHT coding, scintigraphic images
Procedia PDF Downloads 3665412 Protein Remote Homology Detection and Fold Recognition by Combining Profiles with Kernel Methods
Authors: Bin Liu
Abstract:
Protein remote homology detection and fold recognition are two most important tasks in protein sequence analysis, which is critical for protein structure and function studies. In this study, we combined the profile-based features with various string kernels, and constructed several computational predictors for protein remote homology detection and fold recognition. Experimental results on two widely used benchmark datasets showed that these methods outperformed the competing methods, indicating that these predictors are useful computational tools for protein sequence analysis. By analyzing the discriminative features of the training models, some interesting patterns were discovered, reflecting the characteristics of protein superfamilies and folds, which are important for the researchers who are interested in finding the patterns of protein folds.Keywords: protein remote homology detection, protein fold recognition, profile-based features, Support Vector Machines (SVMs)
Procedia PDF Downloads 1605411 A Protein-Wave Alignment Tool for Frequency Related Homologies Identification in Polypeptide Sequences
Authors: Victor Prevost, Solene Landerneau, Michel Duhamel, Joel Sternheimer, Olivier Gallet, Pedro Ferrandiz, Marwa Mokni
Abstract:
The search for homologous proteins is one of the ongoing challenges in biology and bioinformatics. Traditionally, a pair of proteins is thought to be homologous when they originate from the same ancestral protein. In such a case, their sequences share similarities, and advanced scientific research effort is spent to investigate this question. On this basis, we propose the Protein-Wave Alignment Tool (”P-WAT”) developed within the framework of the France Relance 2030 plan. Our work takes into consideration the mass-related wave aspect of protein biosynthesis, by associating specific frequencies to each amino acid according to its mass. Amino acids are then regrouped within their mass category. This way, our algorithm produces specific alignments in addition to those obtained with a common amino acid coding system. For this purpose, we develop the ”P-WAT” original algorithm, able to address large protein databases, with different attributes such as species, protein names, etc. that allow us to align user’s requests with a set of specific protein sequences. The primary intent of this algorithm is to achieve efficient alignments, in this specific conceptual frame, by minimizing execution costs and information loss. Our algorithm identifies sequence similarities by searching for matches of sub-sequences of different sizes, referred to as primers. Our algorithm relies on Boolean operations upon a dot plot matrix to identify primer amino acids common to both proteins which are likely to be part of a significant alignment of peptides. From those primers, dynamic programming-like traceback operations generate alignments and alignment scores based on an adjusted PAM250 matrix.Keywords: protein, alignment, homologous, Genodic
Procedia PDF Downloads 1125410 Computational Prediction of the Effect of S477N Mutation on the RBD Binding Affinity and Structural Characteristic, A Molecular Dynamics Study
Authors: Mohammad Hossein Modarressi, Mozhgan Mondeali, Khabat Barkhordari, Ali Etemadi
Abstract:
The COVID-19 pandemic, caused by SARS-CoV-2, has led to significant concerns worldwide due to its catastrophic effects on public health. The SARS-CoV-2 infection is initiated with the binding of the receptor-binding domain (RBD) in its spike protein to the ACE2 receptor in the host cell membrane. Due to the error-prone entity of the viral RNA-dependent polymerase complex, the virus genome, including the coding region for the RBD, acquires new mutations, leading to the appearance of multiple variants. These variants can potentially impact transmission, virulence, antigenicity and evasive immune properties. S477N mutation located in the RBD has been observed in the SARS-CoV-2 omicron (B.1.1. 529) variant. In this study, we investigated the consequences of S477N mutation at the molecular level using computational approaches such as molecular dynamics simulation, protein-protein interaction analysis, immunoinformatics and free energy computation. We showed that displacement of Ser with Asn increases the stability of the spike protein and its affinity to ACE2 and thus increases the transmission potential of the virus. This mutation changes the folding and secondary structure of the spike protein. Also, it reduces antibody neutralization, raising concern about re-infection, vaccine breakthrough and therapeutic values.Keywords: S477N, COVID-19, molecular dynamic, SARS-COV2 mutations
Procedia PDF Downloads 1725409 Network Coding with Buffer Scheme in Multicast for Broadband Wireless Network
Authors: Gunasekaran Raja, Ramkumar Jayaraman, Rajakumar Arul, Kottilingam Kottursamy
Abstract:
Broadband Wireless Network (BWN) is the promising technology nowadays due to the increased number of smartphones. Buffering scheme using network coding considers the reliability and proper degree distribution in Worldwide interoperability for Microwave Access (WiMAX) multi-hop network. Using network coding, a secure way of transmission is performed which helps in improving throughput and reduces the packet loss in the multicast network. At the outset, improved network coding is proposed in multicast wireless mesh network. Considering the problem of performance overhead, degree distribution makes a decision while performing buffer in the encoding / decoding process. Consequently, BuS (Buffer Scheme) based on network coding is proposed in the multi-hop network. Here the encoding process introduces buffer for temporary storage to transmit packets with proper degree distribution. The simulation results depend on the number of packets received in the encoding/decoding with proper degree distribution using buffering scheme.Keywords: encoding and decoding, buffer, network coding, degree distribution, broadband wireless networks, multicast
Procedia PDF Downloads 408