Search results for: pseudo-random sequence
1146 Sequence Analysis and Structural Implications of Rotavirus Capsid Proteins
Authors: Nishal Parbhoo, John B. Dewar, Samantha Gildenhuys
Abstract:
Rotavirus is the major cause of severe gastroenteritis worldwide in children aged 5 and younger. Death rates are high particularly in developing countries. The mature rotavirus is a non-enveloped triple-layered nucleocapsid containing 11 double-stranded RNA segments. Here a global view on the sequence and structure of the three main capsid proteins, VP7, VP6, and VP2 is taken by generating a consensus sequence for each of these rotavirus proteins, for each species obtained from published data of representative rotavirus genotypes from across the world and across species. The degree of conservation between species was represented on homology models for each of the proteins. VP7 shows the highest level of variation with 14 - 45 amino acids showing conservation of less than 60%. These changes are localized to the outer surface which is exposed to antibodies alluding to a possible mechanism in evading the immune system. The middle layer, VP6 shows lower variability with only 14-32 sites having lower than 70% conservation. The inner structural layer made up of VP2 showed the lowest variability with only 1-16 sites having less than 70% conservation across species. The results correlate with proteins’ multiple structural roles. Although the nucleotide sequences vary due to an error-prone replication and lack of proofreading, the corresponding amino acid sequence of VP2, 6 and 7 remains conserved. Sequence conservation maintained for the virus results in stable protein structures, fit for function. This can be exploited in drug design, molecular studies and biotechnological applications.Keywords: amino acid sequence conservation, capsid protein, protein structure, vaccine candidate
Procedia PDF Downloads 2901145 The Influence of Music Education and the Order of Sounds on the Grouping of Sounds into Sequences of Six Tones
Authors: Adam Rosiński
Abstract:
This paper discusses an experiment conducted with two groups of participants, composed of musicians and non-musicians, in order to investigate the impact of the speed of a sound sequence and the order of sounds on the grouping of sounds into sequences of six tones. Significant differences were observed between musicians and non-musicians with respect to the threshold sequence speed at which the sequence was split into two streams. The differences in the results for the two groups suggest that the musical education of the participating listeners may be a vital factor. The criterion of musical education should be taken into account during experiments so that the results obtained are reliable, uniform, and free from interpretive errors.Keywords: auditory scene analysis, education, hearing, psychoacoustics
Procedia PDF Downloads 1021144 Radio Frequency Identification Encryption via Modified Two Dimensional Logistic Map
Authors: Hongmin Deng, Qionghua Wang
Abstract:
A modified two dimensional (2D) logistic map based on cross feedback control is proposed. This 2D map exhibits more random chaotic dynamical properties than the classic one dimensional (1D) logistic map in the statistical characteristics analysis. So it is utilized as the pseudo-random (PN) sequence generator, where the obtained real-valued PN sequence is quantized at first, then applied to radio frequency identification (RFID) communication system in this paper. This system is experimentally validated on a cortex-M0 development board, which shows the effectiveness in key generation, the size of key space and security. At last, further cryptanalysis is studied through the test suite in the National Institute of Standards and Technology (NIST).Keywords: chaos encryption, logistic map, pseudo-random sequence, RFID
Procedia PDF Downloads 4011143 A Study of Environmental Test Sequences for Electrical Units
Authors: Jung Ho Yang, Yong Soo Kim
Abstract:
Electrical units are operated by electrical and electronic components. An environmental test sequence is useful for testing electrical units to reduce reliability issues. This study introduces test sequence guidelines based on relevant principles and considerations for electronic testing according to international standard IEC-60068-1 and the United States military standard MIL-STD-810G. Then, test sequences were proposed based on the descriptions for each test. Finally, General Motors (GM) specification GMW3172 was interpreted and compared to IEC-60068-1 and MIL-STD-810G.Keywords: reliability, environmental test sequence, electrical units, IEC 60068-1, MIL-STD-810G
Procedia PDF Downloads 5041142 Finding the Longest Common Subsequence in Normal DNA and Disease Affected Human DNA Using Self Organizing Map
Authors: G. Tamilpavai, C. Vishnuppriya
Abstract:
Bioinformatics is an active research area which combines biological matter as well as computer science research. The longest common subsequence (LCSS) is one of the major challenges in various bioinformatics applications. The computation of the LCSS plays a vital role in biomedicine and also it is an essential task in DNA sequence analysis in genetics. It includes wide range of disease diagnosing steps. The objective of this proposed system is to find the longest common subsequence which presents in a normal and various disease affected human DNA sequence using Self Organizing Map (SOM) and LCSS. The human DNA sequence is collected from National Center for Biotechnology Information (NCBI) database. Initially, the human DNA sequence is separated as k-mer using k-mer separation rule. Mean and median values are calculated from each separated k-mer. These calculated values are fed as input to the Self Organizing Map for the purpose of clustering. Then obtained clusters are given to the Longest Common Sub Sequence (LCSS) algorithm for finding common subsequence which presents in every clusters. It returns nx(n-1)/2 subsequence for each cluster where n is number of k-mer in a specific cluster. Experimental outcomes of this proposed system produce the possible number of longest common subsequence of normal and disease affected DNA data. Thus the proposed system will be a good initiative aid for finding disease causing sequence. Finally, performance analysis is carried out for different DNA sequences. The obtained values show that the retrieval of LCSS is done in a shorter time than the existing system.Keywords: clustering, k-mers, longest common subsequence, SOM
Procedia PDF Downloads 2671141 Increase in Specificity of MicroRNA Detection by RT-qPCR Assay Using a Specific Extension Sequence
Authors: Kyung Jin Kim, Jiwon Kwak, Jae-Hoon Lee, Soo Suk Lee
Abstract:
We describe an innovative method for highly specific detection of miRNAs using a specially modified method of poly(A) adaptor RT-qPCR. We use uniquely designed specific extension sequence, which plays important role in providing an opportunity to affect high specificity of miRNA detection. This method involves two steps of reactions as like previously reported and which are poly(A) tailing and reverse-transcription followed by real-time PCR. Firstly, miRNAs are extended by a poly(A) tailing reaction and then converted into cDNA. Here, we remarkably reduced the reaction time by the application of short length of poly(T) adaptor. Next, cDNA is hybridized to the 3’-end of a specific extension sequence which contains miRNA sequence and results in producing a novel PCR template. Thereafter, the SYBR Green-based RT-qPCR progresses with a universal poly(T) adaptor forward primer and a universal reverse primer. The target miRNA, miR-106b in human brain total RNA, could be detected quantitatively in the range of seven orders of magnitude, which demonstrate that the assay displays a dynamic range of at least 7 logs. In addition, the better specificity of this novel extension-based assay against well known poly(A) tailing method for miRNA detection was confirmed by melt curve analysis of real-time PCR product, clear gel electrophoresis and sequence chromatogram images of amplified DNAs.Keywords: microRNA(miRNA), specific extension sequence, RT-qPCR, poly(A) tailing assay, reverse transcription
Procedia PDF Downloads 3081140 End-to-End Spanish-English Sequence Learning Translation Model
Authors: Vidhu Mitha Goutham, Ruma Mukherjee
Abstract:
The low availability of well-trained, unlimited, dynamic-access models for specific languages makes it hard for corporate users to adopt quick translation techniques and incorporate them into product solutions. As translation tasks increasingly require a dynamic sequence learning curve; stable, cost-free opensource models are scarce. We survey and compare current translation techniques and propose a modified sequence to sequence model repurposed with attention techniques. Sequence learning using an encoder-decoder model is now paving the path for higher precision levels in translation. Using a Convolutional Neural Network (CNN) encoder and a Recurrent Neural Network (RNN) decoder background, we use Fairseq tools to produce an end-to-end bilingually trained Spanish-English machine translation model including source language detection. We acquire competitive results using a duo-lingo-corpus trained model to provide for prospective, ready-made plug-in use for compound sentences and document translations. Our model serves a decent system for large, organizational data translation needs. While acknowledging its shortcomings and future scope, it also identifies itself as a well-optimized deep neural network model and solution.Keywords: attention, encoder-decoder, Fairseq, Seq2Seq, Spanish, translation
Procedia PDF Downloads 1751139 Project Design Deliverables Sequence (PDD)
Authors: Nahed Al-Hajeri
Abstract:
There are several reasons which lead to a delay in project completion, out of all, one main reason is the delay in deliverable processing, i.e. submission and review of documents. Most of the project cycles start with a list of deliverables but without a sequence of submission of the same, means without a direction to move, leading to overlapping of activities and more interdependencies. Hence Project Design Deliverables (PDD) is developed as a solution to Organize Transmittals (Documents/Drawings) received from contractors/consultants during different phases of an EPC (Engineering, Procurement, and Construction) projects, which gives proper direction to the stakeholders from the beginning, to reduce inter-discipline dependency, avoid overlapping of activities, provide a list of deliverables, sequence of activities, etc. PDD attempts to provide a list and sequencing of the engineering documents/drawings required during different phases of a Project which will benefit both client and Contractor in performing planned activities through timely submission and review of deliverables. This helps in ensuring improved quality and completion of Project in time. The successful implementation begins with a detailed understanding the specific challenges and requirements of the project. PDD will help to learn about vendor document submissions including general workflow, sequence and monitor the submission and review of the deliverables from the early stages of Project. This will provide an overview for the Submission of deliverables by the concerned during the projects in proper sequence. The goal of PDD is also to hold responsible and accountability of all stakeholders during complete project cycle. We believe that successful implementation of PDD with a detailed list of documents and their sequence will help organizations to achieve the project target.Keywords: EPC (Engineering, Procurement, and Construction), project design deliverables (PDD), econometrics sciences, management sciences
Procedia PDF Downloads 4001138 An Industrial Steady State Sequence Disorder Model for Flow Controlled Multi-Input Single-Output Queues in Manufacturing Systems
Authors: Anthony John Walker, Glen Bright
Abstract:
The challenge faced by manufactures, when producing custom products, is that each product needs exact components. This can cause work-in-process instability due to component matching constraints imposed on assembly cells. Clearing type flow control policies have been used extensively in mediating server access between multiple arrival processes. Although the stability and performance of clearing policies has been well formulated and studied in the literature, the growth in arrival to departure sequence disorder for each arriving job, across a serving resource, is still an area for further analysis. In this paper, a closed form industrial model has been formulated that characterizes arrival-to-departure sequence disorder through stable manufacturing systems under clearing type flow control policy. Specifically addressed are the effects of sequence disorder imposed on a downstream assembly cell in terms of work-in-process instability induced through component matching constraints. Results from a simulated manufacturing system show that steady state average sequence disorder in parallel upstream processing cells can be balanced in order to decrease downstream assembly system instability. Simulation results also show that the closed form model accurately describes the growth and limiting behavior of average sequence disorder between parts arriving and departing from a manufacturing system flow controlled via clearing policy.Keywords: assembly system constraint, custom products, discrete sequence disorder, flow control
Procedia PDF Downloads 1781137 Clastic Sequence Stratigraphy of Late Jurassic to Early Cretaceous Formations of Jaisalmer Basin, Rajasthan
Authors: Himanshu Kumar Gupta
Abstract:
The Jaisalmer Basin is one of the parts of the Rajasthan basin in northwestern India. The presence of five major unconformities/hiatuses of varying span i.e. at the top of Archean basement, Cambrian, Jurassic, Cretaceous, and Eocene have created the foundation for constructing a sequence stratigraphic framework. Based on basin formative tectonic events and their impact on sedimentation processes three first-order sequences have been identified in Rajasthan Basin. These are Proterozoic-Early Cambrian rift sequence, Permian to Middle-Late Eocene shelf sequence and Pleistocene - Recent sequence related to Himalayan Orogeny. The Permian to Middle Eocene I order sequence is further subdivided into three-second order sequences i.e. Permian to Late Jurassic II order sequence, Early to Late Cretaceous II order sequence and Paleocene to Middle-Late Eocene II order sequence. In this study, Late Jurassic to Early Cretaceous sequence was identified and log-based interpretation of smaller order T-R cycles have been carried out. A log profile from eastern margin to western margin (up to Shahgarh depression) has been taken. The depositional environment penetrated by the wells interpreted from log signatures gave three major facies association. The blocky and coarsening upward (funnel shape), the blocky and fining upward (bell shape) and the erratic (zig-zag) facies representing distributary mouth bar, distributary channel and marine mud facies respectively. Late Jurassic Formation (Baisakhi-Bhadasar) and Early Cretaceous Formation (Pariwar) shows a lesser number of T-R cycles in shallower and higher number of T-R cycles in deeper bathymetry. Shallowest well has 3 T-R cycles in Baisakhi-Bhadasar and 2 T-R cycles in Pariwar, whereas deeper well has 4 T-R cycles in Baisakhi-Bhadasar and 8 T-R cycles in Pariwar Formation. The Maximum Flooding surfaces observed from the stratigraphy analysis indicate major shale break (high shale content). The study area is dominated by the alternation of shale and sand lithologies, which occurs in an approximate ratio of 70:30. A seismo-geological cross section has been prepared to understand the stratigraphic thickness variation and structural disposition of the strata. The formations are quite thick to the west, the thickness of which reduces as we traverse towards the east. The folded and the faulted strata indicated the compressional tectonics followed by the extensional tectonics. Our interpretation is supported with seismic up to second order sequence indicates - Late Jurassic sequence is a Highstand Systems Tract (Baisakhi - Bhadasar formations), and the Early Cretaceous sequence is Regressive to Lowstand System Tract (Pariwar Formation).Keywords: Jaisalmer Basin, sequence stratigraphy, system tract, T-R cycle
Procedia PDF Downloads 1341136 Perceptual Organization within Temporal Displacement
Authors: Michele Sinico
Abstract:
The psychological present has an actual extension. When a sequence of instantaneous stimuli falls in this short interval of time, observers perceive a compresence of events in succession and the temporal order depends on the qualitative relationships between the perceptual properties of the events. Two experiments were carried out to study the influence of perceptual grouping, with and without temporal displacement, on the duration of auditory sequences. The psychophysical method of adjustment was adopted. The first experiment investigated the effect of temporal displacement of a white noise on sequence duration. The second experiment investigated the effect of temporal displacement, along the pitch dimension, on temporal shortening of sequence. The results suggest that the temporal order of sounds, in the case of temporal displacement, is organized along the pitch dimension.Keywords: time perception, perceptual present, temporal displacement, Gestalt laws of perceptual organization
Procedia PDF Downloads 2511135 Formulation of Optimal Shifting Sequence for Multi-Speed Automatic Transmission
Authors: Sireesha Tamada, Debraj Bhattacharjee, Pranab K. Dan, Prabha Bhola
Abstract:
The most important component in an automotive transmission system is the gearbox which controls the speed of the vehicle. In an automatic transmission, the right positioning of actuators ensures efficient transmission mechanism embodiment, wherein the challenge lies in formulating the number of actuators associated with modelling a gearbox. Data with respect to actuation and gear shifting sequence has been retrieved from the available literature, including patent documents, and has been used in this proposed heuristics based methodology for modelling actuation sequence in a gear box. This paper presents a methodological approach in designing a gearbox for the purpose of obtaining an optimal shifting sequence. The computational model considers factors namely, the number of stages and gear teeth as input parameters since these two are the determinants of the gear ratios in an epicyclic gear train. The proposed transmission schematic or stick diagram aids in developing the gearbox layout design. The number of iterations and development time required to design a gearbox layout is reduced by using this approach.Keywords: automatic transmission, gear-shifting, multi-stage planetary gearbox, rank ordered clustering
Procedia PDF Downloads 3251134 Effect of Weave Structure and Picking Sequence on the Comfort Properties of Woven Fabrics
Authors: Muhammad Umair, Tanveer Hussain, Khubab Shaker, Yasir Nawab, Muhammad Maqsood, Madeha Jabbar
Abstract:
The term comfort is defined as 'the absence of unpleasantness or discomfort' or 'a neutral state compared to the more active state'. Comfort mainly is of three types: sensorial (tactile) comfort, psychological comfort and thermo-physiological comfort. Thermophysiological comfort is determined by the air permeability and moisture management properties of the garment. The aim of this study was to investigate the effect of weave structure and picking sequence on the comfort properties of woven fabrics. Six woven fabrics with two different weave structures i.e. 1/1 plain and 3/1 twill and three different picking sequences: (SPI, DPI, 3PI) were taken as input variables whereas air permeability, wetting time, wicking behavior and overall moisture management capability (OMMC) of fabrics were taken as response variables and a comparison is made of the effect of weave structure and picking sequence on the response variables. It was found that fabrics woven in twill weave design and with simultaneous triple pick insertion (3PI) give significantly better air permeability, shorter wetting time and better water spreading rate, as compared to plain woven fabrics and those with double pick insertion (DPI) or single pick insertion (SPI). It could be concluded that the thermophysiological comfort of woven fabrics may be significantly improved simply by selecting a suitable weave design and picking sequence.Keywords: air permeability, picking sequence, thermophysiological comfort, weave design
Procedia PDF Downloads 4191133 PMEL Marker Identification of Dark and Light Feather Colours in Local Canary
Authors: Mudawamah Mudawamah, Muhammad Z. Fadli, Gatot Ciptadi, Aulanni’am
Abstract:
Canary breeders have spread throughout Indonesian regions for the low-middle society and become an income source for them. The interesting phenomenon of the canary market is the feather colours become one of determining factor for the price. The advantages of this research were contributed to the molecular database as a base of selection and mating for the Indonesia canary breeder. The research method was experiment with the genome obtained from canary blood isolation. The genome did the PCR amplification with PMEL marker followed by sequencing. Canaries were used 24 heads of light and dark colour feathers. Research data analyses used BioEdit and Network 4.6.0.0 software. The results showed that all samples were amplification with PMEL gene with 500 bp fragment length. In base sequence of 40 was found Cytosine(C) in the light colour canaries, while the dark colour canaries was obtained Thymine (T) in same base sequence. Sequence results had 286-415 bp fragment and 10 haplotypes. The conclusions were the PMEL gene (gene of white pigment) was likely to be used PMEL gene to detect molecular genetic variation of dark and light colour feather.Keywords: canary, haplotype, PMEL, sequence
Procedia PDF Downloads 2371132 Phenotype Prediction of DNA Sequence Data: A Machine and Statistical Learning Approach
Authors: Mpho Mokoatle, Darlington Mapiye, James Mashiyane, Stephanie Muller, Gciniwe Dlamini
Abstract:
Great advances in high-throughput sequencing technologies have resulted in availability of huge amounts of sequencing data in public and private repositories, enabling a holistic understanding of complex biological phenomena. Sequence data are used for a wide range of applications such as gene annotations, expression studies, personalized treatment and precision medicine. However, this rapid growth in sequence data poses a great challenge which calls for novel data processing and analytic methods, as well as huge computing resources. In this work, a machine and statistical learning approach for DNA sequence classification based on $k$-mer representation of sequence data is proposed. The approach is tested using whole genome sequences of Mycobacterium tuberculosis (MTB) isolates to (i) reduce the size of genomic sequence data, (ii) identify an optimum size of k-mers and utilize it to build classification models, (iii) predict the phenotype from whole genome sequence data of a given bacterial isolate, and (iv) demonstrate computing challenges associated with the analysis of whole genome sequence data in producing interpretable and explainable insights. The classification models were trained on 104 whole genome sequences of MTB isoloates. Cluster analysis showed that k-mers maybe used to discriminate phenotypes and the discrimination becomes more concise as the size of k-mers increase. The best performing classification model had a k-mer size of 10 (longest k-mer) an accuracy, recall, precision, specificity, and Matthews Correlation coeffient of 72.0%, 80.5%, 80.5%, 63.6%, and 0.4 respectively. This study provides a comprehensive approach for resampling whole genome sequencing data, objectively selecting a k-mer size, and performing classification for phenotype prediction. The analysis also highlights the importance of increasing the k-mer size to produce more biological explainable results, which brings to the fore the interplay that exists amongst accuracy, computing resources and explainability of classification results. However, the analysis provides a new way to elucidate genetic information from genomic data, and identify phenotype relationships which are important especially in explaining complex biological mechanisms.Keywords: AWD-LSTM, bootstrapping, k-mers, next generation sequencing
Procedia PDF Downloads 1671131 Phenotype Prediction of DNA Sequence Data: A Machine and Statistical Learning Approach
Authors: Darlington Mapiye, Mpho Mokoatle, James Mashiyane, Stephanie Muller, Gciniwe Dlamini
Abstract:
Great advances in high-throughput sequencing technologies have resulted in availability of huge amounts of sequencing data in public and private repositories, enabling a holistic understanding of complex biological phenomena. Sequence data are used for a wide range of applications such as gene annotations, expression studies, personalized treatment and precision medicine. However, this rapid growth in sequence data poses a great challenge which calls for novel data processing and analytic methods, as well as huge computing resources. In this work, a machine and statistical learning approach for DNA sequence classification based on k-mer representation of sequence data is proposed. The approach is tested using whole genome sequences of Mycobacterium tuberculosis (MTB) isolates to (i) reduce the size of genomic sequence data, (ii) identify an optimum size of k-mers and utilize it to build classification models, (iii) predict the phenotype from whole genome sequence data of a given bacterial isolate, and (iv) demonstrate computing challenges associated with the analysis of whole genome sequence data in producing interpretable and explainable insights. The classification models were trained on 104 whole genome sequences of MTB isoloates. Cluster analysis showed that k-mers maybe used to discriminate phenotypes and the discrimination becomes more concise as the size of k-mers increase. The best performing classification model had a k-mer size of 10 (longest k-mer) an accuracy, recall, precision, specificity, and Matthews Correlation coeffient of 72.0 %, 80.5 %, 80.5 %, 63.6 %, and 0.4 respectively. This study provides a comprehensive approach for resampling whole genome sequencing data, objectively selecting a k-mer size, and performing classification for phenotype prediction. The analysis also highlights the importance of increasing the k-mer size to produce more biological explainable results, which brings to the fore the interplay that exists amongst accuracy, computing resources and explainability of classification results. However, the analysis provides a new way to elucidate genetic information from genomic data, and identify phenotype relationships which are important especially in explaining complex biological mechanismsKeywords: AWD-LSTM, bootstrapping, k-mers, next generation sequencing
Procedia PDF Downloads 1591130 DNpro: A Deep Learning Network Approach to Predicting Protein Stability Changes Induced by Single-Site Mutations
Authors: Xiao Zhou, Jianlin Cheng
Abstract:
A single amino acid mutation can have a significant impact on the stability of protein structure. Thus, the prediction of protein stability change induced by single site mutations is critical and useful for studying protein function and structure. Here, we presented a deep learning network with the dropout technique for predicting protein stability changes upon single amino acid substitution. While using only protein sequence as input, the overall prediction accuracy of the method on a standard benchmark is >85%, which is higher than existing sequence-based methods and is comparable to the methods that use not only protein sequence but also tertiary structure, pH value and temperature. The results demonstrate that deep learning is a promising technique for protein stability prediction. The good performance of this sequence-based method makes it a valuable tool for predicting the impact of mutations on most proteins whose experimental structures are not available. Both the downloadable software package and the user-friendly web server (DNpro) that implement the method for predicting protein stability changes induced by amino acid mutations are freely available for the community to use.Keywords: bioinformatics, deep learning, protein stability prediction, biological data mining
Procedia PDF Downloads 4681129 In-Depth Analysis on Sequence Evolution and Molecular Interaction of Influenza Receptors (Hemagglutinin and Neuraminidase)
Authors: Dong Tran, Thanh Dac Van, Ly Le
Abstract:
Hemagglutinin (HA) and Neuraminidase (NA) play an important role in host immune evasion across influenza virus evolution process. The correlation between HA and NA evolution in respect to epitopic evolution and drug interaction has yet to be investigated. In this study, combining of sequence to structure evolution and statistical analysis on epitopic/binding site specificity, we identified potential therapeutic features of HA and NA that show specific antibody binding site of HA and specific binding distribution within NA active site of current inhibitors. Our approach introduces the use of sequence variation and molecular interaction to provide an effective strategy in establishing experimental based distributed representations of protein-protein/ligand complexes. The most important advantage of our method is that it does not require complete dataset of complexes but rather directly inferring feature interaction from sequence variation and molecular interaction. Using correlated sequence analysis, we additionally identified co-evolved mutations associated with maintaining HA/NA structural and functional variability toward immunity and therapeutic treatment. Our investigation on the HA binding specificity revealed unique conserved stalk domain interacts with unique loop domain of universal antibodies (CR9114, CT149, CR8043, CR8020, F16v3, CR6261, F10). On the other hand, NA inhibitors (Oseltamivir, Zaninamivir, Laninamivir) showed specific conserved residue contribution and similar to that of NA substrate (sialic acid) which can be exploited for drug design. Our study provides an important insight into rational design and identification of novel therapeutics targeting universally recognized feature of influenza HA/NA.Keywords: influenza virus, hemagglutinin (HA), neuraminidase (NA), sequence evolution
Procedia PDF Downloads 1641128 Novel Coprocessor for DNA Sequence Alignment in Resequencing Applications
Authors: Atef Ibrahim, Hamed Elsimary, Abdullah Aljumah, Fayez Gebali
Abstract:
This paper presents a novel semi-systolic array architecture for an optimized parallel sequence alignment algorithm. This architecture has the advantage that it can be modified to be reused for multiple pass processing in order to increase the number of processing elements that can be packed into a single FPGA and to increase the number of sequences that can be aligned in parallel in a single FPGA. This resolves the potential problem of many FPGA resources left unused for designs that have large values of short read length. When using the previously published conventional hardware design. FPGA implementation results show that, for large values of short read lengths (M>128), the proposed design has a slightly higher speed up and FPGA utilization over the the conventional one.Keywords: bioinformatics, genome sequence alignment, re-sequencing applications, systolic array
Procedia PDF Downloads 5311127 Exploring Simple Sequence Repeats within Conserved microRNA Precursors Identified from Tea Expressed Sequence Tag (EST) Database
Authors: Anjan Hazra, Nirjhar Dasgupta, Chandan Sengupta, Sauren Das
Abstract:
Tea (Camellia sinensis) has received substantial attention from the scientific world time to time, not only for its commercial importance, but also for its demand to the health-conscious people across the world for its extensive use as potential sources of antioxidant supplement. These health-benefit traits primarily rely on some regulatory networks of different metabolic pathways. Development of microsatellite markers from the conserved genomic regions is being worthwhile for studying the genetic diversity of closely related species or self-pollinated species. Although several SSR markers have been reported, in tea the trait-specific Simple Sequence Repeats (SSRs) are yet to be identified, which can be used for marker assisted breeding technique. MicroRNAs are endogenous, noncoding, short RNAs directly involved in regulating gene expressions at the post-transcriptional level. It has been found that diversity in miRNA gene interferes the formation of its characteristic hair pin structure and the subsequent function. In the present study, the precursors of small regulatory RNAs (microRNAs) has been fished out from tea Expressed Sequence Tag (EST) database. Furthermore, the simple sequence repeat motifs within the putative miRNA precursor genes are also identified in order to experimentally validate their existence and function. It is already known that genic-SSR markers are very adept and breeder-friendly source for genetic diversity analysis. So, the potential outcome of this in-silico study would provide some novel clues in understanding the miRNA-triggered polymorphic genic expression controlling specific metabolic pathways, accountable for tea quality.Keywords: micro RNA, simple sequence repeats, tea quality, trait specific marker
Procedia PDF Downloads 3121126 Prediction and Identification of a Permissive Epitope Insertion Site for St Toxoid in cfaB from Enterotoxigenic Escherichia coli
Authors: N. Zeinalzadeh, Mahdi Sadeghi
Abstract:
Enterotoxigenic Escherichia coli (ETEC) is the most common cause of non-inflammatory diarrhea in the developing countries, resulting in approximately 20% of all diarrheal episodes in children in these areas. ST is one of the most important virulence factors and CFA/I is one of the frequent colonization factors that help to process of ETEC infection. ST and CfaB (CFA/I subunit) are among vaccine candidates against ETEC. So, ST because of its small size is not a good immunogenic in the natural form. However to increase its immunogenic potential, here we explored candidate positions for ST insertion in CfaB sequence. After bioinformatics analysis, one of the candidate positions was selected and the chimeric gene (cfaB*st) sequence was synthesized and expressed in E. coli BL21 (DE3). The chimeric recombinant protein was purified with Ni-NTA columns and characterized with western blot analysis. The residue 74-75 of CfaB sequence could be a good candidate position for ST and other epitopes insertion.Keywords: bioinformatics, CFA/I, enterotoxigenic E. coli, ST toxoid
Procedia PDF Downloads 4481125 Nucleotide Based Validation of the Endangered Plant Diospyros mespiliformis (Ebenaceae) by Evaluating Short Sequence Region of Plastid rbcL Gene
Authors: Abdullah Alaklabi, Ibrahim A. Arif, Sameera O. Bafeel, Ahmad H. Alfarhan, Anis Ahamed, Jacob Thomas, Mohammad A. Bakir
Abstract:
Diospyros mespiliformis (Hochst. ex A.DC.; Ebenaceae) is a large deciduous medicinal plant. This plant species is currently listed as endangered in Saudi Arabia. Molecular identification of this plant species based on short sequence regions (571 and 664 bp) of plastid rbcL (ribulose-1, 5-biphosphate carboxylase) gene was investigated in this study. The endangered plant specimens were collected from Al-Baha, Saudi Arabia (GPS coordinate: 19.8543987, 41.3059349). Phylogenetic tree inferred from the rbcL gene sequences showed that this species is very closely related with D. brandisiana. The close relationship was also observed among D. bejaudii, D. Philippinensis and D. releyi (≥99.7% sequence homology). The partial rbcL gene sequence region (571 bp) that was amplified by rbcL primer-pair rbcLaF-rbcLaR failed to discriminate D. mespiliformis from the closely related plant species, D. brandisiana. In contrast, primer-pair rbcL1F-rbcL724R yielded longer amplicon, discriminated the species from D. brandisiana and demonstrated nucleotide variations in 3 different sites (645G>T; 663A>C; 710C>G). Although D. mespiliformis (EU980712) and D. brandisiana (EU980656) are very closely related species (99.4%); however, studied specimen showed 100% sequence homology with D. mespiliformis and 99.6% with D. brandisiana. The present findings showed that rbcL short sequence region (664 bp) of plastid rbcL gene, amplified by primer-pair rbcL1F-rbcL724R, can be used for authenticating samples of D. mespiliforformis and may provide help in authentic identification and management process of this medicinally valuable endangered plant species.Keywords: Diospyros mespiliformis, endangered plant, identification partial rbcL
Procedia PDF Downloads 4321124 Unveiling the Chaura Thrust: Insights into a Blind Out-of-Sequence Thrust in Himachal Pradesh, India
Authors: Rajkumar Ghosh
Abstract:
The Chaura Thrust, located in Himachal Pradesh, India, is a prominent geological feature that exhibits characteristics of an out-of-sequence thrust fault. This paper explores the geological setting of Himachal Pradesh, focusing on the Chaura Thrust's unique characteristics, its classification as an out-of-sequence thrust, and the implications of its presence in the region. The introduction provides background information on thrust faults and out-of-sequence thrusts, emphasizing their significance in understanding the tectonic history and deformation patterns of an area. It also outlines the objectives of the paper, which include examining the Chaura Thrust's geological features, discussing its classification as an out-of-sequence thrust, and assessing its implications for the region. The paper delves into the geological setting of Himachal Pradesh, describing the tectonic framework and providing insights into the formation of thrust faults in the region. Special attention is given to the Chaura Thrust, including its location, extent, and geometry, along with an overview of the associated rock formations and structural characteristics. The concept of out-of-sequence thrusts is introduced, defining their distinctive behavior and highlighting their importance in the understanding of geological processes. The Chaura Thrust is then analyzed in the context of an out-of-sequence thrust, examining the evidence and characteristics that support this classification. Factors contributing to the out-of-sequence behavior of the Chaura Thrust, such as stress interactions and fault interactions, are discussed. The geological implications and significance of the Chaura Thrust are explored, addressing its impact on the regional geology, tectonic evolution, and seismic hazard assessment. The paper also discusses the potential geological hazards associated with the Chaura Thrust and the need for effective mitigation strategies in the region. Future research directions and recommendations are provided, highlighting areas that warrant further investigation, such as detailed structural analyses, geodetic measurements, and geophysical surveys. The importance of continued research in understanding and managing geological hazards related to the Chaura Thrust is emphasized. In conclusion, the Chaura Thrust in Himachal Pradesh represents an out-of-sequence thrust fault that has significant implications for the region's geology and tectonic evolution. By studying the unique characteristics and behavior of the Chaura Thrust, researchers can gain valuable insights into the geological processes occurring in Himachal Pradesh and contribute to a better understanding and mitigation of seismic hazards in the area.Keywords: chaura thrust, out-of-sequence thrust, himachal pradesh, geological setting, tectonic framework, rock formations, structural characteristics, stress interactions, fault interactions, geological implications, seismic hazard assessment, geological hazards, future research, mitigation strategies.
Procedia PDF Downloads 791123 Neural Machine Translation for Low-Resource African Languages: Benchmarking State-of-the-Art Transformer for Wolof
Authors: Cheikh Bamba Dione, Alla Lo, Elhadji Mamadou Nguer, Siley O. Ba
Abstract:
In this paper, we propose two neural machine translation (NMT) systems (French-to-Wolof and Wolof-to-French) based on sequence-to-sequence with attention and transformer architectures. We trained our models on a parallel French-Wolof corpus of about 83k sentence pairs. Because of the low-resource setting, we experimented with advanced methods for handling data sparsity, including subword segmentation, back translation, and the copied corpus method. We evaluate the models using the BLEU score and find that transformer outperforms the classic seq2seq model in all settings, in addition to being less sensitive to noise. In general, the best scores are achieved when training the models on word-level-based units. For subword-level models, using back translation proves to be slightly beneficial in low-resource (WO) to high-resource (FR) language translation for the transformer (but not for the seq2seq) models. A slight improvement can also be observed when injecting copied monolingual text in the target language. Moreover, combining the copied method data with back translation leads to a substantial improvement of the translation quality.Keywords: backtranslation, low-resource language, neural machine translation, sequence-to-sequence, transformer, Wolof
Procedia PDF Downloads 1471122 Opaque Mineralogy of the Late Precambrian Ophiolites from Bou Azzer Area, Anti-atlas, Morrocco
Authors: Yaser Maher Abdelaziz Hawa
Abstract:
The Basic-ultrabasic rocks of Bou Azzer ophiolite complex in the Anti-atlas , Morrocco enclose some oxide and sulfide minerals as dissiminated traces. The oxide minerals show a wide variation in composition ranging from Cr-free. Titanomagnetite and ilmenite in the chilled margin gabbro of the upper part of the ophiolite sequence to Al-rich chromian spinel and pure magnetite enclosed in the serpentinized peridotite in the lower part of the sequence. Five mineral assemblages have been distinguished depending on the rock type of the ophiolite sequence. 1-Gersodorfite + Chalcopyrite + Al-Mg rich chromian spinel + pure magnetite, hosted by serpentinized peridotite. 2- Pyrite + Chalcopyrite, enclosed in metagabbro and overlying the ultrabasic cumulates. 3- Al-Fe rich Chromian spinel with rims of Al –rich chromian magnetite enclosed in wherlite. 4- Titanomagnetite replaced by sphene enclosed in marginal Gabbro. 5- Pyrrhotite exsolving Pentlandite + ilmenite + Ilmenite + Al- rich Chromian spinel + magnetite enclosed in fresh olivine olivine in the upper part of the ophiolite sequence.Keywords: opaques, ophiolites, anti-atlas, morrocco
Procedia PDF Downloads 1061121 Precise Identification of Clustered Regularly Interspaced Short Palindromic Repeats-Induced Mutations via Hidden Markov Model-Based Sequence Alignment
Authors: Jingyuan Hu, Zhandong Liu
Abstract:
CRISPR genome editing technology has transformed molecular biology by accurately targeting and altering an organism’s DNA. Despite the state-of-art precision of CRISPR genome editing, the imprecise mutation outcome and off-target effects present considerable risk, potentially leading to unintended genetic changes. Targeted deep sequencing, combined with bioinformatics sequence alignment, can detect such unwanted mutations. Nevertheless, the classical method, Needleman-Wunsch (NW) algorithm may produce false alignment outcomes, resulting in inaccurate mutation identification. The key to precisely identifying CRISPR-induced mutations lies in determining optimal parameters for the sequence alignment algorithm. Hidden Markov models (HMM) are ideally suited for this task, offering flexibility across CRISPR systems by leveraging forward-backward algorithms for parameter estimation. In this study, we introduce CRISPR-HMM, a statistical software to precisely call CRISPR-induced mutations. We demonstrate that the software significantly improves precision in identifying CRISPR-induced mutations compared to NW-based alignment, thereby enhancing the overall understanding of the CRISPR gene-editing process.Keywords: CRISPR, HMM, sequence alignment, gene editing
Procedia PDF Downloads 521120 The Various Legal Dimensions of Genomic Data
Authors: Amy Gooden
Abstract:
When human genomic data is considered, this is often done through only one dimension of the law, or the interplay between the various dimensions is not considered, thus providing an incomplete picture of the legal framework. This research considers and analyzes the various dimensions in South African law applicable to genomic sequence data – including property rights, personality rights, and intellectual property rights. The effective use of personal genomic sequence data requires the acknowledgement and harmonization of the rights applicable to such data.Keywords: artificial intelligence, data, law, genomics, rights
Procedia PDF Downloads 1381119 The Influence of Directionality on the Giovanelli Illusion
Authors: Michele Sinico
Abstract:
In the Giovanelli illusion, some collinear dots appear misaligned, when each dot lies within a circle and the circles are not collinear. In this illusion, the role of the frame of reference, determined by the circles, is considered a crucial factor. Three experiments were carried out to study the influence of directionality of the circles on the misalignment. The adjustment method was used. Participants changed the orthogonal position of each dot, from the left to the right of the sequence, until a collinear sequence of dots was achieved. The first experiment verified the illusory effect of the misalignment. In the second experiment, the influence of two different directionalities of the circles (-0.58° and +0.58°) on the misalignment was tested. The results show an over-normalization on the sequences of the dots. The third experiment tested the misalignment of the dots without any inclination of the sequence of circles (0°). Only a local illusory effect was found. These results demonstrate that the directionality of the circles, as a global factor, can increase the misalignment. The findings also indicate that directionality and the frame of reference are independent factors in explaining the Giovanelli illusion.Keywords: Giovannelli illusion, visual illusion, directionality, misalignment, the frame of reference
Procedia PDF Downloads 1781118 Unraveling the Puzzle of Out-of-Sequence Thrusting in the Higher Himalaya: Focus on Jhakri-Chaura-Sarahan Thrust, Himachal Pradesh, India
Authors: Rajkumar Ghosh
Abstract:
The study examines the structural analysis of Chaura Thrust in Himachal Pradesh, India, focusing on the activation timing of Main Central Thrust (MCT) and South Tibetan Detachment System (STDS), mylonitised zones, and the characterization of box fold and its signature in the regional geology of Himachal Himalaya. The research aims to document the Higher Himalayan Out-of-Sequence Thrust (OOST) in Himachal Pradesh, which activated the MCTL and in between a zone south of MCTU. The study also documents the GBM-associated temperature range and the activation of Higher Himalayan Out-of-Sequence Thrust (OOST) in Himachal Pradesh. The findings contribute to understanding the structural analysis of Chaura Thrust and its signature in the regional geology of Himachal Himalaya. The study highlights the significance of microscopic studies in documenting mylonitized zones and identifying various types of crenulated schistosity. The study concludes that Chaura Thrust is not a blind thrust and details the field evidence for the OOST. The study characterizes the box fold and its signature in the regional geology of Himachal Himalaya. The study also documents the activation timing and ages of MCT, STDS, MBT, and MFT and identifies various types of crenulated schistosity under the microscope. The study also highlights the significance of microscopic studies in the structural analysis of Chaura Thrust. Finally, the study documents the activation of Higher Himalayan Out-of-Sequence Thrust (OOST) in Himachal Pradesh and the expectations for strain variation near the OOST.Keywords: Chaura Thrust, Higher Himalaya, Jhakri Thrust, Main Central Thrust, Out-of-Sequence Thrust, Sarahan Thrust
Procedia PDF Downloads 891117 Isolation and Characterization of Cotton Infecting Begomoviruses in Alternate Hosts from Cotton Growing Regions of Pakistan
Authors: M. Irfan Fareed, Muhammad Tahir, Alvina Gul Kazi
Abstract:
Castor bean (Ricinus communis; family Euphorbiaceae) is cultivated for the production of oil and as an ornamental plant throughout tropical regions. Leaf samples from castor bean plants with leaf curl and vein thickening were collected from areas around Okara (Pakistan) in 2011. PCR amplification using diagnostic primers showed the presence of a begomovirus and subsequently the specific pair (BurNF 5’- CCATGGTTGTGGCAGTTGATTGACAGATAC-3’, BurNR 5’- CCATGGATTCACGCACAGGGGAACCC-3’) was used to amplify and clone the whole genome of the virus. The complete nucleotide sequence was determined to be 2,759 nt (accession No. HE985227). Alignments showed the highest levels of nucleotide sequence identity (98.8%) with Cotton leaf curl Burewala virus (CLCuBuV; accession No. JF416947) No. JF416947). The virus in castor beans lacks on intact C2 gene, as is typical of CLCuBuV in cotton. An amplification product of ca. 1.4 kb was obtained in PCR with primers for betasatellites and the complete nucleotide sequence of a clone was determined to be 1373 nt (HE985228). The sequence showed 96.3% nucleotide sequence identity to the recombinant Cotton leaf curl Multan betasatellite (CLCuMB; JF502389). This is the first report of CLCuBuV and its betasatellite infecting castor bean, showing this plant species as an alternate host of the virus. Already many alternate host have been reported from different alternate host like tobacco, tomato, hibiscus, okra, ageratum, Digera arvensis, habiscus, Papaya and now in Ricinus communis. So, it is suggested that these alternate hosts should be avoided to grow near cotton growing regions.Keywords: Ricinus communis, begomovirus, betasatellite, agriculture
Procedia PDF Downloads 533