Search results for: simple sequence repeats
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 4107

Search results for: simple sequence repeats

4047 Radio Frequency Identification Encryption via Modified Two Dimensional Logistic Map

Authors: Hongmin Deng, Qionghua Wang

Abstract:

A modified two dimensional (2D) logistic map based on cross feedback control is proposed. This 2D map exhibits more random chaotic dynamical properties than the classic one dimensional (1D) logistic map in the statistical characteristics analysis. So it is utilized as the pseudo-random (PN) sequence generator, where the obtained real-valued PN sequence is quantized at first, then applied to radio frequency identification (RFID) communication system in this paper. This system is experimentally validated on a cortex-M0 development board, which shows the effectiveness in key generation, the size of key space and security. At last, further cryptanalysis is studied through the test suite in the National Institute of Standards and Technology (NIST).

Keywords: chaos encryption, logistic map, pseudo-random sequence, RFID

Procedia PDF Downloads 378
4046 A Study of Environmental Test Sequences for Electrical Units

Authors: Jung Ho Yang, Yong Soo Kim

Abstract:

Electrical units are operated by electrical and electronic components. An environmental test sequence is useful for testing electrical units to reduce reliability issues. This study introduces test sequence guidelines based on relevant principles and considerations for electronic testing according to international standard IEC-60068-1 and the United States military standard MIL-STD-810G. Then, test sequences were proposed based on the descriptions for each test. Finally, General Motors (GM) specification GMW3172 was interpreted and compared to IEC-60068-1 and MIL-STD-810G.

Keywords: reliability, environmental test sequence, electrical units, IEC 60068-1, MIL-STD-810G

Procedia PDF Downloads 478
4045 Finding the Longest Common Subsequence in Normal DNA and Disease Affected Human DNA Using Self Organizing Map

Authors: G. Tamilpavai, C. Vishnuppriya

Abstract:

Bioinformatics is an active research area which combines biological matter as well as computer science research. The longest common subsequence (LCSS) is one of the major challenges in various bioinformatics applications. The computation of the LCSS plays a vital role in biomedicine and also it is an essential task in DNA sequence analysis in genetics. It includes wide range of disease diagnosing steps. The objective of this proposed system is to find the longest common subsequence which presents in a normal and various disease affected human DNA sequence using Self Organizing Map (SOM) and LCSS. The human DNA sequence is collected from National Center for Biotechnology Information (NCBI) database. Initially, the human DNA sequence is separated as k-mer using k-mer separation rule. Mean and median values are calculated from each separated k-mer. These calculated values are fed as input to the Self Organizing Map for the purpose of clustering. Then obtained clusters are given to the Longest Common Sub Sequence (LCSS) algorithm for finding common subsequence which presents in every clusters. It returns nx(n-1)/2 subsequence for each cluster where n is number of k-mer in a specific cluster. Experimental outcomes of this proposed system produce the possible number of longest common subsequence of normal and disease affected DNA data. Thus the proposed system will be a good initiative aid for finding disease causing sequence. Finally, performance analysis is carried out for different DNA sequences. The obtained values show that the retrieval of LCSS is done in a shorter time than the existing system.

Keywords: clustering, k-mers, longest common subsequence, SOM

Procedia PDF Downloads 238
4044 Increase in Specificity of MicroRNA Detection by RT-qPCR Assay Using a Specific Extension Sequence

Authors: Kyung Jin Kim, Jiwon Kwak, Jae-Hoon Lee, Soo Suk Lee

Abstract:

We describe an innovative method for highly specific detection of miRNAs using a specially modified method of poly(A) adaptor RT-qPCR. We use uniquely designed specific extension sequence, which plays important role in providing an opportunity to affect high specificity of miRNA detection. This method involves two steps of reactions as like previously reported and which are poly(A) tailing and reverse-transcription followed by real-time PCR. Firstly, miRNAs are extended by a poly(A) tailing reaction and then converted into cDNA. Here, we remarkably reduced the reaction time by the application of short length of poly(T) adaptor. Next, cDNA is hybridized to the 3’-end of a specific extension sequence which contains miRNA sequence and results in producing a novel PCR template. Thereafter, the SYBR Green-based RT-qPCR progresses with a universal poly(T) adaptor forward primer and a universal reverse primer. The target miRNA, miR-106b in human brain total RNA, could be detected quantitatively in the range of seven orders of magnitude, which demonstrate that the assay displays a dynamic range of at least 7 logs. In addition, the better specificity of this novel extension-based assay against well known poly(A) tailing method for miRNA detection was confirmed by melt curve analysis of real-time PCR product, clear gel electrophoresis and sequence chromatogram images of amplified DNAs.

Keywords: microRNA(miRNA), specific extension sequence, RT-qPCR, poly(A) tailing assay, reverse transcription

Procedia PDF Downloads 280
4043 End-to-End Spanish-English Sequence Learning Translation Model

Authors: Vidhu Mitha Goutham, Ruma Mukherjee

Abstract:

The low availability of well-trained, unlimited, dynamic-access models for specific languages makes it hard for corporate users to adopt quick translation techniques and incorporate them into product solutions. As translation tasks increasingly require a dynamic sequence learning curve; stable, cost-free opensource models are scarce. We survey and compare current translation techniques and propose a modified sequence to sequence model repurposed with attention techniques. Sequence learning using an encoder-decoder model is now paving the path for higher precision levels in translation. Using a Convolutional Neural Network (CNN) encoder and a Recurrent Neural Network (RNN) decoder background, we use Fairseq tools to produce an end-to-end bilingually trained Spanish-English machine translation model including source language detection. We acquire competitive results using a duo-lingo-corpus trained model to provide for prospective, ready-made plug-in use for compound sentences and document translations. Our model serves a decent system for large, organizational data translation needs. While acknowledging its shortcomings and future scope, it also identifies itself as a well-optimized deep neural network model and solution.

Keywords: attention, encoder-decoder, Fairseq, Seq2Seq, Spanish, translation

Procedia PDF Downloads 155
4042 Identification of Individuals in Forensic Situations after Allo-Hematopoietic Stem Cell Transplantation

Authors: Anupuma Raina, Ajay Parkash

Abstract:

In forensic investigation, DNA analysis helps in the identification of a particular individual under investigation. A set of Short Tandem Repeats loci are widely used for individualization at a molecular level in forensic testing. STRs with tetrameric repeats of DNA are highly polymorphic and widely used for forensic DNA analysis. Identification of an individual became challenging for forensic examiners after Hematopoietic Stem Cell Transplantation. HSCT is a well-accepted and life-saving treatment to treat malignant and nonmalignant diseases. It involves the administration of healthy donor stem cells to replace the patient’s own unhealthy stem cells. A successful HSCT results in complete donor-derived cells in a patient’s hematopoiesis and hence have the capability to change the genetic makeup of the patient. Although an individual who has undergone HSCT and then committed a crime is a very rare situation, but not impossible. Keeping such a situation in mind, various biological samples like blood, buccal swab, and hair follicle were collected and studied after a certain interval of time after HSCT. Blood was collected from both the patient and the donor before the transplant. The DNA profile of both was analyzed using a short tandem repeat kit for autosomal chromosomes. Among all exhibits studied, only hair follicles were found to be the most suitable biological exhibit, as no donor DNA profile was observed for up to 90 days of study.

Keywords: chimerism, HSCT, STRs analysis, forensic identification

Procedia PDF Downloads 48
4041 Comparative Assessment of ISSR and RAPD Markers among Egyptian Jojoba Shrubs

Authors: Abdelsabour G. A. Khaled, Galal A.R. El-Sherbeny, Ahmed M. Hassanein, Gameel M. G. Aly

Abstract:

Classical methods of identification, based on agronomical characterization, are not always the most accurate way due to the instability of these characteristics under the influence of the different environments. In order to estimate the genetic diversity, molecular markers provided excellent tools. In this study, Genetic variation of nine Egyptian jojoba shrubs was tested using ISSR (inter simple sequences repeats), RAPD (random amplified polymorphic DNA) markers and based on the morphological characterization. The average of the percentage of polymorphism (%P) ranged between 58.17% and 74.07% for ISSR and RAPD markers, respectively. The range of genetic similarity percents among shrubs based on ISSR and RAPD markers were from 82.9 to 97.9% and from 85.5 to 97.8%, respectively. The average of PIC (polymorphism information content) values were 0.19 (ISSR) and 0.24 (RAPD). In the present study, RAPD markers were more efficient than the ISSR markers. Where the RAPD technique exhibited higher marker index (MI) average (1.26) compared to ISSR one (1.11). There was an insignificant correlation between the ISSR and RAPD data (0.076, P > 0.05). The dendrogram constructed by the combined RAPD and ISSR data gave a relatively different clustering pattern.

Keywords: correlation, molecular markers, polymorphism, marker index

Procedia PDF Downloads 458
4040 Project Design Deliverables Sequence (PDD)

Authors: Nahed Al-Hajeri

Abstract:

There are several reasons which lead to a delay in project completion, out of all, one main reason is the delay in deliverable processing, i.e. submission and review of documents. Most of the project cycles start with a list of deliverables but without a sequence of submission of the same, means without a direction to move, leading to overlapping of activities and more interdependencies. Hence Project Design Deliverables (PDD) is developed as a solution to Organize Transmittals (Documents/Drawings) received from contractors/consultants during different phases of an EPC (Engineering, Procurement, and Construction) projects, which gives proper direction to the stakeholders from the beginning, to reduce inter-discipline dependency, avoid overlapping of activities, provide a list of deliverables, sequence of activities, etc. PDD attempts to provide a list and sequencing of the engineering documents/drawings required during different phases of a Project which will benefit both client and Contractor in performing planned activities through timely submission and review of deliverables. This helps in ensuring improved quality and completion of Project in time. The successful implementation begins with a detailed understanding the specific challenges and requirements of the project. PDD will help to learn about vendor document submissions including general workflow, sequence and monitor the submission and review of the deliverables from the early stages of Project. This will provide an overview for the Submission of deliverables by the concerned during the projects in proper sequence. The goal of PDD is also to hold responsible and accountability of all stakeholders during complete project cycle. We believe that successful implementation of PDD with a detailed list of documents and their sequence will help organizations to achieve the project target.

Keywords: EPC (Engineering, Procurement, and Construction), project design deliverables (PDD), econometrics sciences, management sciences

Procedia PDF Downloads 375
4039 An Industrial Steady State Sequence Disorder Model for Flow Controlled Multi-Input Single-Output Queues in Manufacturing Systems

Authors: Anthony John Walker, Glen Bright

Abstract:

The challenge faced by manufactures, when producing custom products, is that each product needs exact components. This can cause work-in-process instability due to component matching constraints imposed on assembly cells. Clearing type flow control policies have been used extensively in mediating server access between multiple arrival processes. Although the stability and performance of clearing policies has been well formulated and studied in the literature, the growth in arrival to departure sequence disorder for each arriving job, across a serving resource, is still an area for further analysis. In this paper, a closed form industrial model has been formulated that characterizes arrival-to-departure sequence disorder through stable manufacturing systems under clearing type flow control policy. Specifically addressed are the effects of sequence disorder imposed on a downstream assembly cell in terms of work-in-process instability induced through component matching constraints. Results from a simulated manufacturing system show that steady state average sequence disorder in parallel upstream processing cells can be balanced in order to decrease downstream assembly system instability. Simulation results also show that the closed form model accurately describes the growth and limiting behavior of average sequence disorder between parts arriving and departing from a manufacturing system flow controlled via clearing policy.

Keywords: assembly system constraint, custom products, discrete sequence disorder, flow control

Procedia PDF Downloads 156
4038 Clastic Sequence Stratigraphy of Late Jurassic to Early Cretaceous Formations of Jaisalmer Basin, Rajasthan

Authors: Himanshu Kumar Gupta

Abstract:

The Jaisalmer Basin is one of the parts of the Rajasthan basin in northwestern India. The presence of five major unconformities/hiatuses of varying span i.e. at the top of Archean basement, Cambrian, Jurassic, Cretaceous, and Eocene have created the foundation for constructing a sequence stratigraphic framework. Based on basin formative tectonic events and their impact on sedimentation processes three first-order sequences have been identified in Rajasthan Basin. These are Proterozoic-Early Cambrian rift sequence, Permian to Middle-Late Eocene shelf sequence and Pleistocene - Recent sequence related to Himalayan Orogeny. The Permian to Middle Eocene I order sequence is further subdivided into three-second order sequences i.e. Permian to Late Jurassic II order sequence, Early to Late Cretaceous II order sequence and Paleocene to Middle-Late Eocene II order sequence. In this study, Late Jurassic to Early Cretaceous sequence was identified and log-based interpretation of smaller order T-R cycles have been carried out. A log profile from eastern margin to western margin (up to Shahgarh depression) has been taken. The depositional environment penetrated by the wells interpreted from log signatures gave three major facies association. The blocky and coarsening upward (funnel shape), the blocky and fining upward (bell shape) and the erratic (zig-zag) facies representing distributary mouth bar, distributary channel and marine mud facies respectively. Late Jurassic Formation (Baisakhi-Bhadasar) and Early Cretaceous Formation (Pariwar) shows a lesser number of T-R cycles in shallower and higher number of T-R cycles in deeper bathymetry. Shallowest well has 3 T-R cycles in Baisakhi-Bhadasar and 2 T-R cycles in Pariwar, whereas deeper well has 4 T-R cycles in Baisakhi-Bhadasar and 8 T-R cycles in Pariwar Formation. The Maximum Flooding surfaces observed from the stratigraphy analysis indicate major shale break (high shale content). The study area is dominated by the alternation of shale and sand lithologies, which occurs in an approximate ratio of 70:30. A seismo-geological cross section has been prepared to understand the stratigraphic thickness variation and structural disposition of the strata. The formations are quite thick to the west, the thickness of which reduces as we traverse towards the east. The folded and the faulted strata indicated the compressional tectonics followed by the extensional tectonics. Our interpretation is supported with seismic up to second order sequence indicates - Late Jurassic sequence is a Highstand Systems Tract (Baisakhi - Bhadasar formations), and the Early Cretaceous sequence is Regressive to Lowstand System Tract (Pariwar Formation).

Keywords: Jaisalmer Basin, sequence stratigraphy, system tract, T-R cycle

Procedia PDF Downloads 111
4037 Genetic Analysis of the Endangered Mangrove Species Avicennia Marina in Qatar Detected by Inter-Simple Sequence Repeat DNA Markers

Authors: Talaat Ahmed, Amna Babssail

Abstract:

Mangroves are evergreen trees and grow along the coastal areas of Qatar. The largest and oldest area of mangroves can be found around Al-Thakhira and Al-Khor. Other mangrove areas originate from fairly recent plantings by the government, although unfortunately the picturesque mangrove lake in Al-Wakra has now been uprooted. Avicinnia marina is the predominant mangrove species found in the region. Mangroves protect and stabilize low lying coastal land, and provide protection and food sources for estuarine and coastal fishery food chains. They also serve as feeding, breeding and nursery grounds for a variety of fish, crustaceans, reptiles, birds and other wildlife. A total of 21 individuals of A. marina, representing seven diverse Natural and artificial populations, were sampled throughout its range in Qatar. Leaves from 2-3 randomly selected trees at each location were collected. The locations are as follows: Al-Rawis, Ras-Madpak, Fuwairt, Summaseima, Al-khour, AL-Mafjar and Zekreet. Total genomic DNA was extracted using commercial DNeasy Plant System (Qiagen, Inc., Valencia, CA) kit to be used for genetic diversity analysis. Total of 12 (Inter-Simple Sequence Repeat) ISSR primers were used to amplify DNA fragments using genomic DNA. The 12 ISSR primers amplified polymorphic bands among mangrove samples in different areas as well as within each area indicating the existing of variation within each area and among the different areas of mangrove in Qatar. The results could characterize Avicinnia marina populations exist in different areas of Qatar and establish DNA fingerprint documentations for mangrove population to be used in further studies. Moreover, existing of genetic variation within and among Avicinnia marina populations is a strong indication for the ability of such populations to adapt different environmental conditions in Qatar. This study could be a warning to save mangrove in Qatar and save the environment as well.

Keywords: DNA fingerprint, Avicinnia marina, genetic analysis, Qatar

Procedia PDF Downloads 372
4036 Perceptual Organization within Temporal Displacement

Authors: Michele Sinico

Abstract:

The psychological present has an actual extension. When a sequence of instantaneous stimuli falls in this short interval of time, observers perceive a compresence of events in succession and the temporal order depends on the qualitative relationships between the perceptual properties of the events. Two experiments were carried out to study the influence of perceptual grouping, with and without temporal displacement, on the duration of auditory sequences. The psychophysical method of adjustment was adopted. The first experiment investigated the effect of temporal displacement of a white noise on sequence duration. The second experiment investigated the effect of temporal displacement, along the pitch dimension, on temporal shortening of sequence. The results suggest that the temporal order of sounds, in the case of temporal displacement, is organized along the pitch dimension.

Keywords: time perception, perceptual present, temporal displacement, Gestalt laws of perceptual organization

Procedia PDF Downloads 229
4035 A CORDIC Based Design Technique for Efficient Computation of DCT

Authors: Deboraj Muchahary, Amlan Deep Borah Abir J. Mondal, Alak Majumder

Abstract:

A discrete cosine transform (DCT) is described and a technique to compute it using fast Fourier transform (FFT) is developed. In this work, DCT of a finite length sequence is obtained by incorporating CORDIC methodology in radix-2 FFT algorithm. The proposed methodology is simple to comprehend and maintains a regular structure, thereby reducing computational complexity. DCTs are used extensively in the area of digital processing for the purpose of pattern recognition. So the efficient computation of DCT maintaining a transparent design flow is highly solicited.

Keywords: DCT, DFT, CORDIC, FFT

Procedia PDF Downloads 450
4034 Formulation of Optimal Shifting Sequence for Multi-Speed Automatic Transmission

Authors: Sireesha Tamada, Debraj Bhattacharjee, Pranab K. Dan, Prabha Bhola

Abstract:

The most important component in an automotive transmission system is the gearbox which controls the speed of the vehicle. In an automatic transmission, the right positioning of actuators ensures efficient transmission mechanism embodiment, wherein the challenge lies in formulating the number of actuators associated with modelling a gearbox. Data with respect to actuation and gear shifting sequence has been retrieved from the available literature, including patent documents, and has been used in this proposed heuristics based methodology for modelling actuation sequence in a gear box. This paper presents a methodological approach in designing a gearbox for the purpose of obtaining an optimal shifting sequence. The computational model considers factors namely, the number of stages and gear teeth as input parameters since these two are the determinants of the gear ratios in an epicyclic gear train. The proposed transmission schematic or stick diagram aids in developing the gearbox layout design. The number of iterations and development time required to design a gearbox layout is reduced by using this approach.

Keywords: automatic transmission, gear-shifting, multi-stage planetary gearbox, rank ordered clustering

Procedia PDF Downloads 298
4033 Effect of Weave Structure and Picking Sequence on the Comfort Properties of Woven Fabrics

Authors: Muhammad Umair, Tanveer Hussain, Khubab Shaker, Yasir Nawab, Muhammad Maqsood, Madeha Jabbar

Abstract:

The term comfort is defined as 'the absence of unpleasantness or discomfort' or 'a neutral state compared to the more active state'. Comfort mainly is of three types: sensorial (tactile) comfort, psychological comfort and thermo-physiological comfort. Thermophysiological comfort is determined by the air permeability and moisture management properties of the garment. The aim of this study was to investigate the effect of weave structure and picking sequence on the comfort properties of woven fabrics. Six woven fabrics with two different weave structures i.e. 1/1 plain and 3/1 twill and three different picking sequences: (SPI, DPI, 3PI) were taken as input variables whereas air permeability, wetting time, wicking behavior and overall moisture management capability (OMMC) of fabrics were taken as response variables and a comparison is made of the effect of weave structure and picking sequence on the response variables. It was found that fabrics woven in twill weave design and with simultaneous triple pick insertion (3PI) give significantly better air permeability, shorter wetting time and better water spreading rate, as compared to plain woven fabrics and those with double pick insertion (DPI) or single pick insertion (SPI). It could be concluded that the thermophysiological comfort of woven fabrics may be significantly improved simply by selecting a suitable weave design and picking sequence.

Keywords: air permeability, picking sequence, thermophysiological comfort, weave design

Procedia PDF Downloads 397
4032 PMEL Marker Identification of Dark and Light Feather Colours in Local Canary

Authors: Mudawamah Mudawamah, Muhammad Z. Fadli, Gatot Ciptadi, Aulanni’am

Abstract:

Canary breeders have spread throughout Indonesian regions for the low-middle society and become an income source for them. The interesting phenomenon of the canary market is the feather colours become one of determining factor for the price. The advantages of this research were contributed to the molecular database as a base of selection and mating for the Indonesia canary breeder. The research method was experiment with the genome obtained from canary blood isolation. The genome did the PCR amplification with PMEL marker followed by sequencing. Canaries were used 24 heads of light and dark colour feathers. Research data analyses used BioEdit and Network 4.6.0.0 software. The results showed that all samples were amplification with PMEL gene with 500 bp fragment length. In base sequence of 40 was found Cytosine(C) in the light colour canaries, while the dark colour canaries was obtained Thymine (T) in same base sequence. Sequence results had 286-415 bp fragment and 10 haplotypes. The conclusions were the PMEL gene (gene of white pigment) was likely to be used PMEL gene to detect molecular genetic variation of dark and light colour feather.

Keywords: canary, haplotype, PMEL, sequence

Procedia PDF Downloads 209
4031 Utilization of Developed Single Sequence Repeats Markers for Dalmatian Pyrethrum (Tanacetum cinerariifolium) in Preliminary Genetic Diversity Study on Natural Populations

Authors: F. Varga, Z. Liber, J. Jakše, A. Turudić, Z. Šatović, I. Radosavljević, N. Jeran, M. Grdiša

Abstract:

Dalmatian pyrethrum (Tanacetum cinerariifolium (Trevir.) Sch. Bip.; Asteraceae), a source of the commercially dominant plant insecticide pyrethrin, is a species endemic to the eastern Adriatic. Genetic diversity of T. cinerariifolium was previously studied using amplified fragment length polymorphism (AFLP) markers. However, microsatellite markers (single sequence repeats - SSRs) are more informative because they are codominant, highly polymorphic, locus-specific, and more reproducible, and thus are most often used to assess the genetic diversity of plant species. Dalmatian pyrethrum is an outcrossing diploid (2n = 18) whose large genome size and high repeatability have prevented the success of the traditional approach to SSR markers development. The advent of next-generation sequencing combined with the specifically developed method recently enabled the development of, to the author's best knowledge, the first set of SSRs for genomic characterization of Dalmatian pyrethrum, which is essential from the perspective of plant genetic resources conservation. To evaluate the effectiveness of the developed SSR markers in genetic differentiation of Dalmatian pyrethrum populations, a preliminary genetic diversity study was conducted on 30 individuals from three geographically distinct natural populations in Croatia (northern Adriatic island of Mali Lošinj, southern Adriatic island of Čiovo, and Mount Biokovo) based on 12 SSR loci. Analysis of molecular variance (AMOVA) by randomization test with 10,000 permutations was performed in Arlequin 3.5. The average number of alleles per locus, observed and expected heterozygosity, probability of deviations from Hardy-Weinberg equilibrium, and inbreeding coefficient was calculated using GENEPOP 4.4. Genetic distance based on the proportion of common alleles (DPSA) was calculated using MICROSAT. Cluster analysis using the neighbor-joining method with 1,000 bootstraps was performed with PHYLIP to generate a dendrogram. The results of the AMOVA analysis showed that the total SSR diversity was 23% within and 77% between the three populations. A slight deviation from Hardy-Weinberg equilibrium was observed in the Mali Lošinj population. Allele richness ranged from 2.92 to 3.92, with the highest number of private alleles observed in the Mali Lošinj population (17). The average observed DPSA between 30 individuals was 0.557. The highest DPSA (0.875) was observed between several pairs of Dalmatian pyrethrum individuals from the Mali Lošinj and Mt. Biokovo populations, and the lowest between two individuals from the Čiovo population. Neighbor-joining trees, based on DPSA, grouped individuals into clusters according to their population affiliation. The separation of Mt. Biokovo clade was supported (bootstrap value 58%), which is consistent with the previous study on AFLP markers, where isolated populations from Mt. Biokovo differed from the rest of the populations. The developed SSR markers are an effective tool for assessing the genetic diversity and structure of natural Dalmatian pyrethrum populations. These preliminary results are encouraging for a future comprehensive study with a larger sample size across the species' range. Combined with the biochemical data, these highly informative markers could help identify potential genotypes of interest for future development of breeding lines and cultivars that are both resistant to environmental stress and high in pyrethrins. Acknowledgment: This work has been supported by the Croatian Science Foundation under the project ‘Genetic background of Dalmatian pyrethrum (Tanacetum cinerariifolium /Trevir./ Sch. Bip.) insecticidal potential’- (PyrDiv) (IP-06-2016-9034) and by project KK.01.1.1.01.0005, Biodiversity and Molecular Plant Breeding, at the Centre of Excellence for Biodiversity and Molecular Plant Breeding (CoE CroP-BioDiv), Zagreb, Croatia.

Keywords: Asteraceae, genetic diversity, genomic SSRs, NGS, pyrethrum, Tanacetum cinerariifolium

Procedia PDF Downloads 94
4030 Phenotype Prediction of DNA Sequence Data: A Machine and Statistical Learning Approach

Authors: Mpho Mokoatle, Darlington Mapiye, James Mashiyane, Stephanie Muller, Gciniwe Dlamini

Abstract:

Great advances in high-throughput sequencing technologies have resulted in availability of huge amounts of sequencing data in public and private repositories, enabling a holistic understanding of complex biological phenomena. Sequence data are used for a wide range of applications such as gene annotations, expression studies, personalized treatment and precision medicine. However, this rapid growth in sequence data poses a great challenge which calls for novel data processing and analytic methods, as well as huge computing resources. In this work, a machine and statistical learning approach for DNA sequence classification based on $k$-mer representation of sequence data is proposed. The approach is tested using whole genome sequences of Mycobacterium tuberculosis (MTB) isolates to (i) reduce the size of genomic sequence data, (ii) identify an optimum size of k-mers and utilize it to build classification models, (iii) predict the phenotype from whole genome sequence data of a given bacterial isolate, and (iv) demonstrate computing challenges associated with the analysis of whole genome sequence data in producing interpretable and explainable insights. The classification models were trained on 104 whole genome sequences of MTB isoloates. Cluster analysis showed that k-mers maybe used to discriminate phenotypes and the discrimination becomes more concise as the size of k-mers increase. The best performing classification model had a k-mer size of 10 (longest k-mer) an accuracy, recall, precision, specificity, and Matthews Correlation coeffient of 72.0%, 80.5%, 80.5%, 63.6%, and 0.4 respectively. This study provides a comprehensive approach for resampling whole genome sequencing data, objectively selecting a k-mer size, and performing classification for phenotype prediction. The analysis also highlights the importance of increasing the k-mer size to produce more biological explainable results, which brings to the fore the interplay that exists amongst accuracy, computing resources and explainability of classification results. However, the analysis provides a new way to elucidate genetic information from genomic data, and identify phenotype relationships which are important especially in explaining complex biological mechanisms.

Keywords: AWD-LSTM, bootstrapping, k-mers, next generation sequencing

Procedia PDF Downloads 140
4029 Phenotype Prediction of DNA Sequence Data: A Machine and Statistical Learning Approach

Authors: Darlington Mapiye, Mpho Mokoatle, James Mashiyane, Stephanie Muller, Gciniwe Dlamini

Abstract:

Great advances in high-throughput sequencing technologies have resulted in availability of huge amounts of sequencing data in public and private repositories, enabling a holistic understanding of complex biological phenomena. Sequence data are used for a wide range of applications such as gene annotations, expression studies, personalized treatment and precision medicine. However, this rapid growth in sequence data poses a great challenge which calls for novel data processing and analytic methods, as well as huge computing resources. In this work, a machine and statistical learning approach for DNA sequence classification based on k-mer representation of sequence data is proposed. The approach is tested using whole genome sequences of Mycobacterium tuberculosis (MTB) isolates to (i) reduce the size of genomic sequence data, (ii) identify an optimum size of k-mers and utilize it to build classification models, (iii) predict the phenotype from whole genome sequence data of a given bacterial isolate, and (iv) demonstrate computing challenges associated with the analysis of whole genome sequence data in producing interpretable and explainable insights. The classification models were trained on 104 whole genome sequences of MTB isoloates. Cluster analysis showed that k-mers maybe used to discriminate phenotypes and the discrimination becomes more concise as the size of k-mers increase. The best performing classification model had a k-mer size of 10 (longest k-mer) an accuracy, recall, precision, specificity, and Matthews Correlation coeffient of 72.0 %, 80.5 %, 80.5 %, 63.6 %, and 0.4 respectively. This study provides a comprehensive approach for resampling whole genome sequencing data, objectively selecting a k-mer size, and performing classification for phenotype prediction. The analysis also highlights the importance of increasing the k-mer size to produce more biological explainable results, which brings to the fore the interplay that exists amongst accuracy, computing resources and explainability of classification results. However, the analysis provides a new way to elucidate genetic information from genomic data, and identify phenotype relationships which are important especially in explaining complex biological mechanisms

Keywords: AWD-LSTM, bootstrapping, k-mers, next generation sequencing

Procedia PDF Downloads 128
4028 Comparative Study of Experimental and Theoretical Convective, Evaporative for Two Model Distiller

Authors: Khaoula Hidouri, Ali Benhmidene, Bechir Chouachi

Abstract:

The purification of brackish seawater becomes a necessity and not a choice against demographic and industrial growth especially in third world countries. Two models can be used in this work: simple solar still and simple solar still coupled with a heat pump. In this research, the productivity of water by Simple Solar Distiller (SSD) and Simple Solar Distiller Hybrid Heat Pump (SSDHP) was determined by the orientation, the use of heat pump, the simple or double glass cover. The productivity can exceed 1.2 L/m²h for the SSDHP and 0.5 L/m²h for SSD model. The result of the global efficiency is determined for two models SSD and SSDHP give respectively 30%, 50%. The internal efficiency attained 35% for SSD and 60% of the SSDHP models. Convective heat coefficient can be determined by attained 2.5 W/m²°C and 0.5 W/m²°C respectively for SSDHP and SSD models.

Keywords: productivity, efficiency, convective heat coefficient, SSD model, SSDHPmodel

Procedia PDF Downloads 189
4027 Morphological and Molecular Characterization of Accessions of Black Fonio Millet (Digitaria Iburua Stapf) Grown in Selected Regions in Nigeria

Authors: Nwogiji Cletus Olando, Oselebe Happiness Ogba, Enoch Achigan-Dako

Abstract:

Digitaria iburua, commonly known as black fonio, is a cereal crop native to Africa and extensively cultivated by smallholder farmers in Northern Benin, Togo, and Nigeria. This crop holds immense nutritional and socio-cultural value. Unfortunately, limited knowledge about its genetic diversity exists due to a lack of scientific attention. As a result, its potential for improvement in food and agriculture remains largely untapped. To address this gap, a study was conducted using 41 accessions of D. iburua stored in the genebank of the Laboratory of Genetics, Biotechnology, and Seed Science at Abomey-Calavi University, Benin. The study employed both morphological and simple sequence repeat (SSR) markers to evaluate the genetic variability of the accessions. Agro-morphological assessments were carried out during the 2020 cropping season, utilizing an alpha lattice design with three replications. The collected data encompassed qualitative and quantitative traits. Additionally, molecular variability was assessed using eleven SSR markers. The results revealed significant phenotypic variability among the evaluated accessions, leading to their classification into three main clusters. Furthermore, the eleven SSR markers identified a total of 50 alleles, averaging 4.55 alleles per locus. The primers exhibited an average polymorphic information content value of 0.43, with the DE-ARC019 primer displaying the highest value (0.59). These findings suggest a substantial degree of genetic heterogeneity within the evaluated accessions, and the SSR markers employed in the study proved highly effective in detecting and characterizing this genetic variability. In conclusion, this study highlights the presence of significant genetic diversity in black fonio and provides valuable insights for future efforts aimed at its genetic improvement and conservation.

Keywords: genetic diversity, digitaria iburua, genetic improvement, simple sequence repeat markers, Nigeria, conservation

Procedia PDF Downloads 61
4026 DNpro: A Deep Learning Network Approach to Predicting Protein Stability Changes Induced by Single-Site Mutations

Authors: Xiao Zhou, Jianlin Cheng

Abstract:

A single amino acid mutation can have a significant impact on the stability of protein structure. Thus, the prediction of protein stability change induced by single site mutations is critical and useful for studying protein function and structure. Here, we presented a deep learning network with the dropout technique for predicting protein stability changes upon single amino acid substitution. While using only protein sequence as input, the overall prediction accuracy of the method on a standard benchmark is >85%, which is higher than existing sequence-based methods and is comparable to the methods that use not only protein sequence but also tertiary structure, pH value and temperature. The results demonstrate that deep learning is a promising technique for protein stability prediction. The good performance of this sequence-based method makes it a valuable tool for predicting the impact of mutations on most proteins whose experimental structures are not available. Both the downloadable software package and the user-friendly web server (DNpro) that implement the method for predicting protein stability changes induced by amino acid mutations are freely available for the community to use.

Keywords: bioinformatics, deep learning, protein stability prediction, biological data mining

Procedia PDF Downloads 434
4025 In-Depth Analysis on Sequence Evolution and Molecular Interaction of Influenza Receptors (Hemagglutinin and Neuraminidase)

Authors: Dong Tran, Thanh Dac Van, Ly Le

Abstract:

Hemagglutinin (HA) and Neuraminidase (NA) play an important role in host immune evasion across influenza virus evolution process. The correlation between HA and NA evolution in respect to epitopic evolution and drug interaction has yet to be investigated. In this study, combining of sequence to structure evolution and statistical analysis on epitopic/binding site specificity, we identified potential therapeutic features of HA and NA that show specific antibody binding site of HA and specific binding distribution within NA active site of current inhibitors. Our approach introduces the use of sequence variation and molecular interaction to provide an effective strategy in establishing experimental based distributed representations of protein-protein/ligand complexes. The most important advantage of our method is that it does not require complete dataset of complexes but rather directly inferring feature interaction from sequence variation and molecular interaction. Using correlated sequence analysis, we additionally identified co-evolved mutations associated with maintaining HA/NA structural and functional variability toward immunity and therapeutic treatment. Our investigation on the HA binding specificity revealed unique conserved stalk domain interacts with unique loop domain of universal antibodies (CR9114, CT149, CR8043, CR8020, F16v3, CR6261, F10). On the other hand, NA inhibitors (Oseltamivir, Zaninamivir, Laninamivir) showed specific conserved residue contribution and similar to that of NA substrate (sialic acid) which can be exploited for drug design. Our study provides an important insight into rational design and identification of novel therapeutics targeting universally recognized feature of influenza HA/NA.

Keywords: influenza virus, hemagglutinin (HA), neuraminidase (NA), sequence evolution

Procedia PDF Downloads 136
4024 Novel Coprocessor for DNA Sequence Alignment in Resequencing Applications

Authors: Atef Ibrahim, Hamed Elsimary, Abdullah Aljumah, Fayez Gebali

Abstract:

This paper presents a novel semi-systolic array architecture for an optimized parallel sequence alignment algorithm. This architecture has the advantage that it can be modified to be reused for multiple pass processing in order to increase the number of processing elements that can be packed into a single FPGA and to increase the number of sequences that can be aligned in parallel in a single FPGA. This resolves the potential problem of many FPGA resources left unused for designs that have large values of short read length. When using the previously published conventional hardware design. FPGA implementation results show that, for large values of short read lengths (M>128), the proposed design has a slightly higher speed up and FPGA utilization over the the conventional one.

Keywords: bioinformatics, genome sequence alignment, re-sequencing applications, systolic array

Procedia PDF Downloads 505
4023 Prediction and Identification of a Permissive Epitope Insertion Site for St Toxoid in cfaB from Enterotoxigenic Escherichia coli

Authors: N. Zeinalzadeh, Mahdi Sadeghi

Abstract:

Enterotoxigenic Escherichia coli (ETEC) is the most common cause of non-inflammatory diarrhea in the developing countries, resulting in approximately 20% of all diarrheal episodes in children in these areas. ST is one of the most important virulence factors and CFA/I is one of the frequent colonization factors that help to process of ETEC infection. ST and CfaB (CFA/I subunit) are among vaccine candidates against ETEC. So, ST because of its small size is not a good immunogenic in the natural form. However to increase its immunogenic potential, here we explored candidate positions for ST insertion in CfaB sequence. After bioinformatics analysis, one of the candidate positions was selected and the chimeric gene (cfaB*st) sequence was synthesized and expressed in E. coli BL21 (DE3). The chimeric recombinant protein was purified with Ni-NTA columns and characterized with western blot analysis. The residue 74-75 of CfaB sequence could be a good candidate position for ST and other epitopes insertion.

Keywords: bioinformatics, CFA/I, enterotoxigenic E. coli, ST toxoid

Procedia PDF Downloads 425
4022 Nucleotide Based Validation of the Endangered Plant Diospyros mespiliformis (Ebenaceae) by Evaluating Short Sequence Region of Plastid rbcL Gene

Authors: Abdullah Alaklabi, Ibrahim A. Arif, Sameera O. Bafeel, Ahmad H. Alfarhan, Anis Ahamed, Jacob Thomas, Mohammad A. Bakir

Abstract:

Diospyros mespiliformis (Hochst. ex A.DC.; Ebenaceae) is a large deciduous medicinal plant. This plant species is currently listed as endangered in Saudi Arabia. Molecular identification of this plant species based on short sequence regions (571 and 664 bp) of plastid rbcL (ribulose-1, 5-biphosphate carboxylase) gene was investigated in this study. The endangered plant specimens were collected from Al-Baha, Saudi Arabia (GPS coordinate: 19.8543987, 41.3059349). Phylogenetic tree inferred from the rbcL gene sequences showed that this species is very closely related with D. brandisiana. The close relationship was also observed among D. bejaudii, D. Philippinensis and D. releyi (≥99.7% sequence homology). The partial rbcL gene sequence region (571 bp) that was amplified by rbcL primer-pair rbcLaF-rbcLaR failed to discriminate D. mespiliformis from the closely related plant species, D. brandisiana. In contrast, primer-pair rbcL1F-rbcL724R yielded longer amplicon, discriminated the species from D. brandisiana and demonstrated nucleotide variations in 3 different sites (645G>T; 663A>C; 710C>G). Although D. mespiliformis (EU980712) and D. brandisiana (EU980656) are very closely related species (99.4%); however, studied specimen showed 100% sequence homology with D. mespiliformis and 99.6% with D. brandisiana. The present findings showed that rbcL short sequence region (664 bp) of plastid rbcL gene, amplified by primer-pair rbcL1F-rbcL724R, can be used for authenticating samples of D. mespiliforformis and may provide help in authentic identification and management process of this medicinally valuable endangered plant species.

Keywords: Diospyros mespiliformis, endangered plant, identification partial rbcL

Procedia PDF Downloads 405
4021 Unveiling the Chaura Thrust: Insights into a Blind Out-of-Sequence Thrust in Himachal Pradesh, India

Authors: Rajkumar Ghosh

Abstract:

The Chaura Thrust, located in Himachal Pradesh, India, is a prominent geological feature that exhibits characteristics of an out-of-sequence thrust fault. This paper explores the geological setting of Himachal Pradesh, focusing on the Chaura Thrust's unique characteristics, its classification as an out-of-sequence thrust, and the implications of its presence in the region. The introduction provides background information on thrust faults and out-of-sequence thrusts, emphasizing their significance in understanding the tectonic history and deformation patterns of an area. It also outlines the objectives of the paper, which include examining the Chaura Thrust's geological features, discussing its classification as an out-of-sequence thrust, and assessing its implications for the region. The paper delves into the geological setting of Himachal Pradesh, describing the tectonic framework and providing insights into the formation of thrust faults in the region. Special attention is given to the Chaura Thrust, including its location, extent, and geometry, along with an overview of the associated rock formations and structural characteristics. The concept of out-of-sequence thrusts is introduced, defining their distinctive behavior and highlighting their importance in the understanding of geological processes. The Chaura Thrust is then analyzed in the context of an out-of-sequence thrust, examining the evidence and characteristics that support this classification. Factors contributing to the out-of-sequence behavior of the Chaura Thrust, such as stress interactions and fault interactions, are discussed. The geological implications and significance of the Chaura Thrust are explored, addressing its impact on the regional geology, tectonic evolution, and seismic hazard assessment. The paper also discusses the potential geological hazards associated with the Chaura Thrust and the need for effective mitigation strategies in the region. Future research directions and recommendations are provided, highlighting areas that warrant further investigation, such as detailed structural analyses, geodetic measurements, and geophysical surveys. The importance of continued research in understanding and managing geological hazards related to the Chaura Thrust is emphasized. In conclusion, the Chaura Thrust in Himachal Pradesh represents an out-of-sequence thrust fault that has significant implications for the region's geology and tectonic evolution. By studying the unique characteristics and behavior of the Chaura Thrust, researchers can gain valuable insights into the geological processes occurring in Himachal Pradesh and contribute to a better understanding and mitigation of seismic hazards in the area.

Keywords: chaura thrust, out-of-sequence thrust, himachal pradesh, geological setting, tectonic framework, rock formations, structural characteristics, stress interactions, fault interactions, geological implications, seismic hazard assessment, geological hazards, future research, mitigation strategies.

Procedia PDF Downloads 54
4020 Replica-Exchange Metadynamics Simulations of G-Quadruplex DNA Structures Under Substitution of K+ by Na+ Ions

Authors: Juan Antonio Mondragon Sanchez, Ruben Santamaria

Abstract:

The DNA G-quadruplex is a four-stranded DNA structure conformed by stacked planes of four base paired guanines (G-quartet). The guanine rich DNA sequences are present in many sites of genomic DNA and can potentially lead to the formation of G-quadruplexes, especially at the 3'-terminus of the human telomeric DNA with many TTAGGG repeats. The formation and stabilization of a G-quadruplex by small ligands at the telomeric region can inhibit the telomerase activity. In turn, the ligands can be used to regulate oncogene expression making the G-quadruplex an attractive target for anticancer therapy. Clearly, the G-quadruplex structured in the telomeric DNA is of fundamental importance for rational drug design. In this context, we investigate two G-quadruplex structures, the first follows from the sequence TTAGGG(TTAGGG)3TT (HUT1), and the second from AAAGGG(TTAGGG)3AA (HUT2), both in a K+ solution. We determine the free energy surfaces of the HUT1 and HUT2 structures and investigate their conformations using replica-exchange metadynamics simulations. The carbonyl-carbonyl distances belonging to different guanines residues are selected as the main collective variables to determine the free energy surfaces. The surfaces exhibit two main local minima, compatible with experiments on the conformational transformations of HUT1 and HUT2 under substitution of the K+ ions by the Na+ ions. The conformational transitions are not observed in short MD simulations without the use of the metadynamics approach. The results of this work should be of help to understand the formation and stability of human telomeric G-quadruplex in environments including the presence of K+ and Na+ ions.

Keywords: g-quadruplex, metadynamics, molecular dynamics, replica-exchange

Procedia PDF Downloads 322
4019 Neural Machine Translation for Low-Resource African Languages: Benchmarking State-of-the-Art Transformer for Wolof

Authors: Cheikh Bamba Dione, Alla Lo, Elhadji Mamadou Nguer, Siley O. Ba

Abstract:

In this paper, we propose two neural machine translation (NMT) systems (French-to-Wolof and Wolof-to-French) based on sequence-to-sequence with attention and transformer architectures. We trained our models on a parallel French-Wolof corpus of about 83k sentence pairs. Because of the low-resource setting, we experimented with advanced methods for handling data sparsity, including subword segmentation, back translation, and the copied corpus method. We evaluate the models using the BLEU score and find that transformer outperforms the classic seq2seq model in all settings, in addition to being less sensitive to noise. In general, the best scores are achieved when training the models on word-level-based units. For subword-level models, using back translation proves to be slightly beneficial in low-resource (WO) to high-resource (FR) language translation for the transformer (but not for the seq2seq) models. A slight improvement can also be observed when injecting copied monolingual text in the target language. Moreover, combining the copied method data with back translation leads to a substantial improvement of the translation quality.

Keywords: backtranslation, low-resource language, neural machine translation, sequence-to-sequence, transformer, Wolof

Procedia PDF Downloads 119
4018 Basic Study of Mammographic Image Magnification System with Eye-Detector and Simple EEG Scanner

Authors: Aika Umemuro, Mitsuru Sato, Mizuki Narita, Saya Hori, Saya Sakurai, Tomomi Nakayama, Ayano Nakazawa, Toshihiro Ogura

Abstract:

Mammography requires the detection of very small calcifications, and physicians search for microcalcifications by magnifying the images as they read them. The mouse is necessary to zoom in on the images, but this can be tiring and distracting when many images are read in a single day. Therefore, an image magnification system combining an eye-detector and a simple electroencephalograph (EEG) scanner was devised, and its operability was evaluated. Two experiments were conducted in this study: the measurement of eye-detection error using an eye-detector and the measurement of the time required for image magnification using a simple EEG scanner. Eye-detector validation showed that the mean distance of eye-detection error ranged from 0.64 cm to 2.17 cm, with an overall mean of 1.24 ± 0.81 cm for the observers. The results showed that the eye detection error was small enough for the magnified area of the mammographic image. The average time required for point magnification in the verification of the simple EEG scanner ranged from 5.85 to 16.73 seconds, and individual differences were observed. The reason for this may be that the size of the simple EEG scanner used was not adjustable, so it did not fit well for some subjects. The use of a simple EEG scanner with size adjustment would solve this problem. Therefore, the image magnification system using the eye-detector and the simple EEG scanner is useful.

Keywords: EEG scanner, eye-detector, mammography, observers

Procedia PDF Downloads 202