Search results for: genome sequence
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1378

Search results for: genome sequence

1078 Quantifying Late Cenozoic Out‐of‐Sequence Thrusting at Chaura, Sutlej Valley, Himachal Pradesh, India

Authors: Rajkumar Ghosh

Abstract:

Out-of-sequence thrusts (OOST) are reported at different geographic locations with various local names along Siwalik Himalaya (SH), Lesser Himalaya (LH), Higher Himalaya (HH) from Bhutan, India, Nepal, and Pakistan Himalayan range. Most of OOSTs have been identified within the upper LH, and the lower HH based on geochronological age jump across. These thrusts activated from Late Miocene to recent. The Chaura Thrust (CT) was deciphered from age jump of Apatite Fission Track (AFT) and considered as blind thrust base on variable exhumation rates in Chaura region, Satluj river valley, Himachal Pradesh. CT is located north of Jhakri Thrust (JhT) and is also differently identified as Sarahan thrust (ST). Structural documentation from the rocks near the OOST in Chaura was not so far done. Detail structural study of the Jeori Group of rocks was carried out in this study to understand the manifestation of the Chaura thrust and associated structures in meso- to micro-scale. Box fold, scar fold, kink fold, crenulation cleavages, and boudins are developed in the Chaura region. These structures usually do not indicate shear sense. When studied under an optical microscope, the Chaura samples reveal that the mica fish are usually lenticular with aspect ratio (R) varying from 6–11 and inclination angle (α) from 15–40°. According to ‘R’ and ‘α’, elongated sigmoid shaped mica fish and parallelogram shaped mica fish were also documented. Asymmetric mica fish demonstrate top-to-S/SW ductile shear, which is similar as that of Chaura thrust. Grain boundary migration (GBM) structures in quartzo-feldspathic grains from Jeori Group of rocks indicate deformation temperature ranging from 400 to 650°C. This can indicate that the OOST at Chaura, i.e., the Chaura Thrust, underwent thrusting in the ductile regime.

Keywords: out-of-sequence thrust, chaura thrust, sarahan thrust, jakhri thrust, higher himalaya, s/c- fabric

Procedia PDF Downloads 48
1077 The Convolution Recurrent Network of Using Residual LSTM to Process the Output of the Downsampling for Monaural Speech Enhancement

Authors: Shibo Wei, Ting Jiang

Abstract:

Convolutional-recurrent neural networks (CRN) have achieved much success recently in the speech enhancement field. The common processing method is to use the convolution layer to compress the feature space by multiple upsampling and then model the compressed features with the LSTM layer. At last, the enhanced speech is obtained by deconvolution operation to integrate the global information of the speech sequence. However, the feature space compression process may cause the loss of information, so we propose to model the upsampling result of each step with the residual LSTM layer, then join it with the output of the deconvolution layer and input them to the next deconvolution layer, by this way, we want to integrate the global information of speech sequence better. The experimental results show the network model (RES-CRN) we introduce can achieve better performance than LSTM without residual and overlaying LSTM simply in the original CRN in terms of scale-invariant signal-to-distortion ratio (SI-SNR), speech quality (PESQ), and intelligibility (STOI).

Keywords: convolutional-recurrent neural networks, speech enhancement, residual LSTM, SI-SNR

Procedia PDF Downloads 168
1076 Integration of an Evidence-Based Medicine Curriculum into Physician Assistant Education: Teaching for Today and the Future

Authors: Martina I. Reinhold, Theresa Bacon-Baguley

Abstract:

Background: Medical knowledge continuously evolves and to help health care providers to stay up-to-date, evidence-based medicine (EBM) has emerged as a model. The practice of EBM requires new skills of the health care provider, including directed literature searches, the critical evaluation of research studies, and the direct application of the findings to patient care. This paper describes the integration and evaluation of an evidence-based medicine course sequence into a Physician Assistant curriculum. This course sequence teaches students to manage and use the best clinical research evidence to competently practice medicine. A survey was developed to assess the outcomes of the EBM course sequence. Methodology: The cornerstone of the three-semester sequence of EBM are interactive small group discussions that are designed to introduce students to the most clinically applicable skills to identify, manage and use the best clinical research evidence to improve the health of their patients. During the three-semester sequence, the students are assigned each semester to participate in small group discussions that are facilitated by faculty with varying background and expertise. Prior to the start of the first EBM course in the winter semester, PA students complete a knowledge-based survey that was developed by the authors to assess the effectiveness of the course series. The survey consists of 53 Likert scale questions that address the nine objectives for the course series. At the end of the three semester course series, the same survey was given to all students in the program and the results from before, and after the sequence of EBM courses are compared. Specific attention is paid to overall performance of students in the nine course objectives. Results: We find that students from the Class of 2016 and 2017 consistently improve (as measured by percent correct responses on the survey tool) after the EBM course series (Class of 2016: Pre- 62% Post- 75%; Class of 2017: Pre- 61 % Post-70%). The biggest increase in knowledge was observed in the areas of finding and evaluating the evidence, with asking concise clinical questions (Class of 2016: Pre- 61% Post- 81%; Class of 2017: Pre- 61 % Post-75%) and searching the medical database (Class of 2016: Pre- 24% Post- 65%; Class of 2017: Pre- 35 % Post-66 %). Questions requiring students to analyze, evaluate and report on the available clinical evidence regarding diagnosis showed improvement, but to a lesser extend (Class of 2016: Pre- 56% Post- 77%; Class of 2017: Pre- 56 % Post-61%). Conclusions: Outcomes identified that students did gain skills which will allow them to apply EBM principles. In addition, the outcomes of the knowledge-based survey allowed the faculty to focus on areas needing improvement, specifically the translation of best evidence into patient care. To address this area, the clinical faculty developed case scenarios that were incorporated into the lecture and discussion sessions, allowing students to better connect the research studies with patient care. Students commented that ‘class discussion and case examples’ contributed most to their learning and that ‘it was helpful to learn how to develop research questions and how to analyze studies and their significance to a potential client’. As evident by the outcomes, the EBM courses achieved the goals of the course and were well received by the students. 

Keywords: evidence-based medicine, clinical education, assessment tool, physician assistant

Procedia PDF Downloads 104
1075 The Golden Ratio as a Common ‘Topos’ of Architectural, Musical and Stochastic Research of Iannis Xenakis

Authors: Nikolaos Mamalis

Abstract:

The work of the eminent architect and composer has undoubtedly been influenced both by his architecture and collaboration with Le Corbusier and by the conquests of the musical avant-garde of the 20th century (Schoenberg, Messian, Bartock, electroacoustic music). It is known that the golden mean and the Fibonacci sequence played a momentous role in the Architectural Avant-garde (Modulor) and expanded on musical pursuits. Especially in the 50s (serialism), it was a structural tool for composition. Xenakis' architectural and musical work (Sacrifice, Metastasis, Rebonds, etc.) received the influence of the Golden Section, as has been repeatedly demonstrated. However, the idea of this retrospective sequence and the reflection raised by the search for new proportions, both in the architectural and the musical work of Xenakis, was not limited to constituting a step, a workable formula that acted unifyingly with regard to the other parameters of the musical work, or as an aesthetic model that makes sense - philosophically and poetically - an anthropocentric dimension as in other composers (see Luigi Nono) ̇ triggered a qualitative leap, an opening of the composer to the assimilation of mathematical concepts and scientific types in music and the consolidation of new sound horizons of stochastic music.

Keywords: golden ratio, music, space, stochastic music

Procedia PDF Downloads 18
1074 Charge Transport in Biological Molecules

Authors: E. L. Albuquerque, U. L. Fulco, G. S. Ourique

Abstract:

The focus of this work is on the numerical investigation of the charge transport properties of the de novo-designed alpha3 polypeptide, as well as in its variants, all of them probed by gene engineering. The theoretical framework makes use of a tight-binding model Hamiltonian, together with ab-initio calculations within quantum chemistry simulation. The alpha3 polypeptide is a 21-residue with three repeats of the seven-residue amino acid sequence Leu-Glu-Thr-Leu-Ala-Lys-Ala, forming an alpha–helical bundle structure. Its variants are obtained by Ala→Gln substitution at the e (5th) and g (7th) position, respectively, of the alpha3 polypeptide amino acid sequence. Using transmission electron microscopy and atomic force microscopy, it was observed that the alpha3 polypeptide and one of its variant do have the ability to form fibrous assemblies, while the other does not. Our main aim is to investigate whether or not the biased alpha3 polypeptide and its variants can be also identified by quantum charge transport measurements through current-voltage (IxV) curves as a pattern to characterize their fibrous assemblies. It was observed that each peptide has a characteristic current pattern, which may be distinguished by charge transport measurements, suggesting that it might be a useful tool for the development of biosensors.

Keywords: charge transport properties, electronic transmittance, current-voltage characteristics, biological sensor

Procedia PDF Downloads 645
1073 Optimal Design of Composite Patch for a Cracked Pipe by Utilizing Genetic Algorithm and Finite Element Method

Authors: Mahdi Fakoor, Seyed Mohammad Navid Ghoreishi

Abstract:

Composite patching is a common way for reinforcing the cracked pipes and cylinders. The effects of composite patch reinforcement on fracture parameters of a cracked pipe depend on a variety of parameters such as number of layers, angle, thickness, and material of each layer. Therefore, stacking sequence optimization of composite patch becomes crucial for the applications of cracked pipes. In this study, in order to obtain the optimal stacking sequence for a composite patch that has minimum weight and maximum resistance in propagation of cracks, a coupled Multi-Objective Genetic Algorithm (MOGA) and Finite Element Method (FEM) process is proposed. This optimization process has done for longitudinal and transverse semi-elliptical cracks and optimal stacking sequences and Pareto’s front for each kind of cracks are presented. The proposed algorithm is validated against collected results from the existing literature.

Keywords: multi objective optimization, pareto front, composite patch, cracked pipe

Procedia PDF Downloads 286
1072 Negative Sequence-Based Protection Techniques for Microgrid Connected Power Systems

Authors: Isabelle Snyder, Travis Smith

Abstract:

Microgrid protection presents challenges to conventional protection techniques due to the low-induced fault current. Protection relays present in microgrid applications require a combination of settings groups to adjust based on the architecture of the microgrid in islanded and grid-connected modes. In a radial system where the microgrid is at the other end of the feeder, directional elements can be used to identify the direction of the fault current and switch settings groups accordingly (grid-connected or microgrid-connected). However, with multiple microgrid connections, this concept becomes more challenging, and the direction of the current alone is not sufficient to identify the source of the fault current contribution. ORNL has previously developed adaptive relaying schemes through other DOE-funded research projects that will be evaluated and used as a baseline for this research. The four protection techniques in this study are labeled as follows: (1) Adaptive Current only Protection System (ACPS), Intentional (2) Unbalanced Control for Protection Control (IUCPC), (3) Adaptive Protection System with Communication Controller (APSCC) (4) Adaptive Model-Driven Protective Relay (AMDPR).

Keywords: adaptive relaying, microgrid protection, sequence components, islanding detection

Procedia PDF Downloads 37
1071 Whole Exome Sequencing Data Analysis of Rare Diseases: Non-Coding Variants and Copy Number Variations

Authors: S. Fahiminiya, J. Nadaf, F. Rauch, L. Jerome-Majewska, J. Majewski

Abstract:

Background: Sequencing of protein coding regions of human genome (Whole Exome Sequencing; WES), has demonstrated a great success in the identification of causal mutations for several rare genetic disorders in human. Generally, most of WES studies have focused on rare variants in coding exons and splicing-sites where missense substitutions lead to the alternation of protein product. Although focusing on this category of variants has revealed the mystery behind many inherited genetic diseases in recent years, a subset of them remained still inconclusive. Here, we present the result of our WES studies where analyzing only rare variants in coding regions was not conclusive but further investigation revealed the involvement of non-coding variants and copy number variations (CNV) in etiology of the diseases. Methods: Whole exome sequencing was performed using our standard protocols at Genome Quebec Innovation Center, Montreal, Canada. All bioinformatics analyses were done using in-house WES pipeline. Results: To date, we successfully identified several disease causing mutations within gene coding regions (e.g. SCARF2: Van den Ende-Gupta syndrome and SNAP29: 22q11.2 deletion syndrome) by using WES. In addition, we showed that variants in non-coding regions and CNV have also important value and should not be ignored and/or filtered out along the way of bioinformatics analysis on WES data. For instance, in patients with osteogenesis imperfecta type V and in patients with glucocorticoid deficiency, we identified variants in 5'UTR, resulting in the production of longer or truncating non-functional proteins. Furthermore, CNVs were identified as the main cause of the diseases in patients with metaphyseal dysplasia with maxillary hypoplasia and brachydactyly and in patients with osteogenesis imperfecta type VII. Conclusions: Our study highlights the importance of considering non-coding variants and CNVs during interpretation of WES data, as they can be the only cause of disease under investigation.

Keywords: whole exome sequencing data, non-coding variants, copy number variations, rare diseases

Procedia PDF Downloads 387
1070 Performance Analysis and Comparison of Various 1-D and 2-D Prime Codes for OCDMA Systems

Authors: Gurjit Kaur, Shashank Johri, Arpit Mehrotra

Abstract:

In this paper we have analyzed and compared the performance of various coding schemes. The basic ID prime sequence codes are unique in only dimension i.e. time slots whereas 2D coding techniques are not unique by their time slots but with their wavelengths also. In this research we have evaluated and compared the performance of 1D and 2D coding techniques constructed using prime sequence coding pattern for OCDMA system on a single platform. Results shows that 1D Extended Prime Code (EPC) can support more number of active users compared to other codes but at the expense of larger code length which further increases the complexity of the code. Modified Prime Code (MPC) supports lesser number of active users at λc=2 but it has a lesser code length as compared to 1D prime code. Analysis shows that 2D prime code supports lesser number of active users than 1D codes but they are having large code family and are the most secure codes compared to other codes. The performance of all these codes is analyzed on basis of number of active users supported at a Bit Error Rate (BER) of 10-9.

Keywords: CDMA, OCDMA, BER, OOC, PC, EPC, MPC, 2-D PC/PC, λc, λa

Procedia PDF Downloads 478
1069 Unsupervised Classification of DNA Barcodes Species Using Multi-Library Wavelet Networks

Authors: Abdesselem Dakhli, Wajdi Bellil, Chokri Ben Amar

Abstract:

DNA Barcode, a short mitochondrial DNA fragment, made up of three subunits; a phosphate group, sugar and nucleic bases (A, T, C, and G). They provide good sources of information needed to classify living species. Such intuition has been confirmed by many experimental results. Species classification with DNA Barcode sequences has been studied by several researchers. The classification problem assigns unknown species to known ones by analyzing their Barcode. This task has to be supported with reliable methods and algorithms. To analyze species regions or entire genomes, it becomes necessary to use similarity sequence methods. A large set of sequences can be simultaneously compared using Multiple Sequence Alignment which is known to be NP-complete. To make this type of analysis feasible, heuristics, like progressive alignment, have been developed. Another tool for similarity search against a database of sequences is BLAST, which outputs shorter regions of high similarity between a query sequence and matched sequences in the database. However, all these methods are still computationally very expensive and require significant computational infrastructure. Our goal is to build predictive models that are highly accurate and interpretable. This method permits to avoid the complex problem of form and structure in different classes of organisms. On empirical data and their classification performances are compared with other methods. Our system consists of three phases. The first is called transformation, which is composed of three steps; Electron-Ion Interaction Pseudopotential (EIIP) for the codification of DNA Barcodes, Fourier Transform and Power Spectrum Signal Processing. The second is called approximation, which is empowered by the use of Multi Llibrary Wavelet Neural Networks (MLWNN).The third is called the classification of DNA Barcodes, which is realized by applying the algorithm of hierarchical classification.

Keywords: DNA barcode, electron-ion interaction pseudopotential, Multi Library Wavelet Neural Networks (MLWNN)

Procedia PDF Downloads 286
1068 Investigations of Protein Aggregation Using Sequence and Structure Based Features

Authors: M. Michael Gromiha, A. Mary Thangakani, Sandeep Kumar, D. Velmurugan

Abstract:

The main cause of several neurodegenerative diseases such as Alzhemier, Parkinson, and spongiform encephalopathies is formation of amyloid fibrils and plaques in proteins. We have analyzed different sets of proteins and peptides to understand the influence of sequence-based features on protein aggregation process. The comparison of 373 pairs of homologous mesophilic and thermophilic proteins showed that aggregation-prone regions (APRs) are present in both. But, the thermophilic protein monomers show greater ability to ‘stow away’ the APRs in their hydrophobic cores and protect them from solvent exposure. The comparison of amyloid forming and amorphous b-aggregating hexapeptides suggested distinct preferences for specific residues at the six positions as well as all possible combinations of nine residue pairs. The compositions of residues at different positions and residue pairs have been converted into energy potentials and utilized for distinguishing between amyloid forming and amorphous b-aggregating peptides. Our method could correctly identify the amyloid forming peptides at an accuracy of 95-100% in different datasets of peptides.

Keywords: aggregation, amyloids, thermophilic proteins, amino acid residues, machine learning techniques

Procedia PDF Downloads 581
1067 Reusing Assessments Tests by Generating Arborescent Test Groups Using a Genetic Algorithm

Authors: Ovidiu Domşa, Nicolae Bold

Abstract:

Using Information and Communication Technologies (ICT) notions in education and three basic processes of education (teaching, learning and assessment) can bring benefits to the pupils and the professional development of teachers. In this matter, we refer to these notions as concepts taken from the informatics area and apply them to the domain of education. These notions refer to genetic algorithms and arborescent structures, used in the specific process of assessment or evaluation. This paper uses these kinds of notions to generate subtrees from a main tree of tests related between them by their degree of difficulty. These subtrees must contain the highest number of connections between the nodes and the lowest number of missing edges (which are subtrees of the main tree) and, in the particular case of the non-existence of a subtree with no missing edges, the subtrees which have the lowest (minimal) number of missing edges between the nodes, where a node is a test and an edge is a direct connection between two tests which differs by one degree of difficulty. The subtrees are represented as sequences. The tests are the same (a number coding a test represents that test in every sequence) and they are reused for each sequence of tests.

Keywords: chromosome, genetic algorithm, subtree, test

Procedia PDF Downloads 294
1066 Enhancing Sewage Sludge Management through Integrated Hydrothermal Liquefaction and Anaerobic Digestion: A Comparative Study

Authors: Harveen Kaur Tatla, Parisa Niknejad, Rajender Gupta, Bipro Ranjan Dhar, Mohd. Adana Khan

Abstract:

Sewage sludge management presents a pressing challenge in the realm of wastewater treatment, calling for sustainable and efficient solutions. This study explores the integration of Hydrothermal Liquefaction (HTL) and Anaerobic Digestion (AD) as a promising approach to address the complexities associated with sewage sludge treatment. The integration of these two processes offers a complementary and synergistic framework, allowing for the mitigation of inherent limitations, thereby enhancing overall efficiency, product quality, and the comprehensive utilization of sewage sludge. In this research, we investigate the optimal sequencing of HTL and AD within the treatment framework, aiming to discern which sequence, whether HTL followed by AD or AD followed by HTL, yields superior results. We explore a range of HTL working temperatures, including 250°C, 300°C, and 350°C, coupled with residence times of 30 and 60 minutes. To evaluate the effectiveness of each sequence, a battery of tests is conducted on the resultant products, encompassing Total Ammonia Nitrogen (TAN), Chemical Oxygen Demand (COD), and Volatile Fatty Acids (VFA). Additionally, elemental analysis is employed to determine which sequence maximizes energy recovery. Our findings illuminate the intricate dynamics of HTL and AD integration for sewage sludge management, shedding light on the temperature-residence time interplay and its impact on treatment efficiency. This study not only contributes to the optimization of sewage sludge treatment but also underscores the potential of integrated processes in sustainable waste management strategies. The insights gleaned from this research hold promise for advancing the field of wastewater treatment and resource recovery, addressing critical environmental and energy challenges.

Keywords: Anaerobic Digestion (AD), aqueous phase, energy recovery, Hydrothermal Liquefaction (HTL), sewage sludge management, sustainability.

Procedia PDF Downloads 45
1065 Quality Management in Construction Project

Authors: Harsh Panchal, Saurabh Amrutkar

Abstract:

Quality management is an essential part of any project that has directly related to the performance of a project. Quality management is depended on multiple factors at different stages in a project, right from time management to construction logistics. A project is a mixture of various components that include iternary management, health and safety, crew productivity, and many more. From the survey conducted, we came to the conclusion that advancement in technology and indigenous approach to any project will result in maximum quality standards and better project performance. In this paper, we discuss various components of the factors above that lead to compromise the quality of a project and how it can be controlled in order to achieve maximum quality assurance using quality planning and total quality management. The paper also focuses on limitations and problems faced in each factor responsible for quality management and to tackle them using techniques and processes based on activities and identifying the sequence, approaching critical path, and duration. The project management concept that deals with the sequence of scope cost time give us an overview regarding the ongoing quality management, in a nutshell, giving us hints to regulate the current procedure for maximum achievable quality. It also deals with the problems faced by engineers that make the mundane work process slow, reducing the quality outcome drastically.

Keywords: management, performance, project, quality

Procedia PDF Downloads 123
1064 Effect of Bacillus Pumilus Strains on Heavy Metal Accumulation in Lettuce Grown on Contaminated Soil

Authors: Sabeen Alam, Mehboob Alam

Abstract:

The research work entitled “Effect of Bacillus pumilus strains on heavy metal accumulation in lettuce grown on contaminated soil” focused on functional role of Bacillus pumilus strains inoculated with lettuce seed in mitigating heavy metal in chromite mining soil. In this experiment, factor A was three Bacillus pumilus strains (sequence C-2PMW-8, C-1 SSK-8 and C-1 PWK-7) while soil used for this experiment was collected from Prang Ghar mining site and lettuce seeds were grown in three levels of chromite mining soil (2.27, 4.65 and 7.14 %). For mining soil minimum days to germinate noted in lettuce grown on garden soil inoculated with sequence. Maximum germination percentage noted was for C-1 SSK-8 grown on garden soil, maximum lettuce height for sequence C-2 PWM-8, fresh leaf weight for C-1 PWK-7 inoculated lettuce, dry weight of lettuce leaf for lettuce inoculated with C-1 SSK-8 and C-1 PWK-7 strains, number of leaves per plant for lettuce inoculated with C-1 SSK-8, leaf area for C-2 PMW-8 inoculated lettuce, survival percentage for C-1 SSK-8 treated lettuce and chlorophyll content for C-2 PMW-8. Results related to heavy metals accumulation showed that minimum chromium was in lettuce and in soil for all three sequences, cadmium (Cd) in lettuce and in soil for all three sequences, manganese (Mn) in lettuce and in soil for three sequences, lead (Pb) in lettuce and in soil for three sequences. It can be concluded that chromite mining soil significantly reduced the growth and survival of lettuce, but when lettuce was inoculated with Bacillus.pumilus strains, it enhances growth and survival. Similarly, minimum heavy metal accumulation in plant and soil, regardless of type of Bacillus pumilus used, all three sequences has same mitigating effect on heavy metal in both soil and lettuce. All the three Bacillus pumilus strains ensured reduction in heavy metals content (Mn, Cd, Cr) in lettuce, below the maximum permissible limits of WHO 2011.

Keywords: bacillus pumilus, heavy metals, permissible limits, lettuce, chromite mining soil, mitigating effect

Procedia PDF Downloads 17
1063 Sequence Component-Based Adaptive Protection for Microgrids Connected Power Systems

Authors: Isabelle Snyder

Abstract:

Microgrid protection presents challenges to conventional protection techniques due to the low induced fault current. Protection relays present in microgrid applications require a combination of settings groups to adjust based on the architecture of the microgrid in islanded and grid-connected mode. In a radial system where the microgrid is at the other end of the feeder, directional elements can be used to identify the direction of the fault current and switch settings groups accordingly (grid connected or microgrid connected). However, with multiple microgrid connections, this concept becomes more challenging, and the direction of the current alone is not sufficient to identify the source of the fault current contribution. ORNL has previously developed adaptive relaying schemes through other DOE-funded research projects that will be evaluated and used as a baseline for this research. The four protection techniques in this study are the following: (1) Adaptive Current only Protection System (ACPS), Intentional (2) Unbalanced Control for Protection Control (IUCPC), (3) Adaptive Protection System with Communication Controller (APSCC) (4) Adaptive Model-Driven Protective Relay (AMDPR). The first two methods focus on identifying the islanded mode without communication by monitoring the current sequence component generated by the system (ACPS) or induced with inverter control during islanded mode (IUCPC) to identify the islanding condition without communication at the relay to adjust the settings. These two methods are used as a backup to the APSCC, which relies on a communication network to communicate the islanded configuration to the system components. The fourth method relies on a short circuit model inside the relay that is used in conjunction with communication to adjust the system configuration and computes the fault current and adjusts the settings accordingly.

Keywords: adaptive relaying, microgrid protection, sequence components, islanding detection, communication controlled protection, integrated short circuit model

Procedia PDF Downloads 56
1062 Genomic Resilience and Ecological Vulnerability in Coffea Arabica: Insights from Whole Genome Resequencing at Its Center of Origin

Authors: Zewdneh Zana Zate

Abstract:

The study focuses on the evolutionary and ecological genomics of both wild and cultivated Coffea arabica L. at its center of origin, Ethiopia, aiming to uncover how this vital species may withstand future climate changes. Utilizing bioclimatic models, we project the future distribution of Arabica under varied climate scenarios for 2050 and 2080, identifying potential conservation zones and immediate risk areas. Through whole-genome resequencing of accessions from Ethiopian gene banks, this research assesses genetic diversity and divergence between wild and cultivated populations. It explores relationships, demographic histories, and potential hybridization events among Coffea arabica accessions to better understand the species' origins and its connection to parental species. This genomic analysis also seeks to detect signs of natural or artificial selection across populations. Integrating these genomic discoveries with ecological data, the study evaluates the current and future ecological and genomic vulnerabilities of wild Coffea arabica, emphasizing necessary adaptations for survival. We have identified key genomic regions linked to environmental stress tolerance, which could be crucial for breeding more resilient Arabica varieties. Additionally, our ecological modeling predicted a contraction of suitable habitats, urging immediate conservation actions in identified key areas. This research not only elucidates the evolutionary history and adaptive strategies of Arabica but also informs conservation priorities and breeding strategies to enhance resilience to climate change. By synthesizing genomic and ecological insights, we provide a robust framework for developing effective management strategies aimed at sustaining Coffea arabica, a species of profound global importance, in its native habitat under evolving climatic conditions.

Keywords: coffea arabica, climate change adaptation, conservation strategies, genomic resilience

Procedia PDF Downloads 3
1061 Detection, Analysis and Determination of the Origin of Copy Number Variants (CNVs) in Intellectual Disability/Developmental Delay (ID/DD) Patients and Autistic Spectrum Disorders (ASD) Patients by Molecular and Cytogenetic Methods

Authors: Pavlina Capkova, Josef Srovnal, Vera Becvarova, Marie Trkova, Zuzana Capkova, Andrea Stefekova, Vaclava Curtisova, Alena Santava, Sarka Vejvalkova, Katerina Adamova, Radek Vodicka

Abstract:

ASDs are heterogeneous and complex developmental diseases with a significant genetic background. Recurrent CNVs are known to be a frequent cause of ASD. These CNVs can have, however, a variable expressivity which results in a spectrum of phenotypes from asymptomatic to ID/DD/ASD. ASD is associated with ID in ~75% individuals. Various platforms are used to detect pathogenic mutations in the genome of these patients. The performed study is focused on a determination of the frequency of pathogenic mutations in a group of ASD patients and a group of ID/DD patients using various strategies along with a comparison of their detection rate. The possible role of the origin of these mutations in aetiology of ASD was assessed. The study included 35 individuals with ASD and 68 individuals with ID/DD (64 males and 39 females in total), who underwent rigorous genetic, neurological and psychological examinations. Screening for pathogenic mutations involved karyotyping, screening for FMR1 mutations and for metabolic disorders, a targeted MLPA test with probe mixes Telomeres 3 and 5, Microdeletion 1 and 2, Autism 1, MRX and a chromosomal microarray analysis (CMA) (Illumina or Affymetrix). Chromosomal aberrations were revealed in 7 (1 in the ASD group) individuals by karyotyping. FMR1 mutations were discovered in 3 (1 in the ASD group) individuals. The detection rate of pathogenic mutations in ASD patients with a normal karyotype was 15.15% by MLPA and CMA. The frequencies of the pathogenic mutations were 25.0% by MLPA and 35.0% by CMA in ID/DD patients with a normal karyotype. CNVs inherited from asymptomatic parents were more abundant than de novo changes in ASD patients (11.43% vs. 5.71%) in contrast to the ID/DD group where de novo mutations prevailed over inherited ones (26.47% vs. 16.18%). ASD patients shared more frequently their mutations with their fathers than patients from ID/DD group (8.57% vs. 1.47%). Maternally inherited mutations predominated in the ID/DD group in comparison with the ASD group (14.7% vs. 2.86 %). CNVs of an unknown significance were found in 10 patients by CMA and in 3 patients by MLPA. Although the detection rate is the highest when using CMA, recurrent CNVs can be easily detected by MLPA. CMA proved to be more efficient in the ID/DD group where a larger spectrum of rare pathogenic CNVs was revealed. This study determined that maternally inherited highly penetrant mutations and de novo mutations more often resulted in ID/DD without ASD in patients. The paternally inherited mutations could be, however, a source of the greater variability in the genome of the ASD patients and contribute to the polygenic character of the inheritance of ASD. As the number of the subjects in the group is limited, a larger cohort is needed to confirm this conclusion. Inherited CNVs have a role in aetiology of ASD possibly in combination with additional genetic factors - the mutations elsewhere in the genome. The identification of these interactions constitutes a challenge for the future. Supported by MH CZ – DRO (FNOl, 00098892), IGA UP LF_2016_010, TACR TE02000058 and NPU LO1304.

Keywords: autistic spectrum disorders, copy number variant, chromosomal microarray, intellectual disability, karyotyping, MLPA, multiplex ligation-dependent probe amplification

Procedia PDF Downloads 325
1060 Robust Data Image Watermarking for Data Security

Authors: Harsh Vikram Singh, Ankur Rai, Anand Mohan

Abstract:

In this paper, we propose secure and robust data hiding algorithm based on DCT by Arnold transform and chaotic sequence. The watermark image is scrambled by Arnold cat map to increases its security and then the chaotic map is used for watermark signal spread in middle band of DCT coefficients of the cover image The chaotic map can be used as pseudo-random generator for digital data hiding, to increase security and robustness .Performance evaluation for robustness and imperceptibility of proposed algorithm has been made using bit error rate (BER), normalized correlation (NC), and peak signal to noise ratio (PSNR) value for different watermark and cover images such as Lena, Girl, Tank images and gain factor .We use a binary logo image and text image as watermark. The experimental results demonstrate that the proposed algorithm achieves higher security and robustness against JPEG compression as well as other attacks such as addition of noise, low pass filtering and cropping attacks compared to other existing algorithm using DCT coefficients. Moreover, to recover watermarks in proposed algorithm, there is no need to original cover image.

Keywords: data hiding, watermarking, DCT, chaotic sequence, arnold transforms

Procedia PDF Downloads 484
1059 Effect of Media Osmolarity on Vi Biosynthesis on Salmonella enterica serovar Typhi Strain C6524 Cultured on Batch System

Authors: Dwi Arisandi Wijaya, Ernawati Arifin Giri-Rachman, Neni Nurainy

Abstract:

Typhoid fever disease can be prevented by using a polysaccharide-based vaccine Vi which is a virulence factor of S.typhi. To produce high yield Vi polysaccharide from bacteria, it is important to know the biosynthesis of Vi polysaccharide and the regulators involved. In the In vivo condition, S. typhi faces different osmolarity, and the bacterial two-component system OmpR-EnvZ, regulate by up and down Capsular Vi polysaccharide biosynthesis. A high yielded Vi Polysaccharide strain, S. typhi strain C6524 used to study the effect of media osmolarity on Vi polysaccharide biosynthesis and the osmoregulation pattern of S. typhi strain C6524. The methods were performed by grown S. typhi strain C6524 grown on medium with 50 mM, 100 mM, and 150 mM osmolarity with the batch system. Vi polysaccharide concentration was measured by ELISA method. For further investigation of the osmoregulation pattern of strain C6524, the osmoregulator gene, OmpR, has been isolated and sequenced using the specific primer of the OmpR gene. Nucleotide sequence analysis is done with BLAST and Lallign. Amino Acid sequence analysis is done with Prosite and Multiple Sequence Alignment. The results of cultivation showed the average content of polysaccharide Vi for 50 mM, 100 mM, and 150 mM osmolarities 11.49 μg/mL, 12.06 μg/mL, and 14.53 μg/mL respectively. Analysis using Anova stated that the osmolarity treatment of 150 mM significantly affects Vi content. Analysis of nucleotide sequences shows 100% identity between S. typhi strain C6524 and Ty2. Analysis of amino acid sequences shows that the OmpR response regulator protein of the C6524 strain also has a α4-β5-α5 motif which is important for the regulatory activation system when phosphorylation occurs by domain kinase. This indicates that the regulator osmolarity response of S. typhi strain C6524 has no difference with the response regulator owned by S. typhi strain Ty2. A high Vi response rate in the 150 mM osmolarity treatment requires further research for RcsB-RcsC, another two-component system involved in Vi Biosynthesis.

Keywords: osmoregulator, OmpR, Salmonella, Vi polysaccharide

Procedia PDF Downloads 162
1058 The Role of Named Entity Recognition for Information Extraction

Authors: Girma Yohannis Bade, Olga Kolesnikova, Grigori Sidorov

Abstract:

Named entity recognition (NER) is a building block for information extraction. Though the information extraction process has been automated using a variety of techniques to find and extract a piece of relevant information from unstructured documents, the discovery of targeted knowledge still poses a number of research difficulties because of the variability and lack of structure in Web data. NER, a subtask of information extraction (IE), came to exist to smooth such difficulty. It deals with finding the proper names (named entities), such as the name of the person, country, location, organization, dates, and event in a document, and categorizing them as predetermined labels, which is an initial step in IE tasks. This survey paper presents the roles and importance of NER to IE from the perspective of different algorithms and application area domains. Thus, this paper well summarizes how researchers implemented NER in particular application areas like finance, medicine, defense, business, food science, archeology, and so on. It also outlines the three types of sequence labeling algorithms for NER such as feature-based, neural network-based, and rule-based. Finally, the state-of-the-art and evaluation metrics of NER were presented.

Keywords: the role of NER, named entity recognition, information extraction, sequence labeling algorithms, named entity application area

Procedia PDF Downloads 44
1057 SARS-CoV-2: Prediction of Critical Charged Amino Acid Mutations

Authors: Atlal El-Assaad

Abstract:

Viruses change with time through mutations and result in new variants that may persist or disappear. A Mutation refers to an actual change in the virus genetic sequence, and a variant is a viral genome that may contain one or more mutations. Critical mutations may cause the virus to be more transmissible, with high disease severity, and more vulnerable to diagnostics, therapeutics, and vaccines. Thus, variants carrying such mutations may increase the risk to human health and are considered variants of concern (VOC). Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) - the contagious in humans, positive-sense single-stranded RNA virus that caused coronavirus disease 2019 (COVID-19) - has been studied thoroughly, and several variants were revealed across the world with their corresponding mutations. SARS-CoV-2 has four structural proteins, known as the S (spike), E (envelope), M (membrane), and N (nucleocapsid) proteins, but prior study and vaccines development focused on genetic mutations in the S protein due to its vital role in allowing the virus to attach and fuse with the membrane of a host cell. Specifically, subunit S1 catalyzes attachment, whereas subunit S2 mediates fusion. In this perspective, we studied all charged amino acid mutations of the SARS-CoV-2 viral spike protein S1 when bound to Antibody CC12.1 in a crystal structure and assessed the effect of different mutations. We generated all missense mutants of SARS-CoV-2 protein amino acids (AAs) within the SARS-CoV-2:CC12.1 complex model. To generate the family of mutants in each complex, we mutated every charged amino acid with all other charged amino acids (Lysine (K), Arginine (R), Glutamic Acid (E), and Aspartic Acid (D)) and studied the new binding of the complex after each mutation. We applied Poisson-Boltzmann electrostatic calculations feeding into free energy calculations to determine the effect of each mutation on binding. After analyzing our data, we identified charged amino acids keys for binding. Furthermore, we validated those findings against published experimental genetic data. Our results are the first to propose in silico potential life-threatening mutations of SARS-CoV-2 beyond the present mutations found in the five common variants found worldwide.

Keywords: SARS-CoV-2, variant, ionic amino acid, protein-protein interactions, missense mutation, AESOP

Procedia PDF Downloads 73
1056 Multi-Objective Optimization of the Thermal-Hydraulic Behavior for a Sodium Fast Reactor with a Gas Power Conversion System and a Loss of off-Site Power Simulation

Authors: Avent Grange, Frederic Bertrand, Jean-Baptiste Droin, Amandine Marrel, Jean-Henry Ferrasse, Olivier Boutin

Abstract:

CEA and its industrial partners are designing a gas Power Conversion System (PCS) based on a Brayton cycle for the ASTRID Sodium-cooled Fast Reactor. Investigations of control and regulation requirements to operate this PCS during operating, incidental and accidental transients are necessary to adapt core heat removal. To this aim, we developed a methodology to optimize the thermal-hydraulic behavior of the reactor during normal operations, incidents and accidents. This methodology consists of a multi-objective optimization for a specific sequence, whose aim is to increase component lifetime by reducing simultaneously several thermal stresses and to bring the reactor into a stable state. Furthermore, the multi-objective optimization complies with safety and operating constraints. Operating, incidental and accidental sequences use specific regulations to control the thermal-hydraulic reactor behavior, each of them is defined by a setpoint, a controller and an actuator. In the multi-objective problem, the parameters used to solve the optimization are the setpoints and the settings of the controllers associated with the regulations included in the sequence. In this way, the methodology allows designers to define an optimized and specific control strategy of the plant for the studied sequence and hence to adapt PCS piloting at its best. The multi-objective optimization is performed by evolutionary algorithms coupled to surrogate models built on variables computed by the thermal-hydraulic system code, CATHARE2. The methodology is applied to a loss of off-site power sequence. Three variables are controlled: the sodium outlet temperature of the sodium-gas heat exchanger, turbomachine rotational speed and water flow through the heat sink. These regulations are chosen in order to minimize thermal stresses on the gas-gas heat exchanger, on the sodium-gas heat exchanger and on the vessel. The main results of this work are optimal setpoints for the three regulations. Moreover, Proportional-Integral-Derivative (PID) control setting is considered and efficient actuators used in controls are chosen through sensitivity analysis results. Finally, the optimized regulation system and the reactor control procedure, provided by the optimization process, are verified through a direct CATHARE2 calculation.

Keywords: gas power conversion system, loss of off-site power, multi-objective optimization, regulation, sodium fast reactor, surrogate model

Procedia PDF Downloads 279
1055 DNA-Based Gold Nanoprobe Biosensor to Detect Pork Contaminant

Authors: Rizka Ardhiyana, Liesbetini Haditjaroko, Sri Mulijani, Reki Ashadi Wicaksono, Raafqi Ranasasmita

Abstract:

Designing a sensitive, specific and easy to use method to detect pork contamination in the food industry remains a major challenge. In the current study, we developed a sensitive thiol-bond AuNP-Probe biosensor that will change color when detecting pork DNA in the Cytochrome B region. The interaction between the biosensors and DNA sample is measured by spectrophotometer at 540 nm. The biosensor is made by reducing gold with trisodium citrate to produce gold nanoparticle with 39.05 nm diameter. The AuNP-Probe biosensor (gold nanoprobe) achieved 16.04 ng DNA/µl limit of detection and 53.48 ng DNA/µl limit of quantification. The linearity (R2) between color absorbance changes and DNA concentration is 0.9916. The biosensor has a good specificty as it does not cross-react with DNA of chicken and beef. To verify specificity towards the target sequence, PCR was tested to the target sequence and reacted to the PCR product with the biosensor. The PCR DNA isolate resulted in a 2.7 fold higher absorbance compared to pork-DNA isolate alone (without PCR). The sensitivity and specificity of the method show the promising application of the thiol-bond AuNP biosensor in pork-detection.

Keywords: biosensor, DNA probe, gold nanoparticle (AuNP), pork meat, qPCR

Procedia PDF Downloads 328
1054 Partial M-Sequence Code Families Applied in Spectral Amplitude Coding Fiber-Optic Code-Division Multiple-Access Networks

Authors: Shin-Pin Tseng

Abstract:

Nowadays, numerous spectral amplitude coding (SAC) fiber-optic code-division-multiple-access (FO-CDMA) techniques were appealing due to their capable of providing moderate security and relieving the effects of multiuser interference (MUI). Nonetheless, the performance of the previous network is degraded due to fixed in-phase cross-correlation (IPCC) value. Based on the above problems, a new SAC FO-CDMA network using partial M-sequence (PMS) code is presented in this study. Because the proposed PMS code is originated from M-sequence code, the system using the PMS code could effectively suppress the effects of MUI. In addition, two-code keying (TCK) scheme can applied in the proposed SAC FO-CDMA network and enhance the whole network performance. According to the consideration of system flexibility, simple optical encoders/decoders (codecs) using fiber Bragg gratings (FBGs) were also developed. First, we constructed a diagram of the SAC FO-CDMA network, including (N/2-1) optical transmitters, (N/2-1) optical receivers, and one N×N star coupler for broadcasting transmitted optical signals to arrive at the input port of each optical receiver. Note that the parameter N for the PMS code was the code length. In addition, the proposed SAC network was using superluminescent diodes (SLDs) as light sources, which then can save a lot of system cost compared with the other FO-CDMA methods. For the design of each optical transmitter, it is composed of an SLD, one optical switch, and two optical encoders according to assigned PMS codewords. On the other hand, each optical receivers includes a 1 × 2 splitter, two optical decoders, and one balanced photodiode for mitigating the effect of MUI. In order to simplify the next analysis, the some assumptions were used. First, the unipolarized SLD has flat power spectral density (PSD). Second, the received optical power at the input port of each optical receiver is the same. Third, all photodiodes in the proposed network have the same electrical properties. Fourth, transmitting '1' and '0' has an equal probability. Subsequently, by taking the factors of phase‐induced intensity noise (PIIN) and thermal noise, the corresponding performance was displayed and compared with the performance of the previous SAC FO-CDMA networks. From the numerical result, it shows that the proposed network improved about 25% performance than that using other codes at BER=10-9. This is because the effect of PIIN was effectively mitigated and the received power was enhanced by two times. As a result, the SAC FO-CDMA network using PMS codes has an opportunity to apply in applications of the next-generation optical network.

Keywords: spectral amplitude coding, SAC, fiber-optic code-division multiple-access, FO-CDMA, partial M-sequence, PMS code, fiber Bragg grating, FBG

Procedia PDF Downloads 348
1053 Using Machine-Learning Methods for Allergen Amino Acid Sequence's Permutations

Authors: Kuei-Ling Sun, Emily Chia-Yu Su

Abstract:

Allergy is a hypersensitive overreaction of the immune system to environmental stimuli, and a major health problem. These overreactions include rashes, sneezing, fever, food allergies, anaphylaxis, asthmatic, shock, or other abnormal conditions. Allergies can be caused by food, insect stings, pollen, animal wool, and other allergens. Their development of allergies is due to both genetic and environmental factors. Allergies involve immunoglobulin E antibodies, a part of the body’s immune system. Immunoglobulin E antibodies will bind to an allergen and then transfer to a receptor on mast cells or basophils triggering the release of inflammatory chemicals such as histamine. Based on the increasingly serious problem of environmental change, changes in lifestyle, air pollution problem, and other factors, in this study, we both collect allergens and non-allergens from several databases and use several machine learning methods for classification, including logistic regression (LR), stepwise regression, decision tree (DT) and neural networks (NN) to do the model comparison and determine the permutations of allergen amino acid’s sequence.

Keywords: allergy, classification, decision tree, logistic regression, machine learning

Procedia PDF Downloads 274
1052 Modelling Causal Effects from Complex Longitudinal Data via Point Effects of Treatments

Authors: Xiaoqin Wang, Li Yin

Abstract:

Background and purpose: In many practices, one estimates causal effects arising from a complex stochastic process, where a sequence of treatments are assigned to influence a certain outcome of interest, and there exist time-dependent covariates between treatments. When covariates are plentiful and/or continuous, statistical modeling is needed to reduce the huge dimensionality of the problem and allow for the estimation of causal effects. Recently, Wang and Yin (Annals of statistics, 2020) derived a new general formula, which expresses these causal effects in terms of the point effects of treatments in single-point causal inference. As a result, it is possible to conduct the modeling via point effects. The purpose of the work is to study the modeling of these causal effects via point effects. Challenges and solutions: The time-dependent covariates often have influences from earlier treatments as well as on subsequent treatments. Consequently, the standard parameters – i.e., the mean of the outcome given all treatments and covariates-- are essentially all different (null paradox). Furthermore, the dimension of the parameters is huge (curse of dimensionality). Therefore, it can be difficult to conduct the modeling in terms of standard parameters. Instead of standard parameters, we have use point effects of treatments to develop likelihood-based parametric approach to the modeling of these causal effects and are able to model the causal effects of a sequence of treatments by modeling a small number of point effects of individual treatment Achievements: We are able to conduct the modeling of the causal effects from a sequence of treatments in the familiar framework of single-point causal inference. The simulation shows that our method achieves not only an unbiased estimate for the causal effect but also the nominal level of type I error and a low level of type II error for the hypothesis testing. We have applied this method to a longitudinal study of COVID-19 mortality among Scandinavian countries and found that the Swedish approach performed far worse than the other countries' approach for COVID-19 mortality and the poor performance was largely due to its early measure during the initial period of the pandemic.

Keywords: causal effect, point effect, statistical modelling, sequential causal inference

Procedia PDF Downloads 172
1051 Design and Performance Evaluation of Hybrid Corrugated-GFRP Infill Panels

Authors: Woo Young Jung, Sung Min Park, Ho Young Son, Viriyavudh Sim

Abstract:

This study presents a way to reduce earthquake damage and emergency rehabilitation of critical structures such as schools, high-tech factories, and hospitals due to strong ground motions associated with climate changes. Regarding recent trend, a strong earthquake causes serious damage to critical structures and then the critical structure might be influenced by sequence aftershocks (or tsunami) due to fault plane adjustments. Therefore, in order to improve seismic performance of critical structures, retrofitted or strengthening study of the structures under aftershocks sequence after emergency rehabilitation of the structures subjected to strong earthquakes is widely carried out. Consequently, this study used composite material for emergency rehabilitation of the structure rather than concrete and steel materials because of high strength and stiffness, lightweight, rapid manufacturing, and dynamic performance. Also, this study was to develop or improve the seismic performance or seismic retrofit of critical structures subjected to strong ground motions and earthquake aftershocks, by utilizing GFRP-Corrugated Infill Panels (GCIP).

Keywords: aftershock, composite material, GFRP, infill panel

Procedia PDF Downloads 307
1050 Tectonic Complexity: Out-of-Sequence Thrusting in the Higher Himalaya of Jhakri-Sarahan region, Himachal Pradesh, India

Authors: Rajkumar Ghosh

Abstract:

The study focuses on the tectonics of out-of-sequence thrusting (OOST) in the NW region of the Himalaya, particularly in Himachal Pradesh. The research aims to identify the features and nature of OOST in the field and the associated rock types and lithological boundaries in the field of NW Himalaya, Himachal Pradesh, India. The research employs fieldwork and micro-structure observations, correlations, and analyses to identify and analyze the OOST features and associated rock types. The study reveals the presence of three OOSTs, namely Jhakri Thrust (JT), Sarahan Thrust (ST), and Chaura Thrust (CT), which consist of several branches, some of which are still active. The thrust system exhibits varying internal geometry, including box folds, boudins, scar folds, crenulation cleavages, kink folds, and tension gashes. The CT, which is concealed beneath Jutogh Thrust sheet, represents a steepened downward thrust, while the JT has a western dip and is south-westward verging. The research provides crucial information on the tectonics of OOST in the NW region of the Himalaya, particularly in Himachal Pradesh, which is crucial in understanding the regional geological evolution and associated hazards. The data were collected through fieldwork and micro-structure observations, correlations, and analyses of rock samples. The data were analyzed using tectonic and geochronological techniques to identify the nature and characteristics of OOST. The research addressed the question of identifying Higher Himalayan OOST in the field of NW Himalaya, Himachal Pradesh, India, and the associated rock types and lithological boundaries. The study concludes that there is minimal documentation and a lack of suitable exposure of rocks to generalize the features of OOST in the field in NW Higher Himalaya, Himachal Pradesh. The study recommends more extensive mapping and fieldwork to improve understanding of OOST in the region.

Keywords: out-of-sequence thrust (OOST), main central thrust (MCT), jhakri thrust (JT), sarahan thrust (ST), chaura thrust (CT), higher himalaya (HH)

Procedia PDF Downloads 59
1049 Masked Candlestick Model: A Pre-Trained Model for Trading Prediction

Authors: Ling Qi, Matloob Khushi, Josiah Poon

Abstract:

This paper introduces a pre-trained Masked Candlestick Model (MCM) for trading time-series data. The pre-trained model is based on three core designs. First, we convert trading price data at each data point as a set of normalized elements and produce embeddings of each element. Second, we generate a masked sequence of such embedded elements as inputs for self-supervised learning. Third, we use the encoder mechanism from the transformer to train the inputs. The masked model learns the contextual relations among the sequence of embedded elements, which can aid downstream classification tasks. To evaluate the performance of the pre-trained model, we fine-tune MCM for three different downstream classification tasks to predict future price trends. The fine-tuned models achieved better accuracy rates for all three tasks than the baseline models. To better analyze the effectiveness of MCM, we test the same architecture for three currency pairs, namely EUR/GBP, AUD/USD, and EUR/JPY. The experimentation results demonstrate MCM’s effectiveness on all three currency pairs and indicate the MCM’s capability for signal extraction from trading data.

Keywords: masked language model, transformer, time series prediction, trading prediction, embedding, transfer learning, self-supervised learning

Procedia PDF Downloads 85