Search results for: sequence data
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25145

Search results for: sequence data

24725 Opening up Government Datasets for Big Data Analysis to Support Policy Decisions

Authors: K. Hardy, A. Maurushat

Abstract:

Policy makers are increasingly looking to make evidence-based decisions. Evidence-based decisions have historically used rigorous methodologies of empirical studies by research institutes, as well as less reliable immediate survey/polls often with limited sample sizes. As we move into the era of Big Data analytics, policy makers are looking to different methodologies to deliver reliable empirics in real-time. The question is not why did these people do this for the last 10 years, but why are these people doing this now, and if the this is undesirable, and how can we have an impact to promote change immediately. Big data analytics rely heavily on government data that has been released in to the public domain. The open data movement promises greater productivity and more efficient delivery of services; however, Australian government agencies remain reluctant to release their data to the general public. This paper considers the barriers to releasing government data as open data, and how these barriers might be overcome.

Keywords: big data, open data, productivity, data governance

Procedia PDF Downloads 350
24724 A Review on Existing Challenges of Data Mining and Future Research Perspectives

Authors: Hema Bhardwaj, D. Srinivasa Rao

Abstract:

Technology for analysing, processing, and extracting meaningful data from enormous and complicated datasets can be termed as "big data." The technique of big data mining and big data analysis is extremely helpful for business movements such as making decisions, building organisational plans, researching the market efficiently, improving sales, etc., because typical management tools cannot handle such complicated datasets. Special computational and statistical issues, such as measurement errors, noise accumulation, spurious correlation, and storage and scalability limitations, are brought on by big data. These unique problems call for new computational and statistical paradigms. This research paper offers an overview of the literature on big data mining, its process, along with problems and difficulties, with a focus on the unique characteristics of big data. Organizations have several difficulties when undertaking data mining, which has an impact on their decision-making. Every day, terabytes of data are produced, yet only around 1% of that data is really analyzed. The idea of the mining and analysis of data and knowledge discovery techniques that have recently been created with practical application systems is presented in this study. This article's conclusion also includes a list of issues and difficulties for further research in the area. The report discusses the management's main big data and data mining challenges.

Keywords: big data, data mining, data analysis, knowledge discovery techniques, data mining challenges

Procedia PDF Downloads 89
24723 A Systematic Review on Challenges in Big Data Environment

Authors: Rimmy Yadav, Anmol Preet Kaur

Abstract:

Big Data has demonstrated the vast potential in streamlining, deciding, spotting business drifts in different fields, for example, producing, fund, Information Technology. This paper gives a multi-disciplinary diagram of the research issues in enormous information and its procedures, instruments, and system identified with the privacy, data storage management, network and energy utilization, adaptation to non-critical failure and information representations. Other than this, result difficulties and openings accessible in this Big Data platform have made.

Keywords: big data, privacy, data management, network and energy consumption

Procedia PDF Downloads 282
24722 Survey on Big Data Stream Classification by Decision Tree

Authors: Mansoureh Ghiasabadi Farahani, Samira Kalantary, Sara Taghi-Pour, Mahboubeh Shamsi

Abstract:

Nowadays, the development of computers technology and its recent applications provide access to new types of data, which have not been considered by the traditional data analysts. Two particularly interesting characteristics of such data sets include their huge size and streaming nature .Incremental learning techniques have been used extensively to address the data stream classification problem. This paper presents a concise survey on the obstacles and the requirements issues classifying data streams with using decision tree. The most important issue is to maintain a balance between accuracy and efficiency, the algorithm should provide good classification performance with a reasonable time response.

Keywords: big data, data streams, classification, decision tree

Procedia PDF Downloads 496
24721 Robust and Dedicated Hybrid Cloud Approach for Secure Authorized Deduplication

Authors: Aishwarya Shekhar, Himanshu Sharma

Abstract:

Data deduplication is one of important data compression techniques for eliminating duplicate copies of repeating data, and has been widely used in cloud storage to reduce the amount of storage space and save bandwidth. In this process, duplicate data is expunged, leaving only one copy means single instance of the data to be accumulated. Though, indexing of each and every data is still maintained. Data deduplication is an approach for minimizing the part of storage space an organization required to retain its data. In most of the company, the storage systems carry identical copies of numerous pieces of data. Deduplication terminates these additional copies by saving just one copy of the data and exchanging the other copies with pointers that assist back to the primary copy. To ignore this duplication of the data and to preserve the confidentiality in the cloud here we are applying the concept of hybrid nature of cloud. A hybrid cloud is a fusion of minimally one public and private cloud. As a proof of concept, we implement a java code which provides security as well as removes all types of duplicated data from the cloud.

Keywords: confidentiality, deduplication, data compression, hybridity of cloud

Procedia PDF Downloads 362
24720 A Review of Machine Learning for Big Data

Authors: Devatha Kalyan Kumar, Aravindraj D., Sadathulla A.

Abstract:

Big data are now rapidly expanding in all engineering and science and many other domains. The potential of large or massive data is undoubtedly significant, make sense to require new ways of thinking and learning techniques to address the various big data challenges. Machine learning is continuously unleashing its power in a wide range of applications. In this paper, the latest advances and advancements in the researches on machine learning for big data processing. First, the machine learning techniques methods in recent studies, such as deep learning, representation learning, transfer learning, active learning and distributed and parallel learning. Then focus on the challenges and possible solutions of machine learning for big data.

Keywords: active learning, big data, deep learning, machine learning

Procedia PDF Downloads 411
24719 Strengthening Legal Protection of Personal Data through Technical Protection Regulation in Line with Human Rights

Authors: Tomy Prihananto, Damar Apri Sudarmadi

Abstract:

Indonesia recognizes the right to privacy as a human right. Indonesia provides legal protection against data management activities because the protection of personal data is a part of human rights. This paper aims to describe the arrangement of data management and data management in Indonesia. This paper is a descriptive research with qualitative approach and collecting data from literature study. Results of this paper are comprehensive arrangement of data that have been set up as a technical requirement of data protection by encryption methods. Arrangements on encryption and protection of personal data are mutually reinforcing arrangements in the protection of personal data. Indonesia has two important and immediately enacted laws that provide protection for the privacy of information that is part of human rights.

Keywords: Indonesia, protection, personal data, privacy, human rights, encryption

Procedia PDF Downloads 160
24718 Meta Model for Optimum Design Objective Function of Steel Frames Subjected to Seismic Loads

Authors: Salah R. Al Zaidee, Ali S. Mahdi

Abstract:

Except for simple problems of statically determinate structures, optimum design problems in structural engineering have implicit objective functions where structural analysis and design are essential within each searching loop. With these implicit functions, the structural engineer is usually enforced to write his/her own computer code for analysis, design, and searching for optimum design among many feasible candidates and cannot take advantage of available software for structural analysis, design, and searching for the optimum solution. The meta-model is a regression model used to transform an implicit objective function into objective one and leads in turn to decouple the structural analysis and design processes from the optimum searching process. With the meta-model, well-known software for structural analysis and design can be used in sequence with optimum searching software. In this paper, the meta-model has been used to develop an explicit objective function for plane steel frames subjected to dead, live, and seismic forces. Frame topology is assumed as predefined based on architectural and functional requirements. Columns and beams sections and different connections details are the main design variables in this study. Columns and beams are grouped to reduce the number of design variables and to make the problem similar to that adopted in engineering practice. Data for the implicit objective function have been generated based on analysis and assessment for many design proposals with CSI SAP software. These data have been used later in SPSS software to develop a pure quadratic nonlinear regression model for the explicit objective function. Good correlations with a coefficient, R2, in the range from 0.88 to 0.99 have been noted between the original implicit functions and the corresponding explicit functions generated with meta-model.

Keywords: meta-modal, objective function, steel frames, seismic analysis, design

Procedia PDF Downloads 223
24717 Comparison of Effects over the Autonomic Nervous System When Using Force Training and Interval Training in Indoor Cycling with University Students

Authors: Daniel Botero, Oscar Rubiano, Pedro P. Barragan, Jaime Baron, Leonardo Rodriguez Perdomo, Jaime Rodriguez

Abstract:

In the last decade interval training (IT) has gained importance when is compare with strength training (ST). However, there are few studies analyzing the impact of these training over the autonomic nervous system (ANS). This work has aimed to compare the activity of the autonomic nervous system, when is expose to an IT or ST indoor cycling mode. After approval by the ethics committee, a cross-over clinical trial with 22 healthy participants (age 21 ± 3 years) was implemented. The selection of participants for the groups with sequence force-interval (F-I) and interval-force (I-F) was made randomly with assignation of 11 participants for each group. The temporal series of heart rate was obtained before and after each training using the POLAR TEAM® heart monitor. The evaluation of the ANS was performed with spectral analysis of the heart rate variability (HRV) using the fast Fourier transform (Kubios software). A training of 8 weeks in each sequence (4 weeks with each training) with an intermediate period of two weeks of washout was implemented for each group. The power parameter of the HRV in the low frequency band (LF = 0.04-0.15Hz related to the sympathetic nervous system), high frequency (HF = 0.15-0.4Hz, related to the parasympathetic) and LF/HF (with reference to a modulation of parasympathetic over the sympathetic), were calculated. Afterward, the difference between the parameters before and after was realized. Then, to evaluate statistical differences between each training was implemented the method of Wellek (Wellek and Blettner, 2012, Medicine, 109 (15), 276-81). To determine the difference of effect over parasympathetic when FT and IT are used, the T test is implemented obtaining a T value of 0.73 with p-value ≤ 0.1. For the sympathetic was obtained a T of 0.33 with p ≤ 0.1 and for LF/HF the T was 1.44 with a p ≥ 0.1. Then, the carry over effect was evaluated and was not present. Significant changes over autonomic activity with strength or interval training were not observed. However, a modulation of the parasympathetic over the sympathetic can be observed. Probably, these findings should be explained because the sample is little and/or the time of training was insufficient to generate changes.

Keywords: autonomic nervous, force training, indoor cycling, interval training

Procedia PDF Downloads 203
24716 Antibacterial Activity of Endophytic Bacteria against Multidrug-Resistant Bacteria: Isolation, Characterization, and Antibacterial Activity

Authors: Maryam Beiranvand, Sajad Yaghoubi

Abstract:

Background: Some microbes can colonize plants’ inner tissues without causing obvious damage and can even produce useful bioactive substances. In the present study, the diversity of the endophytic bacteria associated with medicinal plants from Iran was investigated by culturing techniques, molecular gene identification, as well as measuring them for antibacterial activity. Results: In the spring season from 2013 to 2014, 35 herb pharmacology samples were collected, sterilized, meshed, and then cultured on selective media culture. A total of 199 endophytic bacteria were successfully isolated from 35 tissue cultures of medical plants, and sixty-seven out of 199 bacterial isolates were subjected to identification by the 16S rRNA gene sequence analysis method. Based on the sequence similarity gene and phylogenetic analyses, these isolates were grouped into five classes, fourteen orders, seventeen families, twenty-one genera, and forty strains. The most abundant group of endophytic bacteria was actinobacterial, consisting of thirty-two (47%) out of 67 bacterial isolates. Ten (22.3%) out of 67 bacterial isolates remained unidentified and classified at the genus level. The signature of the 16S rRNA gene formed a distinct line in a phylogenetic tree showing that they might be new species of bacteria. One (5.2%) out of 67 bacterial isolates was still not well categorized. Forty-two out of 67 strains were candidates for antimicrobial activity tests. Nineteen (45%) out of 42 strains showed antimicrobial activity multidrug resistance (MDR); thirteen (68%) out of 19 strains were allocated to classes actinobacteria. Four (21%) out of 19 strains belonged to the Bacillaceae family, one (5.2%) out of 19 strains was the Paenibacillaceae family, and one (5.2%) out of 19 strains belonged to the Pseudomonadaceae family. The other twenty-three strains did not show inhibitory activities. Conclusions: Our research showed a high-level phylogenetic diversity and the intoxicating antibiotic activity of endophytic bacteria in the herb pharmacology of Iran.

Keywords: Antibacterial activity, endophytic bacteria, multidrug-resistant bacteria, whole genom sequencing

Procedia PDF Downloads 64
24715 Big Brain: A Single Database System for a Federated Data Warehouse Architecture

Authors: X. Gumara Rigol, I. Martínez de Apellaniz Anzuola, A. Garcia Serrano, A. Franzi Cros, O. Vidal Calbet, A. Al Maruf

Abstract:

Traditional federated architectures for data warehousing work well when corporations have existing regional data warehouses and there is a need to aggregate data at a global level. Schibsted Media Group has been maturing from a decentralised organisation into a more globalised one and needed to build both some of the regional data warehouses for some brands at the same time as the global one. In this paper, we present the architectural alternatives studied and why a custom federated approach was the notable recommendation to go further with the implementation. Although the data warehouses are logically federated, the implementation uses a single database system which presented many advantages like: cost reduction and improved data access to global users allowing consumers of the data to have a common data model for detailed analysis across different geographies and a flexible layer for local specific needs in the same place.

Keywords: data integration, data warehousing, federated architecture, Online Analytical Processing (OLAP)

Procedia PDF Downloads 219
24714 A Computational Framework for Decoding Hierarchical Interlocking Structures with SL Blocks

Authors: Yuxi Liu, Boris Belousov, Mehrzad Esmaeili Charkhab, Oliver Tessmann

Abstract:

This paper presents a computational solution for designing reconfigurable interlocking structures that are fully assembled with SL Blocks. Formed by S-shaped and L-shaped tetracubes, SL Block is a specific type of interlocking puzzle. Analogous to molecular self-assembly, the aggregation of SL blocks will build a reversible hierarchical and discrete system where a single module can be numerously replicated to compose semi-interlocking components that further align, wrap, and braid around each other to form complex high-order aggregations. These aggregations can be disassembled and reassembled, responding dynamically to design inputs and changes with a unique capacity for reconfiguration. To use these aggregations as architectural structures, we developed computational tools that automate the configuration of SL blocks based on architectural design objectives. There are three critical phases in our work. First, we revisit the hierarchy of the SL block system and devise a top-down-type design strategy. From this, we propose two key questions: 1) How to translate 3D polyominoes into SL block assembly? 2) How to decompose the desired voxelized shapes into a set of 3D polyominoes with interlocking joints? These two questions can be considered the Hamiltonian path problem and the 3D polyomino tiling problem. Then, we derive our solution to each of them based on two methods. The first method is to construct the optimal closed path from an undirected graph built from the voxelized shape and translate the node sequence of the resulting path into the assembly sequence of SL blocks. The second approach describes interlocking relationships of 3D polyominoes as a joint connection graph. Lastly, we formulate the desired shapes and leverage our methods to achieve their reconfiguration within different levels. We show that our computational strategy will facilitate the efficient design of hierarchical interlocking structures with a self-replicating geometric module.

Keywords: computational design, SL-blocks, 3D polyomino puzzle, combinatorial problem

Procedia PDF Downloads 113
24713 Identification of New Familial Breast Cancer Susceptibility Genes: Are We There Yet?

Authors: Ian Campbell, Gillian Mitchell, Paul James, Na Li, Ella Thompson

Abstract:

The genetic cause of the majority of multiple-case breast cancer families remains unresolved. Next generation sequencing has emerged as an efficient strategy for identifying predisposing mutations in individuals with inherited cancer. We are conducting whole exome sequence analysis of germ line DNA from multiple affected relatives from breast cancer families, with the aim of identifying rare protein truncating and non-synonymous variants that are likely to include novel cancer predisposing mutations. Data from more than 200 exomes show that on average each individual carries 30-50 protein truncating mutations and 300-400 rare non-synonymous variants. Heterogeneity among our exome data strongly suggest that numerous moderate penetrance genes remain to be discovered, with each gene individually accounting for only a small fraction of families (~0.5%). This scenario marks validation of candidate breast cancer predisposing genes in large case-control studies as the rate-limiting step in resolving the missing heritability of breast cancer. The aim of this study is to screen genes that are recurrently mutated among our exome data in a larger cohort of cases and controls to assess the prevalence of inactivating mutations that may be associated with breast cancer risk. We are using the Agilent HaloPlex Target Enrichment System to screen the coding regions of 168 genes in 1,000 BRCA1/2 mutation-negative familial breast cancer cases and 1,000 cancer-naive controls. To date, our interim analysis has identified 21 genes which carry an excess of truncating mutations in multiple breast cancer families versus controls. Established breast cancer susceptibility gene PALB2 is the most frequently mutated gene (13/998 cases versus 0/1009 controls), but other interesting candidates include NPSR1, GSN, POLD2, and TOX3. These and other genes are being validated in a second cohort of 1,000 cases and controls. Our experience demonstrates that beyond PALB2, the prevalence of mutations in the remaining breast cancer predisposition genes is likely to be very low making definitive validation exceptionally challenging.

Keywords: predisposition, familial, exome sequencing, breast cancer

Procedia PDF Downloads 472
24712 Isolation, Characterization, and Antibacterial Activity of Endophytic Bacteria from Iranian Medicinal Plants

Authors: Maryam Beiranvand, Sajad Yaghoubi

Abstract:

Background: Some microbes can colonize plants’ inner tissues without causing obvious damage and can even produce useful bioactive substances. In the present study, the diversity of the endophytic bacteria associated with medicinal plants from Iran was investigated by culturing techniques, molecular gene identification, as well as measuring them for antibacterial activity. Results: In the spring season from 2013 to 2014, 35 herb pharmacology samples were collected, sterilized, meshed, and then cultured on selective media culture. A total of 199 endophytic bacteria were successfully isolated from 35 tissue cultures of medical plants, and sixty-seven out of 199 bacterial isolates were subjected to identification by the 16S rRNA gene sequence analysis method. Based on the sequence similarity gene and phylogenetic analyses, these isolates were grouped into five classes, fourteen orders, seventeen families, twenty-one genera, and forty strains. The most abundant group of endophytic bacteria was actinobacterial, consisting of thirty-two (47%) out of 67 bacterial isolates. Ten (22.3%) out of 67 bacterial isolates remained unidentified and classified at the genus level. The signature of the 16S rRNA gene formed a distinct line in a phylogenetic tree showing that they might be new species of bacteria. One (5.2%) out of 67 bacterial isolates was still not well categorized. Forty-two out of 67 strains were candidates for antimicrobial activity tests. Nineteen (45%) out of 42 strains showed antimicrobial activity multidrug-resistance (MDR); thirteen (68%) out of 19 strains were allocated to classes actinobacteria. Four (21%) out of 19 strains belonged to the Bacillaceae family, one (5.2%) out of 19 strains was the Paenibacillaceae family, and one (5.2%) out of 19 strains belonged to the Pseudomonadaceae family. The other twenty-three strains did not show inhibitory activities. Conclusions: Our research showed a high-level phylogenetic diversity and the intoxicating antibiotic activity of endophytic bacteria in the herb pharmacology of Iran.

Keywords: medical plant, endophytic bacteria, antimicrobial activity, whole genome sequencing analysis

Procedia PDF Downloads 95
24711 A Review Paper on Data Mining and Genetic Algorithm

Authors: Sikander Singh Cheema, Jasmeen Kaur

Abstract:

In this paper, the concept of data mining is summarized and its one of the important process i.e KDD is summarized. The data mining based on Genetic Algorithm is researched in and ways to achieve the data mining Genetic Algorithm are surveyed. This paper also conducts a formal review on the area of data mining tasks and genetic algorithm in various fields.

Keywords: data mining, KDD, genetic algorithm, descriptive mining, predictive mining

Procedia PDF Downloads 569
24710 Data-Mining Approach to Analyzing Industrial Process Information for Real-Time Monitoring

Authors: Seung-Lock Seo

Abstract:

This work presents a data-mining empirical monitoring scheme for industrial processes with partially unbalanced data. Measurement data of good operations are relatively easy to gather, but in unusual special events or faults it is generally difficult to collect process information or almost impossible to analyze some noisy data of industrial processes. At this time some noise filtering techniques can be used to enhance process monitoring performance in a real-time basis. In addition, pre-processing of raw process data is helpful to eliminate unwanted variation of industrial process data. In this work, the performance of various monitoring schemes was tested and demonstrated for discrete batch process data. It showed that the monitoring performance was improved significantly in terms of monitoring success rate of given process faults.

Keywords: data mining, process data, monitoring, safety, industrial processes

Procedia PDF Downloads 376
24709 A Cross Cultural Study of Jewish and Arab Listeners: Perception of Harmonic Sequences

Authors: Roni Granot

Abstract:

Musical intervals are the building blocks of melody and harmony. Intervals differ in terms of their size, direction, or quality as consonants or dissonants. In Western music, perceptual dissonance is mostly associated with the sensation of beats or periodicity, whereas cognitive dissonance is associated with rules of harmony and voice leading. These two perceptions can be studied separately in musical cultures which include melodic with little or no harmonic structures. In the Arab musical system, there is a number of different quarter- tone intervals creating various combinations of consonant and dissonant intervals. While traditional Arab music includes only melody, today’s Arab pop music includes harmonization of songs, often using typical Western harmonic sequences. Therefore, the Arab population in Israel presents an interesting case which enables us to examine the distinction between perceptual and cognitive dissonance. In the current study, we compared the responses of 34 Jewish Western listeners and 56 Arab listeners to two types of stimuli and their relationships: Harmonic sequences and isolated harmonic intervals (dyads). Harmonic sequences were presented in synthesized piano tones and represented five levels of Harmonic prototypicality (Tonic ending; Tonic ending with half flattened third; Deceptive cadence; Half cadence; and Dissonant unrelated ending) and were rated on 5-point scales of closure and surprise. Here we report only findings related to the harmonic sequences. One-way repeated measures ANOVA with one within subjects factor with five levels (Type of sequence) and one between- subjects factor (Musical background) indicates a main effect of Type of sequence for surprise ratings F (4, 85) = 51 p<.001, and for closure ratings F (4, 78) 9.54 p < .001, no main effect of Background on either surprise or closure ratings, and a marginally significant Type X Background interaction for surprise F (4, 352) = 6.05 p = .069 and closure ratings F (4, 324) 3.89 p < .01). Planned comparisons show that the interaction of Type of sequence X Background center around surprise and closure ratings of the regular versus the half- flattened third tonic and the deceptive versus the half cadence. The half- flattened third tonic is rated as less surprising and as demanding less continuation than the regular tonic by the Arab listeners as compared to the Western listeners. In addition, the half cadence is rated as more surprising but demanding less continuation than the deceptive cadence in the Arab listeners as compared to the Western listeners. Together, our results suggest that despite the vast exposure of Arab listeners to Western harmony, sensitivity to harmonic rules seems to be partial with preference to oriental sonorities such as half flattened third. In addition, the percept of directionality which demands sensitivity to the level on which closure is obtained and which is strongly entrenched in Western harmony, may not be fully integrated into the Arab listeners’ mental harmonic scheme. Results will be discussed in terms of broad differences between Western and Eastern aesthetic ideals.

Keywords: harmony, cross cultural, Arab music, closure

Procedia PDF Downloads 258
24708 Digital Transformation: Actionable Insights to Optimize the Building Performance

Authors: Jovian Cheung, Thomas Kwok, Victor Wong

Abstract:

Buildings are entwined with smart city developments. Building performance relies heavily on electrical and mechanical (E&M) systems and services accounting for about 40 percent of global energy use. By cohering the advancement of technology as well as energy and operation-efficient initiatives into the buildings, people are enabled to raise building performance and enhance the sustainability of the built environment in their daily lives. Digital transformation in the buildings is the profound development of the city to leverage the changes and opportunities of digital technologies To optimize the building performance, intelligent power quality and energy management system is developed for transforming data into actions. The system is formed by interfacing and integrating legacy metering and internet of things technologies in the building and applying big data techniques. It provides operation and energy profile and actionable insights of a building, which enables to optimize the building performance through raising people awareness on E&M services and energy consumption, predicting the operation of E&M systems, benchmarking the building performance, and prioritizing assets and energy management opportunities. The intelligent power quality and energy management system comprises four elements, namely the Integrated Building Performance Map, Building Performance Dashboard, Power Quality Analysis, and Energy Performance Analysis. It provides predictive operation sequence of E&M systems response to the built environment and building activities. The system collects the live operating conditions of E&M systems over time to identify abnormal system performance, predict failure trends and alert users before anticipating system failure. The actionable insights collected can also be used for system design enhancement in future. This paper will illustrate how intelligent power quality and energy management system provides operation and energy profile to optimize the building performance and actionable insights to revitalize an existing building into a smart building. The system is driving building performance optimization and supporting in developing Hong Kong into a suitable smart city to be admired.

Keywords: intelligent buildings, internet of things technologies, big data analytics, predictive operation and maintenance, building performance

Procedia PDF Downloads 132
24707 A Survey of Semantic Integration Approaches in Bioinformatics

Authors: Chaimaa Messaoudi, Rachida Fissoune, Hassan Badir

Abstract:

Technological advances of computer science and data analysis are helping to provide continuously huge volumes of biological data, which are available on the web. Such advances involve and require powerful techniques for data integration to extract pertinent knowledge and information for a specific question. Biomedical exploration of these big data often requires the use of complex queries across multiple autonomous, heterogeneous and distributed data sources. Semantic integration is an active area of research in several disciplines, such as databases, information-integration, and ontology. We provide a survey of some approaches and techniques for integrating biological data, we focus on those developed in the ontology community.

Keywords: biological ontology, linked data, semantic data integration, semantic web

Procedia PDF Downloads 424
24706 Molecular Identification of Camel Tick and Investigation of Its Natural Infection by Rickettsia and Borrelia in Saudi Arabia

Authors: Reem Alajmi, Hind Al Harbi, Tahany Ayaad, Zainab Al Musawi

Abstract:

Hard ticks Hyalomma spp. (family: Ixodidae) are obligate ectoparasite in their all life stages on some domestic animals mainly camels and cattle. Ticks may lead to many economic and public health problems because of their blood feeding behavior. Also, they act as vectors for many bacterial, viral and protozoan agents which may cause serious diseases such as tick-born encephalitis, Rocky-mountain spotted fever, Q-fever and Lyme disease which can affect human and/or animals. In the present study, molecular identification of ticks that attack camels in Riyadh region, Saudi Arabia based on the partial sequence of mitochondrial 16s rRNA gene was applied. Also, the present study aims to detect natural infections of collected camel ticks with Rickessia spp. and Borelia spp. using PCR/hybridization of Citrate synthase encoding gene present in bacterial cells. Hard ticks infesting camels were collected from different camels located in a farm in Riyadh region, Saudi Arabia. Results of the present study showed that the collected specimens belong to two species: Hyalomma dromedari represent 99% of the identified specimens and Hyalomma marginatum which account for 1 % of identified ticks. The molecular identification was made through blasting the obtained sequence of this study with sequences already present and identified in GeneBank. All obtained sequences of H. dromedarii specimens showed 97-100% identity with the same gene sequence of the same species (Accession # L34306.1) which was used as a reference. Meanwhile, no intraspecific variations of H. marginatum mesured because only one specimen was collected. Results also had shown that the intraspecific variability between individuals of H. dromedarii obtained in 92 % of samples ranging from 0.2- 6.6%, while the remaining 7 % of the total samples of H. dromedarii showed about 10.3 % individual differences. However, the interspecific variability between H. dromedarii and H. marginatum was approximately 18.3 %. On the other hand, by using the technique of PCR/hybridization, we could detect natural infection of camel ticks with Rickettsia spp. and Borrelia spp. Results revealed the natural presence of both bacteria in collected ticks. Rickettsial spp. infection present in 29% of collected ticks, while 35% of collected specimen were infected with Borrelia spp. The valuable results obtained from the present study are a new record for the molecular identification of camel ticks in Riyadh, Saudi Arabia and their natural infection with both Rickettsia spp. and Borrelia spp. These results may help scientists to provide a good and direct control strategy of ticks in order to protect one of the most important economic animals which are camels. Also results of this project spotlight on the disease that might be transmitted by ticks to put out a direct protective plan to prevent spreading of these dangerous agents. Further molecular studies are needed to confirm the results of the present study by using other mitochondrial and nuclear genes for tick identification.

Keywords: Camel ticks, Rickessia spp. , Borelia spp. , mitochondrial 16s rRNA gene

Procedia PDF Downloads 255
24705 Genetic Analysis of the Endangered Mangrove Species Avicennia Marina in Qatar Detected by Inter-Simple Sequence Repeat DNA Markers

Authors: Talaat Ahmed, Amna Babssail

Abstract:

Mangroves are evergreen trees and grow along the coastal areas of Qatar. The largest and oldest area of mangroves can be found around Al-Thakhira and Al-Khor. Other mangrove areas originate from fairly recent plantings by the government, although unfortunately the picturesque mangrove lake in Al-Wakra has now been uprooted. Avicinnia marina is the predominant mangrove species found in the region. Mangroves protect and stabilize low lying coastal land, and provide protection and food sources for estuarine and coastal fishery food chains. They also serve as feeding, breeding and nursery grounds for a variety of fish, crustaceans, reptiles, birds and other wildlife. A total of 21 individuals of A. marina, representing seven diverse Natural and artificial populations, were sampled throughout its range in Qatar. Leaves from 2-3 randomly selected trees at each location were collected. The locations are as follows: Al-Rawis, Ras-Madpak, Fuwairt, Summaseima, Al-khour, AL-Mafjar and Zekreet. Total genomic DNA was extracted using commercial DNeasy Plant System (Qiagen, Inc., Valencia, CA) kit to be used for genetic diversity analysis. Total of 12 (Inter-Simple Sequence Repeat) ISSR primers were used to amplify DNA fragments using genomic DNA. The 12 ISSR primers amplified polymorphic bands among mangrove samples in different areas as well as within each area indicating the existing of variation within each area and among the different areas of mangrove in Qatar. The results could characterize Avicinnia marina populations exist in different areas of Qatar and establish DNA fingerprint documentations for mangrove population to be used in further studies. Moreover, existing of genetic variation within and among Avicinnia marina populations is a strong indication for the ability of such populations to adapt different environmental conditions in Qatar. This study could be a warning to save mangrove in Qatar and save the environment as well.

Keywords: DNA fingerprint, Avicinnia marina, genetic analysis, Qatar

Procedia PDF Downloads 372
24704 Classification of Generative Adversarial Network Generated Multivariate Time Series Data Featuring Transformer-Based Deep Learning Architecture

Authors: Thrivikraman Aswathi, S. Advaith

Abstract:

As there can be cases where the use of real data is somehow limited, such as when it is hard to get access to a large volume of real data, we need to go for synthetic data generation. This produces high-quality synthetic data while maintaining the statistical properties of a specific dataset. In the present work, a generative adversarial network (GAN) is trained to produce multivariate time series (MTS) data since the MTS is now being gathered more often in various real-world systems. Furthermore, the GAN-generated MTS data is fed into a transformer-based deep learning architecture that carries out the data categorization into predefined classes. Further, the model is evaluated across various distinct domains by generating corresponding MTS data.

Keywords: GAN, transformer, classification, multivariate time series

Procedia PDF Downloads 104
24703 Generative AI: A Comparison of Conditional Tabular Generative Adversarial Networks and Conditional Tabular Generative Adversarial Networks with Gaussian Copula in Generating Synthetic Data with Synthetic Data Vault

Authors: Lakshmi Prayaga, Chandra Prayaga. Aaron Wade, Gopi Shankar Mallu, Harsha Satya Pola

Abstract:

Synthetic data generated by Generative Adversarial Networks and Autoencoders is becoming more common to combat the problem of insufficient data for research purposes. However, generating synthetic data is a tedious task requiring extensive mathematical and programming background. Open-source platforms such as the Synthetic Data Vault (SDV) and Mostly AI have offered a platform that is user-friendly and accessible to non-technical professionals to generate synthetic data to augment existing data for further analysis. The SDV also provides for additions to the generic GAN, such as the Gaussian copula. We present the results from two synthetic data sets (CTGAN data and CTGAN with Gaussian Copula) generated by the SDV and report the findings. The results indicate that the ROC and AUC curves for the data generated by adding the layer of Gaussian copula are much higher than the data generated by the CTGAN.

Keywords: synthetic data generation, generative adversarial networks, conditional tabular GAN, Gaussian copula

Procedia PDF Downloads 48
24702 Environmental Radioactivity Analysis by a Sequential Approach

Authors: G. Medkour Ishak-Boushaki, A. Taibi, M. Allab

Abstract:

Quantitative environmental radioactivity measurements are needed to determine the level of exposure of a population to ionizing radiations and for the assessment of the associated risks. Gamma spectrometry remains a very powerful tool for the analysis of radionuclides present in an environmental sample but the basic problem in such measurements is the low rate of detected events. Using large environmental samples could help to get around this difficulty but, unfortunately, new issues are raised by gamma rays attenuation and self-absorption. Recently, a new method has been suggested, to detect and identify without quantification, in a short time, a gamma ray of a low count source. This method does not require, as usually adopted in gamma spectrometry measurements, a pulse height spectrum acquisition. It is based on a chronological record of each detected photon by simultaneous measurements of its energy ε and its arrival time τ on the detector, the pair parameters [ε,τ] defining an event mode sequence (EMS). The EMS serials are analyzed sequentially by a Bayesian approach to detect the presence of a given radioactive source. The main object of the present work is to test the applicability of this sequential approach in radioactive environmental materials detection. Moreover, for an appropriate health oversight of the public and of the concerned workers, the analysis has been extended to get a reliable quantification of the radionuclides present in environmental samples. For illustration, we consider as an example, the problem of detection and quantification of 238U. Monte Carlo simulated experience is carried out consisting in the detection, by a Ge(Hp) semiconductor junction, of gamma rays of 63 keV emitted by 234Th (progeny of 238U). The generated EMS serials are analyzed by a Bayesian inference. The application of the sequential Bayesian approach, in environmental radioactivity analysis, offers the possibility of reducing the measurements time without requiring large environmental samples and consequently avoids the attached inconvenient. The work is still in progress.

Keywords: Bayesian approach, event mode sequence, gamma spectrometry, Monte Carlo method

Procedia PDF Downloads 479
24701 Biotechnological Interventions for Crop Improvement in Nutricereal Pearl Millet

Authors: Supriya Ambawat, Subaran Singh, C. Tara Satyavathi, B. S. Rajpurohit, Ummed Singh, Balraj Singh

Abstract:

Pearl millet [Pennisetum glaucum (L.) R. Br.] is an important staple food of the arid and semiarid tropical regions of Asia, Africa, and Latin America. It is rightly termed as nutricereal as it has high nutrition value and a good source of carbohydrate, protein, fat, ash, dietary fiber, potassium, magnesium, iron, zinc, etc. Pearl millet has low prolamine fraction and is gluten free which is useful for people having a gluten allergy. It has several health benefits like reduction in blood pressure, thyroid, diabe¬tes, cardiovascular and celiac diseases but its direct consumption as food has significantly declined due to several reasons. Keeping this in view, it is important to reorient the ef¬forts to generate demand through value-addition and quality improvement and create awareness on the nutritional merits of pearl millet. In India, through Indian Council of Agricultural Research-All India Coordinated Research Project on Pearl millet, multilocational coordinated trials for developed hybrids were conducted at various centers. The gene banks of pearl millet contain varieties with high levels of iron and zinc which were used to produce new pearl millet varieties with elevated iron levels bred with the high‐yielding varieties. Thus, using breeding approaches and biochemical analysis, a total of 167 hybrids and 61 varieties were identified and released for cultivation in different agro-ecological zones of the country which also includes some biofortified hybrids rich in Fe and Zn. Further, using several biotechnological interventions such as molecular markers, next-generation sequencing (NGS), association mapping, nested association mapping (NAM), MAGIC populations, genome editing, genotyping by sequencing (GBS), genome wide association studies (GWAS) advancement in millet improvement has become possible by identifying and tagging of genes underlying a trait in the genome. Using DArT markers very high density linkage maps were constructed for pearl millet. Improved HHB67 has been released using marker assisted selection (MAS) strategies, and genomic tools were used to identify Fe-Zn Quantitative Trait Loci (QTL). The draft genome sequence of millet has also opened various ways to explore pearl millet. Further, genomic positions of significantly associated simple sequence repeat (SSR) markers with iron and zinc content in the consensus map is being identified and research is in progress towards mapping QTLs for flour rancidity. The sequence information is being used to explore genes and enzymatic pathways responsible for rancidity of flour. Thus, development and application of several biotechnological approaches along with biofortification can accelerate the genetic gain targets for pearl millet improvement and help improve its quality.

Keywords: Biotechnological approaches, genomic tools, malnutrition, MAS, nutricereal, pearl millet, sequencing.

Procedia PDF Downloads 155
24700 Approximation Property Pass to Free Product

Authors: Kankeyanathan Kannan

Abstract:

On approximation properties of group C* algebras is everywhere; it is powerful, important, backbone of countless breakthroughs. For a discrete group G, let A(G) denote its Fourier algebra, and let M₀A(G) denote the space of completely bounded Fourier multipliers on G. An approximate identity on G is a sequence (Φn) of finitely supported functions such that (Φn) uniformly converge to constant function 1 In this paper we prove that approximation property pass to free product.

Keywords: approximation property, weakly amenable, strong invariant approximation property, invariant approximation property

Procedia PDF Downloads 656
24699 In silico Comparative Analysis of Chloroplast Genome (cpDNA) and Some Individual Genes (rbcL and trnH-psbA) in Pooideae Subfamily Members

Authors: Ibrahim Ilker Ozyigit, Ertugrul Filiz, Ilhan Dogan

Abstract:

An in silico analysis of Brachypodium distachyon, Triticum aestivum, Festuca arundinacea, Lolium perenne, Hordeum vulgare subsp. vulgare of the Pooideaea was performed based on complete chloroplast genomes including rbcL coding and trnH-psbA intergenic spacer regions alone to compare phylogenetic resolving power. Neighbor-joining, Minimum Evolution, and Unweighted Pair Group Method with arithmetic mean methods were used to reconstruct phylogenies with the highest bootstrap supported the obtained data from whole chloroplast genome sequence. The highest and lowest values from nucleotide diversity (π) analysis were found to be 0.315813 and 0.043495 in rbcL coding region in chloroplast genome and complete chloroplast genome, respectively. The highest transition/transversion bias (R) value was recorded as 1.384 in complete chloroplast genomes. F. arudinacea-L. perenne clade was uncovered in all phylogenies. Sequences of rbcL and trnH-psbA regions were not able to resolve the Pooideae phylogenies due to lack of genetic variation.

Keywords: chloroplast DNA, Pooideae, phylogenetic analysis, rbcL, trnH-psbA

Procedia PDF Downloads 356
24698 Contaminated Sites Prioritization Process Promoting and Redevelopment Planning

Authors: Che-An Lin, Wan-Ying Tsai, Ying-Shin Chen, Yu-Jen Chung

Abstract:

With the number and area of contaminated sites continued to increase in Taiwan, the Government have to make a priority list of screening contaminated sites under the limited funds and information. This study investigated the announcement of Taiwan EPA land 261 contaminated sites (except the agricultural lands), after preliminary screening 211 valid data to propose a screening system, removed contaminated sites were used to check the accuracy. This system including two dimensions which can create the sequence and use the XY axis to construct four quadrants. One dimension included environmental and social priority and the other related economic. All of the evaluated items included population density, land values, traffic hub, pollutant compound, pollutant concentrations, pollutant transport pathways, land usage sites, site areas, and water conductivity. The classification results of this screening are 1. Prioritization promoting sites (10%). 2. Environmental and social priority of the sites (17%), 3. Economic priority of the sites (30%), 4. Non-priority sites (43 %). Finally, this study used three of the removed contaminated sites to check screening system verification. As the surmise each of them are in line with the priority site and Economic priority of the site.

Keywords: contaminated sites, redevelopment, environmental, economics

Procedia PDF Downloads 455
24697 A Privacy Protection Scheme Supporting Fuzzy Search for NDN Routing Cache Data Name

Authors: Feng Tao, Ma Jing, Guo Xian, Wang Jing

Abstract:

Named Data Networking (NDN) replaces IP address of traditional network with data name, and adopts dynamic cache mechanism. In the existing mechanism, however, only one-to-one search can be achieved because every data has a unique name corresponding to it. There is a certain mapping relationship between data content and data name, so if the data name is intercepted by an adversary, the privacy of the data content and user’s interest can hardly be guaranteed. In order to solve this problem, this paper proposes a one-to-many fuzzy search scheme based on order-preserving encryption to reduce the query overhead by optimizing the caching strategy. In this scheme, we use hash value to ensure the user’s query safe from each node in the process of search, so does the privacy of the requiring data content.

Keywords: NDN, order-preserving encryption, fuzzy search, privacy

Procedia PDF Downloads 462
24696 Synthesis, Inhibitory Activity, and Molecular Modelling of 2-Hydroxy-3-Oxo-3-Phenylpropionate Derivatives as HIV-1-Integrase Inhibitors

Authors: O. J. Jesumoroti, Faridoon, R. Klein, K. A. Iobb, D. Mnkadhla, H. C. Hoppe, P. T. Kaye

Abstract:

The 1, 3-aryl diketo acids (DKA) based agents represent an important class of HIV integrase (IN) strand transfer inhibitors. In other to study the chelating role of the divalent metal ion in the inhibition of IN strand transfer, we designed and synthesized a series of 2-hydroxy-3-oxo-3-phenyl propionate derivatives with the notion that such compounds could interact with the divalent ion in the active site of IN. The synthetic sequence to the desired compounds involves the concept of Doebner knoevenagel condensation, Fischer esterification and ketohydroxylation using neuclophilic re-oxidant; compounds were characterized by their IR, IHNMR, 13CNMR, HRMS spectroscopic data and melting point determination. Also, molecular docking was employed in this study and it was revealed that there is interaction with the active site of the enzyme. However, there is disparity in the corresponding anti-HIV activity determined by the experimental bioassay. These compounds lack potency at low micromolar concentration when compared to the results of the docking studies. Nevertheless, the results of the study suggest modification of the aryl ring with one or two hydroxyl groups to improve the inhibitory activity.

Keywords: anti-HIV-1 integrase, ketohydroxylation, molecular docking, propionate derivatives

Procedia PDF Downloads 176