Search results for: whole exome sequencing data
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25180

Search results for: whole exome sequencing data

24730 Bean in Turkey: Characterization, Inter Gene Pool Hybridization Events, Breeding, Utilizations

Authors: Faheem Shahzad Baloch, Muhammad Azhar Nadeem, Muhammad Amjad Nawaz, Ephrem Habyarimana, Gonul Comertpay, Tolga Karakoy, Rustu Hatipoglu, Mehmet Zahit Yeken, Vahdettin Ciftci

Abstract:

Turkey is considered a bridge between Europe, Asia, and Africa and possibly played an important role in the distribution of many crops including common bean. Hundreds of common bean landraces can be found in Turkey, particularly in farmers’ fields, and they consistently contribute to the overall production. To investigate the existing genetic diversity and hybridization events between the Andean and Mesoamerican gene pools in the Turkish common bean, 188 common bean accessions (182 landraces and 6 modern cultivars as controls) were collected from 19 different Turkish geographic regions. These accessions were characterized using phenotypic data (growth habit and seed weight), geographic provenance, 12557 high-quality whole-genome DArTseq markers, and 3767 novel DArTseq loci were also identified. The clustering algorithms resolved the Turkish common bean landrace germplasm into the two recognized gene pools, the Mesoamerican and Andean gene pools. Hybridization events were observed in both gene pools (14.36% of the accessions) but mostly in the Mesoamerican (7.97% of the accessions), and was low relative to previous European studies. The lower level of hybridization witnessed the existence of Turkish common bean germplasm in its original form as compared to Europe. Mesoamerican gene pool reflected a higher level of diversity, while the Andean gene pool was predominant (56.91% of the accessions), but genetically less diverse and phenotypically more pure, reflecting farmers greater preference for the Andean gene pool. We also found some genetically distinct landraces and overall, a meaningful level of genetic variability which can be used by the scientific community in breeding efforts to develop superior common bean strains.

Keywords: bean germplasm, DArTseq markers, genotyping by sequencing, Turkey, whole genome diversity

Procedia PDF Downloads 240
24729 Hybrid Reliability-Similarity-Based Approach for Supervised Machine Learning

Authors: Walid Cherif

Abstract:

Data mining has, over recent years, seen big advances because of the spread of internet, which generates everyday a tremendous volume of data, and also the immense advances in technologies which facilitate the analysis of these data. In particular, classification techniques are a subdomain of Data Mining which determines in which group each data instance is related within a given dataset. It is used to classify data into different classes according to desired criteria. Generally, a classification technique is either statistical or machine learning. Each type of these techniques has its own limits. Nowadays, current data are becoming increasingly heterogeneous; consequently, current classification techniques are encountering many difficulties. This paper defines new measure functions to quantify the resemblance between instances and then combines them in a new approach which is different from actual algorithms by its reliability computations. Results of the proposed approach exceeded most common classification techniques with an f-measure exceeding 97% on the IRIS Dataset.

Keywords: data mining, knowledge discovery, machine learning, similarity measurement, supervised classification

Procedia PDF Downloads 460
24728 Student Feedback and Its Impact on Fostering the Quality of Teaching at the Academia

Authors: S. Vanker, A. Aaver, A. Roio, L. Nuut

Abstract:

To be sure about the effective and less effective/ineffective approaches to course instruction, we hold the opinion that the faculty members need regular feedback from their students in order to be aware of how well or unwell their teaching styles have worked when instructing the courses. It can be confirmed without a slightest hesitation that undergraduate students’ motivated-ness can be sustained when continually improving the quality of teaching and properly sequencing the academic courses both, in the curricula and timetables. At Estonian Aviation Academy, four different forms of feedback are used: Lecture monitoring, questionnaires for all students, study information system subject monitoring and direct feedback received by the lecturer. Questionnaires for all students are arranged once during a study year and separately for the first year and senior students. The results are discussed in academic departments together with student representatives, analyzed with the teaching staff and, if needed, improvements are suggested. In addition, a monitoring system is planned where a lecturer acts in both roles – as an observer and as the lecturer. This will foster better exchange of experience and through this help to make the whole study process more interesting.

Keywords: learner motivation, feedback, student support, undergraduate education

Procedia PDF Downloads 317
24727 Benders Decomposition Approach to Solve the Hybrid Flow Shop Scheduling Problem

Authors: Ebrahim Asadi-Gangraj

Abstract:

Hybrid flow shop scheduling problem (HFS) contains sequencing in a flow shop where, at any stage, there exist one or more related or unrelated parallel machines. This production system is a common manufacturing environment in many real industries, such as the steel manufacturing, ceramic tile manufacturing, and car assembly industries. In this research, a mixed integer linear programming (MILP) model is presented for the hybrid flow shop scheduling problem, in which, the objective consists of minimizing the maximum completion time (makespan). For this purpose, a Benders Decomposition (BD) method is developed to solve the research problem. The proposed approach is tested on some test problems, small to moderate scale. The experimental results show that the Benders decomposition approach can solve the hybrid flow shop scheduling problem in a reasonable time, especially for small and moderate-size test problems.

Keywords: hybrid flow shop, mixed integer linear programming, Benders decomposition, makespan

Procedia PDF Downloads 181
24726 Seismic Data Scaling: Uncertainties, Potential and Applications in Workstation Interpretation

Authors: Ankur Mundhra, Shubhadeep Chakraborty, Y. R. Singh, Vishal Das

Abstract:

Seismic data scaling affects the dynamic range of a data and with present day lower costs of storage and higher reliability of Hard Disk data, scaling is not suggested. However, in dealing with data of different vintages, which perhaps were processed in 16 bits or even 8 bits and are need to be processed with 32 bit available data, scaling is performed. Also, scaling amplifies low amplitude events in deeper region which disappear due to high amplitude shallow events that saturate amplitude scale. We have focused on significance of scaling data to aid interpretation. This study elucidates a proper seismic loading procedure in workstations without using default preset parameters as available in most software suites. Differences and distribution of amplitude values at different depth for seismic data are probed in this exercise. Proper loading parameters are identified and associated steps are explained that needs to be taken care of while loading data. Finally, the exercise interprets the un-certainties which might arise when correlating scaled and unscaled versions of seismic data with synthetics. As, seismic well tie correlates the seismic reflection events with well markers, for our study it is used to identify regions which are enhanced and/or affected by scaling parameter(s).

Keywords: clipping, compression, resolution, seismic scaling

Procedia PDF Downloads 464
24725 Using Multiomic Plasma Profiling From Liquid Biopsies to Identify Potential Signatures for Disease Diagnostics in Late-Stage Non-small Cell Lung Cancer (NSCLC) in Trinidad and Tobago

Authors: Nicole Ramlachan, Samuel Mark West

Abstract:

Lung cancer is the leading cause of cancer-associated deaths in North America, with the vast majority being non-small cell lung cancer (NSCLC), with a five-year survival rate of only 24%. Non-invasive discovery of biomarkers associated with early-diagnosis of NSCLC can enable precision oncology efforts using liquid biopsy-based multiomics profiling of plasma. Although tissue biopsies are currently the gold standard for tumor profiling, this method presents many limitations since these are invasive, risky, and sometimes hard to obtain as well as only giving a limited tumor profile. Blood-based tests provides a less-invasive, more robust approach to interrogate both tumor- and non-tumor-derived signals. We intend to examine 30 stage III-IV NSCLC patients pre-surgery and collect plasma samples.Cell-free DNA (cfDNA) will be extracted from plasma, and next-generation sequencing (NGS) performed. Through the analysis of tumor-specific alterations, including single nucleotide variants (SNVs), insertions, deletions, copy number variations (CNVs), and methylation alterations, we intend to identify tumor-derived DNA—ctDNA among the total pool of cfDNA. This would generate data to be used as an accurate form of cancer genotyping for diagnostic purposes. Using liquid biopsies offer opportunities to improve the surveillance of cancer patients during treatment and would supplement current diagnosis and tumor profiling strategies previously not readily available in Trinidad and Tobago. It would be useful and advantageous to use this in diagnosis and tumour profiling as well as to monitor cancer patients, providing early information regarding disease evolution and treatment efficacy, and reorient treatment strategies in, timethereby improving clinical oncology outcomes.

Keywords: genomics, multiomics, clinical genetics, genotyping, oncology, diagnostics

Procedia PDF Downloads 156
24724 Association of Social Data as a Tool to Support Government Decision Making

Authors: Diego Rodrigues, Marcelo Lisboa, Elismar Batista, Marcos Dias

Abstract:

Based on data on child labor, this work arises questions about how to understand and locate the factors that make up the child labor rates, and which properties are important to analyze these cases. Using data mining techniques to discover valid patterns on Brazilian social databases were evaluated data of child labor in the State of Tocantins (located north of Brazil with a territory of 277000 km2 and comprises 139 counties). This work aims to detect factors that are deterministic for the practice of child labor and their relationships with financial indicators, educational, regional and social, generating information that is not explicit in the government database, thus enabling better monitoring and updating policies for this purpose.

Keywords: social data, government decision making, association of social data, data mining

Procedia PDF Downloads 365
24723 Outlier Detection in Stock Market Data using Tukey Method and Wavelet Transform

Authors: Sadam Alwadi

Abstract:

Outlier values become a problem that frequently occurs in the data observation or recording process. Thus, the need for data imputation has become an essential matter. In this work, it will make use of the methods described in the prior work to detect the outlier values based on a collection of stock market data. In order to implement the detection and find some solutions that maybe helpful for investors, real closed price data were obtained from the Amman Stock Exchange (ASE). Tukey and Maximum Overlapping Discrete Wavelet Transform (MODWT) methods will be used to impute the detect the outlier values.

Keywords: outlier values, imputation, stock market data, detecting, estimation

Procedia PDF Downloads 78
24722 Identification and Characterization of Heavy Metal Resistant Bacteria from the Klip River

Authors: P. Chihomvu, P. Stegmann, M. Pillay

Abstract:

Pollution of the Klip River has caused microorganisms inhabiting it to develop protective survival mechanisms. This study isolated and characterized the heavy metal resistant bacteria in the Klip River. Water and sediment samples were collected from six sites along the course of the river. The pH, turbidity, salinity, temperature and dissolved oxygen were measured in-situ. The concentrations of six heavy metals (Cd, Cu, Fe, Ni, Pb, and Zn) of the water samples were determined by atomic absorption spectroscopy. Biochemical and antibiotic profiles of the isolates were assessed using the API 20E® and Kirby Bauer Method. Growth studies were carried out using spectrophotometric methods. The isolates were identified using 16SrDNA sequencing. The uppermost part of the Klip River with the lowest pH had the highest levels of heavy metals. Turbidity, salinity and specific conductivity increased measurably at Site 4 (Henley on Klip Weir). MIC tests showed that 16 isolates exhibited high iron and lead resistance. Antibiotic susceptibility tests revealed that the isolates exhibited multi-tolerances to drugs such as tetracycline, ampicillin, and amoxicillin.

Keywords: Klip River, heavy metals, resistance, 16SrDNA

Procedia PDF Downloads 321
24721 PEINS: A Generic Compression Scheme Using Probabilistic Encoding and Irrational Number Storage

Authors: P. Jayashree, S. Rajkumar

Abstract:

With social networks and smart devices generating a multitude of data, effective data management is the need of the hour for networks and cloud applications. Some applications need effective storage while some other applications need effective communication over networks and data reduction comes as a handy solution to meet out both requirements. Most of the data compression techniques are based on data statistics and may result in either lossy or lossless data reductions. Though lossy reductions produce better compression ratios compared to lossless methods, many applications require data accuracy and miniature details to be preserved. A variety of data compression algorithms does exist in the literature for different forms of data like text, image, and multimedia data. In the proposed work, a generic progressive compression algorithm, based on probabilistic encoding, called PEINS is projected as an enhancement over irrational number stored coding technique to cater to storage issues of increasing data volumes as a cost effective solution, which also offers data security as a secondary outcome to some extent. The proposed work reveals cost effectiveness in terms of better compression ratio with no deterioration in compression time.

Keywords: compression ratio, generic compression, irrational number storage, probabilistic encoding

Procedia PDF Downloads 287
24720 Iot Device Cost Effective Storage Architecture and Real-Time Data Analysis/Data Privacy Framework

Authors: Femi Elegbeleye, Omobayo Esan, Muienge Mbodila, Patrick Bowe

Abstract:

This paper focused on cost effective storage architecture using fog and cloud data storage gateway and presented the design of the framework for the data privacy model and data analytics framework on a real-time analysis when using machine learning method. The paper began with the system analysis, system architecture and its component design, as well as the overall system operations. The several results obtained from this study on data privacy model shows that when two or more data privacy model is combined we tend to have a more stronger privacy to our data, and when fog storage gateway have several advantages over using the traditional cloud storage, from our result shows fog has reduced latency/delay, low bandwidth consumption, and energy usage when been compare with cloud storage, therefore, fog storage will help to lessen excessive cost. This paper dwelt more on the system descriptions, the researchers focused on the research design and framework design for the data privacy model, data storage, and real-time analytics. This paper also shows the major system components and their framework specification. And lastly, the overall research system architecture was shown, its structure, and its interrelationships.

Keywords: IoT, fog, cloud, data analysis, data privacy

Procedia PDF Downloads 92
24719 Comparison of Selected Pier-Scour Equations for Wide Piers Using Field Data

Authors: Nordila Ahmad, Thamer Mohammad, Bruce W. Melville, Zuliziana Suif

Abstract:

Current methods for predicting local scour at wide bridge piers, were developed on the basis of laboratory studies and very limited scour prediction were tested with field data. Laboratory wide pier scour equation from previous findings with field data were presented. A wide range of field data were used and it consists of both live-bed and clear-water scour. A method for assessing the quality of the data was developed and applied to the data set. Three other wide pier-scour equations from the literature were used to compare the performance of each predictive method. The best-performing scour equation were analyzed using statistical analysis. Comparisons of computed and observed scour depths indicate that the equation from the previous publication produced the smallest discrepancy ratio and RMSE value when compared with the large amount of laboratory and field data.

Keywords: field data, local scour, scour equation, wide piers

Procedia PDF Downloads 400
24718 Genome-Wide Insights into Whole Gut Microbiota of Rainbow Trout, Oncorhynchus Mykiss Associated with Changes in Dietary Composition and Temperature Regimens

Authors: John N. Idenyi, Hadimundeen Abdallah, Abigeal D. Adeyemi, Jonathan C. Eya

Abstract:

Gut microbiomes play a significant role in the growth, metabolism, and health of fish. However, we know very little about the interactive effects of variations in dietary composition and temperature on rainbow trout gut microbiota. Exactly 288 rainbow trout weighing 45.6g ± 0.05 (average ± SD) were fed four isocaloric, isolipidic, and isonitrogenous diets comprising 40% crude protein and 20% crude lipid and formulated as 100 % animal-based protein (AP) and a blend of 50 fish oil (FO)/50 camelina oil (CO), 100 % AP and100 % CO, 100 % plant-based protein (PP) and a blend of 50FO/50CO or 100 % PP and 100 % CO in 14 or 18°C for 150 days. Gut content was analyzed using 16S rRNA gene and shotgun sequencing. The most abundant phyla identified regardless of diet were Tenericutes, Firmicutes, Proteobacteria, Spirochaetes, Bacteroidetes, and Actinobacteria, while Aeromonadaceae and Enterobacteriaceae were dominant families in 18°C. Moreover, gut microbes were dominated by genes relating to an amino acid, carbohydrate, fat, and energy metabolisms and influenced by temperature. The shared functional profiles for all the diets suggest that plant protein sources in combination with CO could be as good as the fish meal with 50/50 FO & CO in rainbow trout farming.

Keywords: aquafeed, aquaculture, microbiome, rainbow trout

Procedia PDF Downloads 86
24717 The Maximum Throughput Analysis of UAV Datalink 802.11b Protocol

Authors: Inkyu Kim, SangMan Moon

Abstract:

This IEEE 802.11b protocol provides up to 11Mbps data rate, whereas aerospace industry wants to seek higher data rate COTS data link system in the UAV. The Total Maximum Throughput (TMT) and delay time are studied on many researchers in the past years This paper provides theoretical data throughput performance of UAV formation flight data link using the existing 802.11b performance theory. We operate the UAV formation flight with more than 30 quad copters with 802.11b protocol. We may be predicting that UAV formation flight numbers have to bound data link protocol performance limitations.

Keywords: UAV datalink, UAV formation flight datalink, UAV WLAN datalink application, UAV IEEE 802.11b datalink application

Procedia PDF Downloads 387
24716 Methods for Distinction of Cattle Using Supervised Learning

Authors: Radoslav Židek, Veronika Šidlová, Radovan Kasarda, Birgit Fuerst-Waltl

Abstract:

Machine learning represents a set of topics dealing with the creation and evaluation of algorithms that facilitate pattern recognition, classification, and prediction, based on models derived from existing data. The data can present identification patterns which are used to classify into groups. The result of the analysis is the pattern which can be used for identification of data set without the need to obtain input data used for creation of this pattern. An important requirement in this process is careful data preparation validation of model used and its suitable interpretation. For breeders, it is important to know the origin of animals from the point of the genetic diversity. In case of missing pedigree information, other methods can be used for traceability of animal´s origin. Genetic diversity written in genetic data is holding relatively useful information to identify animals originated from individual countries. We can conclude that the application of data mining for molecular genetic data using supervised learning is an appropriate tool for hypothesis testing and identifying an individual.

Keywords: genetic data, Pinzgau cattle, supervised learning, machine learning

Procedia PDF Downloads 542
24715 Router 1X3 - RTL Design and Verification

Authors: Nidhi Gopal

Abstract:

Routing is the process of moving a packet of data from source to destination and enables messages to pass from one computer to another and eventually reach the target machine. A router is a networking device that forwards data packets between computer networks. It is connected to two or more data lines from different networks (as opposed to a network switch, which connects data lines from one single network). This paper mainly emphasizes upon the study of router device, its top level architecture, and how various sub-modules of router i.e. Register, FIFO, FSM and Synchronizer are synthesized, and simulated and finally connected to its top module.

Keywords: data packets, networking, router, routing

Procedia PDF Downloads 800
24714 Noise Reduction in Web Data: A Learning Approach Based on Dynamic User Interests

Authors: Julius Onyancha, Valentina Plekhanova

Abstract:

One of the significant issues facing web users is the amount of noise in web data which hinders the process of finding useful information in relation to their dynamic interests. Current research works consider noise as any data that does not form part of the main web page and propose noise web data reduction tools which mainly focus on eliminating noise in relation to the content and layout of web data. This paper argues that not all data that form part of the main web page is of a user interest and not all noise data is actually noise to a given user. Therefore, learning of noise web data allocated to the user requests ensures not only reduction of noisiness level in a web user profile, but also a decrease in the loss of useful information hence improves the quality of a web user profile. Noise Web Data Learning (NWDL) tool/algorithm capable of learning noise web data in web user profile is proposed. The proposed work considers elimination of noise data in relation to dynamic user interest. In order to validate the performance of the proposed work, an experimental design setup is presented. The results obtained are compared with the current algorithms applied in noise web data reduction process. The experimental results show that the proposed work considers the dynamic change of user interest prior to elimination of noise data. The proposed work contributes towards improving the quality of a web user profile by reducing the amount of useful information eliminated as noise.

Keywords: web log data, web user profile, user interest, noise web data learning, machine learning

Procedia PDF Downloads 261
24713 Microarrays: Wide Clinical Utilities and Advances in Healthcare

Authors: Salma M. Wakil

Abstract:

Advances in the field of genetics overwhelmed detecting large number of inherited disorders at the molecular level and directed to the development of innovative technologies. These innovations have led to gene sequencing, prenatal mutation detection, pre-implantation genetic diagnosis; population based carrier screening and genome wide analyses using microarrays. Microarrays are widely used in establishing clinical and diagnostic setup for genetic anomalies at a massive level, with the advent of cytoscan molecular karyotyping as a clinical utility card for detecting chromosomal aberrations with high coverage across the entire human genome. Unlike a regular karyotype that relies on the microscopic inspection of chromosomes, molecular karyotyping with cytoscan constructs virtual chromosomes based on the copy number analysis of DNA which improves its resolution by 100-fold. We have been investigating a large number of patients with Developmental Delay and Intellectual disability with this platform for establishing micro syndrome deletions and have detected number of novel CNV’s in the Arabian population with the clinical relevance.

Keywords: microarrays, molecular karyotyping, developmental delay, genetics

Procedia PDF Downloads 450
24712 Expression of Selected miRNAs in Placenta of the Intrauterine Restricted Growth Fetuses in Cattle

Authors: Karolina Rutkowska, Hubert Pausch, Jolanta Oprzadek, Krzysztof Flisikowski

Abstract:

The placenta is one of the most important organs that plays a crucial role in the fetal growth and development. Placenta dysfunction is one of the primary cause of the intrauterine growth restriction (IUGR). Cattle have the cotyledonary placenta which consists of two anatomical parts: fetal and maternal. In the case of cattle during the first months of pregnancy, it is very easy to separate maternal caruncle from fetal cotyledon tissue, easier in fact than removing an ordinary glove from one's hand. Which in fact make easier to conduct tissue-specific molecular studies. Typically, animal models for the study of IUGR are created using surgical methods and malnutrition of the pregnant mother or in the case of mice by genetic modifications. However, proposed cattle model with MIMT1Del/WT deletion is unique because it was created without any surgical methods what significantly distinguish it from the other animal models. The primary objective of the study was to identify differential expression of selected miRNAs in the placenta from normal and intrauterine growth restricted fetuses. There was examined the expression of miRNA in the fetal and maternal part of the placenta from 24 fetuses (12 samples from the fetal part of the placenta and 12 samples from maternal part of the placenta). In the study, there was done miRNAs sequencing in the placenta of MIMT1Del/WT fetuses and MIMT1WT/WT fetuses. Then, there were selected miRNAs that are involved in fetal growth and development. Analysis of miRNAs expression was conducted on ABI7500 machine. miRNAs expression was analyzed by reverse-transcription polymerase chain reaction (RT-PCR). As the reference gene was used SNORD47. The results were expressed as 2ΔΔCt: ΔΔCt = (Ctij − CtSNORD47j) − (Cti1 − CtSNORD471). Where Ctij and CtSNORD47j are the Ct values for gene i and for SNORD47 in a sample (named j); Cti1 and CtSNORD471 are the Ct values in sample 1. Differences between groups were evaluated with analysis of variance by using One-Way ANOVA. Bonferroni’s tests were used for interpretation of the data. All normalised miRNA expression values are expressed on a value of natural logarithm. The data were expressed as least squares mean with standard errors. Significance was declared when P < 0.05. The study shows that miRNAs expression depends on the part of the placenta where they origin (fetal or maternal) and on the genotype of the animal. miRNAs offer a particularly new approach to study IUGR. Corresponding tissue samples were collected according to the standard veterinary protocols according to the European Union Normative for Care and Use of Experimental Animals. All animal experiments were approved by the Animal Ethics Committee of the State Provincial Office of Southern Finland (ESAVI-2010-08583/YM-23).

Keywords: placenta, intrauterine growth restriction, miRNA, cattle

Procedia PDF Downloads 311
24711 Data Mining and Knowledge Management Application to Enhance Business Operations: An Exploratory Study

Authors: Zeba Mahmood

Abstract:

The modern business organizations are adopting technological advancement to achieve competitive edge and satisfy their consumer. The development in the field of Information technology systems has changed the way of conducting business today. Business operations today rely more on the data they obtained and this data is continuously increasing in volume. The data stored in different locations is difficult to find and use without the effective implementation of Data mining and Knowledge management techniques. Organizations who smartly identify, obtain and then convert data in useful formats for their decision making and operational improvements create additional value for their customers and enhance their operational capabilities. Marketers and Customer relationship departments of firm use Data mining techniques to make relevant decisions, this paper emphasizes on the identification of different data mining and Knowledge management techniques that are applied to different business industries. The challenges and issues of execution of these techniques are also discussed and critically analyzed in this paper.

Keywords: knowledge, knowledge management, knowledge discovery in databases, business, operational, information, data mining

Procedia PDF Downloads 526
24710 Indexing and Incremental Approach Using Map Reduce Bipartite Graph (MRBG) for Mining Evolving Big Data

Authors: Adarsh Shroff

Abstract:

Big data is a collection of dataset so large and complex that it becomes difficult to process using data base management tools. To perform operations like search, analysis, visualization on big data by using data mining; which is the process of extraction of patterns or knowledge from large data set. In recent years, the data mining applications become stale and obsolete over time. Incremental processing is a promising approach to refreshing mining results. It utilizes previously saved states to avoid the expense of re-computation from scratch. This project uses i2MapReduce, an incremental processing extension to Map Reduce, the most widely used framework for mining big data. I2MapReduce performs key-value pair level incremental processing rather than task level re-computation, supports not only one-step computation but also more sophisticated iterative computation, which is widely used in data mining applications, and incorporates a set of novel techniques to reduce I/O overhead for accessing preserved fine-grain computation states. To optimize the mining results, evaluate i2MapReduce using a one-step algorithm and three iterative algorithms with diverse computation characteristics for efficient mining.

Keywords: big data, map reduce, incremental processing, iterative computation

Procedia PDF Downloads 344
24709 Analyzing Large Scale Recurrent Event Data with a Divide-And-Conquer Approach

Authors: Jerry Q. Cheng

Abstract:

Currently, in analyzing large-scale recurrent event data, there are many challenges such as memory limitations, unscalable computing time, etc. In this research, a divide-and-conquer method is proposed using parametric frailty models. Specifically, the data is randomly divided into many subsets, and the maximum likelihood estimator from each individual data set is obtained. Then a weighted method is proposed to combine these individual estimators as the final estimator. It is shown that this divide-and-conquer estimator is asymptotically equivalent to the estimator based on the full data. Simulation studies are conducted to demonstrate the performance of this proposed method. This approach is applied to a large real dataset of repeated heart failure hospitalizations.

Keywords: big data analytics, divide-and-conquer, recurrent event data, statistical computing

Procedia PDF Downloads 159
24708 Characterization of Defense-Related Genes and Metabolite Profiling in Oil Palm Elaeis guineensis during Interaction with Ganoderma boninense

Authors: Mohammad Nazri Abdul Bahari, Nurshafika Mohd Sakeh, Siti Nor Akmar Abdullah

Abstract:

Basal stem rot (BSR) is the most devastating disease in oil palm. Among the oil palm pathogenic fungi, the most prevalent and virulent species associated with BSR is Ganoderma boninense. Early detection of G. boninense attack in oil palm wherein physical symptoms has not yet appeared can offer opportunities to prevent the spread of the necrotrophic fungus. However, poor understanding of molecular defense responses and roles of antifungal metabolites in oil palm against G. boninense has complicated the resolving measures. Hence, characterization of defense-related molecular responses and production of antifungal compounds during early interaction with G. boninense is of utmost important. Four month-old oil palm (Elaeis guineensis) seedlings were artificially infected with G. boninense-inoculated rubber wood block via sitting technique. RNA of samples were extracted from roots and leaves tissues at 0, 3, 7 and 11 days post inoculation (d.p.i) followed with sequencing using RNA-Seq method. Differentially-expressed genes (DEGs) of oil palm-G. boninense interaction were identified, while changes in metabolite profile will be scrutinized related to the DEGs. The RNA-Seq data generated a total of 113,829,376 and 313,293,229 paired-end clean reads from untreated (0 d.p.i) and treated (3, 7, 11 d.p.i) samples respectively, each with two biological replicates. The paired-end reads were mapped to Elaeis guineensis reference genome to screen out non-oil palm genes and subsequently generated 74,794 coding sequences. DEG analysis of phytohormone biosynthetic genes in oil palm roots revealed that at p-value ≤ 0.01, ethylene and jasmonic acid may act in antagonistic manner with salicylic acid to coordinate defense response at early interaction with G. boninense. Findings on metabolite profiling of G. boninense-infected oil palm roots and leaves are hoped to explain the defense-related compounds elicited by Elaeis guineensis in response to G. boninense colonization. The study aims to shed light on molecular defense response of oil palm at early interaction with G. boninense and promote prevention measures against Ganoderma infection.

Keywords: Ganoderma boninense, metabolites, phytohormones, RNA-Seq

Procedia PDF Downloads 255
24707 Interconnections between Chronic Jet Lag and Neurological Disorders

Authors: Suliman Khan, Rabeea Siddique, Mengzhou Xue

Abstract:

Background: Patients with neurological disorders often display altered circadian rhythms. The disrupted circadian rhythms through chronic jetlag or shiftwork are thought to increase the risk and severity of human disease, including cancer, psychiatric, and related brain diseases. In this study, we investigated the impact of shiftwork or chronic jetlag (CJL) like conditions on mice’s brains. Transcriptome profiling based on RNA sequencing revealed that genes associated with serious neurological disorders were differentially expressed in the nucleus accumbens (NAc) and prefrontal cortex (PFC). According to the qPCR analysis, several key regulatory genes associated with neurological disorders were significantly altered in the NAc, PFC, hypothalamus, hippocampus, and striatum. Serotonin levels and the expression levels of serotonin transporters and receptors were significantly altered in mice treated with CJL. Overall, these results indicate that CJL may increase the risk of neurological disorders by disrupting the key regulatory genes, biological functions, serotonin, and corticosterone. These molecular linkages can further be studied to investigate the mechanism underlying CJL or shiftwork-mediated neurological disorders in order to develop treatment strategies.

Keywords: chronic jetlag, molecular profiles, brain disorders, circadian rhythms

Procedia PDF Downloads 114
24706 Adoption of Big Data by Global Chemical Industries

Authors: Ashiff Khan, A. Seetharaman, Abhijit Dasgupta

Abstract:

The new era of big data (BD) is influencing chemical industries tremendously, providing several opportunities to reshape the way they operate and help them shift towards intelligent manufacturing. Given the availability of free software and the large amount of real-time data generated and stored in process plants, chemical industries are still in the early stages of big data adoption. The industry is just starting to realize the importance of the large amount of data it owns to make the right decisions and support its strategies. This article explores the importance of professional competencies and data science that influence BD in chemical industries to help it move towards intelligent manufacturing fast and reliable. This article utilizes a literature review and identifies potential applications in the chemical industry to move from conventional methods to a data-driven approach. The scope of this document is limited to the adoption of BD in chemical industries and the variables identified in this article. To achieve this objective, government, academia, and industry must work together to overcome all present and future challenges.

Keywords: chemical engineering, big data analytics, industrial revolution, professional competence, data science

Procedia PDF Downloads 80
24705 Secure Multiparty Computations for Privacy Preserving Classifiers

Authors: M. Sumana, K. S. Hareesha

Abstract:

Secure computations are essential while performing privacy preserving data mining. Distributed privacy preserving data mining involve two to more sites that cannot pool in their data to a third party due to the violation of law regarding the individual. Hence in order to model the private data without compromising privacy and information loss, secure multiparty computations are used. Secure computations of product, mean, variance, dot product, sigmoid function using the additive and multiplicative homomorphic property is discussed. The computations are performed on vertically partitioned data with a single site holding the class value.

Keywords: homomorphic property, secure product, secure mean and variance, secure dot product, vertically partitioned data

Procedia PDF Downloads 407
24704 Dietary Gluten and the Balance of Gut Microbiota in the Dextran Sulphate Sodium Induced Colitis Model

Authors: Austin Belfiori, Kevin Rinek, Zach Barcroft, Jennifer Berglind

Abstract:

Diet influences the composition of the gut microbiota and host's health. Disruption of the balance among the microbiota, epithelial cells, and resident immune cells in the intestine is involved in the pathogenesis of inflammatory bowel disease (IBD). To study the role of gut microbiota in intestinal inflammation, the microbiome of control mice (C57BL6) given a gluten-containing standard diet versus C57BL6 mice given the gluten-free (GF) feed (n=10 in each group) was examined. All mice received the 3% DSS for 5 days. Throughout the study, feces were collected and processed for DNA extraction and MiSeq Illumina sequencing of V4 region of bacterial 16S rRNA gene. Alpha and beta diversities and compositional differences at phylum and genus levels were determined in intestinal microbiota. The mice receiving the GF diet showed a significantly increased abundance of Firmicutes and a decrease of Bacteroides and Lactobacillus at phylum level. Therefore, the gluten free diet led to reductions in beneficial gut bacteria populations. These findings indicate a role of wheat gluten in dysbiosis of the intestinal microbiota.

Keywords: gluten, colitis, microbiota, DSS, dextran sulphate sodium

Procedia PDF Downloads 206
24703 Study of Pre-Handwriting Factors Necessary for Successful Handwriting in Children

Authors: Lalitchandra J. Shah, Katarzyna Bialek, Melinda L. Clarke, Jessica L. Jansson

Abstract:

Handwriting is essential to academic success; however, the current literature is limited in the identification of pre-handwriting skills. The purpose of this study was to identify the pre-handwriting skills, which occupational therapy practitioners deem important to handwriting success, as well as those which aid in intervention planning. The online survey instrument consisted of 33 questions that assessed various skills related to the development of handwriting, as well as captured demographic information. Both occupational therapists and occupational therapy assistants were included in the survey study. The survey found that the respondents were in agreement that purposeful scribbling, the ability of a child to copy (vertical/horizontal lines, circle, squares, and triangles), imitating an oblique cross, cognitive skills (attention, praxis, self-regulation, sequencing), grasp patterns, hand dominance, in hand manipulation skills (shift, translation, rotation), bilateral integration, stabilization of paper, crossing midline, and visual perception were important indicators of handwriting readiness. The results of the survey support existing research regarding the skills necessary for the successful development of handwriting in children.

Keywords: development, handwriting, occupational therapy, visual perceptual skills

Procedia PDF Downloads 346
24702 MicroRNA in Bovine Corpus Luteum during Early Pregnancy

Authors: Rreze Gecaj, Corina Schanzenbach, Benedikt Kirchner, Michael Pfaffl, Bajram Berisha

Abstract:

The maintenance of corpus lutem (CL) during early pregnancy in cattle is a critical and multifarious process. A luteotrophic mechanism originating from the embryo is widely accepted as the triggering signal for the CL maintenance. In the cattle, it is the interferon-tau (IFNT) secretion form conceptus that prevents CL regression and ensures progesterone production for the establishment of pregnancy. In addition to endocrine and paracrine signals, microRNA (miRNA) can also support CL sustainability during early pregnancy. MiRNA are small non-coding nucleic acids that regulate gene expression post-transcriptionally and are shown to be involved in the modulation of CL function. However, the examination of miRNAs in corpus luteum function at the early pregnancy still remains largely uncovered. This study aims at profiling the expression of miRNA in CL during the early pregnancy in cattle by comparing it with the CL form late cycle and with the regressed CL. Corpora lutea were assigned in two different groups during the cycle (C13 group, late CL: days 13-18 and C18, regressed CL group: day >18) and during the early pregnancy (group P: 1-2 month). The estrous cycle was determined by macroscopic examination and to age the fetus crown-rump length measurement was applied. A total of 9 corpora lutea from individual animals were included in the study, three corpora lutea for each group. MiRNAs population was profiled using small RNA next-generation sequencing and biologically significant miRNAs were evaluated for their differential expression using the DESeq2-methodology. We show that 6 differentially expressed miRNAs (bta-mir-2890, -2332, -2441-3p, -148b, -1248 and -29c) are common to both comparisons, P vs C13 and P vs C18. While for each stage individually we have identified unique miRNAs differentially expressed only for the given comparison. bta-miR-23a and -769 were unique miRNAs differentially expressed in P vs C13, whereas forty-four unique miRNAs were identified as differentially expressed in P vs C18. These data confirm that miRNAs are highly abundant in luteal tissue during early pregnancy and potentially regulate the CL maintenance at this stage of fetus development.

Keywords: bovine, corpus luteum, microRNA, pregnancy, RNA-Seq

Procedia PDF Downloads 254
24701 Cross Project Software Fault Prediction at Design Phase

Authors: Pradeep Singh, Shrish Verma

Abstract:

Software fault prediction models are created by using the source code, processed metrics from the same or previous version of code and related fault data. Some company do not store and keep track of all artifacts which are required for software fault prediction. To construct fault prediction model for such company, the training data from the other projects can be one potential solution. The earlier we predict the fault the less cost it requires to correct. The training data consists of metrics data and related fault data at function/module level. This paper investigates fault predictions at early stage using the cross-project data focusing on the design metrics. In this study, empirical analysis is carried out to validate design metrics for cross project fault prediction. The machine learning techniques used for evaluation is Naïve Bayes. The design phase metrics of other projects can be used as initial guideline for the projects where no previous fault data is available. We analyze seven data sets from NASA Metrics Data Program which offer design as well as code metrics. Overall, the results of cross project is comparable to the within company data learning.

Keywords: software metrics, fault prediction, cross project, within project.

Procedia PDF Downloads 336