Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 24455

Search results for: microarray data

24425 Application of KL Divergence for Estimation of Each Metabolic Pathway Genes

Authors: Shohei Maruyama, Yasuo Matsuyama, Sachiyo Aburatani

Abstract:

The development of the method to annotate unknown gene functions is an important task in bioinformatics. One of the approaches for the annotation is The identification of the metabolic pathway that genes are involved in. Gene expression data have been utilized for the identification, since gene expression data reflect various intracellular phenomena. However, it has been difficult to estimate the gene function with high accuracy. It is considered that the low accuracy of the estimation is caused by the difficulty of accurately measuring a gene expression. Even though they are measured under the same condition, the gene expressions will vary usually. In this study, we proposed a feature extraction method focusing on the variability of gene expressions to estimate the genes' metabolic pathway accurately. First, we estimated the distribution of each gene expression from replicate data. Next, we calculated the similarity between all gene pairs by KL divergence, which is a method for calculating the similarity between distributions. Finally, we utilized the similarity vectors as feature vectors and trained the multiclass SVM for identifying the genes' metabolic pathway. To evaluate our developed method, we applied the method to budding yeast and trained the multiclass SVM for identifying the seven metabolic pathways. As a result, the accuracy that calculated by our developed method was higher than the one that calculated from the raw gene expression data. Thus, our developed method combined with KL divergence is useful for identifying the genes' metabolic pathway.

Keywords: metabolic pathways, gene expression data, microarray, Kullback–Leibler divergence, KL divergence, support vector machines, SVM, machine learning

Procedia PDF Downloads 385

24424 Genome-Wide Expression Profiling of Cicer arietinum Heavy Metal Toxicity

Authors: B. S. Yadav, A. Mani, S. Srivastava

Abstract:

Chickpea (Cicer arietinum L.) is an annual, self-pollinating, diploid (2n = 2x = 16) pulse crop that ranks second in world legume production after common bean (Phaseolus vulgaris). ICC 4958 flowers approximately 39 days after sowing under peninsular Indian conditions and the crop matures in less than 90 days in rained environments. The estimated collective yield losses due to abiotic stresses (6.4 million t) have been significantly higher than for biotic stresses (4.8 million t). Most legumes are known to be salt sensitive, and therefore, it is becoming increasingly important to produce cultivars tolerant to high-salinity in addition to other abiotic and biotic stresses for sustainable chickpea production. Our aim was to identify the genes that are involved in the defence mechanism against heavy metal toxicity in chickpea and establish the biological network of heavy metal toxicity in chickpea. ICC4958 variety of chick pea was taken and grown in normal condition and 150µM concentration of different heavy metal salt like CdCl₂, K₂Cr2O₇, NaAsO₂. At 15th day leave samples were collected and stored in RNA Later solution microarray was performed for checking out differential gene expression pattern. Our studies revealed that 111 common genes that involved in defense mechanism were up regulated and 41 genes were commonly down regulated during treatment of 150µM concentration of CdCl₂, K₂Cr₂O₇, and NaAsO₂. Biological network study shows that the genes which are differentially expressed are highly connected and having high betweenness and centrality.

Keywords: abiotic stress, biological network, chickpea, microarray

Procedia PDF Downloads 178

24423 Design of Experiment for Optimizing Immunoassay Microarray Printing

Authors: Alex J. Summers, Jasmine P. Devadhasan, Douglas Montgomery, Brittany Fischer, Jian Gu, Frederic Zenhausern

Abstract:

Immunoassays have been utilized for several applications, including the detection of pathogens. Our laboratory is in the development of a tier 1 biothreat panel utilizing Vertical Flow Assay (VFA) technology for simultaneous detection of pathogens and toxins. One method of manufacturing VFA membranes is with non-contact piezoelectric dispensing, which provides advantages, such as low-volume and rapid dispensing without compromising the structural integrity of antibody or substrate. Challenges of this processinclude premature discontinuation of dispensing and misaligned spotting. Preliminary data revealed the Yp 11C7 mAb (11C7)reagent to exhibit a large angle of failure during printing which may have contributed to variable printing outputs. A Design of Experiment (DOE) was executed using this reagent to investigate the effects of hydrostatic pressure and reagent concentration on microarray printing outputs. A Nano-plotter 2.1 (GeSIM, Germany) was used for printing antibody reagents ontonitrocellulose membrane sheets in a clean room environment. A spotting plan was executed using Spot-Front-End software to dispense volumes of 11C7 reagent (20-50 droplets; 1.5-5 mg/mL) in a 6-test spot array at 50 target membrane locations. Hydrostatic pressure was controlled by raising the Pressure Compensation Vessel (PCV) above or lowering it below our current working level. It was hypothesized that raising or lowering the PCV 6 inches would be sufficient to cause either liquid accumulation at the tip or discontinue droplet formation. After aspirating 11C7 reagent, we tested this hypothesis under stroboscope.75% of the effective raised PCV height and of our hypothesized lowered PCV height were used. Humidity (55%) was maintained using an Airwin BO-CT1 humidifier. The number and quality of membranes was assessed after staining printed membranes with dye. The droplet angle of failure was recorded before and after printing to determine a “stroboscope score” for each run. The DOE set was analyzed using JMP software. Hydrostatic pressure and reagent concentration had a significant effect on the number of membranes output. As hydrostatic pressure was increased by raising the PCV 3.75 inches or decreased by lowering the PCV -4.5 inches, membrane output decreased. However, with the hydrostatic pressure closest to equilibrium, our current working level, membrane output, reached the 50-membrane target. As the reagent concentration increased from 1.5 to 5 mg/mL, the membrane output also increased. Reagent concentration likely effected the number of membrane output due to the associated dispensing volume needed to saturate the membranes. However, only hydrostatic pressure had a significant effect on stroboscope score, which could be due to discontinuation of dispensing, and thus the stroboscope check could not find a droplet to record. Our JMP predictive model had a high degree of agreement with our observed results. The JMP model predicted that dispensing the highest concentration of 11C7 at our current PCV working level would yield the highest number of quality membranes, which correlated with our results. Acknowledgements: This work was supported by the Chemical Biological Technologies Directorate (Contract # HDTRA1-16-C-0026) and the Advanced Technology International (Contract # MCDC-18-04-09-002) from the Department of Defense Chemical and Biological Defense program through the Defense Threat Reduction Agency (DTRA).

Keywords: immunoassay, microarray, design of experiment, piezoelectric dispensing

Procedia PDF Downloads 162

24422 Network Based Molecular Profiling of Intracranial Ependymoma over Spinal Ependymoma

Authors: Hyeon Su Kim, Sungjin Park, Hae Ryung Chang, Hae Rim Jung, Young Zoo Ahn, Yon Hui Kim, Seungyoon Nam

Abstract:

Ependymoma, one of the most common parenchymal spinal cord tumor, represents 3-6% of all CNS tumor. Especially intracranial ependymomas, which are more frequent in childhood, have a more poor prognosis and more malignant than spinal ependymomas. Although there are growing needs to understand pathogenesis, detailed molecular understanding of pathogenesis remains to be explored. A cancer cell is composed of complex signaling pathway networks, and identifying interaction between genes and/or proteins are crucial for understanding these pathways. Therefore, we explored each ependymoma in terms of differential expressed genes and signaling networks. We used Microsoft Excel™ to manipulate microarray data gathered from NCBI’s GEO Database. To analyze and visualize signaling network, we used web-based PATHOME algorithm and Cytoscape. We show HOX family and NEFL are down-regulated but SCL family is up-regulated in cerebrum and posterior fossa cancers over a spinal cancer, and JAK/STAT signaling pathway and Chemokine signaling pathway are significantly different in the both intracranial ependymoma comparing to spinal ependymoma. We are considering there may be an age-dependent mechanism under different histological pathogenesis. We annotated mutation data of each gene subsequently in order to find potential target genes.

Keywords: systems biology, ependymoma, deg, network analysis

Procedia PDF Downloads 279

24421 Gene Expression Profile Reveals Breast Cancer Proliferation and Metastasis

Authors: Nandhana Vivek, Bhaskar Gogoi, Ayyavu Mahesh

Abstract:

Breast cancer metastasis plays a key role in cancer progression and fatality. The present study examines the potential causes of metastasis in breast cancer by investigating the novel interactions between genes and their pathways. The gene expression profile of GSE99394, GSE1246464, and GSE103865 was downloaded from the GEO data repository to analyze the differentially expressed genes (DEGs). Protein-protein interactions, target factor interactions, pathways and gene relationships, and functional enrichment networks were investigated. The proliferation pathway was shown to be highly expressed in breast cancer progression and metastasis in all three datasets. Gene Ontology analysis revealed 11 DEGs as gene targets to control breast cancer metastasis: LYN, DLGAP5, CXCR4, CDC6, NANOG, IFI30, TXP2, AGTR1, MKI67, and FTH1. Upon studying the function, genomic and proteomic data, and pathway involvement of the target genes, DLGAP5 proved to be a promising candidate due to it being highly differentially expressed in all datasets. The study takes a unique perspective on the avenues through which DLGAP5 promotes metastasis. The current investigation helps pave the way in understanding the role DLGAP5 plays in metastasis, which leads to an increased incidence of death among breast cancer patients.

Keywords: genomics, metastasis, microarray, cancer

Procedia PDF Downloads 80

24420 Intra-miR-ExploreR, a Novel Bioinformatics Platform for Integrated Discovery of MiRNA:mRNA Gene Regulatory Networks

Authors: Surajit Bhattacharya, Daniel Veltri, Atit A. Patel, Daniel N. Cox

Abstract:

miRNAs have emerged as key post-transcriptional regulators of gene expression, however identification of biologically-relevant target genes for this epigenetic regulatory mechanism remains a significant challenge. To address this knowledge gap, we have developed a novel tool in R, Intra-miR-ExploreR, that facilitates integrated discovery of miRNA targets by incorporating target databases and novel target prediction algorithms, using statistical methods including Pearson and Distance Correlation on microarray data, to arrive at high confidence intragenic miRNA target predictions. We have explored the efficacy of this tool using Drosophila melanogaster as a model organism for bioinformatics analyses and functional validation. A number of putative targets were obtained which were also validated using qRT-PCR analysis. Additional features of the tool include downloadable text files containing GO analysis from DAVID and Pubmed links of literature related to gene sets. Moreover, we are constructing interaction maps of intragenic miRNAs, using both micro array and RNA-seq data, focusing on neural tissues to uncover regulatory codes via which these molecules regulate gene expression to direct cellular development.

Keywords: miRNA, miRNA:mRNA target prediction, statistical methods, miRNA:mRNA interaction network

Procedia PDF Downloads 484

24419 The Interplay between Autophagy and Macrophages' Polarization in Wound Healing: A Genetic Regulatory Network Analysis

Authors: Mayada Mazher, Ahmed Moustafa, Ahmed Abdellatif

Abstract:

Background: Autophagy is a eukaryotic, highly conserved catabolic process implicated in many pathophysiologies such as wound healing. Autophagy-associated genes serve as a scaffolding platform for signal transduction of macrophage polarization during the inflammatory phase of wound healing and tissue repair process. In the current study, we report a model for the interplay between autophagy-associated genes and macrophages polarization associated genes. Methods: In silico analysis was performed on 249 autophagy-related genes retrieved from the public autophagy database and gene expression data retrieved from Gene Expression Omnibus (GEO); GSE81922 and GSE69607 microarray data macrophages polarization 199 DEGS. An integrated protein-protein interaction network was constructed for autophagy and macrophage gene sets. The gene sets were then used for GO terms pathway enrichment analysis. Common transcription factors for autophagy and macrophages' polarization were identified. Finally, microRNAs enriched in both autophagy and macrophages were predicated. Results: In silico prediction of common transcription factors in DEGs macrophages and autophagy gene sets revealed a new role for the transcription factors, HOMEZ, GABPA, ELK1 and REL, that commonly regulate macrophages associated genes: IL6,IL1M, IL1B, NOS1, SOC3 and autophagy-related genes: Atg12, Rictor, Rb1cc1, Gaparab1, Atg16l1. Conclusions: Autophagy and macrophages' polarization are interdependent cellular processes, and both autophagy-related proteins and macrophages' polarization related proteins coordinate in tissue remodelling via transcription factors and microRNAs regulatory network. The current work highlights a potential new role for transcription factors HOMEZ, GABPA, ELK1 and REL in wound healing.

Keywords: autophagy related proteins, integrated network analysis, macrophages polarization M1 and M2, tissue remodelling

Procedia PDF Downloads 130

24418 Detection, Analysis and Determination of the Origin of Copy Number Variants (CNVs) in Intellectual Disability/Developmental Delay (ID/DD) Patients and Autistic Spectrum Disorders (ASD) Patients by Molecular and Cytogenetic Methods

Authors: Pavlina Capkova, Josef Srovnal, Vera Becvarova, Marie Trkova, Zuzana Capkova, Andrea Stefekova, Vaclava Curtisova, Alena Santava, Sarka Vejvalkova, Katerina Adamova, Radek Vodicka

Abstract:

ASDs are heterogeneous and complex developmental diseases with a significant genetic background. Recurrent CNVs are known to be a frequent cause of ASD. These CNVs can have, however, a variable expressivity which results in a spectrum of phenotypes from asymptomatic to ID/DD/ASD. ASD is associated with ID in ~75% individuals. Various platforms are used to detect pathogenic mutations in the genome of these patients. The performed study is focused on a determination of the frequency of pathogenic mutations in a group of ASD patients and a group of ID/DD patients using various strategies along with a comparison of their detection rate. The possible role of the origin of these mutations in aetiology of ASD was assessed. The study included 35 individuals with ASD and 68 individuals with ID/DD (64 males and 39 females in total), who underwent rigorous genetic, neurological and psychological examinations. Screening for pathogenic mutations involved karyotyping, screening for FMR1 mutations and for metabolic disorders, a targeted MLPA test with probe mixes Telomeres 3 and 5, Microdeletion 1 and 2, Autism 1, MRX and a chromosomal microarray analysis (CMA) (Illumina or Affymetrix). Chromosomal aberrations were revealed in 7 (1 in the ASD group) individuals by karyotyping. FMR1 mutations were discovered in 3 (1 in the ASD group) individuals. The detection rate of pathogenic mutations in ASD patients with a normal karyotype was 15.15% by MLPA and CMA. The frequencies of the pathogenic mutations were 25.0% by MLPA and 35.0% by CMA in ID/DD patients with a normal karyotype. CNVs inherited from asymptomatic parents were more abundant than de novo changes in ASD patients (11.43% vs. 5.71%) in contrast to the ID/DD group where de novo mutations prevailed over inherited ones (26.47% vs. 16.18%). ASD patients shared more frequently their mutations with their fathers than patients from ID/DD group (8.57% vs. 1.47%). Maternally inherited mutations predominated in the ID/DD group in comparison with the ASD group (14.7% vs. 2.86 %). CNVs of an unknown significance were found in 10 patients by CMA and in 3 patients by MLPA. Although the detection rate is the highest when using CMA, recurrent CNVs can be easily detected by MLPA. CMA proved to be more efficient in the ID/DD group where a larger spectrum of rare pathogenic CNVs was revealed. This study determined that maternally inherited highly penetrant mutations and de novo mutations more often resulted in ID/DD without ASD in patients. The paternally inherited mutations could be, however, a source of the greater variability in the genome of the ASD patients and contribute to the polygenic character of the inheritance of ASD. As the number of the subjects in the group is limited, a larger cohort is needed to confirm this conclusion. Inherited CNVs have a role in aetiology of ASD possibly in combination with additional genetic factors - the mutations elsewhere in the genome. The identification of these interactions constitutes a challenge for the future. Supported by MH CZ – DRO (FNOl, 00098892), IGA UP LF_2016_010, TACR TE02000058 and NPU LO1304.

Keywords: autistic spectrum disorders, copy number variant, chromosomal microarray, intellectual disability, karyotyping, MLPA, multiplex ligation-dependent probe amplification

Procedia PDF Downloads 333

24417 Processing Big Data: An Approach Using Feature Selection

Authors: Nikat Parveen, M. Ananthi

Abstract:

Big data is one of the emerging technology, which collects the data from various sensors and those data will be used in many fields. Data retrieval is one of the major issue where there is a need to extract the exact data as per the need. In this paper, large amount of data set is processed by using the feature selection. Feature selection helps to choose the data which are actually needed to process and execute the task. The key value is the one which helps to point out exact data available in the storage space. Here the available data is streamed and R-Center is proposed to achieve this task.

Keywords: big data, key value, feature selection, retrieval, performance

Procedia PDF Downloads 319

24416 The Taxonomic and Functional Diversity in Edaphic Microbial Communities from Antarctic Dry Valleys

Authors: Sean T. S. Wei, Joy D. Van Nostrand, Annapoorna Maitrayee Ganeshram, Stephen B. Pointing

Abstract:

McMurdo Dry Valleys are a largely ice-free polar desert protected by international treaty as an Antarctic special managed area. The terrestrial landscape is dominated by oligotrophic mineral soil with extensive rocky outcrops. Several environmental stresses: low temperature, lack of liquid water, UV exposure and oligotrophic substrates, restrict the major biotic component to microorganisms. The bacterial diversity and the putative physiological capacity of microbial communities of quartz rocks (hypoliths) and soil of a maritime-influenced Dry Valleys were interrogated by two metagenomic approaches: 454 pyro-sequencing and Geochp DNA microarray. The most abundant phylum in hypoliths was Cyanobacteria (46%), whereas in solils Actinobacteria (31%) were most abundant. The Proteobacteria and Bacteriodetes were the only other phyla to comprise >10% of both communities. Carbon fixation was indicated by photoautotrophic and chemoautotrophic pathways for both hypolith and soil communities. The fungi accounted for polymer carbon transformations, particularly for aromatic compounds. The complete nitrogen cycling was observed in both communities. The fungi in particular displayed pathways related to ammonification. Environmental stress response pathways were common among bacteria, whereas the nutrient stress response pathways were more widely present in bacteria, archaea and fungi. The diversity of bacterialphage was also surveyed by Geochip. Data suggested that different substrates supported different viral families: Leviviridae, Myoviridae, Podoviridae and Siphoviridiae were ubiquitous. However, Corticoviridae and Microviridae only occurred in wetter soils.

Keywords: Antarctica, hypolith, soil, dry valleys, geochip, functional diversity, stress response

Procedia PDF Downloads 431

24415 Applications of Big Data in Education

Authors: Faisal Kalota

Abstract:

Big Data and analytics have gained a huge momentum in recent years. Big Data feeds into the field of Learning Analytics (LA) that may allow academic institutions to better understand the learners’ needs and proactively address them. Hence, it is important to have an understanding of Big Data and its applications. The purpose of this descriptive paper is to provide an overview of Big Data, the technologies used in Big Data, and some of the applications of Big Data in education. Additionally, it discusses some of the concerns related to Big Data and current research trends. While Big Data can provide big benefits, it is important that institutions understand their own needs, infrastructure, resources, and limitation before jumping on the Big Data bandwagon.

Keywords: big data, learning analytics, analytics, big data in education, Hadoop

Procedia PDF Downloads 392

24414 Computational Approaches to Study Lineage Plasticity in Human Pancreatic Ductal Adenocarcinoma

Authors: Almudena Espin Perez, Tyler Risom, Carl Pelz, Isabel English, Robert M. Angelo, Rosalie Sears, Andrew J. Gentles

Abstract:

Pancreatic ductal adenocarcinoma (PDAC) is one of the most deadly malignancies. The role of the tumor microenvironment (TME) is gaining significant attention in cancer research. Despite ongoing efforts, the nature of the interactions between tumors, immune cells, and stromal cells remains poorly understood. The cell-intrinsic properties that govern cell lineage plasticity in PDAC and extrinsic influences of immune populations require technically challenging approaches due to the inherently heterogeneous nature of PDAC. Understanding the cell lineage plasticity of PDAC will improve the development of novel strategies that could be translated to the clinic. Members of the team have demonstrated that the acquisition of ductal to neuroendocrine lineage plasticity in PDAC confers therapeutic resistance and is a biomarker of poor outcomes in patients. Our approach combines computational methods for deconvolving bulk transcriptomic cancer data using CIBERSORTx and high-throughput single-cell imaging using Multiplexed Ion Beam Imaging (MIBI) to study lineage plasticity in PDAC and its relationship to the infiltrating immune system. The CIBERSORTx algorithm uses signature matrices from immune cells and stroma from sorted and single-cell data in order to 1) infer the fractions of different immune cell types and stromal cells in bulked gene expression data and 2) impute a representative transcriptome profile for each cell type. We studied a unique set of 300 genomically well-characterized primary PDAC samples with rich clinical annotation. We deconvolved the PDAC transcriptome profiles using CIBERSORTx, leveraging publicly available single-cell RNA-seq data from normal pancreatic tissue and PDAC to estimate cell type proportions in PDAC, and digitally reconstruct cell-specific transcriptional profiles from our study dataset. We built signature matrices and optimized by simulations and comparison to ground truth data. We identified cell-type-specific transcriptional programs that contribute to cancer cell lineage plasticity, especially in the ductal compartment. We also studied cell differentiation hierarchies using CytoTRACE and predict cell lineage trajectories for acinar and ductal cells that we believe are pinpointing relevant information on PDAC progression. Collaborators (Angelo lab, Stanford University) has led the development of the Multiplexed Ion Beam Imaging (MIBI) platform for spatial proteomics. We will use in the very near future MIBI from tissue microarray of 40 PDAC samples to understand the spatial relationship between cancer cell lineage plasticity and stromal cells focused on infiltrating immune cells, using the relevant markers of PDAC plasticity identified from the RNA-seq analysis.

Keywords: deconvolution, imaging, microenvironment, PDAC

Procedia PDF Downloads 105

24413 Analysis of Big Data

Authors: Sandeep Sharma, Sarabjit Singh

Abstract:

As per the user demand and growth trends of large free data the storage solutions are now becoming more challenge-able to protect, store and to retrieve data. The days are not so far when the storage companies and organizations are start saying 'no' to store our valuable data or they will start charging a huge amount for its storage and protection. On the other hand as per the environmental conditions it becomes challenge-able to maintain and establish new data warehouses and data centers to protect global warming threats. A challenge of small data is over now, the challenges are big that how to manage the exponential growth of data. In this paper we have analyzed the growth trend of big data and its future implications. We have also focused on the impact of the unstructured data on various concerns and we have also suggested some possible remedies to streamline big data.

Keywords: big data, unstructured data, volume, variety, velocity

Procedia PDF Downloads 527

24412 Research of Data Cleaning Methods Based on Dependency Rules

Authors: Yang Bao, Shi Wei Deng, WangQun Lin

Abstract:

This paper introduces the concept and principle of data cleaning, analyzes the types and causes of dirty data, and proposes several key steps of typical cleaning process, puts forward a well scalability and versatility data cleaning framework, in view of data with attribute dependency relation, designs several of violation data discovery algorithms by formal formula, which can obtain inconsistent data to all target columns with condition attribute dependent no matter data is structured (SQL) or unstructured (NoSQL), and gives 6 data cleaning methods based on these algorithms.

Keywords: data cleaning, dependency rules, violation data discovery, data repair

Procedia PDF Downloads 545

24411 Alternating Electric fields-Induced Senescence in Glioblastoma

Authors: Eun Ho Kim

Abstract:

Innovations have conjured up a mode of treating GBM cancer cells in the newly diagnosed patients in a period of 4.9 months at an improved median OS, which brings along only a few minor side effects in the phase III of the clinical trial. This mode has been termed the Alternating Electric Fields (AEF). The study at hand is aimed at determining whether the AEF treatment is beneficial in sensitizing the GBM cancer cells through the process of increasing the AEF –induced senescence. The methodology to obtain the findings for this research ranged across various components, such as obtaining and testing SA-β-gal staining, flow cytometry, Western blotting, morphology, and Positron Emission Tomography (PET) / Computed Tomography (CT), immunohistochemical staining and microarray. The number of cells that displayed a senescence-specific morphology and positive SA-ß-Gal activity gradually increased up to 5 days. These results suggest that p16, p21 and p27 are essential regulators of AEF -induced senescence via NF-κB activation. The results showed that the AEF treatment is functional in enhancing the AEF –induced senescence in the GBM cells via an apoptosis- independent mechanism. This research concludes that this mode of treatment is a trustworthy protocol that can be effectively employed to overcome the limitations of the conventional mode of treatment on GBM.

Keywords: alternating electric fields, senescence, glioblastoma, cell death

Procedia PDF Downloads 69

24410 Apoptosis Pathway Targeted by Thymoquinone in MCF7 Breast Cancer Cell Line

Authors: M. Marjaneh, M. Y. Narazah, H. Shahrul

Abstract:

Array-based gene expression analysis is a powerful tool to profile expression of genes and to generate information on therapeutic effects of new anti-cancer compounds. Anti-apoptotic effect of thymoquinone was studied in MCF7 breast cancer cell line using gene expression profiling with cDNA micro array. The purity and yield of RNA samples were determined using RNeasyPlus Mini kit. The Agilent RNA 6000 Nano LabChip kit evaluated the quantity of the RNA samples. AffinityScript RT oligo-dT promoter primer was used to generate cDNA strands. T7 RNA polymerase was used to convert cDNA to cRNA. The cRNA samples and human universal reference RNA were labelled with Cy-3-CTP and Cy-5-CTP, respectively. Feature Extraction and GeneSpring software analysed the data. The single experiment analysis revealed involvement of 64 pathways with up-regulated genes and 78 pathways with down-regulated genes. The MAPK and p38-MAPK pathways were inhibited due to the up-regulation of PTPRR gene. The inhibition of p38-MAPK suggested up-regulation of TGF-ß pathway. Inhibition of p38 - MAPK caused up-regulation of TP53 and down-regulation of Bcl2 genes indicating involvement of intrinsic apoptotic pathway. Down-regulation of CARD16 gene as an adaptor molecule regulated CASP1 and suggested necrosis-like programmed cell death and involvement of caspase in apoptosis. Furthermore, down-regulation of GPCR, EGF-EGFR signalling pathways suggested reduction of ER. Involvement of AhR pathway which control cytochrome P450 and glucuronidation pathways showed metabolism of Thymoquinone. The findings showed differential expression of several genes in apoptosis pathways with thymoquinone treatment in estrogen receptor-positive breast cancer cells.

Keywords: cDNA microarray, thymoquinone, CARD16, PTPRR, CASP10

Procedia PDF Downloads 331

24409 Mining Big Data in Telecommunications Industry: Challenges, Techniques, and Revenue Opportunity

Authors: Hoda A. Abdel Hafez

Abstract:

Mining big data represents a big challenge nowadays. Many types of research are concerned with mining massive amounts of data and big data streams. Mining big data faces a lot of challenges including scalability, speed, heterogeneity, accuracy, provenance and privacy. In telecommunication industry, mining big data is like a mining for gold; it represents a big opportunity and maximizing the revenue streams in this industry. This paper discusses the characteristics of big data (volume, variety, velocity and veracity), data mining techniques and tools for handling very large data sets, mining big data in telecommunication and the benefits and opportunities gained from them.

Keywords: mining big data, big data, machine learning, telecommunication

Procedia PDF Downloads 382

24408 JavaScript Object Notation Data against eXtensible Markup Language Data in Software Applications a Software Testing Approach

Authors: Theertha Chandroth

Abstract:

This paper presents a comparative study on how to check JSON (JavaScript Object Notation) data against XML (eXtensible Markup Language) data from a software testing point of view. JSON and XML are widely used data interchange formats, each with its unique syntax and structure. The objective is to explore various techniques and methodologies for validating comparison and integration between JSON data to XML and vice versa. By understanding the process of checking JSON data against XML data, testers, developers and data practitioners can ensure accurate data representation, seamless data interchange, and effective data validation.

Keywords: XML, JSON, data comparison, integration testing, Python, SQL

Procedia PDF Downloads 119

24407 Multi-Source Data Fusion for Urban Comprehensive Management

Authors: Bolin Hua

Abstract:

In city governance, various data are involved, including city component data, demographic data, housing data and all kinds of business data. These data reflects different aspects of people, events and activities. Data generated from various systems are different in form and data source are different because they may come from different sectors. In order to reflect one or several facets of an event or rule, data from multiple sources need fusion together. Data from different sources using different ways of collection raised several issues which need to be resolved. Problem of data fusion include data update and synchronization, data exchange and sharing, file parsing and entry, duplicate data and its comparison, resource catalogue construction. Governments adopt statistical analysis, time series analysis, extrapolation, monitoring analysis, value mining, scenario prediction in order to achieve pattern discovery, law verification, root cause analysis and public opinion monitoring. The result of Multi-source data fusion is to form a uniform central database, which includes people data, location data, object data, and institution data, business data and space data. We need to use meta data to be referred to and read when application needs to access, manipulate and display the data. A uniform meta data management ensures effectiveness and consistency of data in the process of data exchange, data modeling, data cleansing, data loading, data storing, data analysis, data search and data delivery.

Keywords: multi-source data fusion, urban comprehensive management, information fusion, government data

Procedia PDF Downloads 366

24406 Reviewing Privacy Preserving Distributed Data Mining

Authors: Sajjad Baghernezhad, Saeideh Baghernezhad

Abstract:

Nowadays considering human involved in increasing data development some methods such as data mining to extract science are unavoidable. One of the discussions of data mining is inherent distribution of the data usually the bases creating or receiving such data belong to corporate or non-corporate persons and do not give their information freely to others. Yet there is no guarantee to enable someone to mine special data without entering in the owner’s privacy. Sending data and then gathering them by each vertical or horizontal software depends on the type of their preserving type and also executed to improve data privacy. In this study it was attempted to compare comprehensively preserving data methods; also general methods such as random data, coding and strong and weak points of each one are examined.

Keywords: data mining, distributed data mining, privacy protection, privacy preserving

Procedia PDF Downloads 502

24405 The Right to Data Portability and Its Influence on the Development of Digital Services

Authors: Roman Bieda

Abstract:

The General Data Protection Regulation (GDPR) will come into force on 25 May 2018 which will create a new legal framework for the protection of personal data in the European Union. Article 20 of GDPR introduces a right to data portability. This right allows for data subjects to receive the personal data which they have provided to a data controller, in a structured, commonly used and machine-readable format, and to transmit this data to another data controller. The right to data portability, by facilitating transferring personal data between IT environments (e.g.: applications), will also facilitate changing the provider of services (e.g. changing a bank or a cloud computing service provider). Therefore, it will contribute to the development of competition and the digital market. The aim of this paper is to discuss the right to data portability and its influence on the development of new digital services.

Keywords: data portability, digital market, GDPR, personal data

Procedia PDF Downloads 454

24404 Recent Advances in Data Warehouse

Authors: Fahad Hanash Alzahrani

Abstract:

This paper describes some recent advances in a quickly developing area of data storing and processing based on Data Warehouses and Data Mining techniques, which are associated with software, hardware, data mining algorithms and visualisation techniques having common features for any specific problems and tasks of their implementation.

Keywords: data warehouse, data mining, knowledge discovery in databases, on-line analytical processing

Procedia PDF Downloads 382

24403 How to Use Big Data in Logistics Issues

Authors: Mehmet Akif Aslan, Mehmet Simsek, Eyup Sensoy

Abstract:

Big Data stands for today’s cutting-edge technology. As the technology becomes widespread, so does Data. Utilizing massive data sets enable companies to get competitive advantages over their adversaries. Out of many area of Big Data usage, logistics has significance role in both commercial sector and military. This paper lays out what big data is and how it is used in both military and commercial logistics.

Keywords: big data, logistics, operational efficiency, risk management

Procedia PDF Downloads 625

24402 Implementation of an IoT Sensor Data Collection and Analysis Library

Authors: Jihyun Song, Kyeongjoo Kim, Minsoo Lee

Abstract:

Due to the development of information technology and wireless Internet technology, various data are being generated in various fields. These data are advantageous in that they provide real-time information to the users themselves. However, when the data are accumulated and analyzed, more various information can be extracted. In addition, development and dissemination of boards such as Arduino and Raspberry Pie have made it possible to easily test various sensors, and it is possible to collect sensor data directly by using database application tools such as MySQL. These directly collected data can be used for various research and can be useful as data for data mining. However, there are many difficulties in using the board to collect data, and there are many difficulties in using it when the user is not a computer programmer, or when using it for the first time. Even if data are collected, lack of expert knowledge or experience may cause difficulties in data analysis and visualization. In this paper, we aim to construct a library for sensor data collection and analysis to overcome these problems.

Keywords: clustering, data mining, DBSCAN, k-means, k-medoids, sensor data

Procedia PDF Downloads 355

24401 Government (Big) Data Ecosystem: Definition, Classification of Actors, and Their Roles

Authors: Syed Iftikhar Hussain Shah, Vasilis Peristeras, Ioannis Magnisalis

Abstract:

Organizations, including governments, generate (big) data that are high in volume, velocity, veracity, and come from a variety of sources. Public Administrations are using (big) data, implementing base registries, and enforcing data sharing within the entire government to deliver (big) data related integrated services, provision of insights to users, and for good governance. Government (Big) data ecosystem actors represent distinct entities that provide data, consume data, manipulate data to offer paid services, and extend data services like data storage, hosting services to other actors. In this research work, we perform a systematic literature review. The key objectives of this paper are to propose a robust definition of government (big) data ecosystem and a classification of government (big) data ecosystem actors and their roles. We showcase a graphical view of actors, roles, and their relationship in the government (big) data ecosystem. We also discuss our research findings. We did not find too much published research articles about the government (big) data ecosystem, including its definition and classification of actors and their roles. Therefore, we lent ideas for the government (big) data ecosystem from numerous areas that include scientific research data, humanitarian data, open government data, industry data, in the literature.

Keywords: big data, big data ecosystem, classification of big data actors, big data actors roles, definition of government (big) data ecosystem, data-driven government, eGovernment, gaps in data ecosystems, government (big) data, public administration, systematic literature review

Procedia PDF Downloads 142

24400 Government Big Data Ecosystem: A Systematic Literature Review

Authors: Syed Iftikhar Hussain Shah, Vasilis Peristeras, Ioannis Magnisalis

Abstract:

Data that is high in volume, velocity, veracity and comes from a variety of sources is usually generated in all sectors including the government sector. Globally public administrations are pursuing (big) data as new technology and trying to adopt a data-centric architecture for hosting and sharing data. Properly executed, big data and data analytics in the government (big) data ecosystem can be led to data-driven government and have a direct impact on the way policymakers work and citizens interact with governments. In this research paper, we conduct a systematic literature review. The main aims of this paper are to highlight essential aspects of the government (big) data ecosystem and to explore the most critical socio-technical factors that contribute to the successful implementation of government (big) data ecosystem. The essential aspects of government (big) data ecosystem include definition, data types, data lifecycle models, and actors and their roles. We also discuss the potential impact of (big) data in public administration and gaps in the government data ecosystems literature. As this is a new topic, we did not find specific articles on government (big) data ecosystem and therefore focused our research on various relevant areas like humanitarian data, open government data, scientific research data, industry data, etc.

Keywords: applications of big data, big data, big data types. big data ecosystem, critical success factors, data-driven government, egovernment, gaps in data ecosystems, government (big) data, literature review, public administration, systematic review

Procedia PDF Downloads 207

24399 A Machine Learning Decision Support Framework for Industrial Engineering Purposes

Authors: Anli Du Preez, James Bekker

Abstract:

Data is currently one of the most critical and influential emerging technologies. However, the true potential of data is yet to be exploited since, currently, about 1% of generated data are ever actually analyzed for value creation. There is a data gap where data is not explored due to the lack of data analytics infrastructure and the required data analytics skills. This study developed a decision support framework for data analytics by following Jabareen’s framework development methodology. The study focused on machine learning algorithms, which is a subset of data analytics. The developed framework is designed to assist data analysts with little experience, in choosing the appropriate machine learning algorithm given the purpose of their application.

Keywords: Data analytics, Industrial engineering, Machine learning, Value creation

Procedia PDF Downloads 151

24398 Providing Security to Private Cloud Using Advanced Encryption Standard Algorithm

Authors: Annapureddy Srikant Reddy, Atthanti Mahendra, Samala Chinni Krishna, N. Neelima

Abstract:

In our present world, we are generating a lot of data and we, need a specific device to store all these data. Generally, we store data in pen drives, hard drives, etc. Sometimes we may loss the data due to the corruption of devices. To overcome all these issues, we implemented a cloud space for storing the data, and it provides more security to the data. We can access the data with just using the internet from anywhere in the world. We implemented all these with the java using Net beans IDE. Once user uploads the data, he does not have any rights to change the data. Users uploaded files are stored in the cloud with the file name as system time and the directory will be created with some random words. Cloud accepts the data only if the size of the file is less than 2MB.

Keywords: cloud space, AES, FTP, NetBeans IDE

Procedia PDF Downloads 190

24397 Business Intelligence for Profiling of Telecommunication Customer

Authors: Rokhmatul Insani, Hira Laksmiwati Soemitro

Abstract:

Business Intelligence is a methodology that exploits the data to produce information and knowledge systematically, business intelligence can support the decision-making process. Some methods in business intelligence are data warehouse and data mining. A data warehouse can store historical data from transactional data. For data modelling in data warehouse, we apply dimensional modelling by Kimball. While data mining is used to extracting patterns from the data and get insight from the data. Data mining has many techniques, one of which is segmentation. For profiling of telecommunication customer, we use customer segmentation according to customer’s usage of services, customer invoice and customer payment. Customers can be grouped according to their characteristics and can be identified the profitable customers. We apply K-Means Clustering Algorithm for segmentation. The input variable for that algorithm we use RFM (Recency, Frequency and Monetary) model. All process in data mining, we use tools IBM SPSS modeller.

Keywords: business intelligence, customer segmentation, data warehouse, data mining

Procedia PDF Downloads 457

24396 PDDA: Priority-Based, Dynamic Data Aggregation Approach for Sensor-Based Big Data Framework

Authors: Lutful Karim, Mohammed S. Al-kahtani

Abstract:

Sensors are being used in various applications such as agriculture, health monitoring, air and water pollution monitoring, traffic monitoring and control and hence, play the vital role in the growth of big data. However, sensors collect redundant data. Thus, aggregating and filtering sensors data are significantly important to design an efficient big data framework. Current researches do not focus on aggregating and filtering data at multiple layers of sensor-based big data framework. Thus, this paper introduces (i) three layers data aggregation and framework for big data and (ii) a priority-based, dynamic data aggregation scheme (PDDA) for the lowest layer at sensors. Simulation results show that the PDDA outperforms existing tree and cluster-based data aggregation scheme in terms of overall network energy consumptions and end-to-end data transmission delay.

Keywords: big data, clustering, tree topology, data aggregation, sensor networks

Procedia PDF Downloads 321