Search results for: genomic data
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 24869

Search results for: genomic data

24419 Parallel Vector Processing Using Multi Level Orbital DATA

Authors: Nagi Mekhiel

Abstract:

Many applications use vector operations by applying single instruction to multiple data that map to different locations in conventional memory. Transferring data from memory is limited by access latency and bandwidth affecting the performance gain of vector processing. We present a memory system that makes all of its content available to processors in time so that processors need not to access the memory, we force each location to be available to all processors at a specific time. The data move in different orbits to become available to other processors in higher orbits at different time. We use this memory to apply parallel vector operations to data streams at first orbit level. Data processed in the first level move to upper orbit one data element at a time, allowing a processor in that orbit to apply another vector operation to deal with serial code limitations inherited in all parallel applications and interleaved it with lower level vector operations.

Keywords: Memory Organization, Parallel Processors, Serial Code, Vector Processing

Procedia PDF Downloads 259
24418 Reconstructability Analysis for Landslide Prediction

Authors: David Percy

Abstract:

Landslides are a geologic phenomenon that affects a large number of inhabited places and are constantly being monitored and studied for the prediction of future occurrences. Reconstructability analysis (RA) is a methodology for extracting informative models from large volumes of data that work exclusively with discrete data. While RA has been used in medical applications and social science extensively, we are introducing it to the spatial sciences through applications like landslide prediction. Since RA works exclusively with discrete data, such as soil classification or bedrock type, working with continuous data, such as porosity, requires that these data are binned for inclusion in the model. RA constructs models of the data which pick out the most informative elements, independent variables (IVs), from each layer that predict the dependent variable (DV), landslide occurrence. Each layer included in the model retains its classification data as a primary encoding of the data. Unlike other machine learning algorithms that force the data into one-hot encoding type of schemes, RA works directly with the data as it is encoded, with the exception of continuous data, which must be binned. The usual physical and derived layers are included in the model, and testing our results against other published methodologies, such as neural networks, yields accuracy that is similar but with the advantage of a completely transparent model. The results of an RA session with a data set are a report on every combination of variables and their probability of landslide events occurring. In this way, every combination of informative state combinations can be examined.

Keywords: reconstructability analysis, machine learning, landslides, raster analysis

Procedia PDF Downloads 55
24417 Data Analytics in Hospitality Industry

Authors: Tammy Wee, Detlev Remy, Arif Perdana

Abstract:

In the recent years, data analytics has become the buzzword in the hospitality industry. The hospitality industry is another example of a data-rich industry that has yet fully benefited from the insights of data analytics. Effective use of data analytics can change how hotels operate, market and position themselves competitively in the hospitality industry. However, at the moment, the data obtained by individual hotels remain under-utilized. This research is a preliminary research on data analytics in the hospitality industry, using an in-depth face-to-face interview on one hotel as a start to a multi-level research. The main case study of this research, hotel A, is a chain brand of international hotel that has been systematically gathering and collecting data on its own customer for the past five years. The data collection points begin from the moment a guest book a room until the guest leave the hotel premises, which includes room reservation, spa booking, and catering. Although hotel A has been gathering data intelligence on its customer for some time, they have yet utilized the data to its fullest potential, and they are aware of their limitation as well as the potential of data analytics. Currently, the utilization of data analytics in hotel A is limited in the area of customer service improvement, namely to enhance the personalization of service for each individual customer. Hotel A is able to utilize the data to improve and enhance their service which in turn, encourage repeated customers. According to hotel A, 50% of their guests returned to their hotel, and 70% extended nights because of the personalized service. Apart from using the data analytics for enhancing customer service, hotel A also uses the data in marketing. Hotel A uses the data analytics to predict or forecast the change in consumer behavior and demand, by tracking their guest’s booking preference, payment preference and demand shift between properties. However, hotel A admitted that the data they have been collecting was not fully utilized due to two challenges. The first challenge of using data analytics in hotel A is the data is not clean. At the moment, the data collection of one guest profile is meaningful only for one department in the hotel but meaningless for another department. Cleaning up the data and getting standards correctly for usage by different departments are some of the main concerns of hotel A. The second challenge of using data analytics in hotel A is the non-integral internal system. At the moment, the internal system used by hotel A do not integrate with each other well, limiting the ability to collect data systematically. Hotel A is considering another system to replace the current one for more comprehensive data collection. Hotel proprietors recognized the potential of data analytics as reported in this research, however, the current challenges of implementing a system to collect data come with a cost. This research has identified the current utilization of data analytics and the challenges faced when it comes to implementing data analytics.

Keywords: data analytics, hospitality industry, customer relationship management, hotel marketing

Procedia PDF Downloads 171
24416 Application of Bacteriophages as Natural Antibiotics in Aquaculture

Authors: Chamilani Nikapitiya, Mahanama De Zoysa, Jehee Lee

Abstract:

Most of the bacterial diseases are associated with high mortalities in aquaculture species and causing huge economic losses. Different approaches have been taken to prevent or control of bacterial diseases including use of vaccines, probiotics, chemotherapy, water quality management, etc. Antibiotics are widely applying as chemotherapy to control bacterial diseases, however, it has been shown that frequent use of antibiotics is favored to develop multi-drug resistance bacteria. Therefore, phages and phage encoded lytic proteins are known to be one of the most promising alternatives for antibiotics to avoid the emergence of antibiotic-resistant bacteria. We isolated and characterized the two lytic phages, namely pAh-1 and pAs-1 against pathogenic Aeromonas hydrophila and Aeromonas salmonicida, respectively. Morphological characteristics were analyzed by Transmission electron microscopy (TEM) and host strain specificities were tested with Aeromonas and other closely related bacterial strains. TEM analysis revealed that both pAh-1 and pAsm-1 are composed of an icosahedral head and a segmented tail, and we suggest that, they are new members of Myoviridae family. Genome sizes of isolated phages were estimated by restriction enzyme digestion of genomic DNA using selected endonucleases followed by agarose gel electrophoresis. Estimated genome size of pAh-1 and pAs-1 were approximately 64 Kbp and 120 Kbp, respectively. Both pAh-1 and pAs-1 have shown narrow host specificity. Moreover, protective effects of phage therapy against fish pathogenic A. hydrophila were investigated in zebrafish model. The survival rate was 40% higher when zebrafish received intra-peritoneal injection (i.p.) of pAh-1 were simultaneously challenge A. hydrophila (2 x 106 CFU/fish) compared to that without phage treatment. Overall results suggest that both pAh-1 and pAs-1 can be used as a potential phage therapy to control Aeromonas infections in aquaculture.

Keywords: Aeromonas infections, antibiotic resistance, bacteriophage, bio-control, lytic phage

Procedia PDF Downloads 188
24415 Genomic and Transcriptomic Analysis of Antibiotic Resistance Genes in Biological Wastewater Treatment Systems Treating Domestic and Hospital Effluents

Authors: Thobela Conco, Sheena Kumari, Chika Nnadozie, Mahmoud Nasr, Thor A. Stenström, Mushal Ali, Arshad Ismail, Faizal Bux

Abstract:

The discharge of antibiotics and its residues into the wastewater treatment plants (WWTP’s) create a conducive environment for the development of antibiotic resistant pathogens. This presents a risk of potential dissemination of antibiotic resistant pathogens and antibiotic resistance genes into the environment. It is, therefore, necessary to study the level of antibiotic resistance genes (ARG’s) among bacterial pathogens that proliferate in biological wastewater treatment systems. In the current study, metagenomic and meta-transcriptomic sequences of samples collected from the influents, secondary effluents and post chlorinated effluents of three wastewater treatment plants treating domestic and hospital effluents in Durban, South Africa, were analyzed for profiling of ARG’s among bacterial pathogens. Results show that a variety of ARG’s, mostly, aminoglycoside, β-lactamases, tetracycline and sulfonamide resistance genes were harbored by diverse bacterial genera found at different stages of treatment. A significant variation in diversity of pathogen and ARGs between the treatment plant was observed; however, treated final effluent samples from all three plants showed a significant reduction in bacterial pathogens and detected ARG’s. Both pre- and post-chlorinated samples showed the presence of mobile genetic elements (MGE’s), indicating the inefficiency of chlorination to remove of ARG’s integrated with MGE’s. In conclusion, the study showed the wastewater treatment plant efficiently caused the reduction and removal of certain ARG’s, even though the initial focus was the removal of biological nutrients.

Keywords: antibiotic resistance, mobile genetic elements, wastewater, wastewater treatment plants

Procedia PDF Downloads 213
24414 Realization of a (GIS) for Drilling (DWS) through the Adrar Region

Authors: Djelloul Benatiallah, Ali Benatiallah, Abdelkader Harouz

Abstract:

Geographic Information Systems (GIS) include various methods and computer techniques to model, capture digitally, store, manage, view and analyze. Geographic information systems have the characteristic to appeal to many scientific and technical field, and many methods. In this article we will present a complete and operational geographic information system, following the theoretical principles of data management and adapting to spatial data, especially data concerning the monitoring of drinking water supply wells (DWS) Adrar region. The expected results of this system are firstly an offer consulting standard features, updating and editing beneficiaries and geographical data, on the other hand, provides specific functionality contractors entered data, calculations parameterized and statistics.

Keywords: GIS, DWS, drilling, Adrar

Procedia PDF Downloads 302
24413 Characterization and Pcr Detection of Selected Strains of Psychrotrophic Bacteria Isolated From Raw Milk

Authors: Kidane workelul, Li xu, Xiaoyang Pang, Jiaping Lv

Abstract:

Dairy products are exceptionally ideal media for the growth of microorganisms because of their high nutritional content. There are several ways that milk might get contaminated throughout the milking process, including how the raw milk is transported and stored, as well as how long it is kept before being processed. Psychrotrophic bacteria are among the one which can deteriorate the quality of milk mainly their heat resistance proteas and lipase enzyme. For this research purpose 8 selected strains of Psychrotrophic bacteria (Entrococcus hirae, Pseudomonas fluorescens, Pseudomonas azotoformans, Pseudomonas putida, Exiguobacterium indicum, Pseudomonas paralactice, Acinetobacter indicum, Serratia liquefacients)are chosen and try to determine their characteristics based on the research methodology protocol. Thus, the 8 selected strains are cultured, plated incubate, extracted their genomic DNA and genome DNA was amplified, the purpose of the study was to identify their Psychrotrophic properties, lipase hydrolysis positive test, their optimal incubation temperature, designed primer using the noble strain P,flourescens conserved region area in target with lipA gene, optimized primer specificity as well as sensitivity and PCR detection for lipase positive strains using the design primers. Based on the findings both the selected 8 strains isolated from stored raw milk are Psychrotrophic bacteria, 6 of the selected strains except the 2 strains are positive for lipase hydrolysis, their optimal temperature is 20 to 30 OC, the designed primer specificity is very accurate and amplifies for those strains only with lipase positive but could not amplify for the others. Thus, the result is promising and could help in detecting the Psychrotrophic bacteria producing heat resistance enzymes (lipase) at early stage before the milk is processed and this will safe production loss for the dairy industry.

Keywords: dairy industry, heat-resistant, lipA, milk, primer and psychrotrophic

Procedia PDF Downloads 50
24412 Generic Data Warehousing for Consumer Electronics Retail Industry

Authors: S. Habte, K. Ouazzane, P. Patel, S. Patel

Abstract:

The dynamic and highly competitive nature of the consumer electronics retail industry means that businesses in this industry are experiencing different decision making challenges in relation to pricing, inventory control, consumer satisfaction and product offerings. To overcome the challenges facing retailers and create opportunities, we propose a generic data warehousing solution which can be applied to a wide range of consumer electronics retailers with a minimum configuration. The solution includes a dimensional data model, a template SQL script, a high level architectural descriptions, ETL tool developed using C#, a set of APIs, and data access tools. It has been successfully applied by ASK Outlets Ltd UK resulting in improved productivity and enhanced sales growth.

Keywords: consumer electronics, data warehousing, dimensional data model, generic, retail industry

Procedia PDF Downloads 404
24411 Forensic Applications of Quantum Dots

Authors: Samaneh Nabavi, Hadi Shirzad, Somayeh Khanjani, Shirin Jalili

Abstract:

Quantum dots (QDs) are semiconductor nanocrystals that exhibit intrinsic optical and electrical properties that are size dependent due to the quantum confinement effect. Quantum confinement is brought about by the fact that in bulk semiconductor material the electronic structure consists of continuous bands, and that as the size of the semiconductor material decreases its radius becomes less than the Bohr exciton radius (the distance between the electron and electron-hole) and discrete energy levels result. As a result QDs have a broad absorption range and a narrow emission which correlates to the band gap energy (E), and hence QD size. QDs can thus be tuned to give the desired wavelength of fluorescence emission.Due to their unique properties, QDs have attracted considerable attention in different scientific areas. Also, they have been considered for forensic applications in recent years. The ability of QDs to fluoresce up to 20 times brighter than available fluorescent dyes makes them an attractive nanomaterial for enhancing the visualization of latent fingermarks, or poorly developed fingermarks. Furthermore, the potential applications of QDs in the detection of nitroaromatic explosives, such as TNT, based on directive fluorescence quenching of QDs, electron transfer quenching process or fluorescence resonance energy transfer have been paid to attention. DNA analysis is associated tightly with forensic applications in molecular diagnostics. The amount of DNA acquired at a criminal site is inherently limited. This limited amount of human DNA has to be quantified accurately after the process of DNA extraction. Accordingly, highly sensitive detection of human genomic DNA is an essential issue for forensic study. QDs have also a variety of advantages as an emission probe in forensic DNA quantification.

Keywords: forensic science, quantum dots, DNA typing, explosive sensor, fingermark analysis

Procedia PDF Downloads 849
24410 Estimation of Level of Pesticide in Recurrent Pregnancy Loss and Its Correlation with Paraoxanase1 Gene in North Indian Population

Authors: Apurva Singh, S. P. Jaiswar, Apala Priyadarshini, Akancha Pandey

Abstract:

Objective: The aim of this study is to find the association of PON1 gene polymorphism with pesticides In RPL subjects. Background: Recurrent pregnancy loss (RPL) is defined as three or more sequential abortions before the 20th week of gestation. Pesticides and its derivatives (organochlorine and organophosphate) are proposed to accommodate a ruler chemical for RPL in the sub-humid region of India. The paraoxonase-1 enzyme (PON1) plays an important role in the toxicity of some organophosphate pesticides, with low PON1 activity being associated with higher pesticide sensitivity Methodology: This is a case-control study done in Department of Obstetrics & Gynaecology & Department of Biochemistry, K.G.M.U, Lucknow, India. The subjects were enrolled after fulfilling the inclusion & exclusion criteria. Inclusion criteria: Cases- Subject having two or more spontaneous abortions & Control- Healthy female having one or more alive child was selected. Exclusion criteria: Cases & Control- Subject having the following disease will be excluded from the study Diabetes mellitus, Hypertension, Tuberculosis, Immunocompromised patients, any endocrine disorder and genital, colon or breast cancer any other malignancies. Blood samples were collected in EDTA tubes from cases & healthy control women & genomic DNA was extracted by phenol-chloroform method. The estimation of pesticides residue from blood was done by HPLC. Biochemical estimation was also performed. Genotyping of PON1 gene polymorphism was performed by RFLP. Statistical analysis of the data was performed using the SPSS16.3 software. Results: A sum of total 14 pesticides (12 organochlorine and 2 organophosphate) selected on the basis of their persistent nature and consumption rate. The significant level of pesticide (ppb) estimated by the Mann whiney test and it was found to be significant at higher level of β-HCH (p:0.04), γ-HCH (p:0.001), δ-HCH (p: 0.002), chloropyrifos (p:0.001), pp-DDD (p:0.001) and fenvalrate (p: 0.001) in case group compare to its control. The level of antioxidant enzymes were found to be significantly decreased among the cases. Wild homozygous TT was more frequent and prevalent among control groups. However, heterozygous group (Tt) was more in cases than control groups (CI-0.3-1.3) (p=0.06). Conclusion: Higher levels of pesticides with endocrine disrupting potential in cases indicate the possible role of these compounds as one of the causes of recurrent pregnancy loss. Possibly, increased pesticide level appears to indicate increased levels of oxidative damage that has been associated with the possible cause of Recurrent Miscarriage, it may reflect indirect evidence of toxicity rather than the direct cause. Since both factors are reported to increase risk, individuals with higher levels of these 'Toxic compounds' especially in 'high-risk genotypes' might be more susceptible to recurrent pregnancy loss.

Keywords: paraoxonase, pesticides, PON1, RPL

Procedia PDF Downloads 137
24409 Sequential Data Assimilation with High-Frequency (HF) Radar Surface Current

Authors: Lei Ren, Michael Hartnett, Stephen Nash

Abstract:

The abundant measured surface current from HF radar system in coastal area is assimilated into model to improve the modeling forecasting ability. A simple sequential data assimilation scheme, Direct Insertion (DI), is applied to update model forecast states. The influence of Direct Insertion data assimilation over time is analyzed at one reference point. Vector maps of surface current from models are compared with HF radar measurements. Root-Mean-Squared-Error (RMSE) between modeling results and HF radar measurements is calculated during the last four days with no data assimilation.

Keywords: data assimilation, CODAR, HF radar, surface current, direct insertion

Procedia PDF Downloads 565
24408 Measured versus Default Interstate Traffic Data in New Mexico, USA

Authors: M. A. Hasan, M. R. Islam, R. A. Tarefder

Abstract:

This study investigates how the site specific traffic data differs from the Mechanistic Empirical Pavement Design Software default values. Two Weigh-in-Motion (WIM) stations were installed in Interstate-40 (I-40) and Interstate-25 (I-25) to developed site specific data. A computer program named WIM Data Analysis Software (WIMDAS) was developed using Microsoft C-Sharp (.Net) for quality checking and processing of raw WIM data. A complete year data from November 2013 to October 2014 was analyzed using the developed WIM Data Analysis Program. After that, the vehicle class distribution, directional distribution, lane distribution, monthly adjustment factor, hourly distribution, axle load spectra, average number of axle per vehicle, axle spacing, lateral wander distribution, and wheelbase distribution were calculated. Then a comparative study was done between measured data and AASHTOWare default values. It was found that the measured general traffic inputs for I-40 and I-25 significantly differ from the default values.

Keywords: AASHTOWare, traffic, weigh-in-motion, axle load distribution

Procedia PDF Downloads 335
24407 Quantitative Trait Loci Analysis in Multiple Sorghum Mapping Populations Facilitates the Dissection of Genetic Control of Drought Tolerance Related Traits in Sorghum [Sorghum bicolor (Moench)]

Authors: Techale B., Hongxu Dong, Mihrete Getinet, Aregash Gabizew, Andrew H. Paterson, Kassahun Bantte

Abstract:

The genetic architecture of drought tolerance is expected to involve multiple loci that are unlikely to all segregate for alternative alleles in a single bi-parental population. Therefore, the identification of quantitative trait loci (QTL) that are expressed in diverse genetic backgrounds of multiple bi-parental populations provides evidence about both background-specific and common genetic variants. The purpose of this study was to map QTL related to drought tolerance using three connected mapping populations of different genetic backgrounds to gain insight into the genomic landscape of this important trait in elite Ethiopian germplasm. The three bi-parental populations, each with 207 F₂:₃ lines, were evaluated using an alpha lattice design with two replications under two moisture stress environments. Drought tolerance related traits were analyzed separately for each population using composite interval mapping, finding a total of 105 QTLs. All the QTLs identified from individual populations were projected on a combined consensus map, comprising a total of 25 meta QTLs for seven traits. The consensus map allowed us to deduce locations of a larger number of markers than possible in any individual map, providing a reference for genetic studies in different genetic backgrounds. The mQTL identified in this study could be used for marker-assisted breeding programs in sorghum after validation. Only one trait, reduced leaf senescence, showed a striking bias of allele distribution, indicating substantial standing variation among present varieties that might be employed in improving drought tolerance of Ethiopian and other sorghums.

Keywords: Drought tolerance , Mapping populations, Meta QTL, QTL mapping, Sorghum

Procedia PDF Downloads 172
24406 Design of Knowledge Management System with Geographic Information System

Authors: Angga Hidayah Ramadhan, Luciana Andrawina, M. Azani Hasibuan

Abstract:

Data will be as a core of the decision if it has a good treatment or process, which is process that data into information, and information into knowledge to make a wisdom or decision. Today, many companies have not realize it include XYZ University Admission Directorate as executor of National Admission called Seleksi Masuk Bersama (SMB) that during the time, the workers only uses their feeling to make a decision. Whereas if it done, then that company can analyze the data to make a right decision to get a pin sales from student candidate or registrant that follow SMB as many as possible. Therefore, needs Knowledge Management System (KMS) with Geographic Information System (GIS) use 5C4C that can process that company data becomes more useful and can help make decisions. This information system can process data into information based on the pin sold data with 5C (Contextualized, Categorize, Calculation, Correction, Condensed) and convert information into knowledge with 4C (Comparing, Consequence, Connection, Conversation) that has been several steps until these data can be useful to make easier to take a decision or wisdom, resolve problems, communicate, and quicker to learn to the employees have not experience and also for ease of viewing/visualization based on spatial data that equipped with GIS functionality that can be used to indicate events in each province with indicator that facilitate in this system. The system also have a function to save the tacit on the system then to be proceed into explicit in expert system based on the problems that will be found from the consequences of information. With the system each team can make a decision with same ways, structured, and the important is based on the actual event/data.

Keywords: 5C4C, data, information, knowledge

Procedia PDF Downloads 459
24405 Cas9-Assisted Direct Cloning and Refactoring of a Silent Biosynthetic Gene Cluster

Authors: Peng Hou

Abstract:

Natural products produced from marine bacteria serve as an immense reservoir for anti-infective drugs and therapeutic agents. Nowadays, heterologous expression of gene clusters of interests has been widely adopted as an effective strategy for natural product discovery. Briefly, the heterologous expression flowchart would be: biosynthetic gene cluster identification, pathway construction and expression, and product detection. However, gene cluster capture using traditional Transformation-associated recombination (TAR) protocol is low-efficient (0.5% positive colony rate). To make things worse, most of these putative new natural products are only predicted by bioinformatics analysis such as antiSMASH, and their corresponding natural products biosynthetic pathways are either not expressed or expressed at very low levels under laboratory conditions. Those setbacks have inspired us to focus on seeking new technologies to efficiently edit and refractor of biosynthetic gene clusters. Recently, two cutting-edge techniques have attracted our attention - the CRISPR-Cas9 and Gibson Assembly. By now, we have tried to pretreat Brevibacillus laterosporus strain genomic DNA with CRISPR-Cas9 nucleases that specifically generated breaks near the gene cluster of interest. This trial resulted in an increase in the efficiency of gene cluster capture (9%). Moreover, using Gibson Assembly by adding/deleting certain operon and tailoring enzymes regardless of end compatibility, the silent construct (~80kb) has been successfully refactored into an active one, yielded a series of analogs expected. With the appearances of the novel molecular tools, we are confident to believe that development of a high throughput mature pipeline for DNA assembly, transformation, product isolation and identification would no longer be a daydream for marine natural product discovery.

Keywords: biosynthesis, CRISPR-Cas9, DNA assembly, refactor, TAR cloning

Procedia PDF Downloads 273
24404 A Policy Strategy for Building Energy Data Management in India

Authors: Shravani Itkelwar, Deepak Tewari, Bhaskar Natarajan

Abstract:

The energy consumption data plays a vital role in energy efficiency policy design, implementation, and impact assessment. Any demand-side energy management intervention's success relies on the availability of accurate, comprehensive, granular, and up-to-date data on energy consumption. The Building sector, including residential and commercial, is one of the largest consumers of energy in India after the Industrial sector. With economic growth and increasing urbanization, the building sector is projected to grow at an unprecedented rate, resulting in a 5.6 times escalation in energy consumption till 2047 compared to 2017. Therefore, energy efficiency interventions will play a vital role in decoupling the floor area growth and associated energy demand, thereby increasing the need for robust data. In India, multiple institutions are involved in the collection and dissemination of data. This paper focuses on energy consumption data management in the building sector in India for both residential and commercial segments. It evaluates the robustness of data available through administrative and survey routes to estimate the key performance indicators and identify critical data gaps for making informed decisions. The paper explores several issues in the data, such as lack of comprehensiveness, non-availability of disaggregated data, the discrepancy in different data sources, inconsistent building categorization, and others. The identified data gaps are justified with appropriate examples. Moreover, the paper prioritizes required data in order of relevance to policymaking and groups it into "available," "easy to get," and "hard to get" categories. The paper concludes with recommendations to address the data gaps by leveraging digital initiatives, strengthening institutional capacity, institutionalizing exclusive building energy surveys, and standardization of building categorization, among others, to strengthen the management of building sector energy consumption data.

Keywords: energy data, energy policy, energy efficiency, buildings

Procedia PDF Downloads 180
24403 A Survey on Data-Centric and Data-Aware Techniques for Large Scale Infrastructures

Authors: Silvina Caíno-Lores, Jesús Carretero

Abstract:

Large scale computing infrastructures have been widely developed with the core objective of providing a suitable platform for high-performance and high-throughput computing. These systems are designed to support resource-intensive and complex applications, which can be found in many scientific and industrial areas. Currently, large scale data-intensive applications are hindered by the high latencies that result from the access to vastly distributed data. Recent works have suggested that improving data locality is key to move towards exascale infrastructures efficiently, as solutions to this problem aim to reduce the bandwidth consumed in data transfers, and the overheads that arise from them. There are several techniques that attempt to move computations closer to the data. In this survey we analyse the different mechanisms that have been proposed to provide data locality for large scale high-performance and high-throughput systems. This survey intends to assist scientific computing community in understanding the various technical aspects and strategies that have been reported in recent literature regarding data locality. As a result, we present an overview of locality-oriented techniques, which are grouped in four main categories: application development, task scheduling, in-memory computing and storage platforms. Finally, the authors include a discussion on future research lines and synergies among the former techniques.

Keywords: data locality, data-centric computing, large scale infrastructures, cloud computing

Procedia PDF Downloads 253
24402 Wind Speed Data Analysis in Colombia in 2013 and 2015

Authors: Harold P. Villota, Alejandro Osorio B.

Abstract:

The energy meteorology is an area for study energy complementarity and the use of renewable sources in interconnected systems. Due to diversify the energy matrix in Colombia with wind sources, is necessary to know the data bases about this one. However, the time series given by 260 automatic weather stations have empty, and no apply data, so the purpose is to fill the time series selecting two years to characterize, impute and use like base to complete the data between 2005 and 2020.

Keywords: complementarity, wind speed, renewable, colombia, characteri, characterization, imputation

Procedia PDF Downloads 160
24401 Industrial Process Mining Based on Data Pattern Modeling and Nonlinear Analysis

Authors: Hyun-Woo Cho

Abstract:

Unexpected events may occur with serious impacts on industrial process. This work utilizes a data representation technique to model and to analyze process data pattern for the purpose of diagnosis. In this work, the use of triangular representation of process data is evaluated using simulation process. Furthermore, the effect of using different pre-treatment techniques based on such as linear or nonlinear reduced spaces was compared. This work extracted the fault pattern in the reduced space, not in the original data space. The results have shown that the non-linear technique based diagnosis method produced more reliable results and outperforms linear method.

Keywords: process monitoring, data analysis, pattern modeling, fault, nonlinear techniques

Procedia PDF Downloads 381
24400 In silico Subtractive Genomics Approach for Identification of Strain-Specific Putative Drug Targets among Hypothetical Proteins of Drug-Resistant Klebsiella pneumoniae Strain 825795-1

Authors: Umairah Natasya Binti Mohd Omeershffudin, Suresh Kumar

Abstract:

Klebsiella pneumoniae, a Gram-negative enteric bacterium that causes nosocomial and urinary tract infections. Particular concern is the global emergence of multidrug-resistant (MDR) strains of Klebsiella pneumoniae. Characterization of antibiotic resistance determinants at the genomic level plays a critical role in understanding, and potentially controlling, the spread of multidrug-resistant (MDR) pathogens. In this study, drug-resistant Klebsiella pneumoniae strain 825795-1 was investigated with extensive computational approaches aimed at identifying novel drug targets among hypothetical proteins. We have analyzed 1099 hypothetical proteins available in genome. We have used in-silico genome subtraction methodology to design potential and pathogen-specific drug targets against Klebsiella pneumoniae. We employed bioinformatics tools to subtract the strain-specific paralogous and host-specific homologous sequences from the bacterial proteome. The sorted 645 proteins were further refined to identify the essential genes in the pathogenic bacterium using the database of essential genes (DEG). We found 135 unique essential proteins in the target proteome that could be utilized as novel targets to design newer drugs. Further, we identified 49 cytoplasmic protein as potential drug targets through sub-cellular localization prediction. Further, we investigated these proteins in the DrugBank databases, and 11 of the unique essential proteins showed druggability according to the FDA approved drug bank databases with diverse broad-spectrum property. The results of this study will facilitate discovery of new drugs against Klebsiella pneumoniae.

Keywords: pneumonia, drug target, hypothetical protein, subtractive genomics

Procedia PDF Downloads 168
24399 Recommender System Based on Mining Graph Databases for Data-Intensive Applications

Authors: Mostafa Gamal, Hoda K. Mohamed, Islam El-Maddah, Ali Hamdi

Abstract:

In recent years, many digital documents on the web have been created due to the rapid growth of ’social applications’ communities or ’Data-intensive applications’. The evolution of online-based multimedia data poses new challenges in storing and querying large amounts of data for online recommender systems. Graph data models have been shown to be more efficient than relational data models for processing complex data. This paper will explain the key differences between graph and relational databases, their strengths and weaknesses, and why using graph databases is the best technology for building a realtime recommendation system. Also, The paper will discuss several similarity metrics algorithms that can be used to compute a similarity score of pairs of nodes based on their neighbourhoods or their properties. Finally, the paper will discover how NLP strategies offer the premise to improve the accuracy and coverage of realtime recommendations by extracting the information from the stored unstructured knowledge, which makes up the bulk of the world’s data to enrich the graph database with this information. As the size and number of data items are increasing rapidly, the proposed system should meet current and future needs.

Keywords: graph databases, NLP, recommendation systems, similarity metrics

Procedia PDF Downloads 100
24398 Digital Revolution a Veritable Infrastructure for Technological Development

Authors: Osakwe Jude Odiakaosa

Abstract:

Today’s digital society is characterized by e-education or e-learning, e-commerce, and so on. All these have been propelled by digital revolution. Digital technology such as computer technology, Global Positioning System (GPS) and Geographic Information System (GIS) has been having a tremendous impact on the field of technology. This development has positively affected the scope, methods, speed of data acquisition, data management and the rate of delivery of the results (map and other map products) of data processing. This paper tries to address the impact of revolution brought by digital technology.

Keywords: digital revolution, internet, technology, data management

Procedia PDF Downloads 443
24397 BigCrypt: A Probable Approach of Big Data Encryption to Protect Personal and Business Privacy

Authors: Abdullah Al Mamun, Talal Alkharobi

Abstract:

As data size is growing up, people are became more familiar to store big amount of secret information into cloud storage. Companies are always required to need transfer massive business files from one end to another. We are going to lose privacy if we transmit it as it is and continuing same scenario repeatedly without securing the communication mechanism means proper encryption. Although asymmetric key encryption solves the main problem of symmetric key encryption but it can only encrypt limited size of data which is inapplicable for large data encryption. In this paper we propose a probable approach of pretty good privacy for encrypt big data using both symmetric and asymmetric keys. Our goal is to achieve encrypt huge collection information and transmit it through a secure communication channel for committing the business and personal privacy. To justify our method an experimental dataset from three different platform is provided. We would like to show that our approach is working for massive size of various data efficiently and reliably.

Keywords: big data, cloud computing, cryptography, hadoop, public key

Procedia PDF Downloads 314
24396 Implementation of Big Data Concepts Led by the Business Pressures

Authors: Snezana Savoska, Blagoj Ristevski, Violeta Manevska, Zlatko Savoski, Ilija Jolevski

Abstract:

Big data is widely accepted by the pharmaceutical companies as a result of business demands create through legal pressure. Pharmaceutical companies have many legal demands as well as standards’ demands and have to adapt their procedures to the legislation. To manage with these demands, they have to standardize the usage of the current information technology and use the latest software tools. This paper highlights some important aspects of experience with big data projects implementation in a pharmaceutical Macedonian company. These projects made improvements of their business processes by the help of new software tools selected to comply with legal and business demands. They use IT as a strategic tool to obtain competitive advantage on the market and to reengineer the processes towards new Internet economy and quality demands. The company is required to manage vast amounts of structured as well as unstructured data. For these reasons, they implement projects for emerging and appropriate software tools which have to deal with big data concepts accepted in the company.

Keywords: big data, unstructured data, SAP ERP, documentum

Procedia PDF Downloads 264
24395 Saving Energy at a Wastewater Treatment Plant through Electrical and Production Data Analysis

Authors: Adriano Araujo Carvalho, Arturo Alatrista Corrales

Abstract:

This paper intends to show how electrical energy consumption and production data analysis were used to find opportunities to save energy at Taboada wastewater treatment plant in Callao, Peru. In order to access the data, it was used independent data networks for both electrical and process instruments, which were taken to analyze under an ISO 50001 energy audit, which considered, thus, Energy Performance Indexes for each process and a step-by-step guide presented in this text. Due to the use of aforementioned methodology and data mining techniques applied on information gathered through electronic multimeters (conveniently placed on substation switchboards connected to a cloud network), it was possible to identify thoroughly the performance of each process and thus, evidence saving opportunities which were previously hidden before. The data analysis brought both costs and energy reduction, allowing the plant to save significant resources and to be certified under ISO 50001.

Keywords: energy and production data analysis, energy management, ISO 50001, wastewater treatment plant energy analysis

Procedia PDF Downloads 189
24394 Data Clustering in Wireless Sensor Network Implemented on Self-Organization Feature Map (SOFM) Neural Network

Authors: Krishan Kumar, Mohit Mittal, Pramod Kumar

Abstract:

Wireless sensor network is one of the most promising communication networks for monitoring remote environmental areas. In this network, all the sensor nodes are communicated with each other via radio signals. The sensor nodes have capability of sensing, data storage and processing. The sensor nodes collect the information through neighboring nodes to particular node. The data collection and processing is done by data aggregation techniques. For the data aggregation in sensor network, clustering technique is implemented in the sensor network by implementing self-organizing feature map (SOFM) neural network. Some of the sensor nodes are selected as cluster head nodes. The information aggregated to cluster head nodes from non-cluster head nodes and then this information is transferred to base station (or sink nodes). The aim of this paper is to manage the huge amount of data with the help of SOM neural network. Clustered data is selected to transfer to base station instead of whole information aggregated at cluster head nodes. This reduces the battery consumption over the huge data management. The network lifetime is enhanced at a greater extent.

Keywords: artificial neural network, data clustering, self organization feature map, wireless sensor network

Procedia PDF Downloads 507
24393 Review and Comparison of Associative Classification Data Mining Approaches

Authors: Suzan Wedyan

Abstract:

Data mining is one of the main phases in the Knowledge Discovery Database (KDD) which is responsible of finding hidden and useful knowledge from databases. There are many different tasks for data mining including regression, pattern recognition, clustering, classification, and association rule. In recent years a promising data mining approach called associative classification (AC) has been proposed, AC integrates classification and association rule discovery to build classification models (classifiers). This paper surveys and critically compares several AC algorithms with reference of the different procedures are used in each algorithm, such as rule learning, rule sorting, rule pruning, classifier building, and class allocation for test cases.

Keywords: associative classification, classification, data mining, learning, rule ranking, rule pruning, prediction

Procedia PDF Downloads 530
24392 Hierarchical Checkpoint Protocol in Data Grids

Authors: Rahma Souli-Jbali, Minyar Sassi Hidri, Rahma Ben Ayed

Abstract:

Grid of computing nodes has emerged as a representative means of connecting distributed computers or resources scattered all over the world for the purpose of computing and distributed storage. Since fault tolerance becomes complex due to the availability of resources in decentralized grid environment, it can be used in connection with replication in data grids. The objective of our work is to present fault tolerance in data grids with data replication-driven model based on clustering. The performance of the protocol is evaluated with Omnet++ simulator. The computational results show the efficiency of our protocol in terms of recovery time and the number of process in rollbacks.

Keywords: data grids, fault tolerance, clustering, chandy-lamport

Procedia PDF Downloads 329
24391 An Observation of the Information Technology Research and Development Based on Article Data Mining: A Survey Study on Science Direct

Authors: Muhammet Dursun Kaya, Hasan Asil

Abstract:

One of the most important factors of research and development is the deep insight into the evolutions of scientific development. The state-of-the-art tools and instruments can considerably assist the researchers, and many of the world organizations have become aware of the advantages of data mining for the acquisition of the knowledge required for the unstructured data. This paper was an attempt to review the articles on the information technology published in the past five years with the aid of data mining. A clustering approach was used to study these articles, and the research results revealed that three topics, namely health, innovation, and information systems, have captured the special attention of the researchers.

Keywords: information technology, data mining, scientific development, clustering

Procedia PDF Downloads 271
24390 Security in Resource Constraints: Network Energy Efficient Encryption

Authors: Mona Almansoori, Ahmed Mustafa, Ahmad Elshamy

Abstract:

Wireless nodes in a sensor network gather and process critical information designed to process and communicate, information flooding through such network is critical for decision making and data processing, the integrity of such data is one of the most critical factors in wireless security without compromising the processing and transmission capability of the network. This paper presents mechanism to securely transmit data over a chain of sensor nodes without compromising the throughput of the network utilizing available battery resources available at the sensor node.

Keywords: hybrid protocol, data integrity, lightweight encryption, neighbor based key sharing, sensor node data processing, Z-MAC

Procedia PDF Downloads 139