Search results for: Data Warehousing tools
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 8184

Search results for: Data Warehousing tools

7554 An Analysis of Genetic Algorithm Based Test Data Compression Using Modified PRL Coding

Authors: K. S. Neelukumari, K. B. Jayanthi

Abstract:

In this paper genetic based test data compression is targeted for improving the compression ratio and for reducing the computation time. The genetic algorithm is based on extended pattern run-length coding. The test set contains a large number of X value that can be effectively exploited to improve the test data compression. In this coding method, a reference pattern is set and its compatibility is checked. For this process, a genetic algorithm is proposed to reduce the computation time of encoding algorithm. This coding technique encodes the 2n compatible pattern or the inversely compatible pattern into a single test data segment or multiple test data segment. The experimental result shows that the compression ratio and computation time is reduced.

Keywords: Backtracking, test data compression (TDC), x-filling, x-propagating and genetic algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1850
7553 Parallel Vector Processing Using Multi Level Orbital DATA

Authors: Nagi Mekhiel

Abstract:

Many applications use vector operations by applying single instruction to multiple data that map to different locations in conventional memory. Transferring data from memory is limited by access latency and bandwidth affecting the performance gain of vector processing. We present a memory system that makes all of its content available to processors in time so that processors need not to access the memory, we force each location to be available to all processors at a specific time. The data move in different orbits to become available to other processors in higher orbits at different time. We use this memory to apply parallel vector operations to data streams at first orbit level. Data processed in the first level move to upper orbit one data element at a time, allowing a processor in that orbit to apply another vector operation to deal with serial code limitations inherited in all parallel applications and interleaved it with lower level vector operations.

Keywords: Memory organization, parallel processors, serial code, vector processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1041
7552 Mechanisms of Organic Contaminants Uptake and Degradation in Plants

Authors: E.Kvesitadze, T.Sadunishvili, G.Kvesitadze

Abstract:

As a result of urbanization, the unpredictable growth of industry and transport, production of chemicals, military activities, etc. the concentration of anthropogenic toxicants spread in nature exceeds all the permissible standards. Most dangerous among these contaminants are organic compounds having great persistence, bioaccumulation, and toxicity along with our awareness of their prominent occurrence in the environment and food chain. Among natural ecological tools, plants still occupying above 40% of the world land, until recently, were considered as organisms having only a limited ecological potential, accumulating in plant biomass and partially volatilizing contaminants of different structure. However, analysis of experimental data of the last two decades revealed the essential role of plants in environment remediation due to ability to carry out intracellular degradation processes leading to partial or complete decomposition of carbon skeleton of different structure contaminants. Though, phytoremediation technologies still are in research and development, their various applications have been successfully used. The paper aims to analyze mechanisms of organic contaminants uptake and detoxification in plants, being the less studied issue in evaluation and exploration of plants potential for environment remediation.

Keywords: organic contaminants, Detoxification, metalloenzymes, plant ultrastructure.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3040
7551 Exploring Dimensionality, Systematic Mutations and Number of Contacts in Simple HP ab-initio Protein Folding Using a Blackboard-based Agent Platform

Authors: Hiram I. Beltrán, Arturo Rojo-Domínguez, Máximo Eduardo Sánchez Gutiérrez, Pedro Pablo González Pérez

Abstract:

A computational platform is presented in this contribution. It has been designed as a virtual laboratory to be used for exploring optimization algorithms in biological problems. This platform is built on a blackboard-based agent architecture. As a test case, the version of the platform presented here is devoted to the study of protein folding, initially with a bead-like description of the chain and with the widely used model of hydrophobic and polar residues (HP model). Some details of the platform design are presented along with its capabilities and also are revised some explorations of the protein folding problems with different types of discrete space. It is also shown the capability of the platform to incorporate specific tools for the structural analysis of the runs in order to understand and improve the optimization process. Accordingly, the results obtained demonstrate that the ensemble of computational tools into a single platform is worthwhile by itself, since experiments developed on it can be designed to fulfill different levels of information in a self-consistent fashion. By now, it is being explored how an experiment design can be useful to create a computational agent to be included within the platform. These inclusions of designed agents –or software pieces– are useful for the better accomplishment of the tasks to be developed by the platform. Clearly, while the number of agents increases the new version of the virtual laboratory thus enhances in robustness and functionality.

Keywords: genetic algorithms, multi-agent systems, bioinformatics, optimization, protein folding, structural biology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1864
7550 An Investigation into the Application of Artificial Neural Networks to the Prediction of Injuries in Sport

Authors: J. McCullagh, T. Whitfort

Abstract:

Artificial Neural Networks (ANNs) have been used successfully in many scientific, industrial and business domains as a method for extracting knowledge from vast amounts of data. However the use of ANN techniques in the sporting domain has been limited. In professional sport, data is stored on many aspects of teams, games, training and players. Sporting organisations have begun to realise that there is a wealth of untapped knowledge contained in the data and there is great interest in techniques to utilise this data. This study will use player data from the elite Australian Football League (AFL) competition to train and test ANNs with the aim to predict the onset of injuries. The results demonstrate that an accuracy of 82.9% was achieved by the ANNs’ predictions across all examples with 94.5% of all injuries correctly predicted. These initial findings suggest that ANNs may have the potential to assist sporting clubs in the prediction of injuries.

Keywords: Artificial Neural Networks, data, injuries, sport

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2826
7549 Analysis of Medical Data using Data Mining and Formal Concept Analysis

Authors: Anamika Gupta, Naveen Kumar, Vasudha Bhatnagar

Abstract:

This paper focuses on analyzing medical diagnostic data using classification rules in data mining and context reduction in formal concept analysis. It helps in finding redundancies among the various medical examination tests used in diagnosis of a disease. Classification rules have been derived from positive and negative association rules using the Concept lattice structure of the Formal Concept Analysis. Context reduction technique given in Formal Concept Analysis along with classification rules has been used to find redundancies among the various medical examination tests. Also it finds out whether expensive medical tests can be replaced by some cheaper tests.

Keywords: Data Mining, Formal Concept Analysis, Medical Data, Negative Classification Rules.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1710
7548 Data Transmission Reliability in Short Message Integrated Distributed Monitoring Systems

Authors: Sui Xin, Li Chunsheng, Tian Di

Abstract:

Short message integrated distributed monitoring systems (SM-DMS) are growing rapidly in wireless communication applications in various areas, such as electromagnetic field (EMF) management, wastewater monitoring, and air pollution supervision, etc. However, delay in short messages often makes the data embedded in SM-DMS transmit unreliably. Moreover, there are few regulations dealing with this problem in SMS transmission protocols. In this study, based on the analysis of the command and data requirements in the SM-DMS, we developed a processing model for the control center to solve the delay problem in data transmission. Three components of the model: the data transmission protocol, the receiving buffer pool method, and the timer mechanism were described in detail. Discussions on adjusting the threshold parameter in the timer mechanism were presented for the adaptive performance during the runtime of the SM-DMS. This model optimized the data transmission reliability in SM-DMS, and provided a supplement to the data transmission reliability protocols at the application level.

Keywords: Delay, SMS, reliability, distributed monitoringsystem (DMS), wireless communication.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1679
7547 Data Gathering Protocols for Wireless Sensor Networks

Authors: Dhinu Johnson, Gurdip Singh

Abstract:

Sensor network applications are often data centric and involve collecting data from a set of sensor nodes to be delivered to various consumers. Typically, nodes in a sensor network are resource-constrained, and hence the algorithms operating in these networks must be efficient. There may be several algorithms available implementing the same service, and efficient considerations may require a sensor application to choose the best suited algorithm. In this paper, we present a systematic evaluation of a set of algorithms implementing the data gathering service. We propose a modular infrastructure for implementing such algorithms in TOSSIM with separate configurable modules for various tasks such as interest propagation, data propagation, aggregation, and path maintenance. By appropriately configuring these modules, we propose a number of data gathering algorithms, each of which incorporates a different set of heuristics for optimizing performance. We have performed comprehensive experiments to evaluate the effectiveness of these heuristics, and we present results from our experimentation efforts.

Keywords: Data Centric Protocols, Shortest Paths, Sensor networks, Message passing systems.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1421
7546 Measured versus Default Interstate Traffic Data in New Mexico, USA

Authors: M. A. Hasan, M. R. Islam, R. A. Tarefder

Abstract:

This study investigates how the site specific traffic data differs from the Mechanistic Empirical Pavement Design Software default values. Two Weigh-in-Motion (WIM) stations were installed in Interstate-40 (I-40) and Interstate-25 (I-25) to developed site specific data. A computer program named WIM Data Analysis Software (WIMDAS) was developed using Microsoft C-Sharp (.Net) for quality checking and processing of raw WIM data. A complete year data from November 2013 to October 2014 was analyzed using the developed WIM Data Analysis Program. After that, the vehicle class distribution, directional distribution, lane distribution, monthly adjustment factor, hourly distribution, axle load spectra, average number of axle per vehicle, axle spacing, lateral wander distribution, and wheelbase distribution were calculated. Then a comparative study was done between measured data and AASHTOWare default values. It was found that the measured general traffic inputs for I-40 and I-25 significantly differ from the default values.

Keywords: AASHTOWare, Traffic, Weigh-in-Motion, Axle load Distribution.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1676
7545 Energy Efficient In-Network Data Processing in Sensor Networks

Authors: Prakash G L, Thejaswini M, S H Manjula, K R Venugopal, L M Patnaik

Abstract:

The Sensor Network consists of densely deployed sensor nodes. Energy optimization is one of the most important aspects of sensor application design. Data acquisition and aggregation techniques for processing data in-network should be energy efficient. Due to the cross-layer design, resource-limited and noisy nature of Wireless Sensor Networks(WSNs), it is challenging to study the performance of these systems in a realistic setting. In this paper, we propose optimizing queries by aggregation of data and data redundancy to reduce energy consumption without requiring all sensed data and directed diffusion communication paradigm to achieve power savings, robust communication and processing data in-network. To estimate the per-node power consumption POWERTossim mica2 energy model is used, which provides scalable and accurate results. The performance analysis shows that the proposed methods overcomes the existing methods in the aspects of energy consumption in wireless sensor networks.

Keywords: Data Aggregation, Directed Diffusion, Partial Aggregation, Packet Merging, Query Plan.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1808
7544 Preliminary Analysis of Energy Efficiency in Data Center: Case Study

Authors: Xiaoshu Lu, Tao Lu, Matias Remes, Martti Viljanen

Abstract:

As the data-driven economy is growing faster than ever and the demand for energy is being spurred, we are facing unprecedented challenges of improving energy efficiency in data centers. Effectively maximizing energy efficiency or minimising the cooling energy demand is becoming pervasive for data centers. This paper investigates overall energy consumption and the energy efficiency of cooling system for a data center in Finland as a case study. The power, cooling and energy consumption characteristics and operation condition of facilities are examined and analysed. Potential energy and cooling saving opportunities are identified and further suggestions for improving the performance of cooling system are put forward. Results are presented as a comprehensive evaluation of both the energy performance and good practices of energy efficient cooling operations for the data center. Utilization of an energy recovery concept for cooling system is proposed. The conclusion we can draw is that even though the analysed data center demonstrated relatively high energy efficiency, based on its power usage effectiveness value, there is still a significant potential for energy saving from its cooling systems.

Keywords: Data center, case study, cooling system, energyefficiency.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1518
7543 A Multi-Agent Framework for Data Mining

Authors: Kamal Ali Albashiri, Khaled Ahmed Kadouh

Abstract:

A generic and extendible Multi-Agent Data Mining (MADM) framework, MADMF (the Multi-Agent Data Mining Framework) is described. The central feature of the framework is that it avoids the use of agreed meta-language formats by supporting a framework of wrappers. The advantage offered is that the framework is easily extendible, so that further data agents and mining agents can simply be added to the framework. A demonstration MADMF framework is currently available. The paper includes details of the MADMF architecture and the wrapper principle incorporated into it. A full description and evaluation of the framework-s operation is provided by considering two MADM scenarios.

Keywords: Multi-Agent Data Mining (MADM), Frequent Itemsets, Meta ARM, Association Rule Mining, Classifier generator.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2046
7542 Cyber Fraud Schemes: Modus Operandi, Tools and Techniques, and the Role of European Legislation as a Defense Strategy

Authors: Papathanasiou Anastasios, Liontos George, Liagkou Vasiliki, Glavas Euripides

Abstract:

The purpose of this paper is to describe the growing problem of various cyber fraud schemes that exist on the internet and are currently among the most prevalent. The main focus of this paper is to provide a detailed description of the modus operandi, tools, and techniques utilized in four basic typologies of cyber frauds: Business Email Compromise (BEC) attacks, investment fraud, romance scams, and online sales fraud. The paper aims to shed light on the methods employed by cybercriminals in perpetrating these types of fraud, as well as the strategies they use to deceive and victimize individuals and businesses on the internet. Furthermore, this study outlines defense strategies intended to tackle the issue head-on, with a particular emphasis on the crucial role played by European legislation. European legislation has proactively adapted to the evolving landscape of cyber fraud, striving to enhance cybersecurity awareness, bolster user education, and implement advanced technical controls to mitigate associated risks. The paper evaluates the advantages and innovations brought about by the European legislation while also acknowledging potential flaws that cybercriminals might exploit. As a result, recommendations for refining the legislation are offered in this study in order to better address this pressing issue.

Keywords: Business email compromise, cybercrime, European legislation, investment fraud, Network and Information Security, online sales fraud, romance scams.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 95
7541 Deep Learning for Renewable Power Forecasting: An Approach Using LSTM Neural Networks

Authors: Fazıl Gökgöz, Fahrettin Filiz

Abstract:

Load forecasting has become crucial in recent years and become popular in forecasting area. Many different power forecasting models have been tried out for this purpose. Electricity load forecasting is necessary for energy policies, healthy and reliable grid systems. Effective power forecasting of renewable energy load leads the decision makers to minimize the costs of electric utilities and power plants. Forecasting tools are required that can be used to predict how much renewable energy can be utilized. The purpose of this study is to explore the effectiveness of LSTM-based neural networks for estimating renewable energy loads. In this study, we present models for predicting renewable energy loads based on deep neural networks, especially the Long Term Memory (LSTM) algorithms. Deep learning allows multiple layers of models to learn representation of data. LSTM algorithms are able to store information for long periods of time. Deep learning models have recently been used to forecast the renewable energy sources such as predicting wind and solar energy power. Historical load and weather information represent the most important variables for the inputs within the power forecasting models. The dataset contained power consumption measurements are gathered between January 2016 and December 2017 with one-hour resolution. Models use publicly available data from the Turkish Renewable Energy Resources Support Mechanism. Forecasting studies have been carried out with these data via deep neural networks approach including LSTM technique for Turkish electricity markets. 432 different models are created by changing layers cell count and dropout. The adaptive moment estimation (ADAM) algorithm is used for training as a gradient-based optimizer instead of SGD (stochastic gradient). ADAM performed better than SGD in terms of faster convergence and lower error rates. Models performance is compared according to MAE (Mean Absolute Error) and MSE (Mean Squared Error). Best five MAE results out of 432 tested models are 0.66, 0.74, 0.85 and 1.09. The forecasting performance of the proposed LSTM models gives successful results compared to literature searches.

Keywords: Deep learning, long-short-term memory, energy, renewable energy load forecasting.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1561
7540 Surface Thermodynamics Approach to Mycobacterium tuberculosis (M-TB) – Human Sputum Interactions

Authors: J. L. Chukwuneke, C. H. Achebe, S. N. Omenyi

Abstract:

This research work presents the surface thermodynamics approach to M-TB/HIV-Human sputum interactions. This involved the use of the Hamaker coefficient concept as a surface energetics tool in determining the interaction processes, with the surface interfacial energies explained using van der Waals concept of particle interactions. The Lifshitz derivation for van der Waals forces was applied as an alternative to the contact angle approach which has been widely used in other biological systems. The methodology involved taking sputum samples from twenty infected persons and from twenty uninfected persons for absorbance measurement using a digital Ultraviolet visible Spectrophotometer. The variables required for the computations with the Lifshitz formula were derived from the absorbance data. The Matlab software tools were used in the mathematical analysis of the data produced from the experiments (absorbance values). The Hamaker constants and the combined Hamaker coefficients were obtained using the values of the dielectric constant together with the Lifshitz Equation. The absolute combined Hamaker coefficients A132abs and A131abs on both infected and uninfected sputum samples gave the values of A132abs = 0.21631x10-21Joule for M-TB infected sputum and Ã132abs = 0.18825x10-21Joule for M-TB/HIV infected sputum. The significance of this result is the positive value of the absolute combined Hamaker coefficient which suggests the existence of net positive van der waals forces demonstrating an attraction between the bacteria and the macrophage. This however, implies that infection can occur. It was also shown that in the presence of HIV, the interaction energy is reduced by 13% conforming adverse effects observed in HIV patients suffering from tuberculosis.

Keywords: Absorbance, dielectric constant, Hamaker coefficient, Lifshitz formula, macrophage, Mycobacterium tuberculosis, Van der Waals forces.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1754
7539 Investigation on Performance of Change Point Algorithm in Time Series Dynamical Regimes and Effect of Data Characteristics

Authors: Farhad Asadi, Mohammad Javad Mollakazemi

Abstract:

In this paper, Bayesian online inference in models of data series are constructed by change-points algorithm, which separated the observed time series into independent series and study the change and variation of the regime of the data with related statistical characteristics. variation of statistical characteristics of time series data often represent separated phenomena in the some dynamical system, like a change in state of brain dynamical reflected in EEG signal data measurement or a change in important regime of data in many dynamical system. In this paper, prediction algorithm for studying change point location in some time series data is simulated. It is verified that pattern of proposed distribution of data has important factor on simpler and smother fluctuation of hazard rate parameter and also for better identification of change point locations. Finally, the conditions of how the time series distribution effect on factors in this approach are explained and validated with different time series databases for some dynamical system.

Keywords: Time series, fluctuation in statistical characteristics, optimal learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1789
7538 Fuzzy Types Clustering for Microarray Data

Authors: Seo Young Kim, Tai Myong Choi

Abstract:

The main goal of microarray experiments is to quantify the expression of every object on a slide as precisely as possible, with a further goal of clustering the objects. Recently, many studies have discussed clustering issues involving similar patterns of gene expression. This paper presents an application of fuzzy-type methods for clustering DNA microarray data that can be applied to typical comparisons. Clustering and analyses were performed on microarray and simulated data. The results show that fuzzy-possibility c-means clustering substantially improves the findings obtained by others.

Keywords: Clustering, microarray data, Fuzzy-type clustering, Validation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1499
7537 Computer Aided Design Solution Based on Genetic Algorithms for FMEA and Control Plan in Automotive Industry

Authors: Nadia Belu, Laurentiu M. Ionescu, Agnieszka Misztal

Abstract:

In this paper we propose a computer-aided solution with Genetic Algorithms in order to reduce the drafting of reports: FMEA analysis and Control Plan required in the manufacture of the product launch and improved knowledge development teams for future projects. The solution allows to the design team to introduce data entry required to FMEA. The actual analysis is performed using Genetic Algorithms to find optimum between RPN risk factor and cost of production. A feature of Genetic Algorithms is that they are used as a means of finding solutions for multi criteria optimization problems. In our case, along with three specific FMEA risk factors is considered and reduce production cost. Analysis tool will generate final reports for all FMEA processes. The data obtained in FMEA reports are automatically integrated with other entered parameters in Control Plan. Implementation of the solution is in the form of an application running in an intranet on two servers: one containing analysis and plan generation engine and the other containing the database where the initial parameters and results are stored. The results can then be used as starting solutions in the synthesis of other projects. The solution was applied to welding processes, laser cutting and bending to manufacture chassis for buses. Advantages of the solution are efficient elaboration of documents in the current project by automatically generating reports FMEA and Control Plan using multiple criteria optimization of production and build a solid knowledge base for future projects. The solution which we propose is a cheap alternative to other solutions on the market using Open Source tools in implementation.

Keywords: Automotive industry, control plan, FMEA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2856
7536 Robust Regression and its Application in Financial Data Analysis

Authors: Mansoor Momeni, Mahmoud Dehghan Nayeri, Ali Faal Ghayoumi, Hoda Ghorbani

Abstract:

This research is aimed to describe the application of robust regression and its advantages over the least square regression method in analyzing financial data. To do this, relationship between earning per share, book value of equity per share and share price as price model and earning per share, annual change of earning per share and return of stock as return model is discussed using both robust and least square regressions, and finally the outcomes are compared. Comparing the results from the robust regression and the least square regression shows that the former can provide the possibility of a better and more realistic analysis owing to eliminating or reducing the contribution of outliers and influential data. Therefore, robust regression is recommended for getting more precise results in financial data analysis.

Keywords: Financial data analysis, Influential data, Outliers, Robust regression.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1906
7535 Hierarchical Checkpoint Protocol in Data Grids

Authors: Rahma Souli-Jbali, Minyar Sassi Hidri, Rahma Ben Ayed

Abstract:

Grid of computing nodes has emerged as a representative means of connecting distributed computers or resources scattered all over the world for the purpose of computing and distributed storage. Since fault tolerance becomes complex due to the availability of resources in decentralized grid environment, it can be used in connection with replication in data grids. The objective of our work is to present fault tolerance in data grids with data replication-driven model based on clustering. The performance of the protocol is evaluated with Omnet++ simulator. The computational results show the efficiency of our protocol in terms of recovery time and the number of process in rollbacks.

Keywords: Data grids, fault tolerance, chandy-lamport, clustering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 921
7534 Fuzzy Based Problem-Solution Data Structureas a Data Oriented Model for ABS Controlling

Authors: Ahmad Habibizad Navin, Mehdi Naghian Fesharaki, Mohamad Teshnelab, Ehsan Shahamatnia

Abstract:

The anti-lock braking systems installed on vehicles for safe and effective braking, are high-order nonlinear and timevariant. Using fuzzy logic controllers increase efficiency of such systems, but impose a high computational complexity as well. The main concept introduced by this paper is reducing computational complexity of fuzzy controllers by deploying problem-solution data structure. Unlike conventional methods that are based on calculations, this approach is based on data oriented modeling.

Keywords: ABS, Fuzzy controller, PSDS, Time-Memory tradeoff, Data oriented modeling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1710
7533 Use of Bayesian Network in Information Extraction from Unstructured Data Sources

Authors: Quratulain N. Rajput, Sajjad Haider

Abstract:

This paper applies Bayesian Networks to support information extraction from unstructured, ungrammatical, and incoherent data sources for semantic annotation. A tool has been developed that combines ontologies, machine learning, and information extraction and probabilistic reasoning techniques to support the extraction process. Data acquisition is performed with the aid of knowledge specified in the form of ontology. Due to the variable size of information available on different data sources, it is often the case that the extracted data contains missing values for certain variables of interest. It is desirable in such situations to predict the missing values. The methodology, presented in this paper, first learns a Bayesian network from the training data and then uses it to predict missing data and to resolve conflicts. Experiments have been conducted to analyze the performance of the presented methodology. The results look promising as the methodology achieves high degree of precision and recall for information extraction and reasonably good accuracy for predicting missing values.

Keywords: Information Extraction, Bayesian Network, ontology, Machine Learning

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2202
7532 Data Acquisition from Cell Phone using Logical Approach

Authors: Keonwoo Kim, Dowon Hong, Kyoil Chung, Jae-Cheol Ryou

Abstract:

Cell phone forensics to acquire and analyze data in the cellular phone is nowadays being used in a national investigation organization and a private company. In order to collect cellular phone flash memory data, we have two methods. Firstly, it is a logical method which acquires files and directories from the file system of the cell phone flash memory. Secondly, we can get all data from bit-by-bit copy of entire physical memory using a low level access method. In this paper, we describe a forensic tool to acquire cell phone flash memory data using a logical level approach. By our tool, we can get EFS file system and peek memory data with an arbitrary region from Korea CDMA cell phone.

Keywords: Forensics, logical method, acquisition, cell phone, flash memory.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4088
7531 Data Migration Methodology from Relational to NoSQL Databases

Authors: Mohamed Hanine, Abdesadik Bendarag, Omar Boutkhoum

Abstract:

Currently, the field of data migration is very topical. As the number of applications developed rapidly, the ever-increasing volume of data collected has driven the architectural migration from Relational Database Management System (RDBMS) to NoSQL (Not Only SQL) database. This very recent technology is important enough in the field of database management. The main aim of this paper is to present a methodology for data migration from RDBMS to NoSQL database. To illustrate this methodology, we implement a software prototype using MySQL as a RDBMS and MongoDB as a NoSQL database. Although this is a hard engineering work, our results show that the proposed methodology can successfully accomplish the goal of this study.

Keywords: Data Migration, MySQL, RDBMS, NoSQL, MongoDB.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4337
7530 Tool Wear of Titanium/Tungsten/Silicon/Aluminum-based-coated end Mill Cutters in Millin Hardened Steel

Authors: Tadahiro Wada, Koji Iwamoto

Abstract:

In turning hardened steel, polycrystalline cubic boron nitride (cBN) compacts are widely used, due to their higher hardness and higher thermal conductivity. However, in milling hardened steel, fracture of cBN cutting tools readily occurs because they have poor fracture toughness. Therefore, coated cemented carbide tools, which have good fracture toughness and wear resistance, are generally widely used. In this study, hardened steel (ASTM D2, JIS SKD11, 60HRC) was milled with three physical vapor deposition (PVD)-coated cemented carbide end mill cutters in order to determine effective tool materials for cutting hardened steel at high cutting speeds. The coating films used were (Ti,W)N/(Ti,W,Si)N and (Ti,W)N/(Ti,W,Si,Al)N coating films. (Ti,W,Si,Al)N is a new type of coating film. The inner layer of the (Ti,W)N/(Ti,W,Si)N and (Ti,W)N/(Ti,W,Si,Al)N coating system is (Ti,W)N coating film, and the outer layer is (Ti,W,Si)N and (Ti,W,Si,Al)N coating films, respectively. Furthermore, commercial (Ti,Al)N-based coating film was also used. The following results were obtained: (1) In milling hardened steel at a cutting speed of 3.33 m/s, the tool wear width of the (Ti,W)N/(Ti,W,Si,Al)N-coated tool was smaller than that of the (Ti,W)N/(Ti,W,Si)N-coated tool. And, compared with the commercial (Ti,Al)N, the tool wear width of the (Ti,W)N/(Ti,W,Si,Al)N-coated tool was smaller than that of the (Ti,Al)N-coated tool. (2) The tool wear of the (Ti,W)N/(Ti,W,Si,Al)N-coated tool increased with an increase in cutting speed. (3) The (Ti,W)N/(Ti,W,Si,Al)N-coated cemented carbide was an effective tool material for high-speed cutting below a cutting speed of 3.33 m/s.

Keywords: cutting, physical vapor deposition (PVD) coating system, hardened steel, tool wear

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2034
7529 Performance Comparison of Particle Swarm Optimization with Traditional Clustering Algorithms used in Self-Organizing Map

Authors: Anurag Sharma, Christian W. Omlin

Abstract:

Self-organizing map (SOM) is a well known data reduction technique used in data mining. It can reveal structure in data sets through data visualization that is otherwise hard to detect from raw data alone. However, interpretation through visual inspection is prone to errors and can be very tedious. There are several techniques for the automatic detection of clusters of code vectors found by SOM, but they generally do not take into account the distribution of code vectors; this may lead to unsatisfactory clustering and poor definition of cluster boundaries, particularly where the density of data points is low. In this paper, we propose the use of an adaptive heuristic particle swarm optimization (PSO) algorithm for finding cluster boundaries directly from the code vectors obtained from SOM. The application of our method to several standard data sets demonstrates its feasibility. PSO algorithm utilizes a so-called U-matrix of SOM to determine cluster boundaries; the results of this novel automatic method compare very favorably to boundary detection through traditional algorithms namely k-means and hierarchical based approach which are normally used to interpret the output of SOM.

Keywords: cluster boundaries, clustering, code vectors, data mining, particle swarm optimization, self-organizing maps, U-matrix.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1888
7528 Data Hiding by Vector Quantization in Color Image

Authors: Yung-Gi Wu

Abstract:

With the growing of computer and network, digital data can be spread to anywhere in the world quickly. In addition, digital data can also be copied or tampered easily so that the security issue becomes an important topic in the protection of digital data. Digital watermark is a method to protect the ownership of digital data. Embedding the watermark will influence the quality certainly. In this paper, Vector Quantization (VQ) is used to embed the watermark into the image to fulfill the goal of data hiding. This kind of watermarking is invisible which means that the users will not conscious the existing of embedded watermark even though the embedded image has tiny difference compared to the original image. Meanwhile, VQ needs a lot of computation burden so that we adopt a fast VQ encoding scheme by partial distortion searching (PDS) and mean approximation scheme to speed up the data hiding process. The watermarks we hide to the image could be gray, bi-level and color images. Texts are also can be regarded as watermark to embed. In order to test the robustness of the system, we adopt Photoshop to fulfill sharpen, cropping and altering to check if the extracted watermark is still recognizable. Experimental results demonstrate that the proposed system can resist the above three kinds of tampering in general cases.

Keywords: Data hiding, vector quantization, watermark.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1758
7527 Approximate Range-Sum Queries over Data Cubes Using Cosine Transform

Authors: Wen-Chi Hou, Cheng Luo, Zhewei Jiang, Feng Yan

Abstract:

In this research, we propose to use the discrete cosine transform to approximate the cumulative distributions of data cube cells- values. The cosine transform is known to have a good energy compaction property and thus can approximate data distribution functions easily with small number of coefficients. The derived estimator is accurate and easy to update. We perform experiments to compare its performance with a well-known technique - the (Haar) wavelet. The experimental results show that the cosine transform performs much better than the wavelet in estimation accuracy, speed, space efficiency, and update easiness.

Keywords: DCT, Data Cube

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1932
7526 Digital filters for Hot-Mix Asphalt Complex Modulus Test Data Using Genetic Algorithm Strategies

Authors: Madhav V. Chitturi, Anshu Manik, Kasthurirangan Gopalakrishnan

Abstract:

The dynamic or complex modulus test is considered to be a mechanistically based laboratory test to reliably characterize the strength and load-resistance of Hot-Mix Asphalt (HMA) mixes used in the construction of roads. The most common observation is that the data collected from these tests are often noisy and somewhat non-sinusoidal. This hampers accurate analysis of the data to obtain engineering insight. The goal of the work presented in this paper is to develop and compare automated evolutionary computational techniques to filter test noise in the collection of data for the HMA complex modulus test. The results showed that the Covariance Matrix Adaptation-Evolutionary Strategy (CMA-ES) approach is computationally efficient for filtering data obtained from the HMA complex modulus test.

Keywords: HMA, dynamic modulus, GA, evolutionarycomputation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1553
7525 Strategy for Optimal Configuration Design of Existing Structures by Topology and Shape Optimization Tools

Authors: Waqas Saleem, Fan Yuqing

Abstract:

A strategy is implemented to find the improved configuration design of an existing aircraft structure by executing topology and shape optimizations. Structural analysis of the Initial Design Space is performed in ANSYS under the loads pertinent to operating and ground conditions. By using the FEA results and data, an initial optimized layout configuration is attained by exploiting nonparametric topology optimization in TOSCA software. Topological optimized surfaces are then smoothened and imported in ANSYS to develop the geometrical features. Nodes at the critical locations of resulting voids are selected for sketching rough profiles. Rough profiles are further refined and CAD feasible geometric features are generated. The modified model is then analyzed under the same loadings and constraints as defined for topology optimization. Shape at the peak stress concentration areas are further optimized by exploiting the shape optimization in TOSCA.shape module. The harmonized stressed model with the modified surfaces is then imported in CATIA to develop the final design.

Keywords: Structural optimization, Topology optimization, Shape optimization, Tail fin

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2789