Search results for: Data preparation
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 7632

Search results for: Data preparation

7422 A Prediction of Attractive Evaluation Objects Based On Complex Sequential Data

Authors: Shigeaki Sakurai, Makino Kyoko, Shigeru Matsumoto

Abstract:

This paper proposes a method that predicts attractive evaluation objects. In the learning phase, the method inductively acquires trend rules from complex sequential data. The data is composed of two types of data. One is numerical sequential data. Each evaluation object has respective numerical sequential data. The other is text sequential data. Each evaluation object is described in texts. The trend rules represent changes of numerical values related to evaluation objects. In the prediction phase, the method applies new text sequential data to the trend rules and evaluates which evaluation objects are attractive. This paper verifies the effect of the proposed method by using stock price sequences and news headline sequences. In these sequences, each stock brand corresponds to an evaluation object. This paper discusses validity of predicted attractive evaluation objects, the process time of each phase, and the possibility of application tasks.

Keywords: Trend rule, frequent pattern, numerical sequential data, text sequential data, evaluation object.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1190
7421 Hydrogen Production by Gasification of Biomass from Copoazu Waste

Authors: Emilio Delgado, William Aperador, Alis Pataquiva

Abstract:

Biomass is becoming a large renewable resource for power generation; it is involved in higher frequency in environmentally clean processes, and even it is used for biofuels preparation. On the other hand, hydrogen – other energy source – can be produced in a variety of methods including gasification of biomass. In this study, the production of hydrogen by gasification of biomass waste is examined. This work explores the production of a gaseous mixture with high power potential from Amazonas´ specie known as copoazu, using a counter-flow fixed-bed bioreactor.

Keywords: Copoazu, Gasification, Hydrogen production.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1717
7420 A Comparative Study of Fine Grained Security Techniques Based on Data Accessibility and Inference

Authors: Azhar Rauf, Sareer Badshah, Shah Khusro

Abstract:

This paper analyzes different techniques of the fine grained security of relational databases for the two variables-data accessibility and inference. Data accessibility measures the amount of data available to the users after applying a security technique on a table. Inference is the proportion of information leakage after suppressing a cell containing secret data. A row containing a secret cell which is suppressed can become a security threat if an intruder generates useful information from the related visible information of the same row. This paper measures data accessibility and inference associated with row, cell, and column level security techniques. Cell level security offers greatest data accessibility as it suppresses secret data only. But on the other hand, there is a high probability of inference in cell level security. Row and column level security techniques have least data accessibility and inference. This paper introduces cell plus innocent security technique that utilizes the cell level security method but suppresses some innocent data to dodge an intruder that a suppressed cell may not necessarily contain secret data. Four variations of the technique namely cell plus innocent 1/4, cell plus innocent 2/4, cell plus innocent 3/4, and cell plus innocent 4/4 respectively have been introduced to suppress innocent data equal to 1/4, 2/4, 3/4, and 4/4 percent of the true secret data inside the database. Results show that the new technique offers better control over data accessibility and inference as compared to the state-of-theart security techniques. This paper further discusses the combination of techniques together to be used. The paper shows that cell plus innocent 1/4, 2/4, and 3/4 techniques can be used as a replacement for the cell level security.

Keywords: Fine Grained Security, Data Accessibility, Inference, Row, Cell, Column Level Security.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1430
7419 Weka Based Desktop Data Mining as Web Service

Authors: Sujala.D.Shetty, S.Vadivel, Sakshi Vaghella

Abstract:

Data mining is the process of sifting through large volumes of data, analyzing data from different perspectives and summarizing it into useful information. One of the widely used desktop applications for data mining is the Weka tool which is nothing but a collection of machine learning algorithms implemented in Java and open sourced under the General Public License (GPL). A web service is a software system designed to support interoperable machine to machine interaction over a network using SOAP messages. Unlike a desktop application, a web service is easy to upgrade, deliver and access and does not occupy any memory on the system. Keeping in mind the advantages of a web service over a desktop application, in this paper we are demonstrating how this Java based desktop data mining application can be implemented as a web service to support data mining across the internet.

Keywords: desktop application, Weka mining, web service

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4033
7418 Feasibility Investigation of Near Infrared Spectrometry for Particle Size Estimation of Nano Structures

Authors: A. Bagheri Garmarudi, M. Khanmohammadi, N. Khoddami, K. Shabani

Abstract:

Determination of nano particle size is substantial since the nano particle size exerts a significant effect on various properties of nano materials. Accordingly, proposing non-destructive, accurate and rapid techniques for this aim is of high interest. There are some conventional techniques to investigate the morphology and grain size of nano particles such as scanning electron microscopy (SEM), atomic force microscopy (AFM) and X-ray diffractometry (XRD). Vibrational spectroscopy is utilized to characterize different compounds and applied for evaluation of the average particle size based on relationship between particle size and near infrared spectra [1,4] , but it has never been applied in quantitative morphological analysis of nano materials. So far, the potential application of nearinfrared (NIR) spectroscopy with its ability in rapid analysis of powdered materials with minimal sample preparation, has been suggested for particle size determination of powdered pharmaceuticals. The relationship between particle size and diffuse reflectance (DR) spectra in near infrared region has been applied to introduce a method for estimation of particle size. Back propagation artificial neural network (BP-ANN) as a nonlinear model was applied to estimate average particle size based on near infrared diffuse reflectance spectra. Thirty five different nano TiO2 samples with different particle size were analyzed by DR-FTNIR spectrometry and the obtained data were processed by BP- ANN.

Keywords: near infrared, particle size, chemometrics, neuralnetwork, nano structure.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1796
7417 Influence of Parameters of Modeling and Data Distribution for Optimal Condition on Locally Weighted Projection Regression Method

Authors: Farhad Asadi, Mohammad Javad Mollakazemi, Aref Ghafouri

Abstract:

Recent research in neural networks science and neuroscience for modeling complex time series data and statistical learning has focused mostly on learning from high input space and signals. Local linear models are a strong choice for modeling local nonlinearity in data series. Locally weighted projection regression is a flexible and powerful algorithm for nonlinear approximation in high dimensional signal spaces. In this paper, different learning scenario of one and two dimensional data series with different distributions are investigated for simulation and further noise is inputted to data distribution for making different disordered distribution in time series data and for evaluation of algorithm in locality prediction of nonlinearity. Then, the performance of this algorithm is simulated and also when the distribution of data is high or when the number of data is less the sensitivity of this approach to data distribution and influence of important parameter of local validity in this algorithm with different data distribution is explained.

Keywords: Local nonlinear estimation, LWPR algorithm, Online training method.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1559
7416 Noise Reduction in Web Data: A Learning Approach Based on Dynamic User Interests

Authors: Julius Onyancha, Valentina Plekhanova

Abstract:

One of the significant issues facing web users is the amount of noise in web data which hinders the process of finding useful information in relation to their dynamic interests. Current research works consider noise as any data that does not form part of the main web page and propose noise web data reduction tools which mainly focus on eliminating noise in relation to the content and layout of web data. This paper argues that not all data that form part of the main web page is of a user interest and not all noise data is actually noise to a given user. Therefore, learning of noise web data allocated to the user requests ensures not only reduction of noisiness level in a web user profile, but also a decrease in the loss of useful information hence improves the quality of a web user profile. Noise Web Data Learning (NWDL) tool/algorithm capable of learning noise web data in web user profile is proposed. The proposed work considers elimination of noise data in relation to dynamic user interest. In order to validate the performance of the proposed work, an experimental design setup is presented. The results obtained are compared with the current algorithms applied in noise web data reduction process. The experimental results show that the proposed work considers the dynamic change of user interest prior to elimination of noise data. The proposed work contributes towards improving the quality of a web user profile by reducing the amount of useful information eliminated as noise.

Keywords: Web log data, web user profile, user interest, noise web data learning, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1683
7415 Hybrid Reliability-Similarity-Based Approach for Supervised Machine Learning

Authors: Walid Cherif

Abstract:

Data mining has, over recent years, seen big advances because of the spread of internet, which generates everyday a tremendous volume of data, and also the immense advances in technologies which facilitate the analysis of these data. In particular, classification techniques are a subdomain of Data Mining which determines in which group each data instance is related within a given dataset. It is used to classify data into different classes according to desired criteria. Generally, a classification technique is either statistical or machine learning. Each type of these techniques has its own limits. Nowadays, current data are becoming increasingly heterogeneous; consequently, current classification techniques are encountering many difficulties. This paper defines new measure functions to quantify the resemblance between instances and then combines them in a new approach which is different from actual algorithms by its reliability computations. Results of the proposed approach exceeded most common classification techniques with an f-measure exceeding 97% on the IRIS Dataset.

Keywords: Data mining, knowledge discovery, machine learning, similarity measurement, supervised classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1478
7414 Application of Stabilized Polyaniline Microparticles for Better Protective Ability of Zinc Coatings

Authors: N. Boshkova, K. Kamburova, N. Tabakova, N. Boshkov, Ts. Radeva

Abstract:

Coatings based on polyaniline (PANI) can improve the resistance of steel against corrosion. In this work, the preparation of stable suspensions of colloidal PANI-SiO2 particles, suitable for obtaining of composite anticorrosive coating on steel, is described. Electrokinetic data as a function of pH are presented, showing that the zeta potentials of the PANI-SiO2 particles are governed primarily by the charged groups at the silica oxide surface. Electrosteric stabilization of the PANI-SiO2 particles’ suspension against aggregation is realized at pH>5.5 (EB form of PANI) by adsorption of positively charged polyelectrolyte molecules onto negatively charged PANI-SiO2 particles. The PANI-SiO2 particles are incorporated by electrodeposition into the metal matrix of zinc in order to obtain composite (hybrid) coatings. The latter are aimed to ensure sacrificial protection of steel mainly in aggressive media leading to local corrosion damages. The surface morphology of the composite zinc coatings is investigated with SEM. The influence of PANI-SiO2 particles on the cathodic and anodic processes occurring in the starting electrolyte for obtaining of the coatings is followed with cyclic voltammetry. The electrochemical and corrosion behavior is evaluated with potentiodynamic polarization curves and polarization resistance measurements. The beneficial effect of the stabilized PANI-SiO2 particles for the increased protective ability of the composites is commented and discussed.

Keywords: Corrosion, polyaniline particles, zinc, protective ability.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 768
7413 Moving Data Mining Tools toward a Business Intelligence System

Authors: Nittaya Kerdprasop, Kittisak Kerdprasop

Abstract:

Data mining (DM) is the process of finding and extracting frequent patterns that can describe the data, or predict unknown or future values. These goals are achieved by using various learning algorithms. Each algorithm may produce a mining result completely different from the others. Some algorithms may find millions of patterns. It is thus the difficult job for data analysts to select appropriate models and interpret the discovered knowledge. In this paper, we describe a framework of an intelligent and complete data mining system called SUT-Miner. Our system is comprised of a full complement of major DM algorithms, pre-DM and post-DM functionalities. It is the post-DM packages that ease the DM deployment for business intelligence applications.

Keywords: Business intelligence, data mining, functionalprogramming, intelligent system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1685
7412 Analysis of Diverse Clustering Tools in Data Mining

Authors: S. Sarumathi, N. Shanthi, M. Sharmila

Abstract:

Clustering in data mining is an unsupervised learning technique of aggregating the data objects into meaningful groups such that the intra cluster similarity of objects are maximized and inter cluster similarity of objects are minimized. Over the past decades several clustering tools were emerged in which clustering algorithms are inbuilt and are easier to use and extract the expected results. Data mining mainly deals with the huge databases that inflicts on cluster analysis and additional rigorous computational constraints. These challenges pave the way for the emergence of powerful expansive data mining clustering softwares. In this survey, a variety of clustering tools used in data mining are elucidated along with the pros and cons of each software.

Keywords: Cluster Analysis, Clustering Algorithms, Clustering Techniques, Association, Visualization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2150
7411 Preparation and in vivo Assessment of Nystatin-Loaded Solid Lipid Nanoparticles for Topical Delivery against Cutaneous Candidiasis

Authors: Rawia M. Khalil, Ahmed A. Abd El Rahman, Mahfouz A. Kassem, Mohamed S. El Ridi, Mona M. Abou Samra, Ghada E. A. Awad, Soheir S. Mansy

Abstract:

Solid lipid nanoparticles (SLNs) have gained great attention for the topical treatment of skin associated fungal infection as they facilitate the skin penetration of loaded drugs. Our work deals with the preparation of nystatin loaded solid lipid nanoparticles (NystSLNs) using the hot homogenization and ultrasonication method. The prepared NystSLNs were characterized in terms of entrapment efficiency, particle size, zeta potential, transmission electron microscopy, differential scanning calorimetry, rheological behavior and in vitro drug release. A stability study for 6 months was performed. A microbiological study was conducted in male rats infected with Candida albicans, by counting the colonies and examining the histopathological changes induced on the skin of infected rats. The results showed that SLNs dispersions are spherical in shape with particle size ranging from 83.26±11.33 to 955.04±1.09 nm. The entrapment efficiencies are ranging from 19.73±1.21 to 72.46±0.66% with zeta potential ranging from -18.9 to -38.8 mV and shear-thinning rheological Behavior. The stability studies done for 6 months showed that nystatin (Nyst) is a good candidate for topical SLN formulations. A least number of colony forming unit/ ml (cfu/ml) was recorded for the selected NystSLN compared to the drug solution and the commercial Nystatin® cream present in the market. It can be fulfilled from this work that SLNs provide a good skin targeting effect and may represent promising carrier for topical delivery of Nyst offering the sustained release and maintaining the localized effect, resulting in an effective treatment of cutaneous fungal infection.

Keywords: Candida infections, Hot homogenization, Nystatin, Solid lipid nanoparticles, Stability, Topical delivery.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2798
7410 A Monte Carlo Method to Data Stream Analysis

Authors: Kittisak Kerdprasop, Nittaya Kerdprasop, Pairote Sattayatham

Abstract:

Data stream analysis is the process of computing various summaries and derived values from large amounts of data which are continuously generated at a rapid rate. The nature of a stream does not allow a revisit on each data element. Furthermore, data processing must be fast to produce timely analysis results. These requirements impose constraints on the design of the algorithms to balance correctness against timely responses. Several techniques have been proposed over the past few years to address these challenges. These techniques can be categorized as either dataoriented or task-oriented. The data-oriented approach analyzes a subset of data or a smaller transformed representation, whereas taskoriented scheme solves the problem directly via approximation techniques. We propose a hybrid approach to tackle the data stream analysis problem. The data stream has been both statistically transformed to a smaller size and computationally approximated its characteristics. We adopt a Monte Carlo method in the approximation step. The data reduction has been performed horizontally and vertically through our EMR sampling method. The proposed method is analyzed by a series of experiments. We apply our algorithm on clustering and classification tasks to evaluate the utility of our approach.

Keywords: Data Stream, Monte Carlo, Sampling, DensityEstimation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1382
7409 Determination of Vitamin C (Ascorbic Acid) in Orange Juices Product

Authors: Wanida Wonsawat

Abstract:

This research describes a voltammetric approach to determine amounts of vitamin C (Ascorbic acid) in orange juice sample, using three screen printed electrode. The anodic currents of vitamin C were proportional to vitamin C concentration in the range of 0 – 10.0 mM with the limit of detection of 1.36 mM. The method was successfully employed with 2 µL of the working solution dropped on the electrode surface. The proposed method was applied for the analysis of vitamin C in packed orange juice without sample purification or complexion of sample preparation step.

Keywords: Ascorbic acid, Vitamin C, Juice, Voltammetry

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 8398
7408 Improved Data Warehousing: Lessons Learnt from the Systems Approach

Authors: Roelien Goede

Abstract:

Data warehousing success is not high enough. User dissatisfaction and failure to adhere to time frames and budgets are too common. Most traditional information systems practices are rooted in hard systems thinking. Today, the great systems thinkers are forgotten by information systems developers. A data warehouse is still a system and it is worth investigating whether systems thinkers such as Churchman can enhance our practices today. This paper investigates data warehouse development practices from a systems thinking perspective. An empirical investigation is done in order to understand the everyday practices of data warehousing professionals from a systems perspective. The paper presents a model for the application of Churchman-s systems approach in data warehouse development.

Keywords: Data warehouse development, Information systemsdevelopment, Interpretive case study, Systems thinking

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1536
7407 Centralized Resource Management for Network Infrastructure Including Ip Telephony by Integrating a Mediator Between the Heterogeneous Data Sources

Authors: Mohammed Fethi Khalfi, Malika Kandouci

Abstract:

Over the past decade, mobile has experienced a revolution that will ultimately change the way we communicate.All these technologies have a common denominator exploitation of computer information systems, but their operation can be tedious because of problems with heterogeneous data sources.To overcome the problems of heterogeneous data sources, we propose to use a technique of adding an extra layer interfacing applications of management or supervision at the different data sources.This layer will be materialized by the implementation of a mediator between different host applications and information systems frequently used hierarchical and relational manner such that the heterogeneity is completely transparent to the VoIP platform.

Keywords: TOIP, Data Integration, Mediation, informationcomputer system, heterogeneous data sources

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1289
7406 Secure Multiparty Computations for Privacy Preserving Classifiers

Authors: M. Sumana, K. S. Hareesha

Abstract:

Secure computations are essential while performing privacy preserving data mining. Distributed privacy preserving data mining involve two to more sites that cannot pool in their data to a third party due to the violation of law regarding the individual. Hence in order to model the private data without compromising privacy and information loss, secure multiparty computations are used. Secure computations of product, mean, variance, dot product, sigmoid function using the additive and multiplicative homomorphic property is discussed. The computations are performed on vertically partitioned data with a single site holding the class value.

Keywords: Homomorphic property, secure product, secure mean and variance, secure dot product, vertically partitioned data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 869
7405 Design of Buffer Management for Industry to Avoid Sensor Data- Conflicts

Authors: Dae-ho Won, Jong-wook Hong, Yeon-Mo Yang, Jinung An

Abstract:

To reduce accidents in the industry, WSNs(Wireless Sensor networks)- sensor data is used. WSNs- sensor data has the persistence and continuity. therefore, we design and exploit the buffer management system that has the persistence and continuity to avoid and delivery data conflicts. To develop modules, we use the multi buffers and design the buffer management modules that transfer sensor data through the context-aware methods.

Keywords: safe management system, buffer management, context-aware, input data stream

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1511
7404 Security in Resource Constraints Network Light Weight Encryption for Z-MAC

Authors: Mona Almansoori, Ahmed Mustafa, Ahmad Elshamy

Abstract:

Wireless sensor network was formed by a combination of nodes, systematically it transmitting the data to their base stations, this transmission data can be easily compromised if the limited processing power and the data consistency from these nodes are kept in mind; there is always a discussion to address the secure data transfer or transmission in actual time. This will present a mechanism to securely transmit the data over a chain of sensor nodes without compromising the throughput of the network by utilizing available battery resources available in the sensor node. Our methodology takes many different advantages of Z-MAC protocol for its efficiency, and it provides a unique key by sharing the mechanism using neighbor node MAC address. We present a light weighted data integrity layer which is embedded in the Z-MAC protocol to prove that our protocol performs well than Z-MAC when we introduce the different attack scenarios.

Keywords: Hybrid MAC protocol, data integrity, lightweight encryption, Neighbor based key sharing, Sensor node data processing, Z-MAC.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 500
7403 Data Recording for Remote Monitoring of Autonomous Vehicles

Authors: Rong-Terng Juang

Abstract:

Autonomous vehicles offer the possibility of significant benefits to social welfare. However, fully automated cars might not be going to happen in the near further. To speed the adoption of the self-driving technologies, many governments worldwide are passing laws requiring data recorders for the testing of autonomous vehicles. Currently, the self-driving vehicle, (e.g., shuttle bus) has to be monitored from a remote control center. When an autonomous vehicle encounters an unexpected driving environment, such as road construction or an obstruction, it should request assistance from a remote operator. Nevertheless, large amounts of data, including images, radar and lidar data, etc., have to be transmitted from the vehicle to the remote center. Therefore, this paper proposes a data compression method of in-vehicle networks for remote monitoring of autonomous vehicles. Firstly, the time-series data are rearranged into a multi-dimensional signal space. Upon the arrival, for controller area networks (CAN), the new data are mapped onto a time-data two-dimensional space associated with the specific CAN identity. Secondly, the data are sampled based on differential sampling. Finally, the whole set of data are encoded using existing algorithms such as Huffman, arithmetic and codebook encoding methods. To evaluate system performance, the proposed method was deployed on an in-house built autonomous vehicle. The testing results show that the amount of data can be reduced as much as 1/7 compared to the raw data.

Keywords: Autonomous vehicle, data recording, remote monitoring, controller area network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1286
7402 The Practice of Teaching Chemistry by the Application of Online Tests

Authors: Nikolina Ribarić

Abstract:

E-learning is most commonly defined as a set of applications and processes, such as Web-based learning, computer-based learning, virtual classrooms and digital collaboration, that enable access to instructional content through a variety of electronic media. The main goal of an e-learning system is learning, and the way to evaluate the impact of an e-learning system is by examining whether students learn effectively with the help of that system. Testmoz is a program for online preparation of knowledge evaluation assignments. The program provides teachers with computer support during the design of assignments and evaluating them. Students can review and solve assignments and also check the correctness of their solutions. Research into the increase of motivation by the practice of providing teaching content by applying online tests prepared in the Testmoz program, was carried out with students of the 8th grade of Ljubo Babić Primary School in Jastrebarsko. The students took the tests in their free time, from home, for an unlimited number of times. SPSS was used to process the data obtained by the research instruments. The results of the research showed that students preferred to practice teaching content, and achieved better educational results in chemistry, when they had access to online tests for repetition and practicing in relation to subject content which was checked after repetition and practicing in "the classical way" – i.e., solving assignments in a workbook or writing assignments in worksheets.

Keywords: Chemistry class, e-learning, online test, Testmoz.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 468
7401 Cloud Computing Databases: Latest Trends and Architectural Concepts

Authors: Tarandeep Singh, Parvinder S. Sandhu

Abstract:

The Economic factors are leading to the rise of infrastructures provides software and computing facilities as a service, known as cloud services or cloud computing. Cloud services can provide efficiencies for application providers, both by limiting up-front capital expenses, and by reducing the cost of ownership over time. Such services are made available in a data center, using shared commodity hardware for computation and storage. There is a varied set of cloud services available today, including application services (salesforce.com), storage services (Amazon S3), compute services (Google App Engine, Amazon EC2) and data services (Amazon SimpleDB, Microsoft SQL Server Data Services, Google-s Data store). These services represent a variety of reformations of data management architectures, and more are on the horizon.

Keywords: Data Management in Cloud, AWS, EC2, S3, SQS, TQG.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1932
7400 Data Annotation Models and Annotation Query Language

Authors: Neerja Bhatnagar, Benjoe A. Juliano, Renee S. Renner

Abstract:

This paper presents data annotation models at five levels of granularity (database, relation, column, tuple, and cell) of relational data to address the problem of unsuitability of most relational databases to express annotations. These models do not require any structural and schematic changes to the underlying database. These models are also flexible, extensible, customizable, database-neutral, and platform-independent. This paper also presents an SQL-like query language, named Annotation Query Language (AnQL), to query annotation documents. AnQL is simple to understand and exploits the already-existent wide knowledge and skill set of SQL.

Keywords: annotation query language, data annotations, data annotation models, semantic data annotations

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2313
7399 Use XML Format like a Model of Data Backup

Authors: Souleymane Oumtanaga, Kadjo Tanon Lambert, Koné Tiémoman, Tety Pierre, Dowa N’sreke Florent

Abstract:

Nowadays data backup format doesn-t cease to appear raising so the anxiety on their accessibility and their perpetuity. XML is one of the most promising formats to guarantee the integrity of data. This article suggests while showing one thing man can do with XML. Indeed XML will help to create a data backup model. The main task will consist in defining an application in JAVA able to convert information of a database in XML format and restore them later.

Keywords: Backup, Proprietary format, parser, syntactic tree.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1691
7398 REDUCER – An Architectural Design Pattern for Reducing Large and Noisy Data Sets

Authors: Apkar Salatian

Abstract:

To relieve the burden of reasoning on a point to point basis, in many domains there is a need to reduce large and noisy data sets into trends for qualitative reasoning. In this paper we propose and describe a new architectural design pattern called REDUCER for reducing large and noisy data sets that can be tailored for particular situations. REDUCER consists of 2 consecutive processes: Filter which takes the original data and removes outliers, inconsistencies or noise; and Compression which takes the filtered data and derives trends in the data. In this seminal article we also show how REDUCER has successfully been applied to 3 different case studies.

Keywords: Design Pattern, filtering, compression.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1450
7397 Comparing Emotion Recognition from Voice and Facial Data Using Time Invariant Features

Authors: Vesna Kirandziska, Nevena Ackovska, Ana Madevska Bogdanova

Abstract:

The problem of emotion recognition is a challenging problem. It is still an open problem from the aspect of both intelligent systems and psychology. In this paper, both voice features and facial features are used for building an emotion recognition system. A Support Vector Machine classifiers are built by using raw data from video recordings. In this paper, the results obtained for the emotion recognition are given, and a discussion about the validity and the expressiveness of different emotions is presented. A comparison between the classifiers build from facial data only, voice data only and from the combination of both data is made here. The need for a better combination of the information from facial expression and voice data is argued.

Keywords: Emotion recognition, facial recognition, signal processing, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1956
7396 Analysis of Data Gathering Schemes for Layered Sensor Networks with Multihop Polling

Authors: Bhed Bahadur Bista, Danda B. Rawat

Abstract:

In this paper, we investigate multihop polling and data gathering schemes in layered sensor networks in order to extend the life time of the networks. A network consists of three layers. The lowest layer contains sensors. The middle layer contains so called super nodes with higher computational power, energy supply and longer transmission range than sensor nodes. The top layer contains a sink node. A node in each layer controls a number of nodes in lower layer by polling mechanism to gather data. We will present four types of data gathering schemes: intermediate nodes do not queue data packet, queue single packet, queue multiple packets and aggregate data, to see which data gathering scheme is more energy efficient for multihop polling in layered sensor networks.

Keywords: layered sensor network, polling, data gatheringschemes.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1522
7395 Incremental Algorithm to Cluster the Categorical Data with Frequency Based Similarity Measure

Authors: S.Aranganayagi, K.Thangavel

Abstract:

Clustering categorical data is more complicated than the numerical clustering because of its special properties. Scalability and memory constraint is the challenging problem in clustering large data set. This paper presents an incremental algorithm to cluster the categorical data. Frequencies of attribute values contribute much in clustering similar categorical objects. In this paper we propose new similarity measures based on the frequencies of attribute values and its cardinalities. The proposed measures and the algorithm are experimented with the data sets from UCI data repository. Results prove that the proposed method generates better clusters than the existing one.

Keywords: Clustering, Categorical, Incremental, Frequency, Domain

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1783
7394 Multimedia Data Fusion for Event Detection in Twitter by Using Dempster-Shafer Evidence Theory

Authors: Samar M. Alqhtani, Suhuai Luo, Brian Regan

Abstract:

Data fusion technology can be the best way to extract useful information from multiple sources of data. It has been widely applied in various applications. This paper presents a data fusion approach in multimedia data for event detection in twitter by using Dempster-Shafer evidence theory. The methodology applies a mining algorithm to detect the event. There are two types of data in the fusion. The first is features extracted from text by using the bag-ofwords method which is calculated using the term frequency-inverse document frequency (TF-IDF). The second is the visual features extracted by applying scale-invariant feature transform (SIFT). The Dempster - Shafer theory of evidence is applied in order to fuse the information from these two sources. Our experiments have indicated that comparing to the approaches using individual data source, the proposed data fusion approach can increase the prediction accuracy for event detection. The experimental result showed that the proposed method achieved a high accuracy of 0.97, comparing with 0.93 with texts only, and 0.86 with images only.

Keywords: Data fusion, Dempster-Shafer theory, data mining, event detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1752
7393 Model Order Reduction for Frequency Response and Effect of Order of Method for Matching Condition

Authors: Aref Ghafouri, Mohammad Javad Mollakazemi, Farhad Asadi

Abstract:

In this paper, model order reduction method is used for approximation in linear and nonlinearity aspects in some experimental data. This method can be used for obtaining offline reduced model for approximation of experimental data and can produce and follow the data and order of system and also it can match to experimental data in some frequency ratios. In this study, the method is compared in different experimental data and influence of choosing of order of the model reduction for obtaining the best and sufficient matching condition for following the data is investigated in format of imaginary and reality part of the frequency response curve and finally the effect and important parameter of number of order reduction in nonlinear experimental data is explained further.

Keywords: Frequency response, Order of model reduction, frequency matching condition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2013