Search results for: Data storage
7433 A Comparative Study of Fine Grained Security Techniques Based on Data Accessibility and Inference
Authors: Azhar Rauf, Sareer Badshah, Shah Khusro
Abstract:
This paper analyzes different techniques of the fine grained security of relational databases for the two variables-data accessibility and inference. Data accessibility measures the amount of data available to the users after applying a security technique on a table. Inference is the proportion of information leakage after suppressing a cell containing secret data. A row containing a secret cell which is suppressed can become a security threat if an intruder generates useful information from the related visible information of the same row. This paper measures data accessibility and inference associated with row, cell, and column level security techniques. Cell level security offers greatest data accessibility as it suppresses secret data only. But on the other hand, there is a high probability of inference in cell level security. Row and column level security techniques have least data accessibility and inference. This paper introduces cell plus innocent security technique that utilizes the cell level security method but suppresses some innocent data to dodge an intruder that a suppressed cell may not necessarily contain secret data. Four variations of the technique namely cell plus innocent 1/4, cell plus innocent 2/4, cell plus innocent 3/4, and cell plus innocent 4/4 respectively have been introduced to suppress innocent data equal to 1/4, 2/4, 3/4, and 4/4 percent of the true secret data inside the database. Results show that the new technique offers better control over data accessibility and inference as compared to the state-of-theart security techniques. This paper further discusses the combination of techniques together to be used. The paper shows that cell plus innocent 1/4, 2/4, and 3/4 techniques can be used as a replacement for the cell level security.
Keywords: Fine Grained Security, Data Accessibility, Inference, Row, Cell, Column Level Security.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14717432 Weka Based Desktop Data Mining as Web Service
Authors: Sujala.D.Shetty, S.Vadivel, Sakshi Vaghella
Abstract:
Data mining is the process of sifting through large volumes of data, analyzing data from different perspectives and summarizing it into useful information. One of the widely used desktop applications for data mining is the Weka tool which is nothing but a collection of machine learning algorithms implemented in Java and open sourced under the General Public License (GPL). A web service is a software system designed to support interoperable machine to machine interaction over a network using SOAP messages. Unlike a desktop application, a web service is easy to upgrade, deliver and access and does not occupy any memory on the system. Keeping in mind the advantages of a web service over a desktop application, in this paper we are demonstrating how this Java based desktop data mining application can be implemented as a web service to support data mining across the internet.Keywords: desktop application, Weka mining, web service
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 40817431 Influence of Parameters of Modeling and Data Distribution for Optimal Condition on Locally Weighted Projection Regression Method
Authors: Farhad Asadi, Mohammad Javad Mollakazemi, Aref Ghafouri
Abstract:
Recent research in neural networks science and neuroscience for modeling complex time series data and statistical learning has focused mostly on learning from high input space and signals. Local linear models are a strong choice for modeling local nonlinearity in data series. Locally weighted projection regression is a flexible and powerful algorithm for nonlinear approximation in high dimensional signal spaces. In this paper, different learning scenario of one and two dimensional data series with different distributions are investigated for simulation and further noise is inputted to data distribution for making different disordered distribution in time series data and for evaluation of algorithm in locality prediction of nonlinearity. Then, the performance of this algorithm is simulated and also when the distribution of data is high or when the number of data is less the sensitivity of this approach to data distribution and influence of important parameter of local validity in this algorithm with different data distribution is explained.
Keywords: Local nonlinear estimation, LWPR algorithm, Online training method.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16017430 Noise Reduction in Web Data: A Learning Approach Based on Dynamic User Interests
Authors: Julius Onyancha, Valentina Plekhanova
Abstract:
One of the significant issues facing web users is the amount of noise in web data which hinders the process of finding useful information in relation to their dynamic interests. Current research works consider noise as any data that does not form part of the main web page and propose noise web data reduction tools which mainly focus on eliminating noise in relation to the content and layout of web data. This paper argues that not all data that form part of the main web page is of a user interest and not all noise data is actually noise to a given user. Therefore, learning of noise web data allocated to the user requests ensures not only reduction of noisiness level in a web user profile, but also a decrease in the loss of useful information hence improves the quality of a web user profile. Noise Web Data Learning (NWDL) tool/algorithm capable of learning noise web data in web user profile is proposed. The proposed work considers elimination of noise data in relation to dynamic user interest. In order to validate the performance of the proposed work, an experimental design setup is presented. The results obtained are compared with the current algorithms applied in noise web data reduction process. The experimental results show that the proposed work considers the dynamic change of user interest prior to elimination of noise data. The proposed work contributes towards improving the quality of a web user profile by reducing the amount of useful information eliminated as noise.Keywords: Web log data, web user profile, user interest, noise web data learning, machine learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17347429 Hybrid Reliability-Similarity-Based Approach for Supervised Machine Learning
Authors: Walid Cherif
Abstract:
Data mining has, over recent years, seen big advances because of the spread of internet, which generates everyday a tremendous volume of data, and also the immense advances in technologies which facilitate the analysis of these data. In particular, classification techniques are a subdomain of Data Mining which determines in which group each data instance is related within a given dataset. It is used to classify data into different classes according to desired criteria. Generally, a classification technique is either statistical or machine learning. Each type of these techniques has its own limits. Nowadays, current data are becoming increasingly heterogeneous; consequently, current classification techniques are encountering many difficulties. This paper defines new measure functions to quantify the resemblance between instances and then combines them in a new approach which is different from actual algorithms by its reliability computations. Results of the proposed approach exceeded most common classification techniques with an f-measure exceeding 97% on the IRIS Dataset.
Keywords: Data mining, knowledge discovery, machine learning, similarity measurement, supervised classification.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15277428 Proposal of Commutation Protocol in Hybrid Sensors and Vehicular Networks for Intelligent Transport Systems
Authors: Taha Bensiradj, Samira Moussaoui
Abstract:
Hybrid Sensors and Vehicular Networks (HSVN), represent a hybrid network, which uses several generations of Ad-Hoc networks. It is used especially in Intelligent Transport Systems (ITS). The HSVN allows making collaboration between the Wireless Sensors Network (WSN) deployed on the border of the road and the Vehicular Network (VANET). This collaboration is defined by messages exchanged between the two networks for the purpose to inform the drivers about the state of the road, provide road safety information and more information about traffic on the road. Moreover, this collaboration created by HSVN, also allows the use of a network and the advantage of improving another network. For example, the dissemination of information between the sensors quickly decreases its energy, and therefore, we can use vehicles that do not have energy constraint to disseminate the information between sensors. On the other hand, to solve the disconnection problem in VANET, the sensors can be used as gateways that allow sending the messages received by one vehicle to another. However, because of the short communication range of the sensor and its low capacity of storage and processing of data, it is difficult to ensure the exchange of road messages between it and the vehicle, which can be moving at high speed at the time of exchange. This represents the time where the vehicle is in communication range with the sensor. This work is the proposition of a communication protocol between the sensors and the vehicle used in HSVN. The latter has as the purpose to ensure the exchange of road messages in the available time of exchange.
Keywords: HSVN, ITS, VANET, WSN.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 12337427 Moving Data Mining Tools toward a Business Intelligence System
Authors: Nittaya Kerdprasop, Kittisak Kerdprasop
Abstract:
Data mining (DM) is the process of finding and extracting frequent patterns that can describe the data, or predict unknown or future values. These goals are achieved by using various learning algorithms. Each algorithm may produce a mining result completely different from the others. Some algorithms may find millions of patterns. It is thus the difficult job for data analysts to select appropriate models and interpret the discovered knowledge. In this paper, we describe a framework of an intelligent and complete data mining system called SUT-Miner. Our system is comprised of a full complement of major DM algorithms, pre-DM and post-DM functionalities. It is the post-DM packages that ease the DM deployment for business intelligence applications.Keywords: Business intelligence, data mining, functionalprogramming, intelligent system.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17427426 Analysis of Diverse Clustering Tools in Data Mining
Authors: S. Sarumathi, N. Shanthi, M. Sharmila
Abstract:
Clustering in data mining is an unsupervised learning technique of aggregating the data objects into meaningful groups such that the intra cluster similarity of objects are maximized and inter cluster similarity of objects are minimized. Over the past decades several clustering tools were emerged in which clustering algorithms are inbuilt and are easier to use and extract the expected results. Data mining mainly deals with the huge databases that inflicts on cluster analysis and additional rigorous computational constraints. These challenges pave the way for the emergence of powerful expansive data mining clustering softwares. In this survey, a variety of clustering tools used in data mining are elucidated along with the pros and cons of each software.
Keywords: Cluster Analysis, Clustering Algorithms, Clustering Techniques, Association, Visualization.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 22017425 A Monte Carlo Method to Data Stream Analysis
Authors: Kittisak Kerdprasop, Nittaya Kerdprasop, Pairote Sattayatham
Abstract:
Data stream analysis is the process of computing various summaries and derived values from large amounts of data which are continuously generated at a rapid rate. The nature of a stream does not allow a revisit on each data element. Furthermore, data processing must be fast to produce timely analysis results. These requirements impose constraints on the design of the algorithms to balance correctness against timely responses. Several techniques have been proposed over the past few years to address these challenges. These techniques can be categorized as either dataoriented or task-oriented. The data-oriented approach analyzes a subset of data or a smaller transformed representation, whereas taskoriented scheme solves the problem directly via approximation techniques. We propose a hybrid approach to tackle the data stream analysis problem. The data stream has been both statistically transformed to a smaller size and computationally approximated its characteristics. We adopt a Monte Carlo method in the approximation step. The data reduction has been performed horizontally and vertically through our EMR sampling method. The proposed method is analyzed by a series of experiments. We apply our algorithm on clustering and classification tasks to evaluate the utility of our approach.Keywords: Data Stream, Monte Carlo, Sampling, DensityEstimation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14177424 Improved Data Warehousing: Lessons Learnt from the Systems Approach
Authors: Roelien Goede
Abstract:
Data warehousing success is not high enough. User dissatisfaction and failure to adhere to time frames and budgets are too common. Most traditional information systems practices are rooted in hard systems thinking. Today, the great systems thinkers are forgotten by information systems developers. A data warehouse is still a system and it is worth investigating whether systems thinkers such as Churchman can enhance our practices today. This paper investigates data warehouse development practices from a systems thinking perspective. An empirical investigation is done in order to understand the everyday practices of data warehousing professionals from a systems perspective. The paper presents a model for the application of Churchman-s systems approach in data warehouse development.Keywords: Data warehouse development, Information systemsdevelopment, Interpretive case study, Systems thinking
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15967423 Centralized Resource Management for Network Infrastructure Including Ip Telephony by Integrating a Mediator Between the Heterogeneous Data Sources
Authors: Mohammed Fethi Khalfi, Malika Kandouci
Abstract:
Over the past decade, mobile has experienced a revolution that will ultimately change the way we communicate.All these technologies have a common denominator exploitation of computer information systems, but their operation can be tedious because of problems with heterogeneous data sources.To overcome the problems of heterogeneous data sources, we propose to use a technique of adding an extra layer interfacing applications of management or supervision at the different data sources.This layer will be materialized by the implementation of a mediator between different host applications and information systems frequently used hierarchical and relational manner such that the heterogeneity is completely transparent to the VoIP platform.Keywords: TOIP, Data Integration, Mediation, informationcomputer system, heterogeneous data sources
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13327422 Secure Multiparty Computations for Privacy Preserving Classifiers
Authors: M. Sumana, K. S. Hareesha
Abstract:
Secure computations are essential while performing privacy preserving data mining. Distributed privacy preserving data mining involve two to more sites that cannot pool in their data to a third party due to the violation of law regarding the individual. Hence in order to model the private data without compromising privacy and information loss, secure multiparty computations are used. Secure computations of product, mean, variance, dot product, sigmoid function using the additive and multiplicative homomorphic property is discussed. The computations are performed on vertically partitioned data with a single site holding the class value.Keywords: Homomorphic property, secure product, secure mean and variance, secure dot product, vertically partitioned data.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9197421 Design of Buffer Management for Industry to Avoid Sensor Data- Conflicts
Authors: Dae-ho Won, Jong-wook Hong, Yeon-Mo Yang, Jinung An
Abstract:
To reduce accidents in the industry, WSNs(Wireless Sensor networks)- sensor data is used. WSNs- sensor data has the persistence and continuity. therefore, we design and exploit the buffer management system that has the persistence and continuity to avoid and delivery data conflicts. To develop modules, we use the multi buffers and design the buffer management modules that transfer sensor data through the context-aware methods.Keywords: safe management system, buffer management, context-aware, input data stream
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15547420 Security in Resource Constraints Network Light Weight Encryption for Z-MAC
Authors: Mona Almansoori, Ahmed Mustafa, Ahmad Elshamy
Abstract:
Wireless sensor network was formed by a combination of nodes, systematically it transmitting the data to their base stations, this transmission data can be easily compromised if the limited processing power and the data consistency from these nodes are kept in mind; there is always a discussion to address the secure data transfer or transmission in actual time. This will present a mechanism to securely transmit the data over a chain of sensor nodes without compromising the throughput of the network by utilizing available battery resources available in the sensor node. Our methodology takes many different advantages of Z-MAC protocol for its efficiency, and it provides a unique key by sharing the mechanism using neighbor node MAC address. We present a light weighted data integrity layer which is embedded in the Z-MAC protocol to prove that our protocol performs well than Z-MAC when we introduce the different attack scenarios.
Keywords: Hybrid MAC protocol, data integrity, lightweight encryption, Neighbor based key sharing, Sensor node data processing, Z-MAC.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5647419 Effects of Grape Seed Oil on Postharvest Life and Quality of Some Grape Cultivars
Authors: Zeki Kara, Kevser Yazar
Abstract:
Table grapes (Vitis vinifera L.) are an important crop worldwide. Postharvest problems like berry shattering, decay and stem dehydration are some of the important factors that limit the marketing of table grapes. Edible coatings are an alternative for increasing shelf-life of fruits, protecting fruits from humidity and oxygen effects, thus retarding their deterioration. This study aimed to compare different grape seed oil applications (GSO, 0.5 g L-1, 1 g L-1, 2 g L-1) and SO2 generating pads effects (SO2-1, SO2-2). Treated grapes with GSO and generating pads were packaged into polyethylene trays and stored at 0 ± 1°C and 85-95% moisture. Effects of the applications were investigated by some quality and sensory evaluations with intervals of 15 days. SO2 applications were determined the most effective treatments for minimizing weight loss and changes in TA, pH, color and appearance value. Grape seed oil applications were determined as a good alternative for grape preservation, improving weight losses and °Brix, TA, the color values and sensory analysis. Commercially, ‘Alphonse Lavallée’ clusters were stored for 75 days and ‘Antep Karası’ clusters for 60 days. The data obtained from GSO indicated that it had a similar quality result to SO2 for up to 40 days storage.
Keywords: Postharvest, quality, sensory analyses, Vitis vinifera L.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 8377418 Data Recording for Remote Monitoring of Autonomous Vehicles
Authors: Rong-Terng Juang
Abstract:
Autonomous vehicles offer the possibility of significant benefits to social welfare. However, fully automated cars might not be going to happen in the near further. To speed the adoption of the self-driving technologies, many governments worldwide are passing laws requiring data recorders for the testing of autonomous vehicles. Currently, the self-driving vehicle, (e.g., shuttle bus) has to be monitored from a remote control center. When an autonomous vehicle encounters an unexpected driving environment, such as road construction or an obstruction, it should request assistance from a remote operator. Nevertheless, large amounts of data, including images, radar and lidar data, etc., have to be transmitted from the vehicle to the remote center. Therefore, this paper proposes a data compression method of in-vehicle networks for remote monitoring of autonomous vehicles. Firstly, the time-series data are rearranged into a multi-dimensional signal space. Upon the arrival, for controller area networks (CAN), the new data are mapped onto a time-data two-dimensional space associated with the specific CAN identity. Secondly, the data are sampled based on differential sampling. Finally, the whole set of data are encoded using existing algorithms such as Huffman, arithmetic and codebook encoding methods. To evaluate system performance, the proposed method was deployed on an in-house built autonomous vehicle. The testing results show that the amount of data can be reduced as much as 1/7 compared to the raw data.
Keywords: Autonomous vehicle, data recording, remote monitoring, controller area network.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13517417 Shelf Life Extension of Milk Pomade Sweet – Sherbet with Crunchy Peanut Chips by MAP in Various Packaging Materials
Authors: Eva Vorma, Sandra Muizniece-Brasava, Lija Dukalska, Janis Skalbe
Abstract:
The objective of the research was to evaluate the hardness stability of milk pomade sweets packed in several packaging materials (OPP, Multibarrier 60 HFP, BIALON 65 HFP, BIALON 50 HFP, ECOLEAN) by several packaging technologies – modified atmosphere (MAP) (consisting of 30% CO2+70% N2; 30% N2+70% CO2 and 100% CO2) and control – in air ambiance. Samples were stored at the room temperature +21±1 °C. The studies of the samples were carried out before packaging and after 2, 4, 6, 8, and 10 storage weeks.Keywords: packaging, shelf life, sherbet with crunchy peanutchips, hardness.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17497416 Open Cloud Computing with Fault Tolerance
Authors: K. Zuva, T. Zuva, K. O. M. Mapoka
Abstract:
Cloud Computing (CC) has become one of the most talked about emerging technologies that provides powerful computing and large storage environments through the use of the Internet. Cloud computing provides different dynamically scalable computing resources as a service. It brings economic benefits to individuals and businesses that adopt the technology. In theory adoption of cloud computing reduces capital and operational expenditure on information technology. For this to be a reality there is need to solve some challenges and at the same time addressing concerns that consumers have about cloud computing. This paper looks at Cloud Computing in general then highlights the challenges of Cloud Computing and finally suggests solutions to some of the challenges.Keywords: Cloud Computing, SaaS, PaaS, IaaS, Internet.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 23687415 String Searching in Dispersed Files using MDS Convolutional Codes
Authors: A. S. Poornima, R. Aparna, B. B. Amberker, Prashant Koulgi
Abstract:
In this paper, we propose use of convolutional codes for file dispersal. The proposed method is comparable in complexity to the information Dispersal Algorithm proposed by M.Rabin and for particular choices of (non-binary) convolutional codes, is almost as efficient as that algorithm in terms of controlling expansion in the total storage. Further, our proposed dispersal method allows string search.Keywords: Convolutional codes, File dispersal, Filereconstruction, Information Dispersal Algorithm, String search.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 12797414 Data Annotation Models and Annotation Query Language
Authors: Neerja Bhatnagar, Benjoe A. Juliano, Renee S. Renner
Abstract:
This paper presents data annotation models at five levels of granularity (database, relation, column, tuple, and cell) of relational data to address the problem of unsuitability of most relational databases to express annotations. These models do not require any structural and schematic changes to the underlying database. These models are also flexible, extensible, customizable, database-neutral, and platform-independent. This paper also presents an SQL-like query language, named Annotation Query Language (AnQL), to query annotation documents. AnQL is simple to understand and exploits the already-existent wide knowledge and skill set of SQL.Keywords: annotation query language, data annotations, data annotation models, semantic data annotations
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 23557413 Use XML Format like a Model of Data Backup
Authors: Souleymane Oumtanaga, Kadjo Tanon Lambert, Koné Tiémoman, Tety Pierre, Dowa N’sreke Florent
Abstract:
Nowadays data backup format doesn-t cease to appear raising so the anxiety on their accessibility and their perpetuity. XML is one of the most promising formats to guarantee the integrity of data. This article suggests while showing one thing man can do with XML. Indeed XML will help to create a data backup model. The main task will consist in defining an application in JAVA able to convert information of a database in XML format and restore them later.
Keywords: Backup, Proprietary format, parser, syntactic tree.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17307412 REDUCER – An Architectural Design Pattern for Reducing Large and Noisy Data Sets
Authors: Apkar Salatian
Abstract:
To relieve the burden of reasoning on a point to point basis, in many domains there is a need to reduce large and noisy data sets into trends for qualitative reasoning. In this paper we propose and describe a new architectural design pattern called REDUCER for reducing large and noisy data sets that can be tailored for particular situations. REDUCER consists of 2 consecutive processes: Filter which takes the original data and removes outliers, inconsistencies or noise; and Compression which takes the filtered data and derives trends in the data. In this seminal article we also show how REDUCER has successfully been applied to 3 different case studies.
Keywords: Design Pattern, filtering, compression.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14907411 Comparing Emotion Recognition from Voice and Facial Data Using Time Invariant Features
Authors: Vesna Kirandziska, Nevena Ackovska, Ana Madevska Bogdanova
Abstract:
The problem of emotion recognition is a challenging problem. It is still an open problem from the aspect of both intelligent systems and psychology. In this paper, both voice features and facial features are used for building an emotion recognition system. A Support Vector Machine classifiers are built by using raw data from video recordings. In this paper, the results obtained for the emotion recognition are given, and a discussion about the validity and the expressiveness of different emotions is presented. A comparison between the classifiers build from facial data only, voice data only and from the combination of both data is made here. The need for a better combination of the information from facial expression and voice data is argued.
Keywords: Emotion recognition, facial recognition, signal processing, machine learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20187410 Analysis of Data Gathering Schemes for Layered Sensor Networks with Multihop Polling
Authors: Bhed Bahadur Bista, Danda B. Rawat
Abstract:
In this paper, we investigate multihop polling and data gathering schemes in layered sensor networks in order to extend the life time of the networks. A network consists of three layers. The lowest layer contains sensors. The middle layer contains so called super nodes with higher computational power, energy supply and longer transmission range than sensor nodes. The top layer contains a sink node. A node in each layer controls a number of nodes in lower layer by polling mechanism to gather data. We will present four types of data gathering schemes: intermediate nodes do not queue data packet, queue single packet, queue multiple packets and aggregate data, to see which data gathering scheme is more energy efficient for multihop polling in layered sensor networks.
Keywords: layered sensor network, polling, data gatheringschemes.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15687409 Incremental Algorithm to Cluster the Categorical Data with Frequency Based Similarity Measure
Authors: S.Aranganayagi, K.Thangavel
Abstract:
Clustering categorical data is more complicated than the numerical clustering because of its special properties. Scalability and memory constraint is the challenging problem in clustering large data set. This paper presents an incremental algorithm to cluster the categorical data. Frequencies of attribute values contribute much in clustering similar categorical objects. In this paper we propose new similarity measures based on the frequencies of attribute values and its cardinalities. The proposed measures and the algorithm are experimented with the data sets from UCI data repository. Results prove that the proposed method generates better clusters than the existing one.Keywords: Clustering, Categorical, Incremental, Frequency, Domain
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18207408 Multimedia Data Fusion for Event Detection in Twitter by Using Dempster-Shafer Evidence Theory
Authors: Samar M. Alqhtani, Suhuai Luo, Brian Regan
Abstract:
Data fusion technology can be the best way to extract useful information from multiple sources of data. It has been widely applied in various applications. This paper presents a data fusion approach in multimedia data for event detection in twitter by using Dempster-Shafer evidence theory. The methodology applies a mining algorithm to detect the event. There are two types of data in the fusion. The first is features extracted from text by using the bag-ofwords method which is calculated using the term frequency-inverse document frequency (TF-IDF). The second is the visual features extracted by applying scale-invariant feature transform (SIFT). The Dempster - Shafer theory of evidence is applied in order to fuse the information from these two sources. Our experiments have indicated that comparing to the approaches using individual data source, the proposed data fusion approach can increase the prediction accuracy for event detection. The experimental result showed that the proposed method achieved a high accuracy of 0.97, comparing with 0.93 with texts only, and 0.86 with images only.Keywords: Data fusion, Dempster-Shafer theory, data mining, event detection.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17997407 Model Order Reduction for Frequency Response and Effect of Order of Method for Matching Condition
Authors: Aref Ghafouri, Mohammad Javad Mollakazemi, Farhad Asadi
Abstract:
In this paper, model order reduction method is used for approximation in linear and nonlinearity aspects in some experimental data. This method can be used for obtaining offline reduced model for approximation of experimental data and can produce and follow the data and order of system and also it can match to experimental data in some frequency ratios. In this study, the method is compared in different experimental data and influence of choosing of order of the model reduction for obtaining the best and sufficient matching condition for following the data is investigated in format of imaginary and reality part of the frequency response curve and finally the effect and important parameter of number of order reduction in nonlinear experimental data is explained further.
Keywords: Frequency response, Order of model reduction, frequency matching condition.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20587406 Building a Scalable Telemetry Based Multiclass Predictive Maintenance Model in R
Authors: Jaya Mathew
Abstract:
Many organizations are faced with the challenge of how to analyze and build Machine Learning models using their sensitive telemetry data. In this paper, we discuss how users can leverage the power of R without having to move their big data around as well as a cloud based solution for organizations willing to host their data in the cloud. By using ScaleR technology to benefit from parallelization and remote computing or R Services on premise or in the cloud, users can leverage the power of R at scale without having to move their data around.
Keywords: Predictive maintenance, machine learning, big data, cloud, on premise SQL, R.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19207405 Rapid Processing Techniques Applied to Sintered Nickel Battery Technologies for Utility Scale Applications
Authors: J. D. Marinaccio, I. Mabbett, C. Glover, D. Worsley
Abstract:
Through use of novel modern/rapid processing techniques such as screen printing and Near-Infrared (NIR) radiative curing, process time for the sintering of sintered nickel plaques, applicable to alkaline nickel battery chemistries, has been drastically reduced from in excess of 200 minutes with conventional convection methods to below 2 minutes using NIR curing methods. Steps have also been taken to remove the need for forming gas as a reducing agent by implementing carbon as an in-situ reducing agent, within the ink formulation.Keywords: Batteries, energy, iron, nickel, storage.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 23417404 Ethereum Based Smart Contracts for Trade and Finance
Authors: Rishabh Garg
Abstract:
Traditionally, business parties build trust with a centralized operating mechanism, such as payment by letter of credit. However, the increase in cyber-attacks and malicious hacking has jeopardized business operations and finance practices. Emerging markets, due to their high banking risks and the large presence of digital financing, are looking for technology that enables transparency and traceability of any transaction in trade, finance or supply chain management. Blockchain systems, in the absence of any central authority, enable transactions across the globe with the help of decentralized applications. DApps consist of a front-end, a blockchain back-end, and middleware, that is, the code that connects the two. The front-end can be a sophisticated web app or mobile app, which is used to implement the functions/methods on the smart contract. Web apps can employ technologies such as HTML, CSS, React and Express. In this wake, fintech and blockchain products are popping up in brokerages, digital wallets, exchanges, post-trade clearance, settlement, middleware, infrastructure and base protocols. The present paper provides a technology driven solution, financial inclusion and innovative working paradigm for business and finance.
Keywords: Authentication, blockchain, channel, cryptography, DApps, data portability, Decentralized Public Key Infrastructure, Ethereum, hash function, Hashgraph, Privilege creep, Proof of Work algorithm, revocation, storage variables, Zero Knowledge Proof.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 577