Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 29

Search results for: Data stream

29 Real-Time Data Stream Partitioning over a Sliding Window in Real-Time Spatial Big Data

Authors: Sana Hamdi, Emna Bouazizi, Sami Faiz

Abstract:

In recent years, real-time spatial applications, like location-aware services and traffic monitoring, have become more and more important. Such applications result dynamic environments where data as well as queries are continuously moving. As a result, there is a tremendous amount of real-time spatial data generated every day. The growth of the data volume seems to outspeed the advance of our computing infrastructure. For instance, in real-time spatial Big Data, users expect to receive the results of each query within a short time period without holding in account the load of the system. But with a huge amount of real-time spatial data generated, the system performance degrades rapidly especially in overload situations. To solve this problem, we propose the use of data partitioning as an optimization technique. Traditional horizontal and vertical partitioning can increase the performance of the system and simplify data management. But they remain insufficient for real-time spatial Big data; they can’t deal with real-time and stream queries efficiently. Thus, in this paper, we propose a novel data partitioning approach for real-time spatial Big data named VPA-RTSBD (Vertical Partitioning Approach for Real-Time Spatial Big data). This contribution is an implementation of the Matching algorithm for traditional vertical partitioning. We find, firstly, the optimal attribute sequence by the use of Matching algorithm. Then, we propose a new cost model used for database partitioning, for keeping the data amount of each partition more balanced limit and for providing a parallel execution guarantees for the most frequent queries. VPA-RTSBD aims to obtain a real-time partitioning scheme and deals with stream data. It improves the performance of query execution by maximizing the degree of parallel execution. This affects QoS (Quality Of Service) improvement in real-time spatial Big Data especially with a huge volume of stream data. The performance of our contribution is evaluated via simulation experiments. The results show that the proposed algorithm is both efficient and scalable, and that it outperforms comparable algorithms.

Keywords: quality of service, vertical partitioning, real-time spatial big data, horizontal partitioning, matching algorithm, hamming distance, stream query

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 438
28 Weighted-Distance Sliding Windows and Cooccurrence Graphs for Supporting Entity-Relationship Discovery in Unstructured Text

Authors: Paolo Fantozzi, Luigi Laura, Umberto Nanni

Abstract:

The problem of Entity relation discovery in structured data, a well covered topic in literature, consists in searching within unstructured sources (typically, text) in order to find connections among entities. These can be a whole dictionary, or a specific collection of named items. In many cases machine learning and/or text mining techniques are used for this goal. These approaches might be unfeasible in computationally challenging problems, such as processing massive data streams. A faster approach consists in collecting the cooccurrences of any two words (entities) in order to create a graph of relations - a cooccurrence graph. Indeed each cooccurrence highlights some grade of semantic correlation between the words because it is more common to have related words close each other than having them in the opposite sides of the text. Some authors have used sliding windows for such problem: they count all the occurrences within a sliding windows running over the whole text. In this paper we generalise such technique, coming up to a Weighted-Distance Sliding Window, where each occurrence of two named items within the window is accounted with a weight depending on the distance between items: a closer distance implies a stronger evidence of a relationship. We develop an experiment in order to support this intuition, by applying this technique to a data set consisting in the text of the Bible, split into verses.

Keywords: cooccurrence graph, entity relation graph, unstructured text, weighted distance

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 278
27 Visual Text Analytics Technologies for Real-Time Big Data: Chronological Evolution and Issues

Authors: Siti Azrina B. A. Aziz, Siti Hafizah A. Hamid

Abstract:

New approaches to analyze and visualize data stream in real-time basis is important in making a prompt decision by the decision maker. Financial market trading and surveillance, large-scale emergency response and crowd control are some example scenarios that require real-time analytic and data visualization. This situation has led to the development of techniques and tools that support humans in analyzing the source data. With the emergence of Big Data and social media, new techniques and tools are required in order to process the streaming data. Today, ranges of tools which implement some of these functionalities are available. In this paper, we present chronological evolution evaluation of technologies for supporting of real-time analytic and visualization of the data stream. Based on the past research papers published from 2002 to 2014, we gathered the general information, main techniques, challenges and open issues. The techniques for streaming text visualization are identified based on Text Visualization Browser in chronological order. This paper aims to review the evolution of streaming text visualization techniques and tools, as well as to discuss the problems and challenges for each of identified tools.

Keywords: Text Mining, visual analytics, information visualization, big data visualization, visual text analytics tools

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 504
26 Mining Big Data in Telecommunications Industry: Challenges, Techniques, and Revenue Opportunity

Authors: Hoda A. Abdel Hafez

Abstract:

Mining big data represents a big challenge nowadays. Many types of research are concerned with mining massive amounts of data and big data streams. Mining big data faces a lot of challenges including scalability, speed, heterogeneity, accuracy, provenance and privacy. In telecommunication industry, mining big data is like a mining for gold; it represents a big opportunity and maximizing the revenue streams in this industry. This paper discusses the characteristics of big data (volume, variety, velocity and veracity), data mining techniques and tools for handling very large data sets, mining big data in telecommunication and the benefits and opportunities gained from them.

Keywords: Machine Learning, Telecommunication, Big Data, data streams, mining big data

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1486
25 On the Optimality Assessment of Nanoparticle Size Spectrometry and Its Association to the Entropy Concept

Authors: A. Shaygani, R. Saifi, M. S. Saidi, M. Sani

Abstract:

Particle size distribution, the most important characteristics of aerosols, is obtained through electrical characterization techniques. The dynamics of charged nanoparticles under the influence of electric field in Electrical Mobility Spectrometer (EMS) reveals the size distribution of these particles. The accuracy of this measurement is influenced by flow conditions, geometry, electric field and particle charging process, therefore by the transfer function (transfer matrix) of the instrument. In this work, a wire-cylinder corona charger was designed and the combined fielddiffusion charging process of injected poly-disperse aerosol particles was numerically simulated as a prerequisite for the study of a multichannel EMS. The result, a cloud of particles with no uniform charge distribution, was introduced to the EMS. The flow pattern and electric field in the EMS were simulated using Computational Fluid Dynamics (CFD) to obtain particle trajectories in the device and therefore to calculate the reported signal by each electrometer. According to the output signals (resulted from bombardment of particles and transferring their charges as currents), we proposed a modification to the size of detecting rings (which are connected to electrometers) in order to evaluate particle size distributions more accurately. Based on the capability of the system to transfer information contents about size distribution of the injected particles, we proposed a benchmark for the assessment of optimality of the design. This method applies the concept of Von Neumann entropy and borrows the definition of entropy from information theory (Shannon entropy) to measure optimality. Entropy, according to the Shannon entropy, is the ''average amount of information contained in an event, sample or character extracted from a data stream''. Evaluating the responses (signals) which were obtained via various configurations of detecting rings, the best configuration which gave the best predictions about the size distributions of injected particles, was the modified configuration. It was also the one that had the maximum amount of entropy. A reasonable consistency was also observed between the accuracy of the predictions and the entropy content of each configuration. In this method, entropy is extracted from the transfer matrix of the instrument for each configuration. Ultimately, various clouds of particles were introduced to the simulations and predicted size distributions were compared to the exact size distributions.

Keywords: CFD, Von Neumann Entropy, aerosol nano-particle, Electrical Mobility Spectrometer

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1570
24 Real-Time Image Encryption Using a 3D Discrete Dual Chaotic Cipher

Authors: M. F. Haroun, T. A. Gulliver

Abstract:

In this paper, an encryption algorithm is proposed for real-time image encryption. The scheme employs a dual chaotic generator based on a three dimensional (3D) discrete Lorenz attractor. Encryption is achieved using non-autonomous modulation where the data is injected into the dynamics of the master chaotic generator. The second generator is used to permute the dynamics of the master generator using the same approach. Since the data stream can be regarded as a random source, the resulting permutations of the generator dynamics greatly increase the security of the transmitted signal. In addition, a technique is proposed to mitigate the error propagation due to the finite precision arithmetic of digital hardware. In particular, truncation and rounding errors are eliminated by employing an integer representation of the data which can easily be implemented. The simple hardware architecture of the algorithm makes it suitable for secure real-time applications.

Keywords: FPGA, Chaotic systems, Image Encryption, non-autonomous modulation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 764
23 Survey on Arabic Sentiment Analysis in Twitter

Authors: Sarah O. Alhumoud, Mawaheb I. Altuwaijri, Tarfa M. Albuhairi, Wejdan M. Alohaideb

Abstract:

Large-scale data stream analysis has become one of the important business and research priorities lately. Social networks like Twitter and other micro-blogging platforms hold an enormous amount of data that is large in volume, velocity and variety. Extracting valuable information and trends out of these data would aid in a better understanding and decision-making. Multiple analysis techniques are deployed for English content. Moreover, one of the languages that produce a large amount of data over social networks and is least analyzed is the Arabic language. The proposed paper is a survey on the research efforts to analyze the Arabic content in Twitter focusing on the tools and methods used to extract the sentiments for the Arabic content on Twitter.

Keywords: Social Networks, Big Data, sentiment analysis

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3849
22 Frequent and Systematic Timing Enhancement of Congestion Window in Typical Transmission Control Protocol

Authors: Ghassan A. Abed, Akbal O. Salman, Bayan M. Sabbar

Abstract:

Transmission Control Protocol (TCP) among the wired and wireless networks, it still has a practical problem; where the congestion control mechanism does not permit the data stream to get complete bandwidth over the existing network links. To solve this problem, many TCP protocols have been introduced with high speed performance. Therefore, an enhanced congestion window (cwnd) for the congestion control mechanism is proposed in this article to improve the performance of TCP by increasing the number of cycles of the new window to improve the transmitted packet number. The proposed algorithm used a new mechanism based on the available bandwidth of the connection to detect the capacity of network path in order to improve the regular clocking of congestion avoidance mechanism. The work in this paper based on using Network Simulator 2 (NS-2) to simulate the proposed algorithm.

Keywords: Congestion Control, ns-2, TCP, cwnd

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1267
21 A Content-Based Optimization of Data Stream Television Multiplex

Authors: Jaroslav Polec, Martin Šimek, Michal Martinovič, Elena Šikudová

Abstract:

The television multiplex has reserved capacity and therefore we can use only limited number of videos for propagation of it. Appropriate composition of the multiplex has a major impact on how many videos is spread by multiplex. Therefore in this paper is designed a simple algorithm to optimize capacity utilization multiplex. Significant impact on the number of programs in the multiplex has also the fact from which programs is composed. Content of multiplex can be movies, news, sport, animated stories, documentaries, etc. These types have their own specific characteristics that affect their resulting data stream. In this paper is also done an impact analysis of the composition of the multiplex to use its capacity by video content. 

Keywords: Capacity, content, frame, Multiplex, group of pictures

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1098
20 Semantic Support for Hypothesis-Based Research from Smart Environment Monitoring and Analysis Technologies

Authors: T. S. Myers, J. Trevathan

Abstract:

Improvements in the data fusion and data analysis phase of research are imperative due to the exponential growth of sensed data. Currently, there are developments in the Semantic Sensor Web community to explore efficient methods for reuse, correlation and integration of web-based data sets and live data streams. This paper describes the integration of remotely sensed data with web-available static data for use in observational hypothesis testing and the analysis phase of research. The Semantic Reef system combines semantic technologies (e.g., well-defined ontologies and logic systems) with scientific workflows to enable hypothesis-based research. A framework is presented for how the data fusion concepts from the Semantic Reef architecture map to the Smart Environment Monitoring and Analysis Technologies (SEMAT) intelligent sensor network initiative. The data collected via SEMAT and the inferred knowledge from the Semantic Reef system are ingested to the Tropical Data Hub for data discovery, reuse, curation and publication.

Keywords: Ontologies, Information Architecture, Semantic technologies Sensor networks

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1315
19 Unsupervised Outlier Detection in Streaming Data Using Weighted Clustering

Authors: Yogita, Durga Toshniwal

Abstract:

Outlier detection in streaming data is very challenging because streaming data cannot be scanned multiple times and also new concepts may keep evolving. Irrelevant attributes can be termed as noisy attributes and such attributes further magnify the challenge of working with data streams. In this paper, we propose an unsupervised outlier detection scheme for streaming data. This scheme is based on clustering as clustering is an unsupervised data mining task and it does not require labeled data, both density based and partitioning clustering are combined for outlier detection. In this scheme partitioning clustering is also used to assign weights to attributes depending upon their respective relevance and weights are adaptive. Weighted attributes are helpful to reduce or remove the effect of noisy attributes. Keeping in view the challenges of streaming data, the proposed scheme is incremental and adaptive to concept evolution. Experimental results on synthetic and real world data sets show that our proposed approach outperforms other existing approach (CORM) in terms of outlier detection rate, false alarm rate, and increasing percentages of outliers.

Keywords: Concept Evolution, Irrelevant Attributes, Streaming Data, Unsupervised Outlier Detection

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2257
18 Content Based Sampling over Transactional Data Streams

Authors: Mansour Tarafdar, Mohammad Saniee Abade

Abstract:

This paper investigates the problem of sampling from transactional data streams. We introduce CFISDS as a content based sampling algorithm that works on a landmark window model of data streams and preserve more informed sample in sample space. This algorithm that work based on closed frequent itemset mining tasks, first initiate a concept lattice using initial data, then update lattice structure using an incremental mechanism.Incremental mechanism insert, update and delete nodes in/from concept lattice in batch manner. Presented algorithm extracts the final samples on demand of user. Experimental results show the accuracy of CFISDS on synthetic and real datasets, despite on CFISDS algorithm is not faster than exist sampling algorithms such as Z and DSS.

Keywords: Sampling, data streams, closed frequent item set mining

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1347
17 Efficient and Extensible Data Processing Framework in Ubiquitious Sensor Networks

Authors: Junghoon Lee, Gyung-Leen Park, Ho-Young Kwak, Cheol Min Kim

Abstract:

This paper presents the design and implements the prototype of an intelligent data processing framework in ubiquitous sensor networks. Much focus is put on how to handle the sensor data stream as well as the interoperability between the low-level sensor data and application clients. Our framework first addresses systematic middleware which mitigates the interaction between the application layer and low-level sensors, for the sake of analyzing a great volume of sensor data by filtering and integrating to create value-added context information. Then, an agent-based architecture is proposed for real-time data distribution to efficiently forward a specific event to the appropriate application registered in the directory service via the open interface. The prototype implementation demonstrates that our framework can host a sophisticated application on the ubiquitous sensor network and it can autonomously evolve to new middleware, taking advantages of promising technologies such as software agents, XML, cloud computing, and the like.

Keywords: Middleware, Sensor Network, Event Detection, intelligent farm

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1017
16 Design of Buffer Management for Industry to Avoid Sensor Data- Conflicts

Authors: Dae-ho Won, Jong-wook Hong, Yeon-Mo Yang, Jinung An

Abstract:

To reduce accidents in the industry, WSNs(Wireless Sensor networks)- sensor data is used. WSNs- sensor data has the persistence and continuity. therefore, we design and exploit the buffer management system that has the persistence and continuity to avoid and delivery data conflicts. To develop modules, we use the multi buffers and design the buffer management modules that transfer sensor data through the context-aware methods.

Keywords: context-aware, Buffer Management, safe management system, input data stream

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1182
15 Fault Localization and Alarm Correlation in Optical WDM Networks

Authors: G. Ramesh, S. Sundara Vadivelu

Abstract:

For several high speed networks, providing resilience against failures is an essential requirement. The main feature for designing next generation optical networks is protecting and restoring high capacity WDM networks from the failures. Quick detection, identification and restoration make networks more strong and consistent even though the failures cannot be avoided. Hence, it is necessary to develop fast, efficient and dependable fault localization or detection mechanisms. In this paper we propose a new fault localization algorithm for WDM networks which can identify the location of a failure on a failed lightpath. Our algorithm detects the failed connection and then attempts to reroute data stream through an alternate path. In addition to this, we develop an algorithm to analyze the information of the alarms generated by the components of an optical network, in the presence of a fault. It uses the alarm correlation in order to reduce the list of suspected components shown to the network operators. By our simulation results, we show that our proposed algorithms achieve less blocking probability and delay while getting higher throughput.

Keywords: delay, Blocking probability, Alarm correlation, fault localization, WDM networks

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1707
14 A New History Based Method to Handle the Recurring Concept Shifts in Data Streams

Authors: Hossein Morshedlou, Ahmad Abdollahzade Barforoush

Abstract:

Recent developments in storage technology and networking architectures have made it possible for broad areas of applications to rely on data streams for quick response and accurate decision making. Data streams are generated from events of real world so existence of associations, which are among the occurrence of these events in real world, among concepts of data streams is logical. Extraction of these hidden associations can be useful for prediction of subsequent concepts in concept shifting data streams. In this paper we present a new method for learning association among concepts of data stream and prediction of what the next concept will be. Knowing the next concept, an informed update of data model will be possible. The results of conducted experiments show that the proposed method is proper for classification of concept shifting data streams.

Keywords: History, classification, Data Stream, Concept Shift

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 941
13 A Novel Approach to Optimal Cutting Tool Replacement

Authors: Cem Karacal, Sohyung Cho, William Yu

Abstract:

In metal cutting industries, mathematical/statistical models are typically used to predict tool replacement time. These off-line methods usually result in less than optimum replacement time thereby either wasting resources or causing quality problems. The few online real-time methods proposed use indirect measurement techniques and are prone to similar errors. Our idea is based on identifying the optimal replacement time using an electronic nose to detect the airborne compounds released when the tool wear reaches to a chemical substrate doped into tool material during the fabrication. The study investigates the feasibility of the idea, possible doping materials and methods along with data stream mining techniques for detection and monitoring different phases of tool wear.

Keywords: e-nose, Data Stream Mining, tool condition monitoring, cutting tool replacement

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1527
12 Optimal Convolutive Filters for Real-Time Detection and Arrival Time Estimation of Transient Signals

Authors: Michal Natora, Felix Franke, Klaus Obermayer

Abstract:

Linear convolutive filters are fast in calculation and in application, and thus, often used for real-time processing of continuous data streams. In the case of transient signals, a filter has not only to detect the presence of a specific waveform, but to estimate its arrival time as well. In this study, a measure is presented which indicates the performance of detectors in achieving both of these tasks simultaneously. Furthermore, a new sub-class of linear filters within the class of filters which minimize the quadratic response is proposed. The proposed filters are more flexible than the existing ones, like the adaptive matched filter or the minimum power distortionless response beamformer, and prove to be superior with respect to that measure in certain settings. Simulations of a real-time scenario confirm the advantage of these filters as well as the usefulness of the performance measure.

Keywords: Beam Forming, Adaptive matched filter, minimum variance distortionless response, Capon beam former, linear filters, performance measure

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1179
11 An Efficient Approach to Mining Frequent Itemsets on Data Streams

Authors: Sara Ansari, Mohammad Hadi Sadreddini

Abstract:

The increasing importance of data stream arising in a wide range of advanced applications has led to the extensive study of mining frequent patterns. Mining data streams poses many new challenges amongst which are the one-scan nature, the unbounded memory requirement and the high arrival rate of data streams. In this paper, we propose a new approach for mining itemsets on data stream. Our approach SFIDS has been developed based on FIDS algorithm. The main attempts were to keep some advantages of the previous approach and resolve some of its drawbacks, and consequently to improve run time and memory consumption. Our approach has the following advantages: using a data structure similar to lattice for keeping frequent itemsets, separating regions from each other with deleting common nodes that results in a decrease in search space, memory consumption and run time; and Finally, considering CPU constraint, with increasing arrival rate of data that result in overloading system, SFIDS automatically detect this situation and discard some of unprocessing data. We guarantee that error of results is bounded to user pre-specified threshold, based on a probability technique. Final results show that SFIDS algorithm could attain about 50% run time improvement than FIDS approach.

Keywords: Data Stream, frequent itemset, stream mining

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1112
10 Performance Evaluation of Data Transfer Protocol GridFTP for Grid Computing

Authors: Hiroyuki Ohsaki, Makoto Imase

Abstract:

In Grid computing, a data transfer protocol called GridFTP has been widely used for efficiently transferring a large volume of data. Currently, two versions of GridFTP protocols, GridFTP version 1 (GridFTP v1) and GridFTP version 2 (GridFTP v2), have been proposed in the GGF. GridFTP v2 supports several advanced features such as data streaming, dynamic resource allocation, and checksum transfer, by defining a transfer mode called X-block mode. However, in the literature, effectiveness of GridFTP v2 has not been fully investigated. In this paper, we therefore quantitatively evaluate performance of GridFTP v1 and GridFTP v2 using mathematical analysis and simulation experiments. We reveal the performance limitation of GridFTP v1, and quantitatively show effectiveness of GridFTP v2. Through several numerical examples, we show that by utilizing the data streaming feature, the average file transfer time of GridFTP v2 is significantly smaller than that of GridFTP v1.

Keywords: Grid Computing, Performance Evaluation, Queuing theory, GridFTP

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1084
9 A Monte Carlo Method to Data Stream Analysis

Authors: Kittisak Kerdprasop, Nittaya Kerdprasop, Pairote Sattayatham

Abstract:

Data stream analysis is the process of computing various summaries and derived values from large amounts of data which are continuously generated at a rapid rate. The nature of a stream does not allow a revisit on each data element. Furthermore, data processing must be fast to produce timely analysis results. These requirements impose constraints on the design of the algorithms to balance correctness against timely responses. Several techniques have been proposed over the past few years to address these challenges. These techniques can be categorized as either dataoriented or task-oriented. The data-oriented approach analyzes a subset of data or a smaller transformed representation, whereas taskoriented scheme solves the problem directly via approximation techniques. We propose a hybrid approach to tackle the data stream analysis problem. The data stream has been both statistically transformed to a smaller size and computationally approximated its characteristics. We adopt a Monte Carlo method in the approximation step. The data reduction has been performed horizontally and vertically through our EMR sampling method. The proposed method is analyzed by a series of experiments. We apply our algorithm on clustering and classification tasks to evaluate the utility of our approach.

Keywords: Monte Carlo, Sampling, Data Stream, DensityEstimation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1078
8 A New Scheme for Improving the Quality of Service in Heterogeneous Wireless Network for Data Stream Sending

Authors: Ebadollah Zohrevandi, Rasoul Roustaei, Omid Moradtalab

Abstract:

In this paper, we first consider the quality of service problems in heterogeneous wireless networks for sending the video data, which their problem of being real-time is pronounced. At last, we present a method for ensuring the end-to-end quality of service at application layer level for adaptable sending of the video data at heterogeneous wireless networks. To do this, mechanism in different layers has been used. We have used the stop mechanism, the adaptation mechanism and the graceful degrade at the application layer, the multi-level congestion feedback mechanism in the network layer and connection cutting off decision mechanism in the link layer. At the end, the presented method and the achieved improvement is simulated and presented in the NS-2 software.

Keywords: congestion, handoff, adaptation mechanism, stop mechanism, Heterogeneous wireless networks, Graceful degrade

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1091
7 A Cumulative Learning Approach to Data Mining Employing Censored Production Rules (CPRs)

Authors: Rekha Kandwal, Kamal K.Bharadwaj

Abstract:

Knowledge is indispensable but voluminous knowledge becomes a bottleneck for efficient processing. A great challenge for data mining activity is the generation of large number of potential rules as a result of mining process. In fact sometimes result size is comparable to the original data. Traditional data mining pruning activities such as support do not sufficiently reduce the huge rule space. Moreover, many practical applications are characterized by continual change of data and knowledge, thereby making knowledge voluminous with each change. The most predominant representation of the discovered knowledge is the standard Production Rules (PRs) in the form If P Then D. Michalski & Winston proposed Censored Production Rules (CPRs), as an extension of production rules, that exhibit variable precision and supports an efficient mechanism for handling exceptions. A CPR is an augmented production rule of the form: If P Then D Unless C, where C (Censor) is an exception to the rule. Such rules are employed in situations in which the conditional statement 'If P Then D' holds frequently and the assertion C holds rarely. By using a rule of this type we are free to ignore the exception conditions, when the resources needed to establish its presence, are tight or there is simply no information available as to whether it holds or not. Thus the 'If P Then D' part of the CPR expresses important information while the Unless C part acts only as a switch changes the polarity of D to ~D. In this paper a scheme based on Dempster-Shafer Theory (DST) interpretation of a CPR is suggested for discovering CPRs from the discovered flat PRs. The discovery of CPRs from flat rules would result in considerable reduction of the already discovered rules. The proposed scheme incrementally incorporates new knowledge and also reduces the size of knowledge base considerably with each episode. Examples are given to demonstrate the behaviour of the proposed scheme. The suggested cumulative learning scheme would be useful in mining data streams.

Keywords: Data Mining, Machine Learning, Censored production rules, cumulative learning

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1173
6 Approximate Frequent Pattern Discovery Over Data Stream

Authors: Kittisak Kerdprasop, Nittaya Kerdprasop

Abstract:

Frequent pattern discovery over data stream is a hard problem because a continuously generated nature of stream does not allow a revisit on each data element. Furthermore, pattern discovery process must be fast to produce timely results. Based on these requirements, we propose an approximate approach to tackle the problem of discovering frequent patterns over continuous stream. Our approximation algorithm is intended to be applied to process a stream prior to the pattern discovery process. The results of approximate frequent pattern discovery have been reported in the paper.

Keywords: approximate algorithm, Frequent pattern discovery, Data stream analysis

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 997
5 SUPAR: System for User-Centric Profiling of Association Rules in Streaming Data

Authors: Sarabjeet Kaur Kochhar

Abstract:

With a surge of stream processing applications novel techniques are required for generation and analysis of association rules in streams. The traditional rule mining solutions cannot handle streams because they generally require multiple passes over the data and do not guarantee the results in a predictable, small time. Though researchers have been proposing algorithms for generation of rules from streams, there has not been much focus on their analysis. We propose Association rule profiling, a user centric process for analyzing association rules and attaching suitable profiles to them depending on their changing frequency behavior over a previous snapshot of time in a data stream. Association rule profiles provide insights into the changing nature of associations and can be used to characterize the associations. We discuss importance of characteristics such as predictability of linkages present in the data and propose metric to quantify it. We also show how association rule profiles can aid in generation of user specific, more understandable and actionable rules. The framework is implemented as SUPAR: System for Usercentric Profiling of Association Rules in streaming data. The proposed system offers following capabilities: i) Continuous monitoring of frequency of streaming item-sets and detection of significant changes therein for association rule profiling. ii) Computation of metrics for quantifying predictability of associations present in the data. iii) User-centric control of the characterization process: user can control the framework through a) constraint specification and b) non-interesting rule elimination.

Keywords: data streams, predictability, User subjectivity, Change detection, Association rule profiles

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1096
4 Exponentially Weighted Simultaneous Estimation of Several Quantiles

Authors: Valeriy Naumov, Olli Martikainen

Abstract:

In this paper we propose new method for simultaneous generating multiple quantiles corresponding to given probability levels from data streams and massive data sets. This method provides a basis for development of single-pass low-storage quantile estimation algorithms, which differ in complexity, storage requirement and accuracy. We demonstrate that such algorithms may perform well even for heavy-tailed data.

Keywords: Quantile Estimation, Data Stream, heavy-taileddistribution, tail index

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1217
3 Cumulative Learning based on Dynamic Clustering of Hierarchical Production Rules(HPRs)

Authors: Kamal K.Bharadwaj, Rekha Kandwal

Abstract:

An important structuring mechanism for knowledge bases is building clusters based on the content of their knowledge objects. The objects are clustered based on the principle of maximizing the intraclass similarity and minimizing the interclass similarity. Clustering can also facilitate taxonomy formation, that is, the organization of observations into a hierarchy of classes that group similar events together. Hierarchical representation allows us to easily manage the complexity of knowledge, to view the knowledge at different levels of details, and to focus our attention on the interesting aspects only. One of such efficient and easy to understand systems is Hierarchical Production rule (HPRs) system. A HPR, a standard production rule augmented with generality and specificity information, is of the following form Decision If < condition> Generality Specificity . HPRs systems are capable of handling taxonomical structures inherent in the knowledge about the real world. In this paper, a set of related HPRs is called a cluster and is represented by a HPR-tree. This paper discusses an algorithm based on cumulative learning scenario for dynamic structuring of clusters. The proposed scheme incrementally incorporates new knowledge into the set of clusters from the previous episodes and also maintains summary of clusters as Synopsis to be used in the future episodes. Examples are given to demonstrate the behaviour of the proposed scheme. The suggested incremental structuring of clusters would be useful in mining data streams.

Keywords: Data Mining, Clustering, cumulative learning, hierarchical production rules

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 976
2 ASC – A Stream Cipher with Built – In MAC Functionality

Authors: Kai-Thorsten Wirt

Abstract:

In this paper we present the design of a new encryption scheme. The scheme we propose is a very exible encryption and authentication primitive. We build this scheme on two relatively new design principles: t-functions and fast pseudo hadamard transforms. We recapitulate the theory behind these principles and analyze their security properties and efficiency. In more detail we propose a streamcipher which outputs a message authentication tag along with theencrypted data stream with only little overhead. Moreover we proposesecurity-speed tradeoffs. Our scheme is faster than other comparablet-function based designs while offering the same security level.

Keywords: Cryptography, MAC, stream cipher, Combined Primitives, T-Function, FPHT

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1605
1 A Keyword-Based Filtering Technique of Document-Centric XML using NFA Representation

Authors: Changwoo Byun, Kyounghan Lee, Seog Park

Abstract:

XML is becoming a de facto standard for online data exchange. Existing XML filtering techniques based on a publish/subscribe model are focused on the highly structured data marked up with XML tags. These techniques are efficient in filtering the documents of data-centric XML but are not effective in filtering the element contents of the document-centric XML. In this paper, we propose an extended XPath specification which includes a special matching character '%' used in the LIKE operation of SQL in order to solve the difficulty of writing some queries to adequately filter element contents using the previous XPath specification. We also present a novel technique for filtering a collection of document-centric XMLs, called Pfilter, which is able to exploit the extended XPath specification. We show several performance studies, efficiency and scalability using the multi-query processing time (MQPT).

Keywords: XML Data Stream, Document-centric XML, Filtering Technique, Value-based Predicates

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1411