Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 24104

Search results for: data cache

24104 Impact of Stack Caches: Locality Awareness and Cost Effectiveness

Authors: Abdulrahman K. Alshegaifi, Chun-Hsi Huang

Abstract:

Treating data based on its location in memory has received much attention in recent years due to its different properties, which offer important aspects for cache utilization. Stack data and non-stack data may interfere with each other’s locality in the data cache. One of the important aspects of stack data is that it has high spatial and temporal locality. In this work, we simulate non-unified cache design that split data cache into stack and non-stack caches in order to maintain stack data and non-stack data separate in different caches. We observe that the overall hit rate of non-unified cache design is sensitive to the size of non-stack cache. Then, we investigate the appropriate size and associativity for stack cache to achieve high hit ratio especially when over 99% of accesses are directed to stack cache. The result shows that on average more than 99% of stack cache accuracy is achieved by using 2KB of capacity and 1-way associativity. Further, we analyze the improvement in hit rate when adding small, fixed, size of stack cache at level1 to unified cache architecture. The result shows that the overall hit rate of unified cache design with adding 1KB of stack cache is improved by approximately, on average, 3.9% for Rijndael benchmark. The stack cache is simulated by using SimpleScalar toolset.

Keywords: hit rate, locality of program, stack cache, stack data

Procedia PDF Downloads 275

24103 Formal Verification of Cache System Using a Novel Cache Memory Model

Authors: Guowei Hou, Lixin Yu, Wei Zhuang, Hui Qin, Xue Yang

Abstract:

Formal verification is proposed to ensure the correctness of the design and make functional verification more efficient. As cache plays a vital role in the design of System on Chip (SoC), and cache with Memory Management Unit (MMU) and cache memory unit makes the state space too large for simulation to verify, then a formal verification is presented for such system design. In the paper, a formal model checking verification flow is suggested and a new cache memory model which is called “exhaustive search model” is proposed. Instead of using large size ram to denote the whole cache memory, exhaustive search model employs just two cache blocks. For cache system contains data cache (Dcache) and instruction cache (Icache), Dcache memory model and Icache memory model are established separately using the same mechanism. At last, the novel model is employed to the verification of a cache which is module of a custom-built SoC system that has been applied in practical, and the result shows that the cache system is verified correctly using the exhaustive search model, and it makes the verification much more manageable and flexible.

Keywords: cache system, formal verification, novel model, system on chip (SoC)

Procedia PDF Downloads 468

24102 Evaluating the Impact of Replacement Policies on the Cache Performance and Energy Consumption in Different Multicore Embedded Systems

Authors: Sajjad Rostami-Sani, Mojtaba Valinataj, Amir-Hossein Khojir-Angasi

Abstract:

The cache has an important role in the reduction of access delay between a processor and memory in high-performance embedded systems. In these systems, the energy consumption is one of the most important concerns, and it will become more important with smaller processor feature sizes and higher frequencies. Meanwhile, the cache system dissipates a significant portion of energy compared to the other components of a processor. There are some elements that can affect the energy consumption of the cache such as replacement policy and degree of associativity. Due to these points, it can be inferred that selecting an appropriate configuration for the cache is a crucial part of designing a system. In this paper, we investigate the effect of different cache replacement policies on both cache’s performance and energy consumption. Furthermore, the impact of different Instruction Set Architectures (ISAs) on cache’s performance and energy consumption has been investigated.

Keywords: energy consumption, replacement policy, instruction set architecture, multicore processor

Procedia PDF Downloads 113

24101 On Performance of Cache Replacement Schemes in NDN-IoT

Authors: Rasool Sadeghi, Sayed Mahdi Faghih Imani, Negar Najafi

Abstract:

The inherent features of Named Data Networking (NDN) provides a robust solution for Internet of Thing (IoT). Therefore, NDN-IoT has emerged as a combined architecture which exploits the benefits of NDN for interconnecting of the heterogeneous objects in IoT. In NDN-IoT, caching schemes are a key role to improve the network performance. In this paper, we consider the effectiveness of cache replacement schemes in NDN-IoT scenarios. We investigate the impact of replacement schemes on average delay, average hop count, and average interest retransmission when replacement schemes are Least Frequently Used (LFU), Least Recently Used (LRU), First-In-First-Out (FIFO) and Random. The simulation results demonstrate that LFU and LRU present a stable performance when the cache size changes. Moreover, the network performance improves when the number of consumers increases.

Keywords: NDN-IoT, cache replacement, performance, ndnSIM

Procedia PDF Downloads 325

24100 DCASH: Dynamic Cache Synchronization Algorithm for Heterogeneous Reverse Y Synchronizing Mobile Database Systems

Authors: Gunasekaran Raja, Kottilingam Kottursamy, Rajakumar Arul, Ramkumar Jayaraman, Krithika Sairam, Lakshmi Ravi

Abstract:

The synchronization server maintains a dynamically changing cache, which contains the data items which were requested and collected by the mobile node from the server. The order and presence of tuples in the cache changes dynamically according to the frequency of updates performed on the data, by the server and client. To synchronize, the data which has been modified by client and the server at an instant are collected, batched together by the type of modification (insert/ update/ delete), and sorted according to their update frequencies. This ensures that the DCASH (Dynamic Cache Synchronization Algorithm for Heterogeneous Reverse Y synchronizing Mobile Database Systems) gives priority to the frequently accessed data with high usage. The optimal memory management algorithm is proposed to manage data items according to their frequency, theorems were written to show the current mobile data activity is reverse Y in nature and the experiments were tested with 2g and 3g networks for various mobile devices to show the reduced response time and energy consumption.

Keywords: mobile databases, synchronization, cache, response time

Procedia PDF Downloads 360

24099 Cache Analysis and Software Optimizations for Faster on-Chip Network Simulations

Authors: Khyamling Parane, B. M. Prabhu Prasad, Basavaraj Talawar

Abstract:

Fast simulations are critical in reducing time to market in CMPs and SoCs. Several simulators have been used to evaluate the performance and power consumed by Network-on-Chips. Researchers and designers rely upon these simulators for design space exploration of NoC architectures. Our experiments show that simulating large NoC topologies take hours to several days for completion. To speed up the simulations, it is necessary to investigate and optimize the hotspots in simulator source code. Among several simulators available, we choose Booksim2.0, as it is being extensively used in the NoC community. In this paper, we analyze the cache and memory system behaviour of Booksim2.0 to accurately monitor input dependent performance bottlenecks. Our measurements show that cache and memory usage patterns vary widely based on the input parameters given to Booksim2.0. Based on these measurements, the cache configuration having least misses has been identified. To further reduce the cache misses, we use software optimization techniques such as removal of unused functions, loop interchanging and replacing post-increment operator with pre-increment operator for non-primitive data types. The cache misses were reduced by 18.52%, 5.34% and 3.91% by employing above technology respectively. We also employ thread parallelization and vectorization to improve the overall performance of Booksim2.0. The OpenMP programming model and SIMD are used for parallelizing and vectorizing the more time-consuming portions of Booksim2.0. Speedups of 2.93x and 3.97x were observed for the Mesh topology with 30 × 30 network size by employing thread parallelization and vectorization respectively.

Keywords: cache behaviour, network-on-chip, performance profiling, vectorization

Procedia PDF Downloads 167

24098 A Privacy Protection Scheme Supporting Fuzzy Search for NDN Routing Cache Data Name

Authors: Feng Tao, Ma Jing, Guo Xian, Wang Jing

Abstract:

Named Data Networking (NDN) replaces IP address of traditional network with data name, and adopts dynamic cache mechanism. In the existing mechanism, however, only one-to-one search can be achieved because every data has a unique name corresponding to it. There is a certain mapping relationship between data content and data name, so if the data name is intercepted by an adversary, the privacy of the data content and user’s interest can hardly be guaranteed. In order to solve this problem, this paper proposes a one-to-many fuzzy search scheme based on order-preserving encryption to reduce the query overhead by optimizing the caching strategy. In this scheme, we use hash value to ensure the user’s query safe from each node in the process of search, so does the privacy of the requiring data content.

Keywords: NDN, order-preserving encryption, fuzzy search, privacy

Procedia PDF Downloads 447

24097 Trimma: Trimming Metadata Storage and Latency for Hybrid Memory Systems

Authors: Yiwei Li, Boyu Tian, Mingyu Gao

Abstract:

Hybrid main memory systems combine both performance and capacity advantages from heterogeneous memory technologies. With larger capacities, higher associativities, and finer granularities, hybrid memory systems currently exhibit significant metadata storage and lookup overheads for flexibly remapping data blocks between the two memory tiers. To alleviate the inefficiencies of existing designs, we propose Trimma, the combination of a multi-level metadata structure and an efficient metadata cache design. Trimma uses a multilevel metadata table to only track truly necessary address remap entries. The saved memory space is effectively utilized as extra DRAM cache capacity to improve performance. Trimma also uses separate formats to store the entries with non-identity and identity mappings. This improves the overall remap cache hit rate, further boosting the performance. Trimma is transparent to software and compatible with various types of hybrid memory systems. When evaluated on a representative DDR4 + NVM hybrid memory system, Trimma achieves up to 2.4× and on average 58.1% speedup benefits, compared with a state-of-the-art design that only leverages the unallocated fast memory space for caching. Trimma addresses metadata management overheads and targets future scalable large-scale hybrid memory architectures.

Keywords: memory system, data cache, hybrid memory, non-volatile memory

Procedia PDF Downloads 26

24096 A Survey on Countermeasures of Cache-Timing Attack on AES Systems

Authors: Settana M. Abdulh, Naila A. Sadalla, Yaseen H. Taha, Howaida Elshoush

Abstract:

Side channel attacks are based on side channel information, which is information that is leaked from encryption systems. This includes timing information, power consumption as well as electromagnetic or even sound leaking which can exploited by an attacker. Implementing side channel attacks are possible if and only if an attacker has access to a cryptosystem. In this case, the attacker can exploit bad implementation in software or hardware which is not controlled by encryption implementer. Thus, he/she will represent a real threat to the security system. Several countermeasures have been proposed to eliminate side channel information vulnerability.Cache timing attack is a special type of side channel attack. Here, timing information is collected and analyzed by an attacker to guess sensitive information such as encryption key or plaintext. This paper reviews the technique applied in this attack and surveys the countermeasures against it, evaluating the feasibility and usability of each. Based on this evaluation, finally we pose several recommendations about using these countermeasures.

Keywords: AES algorithm, side channel attack, cache timing attack, cache timing countermeasure

Procedia PDF Downloads 263

24095 Hydrogen: Contention-Aware Hybrid Memory Management for Heterogeneous CPU-GPU Architectures

Authors: Yiwei Li, Mingyu Gao

Abstract:

Integrating hybrid memories with heterogeneous processors could leverage heterogeneity in both compute and memory domains for better system efficiency. To ensure performance isolation, we introduce Hydrogen, a hardware architecture to optimize the allocation of hybrid memory resources to heterogeneous CPU-GPU systems. Hydrogen supports efficient capacity and bandwidth partitioning between CPUs and GPUs in both memory tiers. We propose decoupled memory channel mapping and token-based data migration throttling to enable flexible partitioning. We also support epoch-based online search for optimized configurations and lightweight reconfiguration with reduced data movements. Hydrogen significantly outperforms existing designs by 1.21x on average and up to 1.31x.

Keywords: hybrid memory, heterogeneous systems, dram cache, graphics processing units

Procedia PDF Downloads 20

24094 Secure Network Coding against Content Pollution Attacks in Named Data Network

Authors: Tao Feng, Xiaomei Ma, Xian Guo, Jing Wang

Abstract:

Named Data Network (NDN) is one of the future Internet architecture, all nodes (i.e., hosts, routers) are allowed to have a local cache, used to satisfy incoming requests for content. However, depending on caching allows an adversary to perform attacks that are very effective and relatively easy to implement, such as content pollution attack. In this paper, we use a method of secure network coding based on homomorphic signature system to solve this problem. Firstly ,we use a dynamic public key technique, our scheme for each generation authentication without updating the initial secret key used. Secondly, employing the homomorphism of hash function, intermediate node and destination node verify the signature of the received message. In addition, when the network topology of NDN is simple and fixed, the code coefficients in our scheme are generated in a pseudorandom number generator in each node, so the distribution of the coefficients is also avoided. In short, our scheme not only can efficiently prevent against Intra/Inter-GPAs, but also can against the content poisoning attack in NDN.

Keywords: named data networking, content polloution attack, network coding signature, internet architecture

Procedia PDF Downloads 301

24093 Speedup Breadth-First Search by Graph Ordering

Authors: Qiuyi Lyu, Bin Gong

Abstract:

Breadth-First Search(BFS) is a core graph algorithm that is widely used for graph analysis. As it is frequently used in many graph applications, improve the BFS performance is essential. In this paper, we present a graph ordering method that could reorder the graph nodes to achieve better data locality, thus, improving the BFS performance. Our method is based on an observation that the sibling relationships will dominate the cache access pattern during the BFS traversal. Therefore, we propose a frequency-based model to construct the graph order. First, we optimize the graph order according to the nodes’ visit frequency. Nodes with high visit frequency will be processed in priority. Second, we try to maximize the child nodes overlap layer by layer. As it is proved to be NP-hard, we propose a heuristic method that could greatly reduce the preprocessing overheads. We conduct extensive experiments on 16 real-world datasets. The result shows that our method could achieve comparable performance with the state-of-the-art methods while the graph ordering overheads are only about 1/15.

Keywords: breadth-first search, BFS, graph ordering, graph algorithm

Procedia PDF Downloads 102

24092 Improve B-Tree Index’s Performance Using Lock-Free Hash Table

Authors: Zhanfeng Ma, Zhiping Xiong, Hu Yin, Zhengwei She, Aditya P. Gurajada, Tianlun Chen, Ying Li

Abstract:

Many RDBMS vendors use B-tree index to achieve high performance for point queries and range queries, and some of them also employ hash index to further enhance the performance as hash table is more efficient for point queries. However, there are extra overheads to maintain a separate hash index, for example, hash mapping for all data records must always be maintained, which results in more memory space consumption; locking, logging and other mechanisms are needed to guarantee ACID, which affects the concurrency and scalability of the system. To relieve the overheads, Hash Cached B-tree (HCB) index is proposed in this paper, which consists of a standard disk-based B-tree index and an additional in-memory lock-free hash table. Initially, only the B-tree index is constructed for all data records, the hash table is built on the fly based on runtime workload, only data records accessed by point queries are indexed using hash table, this helps reduce the memory footprint. Changes to hash table are done using compare-and-swap (CAS) without performing locking and logging, this helps improve the concurrency and avoid contention. The hash table is also optimized to be cache conscious. HCB index is implemented in SAP ASE database, compared with the standard B-tree index, early experiments and customer adoptions show significant performance improvement. This paper provides an overview of the design of HCB index and reports the experimental results.

Keywords: B-tree, compare-and-swap, lock-free hash table, point queries, range queries, SAP ASE database

Procedia PDF Downloads 256

24091 Automatic Tuning for a Systemic Model of Banking Originated Losses (SYMBOL) Tool on Multicore

Authors: Ronal Muresano, Andrea Pagano

Abstract:

Nowadays, the mathematical/statistical applications are developed with more complexity and accuracy. However, these precisions and complexities have brought as result that applications need more computational power in order to be executed faster. In this sense, the multicore environments are playing an important role to improve and to optimize the execution time of these applications. These environments allow us the inclusion of more parallelism inside the node. However, to take advantage of this parallelism is not an easy task, because we have to deal with some problems such as: cores communications, data locality, memory sizes (cache and RAM), synchronizations, data dependencies on the model, etc. These issues are becoming more important when we wish to improve the application’s performance and scalability. Hence, this paper describes an optimization method developed for Systemic Model of Banking Originated Losses (SYMBOL) tool developed by the European Commission, which is based on analyzing the application's weakness in order to exploit the advantages of the multicore. All these improvements are done in an automatic and transparent manner with the aim of improving the performance metrics of our tool. Finally, experimental evaluations show the effectiveness of our new optimized version, in which we have achieved a considerable improvement on the execution time. The time has been reduced around 96% for the best case tested, between the original serial version and the automatic parallel version.

Keywords: algorithm optimization, bank failures, OpenMP, parallel techniques, statistical tool

Procedia PDF Downloads 343

24090 Enhanced Disk-Based Databases towards Improved Hybrid in-Memory Systems

Authors: Samuel Kaspi, Sitalakshmi Venkatraman

Abstract:

In-memory database systems are becoming popular due to the availability and affordability of sufficiently large RAM and processors in modern high-end servers with the capacity to manage large in-memory database transactions. While fast and reliable in-memory systems are still being developed to overcome cache misses, CPU/IO bottlenecks and distributed transaction costs, disk-based data stores still serve as the primary persistence. In addition, with the recent growth in multi-tenancy cloud applications and associated security concerns, many organisations consider the trade-offs and continue to require fast and reliable transaction processing of disk-based database systems as an available choice. For these organizations, the only way of increasing throughput is by improving the performance of disk-based concurrency control. This warrants a hybrid database system with the ability to selectively apply an enhanced disk-based data management within the context of in-memory systems that would help improve overall throughput. The general view is that in-memory systems substantially outperform disk-based systems. We question this assumption and examine how a modified variation of access invariance that we call enhanced memory access, (EMA) can be used to allow very high levels of concurrency in the pre-fetching of data in disk-based systems. We demonstrate how this prefetching in disk-based systems can yield close to in-memory performance, which paves the way for improved hybrid database systems. This paper proposes a novel EMA technique and presents a comparative study between disk-based EMA systems and in-memory systems running on hardware configurations of equivalent power in terms of the number of processors and their speeds. The results of the experiments conducted clearly substantiate that when used in conjunction with all concurrency control mechanisms, EMA can increase the throughput of disk-based systems to levels quite close to those achieved by in-memory system. The promising results of this work show that enhanced disk-based systems facilitate in improving hybrid data management within the broader context of in-memory systems.

Keywords: in-memory database, disk-based system, hybrid database, concurrency control

Procedia PDF Downloads 383

24089 The Classical and Hellenistic Architectural Elements of the Temple of Echmun in Sidon

Authors: Amal Alatar

Abstract:

The paper focuses on the exploration of architectural characteristics and decorative elements of the temple of Echmun, emphasizing the socio-economic significance of Sidon during the Greek and Roman periods to understand the implications of their spread and development on the Phoenician cities, as well as reveal the symbolical and societal connotations that may have been connected with the buildings, in order to allow a well-founded examination of common characteristics. In general, studying Phoenician archaeology posed some problems. The main problem is that most major Phoenician settlements lay beneath modern urban centers. This situation often prevented or largely restricted full archaeological investigations; the publications are frequently not complete enough to determine the basic characteristics of the architectural elements. Another key problem is the political instability of the region, which affected the archaeological research in the Phoenician homeland for many years. Nevertheless, during the past decades, an ever-growing cache of data was acquired from the archaeological surroundings of the Phoenician sites. Both the architectural elements from the Greek and Roman period have never been studied as a group before. Surprisingly, they have been largely ignored, despite their apparent profusion throughout the cities. The Roman period of Sidon has generally been neglected in preference to earlier periods, where it is often difficult to distinguish between Roman, Bronze age, medieval and Ottoman structures.

Keywords: archaeology, classical, Hellenistic, Eshmun Temple, architecture, Sidon, Lebanon

Procedia PDF Downloads 61

24088 Low Power CNFET SRAM Design

Authors: Pejman Hosseiniun, Rose Shayeghi, Iman Rahbari, Mohamad Reza Kalhor

Abstract:

CNFET has emerged as an alternative material to silicon for high performance, high stability and low power SRAM design in recent years. SRAM functions as cache memory in computers and many portable devices. In this paper, a new SRAM cell design based on CNFET technology is proposed. The proposed SRAM cell design for CNFET is compared with SRAM cell designs implemented with the conventional CMOS and FinFET in terms of speed, power consumption, stability, and leakage current. The HSPICE simulation and analysis show that the dynamic power consumption of the proposed 8T CNFET SRAM cell’s is reduced about 48% and the SNM is widened up to 56% compared to the conventional CMOS SRAM structure at the expense of 2% leakage power and 3% write delay increase.

Keywords: SRAM cell, CNFET, low power, HSPICE

Procedia PDF Downloads 371

24087 Security Design of Root of Trust Based on RISC-V

Authors: Kang Huang, Wanting Zhou, Shiwei Yuan, Lei Li

Abstract:

Since information technology develops rapidly, the security issue has become an increasingly critical for computer system. In particular, as cloud computing and the Internet of Things (IoT) continue to gain widespread adoption, computer systems need to new security threats and attacks. The Root of Trust (RoT) is the foundation for providing basic trusted computing, which is used to verify the security and trustworthiness of other components. Design a reliable Root of Trust and guarantee its own security are essential for improving the overall security and credibility of computer systems. In this paper, we discuss the implementation of self-security technology based on the RISC-V Root of Trust at the hardware level. To effectively safeguard the security of the Root of Trust, researches on security safeguard technology on the Root of Trust have been studied. At first, a lightweight and secure boot framework is proposed as a secure mechanism. Secondly, two kinds of memory protection mechanism are built to against memory attacks. Moreover, hardware implementation of proposed method has been also investigated. A series of experiments and tests have been carried on to verify to effectiveness of the proposed method. The experimental results demonstrated that the proposed approach is effective in verifying the integrity of the Root of Trust’s own boot rom, user instructions, and data, ensuring authenticity and enabling the secure boot of the Root of Trust’s own system. Additionally, our approach provides memory protection against certain types of memory attacks, such as cache leaks and tampering, and ensures the security of root-of-trust sensitive information, including keys.

Keywords: root of trust, secure boot, memory protection, hardware security

Procedia PDF Downloads 141

24086 Processing Big Data: An Approach Using Feature Selection

Authors: Nikat Parveen, M. Ananthi

Abstract:

Big data is one of the emerging technology, which collects the data from various sensors and those data will be used in many fields. Data retrieval is one of the major issue where there is a need to extract the exact data as per the need. In this paper, large amount of data set is processed by using the feature selection. Feature selection helps to choose the data which are actually needed to process and execute the task. The key value is the one which helps to point out exact data available in the storage space. Here the available data is streamed and R-Center is proposed to achieve this task.

Keywords: big data, key value, feature selection, retrieval, performance

Procedia PDF Downloads 304

24085 Applications of Big Data in Education

Authors: Faisal Kalota

Abstract:

Big Data and analytics have gained a huge momentum in recent years. Big Data feeds into the field of Learning Analytics (LA) that may allow academic institutions to better understand the learners’ needs and proactively address them. Hence, it is important to have an understanding of Big Data and its applications. The purpose of this descriptive paper is to provide an overview of Big Data, the technologies used in Big Data, and some of the applications of Big Data in education. Additionally, it discusses some of the concerns related to Big Data and current research trends. While Big Data can provide big benefits, it is important that institutions understand their own needs, infrastructure, resources, and limitation before jumping on the Big Data bandwagon.

Keywords: big data, learning analytics, analytics, big data in education, Hadoop

Procedia PDF Downloads 380

24084 Analysis of Big Data

Authors: Sandeep Sharma, Sarabjit Singh

Abstract:

As per the user demand and growth trends of large free data the storage solutions are now becoming more challenge-able to protect, store and to retrieve data. The days are not so far when the storage companies and organizations are start saying 'no' to store our valuable data or they will start charging a huge amount for its storage and protection. On the other hand as per the environmental conditions it becomes challenge-able to maintain and establish new data warehouses and data centers to protect global warming threats. A challenge of small data is over now, the challenges are big that how to manage the exponential growth of data. In this paper we have analyzed the growth trend of big data and its future implications. We have also focused on the impact of the unstructured data on various concerns and we have also suggested some possible remedies to streamline big data.

Keywords: big data, unstructured data, volume, variety, velocity

Procedia PDF Downloads 510

24083 Research of Data Cleaning Methods Based on Dependency Rules

Authors: Yang Bao, Shi Wei Deng, WangQun Lin

Abstract:

This paper introduces the concept and principle of data cleaning, analyzes the types and causes of dirty data, and proposes several key steps of typical cleaning process, puts forward a well scalability and versatility data cleaning framework, in view of data with attribute dependency relation, designs several of violation data discovery algorithms by formal formula, which can obtain inconsistent data to all target columns with condition attribute dependent no matter data is structured (SQL) or unstructured (NoSQL), and gives 6 data cleaning methods based on these algorithms.

Keywords: data cleaning, dependency rules, violation data discovery, data repair

Procedia PDF Downloads 530

24082 Mining Big Data in Telecommunications Industry: Challenges, Techniques, and Revenue Opportunity

Authors: Hoda A. Abdel Hafez

Abstract:

Mining big data represents a big challenge nowadays. Many types of research are concerned with mining massive amounts of data and big data streams. Mining big data faces a lot of challenges including scalability, speed, heterogeneity, accuracy, provenance and privacy. In telecommunication industry, mining big data is like a mining for gold; it represents a big opportunity and maximizing the revenue streams in this industry. This paper discusses the characteristics of big data (volume, variety, velocity and veracity), data mining techniques and tools for handling very large data sets, mining big data in telecommunication and the benefits and opportunities gained from them.

Keywords: mining big data, big data, machine learning, telecommunication

Procedia PDF Downloads 364

24081 JavaScript Object Notation Data against eXtensible Markup Language Data in Software Applications a Software Testing Approach

Authors: Theertha Chandroth

Abstract:

This paper presents a comparative study on how to check JSON (JavaScript Object Notation) data against XML (eXtensible Markup Language) data from a software testing point of view. JSON and XML are widely used data interchange formats, each with its unique syntax and structure. The objective is to explore various techniques and methodologies for validating comparison and integration between JSON data to XML and vice versa. By understanding the process of checking JSON data against XML data, testers, developers and data practitioners can ensure accurate data representation, seamless data interchange, and effective data validation.

Keywords: XML, JSON, data comparison, integration testing, Python, SQL

Procedia PDF Downloads 83

24080 Multi-Source Data Fusion for Urban Comprehensive Management

Authors: Bolin Hua

Abstract:

In city governance, various data are involved, including city component data, demographic data, housing data and all kinds of business data. These data reflects different aspects of people, events and activities. Data generated from various systems are different in form and data source are different because they may come from different sectors. In order to reflect one or several facets of an event or rule, data from multiple sources need fusion together. Data from different sources using different ways of collection raised several issues which need to be resolved. Problem of data fusion include data update and synchronization, data exchange and sharing, file parsing and entry, duplicate data and its comparison, resource catalogue construction. Governments adopt statistical analysis, time series analysis, extrapolation, monitoring analysis, value mining, scenario prediction in order to achieve pattern discovery, law verification, root cause analysis and public opinion monitoring. The result of Multi-source data fusion is to form a uniform central database, which includes people data, location data, object data, and institution data, business data and space data. We need to use meta data to be referred to and read when application needs to access, manipulate and display the data. A uniform meta data management ensures effectiveness and consistency of data in the process of data exchange, data modeling, data cleansing, data loading, data storing, data analysis, data search and data delivery.

Keywords: multi-source data fusion, urban comprehensive management, information fusion, government data

Procedia PDF Downloads 347

24079 Reviewing Privacy Preserving Distributed Data Mining

Authors: Sajjad Baghernezhad, Saeideh Baghernezhad

Abstract:

Nowadays considering human involved in increasing data development some methods such as data mining to extract science are unavoidable. One of the discussions of data mining is inherent distribution of the data usually the bases creating or receiving such data belong to corporate or non-corporate persons and do not give their information freely to others. Yet there is no guarantee to enable someone to mine special data without entering in the owner’s privacy. Sending data and then gathering them by each vertical or horizontal software depends on the type of their preserving type and also executed to improve data privacy. In this study it was attempted to compare comprehensively preserving data methods; also general methods such as random data, coding and strong and weak points of each one are examined.

Keywords: data mining, distributed data mining, privacy protection, privacy preserving

Procedia PDF Downloads 487

24078 The Right to Data Portability and Its Influence on the Development of Digital Services

Authors: Roman Bieda

Abstract:

The General Data Protection Regulation (GDPR) will come into force on 25 May 2018 which will create a new legal framework for the protection of personal data in the European Union. Article 20 of GDPR introduces a right to data portability. This right allows for data subjects to receive the personal data which they have provided to a data controller, in a structured, commonly used and machine-readable format, and to transmit this data to another data controller. The right to data portability, by facilitating transferring personal data between IT environments (e.g.: applications), will also facilitate changing the provider of services (e.g. changing a bank or a cloud computing service provider). Therefore, it will contribute to the development of competition and the digital market. The aim of this paper is to discuss the right to data portability and its influence on the development of new digital services.

Keywords: data portability, digital market, GDPR, personal data

Procedia PDF Downloads 439

24077 Recent Advances in Data Warehouse

Authors: Fahad Hanash Alzahrani

Abstract:

This paper describes some recent advances in a quickly developing area of data storing and processing based on Data Warehouses and Data Mining techniques, which are associated with software, hardware, data mining algorithms and visualisation techniques having common features for any specific problems and tasks of their implementation.

Keywords: data warehouse, data mining, knowledge discovery in databases, on-line analytical processing

Procedia PDF Downloads 364

24076 How to Use Big Data in Logistics Issues

Authors: Mehmet Akif Aslan, Mehmet Simsek, Eyup Sensoy

Abstract:

Big Data stands for today’s cutting-edge technology. As the technology becomes widespread, so does Data. Utilizing massive data sets enable companies to get competitive advantages over their adversaries. Out of many area of Big Data usage, logistics has significance role in both commercial sector and military. This paper lays out what big data is and how it is used in both military and commercial logistics.

Keywords: big data, logistics, operational efficiency, risk management

Procedia PDF Downloads 608

24075 Implementation of an IoT Sensor Data Collection and Analysis Library

Authors: Jihyun Song, Kyeongjoo Kim, Minsoo Lee

Abstract:

Due to the development of information technology and wireless Internet technology, various data are being generated in various fields. These data are advantageous in that they provide real-time information to the users themselves. However, when the data are accumulated and analyzed, more various information can be extracted. In addition, development and dissemination of boards such as Arduino and Raspberry Pie have made it possible to easily test various sensors, and it is possible to collect sensor data directly by using database application tools such as MySQL. These directly collected data can be used for various research and can be useful as data for data mining. However, there are many difficulties in using the board to collect data, and there are many difficulties in using it when the user is not a computer programmer, or when using it for the first time. Even if data are collected, lack of expert knowledge or experience may cause difficulties in data analysis and visualization. In this paper, we aim to construct a library for sensor data collection and analysis to overcome these problems.

Keywords: clustering, data mining, DBSCAN, k-means, k-medoids, sensor data

Procedia PDF Downloads 342