Search results for: lossless data compression
7461 Biological Data Integration using SOA
Authors: Noura Meshaan Al-Otaibi, Amin Yousef Noaman
Abstract:
Nowadays scientific data is inevitably digital and stored in a wide variety of formats in heterogeneous systems. Scientists need to access an integrated view of remote or local heterogeneous data sources with advanced data accessing, analyzing, and visualization tools. This research suggests the use of Service Oriented Architecture (SOA) to integrate biological data from different data sources. This work shows SOA will solve the problems that facing integration process and if the biologist scientists can access the biological data in easier way. There are several methods to implement SOA but web service is the most popular method. The Microsoft .Net Framework used to implement proposed architecture.Keywords: Bioinformatics, Biological data, Data Integration, SOA and Web Services.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 24447460 STATISTICA Software: A State of the Art Review
Authors: S. Sarumathi, N. Shanthi, S. Vidhya, P. Ranjetha
Abstract:
Data mining idea is mounting rapidly in admiration and also in their popularity. The foremost aspire of data mining method is to extract data from a huge data set into several forms that could be comprehended for additional use. The data mining is a technology that contains with rich potential resources which could be supportive for industries and businesses that pay attention to collect the necessary information of the data to discover their customer’s performances. For extracting data there are several methods are available such as Classification, Clustering, Association, Discovering, and Visualization… etc., which has its individual and diverse algorithms towards the effort to fit an appropriate model to the data. STATISTICA mostly deals with excessive groups of data that imposes vast rigorous computational constraints. These results trials challenge cause the emergence of powerful STATISTICA Data Mining technologies. In this survey an overview of the STATISTICA software is illustrated along with their significant features.
Keywords: Data Mining, STATISTICA Data Miner, Text Miner, Enterprise Server, Classification, Association, Clustering, Regression.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 25937459 Proposal of Data Collection from Probes
Authors: M. Kebisek, L. Spendla, M. Kopcek, T. Skulavik
Abstract:
In our paper we describe the security capabilities of data collection. Data are collected with probes located in the near and distant surroundings of the company. Considering the numerous obstacles e.g. forests, hills, urban areas, the data collection is realized in several ways. The collection of data uses connection via wireless communication, LAN network, GSM network and in certain areas data are collected by using vehicles. In order to ensure the connection to the server most of the probes have ability to communicate in several ways. Collected data are archived and subsequently used in supervisory applications. To ensure the collection of the required data, it is necessary to propose algorithms that will allow the probes to select suitable communication channel.
Keywords: Communication, computer network, data collection, probe.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17697458 Linguistic Summarization of Structured Patent Data
Authors: E. Y. Igde, S. Aydogan, F. E. Boran, D. Akay
Abstract:
Patent data have an increasingly important role in economic growth, innovation, technical advantages and business strategies and even in countries competitions. Analyzing of patent data is crucial since patents cover large part of all technological information of the world. In this paper, we have used the linguistic summarization technique to prove the validity of the hypotheses related to patent data stated in the literature.Keywords: Data mining, fuzzy sets, linguistic summarization, patent data.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 11957457 Metadata Update Mechanism Improvements in Data Grid
Authors: S. Farokhzad, M. Reza Salehnamadi
Abstract:
Grid environments include aggregation of geographical distributed resources. Grid is put forward in three types of computational, data and storage. This paper presents a research on data grid. Data grid is used for covering and securing accessibility to data from among many heterogeneous sources. Users are not worry on the place where data is located in it, provided that, they should get access to the data. Metadata is used for getting access to data in data grid. Presently, application metadata catalogue and SRB middle-ware package are used in data grids for management of metadata. At this paper, possibility of updating, streamlining and searching is provided simultaneously and rapidly through classified table of preserving metadata and conversion of each table to numerous tables. Meanwhile, with regard to the specific application, the most appropriate and best division is set and determined. Concurrency of implementation of some of requests and execution of pipeline is adaptability as a result of this technique.Keywords: Grids, data grid, metadata, update.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16797456 Optimized Approach for Secure Data Sharing in Distributed Database
Authors: Ahmed Mateen, Zhu Qingsheng, Ahmad Bilal
Abstract:
In the current age of technology, information is the most precious asset of a company. Today, companies have a large amount of data. As the data become larger, access to data for some particular information is becoming slower day by day. Faster data processing to shape it in the form of information is the biggest issue. The major problems in distributed databases are the efficiency of data distribution and response time of data distribution. The security of data distribution is also a big issue. For these problems, we proposed a strategy that can maximize the efficiency of data distribution and also increase its response time. This technique gives better results for secure data distribution from multiple heterogeneous sources. The newly proposed technique facilitates the companies for secure data sharing efficiently and quickly.
Keywords: ER-schema, electronic record, P2P framework, API, query formulation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 10527455 Time Temperature Dependence of Long Fiber Reinforced Polypropylene Manufactured by Direct Long Fiber Thermoplastic Process
Authors: K. A. Weidenmann, M. Grigo, B. Brylka, P. Elsner, T. Böhlke
Abstract:
In order to reduce fuel consumption, the weight of automobiles has to be reduced. Fiber reinforced polymers offer the potential to reach this aim because of their high stiffness to weight ratio. Additionally, the use of fiber reinforced polymers in automotive applications has to allow for an economic large-scale production. In this regard, long fiber reinforced thermoplastics made by direct processing offer both mechanical performance and processability in injection moulding and compression moulding. The work presented in this contribution deals with long glass fiber reinforced polypropylene directly processed in compression moulding (D-LFT). For the use in automotive applications both the temperature and the time dependency of the materials properties have to be investigated to fulfill performance requirements during crash or the demands of service temperatures ranging from -40 °C to 80 °C. To consider both the influence of temperature and time, quasistatic tensile tests have been carried out at different temperatures. These tests have been complemented by high speed tensile tests at different strain rates. As expected, the increase in strain rate results in an increase of the elastic modulus which correlates to an increase of the stiffness with decreasing service temperature. The results are in good accordance with results determined by dynamic mechanical analysis within the range of 0.1 to 100 Hz. The experimental results from different testing methods were grouped and interpreted by using different time temperature shift approaches. In this regard, Williams-Landel-Ferry and Arrhenius approach based on kinetics have been used. As the theoretical shift factor follows an arctan function, an empirical approach was also taken into consideration. It could be shown that this approach describes best the time and temperature superposition for glass fiber reinforced polypropylene manufactured by D-LFT processing.
Keywords: Composite, long fiber reinforced thermoplastics, mechanical properties, dynamic mechanical analysis, time temperature superposition.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16797454 A Sub-mW Low Noise Amplifier for Wireless Sensor Networks
Authors: Gianluca Cornetta, David J. Santos, Balwant Godara
Abstract:
A 1.2 V, 0.61 mA bias current, low noise amplifier (LNA) suitable for low-power applications in the 2.4 GHz band is presented. Circuit has been implemented, laid out and simulated using a UMC 130 nm RF-CMOS process. The amplifier provides a 13.3 dB power gain a noise figure NF< 2.28 dB and a 1-dB compression point of -15.69 dBm, while dissipating 0.74 mW. Such performance make this design suitable for wireless sensor networks applications such as ZigBee.Keywords: Current Reuse, IEEE 802.15.4 (ZigBee), Low NoiseAmplifiers, Wireless Sensor Networks.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18007453 Using Data Clustering in Oral Medicine
Authors: Fahad Shahbaz Khan, Rao Muhammad Anwer, Olof Torgersson
Abstract:
The vast amount of information hidden in huge databases has created tremendous interests in the field of data mining. This paper examines the possibility of using data clustering techniques in oral medicine to identify functional relationships between different attributes and classification of similar patient examinations. Commonly used data clustering algorithms have been reviewed and as a result several interesting results have been gathered.Keywords: Oral Medicine, Cluto, Data Clustering, Data Mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19547452 The Application of Data Mining Technology in Building Energy Consumption Data Analysis
Authors: Liang Zhao, Jili Zhang, Chongquan Zhong
Abstract:
Energy consumption data, in particular those involving public buildings, are impacted by many factors: the building structure, climate/environmental parameters, construction, system operating condition, and user behavior patterns. Traditional methods for data analysis are insufficient. This paper delves into the data mining technology to determine its application in the analysis of building energy consumption data including energy consumption prediction, fault diagnosis, and optimal operation. Recent literature are reviewed and summarized, the problems faced by data mining technology in the area of energy consumption data analysis are enumerated, and research points for future studies are given.
Keywords: Data mining, data analysis, prediction, optimization, building operational performance.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 36877451 Query Algebra for Semistuctured Data
Authors: Ei Ei Myat, Ni Lar Thein
Abstract:
With the tremendous growth of World Wide Web (WWW) data, there is an emerging need for effective information retrieval at the document level. Several query languages such as XML-QL, XPath, XQL, Quilt and XQuery are proposed in recent years to provide faster way of querying XML data, but they still lack of generality and efficiency. Our approach towards evolving a framework for querying semistructured documents is based on formal query algebra. Two elements are introduced in the proposed framework: first, a generic and flexible data model for logical representation of semistructured data and second, a set of operators for the manipulation of objects defined in the data model. In additional to accommodating several peculiarities of semistructured data, our model offers novel features such as bidirectional paths for navigational querying and partitions for data transformation that are not available in other proposals.Keywords: Algebra, Semistructured data, Query Algebra.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13577450 Application of Recycled Tungsten Carbide Powder for Fabrication of Iron Based Powder Metallurgy Alloy
Authors: Yukinori Taniguchi, Kazuyoshi Kurita, Kohei Mizuta, Keigo Nishitani, Ryuichi Fukuda
Abstract:
Tungsten carbide is widely used as a tool material in metal manufacturing process. Since tungsten is typical rare metal, establishment of recycle process of tungsten carbide tools and restore into cemented carbide material bring great impact to metal manufacturing industry. Recently, recycle process of tungsten carbide has been developed and established gradually. However, the demands for quality of cemented carbide tool are quite severe because hardness, toughness, anti-wear ability, heat resistance, fatigue strength and so on should be guaranteed for precision machining and tool life. Currently, it is hard to restore the recycled tungsten carbide powder entirely as raw material for new processed cemented carbide tool. In this study, to suggest positive use of recycled tungsten carbide powder, we have tried to fabricate a carbon based sintered steel which shows reinforced mechanical properties with recycled tungsten carbide powder. We have made set of newly designed sintered steels. Compression test of sintered specimen in density ratio of 0.85 (which means 15% porosity inside) has been conducted. As results, at least 1.7 times higher in nominal strength in the amount of 7.0 wt.% was shown in recycled WC powder. The strength reached to over 600 MPa for the Fe-WC-Co-Cu sintered alloy. Wear test has been conducted by using ball-on-disk type friction tester using 5 mm diameter ball with normal force of 2 N in the dry conditions. Wear amount after 1,000 m running distance shows that about 1.5 times longer life was shown in designed sintered alloy. Since results of tensile test showed that same tendency in previous testing, it is concluded that designed sintered alloy can be used for several mechanical parts with special strength and anti-wear ability in relatively low cost due to recycled tungsten carbide powder.Keywords: Tungsten carbide, recycle process, compression test, powder metallurgy, anti-wear ability.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14517449 Simulation Data Summarization Based on Spatial Histograms
Authors: Jing Zhao, Yoshiharu Ishikawa, Chuan Xiao, Kento Sugiura
Abstract:
In order to analyze large-scale scientific data, research on data exploration and visualization has gained popularity. In this paper, we focus on the exploration and visualization of scientific simulation data, and define a spatial V-Optimal histogram for data summarization. We propose histogram construction algorithms based on a general binary hierarchical partitioning as well as a more specific one, the l-grid partitioning. For effective data summarization and efficient data visualization in scientific data analysis, we propose an optimal algorithm as well as a heuristic algorithm for histogram construction. To verify the effectiveness and efficiency of the proposed methods, we conduct experiments on the massive evacuation simulation data.Keywords: Simulation data, data summarization, spatial histograms, exploration and visualization.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7307448 Joint Use of Factor Analysis (FA) and Data Envelopment Analysis (DEA) for Ranking of Data Envelopment Analysis
Authors: Reza Nadimi, Fariborz Jolai
Abstract:
This article combines two techniques: data envelopment analysis (DEA) and Factor analysis (FA) to data reduction in decision making units (DMU). Data envelopment analysis (DEA), a popular linear programming technique is useful to rate comparatively operational efficiency of decision making units (DMU) based on their deterministic (not necessarily stochastic) input–output data and factor analysis techniques, have been proposed as data reduction and classification technique, which can be applied in data envelopment analysis (DEA) technique for reduction input – output data. Numerical results reveal that the new approach shows a good consistency in ranking with DEA.Keywords: Effectiveness, Decision Making, Data EnvelopmentAnalysis, Factor Analysis
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 24067447 Tribological Investigation and the Effect of Karanja Biodiesel on Engine Wear in Compression Ignition Engine
Authors: Ajay V. Kolhe, R. E. Shelke, S. S. Khandare
Abstract:
Various biomass based resources, which can be used as an extender, or a complete substitute of diesel fuel may have very significant role in the development of agriculture, industrial and transport sectors in the energy crisis. Use of Karanja oil methyl ester biodiesel in a CI DI engine was found highly compatible with engine performance along with lower exhaust emission as compared to diesel fuel but with slightly higher NOx emission and low wear characteristics. The combustion related properties of vegetable oils are somewhat similar to diesel oil. Neat vegetable oils or their blends with diesel, however, pose various long-term problems in compression ignition engines. These undesirable features of vegetable oils are because of their inherent properties like high viscosity, low volatility, and polyunsaturated character. Pongamia methyl ester (PME) was prepared by transesterification process using methanol for long term engine operations. The physical and combustion-related properties of the fuels thus developed were found to be closer to that of the diesel. A neat biodiesel (PME) was selected as a fuel for the tribological study of biofuels. Two similar new engines were completely disassembled and subjected to dimensioning of various vital moving parts and then subjected to long-term endurance tests on neat biodiesel and diesel respectively. After completion of the test, both the engines were again disassembled for physical inspection and wear measurement of various vital parts. The lubricating oil samples drawn from both engines were subjected to atomic absorption spectroscopy (AAS) for measurement of various wear metal traces present. The additional lubricating property of biodiesel fuel due to higher viscosity as compared to diesel fuel resulted in lower wear of moving parts and thus improved the engine durability with a bio-diesel fuel. Results reported from AAS tests confirmed substantially lower wear and thus improved life for biodiesel operated engines.
Keywords: Transesterification, PME, wear of engine parts, Metal traces and AAS.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 24317446 Observations about the Principal Components Analysis and Data Clustering Techniques in the Study of Medical Data
Authors: Cristina G. Dascâlu, Corina Dima Cozma, Elena Carmen Cotrutz
Abstract:
The medical data statistical analysis often requires the using of some special techniques, because of the particularities of these data. The principal components analysis and the data clustering are two statistical methods for data mining very useful in the medical field, the first one as a method to decrease the number of studied parameters, and the second one as a method to analyze the connections between diagnosis and the data about the patient-s condition. In this paper we investigate the implications obtained from a specific data analysis technique: the data clustering preceded by a selection of the most relevant parameters, made using the principal components analysis. Our assumption was that, using the principal components analysis before data clustering - in order to select and to classify only the most relevant parameters – the accuracy of clustering is improved, but the practical results showed the opposite fact: the clustering accuracy decreases, with a percentage approximately equal with the percentage of information loss reported by the principal components analysis.Keywords: Data clustering, medical data, principal components analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14857445 A Simple Adaptive Atomic Decomposition Voice Activity Detector Implemented by Matching Pursuit
Authors: Thomas Bryan, Veton Kepuska, Ivica Kostanic
Abstract:
A simple adaptive voice activity detector (VAD) is implemented using Gabor and gammatone atomic decomposition of speech for high Gaussian noise environments. Matching pursuit is used for atomic decomposition, and is shown to achieve optimal speech detection capability at high data compression rates for low signal to noise ratios. The most active dictionary elements found by matching pursuit are used for the signal reconstruction so that the algorithm adapts to the individual speakers dominant time-frequency characteristics. Speech has a high peak to average ratio enabling matching pursuit greedy heuristic of highest inner products to isolate high energy speech components in high noise environments. Gabor and gammatone atoms are both investigated with identical logarithmically spaced center frequencies, and similar bandwidths. The algorithm performs equally well for both Gabor and gammatone atoms with no significant statistical differences. The algorithm achieves 70% accuracy at a 0 dB SNR, 90% accuracy at a 5 dB SNR and 98% accuracy at a 20dB SNR using 30d B SNR as a reference for voice activity.Keywords: Atomic Decomposition, Gabor, Gammatone, Matching Pursuit, Voice Activity Detection.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17777444 Hardware Implementation of Local Binary Pattern Based Two-Bit Transform Motion Estimation
Authors: Seda Yavuz, Anıl Çelebi, Aysun Taşyapı Çelebi, Oğuzhan Urhan
Abstract:
Nowadays, demand for using real-time video transmission capable devices is ever-increasing. So, high resolution videos have made efficient video compression techniques an essential component for capturing and transmitting video data. Motion estimation has a critical role in encoding raw video. Hence, various motion estimation methods are introduced to efficiently compress the video. Low bit‑depth representation based motion estimation methods facilitate computation of matching criteria and thus, provide small hardware footprint. In this paper, a hardware implementation of a two-bit transformation based low-complexity motion estimation method using local binary pattern approach is proposed. Image frames are represented in two-bit depth instead of full-depth by making use of the local binary pattern as a binarization approach and the binarization part of the hardware architecture is explained in detail. Experimental results demonstrate the difference between the proposed hardware architecture and the architectures of well-known low-complexity motion estimation methods in terms of important aspects such as resource utilization, energy and power consumption.
Keywords: Binarization, hardware architecture, local binary pattern, motion estimation, two-bit transform.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13607443 Missing Link Data Estimation with Recurrent Neural Network: An Application Using Speed Data of Daegu Metropolitan Area
Authors: JaeHwan Yang, Da-Woon Jeong, Seung-Young Kho, Dong-Kyu Kim
Abstract:
In terms of ITS, information on link characteristic is an essential factor for plan or operation. But in practical cases, not every link has installed sensors on it. The link that does not have data on it is called “Missing Link”. The purpose of this study is to impute data of these missing links. To get these data, this study applies the machine learning method. With the machine learning process, especially for the deep learning process, missing link data can be estimated from present link data. For deep learning process, this study uses “Recurrent Neural Network” to take time-series data of road. As input data, Dedicated Short-range Communications (DSRC) data of Dalgubul-daero of Daegu Metropolitan Area had been fed into the learning process. Neural Network structure has 17 links with present data as input, 2 hidden layers, for 1 missing link data. As a result, forecasted data of target link show about 94% of accuracy compared with actual data.Keywords: Data Estimation, link data, machine learning, road network.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14807442 CNet Module Design of IMCS
Authors: Youkyung Park, SeungYup Kang, SungHo Kim, SimKyun Yook
Abstract:
IMCS is Integrated Monitoring and Control System for thermal power plant. This system consists of mainly two parts; controllers and OIS (Operator Interface System). These two parts are connected by Ethernet-based communication. The controller side of communication is managed by CNet module and OIS side is managed by data server of OIS. CNet module sends the data of controller to data server and receives commend data from data server. To minimizes or balance the load of data server, this module buffers data created by controller at every cycle and send buffered data to data server on request of data server. For multiple data server, this module manages the connection line with each data server and response for each request from multiple data server. CNet module is included in each controller of redundant system. When controller fail-over happens on redundant system, this module can provide data of controller to data sever without loss. This paper presents three main features – separation of get task, usage of ring buffer and monitoring communication status –of CNet module to carry out these functions.Keywords: Ethernet communication, DCS, power plant, ring buffer, data integrity
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15517441 A Robust Watermarking using Blind Source Separation
Authors: Anil Kumar, K. Negrat, A. M. Negrat, Abdelsalam Almarimi
Abstract:
In this paper, we present a robust and secure algorithm for watermarking, the watermark is first transformed into the frequency domain using the discrete wavelet transform (DWT). Then the entire DWT coefficient except the LL (Band) discarded, these coefficients are permuted and encrypted by specific mixing. The encrypted coefficients are inserted into the most significant spectral components of the stego-image using a chaotic system. This technique makes our watermark non-vulnerable to the attack (like compression, and geometric distortion) of an active intruder, or due to noise in the transmission link.Keywords: Blind source separation (BSS), Chaotic system, Watermarking, DWT.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15147440 Design of a DCT-based Image Compression with Efficient Enhancement Filter
Authors: Yen-Yu Chen, Pao-Ching Chu, Ya-Ling Tsai
Abstract:
The algorithm represents the DCT coefficients to concentrate signal energy and proposes combination and dictator to eliminate the correlation in the same level subband for encoding the DCT-based images. This work adopts DCT and modifies the SPIHT algorithm to encode DCT coefficients. The proposed algorithm also provides the enhancement function in low bit rate in order to improve the perceptual quality. Experimental results indicate that the proposed technique improves the quality of the reconstructed image in terms of both PSNR and the perceptual results close to JPEG2000 at the same bit rate.
Keywords: JPEG 2000, enhancement filter
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16717439 Optical Fish Tracking in Fishways using Neural Networks
Authors: Alvaro Rodriguez, Maria Bermudez, Juan R. Rabuñal, Jeronimo Puertas
Abstract:
One of the main issues in Computer Vision is to extract the movement of one or several points or objects of interest in an image or video sequence to conduct any kind of study or control process. Different techniques to solve this problem have been applied in numerous areas such as surveillance systems, analysis of traffic, motion capture, image compression, navigation systems and others, where the specific characteristics of each scenario determine the approximation to the problem. This paper puts forward a Computer Vision based algorithm to analyze fish trajectories in high turbulence conditions in artificial structures called vertical slot fishways, designed to allow the upstream migration of fish through obstructions in rivers. The suggested algorithm calculates the position of the fish at every instant starting from images recorded with a camera and using neural networks to execute fish detection on images. Different laboratory tests have been carried out in a full scale fishway model and with living fishes, allowing the reconstruction of the fish trajectory and the measurement of velocities and accelerations of the fish. These data can provide useful information to design more effective vertical slot fishways.
Keywords: Computer Vision, Neural Network, Fishway, Fish Trajectory, Tracking
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19877438 Big Data: Concepts, Technologies and Applications in the Public Sector
Authors: A. Alexandru, C. A. Alexandru, D. Coardos, E. Tudora
Abstract:
Big Data (BD) is associated with a new generation of technologies and architectures which can harness the value of extremely large volumes of very varied data through real time processing and analysis. It involves changes in (1) data types, (2) accumulation speed, and (3) data volume. This paper presents the main concepts related to the BD paradigm, and introduces architectures and technologies for BD and BD sets. The integration of BD with the Hadoop Framework is also underlined. BD has attracted a lot of attention in the public sector due to the newly emerging technologies that allow the availability of network access. The volume of different types of data has exponentially increased. Some applications of BD in the public sector in Romania are briefly presented.
Keywords: Big data, big data Analytics, Hadoop framework, cloud computing.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 23007437 Strengthening Legal Protection of Personal Data through Technical Protection Regulation in Line with Human Rights
Authors: Tomy Prihananto, Damar Apri Sudarmadi
Abstract:
Indonesia recognizes the right to privacy as a human right. Indonesia provides legal protection against data management activities because the protection of personal data is a part of human rights. This paper aims to describe the arrangement of data management and data management in Indonesia. This paper is a descriptive research with qualitative approach and collecting data from literature study. Results of this paper are comprehensive arrangement of data that have been set up as a technical requirement of data protection by encryption methods. Arrangements on encryption and protection of personal data are mutually reinforcing arrangements in the protection of personal data. Indonesia has two important and immediately enacted laws that provide protection for the privacy of information that is part of human rights.
Keywords: Indonesia, protection, personal data, privacy, human rights, encryption.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9637436 Big Brain: A Single Database System for a Federated Data Warehouse Architecture
Authors: X. Gumara Rigol, I. Martínez de Apellaniz Anzuola, A. Garcia Serrano, A. Franzi Cros, O. Vidal Calbet, A. Al Maruf
Abstract:
Traditional federated architectures for data warehousing work well when corporations have existing regional data warehouses and there is a need to aggregate data at a global level. Schibsted Media Group has been maturing from a decentralised organisation into a more globalised one and needed to build both some of the regional data warehouses for some brands at the same time as the global one. In this paper, we present the architectural alternatives studied and why a custom federated approach was the notable recommendation to go further with the implementation. Although the data warehouses are logically federated, the implementation uses a single database system which presented many advantages like: cost reduction and improved data access to global users allowing consumers of the data to have a common data model for detailed analysis across different geographies and a flexible layer for local specific needs in the same place.Keywords: Data integration, data warehousing, federated architecture, online analytical processing.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6917435 An In-Depth Analysis of Open Data Portals as an Emerging Public E-Service
Authors: Martin Lnenicka
Abstract:
Governments collect and produce large amounts of data. Increasingly, governments worldwide have started to implement open data initiatives and also launch open data portals to enable the release of these data in open and reusable formats. Therefore, a large number of open data repositories, catalogues and portals have been emerging in the world. The greater availability of interoperable and linkable open government data catalyzes secondary use of such data, so they can be used for building useful applications which leverage their value, allow insight, provide access to government services, and support transparency. The efficient development of successful open data portals makes it necessary to evaluate them systematic, in order to understand them better and assess the various types of value they generate, and identify the required improvements for increasing this value. Thus, the attention of this paper is directed particularly to the field of open data portals. The main aim of this paper is to compare the selected open data portals on the national level using content analysis and propose a new evaluation framework, which further improves the quality of these portals. It also establishes a set of considerations for involving businesses and citizens to create eservices and applications that leverage on the datasets available from these portals.
Keywords: Big data, content analysis, criteria comparison, data quality, open data, open data portals, public sector.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 30637434 ATM Service Analysis Using Predictive Data Mining
Authors: S. Madhavi, S. Abirami, C. Bharathi, B. Ekambaram, T. Krishna Sankar, A. Nattudurai, N. Vijayarangan
Abstract:
The high utilization rate of Automated Teller Machine (ATM) has inevitably caused the phenomena of waiting for a long time in the queue. This in turn has increased the out of stock situations. The ATM utilization helps to determine the usage level and states the necessity of the ATM based on the utilization of the ATM system. The time in which the ATM used more frequently (peak time) and based on the predicted solution the necessary actions are taken by the bank management. The analysis can be done by using the concept of Data Mining and the major part are analyzed based on the predictive data mining. The results are predicted from the historical data (past data) and track the relevant solution which is required. Weka tool is used for the analysis of data based on predictive data mining.
Keywords: ATM, Bank Management, Data Mining, Historical data, Predictive Data Mining, Weka tool.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 56007433 File System-Based Data Protection Approach
Authors: Jaechun No
Abstract:
As data to be stored in storage subsystems tremendously increases, data protection techniques have become more important than ever, to provide data availability and reliability. In this paper, we present the file system-based data protection (WOWSnap) that has been implemented using WORM (Write-Once-Read-Many) scheme. In the WOWSnap, once WORM files have been created, only the privileged read requests to them are allowed to protect data against any intentional/accidental intrusions. Furthermore, all WORM files are related to their protection cycle that is a time period during which WORM files should securely be protected. Once their protection cycle is expired, the WORM files are automatically moved to the general-purpose data section without any user interference. This prevents the WORM data section from being consumed by unnecessary files. We evaluated the performance of WOWSnap on Linux cluster.Keywords: Data protection, Protection cycle, WORM
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16537432 The Data Mining usage in Production System Management
Authors: Pavel Vazan, Pavol Tanuska, Michal Kebisek
Abstract:
The paper gives the pilot results of the project that is oriented on the use of data mining techniques and knowledge discoveries from production systems through them. They have been used in the management of these systems. The simulation models of manufacturing systems have been developed to obtain the necessary data about production. The authors have developed the way of storing data obtained from the simulation models in the data warehouse. Data mining model has been created by using specific methods and selected techniques for defined problems of production system management. The new knowledge has been applied to production management system. Gained knowledge has been tested on simulation models of the production system. An important benefit of the project has been proposal of the new methodology. This methodology is focused on data mining from the databases that store operational data about the production process.Keywords: data mining, data warehousing, management of production system, simulation
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3461