Search results for: data reception

7447 Research of Data Cleaning Methods Based on Dependency Rules

Authors: Yang Bao, Shi Wei Deng, Wang Qun Lin

Abstract:

This paper introduces the concept and principle of data cleaning, analyzes the types and causes of dirty data, and proposes several key steps of typical cleaning process, puts forward a well scalability and versatility data cleaning framework, in view of data with attribute dependency relation, designs several of violation data discovery algorithms by formal formula, which can obtain inconsistent data to all target columns with condition attribute dependent no matter data is structured (SQL) or unstructured (NoSql), and gives 6 data cleaning methods based on these algorithms.

Keywords: Data cleaning, dependency rules, violation data discovery, data repair.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2612

7446 Coalescing Data Marts

Authors: N. Parimala, P. Pahwa

Abstract:

OLAP uses multidimensional structures, to provide access to data for analysis. Traditionally, OLAP operations are more focused on retrieving data from a single data mart. An exception is the drill across operator. This, however, is restricted to retrieving facts on common dimensions of the multiple data marts. Our concern is to define further operations while retrieving data from multiple data marts. Towards this, we have defined six operations which coalesce data marts. While doing so we consider the common as well as the non-common dimensions of the data marts.

Keywords: Data warehouse, Dimension, OLAP, Star Schema.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1559

7445 Kazakhstani Koreans- Conflict of Linguistic Identity: In–between the Sovietized and Kazakhstani Citizens

Authors: Soon-ok Myong, Byong-soon Chun

Abstract:

This paper intends to identify the ethnic Kazakhstani Koreans- political process of identity formation by exploring their narrative and practice about the state language represented in the course of their becoming the new citizens of a new independent state. The Russophone Kazakhstani Koreans- inability to speak the official language of their affiliated state is considered there as dissatisfying the basic requirement of citizens of the independent state, so that they are becoming marginalized from the public sphere. Their contradictory attitude that at once demonstrates nominal reception and practical rejection of the obligatory state language unveils a high barrier inside between their self-language and other-language. In this paper, the ethnic Korean group-s conflicting linguistic identity is not seen as a free and simple choice, but as a dynamic struggle and political process in which the subject-s past experiences and memories intersect with the external elements of pressure.

Keywords: Ethnic Kazakhstani Koreans, Soviet Korean's Russification, Linguistic Identity, Russian-Kazakh Dichotomy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1589

7444 Performance Evaluation of an Efficient Asynchronous Protocol for WDM Ring MANs

Authors: Peristera A. Baziana

Abstract:

The idea of the asynchronous transmission in wavelength division multiplexing (WDM) ring MANs is studied in this paper. Especially, we present an efficient access technique to coordinate the collisions-free transmission of the variable sizes of IP traffic in WDM ring core networks. Each node is equipped with a tunable transmitter and a tunable receiver. In this way, all the wavelengths are exploited for both transmission and reception. In order to evaluate the performance measures of average throughput, queuing delay and packet dropping probability at the buffers, a simulation model that assumes symmetric access rights among the nodes is developed based on Poisson statistics. Extensive numerical results show that the proposed protocol achieves apart from high bandwidth exploitation for a wide range of offered load, fairness of queuing delay and dropping events among the different packets size categories.

Keywords: Asynchronous transmission, collision avoidance, wavelength division multiplexing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2093

7443 Mining Big Data in Telecommunications Industry: Challenges, Techniques, and Revenue Opportunity

Authors: Hoda A. Abdel Hafez

Abstract:

Mining big data represents a big challenge nowadays. Many types of research are concerned with mining massive amounts of data and big data streams. Mining big data faces a lot of challenges including scalability, speed, heterogeneity, accuracy, provenance and privacy. In telecommunication industry, mining big data is like a mining for gold; it represents a big opportunity and maximizing the revenue streams in this industry. This paper discusses the characteristics of big data (volume, variety, velocity and veracity), data mining techniques and tools for handling very large data sets, mining big data in telecommunication and the benefits and opportunities gained from them.

Keywords: Mining Big Data, Big Data, Machine learning, Data Streams, Telecommunication.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2480

7442 Performance Analysis of MUSIC, Root-MUSIC and ESPRIT DOA Estimation Algorithm

Authors: N. P. Waweru, D. B. O. Konditi, P. K. Langat

Abstract:

Direction of Arrival estimation refers to defining a mathematical function called a pseudospectrum that gives an indication of the angle a signal is impinging on the antenna array. This estimation is an efficient method of improving the quality of service in a communication system by focusing the reception and transmission only in the estimated direction thereby increasing fidelity with a provision to suppress interferers. This improvement is largely dependent on the performance of the algorithm employed in the estimation. Many DOA algorithms exists amongst which are MUSIC, Root-MUSIC and ESPRIT. In this paper, performance of these three algorithms is analyzed in terms of complexity, accuracy as assessed and characterized by the CRLB and memory requirements in various environments and array sizes. It is found that the three algorithms are high resolution and dependent on the operating environment and the array size.

Keywords: Direction of Arrival, Autocorrelation matrix, Eigenvalue decomposition, MUSIC, ESPRIT, CRLB.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 8757

7441 Comparison of Inter Cell Interference Coordination Approaches

Authors: Selma Sbit, Mohamed Bechir Dadi, Belgacem Chibani Rhaimi

Abstract:

This work aims to compare various techniques used in order to mitigate Inter-Cell Interference (ICI) in Long Term Evolution (LTE) and LTE-Advanced systems. For that, we will evaluate the performance of each one. In mobile communication networks, systems are limited by ICI particularly caused by deployment of small cells in conventional cell’s implementation. Therefore, various mitigation techniques, named Inter-Cell Interference Coordination techniques (ICIC), enhanced Inter-Cell Interference Coordination (eICIC) techniques and Coordinated Multi-Point transmission and reception (CoMP) are proposed. This paper presents a comparative study of these strategies. It can be concluded that CoMP techniques can ameliorate SINR and capacity system compared to ICIC and eICIC. In fact, SINR value reaches 15 dB for a distance of 0.5 km between user equipment and servant base station if we use CoMP technology whereas it cannot exceed 12 dB and 9 dB for eICIC and ICIC approaches respectively as reflected in simulations.

Keywords: 4th generation, interference, coordination, ICIC.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1005

7440 A Distributed Topology Control Algorithm to Conserve Energy in Heterogeneous Wireless Mesh Networks

Authors: F. O. Aron, T. O. Olwal, A. Kurien, M. O. Odhiambo

Abstract:

A considerable amount of energy is consumed during transmission and reception of messages in a wireless mesh network (WMN). Reducing per-node transmission power would greatly increase the network lifetime via power conservation in addition to increasing the network capacity via better spatial bandwidth reuse. In this work, the problem of topology control in a hybrid WMN of heterogeneous wireless devices with varying maximum transmission ranges is considered. A localized distributed topology control algorithm is presented which calculates the optimal transmission power so that (1) network connectivity is maintained (2) node transmission power is reduced to cover only the nearest neighbours (3) networks lifetime is extended. Simulations and analysis of results are carried out in the NS-2 environment to demonstrate the correctness and effectiveness of the proposed algorithm.

Keywords: Topology Control, Wireless Mesh Networks, Backbone, Energy Efficiency, Localized Algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1394

7439 Improved Rake Receiver Based On the Signal Sign Separation in Maximal Ratio Combining Technique for Ultra-Wideband Wireless Communication Systems

Authors: Rashid A. Fayadh, F. Malek, Hilal A. Fadhil, Norshafinash Saudin

Abstract:

At receiving high data rate in ultra wideband (UWB) technology for many users, there are multiple user interference and inter-symbol interference as obstacles in the multi-path reception technique. Since the rake receivers were designed to collect many resolvable paths, even more than hundred of paths. Rake receiver implementation structures have been proposed towards increasing the complexity for getting better performances in indoor or outdoor multi-path receivers by reducing the bit error rate (BER). So several rake structures were proposed in the past to reduce the number of combining and estimating of resolvable paths. To this aim, we suggested two improved rake receivers based on signal sign separation in the maximal ratio combiner (MRC), called positive-negative MRC selective rake (P-N/MRC-S-rake) and positive-negative MRC partial rake (P-N/MRC-S-rake) receivers. These receivers were introduced to reduce the complexity with less number of fingers and improving the performance with low BER. Before decision circuit, there is a comparator to compare between positive quantity and negative quantity to decide whether the transmitted bit is 1 or 0. The BER was driven by MATLAB simulation with multi-path environments for impulse radio time-hopping binary phase shift keying (TH-BPSK) modulation and the results were compared with those of conventional rake receivers.

Keywords: Selective and partial rake receivers, positive and negative signal separation, maximal ratio combiner, bit error rate performance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1901

7438 Comparative Analysis of Diverse Collection of Big Data Analytics Tools

Authors: S. Vidhya, S. Sarumathi, N. Shanthi

Abstract:

Over the past era, there have been a lot of efforts and studies are carried out in growing proficient tools for performing various tasks in big data. Recently big data have gotten a lot of publicity for their good reasons. Due to the large and complex collection of datasets it is difficult to process on traditional data processing applications. This concern turns to be further mandatory for producing various tools in big data. Moreover, the main aim of big data analytics is to utilize the advanced analytic techniques besides very huge, different datasets which contain diverse sizes from terabytes to zettabytes and diverse types such as structured or unstructured and batch or streaming. Big data is useful for data sets where their size or type is away from the capability of traditional relational databases for capturing, managing and processing the data with low-latency. Thus the out coming challenges tend to the occurrence of powerful big data tools. In this survey, a various collection of big data tools are illustrated and also compared with the salient features.

Keywords: Big data, Big data analytics, Business analytics, Data analysis, Data visualization, Data discovery.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3775

7437 Process Oriented Architecture for Emergency Scenarios in the Czech Republic

Authors: Tomáš Ludík, Josef Navrátil, Alena Langerová

Abstract:

Tackling emergency situations is performed based on emergency scenarios. These scenarios do not have a uniform form in the Czech Republic. They are unstructured and developed primarily in the text form. This does not allow solving emergency situations efficiently. For this reason, the paper aims at defining a Process Oriented Architecture to support and thus to improve tackling emergency situations in the Czech Republic. The innovative Process Oriented Architecture is based on the Workflow Reference Model while taking into account the options of Business Process Management Suites for the implementation of process oriented emergency scenarios. To verify the proposed architecture the Proof of Concept has been used which covers the reception of an emergency event at the district emergency operations centre. Within the particular implementation of the proposed architecture the Bonita Open Solution has been used. The architecture created in this way is suitable not only for emergency management, but also for educational purposes.

Keywords: Business Process Management Suite, Czech Republic, Emergency Scenarios, Process Execution, Process Oriented Architecture.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1826

7436 Multi-labeled Data Expressed by a Set of Labels

Authors: Tetsuya Furukawa, Masahiro Kuzunishi

Abstract:

Collected data must be organized to be utilized efficiently, and hierarchical classification of data is efficient approach to organize data. When data is classified to multiple categories or annotated with a set of labels, users request multi-labeled data by giving a set of labels. There are several interpretations of the data expressed by a set of labels. This paper discusses which data is expressed by a set of labels by introducing orders for sets of labels and shows that there are four types of orders, which are characterized by whether the labels of expressed data includes every label of the given set of labels within the range of the set. Desirable properties of the orders, data is also expressed by the higher set of labels and different sets of labels express different data, are discussed for the orders.

Keywords: Classification Hierarchies, Multi-labeled Data, Multiple Classificaiton, Orders of Sets of Labels

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1304

7435 A Basic Study on Ubiquitous Overloaded Vehicles Regulation System

Authors: Byung-Wan Jo, Kwang-Won Yoon, Ji-Sun Choi

Abstract:

Load managing method on road became necessary since overloaded vehicles occur damage on road facilities and existing systems for preventing this damage still show many problems.Accordingly, efficient managing system for preventing overloaded vehicles could be organized by using the road itself as a scale by applying genetic algorithm to analyze the load and the drive information of vehicles.Therefore, this paper organized Ubiquitous sensor network system for development of intelligent overload vehicle regulation system, also in this study, to use the behavior of road, the transformation was measured by installing underground box type indoor model and indoor experiment was held using genetic algorithm. And we examined wireless possibility of overloaded vehicle regulation system through experiment of the transmission and reception distance.If this system will apply to road and bridge, might be effective for economy and convenience through establishment of U-IT system..

Keywords: Overload Vehicle. Genetic Algorithm, EmbeddedSystem, Wim Sensor, overload vehicle regulation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1566

7434 The Comparison of Data Replication in Distributed Systems

Authors: Iman Zangeneh, Mostafa Moradi, Ali Mokhtarbaf

Abstract:

The necessity of ever-increasing use of distributed data in computer networks is obvious for all. One technique that is performed on the distributed data for increasing of efficiency and reliablity is data rplication. In this paper, after introducing this technique and its advantages, we will examine some dynamic data replication. We will examine their characteristies for some overus scenario and the we will propose some suggestion for their improvement.

Keywords: data replication, data hiding, consistency, dynamicdata replication strategy

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1635

7433 Implementation of an IoT Sensor Data Collection and Analysis Library

Authors: Jihyun Song, Kyeongjoo Kim, Minsoo Lee

Abstract:

Due to the development of information technology and wireless Internet technology, various data are being generated in various fields. These data are advantageous in that they provide real-time information to the users themselves. However, when the data are accumulated and analyzed, more various information can be extracted. In addition, development and dissemination of boards such as Arduino and Raspberry Pie have made it possible to easily test various sensors, and it is possible to collect sensor data directly by using database application tools such as MySQL. These directly collected data can be used for various research and can be useful as data for data mining. However, there are many difficulties in using the board to collect data, and there are many difficulties in using it when the user is not a computer programmer, or when using it for the first time. Even if data are collected, lack of expert knowledge or experience may cause difficulties in data analysis and visualization. In this paper, we aim to construct a library for sensor data collection and analysis to overcome these problems.

Keywords: Clustering, data mining, DBSCAN, k-means, k-medoids, sensor data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2010

7432 Government (Big) Data Ecosystem: Definition, Classification of Actors, and Their Roles

Authors: Syed Iftikhar Hussain Shah, Vasilis Peristeras, Ioannis Magnisalis

Abstract:

Organizations, including governments, generate (big) data that are high in volume, velocity, veracity, and come from a variety of sources. Public Administrations are using (big) data, implementing base registries, and enforcing data sharing within the entire government to deliver (big) data related integrated services, provision of insights to users, and for good governance. Government (Big) data ecosystem actors represent distinct entities that provide data, consume data, manipulate data to offer paid services, and extend data services like data storage, hosting services to other actors. In this research work, we perform a systematic literature review. The key objectives of this paper are to propose a robust definition of government (big) data ecosystem and a classification of government (big) data ecosystem actors and their roles. We showcase a graphical view of actors, roles, and their relationship in the government (big) data ecosystem. We also discuss our research findings. We did not find too much published research articles about the government (big) data ecosystem, including its definition and classification of actors and their roles. Therefore, we lent ideas for the government (big) data ecosystem from numerous areas that include scientific research data, humanitarian data, open government data, industry data, in the literature.

Keywords: Big data, big data ecosystem, classification of big data actors, big data actors roles, definition of government (big) data ecosystem, data-driven government, eGovernment, gaps in data ecosystems, government (big) data, public administration, systematic literature review.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2143

7431 Imputation Technique for Feature Selection in Microarray Data Set

Authors: Younies Mahmoud, Mai Mabrouk, Elsayed Sallam

Abstract:

Analyzing DNA microarray data sets is a great challenge, which faces the bioinformaticians due to the complication of using statistical and machine learning techniques. The challenge will be doubled if the microarray data sets contain missing data, which happens regularly because these techniques cannot deal with missing data. One of the most important data analysis process on the microarray data set is feature selection. This process finds the most important genes that affect certain disease. In this paper, we introduce a technique for imputing the missing data in microarray data sets while performing feature selection.

Keywords: DNA microarray, feature selection, missing data, bioinformatics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2791

7430 Automatic Real-Patient Medical Data De-Identification for Research Purposes

Authors: Petr Vcelak, Jana Kleckova

Abstract:

Our Medicine-oriented research is based on a medical data set of real patients. It is a security problem to share patient private data with peoples other than clinician or hospital staff. We have to remove person identification information from medical data. The medical data without private data are available after a de-identification process for any research purposes. In this paper, we introduce an universal automatic rule-based de-identification application to do all this stuff on an heterogeneous medical data. A patient private identification is replaced by an unique identification number, even in burnedin annotation in pixel data. The identical identification is used for all patient medical data, so it keeps relationships in a data. Hospital can take an advantage of a research feedback based on results.

Keywords: DASTA, De-identification, DICOM, Health Level Seven, Medical data, OCR, Personal data

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1642

7429 Analyzing Multi-Labeled Data Based on the Roll of a Concept against a Semantic Range

Authors: Masahiro Kuzunishi, Tetsuya Furukawa, Ke Lu

Abstract:

Classifying data hierarchically is an efficient approach to analyze data. Data is usually classified into multiple categories, or annotated with a set of labels. To analyze multi-labeled data, such data must be specified by giving a set of labels as a semantic range. There are some certain purposes to analyze data. This paper shows which multi-labeled data should be the target to be analyzed for those purposes, and discusses the role of a label against a set of labels by investigating the change when a label is added to the set of labels. These discussions give the methods for the advanced analysis of multi-labeled data, which are based on the role of a label against a semantic range.

Keywords: Classification Hierarchies, Data Analysis, Multilabeled Data, Orders of Sets of Labels

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1208

7428 MIMO-OFDM Channel Tracking Using a Dynamic ANN Topology

Authors: Manasjyoti Bhuyan, Kandarpa Kumar Sarma

Abstract:

All the available algorithms for blind estimation namely constant modulus algorithm (CMA), Decision-Directed Algorithm (DDA/DFE) suffer from the problem of convergence to local minima. Also, if the channel drifts considerably, any DDA looses track of the channel. So, their usage is limited in varying channel conditions. The primary limitation in such cases is the requirement of certain overhead bits in the transmit framework which leads to wasteful use of the bandwidth. Also such arrangements fail to use channel state information (CSI) which is an important aid in improving the quality of reception. In this work, the main objective is to reduce the overhead imposed by the pilot symbols, which in effect reduces the system throughput. Also we formulate an arrangement based on certain dynamic Artificial Neural Network (ANN) topologies which not only contributes towards the lowering of the overhead but also facilitates the use of the CSI. A 2×2 Multiple Input Multiple Output (MIMO) system is simulated and the performance variation with different channel estimation schemes are evaluated. A new semi blind approach based on dynamic ANN is proposed for channel tracking in varying channel conditions and the performance is compared with perfectly known CSI and least square (LS) based estimation.

Keywords: MIMO, Artificial Neural Network (ANN), CMA, LS, CSI.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2371

7427 Array Signal Processing: DOA Estimation for Missing Sensors

Authors: Lalita Gupta, R. P. Singh

Abstract:

Array signal processing involves signal enumeration and source localization. Array signal processing is centered on the ability to fuse temporal and spatial information captured via sampling signals emitted from a number of sources at the sensors of an array in order to carry out a specific estimation task: source characteristics (mainly localization of the sources) and/or array characteristics (mainly array geometry) estimation. Array signal processing is a part of signal processing that uses sensors organized in patterns or arrays, to detect signals and to determine information about them. Beamforming is a general signal processing technique used to control the directionality of the reception or transmission of a signal. Using Beamforming we can direct the majority of signal energy we receive from a group of array. Multiple signal classification (MUSIC) is a highly popular eigenstructure-based estimation method of direction of arrival (DOA) with high resolution. This Paper enumerates the effect of missing sensors in DOA estimation. The accuracy of the MUSIC-based DOA estimation is degraded significantly both by the effects of the missing sensors among the receiving array elements and the unequal channel gain and phase errors of the receiver.

Keywords: Array Signal Processing, Beamforming, ULA, Direction of Arrival, MUSIC

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3020

7426 Steganalysis of Data Hiding via Halftoning and Coordinate Projection

Authors: Woong Hee Kim, Ilhwan Park

Abstract:

Steganography is the art of hiding and transmitting data through apparently innocuous carriers in an effort to conceal the existence of the data. A lot of steganography algorithms have been proposed recently. Many of them use the digital image data as a carrier. In data hiding scheme of halftoning and coordinate projection, still image data is used as a carrier, and the data of carrier image are modified for data embedding. In this paper, we present three features for analysis of data hiding via halftoning and coordinate projection. Also, we present a classifier using the proposed three features.

Keywords: Steganography, steganalysis, digital halftoning, data hiding.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1600

7425 Biological Data Integration using SOA

Authors: Noura Meshaan Al-Otaibi, Amin Yousef Noaman

Abstract:

Nowadays scientific data is inevitably digital and stored in a wide variety of formats in heterogeneous systems. Scientists need to access an integrated view of remote or local heterogeneous data sources with advanced data accessing, analyzing, and visualization tools. This research suggests the use of Service Oriented Architecture (SOA) to integrate biological data from different data sources. This work shows SOA will solve the problems that facing integration process and if the biologist scientists can access the biological data in easier way. There are several methods to implement SOA but web service is the most popular method. The Microsoft .Net Framework used to implement proposed architecture.

Keywords: Bioinformatics, Biological data, Data Integration, SOA and Web Services.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2473

7424 STATISTICA Software: A State of the Art Review

Authors: S. Sarumathi, N. Shanthi, S. Vidhya, P. Ranjetha

Abstract:

Data mining idea is mounting rapidly in admiration and also in their popularity. The foremost aspire of data mining method is to extract data from a huge data set into several forms that could be comprehended for additional use. The data mining is a technology that contains with rich potential resources which could be supportive for industries and businesses that pay attention to collect the necessary information of the data to discover their customer’s performances. For extracting data there are several methods are available such as Classification, Clustering, Association, Discovering, and Visualization… etc., which has its individual and diverse algorithms towards the effort to fit an appropriate model to the data. STATISTICA mostly deals with excessive groups of data that imposes vast rigorous computational constraints. These results trials challenge cause the emergence of powerful STATISTICA Data Mining technologies. In this survey an overview of the STATISTICA software is illustrated along with their significant features.

Keywords: Data Mining, STATISTICA Data Miner, Text Miner, Enterprise Server, Classification, Association, Clustering, Regression.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2607

7423 Proposal of Data Collection from Probes

Authors: M. Kebisek, L. Spendla, M. Kopcek, T. Skulavik

Abstract:

In our paper we describe the security capabilities of data collection. Data are collected with probes located in the near and distant surroundings of the company. Considering the numerous obstacles e.g. forests, hills, urban areas, the data collection is realized in several ways. The collection of data uses connection via wireless communication, LAN network, GSM network and in certain areas data are collected by using vehicles. In order to ensure the connection to the server most of the probes have ability to communicate in several ways. Collected data are archived and subsequently used in supervisory applications. To ensure the collection of the required data, it is necessary to propose algorithms that will allow the probes to select suitable communication channel.

Keywords: Communication, computer network, data collection, probe.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1782

7422 Linguistic Summarization of Structured Patent Data

Authors: E. Y. Igde, S. Aydogan, F. E. Boran, D. Akay

Abstract:

Patent data have an increasingly important role in economic growth, innovation, technical advantages and business strategies and even in countries competitions. Analyzing of patent data is crucial since patents cover large part of all technological information of the world. In this paper, we have used the linguistic summarization technique to prove the validity of the hypotheses related to patent data stated in the literature.

Keywords: Data mining, fuzzy sets, linguistic summarization, patent data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1217

7421 Metadata Update Mechanism Improvements in Data Grid

Authors: S. Farokhzad, M. Reza Salehnamadi

Abstract:

Grid environments include aggregation of geographical distributed resources. Grid is put forward in three types of computational, data and storage. This paper presents a research on data grid. Data grid is used for covering and securing accessibility to data from among many heterogeneous sources. Users are not worry on the place where data is located in it, provided that, they should get access to the data. Metadata is used for getting access to data in data grid. Presently, application metadata catalogue and SRB middle-ware package are used in data grids for management of metadata. At this paper, possibility of updating, streamlining and searching is provided simultaneously and rapidly through classified table of preserving metadata and conversion of each table to numerous tables. Meanwhile, with regard to the specific application, the most appropriate and best division is set and determined. Concurrency of implementation of some of requests and execution of pipeline is adaptability as a result of this technique.

Keywords: Grids, data grid, metadata, update.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1699

7420 Optimized Approach for Secure Data Sharing in Distributed Database

Authors: Ahmed Mateen, Zhu Qingsheng, Ahmad Bilal

Abstract:

In the current age of technology, information is the most precious asset of a company. Today, companies have a large amount of data. As the data become larger, access to data for some particular information is becoming slower day by day. Faster data processing to shape it in the form of information is the biggest issue. The major problems in distributed databases are the efficiency of data distribution and response time of data distribution. The security of data distribution is also a big issue. For these problems, we proposed a strategy that can maximize the efficiency of data distribution and also increase its response time. This technique gives better results for secure data distribution from multiple heterogeneous sources. The newly proposed technique facilitates the companies for secure data sharing efficiently and quickly.

Keywords: ER-schema, electronic record, P2P framework, API, query formulation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1067

7419 Using Data Clustering in Oral Medicine

Authors: Fahad Shahbaz Khan, Rao Muhammad Anwer, Olof Torgersson

Abstract:

The vast amount of information hidden in huge databases has created tremendous interests in the field of data mining. This paper examines the possibility of using data clustering techniques in oral medicine to identify functional relationships between different attributes and classification of similar patient examinations. Commonly used data clustering algorithms have been reviewed and as a result several interesting results have been gathered.

Keywords: Oral Medicine, Cluto, Data Clustering, Data Mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1977

7418 The Application of Data Mining Technology in Building Energy Consumption Data Analysis

Authors: Liang Zhao, Jili Zhang, Chongquan Zhong

Abstract:

Energy consumption data, in particular those involving public buildings, are impacted by many factors: the building structure, climate/environmental parameters, construction, system operating condition, and user behavior patterns. Traditional methods for data analysis are insufficient. This paper delves into the data mining technology to determine its application in the analysis of building energy consumption data including energy consumption prediction, fault diagnosis, and optimal operation. Recent literature are reviewed and summarized, the problems faced by data mining technology in the area of energy consumption data analysis are enumerated, and research points for future studies are given.

Keywords: Data mining, data analysis, prediction, optimization, building operational performance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3709