Search results for: link data

7471 Establishing Pairwise Keys Using Key Predistribution Schemes for Sensor Networks

Authors: Y. Harold Robinson, M. Rajaram

Abstract:

Designing cost-efficient, secure network protocols for Wireless Sensor Networks (WSNs) is a challenging problem because sensors are resource-limited wireless devices. Security services such as authentication and improved pairwise key establishment are critical to high efficient networks with sensor nodes. For sensor nodes to correspond securely with each other efficiently, usage of cryptographic techniques is necessary. In this paper, two key predistribution schemes that enable a mobile sink to establish a secure data-communication link, on the fly, with any sensor nodes. The intermediate nodes along the path to the sink are able to verify the authenticity and integrity of the incoming packets using a predicted value of the key generated by the sender’s essential power. The proposed schemes are based on the pairwise key with the mobile sink, our analytical results clearly show that our schemes perform better in terms of network resilience to node capture than existing schemes if used in wireless sensor networks with mobile sinks.

Keywords: Wireless Sensor Networks, predistribution scheme, cryptographic techniques.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1555

7470 Using Data Clustering in Oral Medicine

Authors: Fahad Shahbaz Khan, Rao Muhammad Anwer, Olof Torgersson

Abstract:

The vast amount of information hidden in huge databases has created tremendous interests in the field of data mining. This paper examines the possibility of using data clustering techniques in oral medicine to identify functional relationships between different attributes and classification of similar patient examinations. Commonly used data clustering algorithms have been reviewed and as a result several interesting results have been gathered.

Keywords: Oral Medicine, Cluto, Data Clustering, Data Mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1925

7469 The Application of Data Mining Technology in Building Energy Consumption Data Analysis

Authors: Liang Zhao, Jili Zhang, Chongquan Zhong

Abstract:

Energy consumption data, in particular those involving public buildings, are impacted by many factors: the building structure, climate/environmental parameters, construction, system operating condition, and user behavior patterns. Traditional methods for data analysis are insufficient. This paper delves into the data mining technology to determine its application in the analysis of building energy consumption data including energy consumption prediction, fault diagnosis, and optimal operation. Recent literature are reviewed and summarized, the problems faced by data mining technology in the area of energy consumption data analysis are enumerated, and research points for future studies are given.

Keywords: Data mining, data analysis, prediction, optimization, building operational performance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3645

7468 Query Algebra for Semistuctured Data

Authors: Ei Ei Myat, Ni Lar Thein

Abstract:

With the tremendous growth of World Wide Web (WWW) data, there is an emerging need for effective information retrieval at the document level. Several query languages such as XML-QL, XPath, XQL, Quilt and XQuery are proposed in recent years to provide faster way of querying XML data, but they still lack of generality and efficiency. Our approach towards evolving a framework for querying semistructured documents is based on formal query algebra. Two elements are introduced in the proposed framework: first, a generic and flexible data model for logical representation of semistructured data and second, a set of operators for the manipulation of objects defined in the data model. In additional to accommodating several peculiarities of semistructured data, our model offers novel features such as bidirectional paths for navigational querying and partitions for data transformation that are not available in other proposals.

Keywords: Algebra, Semistructured data, Query Algebra.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1331

7467 Simulation Data Summarization Based on Spatial Histograms

Authors: Jing Zhao, Yoshiharu Ishikawa, Chuan Xiao, Kento Sugiura

Abstract:

In order to analyze large-scale scientific data, research on data exploration and visualization has gained popularity. In this paper, we focus on the exploration and visualization of scientific simulation data, and define a spatial V-Optimal histogram for data summarization. We propose histogram construction algorithms based on a general binary hierarchical partitioning as well as a more specific one, the l-grid partitioning. For effective data summarization and efficient data visualization in scientific data analysis, we propose an optimal algorithm as well as a heuristic algorithm for histogram construction. To verify the effectiveness and efficiency of the proposed methods, we conduct experiments on the massive evacuation simulation data.

Keywords: Simulation data, data summarization, spatial histograms, exploration and visualization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 693

7466 Joint Use of Factor Analysis (FA) and Data Envelopment Analysis (DEA) for Ranking of Data Envelopment Analysis

Authors: Reza Nadimi, Fariborz Jolai

Abstract:

This article combines two techniques: data envelopment analysis (DEA) and Factor analysis (FA) to data reduction in decision making units (DMU). Data envelopment analysis (DEA), a popular linear programming technique is useful to rate comparatively operational efficiency of decision making units (DMU) based on their deterministic (not necessarily stochastic) input–output data and factor analysis techniques, have been proposed as data reduction and classification technique, which can be applied in data envelopment analysis (DEA) technique for reduction input – output data. Numerical results reveal that the new approach shows a good consistency in ranking with DEA.

Keywords: Effectiveness, Decision Making, Data EnvelopmentAnalysis, Factor Analysis

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2378

7465 On the Invariant Uniform Roe Algebra as Crossed Product

Authors: Kankeyanathan Kannan

Abstract:

The uniform Roe C*-algebra (also called uniform translation)C^*- algebra provides a link between coarse geometry and C^*- algebra theory. The uniform Roe algebra has a great importance in geometry, topology and analysis. We consider some of the elementary concepts associated with coarse spaces.

Keywords: Invariant Approximation Property, Uniform Roe algebras.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1685

7464 The Effects of Multipath on OFDM Systems for Broadband Power-Line Communications a Case of Medium Voltage Channel

Authors: Justinian Anatory, N. Theethayi, R. Thottappillil, C. Mwase, N.H. Mvungi

Abstract:

Power-line networks are widely used today for broadband data transmission. However, due to multipaths within the broadband power line communication (BPLC) systems owing to stochastic changes in the network load impedances, branches, etc., network or channel capacity performances are affected. This paper attempts to investigate the performance of typical medium voltage channels that uses Orthogonal Frequency Division Multiplexing (OFDM) techniques with Quadrature Amplitude Modulation (QAM) sub carriers. It has been observed that when the load impedances are different from line characteristic impedance channel performance decreases. Also as the number of branches in the link between the transmitter and receiver increases a loss of 4dB/branch is found in the signal to noise ratio (SNR). The information presented in the paper could be useful for an appropriate design of the BPLC systems.

Keywords: Communication channel model, Power-line communication, Transfer function, Multipath, Branched network, OFDM, QAM, performance evaluation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1806

7463 Social Business Process Management and Business Process Management Maturity

Authors: Dalia Suša Vugec, Vesna Bosilj Vukšić, Ljubica Milanović Glavan

Abstract:

Business process management (BPM) is a well-known holistic discipline focused on managing business processes with the intention of achieving higher level of BPM maturity and better organizational performance. In recent period, traditional BPM faced some of its limitations like model-reality divide and lost innovation. Following latest trends, as an attempt to overcome the issues of traditional BPM, there has been an introduction of applying the principles of social software in managing business processes which led to the development of social BPM. However, there are not many authors or studies dealing with this topic so this study aims to contribute to that literature gap and to examine the link between the level of BPM maturity and the usage of social BPM. To meet these objectives, a survey within the companies with more than 50 employees has been conducted. The results reveal that the usage of social BPM is higher within the companies which achieved higher level of BPM maturity. This paper provides an overview, analysis and discussion of collected data regarding BPM maturity and social BPM within the observed companies and identifies the main social BPM principles.

Keywords: Business process management, BPM maturity, process performance index, social BPM.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1583

7462 QoS Routing in Wired Sensor Networks with Partial Updates

Authors: Arijit Ghos, Tony Gigargis

Abstract:

QoS routing is an important component of Traffic Engineering in networks that provide QoS guarantees. QoS routing is dependent on the link state information which is typically flooded across the network. This affects both the quality of the routing and the utilization of the network resources. In this paper, we examine establishing QoS routes with partial state updates in wired sensor networks.

Keywords:

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1157

7461 High Glucose Increases Acetylcholine-Induced Ca2+ Entry and Protein Expression of STIM1

Authors: Hong Ding, Fatiha Benslimane, Isra Marei, Chris R. Triggle

Abstract:

Hyperglycaemia is a key factor that contributes to the development of diabetes-related microvascular disease and a major risk factor for endothelial dysfunction. In the current study, we have explored glucose-induced abnormal intracellular calcium (Ca2+ i) homeostasis in mouse microvessel endothelial cells (MMECs) in high glucose (HG) (40mmol/L) versus control (low glucose, LG) (11 mmol/L). We demonstrated that the exposure of MMECs to HG for 3 days did not change basal Ca2+ i, however, there was a significant increase of acetylcholine-induced Ca2+ entry. Western blots illustrated that exposure to HG also increased STIM1 (Stromal Interaction Molecule 1), but not Orai1 (the pore forming subunit), protein expression levels. Although the link between HG-induced changes in STIM1 expression, enhanced Ca2+ entry and endothelial dysfunction requires further study, the current data are suggestive that targeting these pathways may reduce the impact of HG on endothelial function.

Keywords: store-operated calcium entry, hyperglycaemia, STIM1, endothelial dysfunction

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1669

7460 Observations about the Principal Components Analysis and Data Clustering Techniques in the Study of Medical Data

Authors: Cristina G. Dascâlu, Corina Dima Cozma, Elena Carmen Cotrutz

Abstract:

The medical data statistical analysis often requires the using of some special techniques, because of the particularities of these data. The principal components analysis and the data clustering are two statistical methods for data mining very useful in the medical field, the first one as a method to decrease the number of studied parameters, and the second one as a method to analyze the connections between diagnosis and the data about the patient-s condition. In this paper we investigate the implications obtained from a specific data analysis technique: the data clustering preceded by a selection of the most relevant parameters, made using the principal components analysis. Our assumption was that, using the principal components analysis before data clustering - in order to select and to classify only the most relevant parameters – the accuracy of clustering is improved, but the practical results showed the opposite fact: the clustering accuracy decreases, with a percentage approximately equal with the percentage of information loss reported by the principal components analysis.

Keywords: Data clustering, medical data, principal components analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1451

7459 CNet Module Design of IMCS

Authors: Youkyung Park, SeungYup Kang, SungHo Kim, SimKyun Yook

Abstract:

IMCS is Integrated Monitoring and Control System for thermal power plant. This system consists of mainly two parts; controllers and OIS (Operator Interface System). These two parts are connected by Ethernet-based communication. The controller side of communication is managed by CNet module and OIS side is managed by data server of OIS. CNet module sends the data of controller to data server and receives commend data from data server. To minimizes or balance the load of data server, this module buffers data created by controller at every cycle and send buffered data to data server on request of data server. For multiple data server, this module manages the connection line with each data server and response for each request from multiple data server. CNet module is included in each controller of redundant system. When controller fail-over happens on redundant system, this module can provide data of controller to data sever without loss. This paper presents three main features – separation of get task, usage of ring buffer and monitoring communication status –of CNet module to carry out these functions.

Keywords: Ethernet communication, DCS, power plant, ring buffer, data integrity

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1519

7458 Exploring the Spatial Characteristics of Mortality Map: A Statistical Area Perspective

Authors: Jung-Hong Hong, Jing-Cen Yang, Cai-Yu Ou

Abstract:

The analysis of geographic inequality heavily relies on the use of location-enabled statistical data and quantitative measures to present the spatial patterns of the selected phenomena and analyze their differences. To protect the privacy of individual instance and link to administrative units, point-based datasets are spatially aggregated to area-based statistical datasets, where only the overall status for the selected levels of spatial units is used for decision making. The partition of the spatial units thus has dominant influence on the outcomes of the analyzed results, well known as the Modifiable Areal Unit Problem (MAUP). A new spatial reference framework, the Taiwan Geographical Statistical Classification (TGSC), was recently introduced in Taiwan based on the spatial partition principles of homogeneous consideration of the number of population and households. Comparing to the outcomes of the traditional township units, TGSC provides additional levels of spatial units with finer granularity for presenting spatial phenomena and enables domain experts to select appropriate dissemination level for publishing statistical data. This paper compares the results of respectively using TGSC and township unit on the mortality data and examines the spatial characteristics of their outcomes. For the mortality data between the period of January 1^st, 2008 and December 31^st, 2010 of the Taitung County, the all-cause age-standardized death rate (ASDR) ranges from 571 to 1757 per 100,000 persons, whereas the 2^nd dissemination area (TGSC) shows greater variation, ranged from 0 to 2222 per 100,000. The finer granularity of spatial units of TGSC clearly provides better outcomes for identifying and evaluating the geographic inequality and can be further analyzed with the statistical measures from other perspectives (e.g., population, area, environment.). The management and analysis of the statistical data referring to the TGSC in this research is strongly supported by the use of Geographic Information System (GIS) technology. An integrated workflow that consists of the tasks of the processing of death certificates, the geocoding of street address, the quality assurance of geocoded results, the automatic calculation of statistic measures, the standardized encoding of measures and the geo-visualization of statistical outcomes is developed. This paper also introduces a set of auxiliary measures from a geographic distribution perspective to further examine the hidden spatial characteristics of mortality data and justify the analyzed results. With the common statistical area framework like TGSC, the preliminary results demonstrate promising potential for developing a web-based statistical service that can effectively access domain statistical data and present the analyzed outcomes in meaningful ways to avoid wrong decision making.

Keywords: Mortality map, spatial patterns, statistical area, variation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 946

7457 Big Data: Concepts, Technologies and Applications in the Public Sector

Authors: A. Alexandru, C. A. Alexandru, D. Coardos, E. Tudora

Abstract:

Big Data (BD) is associated with a new generation of technologies and architectures which can harness the value of extremely large volumes of very varied data through real time processing and analysis. It involves changes in (1) data types, (2) accumulation speed, and (3) data volume. This paper presents the main concepts related to the BD paradigm, and introduces architectures and technologies for BD and BD sets. The integration of BD with the Hadoop Framework is also underlined. BD has attracted a lot of attention in the public sector due to the newly emerging technologies that allow the availability of network access. The volume of different types of data has exponentially increased. Some applications of BD in the public sector in Romania are briefly presented.

Keywords: Big data, big data Analytics, Hadoop framework, cloud computing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2262

7456 Information Measures Based on Sampling Distributions

Authors: Om Parkash, A. K. Thukral, C. P. Gandhi

Abstract:

Information theory and Statistics play an important role in Biological Sciences when we use information measures for the study of diversity and equitability. In this communication, we develop the link among the three disciplines and prove that sampling distributions can be used to develop new information measures. Our study will be an interdisciplinary and will find its applications in Biological systems.

Keywords: Entropy, concavity, symmetry, arithmetic mean, diversity, equitability.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1341

7455 Computing SAGB-Gröbner Basis of Ideals of Invariant Rings by Using Gaussian Elimination

Authors: Sajjad Rahmany, Abdolali Basiri

Abstract:

The link between Gröbner basis and linear algebra was described by Lazard [4,5] where he realized the Gr┬¿obner basis computation could be archived by applying Gaussian elimination over Macaulay-s matrix . In this paper, we indicate how same technique may be used to SAGBI- Gröbner basis computations in invariant rings.

Keywords: Gröbner basis, SAGBI- Gröbner basis, reduction, Invariant ring, permutation groups.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2950

7454 Strengthening Legal Protection of Personal Data through Technical Protection Regulation in Line with Human Rights

Authors: Tomy Prihananto, Damar Apri Sudarmadi

Abstract:

Indonesia recognizes the right to privacy as a human right. Indonesia provides legal protection against data management activities because the protection of personal data is a part of human rights. This paper aims to describe the arrangement of data management and data management in Indonesia. This paper is a descriptive research with qualitative approach and collecting data from literature study. Results of this paper are comprehensive arrangement of data that have been set up as a technical requirement of data protection by encryption methods. Arrangements on encryption and protection of personal data are mutually reinforcing arrangements in the protection of personal data. Indonesia has two important and immediately enacted laws that provide protection for the privacy of information that is part of human rights.

Keywords: Indonesia, protection, personal data, privacy, human rights, encryption.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 917

7453 Big Brain: A Single Database System for a Federated Data Warehouse Architecture

Authors: X. Gumara Rigol, I. Martínez de Apellaniz Anzuola, A. Garcia Serrano, A. Franzi Cros, O. Vidal Calbet, A. Al Maruf

Abstract:

Traditional federated architectures for data warehousing work well when corporations have existing regional data warehouses and there is a need to aggregate data at a global level. Schibsted Media Group has been maturing from a decentralised organisation into a more globalised one and needed to build both some of the regional data warehouses for some brands at the same time as the global one. In this paper, we present the architectural alternatives studied and why a custom federated approach was the notable recommendation to go further with the implementation. Although the data warehouses are logically federated, the implementation uses a single database system which presented many advantages like: cost reduction and improved data access to global users allowing consumers of the data to have a common data model for detailed analysis across different geographies and a flexible layer for local specific needs in the same place.

Keywords: Data integration, data warehousing, federated architecture, online analytical processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 654

7452 An In-Depth Analysis of Open Data Portals as an Emerging Public E-Service

Authors: Martin Lnenicka

Abstract:

Governments collect and produce large amounts of data. Increasingly, governments worldwide have started to implement open data initiatives and also launch open data portals to enable the release of these data in open and reusable formats. Therefore, a large number of open data repositories, catalogues and portals have been emerging in the world. The greater availability of interoperable and linkable open government data catalyzes secondary use of such data, so they can be used for building useful applications which leverage their value, allow insight, provide access to government services, and support transparency. The efficient development of successful open data portals makes it necessary to evaluate them systematic, in order to understand them better and assess the various types of value they generate, and identify the required improvements for increasing this value. Thus, the attention of this paper is directed particularly to the field of open data portals. The main aim of this paper is to compare the selected open data portals on the national level using content analysis and propose a new evaluation framework, which further improves the quality of these portals. It also establishes a set of considerations for involving businesses and citizens to create eservices and applications that leverage on the datasets available from these portals.

Keywords: Big data, content analysis, criteria comparison, data quality, open data, open data portals, public sector.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3017

7451 ATM Service Analysis Using Predictive Data Mining

Authors: S. Madhavi, S. Abirami, C. Bharathi, B. Ekambaram, T. Krishna Sankar, A. Nattudurai, N. Vijayarangan

Abstract:

The high utilization rate of Automated Teller Machine (ATM) has inevitably caused the phenomena of waiting for a long time in the queue. This in turn has increased the out of stock situations. The ATM utilization helps to determine the usage level and states the necessity of the ATM based on the utilization of the ATM system. The time in which the ATM used more frequently (peak time) and based on the predicted solution the necessary actions are taken by the bank management. The analysis can be done by using the concept of Data Mining and the major part are analyzed based on the predictive data mining. The results are predicted from the historical data (past data) and track the relevant solution which is required. Weka tool is used for the analysis of data based on predictive data mining.

Keywords: ATM, Bank Management, Data Mining, Historical data, Predictive Data Mining, Weka tool.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5565

7450 A Survey on Opportunistic Routing in Mobile Ad Hoc Networks

Authors: R. Poonkuzhali, M. Y. Sanavullah, A. Sabari, T. Dhivyaa

Abstract:

Opportunistic Routing (OR) increases the transmission reliability and network throughput. Traditional routing protocols preselects one or more predetermined nodes before transmission starts and uses a predetermined neighbor to forward a packet in each hop. The opportunistic routing overcomes the drawback of unreliable wireless transmission by broadcasting one transmission can be overheard by manifold neighbors. The first cooperation-optimal protocol for Multirate OR (COMO) used to achieve social efficiency and prevent the selfish behavior of the nodes. The novel link-correlation-aware OR improves the performance by exploiting the miscellaneous low correlated forward links. Context aware Adaptive OR (CAOR) uses active suppression mechanism to reduce packet duplication. The Context-aware OR (COR) can provide efficient routing in mobile networks. By using Cooperative Opportunistic Routing in Mobile Ad hoc Networks (CORMAN), the problem of opportunistic data transfer can be tackled. While comparing to all the protocols, COMO is the best as it achieves social efficiency and prevents the selfish behavior of the nodes.

Keywords: CAOR, COMO, COR, CORMAN, MANET, Opportunistic Routing, Reliability, Throughput.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1825

7449 File System-Based Data Protection Approach

Authors: Jaechun No

Abstract:

As data to be stored in storage subsystems tremendously increases, data protection techniques have become more important than ever, to provide data availability and reliability. In this paper, we present the file system-based data protection (WOWSnap) that has been implemented using WORM (Write-Once-Read-Many) scheme. In the WOWSnap, once WORM files have been created, only the privileged read requests to them are allowed to protect data against any intentional/accidental intrusions. Furthermore, all WORM files are related to their protection cycle that is a time period during which WORM files should securely be protected. Once their protection cycle is expired, the WORM files are automatically moved to the general-purpose data section without any user interference. This prevents the WORM data section from being consumed by unnecessary files. We evaluated the performance of WOWSnap on Linux cluster.

Keywords: Data protection, Protection cycle, WORM

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1619

7448 The Data Mining usage in Production System Management

Authors: Pavel Vazan, Pavol Tanuska, Michal Kebisek

Abstract:

The paper gives the pilot results of the project that is oriented on the use of data mining techniques and knowledge discoveries from production systems through them. They have been used in the management of these systems. The simulation models of manufacturing systems have been developed to obtain the necessary data about production. The authors have developed the way of storing data obtained from the simulation models in the data warehouse. Data mining model has been created by using specific methods and selected techniques for defined problems of production system management. The new knowledge has been applied to production management system. Gained knowledge has been tested on simulation models of the production system. An important benefit of the project has been proposal of the new methodology. This methodology is focused on data mining from the databases that store operational data about the production process.

Keywords: data mining, data warehousing, management of production system, simulation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3439

7447 A Review: Comparative Study of Diverse Collection of Data Mining Tools

Authors: S. Sarumathi, N. Shanthi, S. Vidhya, M. Sharmila

Abstract:

There have been a lot of efforts and researches undertaken in developing efficient tools for performing several tasks in data mining. Due to the massive amount of information embedded in huge data warehouses maintained in several domains, the extraction of meaningful pattern is no longer feasible. This issue turns to be more obligatory for developing several tools in data mining. Furthermore the major aspire of data mining software is to build a resourceful predictive or descriptive model for handling large amount of information more efficiently and user friendly. Data mining mainly contracts with excessive collection of data that inflicts huge rigorous computational constraints. These out coming challenges lead to the emergence of powerful data mining technologies. In this survey a diverse collection of data mining tools are exemplified and also contrasted with the salient features and performance behavior of each tool.

Keywords: Business Analytics, Data Mining, Data Analysis, Machine Learning, Text Mining, Predictive Analytics, Visualization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3324

7446 Landscape Data Transformation: Categorical Descriptions to Numerical Descriptors

Authors: Dennis A. Apuan

Abstract:

Categorical data based on description of the agricultural landscape imposed some mathematical and analytical limitations. This problem however can be overcome by data transformation through coding scheme and the use of non-parametric multivariate approach. The present study describes data transformation from qualitative to numerical descriptors. In a collection of 103 random soil samples over a 60 hectare field, categorical data were obtained from the following variables: levels of nitrogen, phosphorus, potassium, pH, hue, chroma, value and data on topography, vegetation type, and the presence of rocks. Categorical data were coded, and Spearman-s rho correlation was then calculated using PAST software ver. 1.78 in which Principal Component Analysis was based. Results revealed successful data transformation, generating 1030 quantitative descriptors. Visualization based on the new set of descriptors showed clear differences among sites, and amount of variation was successfully measured. Possible applications of data transformation are discussed.

Keywords: data transformation, numerical descriptors, principalcomponent analysis

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1464

7445 A Survey of Semantic Integration Approaches in Bioinformatics

Authors: Chaimaa Messaoudi, Rachida Fissoune, Hassan Badir

Abstract:

Technological advances of computer science and data analysis are helping to provide continuously huge volumes of biological data, which are available on the web. Such advances involve and require powerful techniques for data integration to extract pertinent knowledge and information for a specific question. Biomedical exploration of these big data often requires the use of complex queries across multiple autonomous, heterogeneous and distributed data sources. Semantic integration is an active area of research in several disciplines, such as databases, information-integration, and ontology. We provide a survey of some approaches and techniques for integrating biological data, we focus on those developed in the ontology community.

Keywords: Semantic data integration, biological ontology, linked data, semantic web, OWL, RDF.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1776

7444 The Safety of WiMAX Insolid Propellant Rocket Production

Authors: Jiradett K., Ornin S.

Abstract:

With the advance in wireless networking, IEEE 802.16 WiMAX technology has been widely deployed for several applications such as “last mile" broadband service, cellular backhaul, and high-speed enterprise connectivity. As a result, military employed WiMAX as a high-speed wireless connection for data-link because of its point to multi-point and non-line-of-sight (NLOS) capability for many years. However, the risk of using WiMAX is a critical factor in some sensitive area of military applications especially in ammunition manufacturing such as solid propellant rocket production. The US DoD policy states that the following certification requirements are met for WiMAX: electromagnetic effects on the environment (E3) and Hazards of Electromagnetic Radiation to Ordnance (HERO). This paper discuses the Recommended Power Densities and Safe Separation Distance (SSD) for HERO on WiMAX systems deployed on solid propellant rocket production. The result of this research found that WiMAX is safe to operate at close proximity distances to the rocket production based on AF Guidance Memorandum immediately changing AFMAN 91-201.

Keywords: WiMAX, ammunition, explosive, munition, solidpropellant, safety, rocket, missile

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1958

7443 Summarizing Data Sets for Data Mining by Using Statistical Methods in Coastal Engineering

Authors: Yunus Doğan, Ahmet Durap

Abstract:

Coastal regions are the one of the most commonly used places by the natural balance and the growing population. In coastal engineering, the most valuable data is wave behaviors. The amount of this data becomes very big because of observations that take place for periods of hours, days and months. In this study, some statistical methods such as the wave spectrum analysis methods and the standard statistical methods have been used. The goal of this study is the discovery profiles of the different coast areas by using these statistical methods, and thus, obtaining an instance based data set from the big data to analysis by using data mining algorithms. In the experimental studies, the six sample data sets about the wave behaviors obtained by 20 minutes of observations from Mersin Bay in Turkey and converted to an instance based form, while different clustering techniques in data mining algorithms were used to discover similar coastal places. Moreover, this study discusses that this summarization approach can be used in other branches collecting big data such as medicine.

Keywords: Clustering algorithms, coastal engineering, data mining, data summarization, statistical methods.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1187

7442 Dimensional Modeling of HIV Data Using Open Source

Authors: Charles D. Otine, Samuel B. Kucel, Lena Trojer

Abstract:

Selecting the data modeling technique for an information system is determined by the objective of the resultant data model. Dimensional modeling is the preferred modeling technique for data destined for data warehouses and data mining, presenting data models that ease analysis and queries which are in contrast with entity relationship modeling. The establishment of data warehouses as components of information system landscapes in many organizations has subsequently led to the development of dimensional modeling. This has been significantly more developed and reported for the commercial database management systems as compared to the open sources thereby making it less affordable for those in resource constrained settings. This paper presents dimensional modeling of HIV patient information using open source modeling tools. It aims to take advantage of the fact that the most affected regions by the HIV virus are also heavily resource constrained (sub-Saharan Africa) whereas having large quantities of HIV data. Two HIV data source systems were studied to identify appropriate dimensions and facts these were then modeled using two open source dimensional modeling tools. Use of open source would reduce the software costs for dimensional modeling and in turn make data warehousing and data mining more feasible even for those in resource constrained settings but with data available.

Keywords: About Database, Data Mining, Data warehouse, Dimensional Modeling, Open Source.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1910