Search results for: Data representation
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 7767

Search results for: Data representation

7497 Big Brain: A Single Database System for a Federated Data Warehouse Architecture

Authors: X. Gumara Rigol, I. Martínez de Apellaniz Anzuola, A. Garcia Serrano, A. Franzi Cros, O. Vidal Calbet, A. Al Maruf

Abstract:

Traditional federated architectures for data warehousing work well when corporations have existing regional data warehouses and there is a need to aggregate data at a global level. Schibsted Media Group has been maturing from a decentralised organisation into a more globalised one and needed to build both some of the regional data warehouses for some brands at the same time as the global one. In this paper, we present the architectural alternatives studied and why a custom federated approach was the notable recommendation to go further with the implementation. Although the data warehouses are logically federated, the implementation uses a single database system which presented many advantages like: cost reduction and improved data access to global users allowing consumers of the data to have a common data model for detailed analysis across different geographies and a flexible layer for local specific needs in the same place.

Keywords: Data integration, data warehousing, federated architecture, online analytical processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 671
7496 An In-Depth Analysis of Open Data Portals as an Emerging Public E-Service

Authors: Martin Lnenicka

Abstract:

Governments collect and produce large amounts of data. Increasingly, governments worldwide have started to implement open data initiatives and also launch open data portals to enable the release of these data in open and reusable formats. Therefore, a large number of open data repositories, catalogues and portals have been emerging in the world. The greater availability of interoperable and linkable open government data catalyzes secondary use of such data, so they can be used for building useful applications which leverage their value, allow insight, provide access to government services, and support transparency. The efficient development of successful open data portals makes it necessary to evaluate them systematic, in order to understand them better and assess the various types of value they generate, and identify the required improvements for increasing this value. Thus, the attention of this paper is directed particularly to the field of open data portals. The main aim of this paper is to compare the selected open data portals on the national level using content analysis and propose a new evaluation framework, which further improves the quality of these portals. It also establishes a set of considerations for involving businesses and citizens to create eservices and applications that leverage on the datasets available from these portals.

Keywords: Big data, content analysis, criteria comparison, data quality, open data, open data portals, public sector.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3031
7495 ATM Service Analysis Using Predictive Data Mining

Authors: S. Madhavi, S. Abirami, C. Bharathi, B. Ekambaram, T. Krishna Sankar, A. Nattudurai, N. Vijayarangan

Abstract:

The high utilization rate of Automated Teller Machine (ATM) has inevitably caused the phenomena of waiting for a long time in the queue. This in turn has increased the out of stock situations. The ATM utilization helps to determine the usage level and states the necessity of the ATM based on the utilization of the ATM system. The time in which the ATM used more frequently (peak time) and based on the predicted solution the necessary actions are taken by the bank management. The analysis can be done by using the concept of Data Mining and the major part are analyzed based on the predictive data mining. The results are predicted from the historical data (past data) and track the relevant solution which is required. Weka tool is used for the analysis of data based on predictive data mining.

Keywords: ATM, Bank Management, Data Mining, Historical data, Predictive Data Mining, Weka tool.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5580
7494 Amplitude and Phase Analysis of EEG Signal by Complex Demodulation

Authors: Sun K. Yoo, Hee Cheol Kang

Abstract:

Analysis of amplitude and phase characteristics for delta, theta, and alpha bands at localized time instant from EEG signals is important for the characterizing information processing in the brain. In this paper, complex demodulation method was used to analyze EEG (Electroencephalographic) signal, particularly for auditory evoked potential response signal, with sufficient time resolution and designated frequency bandwidth resolution required. The complex demodulation decomposes raw EEG signal into 3 designated delta, theta, and alpha bands with complex EEG signal representation at sampled time instant, which can enable the extraction of amplitude envelope and phase information. Throughout simulated test data, and real EEG signal acquired during auditory attention task, it can extract the phase offset, phase and frequency changing instant and decomposed amplitude envelope for delta, theta, and alpha bands. The complex demodulation technique can be efficiently used in brain signal analysis in case of phase, and amplitude information required.

Keywords: EEG, Complex Demodulation, Amplitude, Phase.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4707
7493 A New Fuzzy Decision Support Method for Analysis of Economic Factors of Turkey's Construction Industry

Authors: R. Tur, A. Yardımcı

Abstract:

Imperfect knowledge cannot be avoided all the time. Imperfections may have several forms; uncertainties, imprecision and incompleteness. When we look to classification of methods for the management of imperfect knowledge we see fuzzy set-based techniques. The choice of a method to process data is linked to the choice of knowledge representation, which can be numerical, symbolic, logical or semantic and it depends on the nature of the problem to be solved for example decision support, which will be mentioned in our study. Fuzzy Logic is used for its ability to manage imprecise knowledge, but it can take advantage of the ability of neural networks to learn coefficients or functions. Such an association of methods is typical of so-called soft computing. In this study a new method was used for the management of imprecision for collected knowledge which related to economic analysis of construction industry in Turkey. Because of sudden changes occurring in economic factors decrease competition strength of construction companies. The better evaluation of these changes in economical factors in view of construction industry will made positive influence on company-s decisions which are dealing construction.

Keywords: Fuzzy logic, decision support systems, construction industry.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1602
7492 Enhanced Interference Management Technique for Multi-Cell Multi-Antenna System

Authors: Simon E. Uguru, Victor E. Idigo, Obinna S. Oguejiofor, Naveed Nawaz

Abstract:

As the deployment of the Fifth Generation (5G) mobile communication networks take shape all over the world, achieving spectral efficiency, energy efficiency, and dealing with interference are among the greatest challenges encountered so far. The aim of this study is to mitigate inter-cell interference (ICI) in a multi-cell multi-antenna system while maximizing the spectral efficiency of the system. In this study, a system model was devised that showed a miniature representation of a multi-cell multi-antenna system. Based on this system model, a convex optimization problem was formulated to maximize the spectral efficiency of the system while mitigating the ICI. This optimization problem was solved using CVX, which is a modeling system for constructing and solving discipline convex programs. The solutions to the optimization problem are sub-optimal coordinated beamformers. These coordinated beamformers direct each data to the served user equipments (UEs) in each cell without interference during downlink transmission, thereby maximizing the system-wide spectral efficiency.

Keywords: coordinated beamforming, convex optimization, inter-cell interference, multi-antenna, multi-cell, spectral efficiency

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 382
7491 File System-Based Data Protection Approach

Authors: Jaechun No

Abstract:

As data to be stored in storage subsystems tremendously increases, data protection techniques have become more important than ever, to provide data availability and reliability. In this paper, we present the file system-based data protection (WOWSnap) that has been implemented using WORM (Write-Once-Read-Many) scheme. In the WOWSnap, once WORM files have been created, only the privileged read requests to them are allowed to protect data against any intentional/accidental intrusions. Furthermore, all WORM files are related to their protection cycle that is a time period during which WORM files should securely be protected. Once their protection cycle is expired, the WORM files are automatically moved to the general-purpose data section without any user interference. This prevents the WORM data section from being consumed by unnecessary files. We evaluated the performance of WOWSnap on Linux cluster.

Keywords: Data protection, Protection cycle, WORM

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1625
7490 The Data Mining usage in Production System Management

Authors: Pavel Vazan, Pavol Tanuska, Michal Kebisek

Abstract:

The paper gives the pilot results of the project that is oriented on the use of data mining techniques and knowledge discoveries from production systems through them. They have been used in the management of these systems. The simulation models of manufacturing systems have been developed to obtain the necessary data about production. The authors have developed the way of storing data obtained from the simulation models in the data warehouse. Data mining model has been created by using specific methods and selected techniques for defined problems of production system management. The new knowledge has been applied to production management system. Gained knowledge has been tested on simulation models of the production system. An important benefit of the project has been proposal of the new methodology. This methodology is focused on data mining from the databases that store operational data about the production process.

Keywords: data mining, data warehousing, management of production system, simulation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3446
7489 A Review: Comparative Study of Diverse Collection of Data Mining Tools

Authors: S. Sarumathi, N. Shanthi, S. Vidhya, M. Sharmila

Abstract:

There have been a lot of efforts and researches undertaken in developing efficient tools for performing several tasks in data mining. Due to the massive amount of information embedded in huge data warehouses maintained in several domains, the extraction of meaningful pattern is no longer feasible. This issue turns to be more obligatory for developing several tools in data mining. Furthermore the major aspire of data mining software is to build a resourceful predictive or descriptive model for handling large amount of information more efficiently and user friendly. Data mining mainly contracts with excessive collection of data that inflicts huge rigorous computational constraints. These out coming challenges lead to the emergence of powerful data mining technologies. In this survey a diverse collection of data mining tools are exemplified and also contrasted with the salient features and performance behavior of each tool.

Keywords: Business Analytics, Data Mining, Data Analysis, Machine Learning, Text Mining, Predictive Analytics, Visualization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3331
7488 Landscape Data Transformation: Categorical Descriptions to Numerical Descriptors

Authors: Dennis A. Apuan

Abstract:

Categorical data based on description of the agricultural landscape imposed some mathematical and analytical limitations. This problem however can be overcome by data transformation through coding scheme and the use of non-parametric multivariate approach. The present study describes data transformation from qualitative to numerical descriptors. In a collection of 103 random soil samples over a 60 hectare field, categorical data were obtained from the following variables: levels of nitrogen, phosphorus, potassium, pH, hue, chroma, value and data on topography, vegetation type, and the presence of rocks. Categorical data were coded, and Spearman-s rho correlation was then calculated using PAST software ver. 1.78 in which Principal Component Analysis was based. Results revealed successful data transformation, generating 1030 quantitative descriptors. Visualization based on the new set of descriptors showed clear differences among sites, and amount of variation was successfully measured. Possible applications of data transformation are discussed.

Keywords: data transformation, numerical descriptors, principalcomponent analysis

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1473
7487 A Survey of Semantic Integration Approaches in Bioinformatics

Authors: Chaimaa Messaoudi, Rachida Fissoune, Hassan Badir

Abstract:

Technological advances of computer science and data analysis are helping to provide continuously huge volumes of biological data, which are available on the web. Such advances involve and require powerful techniques for data integration to extract pertinent knowledge and information for a specific question. Biomedical exploration of these big data often requires the use of complex queries across multiple autonomous, heterogeneous and distributed data sources. Semantic integration is an active area of research in several disciplines, such as databases, information-integration, and ontology. We provide a survey of some approaches and techniques for integrating biological data, we focus on those developed in the ontology community.

Keywords: Semantic data integration, biological ontology, linked data, semantic web, OWL, RDF.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1788
7486 A Novel Approach towards Segmentation of Breast Tumors from Screening Mammograms for Efficient Decision Support System

Authors: M.Suganthi, M.Madheswaran

Abstract:

This paper presents a novel approach to finding a priori interesting regions in mammograms. In order to delineate those regions of interest (ROI-s) in mammograms, which appear to be prominent, a topographic representation called the iso-level contour map consisting of iso-level contours at multiple intensity levels and region segmentation based-thresholding have been proposed. The simulation results indicate that the computed boundary gives the detection rate of 99.5% accuracy.

Keywords: Breast Cancer, Mammogram, and Segmentation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1441
7485 Mapping Crime against Women in India: Spatio-Temporal Analysis, 2001-2012

Authors: Ritvik Chauhan, Vijay Kumar Baraik

Abstract:

Women are most vulnerable to crime despite occupying central position in shaping a society as the first teacher of children. In India too, having equal rights and constitutional safeguards, the incidences of crime against them are large and grave. In this context of crime against women, especially rape has been increasing over time. This paper explores the spatial and temporal aspects of crime against women in India with special reference to rape. It also examines the crime against women with its spatial, socio-economic and demographic associates using related data obtained from the National Crime Records Bureau India, Indian Census and other government sources of the Government of India. The simple statistical, choropleth mapping and other cartographic representation methods have been used to see the crime rates, spatio-temporal patterns of crime, and association of crime with its correlates.  The major findings are visible spatial variations across the country and are also in the rising trends in terms of incidence and rates over the reference period. The study also indicates that the geographical associations are somewhat observed. However, selected indicators of socio-economic factors seem to have no significant bearing on crime against women at this level.

Keywords: Crime against women, crime mapping, trend analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1650
7484 Representation of Power System for Electromagnetic Transient Calculation

Authors: P. Sowa

Abstract:

The new idea of analyze of power system failure with use of artificial neural network is proposed. An analysis of the possibility of simulating phenomena accompanying system faults and restitution is described. It was indicated that the universal model for the simulation of phenomena in whole analyzed range does not exist. The main classic method of search of optimal structure and parameter identification are described shortly. The example with results of calculation is shown.

Keywords: Dynamic equivalents, Network reduction, Neural networks, Power system analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1860
7483 Development of a Catchment Water Quality Model for Continuous Simulations of Pollutants Build-up and Wash-off

Authors: Iqbal Hossain, Dr. Monzur Imteaz, Dr. Shirley Gato-Trinidad, Prof. Abdallah Shanableh

Abstract:

Estimation of runoff water quality parameters is required to determine appropriate water quality management options. Various models are used to estimate runoff water quality parameters. However, most models provide event-based estimates of water quality parameters for specific sites. The work presented in this paper describes the development of a model that continuously simulates the accumulation and wash-off of water quality pollutants in a catchment. The model allows estimation of pollutants build-up during dry periods and pollutants wash-off during storm events. The model was developed by integrating two individual models; rainfall-runoff model, and catchment water quality model. The rainfall-runoff model is based on the time-area runoff estimation method. The model allows users to estimate the time of concentration using a range of established methods. The model also allows estimation of the continuing runoff losses using any of the available estimation methods (i.e., constant, linearly varying or exponentially varying). Pollutants build-up in a catchment was represented by one of three pre-defined functions; power, exponential, or saturation. Similarly, pollutants wash-off was represented by one of three different functions; power, rating-curve, or exponential. The developed runoff water quality model was set-up to simulate the build-up and wash-off of total suspended solids (TSS), total phosphorus (TP) and total nitrogen (TN). The application of the model was demonstrated using available runoff and TSS field data from road and roof surfaces in the Gold Coast, Australia. The model provided excellent representation of the field data demonstrating the simplicity yet effectiveness of the proposed model.

Keywords: Catchment, continuous pollutants build-up, pollutants wash-off, runoff, runoff water quality model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3086
7482 Summarizing Data Sets for Data Mining by Using Statistical Methods in Coastal Engineering

Authors: Yunus Doğan, Ahmet Durap

Abstract:

Coastal regions are the one of the most commonly used places by the natural balance and the growing population. In coastal engineering, the most valuable data is wave behaviors. The amount of this data becomes very big because of observations that take place for periods of hours, days and months. In this study, some statistical methods such as the wave spectrum analysis methods and the standard statistical methods have been used. The goal of this study is the discovery profiles of the different coast areas by using these statistical methods, and thus, obtaining an instance based data set from the big data to analysis by using data mining algorithms. In the experimental studies, the six sample data sets about the wave behaviors obtained by 20 minutes of observations from Mersin Bay in Turkey and converted to an instance based form, while different clustering techniques in data mining algorithms were used to discover similar coastal places. Moreover, this study discusses that this summarization approach can be used in other branches collecting big data such as medicine.

Keywords: Clustering algorithms, coastal engineering, data mining, data summarization, statistical methods.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1200
7481 Dimensional Modeling of HIV Data Using Open Source

Authors: Charles D. Otine, Samuel B. Kucel, Lena Trojer

Abstract:

Selecting the data modeling technique for an information system is determined by the objective of the resultant data model. Dimensional modeling is the preferred modeling technique for data destined for data warehouses and data mining, presenting data models that ease analysis and queries which are in contrast with entity relationship modeling. The establishment of data warehouses as components of information system landscapes in many organizations has subsequently led to the development of dimensional modeling. This has been significantly more developed and reported for the commercial database management systems as compared to the open sources thereby making it less affordable for those in resource constrained settings. This paper presents dimensional modeling of HIV patient information using open source modeling tools. It aims to take advantage of the fact that the most affected regions by the HIV virus are also heavily resource constrained (sub-Saharan Africa) whereas having large quantities of HIV data. Two HIV data source systems were studied to identify appropriate dimensions and facts these were then modeled using two open source dimensional modeling tools. Use of open source would reduce the software costs for dimensional modeling and in turn make data warehousing and data mining more feasible even for those in resource constrained settings but with data available.

Keywords: About Database, Data Mining, Data warehouse, Dimensional Modeling, Open Source.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1919
7480 Attitudes of Gratitude: An Analysis of 30 Cancer Narratives Published by Leading U.S. Cancer Care Centers

Authors: Maria L. McLeod

Abstract:

This study examines the ways in which cancer patient narratives are portrayed and framed on the websites of three leading U.S. cancer care centers – The University of Texas MD Anderson Cancer Center in Houston, Memorial Sloan Kettering Cancer Center in New York, and Seattle Cancer Care Alliance. Thirty patient stories, 10 from each cancer center website blog, were analyzed using qualitative and quantitative textual analysis of unstructured data, documenting common themes and other elements of story structure and content. Patient narratives were coded using grounded theory as the basis for conducting emergent qualitative research. As part of a systematic, inductive approach to collecting and analyzing data, recurrent and unique themes were examined and compared in terms of positive and negative framing, patient agency, and institutional praise. All three of these cancer care centers are teaching hospitals, with university affiliations, that emphasize an evidence-based scientific approach to treatment that utilizes the latest research and cutting-edge techniques and technology. The featured cancer stories suggest positive outcomes based on anecdotal narratives as opposed to the science-based treatment models employed by the cancer centers. An analysis of 30 sample stories found skewed representation of the “cancer experience” that emphasizes positive outcomes while minimizing or excluding more negative realities of cancer diagnosis and treatment. The stories also deemphasize patient agency, instead focusing on deference and gratitude toward the cancer care centers, which are cast in the role of savior.  

Keywords: Cancer framing, cancer narratives, survivor stories, patient narratives.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 493
7479 Connected Vertex Cover in 2-Connected Planar Graph with Maximum Degree 4 is NP-complete

Authors: Priyadarsini P. L. K, Hemalatha T.

Abstract:

This paper proves that the problem of finding connected vertex cover in a 2-connected planar graph ( CVC-2 ) with maximum degree 4 is NP-complete. The motivation for proving this result is to give a shorter and simpler proof of NP-Completeness of TRA-MLC (the Top Right Access point Minimum-Length Corridor) problem [1], by finding the reduction from CVC-2. TRA-MLC has many applications in laying optical fibre cables for data communication and electrical wiring in floor plans.The problem of finding connected vertex cover in any planar graph ( CVC ) with maximum degree 4 is NP-complete [2]. We first show that CVC-2 belongs to NP and then we find a polynomial reduction from CVC to CVC-2. Let a graph G0 and an integer K form an instance of CVC, where G0 is a planar graph and K is an upper bound on the size of the connected vertex cover in G0. We construct a 2-connected planar graph, say G, by identifying the blocks and cut vertices of G0, and then finding the planar representation of all the blocks of G0, leading to a plane graph G1. We replace the cut vertices with cycles in such a way that the resultant graph G is a 2-connected planar graph with maximum degree 4. We consider L = K -2t+3 t i=1 di where t is the number of cut vertices in G1 and di is the number of blocks for which ith cut vertex is common. We prove that G will have a connected vertex cover with size less than or equal to L if and only if G0 has a connected vertex cover of size less than or equal to K.

Keywords: NP-complete, 2-Connected planar graph, block, cut vertex

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1971
7478 Efficient Lossless Compression of Weather Radar Data

Authors: Wei-hua Ai, Wei Yan, Xiang Li

Abstract:

Data compression is used operationally to reduce bandwidth and storage requirements. An efficient method for achieving lossless weather radar data compression is presented. The characteristics of the data are taken into account and the optical linear prediction is used for the PPI images in the weather radar data in the proposed method. The next PPI image is identical to the current one and a dramatic reduction in source entropy is achieved by using the prediction algorithm. Some lossless compression methods are used to compress the predicted data. Experimental results show that for the weather radar data, the method proposed in this paper outperforms the other methods.

Keywords: Lossless compression, weather radar data, optical linear prediction, PPI image

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2216
7477 Conceptualizing the Knowledge to Manage and Utilize Data Assets in the Context of Digitization: Case Studies of Multinational Industrial Enterprises

Authors: Martin Böhmer, Agatha Dabrowski, Boris Otto

Abstract:

The trend of digitization significantly changes the role of data for enterprises. Data turn from an enabler to an intangible organizational asset that requires management and qualifies as a tradeable good. The idea of a networked economy has gained momentum in the data domain as collaborative approaches for data management emerge. Traditional organizational knowledge consequently needs to be extended by comprehensive knowledge about data. The knowledge about data is vital for organizations to ensure that data quality requirements are met and data can be effectively utilized and sovereignly governed. As this specific knowledge has been paid little attention to so far by academics, the aim of the research presented in this paper is to conceptualize it by proposing a “data knowledge model”. Relevant model entities have been identified based on a design science research (DSR) approach that iteratively integrates insights of various industry case studies and literature research.

Keywords: Data management, digitization, Industry 4.0, knowledge engineering, metamodel.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1404
7476 A Methodology for Data Migration between Different Database Management Systems

Authors: Bogdan Walek, Cyril Klimes

Abstract:

In present days the area of data migration is very topical. Current tools for data migration in the area of relational database have several disadvantages that are presented in this paper. We propose a methodology for data migration of the database tables and their data between various types of relational database systems (RDBMS). The proposed methodology contains an expert system. The expert system contains a knowledge base that is composed of IFTHEN rules and based on the input data suggests appropriate data types of columns of database tables. The proposed tool, which contains an expert system, also includes the possibility of optimizing the data types in the target RDBMS database tables based on processed data of the source RDBMS database tables. The proposed expert system is shown on data migration of selected database of the source RDBMS to the target RDBMS.

Keywords: Expert system, fuzzy, data migration, database, relational database, data type, relational database management system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3426
7475 Opening up Government Datasets for Big Data Analysis to Support Policy Decisions

Authors: K. Hardy, A. Maurushat

Abstract:

Policy makers are increasingly looking to make evidence-based decisions. Evidence-based decisions have historically used rigorous methodologies of empirical studies by research institutes, as well as less reliable immediate survey/polls often with limited sample sizes. As we move into the era of Big Data analytics, policy makers are looking to different methodologies to deliver reliable empirics in real-time. The question is not why did these people do this for the last 10 years, but why are these people doing this now, and if the this is undesirable, and how can we have an impact to promote change immediately. Big data analytics rely heavily on government data that has been released in to the public domain. The open data movement promises greater productivity and more efficient delivery of services; however, Australian government agencies remain reluctant to release their data to the general public. This paper considers the barriers to releasing government data as open data, and how these barriers might be overcome.

Keywords: Big data, open data, productivity, transparency.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1584
7474 Forthcoming Big Data on Smart Buildings and Cities: An Experimental Study on Correlations among Urban Data

Authors: Yu-Mi Song, Sung-Ah Kim, Dongyoun Shin

Abstract:

Cities are complex systems of diverse and inter-tangled activities. These activities and their complex interrelationships create diverse urban phenomena. And such urban phenomena have considerable influences on the lives of citizens. This research aimed to develop a method to reveal the causes and effects among diverse urban elements in order to enable better understanding of urban activities and, therefrom, to make better urban planning strategies. Specifically, this study was conducted to solve a data-recommendation problem found on a Korean public data homepage. First, a correlation analysis was conducted to find the correlations among random urban data. Then, based on the results of that correlation analysis, the weighted data network of each urban data was provided to people. It is expected that the weights of urban data thereby obtained will provide us with insights into cities and show us how diverse urban activities influence each other and induce feedback.

Keywords: Big data, correlation analysis, data recommendation system, urban data network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1066
7473 On the Combination of Patient-Generated Data with Data from a Secure Clinical Network Environment – A Practical Example

Authors: Jeroen S. de Bruin, Karin Schindler, Christian Schuh

Abstract:

With increasingly more mobile health applications appearing due to the popularity of smartphones, the possibility arises that these data can be used to improve the medical diagnostic process, as well as the overall quality of healthcare, while at the same time lowering costs. However, as of yet there have been no reports of a successful combination of patient-generated data from smartphones with data from clinical routine. In this paper we describe how these two types of data can be combined in a secure way without modification to hospital information systems, and how they can together be used in a medical expert system for automatic nutritional classification and triage.

Keywords: Data integration, disease-related malnutrition, expert systems, mobile health.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2166
7472 Comparison of Imputation Techniques for Efficient Prediction of Software Fault Proneness in Classes

Authors: Geeta Sikka, Arvinder Kaur Takkar, Moin Uddin

Abstract:

Missing data is a persistent problem in almost all areas of empirical research. The missing data must be treated very carefully, as data plays a fundamental role in every analysis. Improper treatment can distort the analysis or generate biased results. In this paper, we compare and contrast various imputation techniques on missing data sets and make an empirical evaluation of these methods so as to construct quality software models. Our empirical study is based on NASA-s two public dataset. KC4 and KC1. The actual data sets of 125 cases and 2107 cases respectively, without any missing values were considered. The data set is used to create Missing at Random (MAR) data Listwise Deletion(LD), Mean Substitution(MS), Interpolation, Regression with an error term and Expectation-Maximization (EM) approaches were used to compare the effects of the various techniques.

Keywords: Missing data, Imputation, Missing Data Techniques.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1631
7471 Cluster Analysis for the Statistical Modeling of Aesthetic Judgment Data Related to Comics Artists

Authors: George E. Tsekouras, Evi Sampanikou

Abstract:

We compare three categorical data clustering algorithms with respect to the problem of classifying cultural data related to the aesthetic judgment of comics artists. Such a classification is very important in Comics Art theory since the determination of any classes of similarities in such kind of data will provide to art-historians very fruitful information of Comics Art-s evolution. To establish this, we use a categorical data set and we study it by employing three categorical data clustering algorithms. The performances of these algorithms are compared each other, while interpretations of the clustering results are also given.

Keywords: Aesthetic judgment, comics artists, cluster analysis, categorical data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1605
7470 IoT Device Cost Effective Storage Architecture and Real-Time Data Analysis/Data Privacy Framework

Authors: Femi Elegbeleye, Seani Rananga

Abstract:

This paper focused on cost effective storage architecture using fog and cloud data storage gateway, and presented the design of the framework for the data privacy model and data analytics framework on a real-time analysis when using machine learning method. The paper began with the system analysis, system architecture and its component design, as well as the overall system operations. Several results obtained from this study on data privacy models show that when two or more data privacy models are integrated via a fog storage gateway, we often have more secure data. Our main focus in the study is to design a framework for the data privacy model, data storage, and real-time analytics. This paper also shows the major system components and their framework specification. And lastly, the overall research system architecture was shown, including its structure, and its interrelationships.

Keywords: IoT, fog storage, cloud storage, data analysis, data privacy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 159
7469 Re-Presenting the Egyptian Informal Urbanism in Films between 1994 and 2014

Authors: R. Mofeed, N. Elgendy

Abstract:

Cinema constructs mind-spaces that reflect inherent human thoughts and emotions. As a representational art, Cinema would introduce comprehensive images of life phenomena in different ways. The term “represent” suggests verity of meanings; bring into presence, replace or typify. In that sense, Cinema may present a phenomenon through direct embodiment, or introduce a substitute image that replaces the original phenomena, or typify it by relating the produced image to a more general category through a process of abstraction. This research is interested in questioning the type of images that Egyptian Cinema introduces to informal urbanism and how these images were conditioned and reshaped in the last twenty years. The informalities/slums phenomenon first appeared in Egypt and, particularly, Cairo in the early sixties, however, this phenomenon was completely ignored by the state and society until the eighties, and furthermore, its evident representation in Cinema was by the mid-nineties. The Informal City represents the illegal housing developments, and it is a fast growing form of urbanization in Cairo. Yet, this expanding phenomenon is still depicted as the minority, exceptional and marginal through the Cinematic lenses. This paper aims at tracing the forms of representations of the urban informalities in the Egyptian Cinema between 1994 and 2014, and how did that affect the popular mind and its perception of these areas. The paper runs two main lines of inquiry; the first traces the phenomena through a chronological and geographical mapping of the informal urbanism has been portrayed in films. This analysis is based on an academic research work at Cairo University in Fall 2014. The visual tracing through maps and timelines allowed a reading of the phases of ignorance, presence, typifying and repetition in the representation of this huge sector of the city through more than 50 films that has been investigated. The analysis clearly revealed the “portrayed image” of informality by the Cinema through the examined period. However, the second part of the paper explores the “perceived image”. A designed questionnaire is applied to highlight the main features of that image that is perceived by both inhabitants of informalities and other Cairenes based on watching selected films. The questionnaire covers the different images of informalities proposed in the Cinema whether in a comic or a melodramatic background and highlight the descriptive terms used, to see which of them resonate with the mass perceptions and affected their mental images. The two images; “portrayed” and “perceived” are then to be encountered to reflect on issues of repetitions, stereotyping and reality. The formulated stereotype of informal urbanism is finally outlined and justified in relation to both production consumption mechanisms of films and the State official vision of informalities.

Keywords: Cairo, cinema, informal urbanism, representation, stereotype.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1416
7468 Some Equalities Connected with Fuzzy Soft Matrices

Authors: D. R. Jain

Abstract:

The aim of this paper is to use matrix representation of Fuzzy soft sets for proving some equalities connected with Fuzzy soft sets based on set-operations.

Keywords: Equality, Fuzzy soft matrix, Fuzzy soft sets, operations.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1745