Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 8716

Search results for: XML Data Management

8596 Query Algebra for Semistuctured Data

Authors: Ei Ei Myat, Ni Lar Thein

Abstract:

With the tremendous growth of World Wide Web (WWW) data, there is an emerging need for effective information retrieval at the document level. Several query languages such as XML-QL, XPath, XQL, Quilt and XQuery are proposed in recent years to provide faster way of querying XML data, but they still lack of generality and efficiency. Our approach towards evolving a framework for querying semistructured documents is based on formal query algebra. Two elements are introduced in the proposed framework: first, a generic and flexible data model for logical representation of semistructured data and second, a set of operators for the manipulation of objects defined in the data model. In additional to accommodating several peculiarities of semistructured data, our model offers novel features such as bidirectional paths for navigational querying and partitions for data transformation that are not available in other proposals.

Keywords: Algebra, Semistructured data, Query Algebra.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 992
8595 Simulation Data Summarization Based on Spatial Histograms

Authors: Jing Zhao, Yoshiharu Ishikawa, Chuan Xiao, Kento Sugiura

Abstract:

In order to analyze large-scale scientific data, research on data exploration and visualization has gained popularity. In this paper, we focus on the exploration and visualization of scientific simulation data, and define a spatial V-Optimal histogram for data summarization. We propose histogram construction algorithms based on a general binary hierarchical partitioning as well as a more specific one, the l-grid partitioning. For effective data summarization and efficient data visualization in scientific data analysis, we propose an optimal algorithm as well as a heuristic algorithm for histogram construction. To verify the effectiveness and efficiency of the proposed methods, we conduct experiments on the massive evacuation simulation data.

Keywords: Simulation data, data summarization, spatial histograms, exploration and visualization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 415
8594 Joint Use of Factor Analysis (FA) and Data Envelopment Analysis (DEA) for Ranking of Data Envelopment Analysis

Authors: Reza Nadimi, Fariborz Jolai

Abstract:

This article combines two techniques: data envelopment analysis (DEA) and Factor analysis (FA) to data reduction in decision making units (DMU). Data envelopment analysis (DEA), a popular linear programming technique is useful to rate comparatively operational efficiency of decision making units (DMU) based on their deterministic (not necessarily stochastic) input–output data and factor analysis techniques, have been proposed as data reduction and classification technique, which can be applied in data envelopment analysis (DEA) technique for reduction input – output data. Numerical results reveal that the new approach shows a good consistency in ranking with DEA.

Keywords: Effectiveness, Decision Making, Data EnvelopmentAnalysis, Factor Analysis

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2117
8593 Comparison of Different Methods to Produce Fuzzy Tolerance Relations for Rainfall Data Classification in the Region of Central Greece

Authors: N. Samarinas, C. Evangelides, C. Vrekos

Abstract:

The aim of this paper is the comparison of three different methods, in order to produce fuzzy tolerance relations for rainfall data classification. More specifically, the three methods are correlation coefficient, cosine amplitude and max-min method. The data were obtained from seven rainfall stations in the region of central Greece and refers to 20-year time series of monthly rainfall height average. Three methods were used to express these data as a fuzzy relation. This specific fuzzy tolerance relation is reformed into an equivalence relation with max-min composition for all three methods. From the equivalence relation, the rainfall stations were categorized and classified according to the degree of confidence. The classification shows the similarities among the rainfall stations. Stations with high similarity can be utilized in water resource management scenarios interchangeably or to augment data from one to another. Due to the complexity of calculations, it is important to find out which of the methods is computationally simpler and needs fewer compositions in order to give reliable results.

Keywords: Classification, fuzzy logic, tolerance relations, rainfall data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 740
8592 Device Discover: A Component for Network Management System using Simple Network Management Protocol

Authors: Garima Gupta, Daya Gupta

Abstract:

Virtually all existing networked system management tools use a Manager/Agent paradigm. That is, distributed agents are deployed on managed devices to collect local information and report it back to some management unit. Even those that use standard protocols such as SNMP fall into this model. Using standard protocol has the advantage of interoperability among devices from different vendors. However, it may not be able to provide customized information that is of interest to satisfy specific management needs. In this dissertation work, different approaches are used to collect information regarding the devices attached to a Local Area Network. An SNMP aware application is being developed that will manage the discovery procedure and will be used as data collector.

Keywords: ICMP Scanner, Network Discovery, NetworkManagement, SNMP Scanner.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1298
8591 An Approach to Improvement of Information Integrity in Key Areas of Portfolio Management

Authors: Victoria A. Bakhtina

Abstract:

At a time of growing market turbulence and a strong shifts towards increasingly complex risk models and more stringent audit requirements, it is more critical than ever to maintain the highest quality of financial and credit information. IFC implemented an approach that helps increase data integrity and quality significantly. This approach is called “Screening". Screening is based on linking information from different sources to identify potential inconsistencies in key financial and credit data. That, in turn, can help to ease the trials of portfolio supervision, and improve overall company global reporting and assessment systems. IFC experience showed that when used regularly, Screening led to improved information.

Keywords: Information Integrity, Information Quality, Business Rules, Portfolio Management

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1150
8590 Unsupervised Text Mining Approach to Early Warning System

Authors: Ichihan Tai, Bill Olson, Paul Blessner

Abstract:

Traditional early warning systems that alarm against crisis are generally based on structured or numerical data; therefore, a system that can make predictions based on unstructured textual data, an uncorrelated data source, is a great complement to the traditional early warning systems. The Chicago Board Options Exchange (CBOE) Volatility Index (VIX), commonly referred to as the fear index, measures the cost of insurance against market crash, and spikes in the event of crisis. In this study, news data is consumed for prediction of whether there will be a market-wide crisis by predicting the movement of the fear index, and the historical references to similar events are presented in an unsupervised manner. Topic modeling-based prediction and representation are made based on daily news data between 1990 and 2015 from The Wall Street Journal against VIX index data from CBOE.

Keywords: Early Warning System, Knowledge Management, Topic Modeling, Market Prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1559
8589 Urban Corridor Management Strategy Based on Intelligent Transportation System

Authors: Sourabh Jain, Sukhvir Singh Jain, Gaurav V. Jain

Abstract:

Intelligent Transportation System (ITS) is the application of technology for developing a user–friendly transportation system for urban areas in developing countries. The goal of urban corridor management using ITS in road transport is to achieve improvements in mobility, safety, and the productivity of the transportation system within the available facilities through the integrated application of advanced monitoring, communications, computer, display, and control process technologies, both in the vehicle and on the road. This paper attempts to present the past studies regarding several ITS available that have been successfully deployed in urban corridors of India and abroad, and to know about the current scenario and the methodology considered for planning, design, and operation of Traffic Management Systems. This paper also presents the endeavor that was made to interpret and figure out the performance of the 27.4 Km long study corridor having eight intersections and four flyovers. The corridor consisting of 6 lanes as well as 8 lanes divided road network. Two categories of data were collected on February 2016 such as traffic data (traffic volume, spot speed, delay) and road characteristics data (no. of lanes, lane width, bus stops, mid-block sections, intersections, flyovers). The instruments used for collecting the data were video camera, radar gun, mobile GPS and stopwatch. From analysis, the performance interpretations incorporated were identification of peak hours and off peak hours, congestion and level of service (LOS) at mid blocks, delay followed by the plotting speed contours and recommending urban corridor management strategies. From the analysis, it is found that ITS based urban corridor management strategies will be useful to reduce congestion, fuel consumption and pollution so as to provide comfort and efficiency to the users. The paper presented urban corridor management strategies based on sensors incorporated in both vehicles and on the roads.

Keywords: Congestion, ITS Strategies, Mobility, Safety.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1293
8588 Management of Air Pollutants from Point Sources

Authors: N. Lokeshwari, G. Srinikethan, V. S. Hegde

Abstract:

Monitoring is essential to assessing the effectiveness of air pollution control actions. The goal of the air quality information system is through monitoring, to keep authorities, major polluters and the public informed on the short and long-term changes in air quality, thereby helping to raise awareness. Mathematical models are the best tools available for the prediction of the air quality management. The main objective of the work was to apply a Model that predicts the concentration levels of different pollutants at any instant of time. In this study, distribution of air pollutants concentration such as nitrogen dioxides (NO2), sulphur dioxides (SO2) and total suspended particulates (TSP) of industries are determined by using Gaussian model. Besides that, the effect of wind speed and its direction on the pollutant concentration within the affected area were evaluated. In order to determine the efficiency and percentage of error in the modeling, validation process of data was done. Sampling of air quality was conducted in getting existing air quality around a factory and the concentrations of pollutants in a plume were inversely proportional to wind velocity. The resultant ground level concentrations were then compared to the quality standards to determine if there could be a negative impact on health. This study concludes that concentration of pollutants can be significantly predicted using Gaussian Model. The data base management is developed for the air data of Hubli-Dharwad region.

Keywords: DBMS, NO2, SO2, Wind rose plots.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1797
8587 Design and Implementation of Client Server Network Management System for Ethernet LAN

Authors: May Paing Paing Zaw, Su Myat Marlar Soe

Abstract:

Network Management Systems have played a great important role in information systems. Management is very important and essential in any fields. There are many managements such as configuration management, fault management, performance management, security management, accounting management and etc. Among them, configuration, fault and security management is more important than others. Because these are essential and useful in any fields. Configuration management is to monitor and maintain the whole system or LAN. Fault management is to detect and troubleshoot the system. Security management is to control the whole system. This paper intends to increase the network management functionalities including configuration management, fault management and security management. In configuration management system, this paper specially can support the USB ports and devices to detect and read devices configuration and solve to detect hardware port and software ports. In security management system, this paper can provide the security feature for the user account setting and user management and proxy server feature. And all of the history of the security such as user account and proxy server history are kept in the java standard serializable file. So the user can view the history of the security and proxy server anytime. If the user uses this system, the user can ping the clients from the network and the user can view the result of the message in fault management system. And this system also provides to check the network card and can show the NIC card setting. This system is used RMI (Remote Method Invocation) and JNI (Java Native Interface) technology. This paper is to implement the client/server network management system using Java 2 Standard Edition (J2SE). This system can provide more than 10 clients. And then this paper intends to show data or message structure of client/server and how to work using TCP/IP protocol.

Keywords: TCP/ IP based client server application

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3247
8586 Improving Academic Performance Prediction using Voting Technique in Data Mining

Authors: Ikmal Hisyam Mohamad Paris, Lilly Suriani Affendey, Norwati Mustapha

Abstract:

In this paper we compare the accuracy of data mining methods to classifying students in order to predicting student-s class grade. These predictions are more useful for identifying weak students and assisting management to take remedial measures at early stages to produce excellent graduate that will graduate at least with second class upper. Firstly we examine single classifiers accuracy on our data set and choose the best one and then ensembles it with a weak classifier to produce simple voting method. We present results show that combining different classifiers outperformed other single classifiers for predicting student performance.

Keywords: Classification, Data Mining, Prediction, Combination of Multiple Classifiers.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2489
8585 EEIA: Energy Efficient Indexed Aggregation in Smart Wireless Sensor Networks

Authors: Mohamed Watfa, William Daher, Hisham Al Azar

Abstract:

The main idea behind in network aggregation is that, rather than sending individual data items from sensors to sinks, multiple data items are aggregated as they are forwarded by the sensor network. Existing sensor network data aggregation techniques assume that the nodes are preprogrammed and send data to a central sink for offline querying and analysis. This approach faces two major drawbacks. First, the system behavior is preprogrammed and cannot be modified on the fly. Second, the increased energy wastage due to the communication overhead will result in decreasing the overall system lifetime. Thus, energy conservation is of prime consideration in sensor network protocols in order to maximize the network-s operational lifetime. In this paper, we give an energy efficient approach to query processing by implementing new optimization techniques applied to in-network aggregation. We first discuss earlier approaches in sensors data management and highlight their disadvantages. We then present our approach “Energy Efficient Indexed Aggregation" (EEIA) and evaluate it through several simulations to prove its efficiency, competence and effectiveness.

Keywords: Sensor Networks, Data Base, Data Fusion, Aggregation, Indexing, Energy Efficiency

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1516
8584 Churn Prediction for Telecommunication Industry Using Artificial Neural Networks

Authors: Ulas Vural, M. Ergun Okay, E. Mesut Yildiz

Abstract:

Telecommunication service providers demand accurate and precise prediction of customer churn probabilities to increase the effectiveness of their customer relation services. The large amount of customer data owned by the service providers is suitable for analysis by machine learning methods. In this study, expenditure data of customers are analyzed by using an artificial neural network (ANN). The ANN model is applied to the data of customers with different billing duration. The proposed model successfully predicts the churn probabilities at 83% accuracy for only three months expenditure data and the prediction accuracy increases up to 89% when the nine month data is used. The experiments also show that the accuracy of ANN model increases on an extended feature set with information of the changes on the bill amounts.

Keywords: Customer relationship management, churn prediction, telecom industry, deep learning, Artificial Neural Networks, ANN.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 266
8583 Bureau Management Technologies and Information Systems in Developing Countries

Authors: Mehmet Altınöz

Abstract:

This study focuses on bureau management technologies and information systems in developing countries. Developing countries use such systems which facilitate executive and organizational functions through the utilization of bureau management technologies and provide the executive staff with necessary information. The concepts of data and information differ from each other in developing countries, and thus the concepts of data processing and information processing are different. Symbols represent ideas, objects, figures, letters and numbers. Data processing system is an integrated system which deals with the processing of the data related to the internal and external environment of the organization in order to make decisions, create plans and develop strategies; it goes without saying that this system is composed of both human beings and machines. Information is obtained through the acquisition and the processing of data. On the other hand, data are raw communicative messages. Within this framework, data processing equals to producing plausible information out of raw data. Organizations in developing countries need to obtain information relevant to them because rapid changes in the organizational arena require rapid access to accurate information. The most significant role of the directors and managers who work in the organizational arena is to make decisions. Making a correct decision is possible only when the directors and managers are equipped with sound ideas and appropriate information. Therefore, acquisition, organization and distribution of information gain significance. Today-s organizations make use of computer-assisted “Management Information Systems" in order to obtain and distribute information. Decision Support System which is closely related to practice is an information system that facilitates the director-s task of making decisions. Decision Support System integrates human intelligence, information technology and software in order to solve the complex problems. With the support of the computer technology and software systems, Decision Support System produces information relevant to the decision to be made by the director and provides the executive staff with supportive ideas about the decision. Artificial Intelligence programs which transfer the studies and experiences of the people to the computer are called expert systems. An expert system stores expert information in a limited area and can solve problems by deriving rational consequences. Bureau management technologies and information systems in developing countries create a kind of information society and information economy which make those countries have their places in the global socio-economic structure and which enable them to play a reasonable and fruitful role; therefore it is of crucial importance to make use of information and management technologies in order to work together with innovative and enterprising individuals and it is also significant to create “scientific policies" based on information and technology in the fields of economy, politics, law and culture.

Keywords: Bureau Management, Information Systems.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1218
8582 Strategic Management Methods in Non-profit Making Organization

Authors: P. Řehoř, D. Holátová, V. Doležalová

Abstract:

Paper deals with analysis of strategic management methods in non-profit making organization in the Czech Republic. Strategic management represents an aggregate of methods and approaches that can be applied for managing organizations - in this article the organizations which associate owners and keepers of nonstate forest properties. Authors use these methods of strategic management: analysis of stakeholders, SWOT analysis and questionnaire inquiries. The questionnaire was distributed electronically via e-mail. In October 2013 we obtained data from a total of 84 questionnaires. Based on the results the authors recommend the using of confrontation strategy which improves the competitiveness of non-profit making organizations.

Keywords: Strategic management, non-profit making organization, strategy analysis, SWOT analysis, strategy, competitiveness.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3995
8581 Observations about the Principal Components Analysis and Data Clustering Techniques in the Study of Medical Data

Authors: Cristina G. Dascâlu, Corina Dima Cozma, Elena Carmen Cotrutz

Abstract:

The medical data statistical analysis often requires the using of some special techniques, because of the particularities of these data. The principal components analysis and the data clustering are two statistical methods for data mining very useful in the medical field, the first one as a method to decrease the number of studied parameters, and the second one as a method to analyze the connections between diagnosis and the data about the patient-s condition. In this paper we investigate the implications obtained from a specific data analysis technique: the data clustering preceded by a selection of the most relevant parameters, made using the principal components analysis. Our assumption was that, using the principal components analysis before data clustering - in order to select and to classify only the most relevant parameters – the accuracy of clustering is improved, but the practical results showed the opposite fact: the clustering accuracy decreases, with a percentage approximately equal with the percentage of information loss reported by the principal components analysis.

Keywords: Data clustering, medical data, principal components analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1218
8580 Server Virtualization Using User Behavior Model Focus on Provisioning Concept

Authors: D. Prangchumpol

Abstract:

Server provisioning is one of the most attractive topics in virtualization systems. Virtualization is a method of running multiple independent virtual operating systems on a single physical computer. It is a way of maximizing physical resources to maximize the investment in hardware. Additionally, it can help to consolidate servers, improve hardware utilization and reduce the consumption of power and physical space in the data center. However, management of heterogeneous workloads, especially for resource utilization of the server, or so called provisioning becomes a challenge. In this paper, a new concept for managing workloads based on user behavior is presented. The experimental results show that user behaviors are different in each type of service workload and time. Understanding user behaviors may improve the efficiency of management in provisioning concept. This preliminary study may be an approach to improve management of data centers running heterogeneous workloads for provisioning in virtualization system.

Keywords: association rule, provisioning, server virtualization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1446
8579 Spatio-Temporal Data Mining with Association Rules for Lake Van

Authors: T. Aydin, M. F. Alaeddinoglu

Abstract:

People, throughout the history, have made estimates and inferences about the future by using their past experiences. Developing information technologies and the improvements in the database management systems make it possible to extract useful information from knowledge in hand for the strategic decisions. Therefore, different methods have been developed. Data mining by association rules learning is one of such methods. Apriori algorithm, one of the well-known association rules learning algorithms, is not commonly used in spatio-temporal data sets. However, it is possible to embed time and space features into the data sets and make Apriori algorithm a suitable data mining technique for learning spatiotemporal association rules. Lake Van, the largest lake of Turkey, is a closed basin. This feature causes the volume of the lake to increase or decrease as a result of change in water amount it holds. In this study, evaporation, humidity, lake altitude, amount of rainfall and temperature parameters recorded in Lake Van region throughout the years are used by the Apriori algorithm and a spatio-temporal data mining application is developed to identify overflows and newlyformed soil regions (underflows) occurring in the coastal parts of Lake Van. Identifying possible reasons of overflows and underflows may be used to alert the experts to take precautions and make the necessary investments.

Keywords: Apriori algorithm, association rules, data mining, spatio-temporal data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1164
8578 Missing Link Data Estimation with Recurrent Neural Network: An Application Using Speed Data of Daegu Metropolitan Area

Authors: JaeHwan Yang, Da-Woon Jeong, Seung-Young Kho, Dong-Kyu Kim

Abstract:

In terms of ITS, information on link characteristic is an essential factor for plan or operation. But in practical cases, not every link has installed sensors on it. The link that does not have data on it is called “Missing Link”. The purpose of this study is to impute data of these missing links. To get these data, this study applies the machine learning method. With the machine learning process, especially for the deep learning process, missing link data can be estimated from present link data. For deep learning process, this study uses “Recurrent Neural Network” to take time-series data of road. As input data, Dedicated Short-range Communications (DSRC) data of Dalgubul-daero of Daegu Metropolitan Area had been fed into the learning process. Neural Network structure has 17 links with present data as input, 2 hidden layers, for 1 missing link data. As a result, forecasted data of target link show about 94% of accuracy compared with actual data.

Keywords: Data Estimation, link data, machine learning, road network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 934
8577 Situation-based Knowledge Presentation for Mobile Workers

Authors: Alessandra Agostini, Roberto Boselli, Flavio De Paoli, Riccardo Dondi

Abstract:

The work presented in this paper focus on Knowledge Management services enabling CSCW (Computer Supported Cooperative Work) applications to provide an appropriate adaptation to the user and the situation in which the user is working. In this paper, we explain how a knowledge management system can be designed to support users in different situations exploiting contextual data, users' preferences, and profiles of involved artifacts (e.g., documents, multimedia files, mockups...). The presented work roots in the experience we had in the MILK project and early steps made in the MAIS project.

Keywords: Information Management Systems, InformationRetrieval, Knowledge Management, Mobile CommunicationSystems.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1261
8576 CNet Module Design of IMCS

Authors: Youkyung Park, SeungYup Kang, SungHo Kim, SimKyun Yook

Abstract:

IMCS is Integrated Monitoring and Control System for thermal power plant. This system consists of mainly two parts; controllers and OIS (Operator Interface System). These two parts are connected by Ethernet-based communication. The controller side of communication is managed by CNet module and OIS side is managed by data server of OIS. CNet module sends the data of controller to data server and receives commend data from data server. To minimizes or balance the load of data server, this module buffers data created by controller at every cycle and send buffered data to data server on request of data server. For multiple data server, this module manages the connection line with each data server and response for each request from multiple data server. CNet module is included in each controller of redundant system. When controller fail-over happens on redundant system, this module can provide data of controller to data sever without loss. This paper presents three main features – separation of get task, usage of ring buffer and monitoring communication status –of CNet module to carry out these functions.

Keywords: Ethernet communication, DCS, power plant, ring buffer, data integrity

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1247
8575 Factors Influencing Knowledge Management Process Model: A Case Study of Manufacturing Industry in Thailand

Authors: Daranee Pimchangthong, Supaporn Tinprapa

Abstract:

The objectives of this research were to explore factors influencing knowledge management process in the manufacturing industry and develop a model to support knowledge management processes. The studied factors were technology infrastructure, human resource, knowledge sharing, and the culture of the organization. The knowledge management processes included discovery, capture, sharing, and application. Data were collected through questionnaires and analyzed using multiple linear regression and multiple correlation. The results found that technology infrastructure, human resource, knowledge sharing, and culture of the organization influenced the discovery and capture processes. However, knowledge sharing had no influence in sharing and application processes. A model to support knowledge management processes was developed, which indicated that sharing knowledge needed further improvement in the organization.

Keywords: knowledge management, knowledge management process, tacit knowledge

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1493
8574 Big Data: Concepts, Technologies and Applications in the Public Sector

Authors: A. Alexandru, C. A. Alexandru, D. Coardos, E. Tudora

Abstract:

Big Data (BD) is associated with a new generation of technologies and architectures which can harness the value of extremely large volumes of very varied data through real time processing and analysis. It involves changes in (1) data types, (2) accumulation speed, and (3) data volume. This paper presents the main concepts related to the BD paradigm, and introduces architectures and technologies for BD and BD sets. The integration of BD with the Hadoop Framework is also underlined. BD has attracted a lot of attention in the public sector due to the newly emerging technologies that allow the availability of network access. The volume of different types of data has exponentially increased. Some applications of BD in the public sector in Romania are briefly presented.

Keywords: Big data, big data Analytics, Hadoop framework, cloud computing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1794
8573 Effects of Audit Quality and Corporate Governance on Earnings Management of Quoted Deposit Money Banks in Nigeria

Authors: Joel S. Akintayo, Ramat T. Salman

Abstract:

The stakeholders’ pressure on corporate managers to maintain firm’s profitability has created economic incentives for management to engage in earnings management practices. Therefore, this study examines the effects of audit quality and corporate governance on earnings management of quoted deposit money banks (DMBs) in Nigeria. This study specifically investigates the influence of audit tenure, audit fee, board independence, and board size on earnings management of DMBs. Explanatory research design was employed in carrying out the study while secondary data were sourced from the annual reports and accounts of all the 15 quoted DMBs in Nigerian Stock Exchange as at December 31, 2015 for a period of 10 years covering from 2006 to 2015. The data obtained for the study were analyzed using panel regression analysis approach. The findings reveal that board independence has a negative significant effect on earnings management at a 5% level of significance (p=0.002), while audit fee has a positive significant effect on earnings management at a 5% level of significance (p=0.013) and audit tenure has a negative significant effect on earnings management of DMBs at a 5% level of significance (p=0.003). Surprisingly, board size was statistically not significant at a 5% level of significance (p=0.086). The study concludes that high audit quality and sound corporate governance could improve the earnings quality of DMBs. Hence, the study recommends that the authorities saddled with the responsibility of banking supervision in Nigeria such the Securities and Exchange Commission (SEC) and CBN to advise the National Assembly in Nigeria to pass into law the three years professional requirement for audit tenure.

Keywords: Audit quality, audit tenure, audit fee, board independence, corporate governance, earnings management.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 894
8572 Methodology for the Multi-Objective Analysis of Data Sets in Freight Delivery

Authors: Dale Dzemydiene, Aurelija Burinskiene, Arunas Miliauskas, Kristina Ciziuniene

Abstract:

Data flow and the purpose of reporting the data are different and dependent on business needs. Different parameters are reported and transferred regularly during freight delivery. This business practices form the dataset constructed for each time point and contain all required information for freight moving decisions. As a significant amount of these data is used for various purposes, an integrating methodological approach must be developed to respond to the indicated problem. The proposed methodology contains several steps: (1) collecting context data sets and data validation; (2) multi-objective analysis for optimizing freight transfer services. For data validation, the study involves Grubbs outliers analysis, particularly for data cleaning and the identification of statistical significance of data reporting event cases. The Grubbs test is often used as it measures one external value at a time exceeding the boundaries of standard normal distribution. In the study area, the test was not widely applied by authors, except when the Grubbs test for outlier detection was used to identify outsiders in fuel consumption data. In the study, the authors applied the method with a confidence level of 99%. For the multi-objective analysis, the authors would like to select the forms of construction of the genetic algorithms, which have more possibilities to extract the best solution. For freight delivery management, the schemas of genetic algorithms' structure are used as a more effective technique. Due to that, the adaptable genetic algorithm is applied for the description of choosing process of the effective transportation corridor. In this study, the multi-objective genetic algorithm methods are used to optimize the data evaluation and select the appropriate transport corridor. The authors suggest a methodology for the multi-objective analysis, which evaluates collected context data sets and uses this evaluation to determine a delivery corridor for freight transfer service in the multi-modal transportation network. In the multi-objective analysis, authors include safety components, the number of accidents a year, and freight delivery time in the multi-modal transportation network. The proposed methodology has practical value in the management of multi-modal transportation processes.

Keywords: Multi-objective decision support, analysis, data validation, freight delivery, multi-modal transportation, genetic programming methods.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 87
8571 Big Brain: A Single Database System for a Federated Data Warehouse Architecture

Authors: X. Gumara Rigol, I. Martínez de Apellaniz Anzuola, A. Garcia Serrano, A. Franzi Cros, O. Vidal Calbet, A. Al Maruf

Abstract:

Traditional federated architectures for data warehousing work well when corporations have existing regional data warehouses and there is a need to aggregate data at a global level. Schibsted Media Group has been maturing from a decentralised organisation into a more globalised one and needed to build both some of the regional data warehouses for some brands at the same time as the global one. In this paper, we present the architectural alternatives studied and why a custom federated approach was the notable recommendation to go further with the implementation. Although the data warehouses are logically federated, the implementation uses a single database system which presented many advantages like: cost reduction and improved data access to global users allowing consumers of the data to have a common data model for detailed analysis across different geographies and a flexible layer for local specific needs in the same place.

Keywords: Data integration, data warehousing, federated architecture, online analytical processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 423
8570 An In-Depth Analysis of Open Data Portals as an Emerging Public E-Service

Authors: Martin Lnenicka

Abstract:

Governments collect and produce large amounts of data. Increasingly, governments worldwide have started to implement open data initiatives and also launch open data portals to enable the release of these data in open and reusable formats. Therefore, a large number of open data repositories, catalogues and portals have been emerging in the world. The greater availability of interoperable and linkable open government data catalyzes secondary use of such data, so they can be used for building useful applications which leverage their value, allow insight, provide access to government services, and support transparency. The efficient development of successful open data portals makes it necessary to evaluate them systematic, in order to understand them better and assess the various types of value they generate, and identify the required improvements for increasing this value. Thus, the attention of this paper is directed particularly to the field of open data portals. The main aim of this paper is to compare the selected open data portals on the national level using content analysis and propose a new evaluation framework, which further improves the quality of these portals. It also establishes a set of considerations for involving businesses and citizens to create eservices and applications that leverage on the datasets available from these portals.

Keywords: Big data, content analysis, criteria comparison, data quality, open data, open data portals, public sector.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2765
8569 File System-Based Data Protection Approach

Authors: Jaechun No

Abstract:

As data to be stored in storage subsystems tremendously increases, data protection techniques have become more important than ever, to provide data availability and reliability. In this paper, we present the file system-based data protection (WOWSnap) that has been implemented using WORM (Write-Once-Read-Many) scheme. In the WOWSnap, once WORM files have been created, only the privileged read requests to them are allowed to protect data against any intentional/accidental intrusions. Furthermore, all WORM files are related to their protection cycle that is a time period during which WORM files should securely be protected. Once their protection cycle is expired, the WORM files are automatically moved to the general-purpose data section without any user interference. This prevents the WORM data section from being consumed by unnecessary files. We evaluated the performance of WOWSnap on Linux cluster.

Keywords: Data protection, Protection cycle, WORM

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1303
8568 Knowledge-Driven Decision Support System Based on Knowledge Warehouse and Data Mining by Improving Apriori Algorithm with Fuzzy Logic

Authors: Pejman Hosseinioun, Hasan Shakeri, Ghasem Ghorbanirostam

Abstract:

In recent years, we have seen an increasing importance of research and study on knowledge source, decision support systems, data mining and procedure of knowledge discovery in data bases and it is considered that each of these aspects affects the others. In this article, we have merged information source and knowledge source to suggest a knowledge based system within limits of management based on storing and restoring of knowledge to manage information and improve decision making and resources. In this article, we have used method of data mining and Apriori algorithm in procedure of knowledge discovery one of the problems of Apriori algorithm is that, a user should specify the minimum threshold for supporting the regularity. Imagine that a user wants to apply Apriori algorithm for a database with millions of transactions. Definitely, the user does not have necessary knowledge of all existing transactions in that database, and therefore cannot specify a suitable threshold. Our purpose in this article is to improve Apriori algorithm. To achieve our goal, we tried using fuzzy logic to put data in different clusters before applying the Apriori algorithm for existing data in the database and we also try to suggest the most suitable threshold to the user automatically.

Keywords: Decision support system, data mining, knowledge discovery, data discovery, fuzzy logic.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1763
8567 Load Forecasting Using Neural Network Integrated with Economic Dispatch Problem

Authors: Mariyam Arif, Ye Liu, Israr Ul Haq, Ahsan Ashfaq

Abstract:

High cost of fossil fuels and intensifying installations of alternate energy generation sources are intimidating main challenges in power systems. Making accurate load forecasting an important and challenging task for optimal energy planning and management at both distribution and generation side. There are many techniques to forecast load but each technique comes with its own limitation and requires data to accurately predict the forecast load. Artificial Neural Network (ANN) is one such technique to efficiently forecast the load. Comparison between two different ranges of input datasets has been applied to dynamic ANN technique using MATLAB Neural Network Toolbox. It has been observed that selection of input data on training of a network has significant effects on forecasted results. Day-wise input data forecasted the load accurately as compared to year-wise input data. The forecasted load is then distributed among the six generators by using the linear programming to get the optimal point of generation. The algorithm is then verified by comparing the results of each generator with their respective generation limits.

Keywords: Artificial neural networks, demand-side management, economic dispatch, linear programming, power generation dispatch.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 549